WEBVTT

00:00.000 --> 00:22.760
Thank you very much to all of you to be here for the organizers to give me the opportunity

00:22.760 --> 00:32.060
to present this work. So I'm interested. I'm going to talk about this joint work with Professor

00:32.060 --> 00:39.940
Tomas Rezio at the University in Ebrichha in Madrid. So we are going to talk about

00:39.940 --> 00:46.460
opensions, about the research so far, about research data, in three steps, the definition

00:46.460 --> 00:52.860
of dissemination evaluation and then we will go to the conoderned challenges. So we

00:52.860 --> 01:01.060
start with opensions at some point in our work with Tomas. We were finding people saying,

01:01.060 --> 01:11.460
I don't know very well what means opensions. Maybe we can say something. So this is the proposition

01:11.460 --> 01:18.960
of the definition that we propose. Opensions is the political and legal framework, where

01:18.960 --> 01:26.820
research outputs are ser and disseminated in order to be rendered visible, accessible,

01:26.820 --> 01:34.840
or reliable. As you can see, I have always a lot of information in each page. Please don't

01:34.840 --> 01:45.600
try to read all of them. You have these lies in the website. For each page, I will go to

01:45.600 --> 01:53.520
the main points in order to keep the main ideas, the main information. Then you can go read

01:53.520 --> 02:03.480
these lies with time. Please come back to me as you have some questions. This work is not only

02:03.480 --> 02:11.160
about proposing a definition, but also to extract a landscape of information that is sometimes

02:11.160 --> 02:16.680
a bit complex. And this is much easier to see in this portrait. You have copies here,

02:16.680 --> 02:24.720
available for you, if you like some. So in the poster, you can see why what has motivated

02:24.720 --> 02:33.480
us to propose a definition that the definition is again here. Why this definition stands

02:33.480 --> 02:43.080
over three pillars, the definition of open access to publications, the free software definition,

02:43.080 --> 02:51.320
something about that, that I don't like to mention because it's too complex. The definition

02:51.400 --> 02:57.600
of free software is very important for us because when I started to be confronted to the

02:57.600 --> 03:06.040
open science world in other European level, I say, well, these people never ask, never say

03:06.040 --> 03:10.960
anything about license. And the license is very important. The role of the license are very

03:11.040 --> 03:22.520
important. This is one of the first legal points in this proposition. Here we did studies on

03:22.520 --> 03:30.120
political context at the European level, for France and Spain, we did studies on political

03:30.200 --> 03:39.480
issues. We cannot talk anymore about the open science without mentioning the UNESCO

03:39.480 --> 03:47.880
work. The UNESCO work started in 2019. You know, very well that there was a COVID, so that

03:47.880 --> 03:55.720
changed a lot. There were some people think about this question. There was some preliminary

03:55.800 --> 04:05.800
report. This is why we did a first version of our work to send our work to the committees

04:05.800 --> 04:15.640
that were working on this. Then there is the final report. This is talking to every country

04:15.720 --> 04:23.720
about all scientific areas. This is the first work I started to write with the mass in 2018, but

04:27.720 --> 04:37.720
this is work I have been doing in my lab for a long time. When my direction asked me to take

04:37.800 --> 04:47.080
care about the results of what in my lab, I started to go to my colleagues and see, what are

04:47.080 --> 04:54.760
you doing, what results of what are you. The vocabulary didn't exist still at this point. So what

04:54.760 --> 05:01.640
are you doing as a software? People were surprised and they had two questions. What I do is software.

05:01.720 --> 05:11.160
What I do is software of the lab, and both questions have, and still have, positive answers.

05:12.840 --> 05:20.600
So with the mass, I did propose a phase definition in the context of our research lab

05:20.600 --> 05:28.280
in France, and then with the mass, we did the reconsider this formulation, and now we talk about

05:28.360 --> 05:34.040
our research team, maybe international. So this is the formulation we propose.

05:34.040 --> 05:41.800
Rocea software is a well-identified set of code that has been written and well-identified

05:41.800 --> 05:48.680
research team. It is so word that has been built and used to produce a result,

05:48.680 --> 05:56.280
publish it or disseminate it in some article of scientific contribution. We are in a digital

05:56.280 --> 06:04.280
world, so its research software is in fact a set of files that contain the source code,

06:04.280 --> 06:12.680
maybe the compiler code. There may be many other things like documentation, like license,

06:12.680 --> 06:20.840
the examples of use, with videos that look inside your body, you can put many things down there.

06:21.640 --> 06:29.400
So again, when I was working in my lab, I also in a project at French level, French national level,

06:30.440 --> 06:38.360
some people were disseminated the software correctly with the license. Most of the people,

06:38.440 --> 06:44.120
a lot of people were saying, no, no, my software is already free. It is there, you can do whatever

06:46.120 --> 06:53.160
you like, why you need the license. In fact, if you don't have a license, you saw what is no free,

06:53.640 --> 07:02.200
and this was a message I had been working a lot to send to the community. So please choose a license

07:02.280 --> 07:08.760
and put a license in the software, before distribute the software. And maybe you should talk

07:08.760 --> 07:17.560
with the health of the laboratory of the health of institutions. So I told you, I have done a lot

07:17.560 --> 07:26.520
of training sessions, I am not the only one, obviously. This has a nice result, I think,

07:27.480 --> 07:35.400
there has been a national adolescent, national survey in France. We saw this software listed

07:36.920 --> 07:46.760
80% now, they have a license. This was not a situation, a lot of time ago. So since

07:46.760 --> 07:51.960
are going well, but the information, the training should keep going.

07:52.920 --> 08:02.920
So this is the work I started to do with the mass in 2018, because the question is, we are doing

08:02.920 --> 08:12.200
a lot of software in our research laboratories, but how this software has a role in the evaluation

08:13.240 --> 08:19.640
step. Because in research, when you evaluate something, you look at the publications articles.

08:20.280 --> 08:31.080
So we did propose this, this schedule protocol with four steps. The first one is the

08:31.080 --> 08:36.600
station, the second one is the simulation, the third one is used, and the fourth one is

08:36.600 --> 08:42.440
reserved. The first one is the station, as you have seen already in this room today,

08:42.440 --> 08:48.680
the station is very important, but it's also the way to identify a research of what

08:48.760 --> 08:56.280
as a research output. You know, who has done it, which is the team who has participated

08:56.280 --> 09:04.280
to the output, and is a legal point, is where do they have with copyright issues?

09:06.120 --> 09:16.520
Then the dissemination is where the evaluation committee, we can't check if the software is

09:16.520 --> 09:22.280
disseminated correctly following a dissemination protocol, like the one we have proposed here,

09:23.480 --> 09:32.280
this corresponds to maybe open science policies and the legal point is licensed.

09:33.560 --> 09:39.480
The use step is where you concentrate all the aspects related to the software,

09:40.440 --> 09:45.640
quality, etc, and you can go as far as you like, and this is the point that we have

09:45.640 --> 09:51.080
associated to the reproducibility of the research results, and the fourth point is a research.

09:52.760 --> 10:00.440
So evaluation of research as you saw now. Now, how we go from software to that.

10:02.120 --> 10:07.480
One of the things that is important to realize for us is that we have never tried to say what

10:07.560 --> 10:15.640
so what is. We have adopted the legal vision, because the importance of alpha-sip writes,

10:18.040 --> 10:27.800
can we do the same with data? If I just undo, we can not do something similar with data, because

10:27.800 --> 10:35.080
data in itself does not fall under any specific legal regime. There is personal data,

10:35.080 --> 10:45.400
there is data related to geographical data, environmental data, but there is no legal status for data.

10:47.000 --> 10:55.160
But this is to a problem to propose a definition of research data. As this knowledge changes

10:55.160 --> 11:06.840
for says, anyway, data may not be protected, and anyway, you should check the own case by case

11:06.840 --> 11:13.720
basis. And not only from the legal point of view, also from the scientific point of view,

11:13.720 --> 11:22.840
and from the technical point of view, the people that give the data away. So, as I told you,

11:23.720 --> 11:30.600
we don't say what data is, but we can say what the research data is, and we have a similar formulation

11:31.320 --> 11:38.520
as the one proposed for research software. So, research data is a well-identified set of data that

11:38.520 --> 11:45.960
has been produced. We are using all this terminology by a research team that data has been collected,

11:45.960 --> 11:52.600
processed, in order to produce a public set result or disseminated is on contribution.

11:53.720 --> 12:00.040
Again, we are in a digital world, we have a set of files containing the data, and maybe many other

12:00.040 --> 12:08.600
things. One of the characteristics is that data is going to be manipulated with software, and this

12:08.680 --> 12:17.080
software can be or no processed software. So, the dissemination procedure, now we can say,

12:17.080 --> 12:25.320
yes, we can use the same dissemination procedure, that is always what we propose is always very flexible,

12:25.320 --> 12:33.640
it can be adapted to each case. It is the same, and there are two little differences,

12:33.640 --> 12:43.560
there is one that, in this step, you will check the author's tips if there is, or there is not,

12:44.280 --> 12:51.640
maybe there are other legal context. Here, you say, well, maybe, let's say, software,

12:51.640 --> 12:59.000
sorry, software and data and not the same object. So, what is something that you write? Data is

12:59.000 --> 13:04.040
something that you collect, so you will not deal with the object suddenly in the same way.

13:04.760 --> 13:10.760
And is the same for the specific application protocol that we propose, steps are the same,

13:10.760 --> 13:19.000
there are two differences, exactly the same, the legal context, to be check it before the dissemination,

13:19.000 --> 13:28.200
you should be aware of the legal context of your data. They use a weak concentrate in the, in this

13:28.200 --> 13:35.320
step, all the quality related to data, et cetera, but the framework is a sad video same.

13:36.040 --> 13:44.120
So, the conodrome challenges, this is work by Christine Bormann in 2012,

13:45.160 --> 13:49.800
that said, it is a researcher, very well known, the researcher's data world,

13:50.520 --> 13:57.000
as he said, data setting is a conodrome. The challenges are to understand which data might be

13:57.000 --> 14:04.120
served by whom, with whom, under what conditions, why and to what effects, answers will

14:04.120 --> 14:13.320
inform data policy and practice. So, here are the answers we try to, we have, provide

14:14.920 --> 14:21.080
to these questions, with data is to be served, and by whom, this is the research team,

14:21.160 --> 14:25.480
the research team takes the decision of which is the opposite to be served.

14:28.040 --> 14:34.920
How following a dissemination procedure, like the one we have proposed, where there are many places,

14:34.920 --> 14:43.240
now you can find them here with whom, this is an interesting question, because as Christine Bormann

14:43.240 --> 14:51.000
already says, intended users may vary from researchers, in a very specific domain

14:51.640 --> 14:56.360
to the general public, and between these two extremes, you can have everything.

14:57.960 --> 15:04.360
Under what conditions, license, the license gives the selling conditions, why,

15:05.400 --> 15:10.440
there are many reasons, maybe open science policies, maybe research evaluation,

15:11.720 --> 15:18.040
the conodrome challenges for researchers software, because the questions for Bormann's

15:18.040 --> 15:26.280
world for research data, and now we have saved this year with us, well, what about the commuter

15:26.280 --> 15:32.680
challenges for research software? In fact, the questions are obviously the same, and the answers are

15:32.680 --> 15:43.400
pretty similar. I am not going to enter, because we don't have much time. This is, as for a lot of

15:43.400 --> 15:51.320
work, I don't like to give you more information. So we have a speak about the open science,

15:51.320 --> 15:57.480
about research software, about research data, in three steps, definition, dissemination, evaluation.

15:57.480 --> 16:05.320
We have used the information, we have from research software, and we have been able to apply

16:05.320 --> 16:12.920
this knowledge in comparing these two options, we have been able to use this information for

16:12.920 --> 16:20.840
research data to say what is not similar, for the conodrome challenges is the other way around.

16:22.440 --> 16:29.480
Thus, we have built a framework to study research software, to understand, explain then,

16:30.440 --> 16:36.360
and to promote their contribution to open science. It has been constructed in three stages,

16:36.360 --> 16:41.880
definition, dissemination, evaluation. We have proposed a similar framework for research data,

16:43.880 --> 16:50.360
in this context, we have been able to produce answers to the Bormas conodrome challenges

16:51.000 --> 16:59.000
for research data, and for research. So what does it? Thank you very much for your attention.

16:59.000 --> 17:21.000
If you like to take a copy of the poster, please, okay, I would like, I would like to

17:29.720 --> 17:32.280
take a copy of the poster.

17:38.120 --> 17:44.200
Excuse me, I don't hear you very well, it's too nice, sorry. Please.

17:48.520 --> 17:48.760
Yes?

17:59.080 --> 18:09.080
I mean, when we did propose this formulation, there were some people saying, excuse me.

18:10.520 --> 18:18.600
So the question is, other people propose the definitions about open science with

18:21.800 --> 18:27.800
the process of science, not just the open.

18:29.960 --> 18:39.000
There are many different considerations. There was someone when we were preparing this

18:39.000 --> 18:46.280
work, saying, more or less, do people don't know what you like?

18:47.560 --> 18:55.560
My, what I like is that research outputs are in this way. I do not just, I, it's my dad.

18:56.840 --> 19:03.320
It's the scientists I know, I have been working all my life with, they like that they work,

19:03.400 --> 19:10.440
it's visible, accessible, that you say. This is what I understand about the scientific community where

19:10.440 --> 19:17.880
I am since long time, and not only I, obviously, the mass agrees and the mass is all professor

19:17.880 --> 19:25.960
that is retired and has a experience in many scientific areas. So, but it's not just about the

19:25.960 --> 19:34.920
outputs, it's about the political and legal framework. And this is not just a, a, a, a, a

19:34.920 --> 19:41.720
the level of, of a laboratory at the level of the institution at the level of a country or the European

19:41.720 --> 19:51.800
Commission. I have been always working with questions related to intellectual property. When I

19:51.880 --> 19:59.320
started to work around the research software, the questions are who is the author of the software,

19:59.320 --> 20:07.960
how this works, who decides the license. So, for me, it was important to clarify the legal

20:07.960 --> 20:20.360
questions. Processes, I am not particularly confronted to this kind of work. So, I understand other

20:20.440 --> 20:31.480
people has other pieces, it's normal. They are fine. But my idea is to simplify the, the landscape

20:31.480 --> 20:38.920
of open science before this work, it was very complex. I can no go and talk with the people in my

20:38.920 --> 20:47.160
lab saying, oh, look at this, open science, they have no time, these two complex. So, you

20:47.800 --> 20:54.920
should give something clear, simple, that is possible to take in one second.

21:06.360 --> 21:07.160
Thank you very much.

