WEBVTT

00:00.000 --> 00:10.920
Hello, warm. I'm Luca Basi. I'm a computer science student at University of

00:10.920 --> 00:18.040
Bolonia. I'm doing a scholarship funded by Garra, ATI and F and Knaf in

00:18.040 --> 00:23.520
Bolonia. Today we'll present a site acts that is initiative for network

00:23.520 --> 00:29.320
traffic tagging for scientific computer. Let me introduce a little bit

00:29.320 --> 00:34.120
the environment. So we're talking about scientific communities that

00:34.120 --> 00:40.160
move very large amounts of data between small and large data

00:40.160 --> 00:45.240
center, in particular using the connectivity provided by national

00:45.240 --> 00:50.520
research and location networks. One of these is the Italian one that is

00:50.520 --> 00:56.640
called Garra. That is, for example, provide a whopping 1.6-terabit per second

00:56.640 --> 01:03.200
connections between the Knaf data center in Bolonia. That is the National

01:03.200 --> 01:07.360
Military Center of Italy, National Institute for Nuclear Physics and

01:07.360 --> 01:14.120
Bechern. That's probably, you know, in Geneva. And all of these with a

01:14.120 --> 01:19.800
latency for whole year, nine and half milliseconds thanks to multidermaine

01:19.800 --> 01:27.920
shared spectrum. So we need to bear with this amount of data transfer better

01:27.920 --> 01:36.240
understand and understand of the network traffic and how to optimize it to ensure

01:36.240 --> 01:47.360
to use the network as effective as possible. The worldwide LAC computing grid is

01:47.360 --> 01:53.000
distributed computing infrastructure that supports the storage and distribution

01:53.000 --> 02:00.120
and the analysis of the data produced by the large

02:00.120 --> 02:09.320
other on colliders experiments in Geneva. And we have like one more 170 sites

02:09.320 --> 02:17.640
around the world. There is the Tier 1, the Tier 0, that is one in Geneva. 14 Tier

02:17.640 --> 02:26.080
1s and one of these is the one in Bolonia managed by Knaf. And 100 and

02:26.080 --> 02:34.080
half Tier 2 around the globe. So the is quite complex to manage that

02:34.080 --> 02:41.480
transfer between all these sites and provide data with a lot of

02:41.480 --> 02:48.800
latency to all physics around the world. So the site access initiative was

02:48.800 --> 02:55.040
burned to provide the intensification of science domain and their

02:55.040 --> 03:01.440
high level activities of an actor level and this can be done with the

03:01.440 --> 03:06.640
rising packet and flow market. In particular, this is important for

03:06.640 --> 03:12.640
network providers to identify with the traffic owner and the purpose of

03:12.640 --> 03:17.920
these traffic for the experiments to understand out there that are flows

03:17.920 --> 03:25.000
performed along the complex network path and to sites so to data center to

03:25.000 --> 03:33.120
game visibility into our very different data flows performs around the network.

03:33.120 --> 03:40.960
In particular, stacks provides two ways to mark a network traffic flow marking

03:40.960 --> 03:48.240
with the UDP for flies and packet market, packet marking using the IPvsa flow

03:48.240 --> 03:55.360
label, either field invader. Each flow can be identified with two

03:55.360 --> 04:04.400
information, the experiment and the activity of the experiment. For example,

04:04.400 --> 04:21.040
if we take the address experiment, that is one of a major experiment at

04:21.040 --> 04:27.680
Sharon that has experimented the two, that is doing data consolidation activity

04:27.680 --> 04:36.080
that has ID4, the error will be shot at 100, 30. All the these ID are

04:36.080 --> 04:42.880
statically mapped at the moment in a flow registry. That is just a JSON file with

04:42.880 --> 04:48.240
this information. So let's talk about the two way to mark traffic and let's

04:48.240 --> 04:54.680
start with flow label with UDP for flies. Fresh flies are just simple UDP

04:54.680 --> 05:04.200
packets in a six log format with a defined JSON body. Packets are intended to be sent to

05:04.200 --> 05:11.240
the same host of destination of a traffic to a specific port and and this packet

05:11.240 --> 05:17.240
that is intended to be work-radable because we want that all the arrays and network

05:17.240 --> 05:24.200
provider that receive this traffic, cool intercept is UDP fly flies and every formation

05:24.200 --> 05:31.880
about the network traffic. Also, the packets could also be sent to a specific

05:31.880 --> 05:40.680
regional global collectors if the sites prefer. They use of CSLog format was because

05:40.680 --> 05:47.640
it's easy to reuse a radius tablish tool, for example log session and these type of

05:47.640 --> 05:58.760
marking was for IPv4 and IPv6. Content is not limited but is limited as long as fit a

05:58.760 --> 06:09.880
single frame. The other way to mark traffic is with IPv5 label invader. So I

06:09.880 --> 06:16.360
BFF traffic control program is to insert these information, the site-axing

06:16.360 --> 06:27.960
information in the IPv6 flow label. Some is a 20 bits long field, some are used for entropy.

06:27.960 --> 06:34.360
So I random, some from the communities of experiments that produce the activity and

06:34.600 --> 06:42.760
some other bits are used for the activity. Actually, we are proposing these two AETF

06:43.720 --> 06:51.400
with IRFC and is available to link invaders if you're curious to a specific

06:51.800 --> 06:59.480
to a specification of this part. So now let's see implementation of

06:59.480 --> 07:06.920
standard web data of site-axing software used in WICG. So let me introduce storm,

07:06.920 --> 07:13.320
storm stands for storage resource manager. Is a grid storage solution for this base storage

07:13.320 --> 07:22.120
system with a POSIX file system, typically is a distributed file system such as

07:22.120 --> 07:33.640
GPSS. In particular in WICG, storm is adopted by the tier 1 of set-out knaff and by some

07:33.640 --> 07:41.480
other tier 2. In particular storm is a suite of software and today we will talk about specifically

07:41.480 --> 07:49.000
about some web data. That is one that provides connectivity, provides functionality

07:49.000 --> 07:55.480
through web data protocol. The web data protocol is an extension of the HTTP protocol that

07:55.480 --> 08:05.240
allow to manage files on a remote server. For example, we added some new HTTP verbs, for example,

08:05.240 --> 08:13.560
MK call to create a directory or copy to copy the file remotely. And in particular, it's

08:13.560 --> 08:23.000
on WebDub, it implements an extension of the copy verb of WebDub. So an extension of an extension

08:23.000 --> 08:30.920
that is called third-party copies. And these are a really useful functionality because

08:30.920 --> 08:38.200
permits to transfer data between data center without actually doing some local copies. So you can

08:38.200 --> 08:45.240
say to a remote server copy the file that you have locally to these other remote endpoint or

08:45.240 --> 08:53.800
get to these files from these remote endpoint and copy to your remote storage. So that is the

08:53.800 --> 09:02.760
most used functionality through transfer data between the various data center in WICG.

09:02.760 --> 09:11.400
The latest release of Storm WebDub has obtained support for Site-Eyes Ailer.

09:11.400 --> 09:20.200
In particular, it relies on Flodim. Flodim is a demon that has different plugins to retrieve

09:20.200 --> 09:26.760
the flow information and a set of commands to mark the traffic. In particular, Storm WebDub

09:26.760 --> 09:36.200
used the NPAPI plugin to retrieve the one to receive identifiers through an impipe. And

09:36.200 --> 09:45.560
sends the UDPFrive flies using Flodim. So Storm WebDub writes a line to a specific file in this

09:45.560 --> 09:54.360
form, where a state is start-to-end of the flow. The protocol, so TCP or RDP,

09:54.360 --> 10:03.320
the source IP import, the mediation IP import, and the information extracted from the Site-Eyes Ailer.

10:03.320 --> 10:11.160
So we experimented in the activity ID. So let's see how the third-party copy in push mode.

10:11.160 --> 10:19.960
Also, copy this file that you have, log it to this remote server, worse using Site-Eyes Ailer.

10:19.960 --> 10:27.960
The client to HTTP copy request specifying a remote destination. In this case, we are sending

10:27.960 --> 10:36.920
this request to server-racing, copy this file to destination B. And in the request, it is included

10:36.920 --> 10:47.480
the Site-Eyes Ailer. In this case, 65 or random. Storm WebDub received a request. Start

10:47.480 --> 10:55.640
doing the HTTP put request to server B as extracted by the HTTP copy request. And at the same

10:55.640 --> 11:04.120
time, send the information about activity ID experiment ID to the flow, the Demon, that send

11:04.120 --> 11:12.120
the UDP file files. The transfer continues, and when the transfer ends, Storm WebDub communicated

11:12.120 --> 11:21.160
the end of the network flow to flow D, and flow D send the UDP file files containing the information

11:21.160 --> 11:31.480
that the transfer of the file is ended. So to sum up, Site-Eyes permits to mark packets

11:31.480 --> 11:37.160
and network flow with information about experiments and activity responsible for a network

11:37.160 --> 11:44.520
traffic. The latest release of Storm WebDub has obtained support to these two parts of

11:44.520 --> 11:53.080
this information. And at the moment, is in development, a parting of the moment is written

11:53.080 --> 12:06.040
in Python, and we are also exploring to use other fields in the IPv6 Ailer instead of a

12:06.040 --> 12:13.080
flow label. We are trying with the destination option, a copy option as alternative. So thank

12:13.080 --> 12:19.080
you for your attention. You refine the website of Site-Eyes Initiative. There is a mail list.

12:19.080 --> 12:25.800
We are open to contribution for other type of scientific experiments. At the moment, we are

12:25.800 --> 12:30.840
mainly eye-energy physics because this initiative was born and served, but we are open to

12:32.120 --> 12:39.000
other type of scientific experiment, the source code of the webDub, and all the Site-Eyes

12:39.000 --> 12:45.720
contributors. So thank you. And if there is question.

