WEBVTT

00:00.000 --> 00:13.000
OK, welcome to our next session about fast and lighter container image uploading.

00:13.000 --> 00:20.000
I'm happy to introduce you about D4C.

00:20.000 --> 00:28.000
I want you to see, the leverage there are the three things for faster and lighter container

00:28.000 --> 00:31.000
image collecting.

00:31.000 --> 00:37.000
Some products like IoT, now use containers in low bandwidth network.

00:37.000 --> 00:43.000
To stop or operate containers users need to put images.

00:43.000 --> 00:51.640
We continue images and transfer to be a similar in talks or ISP in talks, but such

00:51.640 --> 00:58.200
in a talks sometimes only limited bundleizing.

00:58.200 --> 01:08.680
If you want to use more applications or libraries, the size of the content images is increasing.

01:08.680 --> 01:15.680
Images size causes some problems in operating in a lower bundleized networks.

01:15.680 --> 01:25.240
It causes network congestion or cost increases and it takes much time to provision a

01:25.240 --> 01:26.240
operating image.

01:26.240 --> 01:28.640
So, the performance takes to much time.

01:28.640 --> 01:33.840
So, right way to understand the first image operating is required.

01:33.840 --> 01:42.640
Current images have player-based image, but some messages already pointed out that player-based

01:42.640 --> 01:49.200
image has a limitation to provide efficient provide updates if sent to it.

01:49.200 --> 01:57.240
This video describes the time to update image from post-based 1, 3, 1, 2, 1, 3, 1,

01:57.240 --> 02:06.840
2, putting image takes much time, especially in all bundleized networks.

02:06.840 --> 02:16.280
Apologies to improve time to update images has already proposed 1, 5, 1, they do the

02:16.280 --> 02:17.280
application.

02:17.280 --> 02:28.280
Like starlight or ZTE, it turns first only required files or chunks, as like this here,

02:28.280 --> 02:35.120
servers sends only a new or update file to the client.

02:35.120 --> 02:43.280
But this works the star room to reduce data size for image operating.

02:44.280 --> 02:51.320
Fireware audiences, the application can not handle partial modifications on files if

02:51.320 --> 02:53.280
it is sent to it.

02:53.280 --> 03:00.960
200 applications is sent to the default see, my default see due to the data encosing, which

03:00.960 --> 03:08.280
can handle such a partial modifies as a data.

03:08.280 --> 03:18.280
You provide right-hand processing, this is a data encoding best image operating approach.

03:18.280 --> 03:27.280
In default see, servers generates a data between now or the image and a new image.

03:27.280 --> 03:34.280
This data is bundled into 1 content called update bundle.

03:34.280 --> 03:41.280
Then grant, because this update bundle and receive it, then apply the data to all the

03:41.280 --> 03:42.280
image.

03:42.280 --> 03:50.280
And then now, grants can launch a container from a new image.

03:50.280 --> 03:56.280
This video shows components in default see, servers generates some data in your

03:56.280 --> 04:00.280
advance and stores them.

04:00.280 --> 04:08.280
When client requests, it spreads bundle, server sends the stored data or merge is

04:08.280 --> 04:13.280
then, merge data as required.

04:13.280 --> 04:20.280
Then, client receives update bundle and starts content with snapshot programming and

04:20.280 --> 04:23.280
update file system called DCFS.

04:23.280 --> 04:28.280
The default see uses merge strategy.

04:28.280 --> 04:43.280
This is a technique to generate quickly as quickly as possible.

04:43.280 --> 04:49.280
In default see, the data generation is performed in a 5-oriented approach.

04:49.280 --> 04:58.280
If file is operated, the data encoding algorithm generates a data.

04:58.280 --> 05:07.280
If the file is newly created, the default see compress the file and bundle it bundle them into

05:07.280 --> 05:08.280
a data bundle.

05:08.280 --> 05:11.280
And containers require manifest on the configs.

05:11.280 --> 05:21.280
So, updates and manifest context are also bundled into update bundle.

05:21.280 --> 05:28.280
I mentioned in the previous slides, default see uses data encoding to get a better

05:28.280 --> 05:29.280
compression.

05:29.280 --> 05:35.280
But to utilize data encoding, we need to solve an issue.

05:35.280 --> 05:39.280
The update is generating data takes much time.

05:39.280 --> 05:47.280
This video describes a compression ratio and the time to generate data.

05:47.280 --> 05:57.280
DSDF generates better more compressed data, but it takes much time to generate data.

05:57.280 --> 06:02.280
And as a file size increases, the time is also increased.

06:02.280 --> 06:06.280
Longer generation increases over a period of time.

06:06.280 --> 06:11.280
So, we need to solve a question.

06:11.280 --> 06:16.280
How to provide decreased data quickly.

06:16.280 --> 06:26.280
So, as a solution, default see data generation approach is generating some data in advance

06:26.280 --> 06:28.280
and merging them on request.

06:29.280 --> 06:37.280
I found that merging data does not take much time compared to generating data from scratch.

06:37.280 --> 06:43.280
In this method, data is between version i and version i plus 1 is generated in advance

06:43.280 --> 06:49.280
and merging them when the request is being requested.

06:49.280 --> 06:55.280
In Korean, to request version with version 0 and version 0 and 2.

06:55.280 --> 06:57.280
From version 0 to version 1.

06:57.280 --> 07:02.280
Sabah already generated the corresponding data so simple responses.

07:02.280 --> 07:08.280
But in Korean, to request a data between version 1 to version 3.

07:08.280 --> 07:11.280
Sabah does not have the actual data.

07:11.280 --> 07:16.280
But the Sabah can provide it by merging version 1 to version 3.

07:16.280 --> 07:18.280
And version 2 and version 3.

07:18.280 --> 07:24.280
So, the Sabah merge is 8 and response the result.

07:24.280 --> 07:35.280
So, quarantine implementation and default see supports two data and code links.

07:35.280 --> 07:38.280
DSL and XDXDXDXDX3.

07:38.280 --> 07:45.280
To support multiple data and code links, default see provides a primary system with a simple API.

07:45.280 --> 07:49.280
The programming needs to implement three functions.

07:49.280 --> 07:52.280
Generate and merge.

07:52.280 --> 07:58.280
Generate, simply generate data from base or predated file.

07:58.280 --> 08:06.280
And apply, also simply apply data on base file and provides a place file.

08:06.280 --> 08:12.280
The SAMH function merge is very important in default see is data generation strategy.

08:12.280 --> 08:19.280
This receives two data files and merge them and generate one data.

08:19.280 --> 08:23.280
This is used on Sabah side.

08:23.280 --> 08:31.280
And default see has two supporting two programs, BSV and XDXDXDXDXDXD.

08:31.280 --> 08:35.280
But this does not provide the merge function.

08:35.280 --> 08:42.280
So, default see present없이 profiles inventory, magnify.

08:42.280 --> 08:47.280
And XSDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXDXD vendor's,

08:47.280 --> 08:55.280
Video script current surroundings is playing very well with this data link.

08:55.280 --> 09:00.600
Now, D4C has D3FS to provide a faster data apprine.

09:00.600 --> 09:04.280
D4C applies data tasks when the file is opened.

09:04.280 --> 09:07.600
And so there is no need to apply all the data

09:07.600 --> 09:13.200
to thousands or more files at all once.

09:13.200 --> 09:15.840
In a problem, when a process works,

09:15.840 --> 09:20.880
the vector is easily or get database D3FS

09:20.880 --> 09:26.880
and it turns the corresponding metadata is the metadata

09:26.880 --> 09:30.320
embedded in the database apprine to bundle.

09:30.320 --> 09:33.400
So then, when the process opens the file,

09:33.400 --> 09:38.880
the corresponding all file and the data file is applied

09:38.880 --> 09:41.760
to and the new file is generated.

09:41.760 --> 09:43.880
This file is stored on a low-car storage.

09:43.880 --> 09:47.320
And competing really, really,

09:47.320 --> 09:50.560
if the vendor process is a file,

09:50.560 --> 09:55.200
simply, D3FS CpCopy data contents

09:55.200 --> 10:02.920
from the stored results and it adds it to the process.

10:02.920 --> 10:06.040
So I implemented D4C for continuity

10:06.040 --> 10:11.360
and performed some evaluation on these environments.

10:11.360 --> 10:16.880
The first is the data size reduction

10:16.880 --> 10:18.880
to operate images.

10:18.880 --> 10:21.200
This figure shows the data size,

10:21.200 --> 10:23.200
is action, is out.

10:23.200 --> 10:26.720
I compared the reduction with fire-oriented approaches

10:26.720 --> 10:31.520
by generating a compressed file, compressed to fire,

10:31.520 --> 10:33.840
instead of data to fire.

10:33.840 --> 10:37.920
So it just shows that D4C provides data

10:37.920 --> 10:43.200
only from 5 to 40% size compared to fire-oriented process.

10:43.200 --> 10:46.720
It means that D4C can provide 20 times compression

10:46.720 --> 10:49.200
at the most.

10:49.200 --> 10:55.280
As well, but as notably as a pilot operating

10:55.280 --> 10:59.720
from 0.1, fire-oriented approaches

10:59.720 --> 11:07.560
has a 6440 megabytes, but the D4C

11:07.560 --> 11:12.800
is BSDF only generates only a 33 megabytes.

11:12.800 --> 11:17.960
This is very huge reduction on operating.

11:20.520 --> 11:23.240
And this figure shows the breakdown over the size reduction.

11:23.240 --> 11:29.160
From this video, we can see huge reduction

11:29.160 --> 11:33.160
on executable fires on the shared libraries.

11:33.160 --> 11:38.840
But this blue one, post-to-base, both points,

11:38.840 --> 11:42.200
post-to-base points in the exploit BSD.

11:42.200 --> 11:46.240
Doesn't want to reduce so much.

11:46.240 --> 11:49.680
This is because BSDF is designed for shared libraries

11:49.680 --> 11:50.760
or executable fires.

11:50.760 --> 11:54.600
But this post-to-base point index point

11:54.600 --> 11:57.680
BSDF is bit encoding, bit encoded.

11:57.680 --> 12:01.240
So the BSDF cannot handle it quite three.

12:04.360 --> 12:08.800
I also evaluate time to generate data.

12:08.800 --> 12:16.560
As well, fire-taughts, 0.1, it takes about 30 minutes

12:16.560 --> 12:18.960
to generate data as is BSDF.

12:18.960 --> 12:22.880
Each shows generating data's own request is not protocol.

12:25.880 --> 12:31.560
And mass power mass, left figure shows the time

12:31.560 --> 12:35.960
to mass change to data's.

12:35.960 --> 12:40.720
By default, this it takes only 25 seconds.

12:40.720 --> 12:46.720
And this is 65 time-fast-adザing the data for almost scratch.

12:46.720 --> 12:51.280
And the size, the margin of data's

12:51.280 --> 12:55.280
and generate from scratch, scratch, data's

12:55.280 --> 12:57.280
is almost the same.

12:57.280 --> 13:02.600
So it means the margin strategy can provide data's

13:02.600 --> 13:05.760
much quickly and there is no side effects.

13:05.760 --> 13:20.160
Finally, as an entire animation evaluation,

13:20.160 --> 13:24.800
I measure the time to create images.

13:24.800 --> 13:26.920
This is an entire result.

13:26.920 --> 13:32.120
And notably, as a fight-out show,

13:32.120 --> 13:35.680
operating from 0.0 to 0.1, 5-on-it-stop-rout

13:35.680 --> 13:44.240
is took two minutes, but the BSDF takes only 12 seconds.

13:44.240 --> 13:50.240
This is almost 10 times faster operating.

13:50.240 --> 13:53.240
Also, we need to clear about the performance

13:53.240 --> 13:56.440
on the applications.

13:56.440 --> 14:00.880
I will measure the performance on postways and fight-out.

14:00.880 --> 14:04.800
The first is the post-faith and the left table shows

14:04.800 --> 14:06.080
the result.

14:06.080 --> 14:11.040
Both DCFs and GATF5 system provide

14:11.040 --> 14:12.840
it's almost the same results.

14:12.840 --> 14:18.240
And the light is a fight-out.

14:18.240 --> 14:21.360
This is in this, basically, an e-finance code,

14:21.360 --> 14:25.520
which means it means the fight-out is launched repeatedly.

14:25.520 --> 14:27.800
And the e-finance performance is not,

14:28.760 --> 14:32.960
it's almost the same, but the time-to-road libraries

14:32.960 --> 14:36.320
is showed the different result.

14:36.320 --> 14:38.840
The first time to road libraries takes much time

14:38.840 --> 14:44.640
with DCFs, but the following launch does not take much.

14:44.640 --> 14:51.800
This is because a DCFs price and data on the BSDFs

14:51.800 --> 14:56.240
at the first fight-out and catch it, catches the result

14:56.240 --> 14:57.440
on low-cost storage.

14:57.440 --> 15:00.480
So the first road takes some time,

15:00.480 --> 15:03.680
but the following will not, does not take.

15:03.680 --> 15:09.480
So once the container starts and the,

15:09.480 --> 15:16.480
I think, out of one, once the container starts and the following

15:16.480 --> 15:18.280
performance will not be decreased.

15:22.840 --> 15:26.400
Next step is to work on this.

15:26.400 --> 15:29.560
The current, the FFC implementation,

15:29.560 --> 15:32.880
is just a pro-conscent on the rocks, many features.

15:32.880 --> 15:35.960
And she had to, sometimes, realize,

15:35.960 --> 15:39.920
and the more important issue is, how to decide

15:39.920 --> 15:42.240
the data generated in advance.

15:42.240 --> 15:44.960
Currently, each event is owned by operators,

15:44.960 --> 15:49.640
but I think there are some approaches to decide

15:49.640 --> 15:54.640
the data to generate in advance.

15:54.640 --> 15:58.200
Well, as a large improvement, I'm seeking a combination

15:58.200 --> 16:03.200
with the DD chunk, providing updates chunks

16:03.200 --> 16:05.280
with their time coding, will be beneficial,

16:05.280 --> 16:09.640
but it has the same problem, how to choose a base

16:09.640 --> 16:11.520
and operate chunks to generate data.

16:16.320 --> 16:19.160
Sometimes, this presentation.

16:19.160 --> 16:22.520
The default is objective is reducing data

16:22.520 --> 16:25.800
insights on the time to operate container images.

16:25.800 --> 16:29.320
Solution is utilizing data encoding

16:29.320 --> 16:32.920
and evaluation shows, 20-time compression,

16:32.920 --> 16:36.960
20 times compression, and almost 10 times

16:36.960 --> 16:39.640
operating, first, 10 times, first operating,

16:39.640 --> 16:42.520
compared to the final oriented approaches.

16:42.520 --> 16:46.640
The next step is implementing more and more.

16:46.640 --> 16:48.360
That's all, thank you for listening.

16:48.360 --> 16:55.160
Thanks for your talk.

16:55.160 --> 16:57.360
Are there any questions?

17:09.360 --> 17:11.640
Hi, thank you for your talk.

17:11.640 --> 17:14.040
Can you explain the trade-offs?

17:14.040 --> 17:17.520
Because you are trading off a network bandwidth

17:17.520 --> 17:20.280
to upload and download the images,

17:20.280 --> 17:22.000
but what's the exchange?

17:22.000 --> 17:24.360
Like, is in the client, you have more CPU

17:24.360 --> 17:27.560
to understand what needs to be dealt

17:27.560 --> 17:31.800
or where is the trade-off, where we get the trade-off from?

17:31.800 --> 17:33.960
It must be somewhere.

17:33.960 --> 17:37.800
Ah, as a total of current containers,

17:37.800 --> 17:43.080
does not require dedicated servers and users can simply

17:43.080 --> 17:45.840
pull with contents URL.

17:45.840 --> 17:48.520
But before see, the clients are dedicated to servers

17:48.520 --> 17:51.880
and users need to generate their tasks

17:51.880 --> 17:54.720
who are put in advance, the money

17:54.720 --> 17:57.000
is then by themselves.

17:57.000 --> 18:02.560
This is one thread of in my approach.

18:02.560 --> 18:05.200
Yes, you mentioned, or in the graphs,

18:05.200 --> 18:10.200
you showed the trade-offs were either time saving

18:10.200 --> 18:12.920
or data saving.

18:12.920 --> 18:16.760
Is there a sweet spot you have found during the trials?

18:16.760 --> 18:21.480
Is there like a middle way between efficiency

18:21.480 --> 18:25.640
and also the bandwidth savings?

18:25.640 --> 18:29.560
D4C saves both time.

18:29.560 --> 18:33.400
Is designed to both time and the data size.

18:33.400 --> 18:38.680
And it can be achieved if it can be achieved.

18:38.680 --> 18:42.480
If the data is generated in the other ones

18:42.480 --> 18:49.280
and the users operate the server by themselves.

18:49.280 --> 18:54.960
So if the corresponding data to operating

18:54.960 --> 18:58.800
is not generated or not, it does not exist,

18:58.800 --> 19:02.040
the server needs to pull the image from the industry

19:02.040 --> 19:05.160
and generate their tasks from scratch.

19:05.160 --> 19:11.400
So in such case, the time is time much increases.

19:11.400 --> 19:17.560
But the data size will not increase.

19:17.560 --> 19:19.240
OK, thank you very much.

19:19.240 --> 19:20.240
Thank you.