WEBVTT

00:00.000 --> 00:13.400
So, our next speaker, Hugo, he's going to talk about creating or replicating the Linux

00:13.400 --> 00:20.160
in a process using Python, so that's going to be competition system D. Enjoy!

00:21.160 --> 00:29.160
Hello everybody, so yes, I'm going to talk to you about creating a custom Linux in

00:29.160 --> 00:37.160
it in Python, not with the goal to replace, well, maybe you will see. So, my name is Hugo

00:37.160 --> 00:42.160
Heiter, I'm a software engineer, I love Python Linux since 2003 and I joined first them

00:42.160 --> 00:49.160
around that time, so I don't know how many editions I've been to but maybe 19, I really love first

00:49.160 --> 00:56.160
them and open source everything basically first them made me a bit.

00:56.160 --> 01:00.760
So, I'm going to talk about this concept that I discovered with a friend on embedded devices

01:00.760 --> 01:06.560
on a project a few years ago, I've only really cool and then a few years later I had a

01:06.560 --> 01:12.360
use case that required the same thing for virtual machines in the cloud and I would

01:12.360 --> 01:16.960
like the concept, I want you to share it with you, so you can also build cool stuff

01:16.960 --> 01:21.960
with this ID. So, I'm not going to sell you a software, I'm just going to show you

01:21.960 --> 01:28.360
how you get there and what you might want to do this. Who here doesn't know what Linux

01:28.360 --> 01:35.760
is or doesn't use Linux? Okay, so small note about Linux, something that's really cool

01:35.760 --> 01:39.960
in this use case, that's it's a monolithic kernel, which means all the drivers can be

01:39.960 --> 01:44.560
included, which makes our life much easier when you don't want to manage this or process

01:44.560 --> 01:55.160
to manage drivers. So, you have smaller, like your modular kernels that make it more

01:55.160 --> 01:59.160
annoying because then you need to actually do more work to have drivers. With Linux, you

01:59.160 --> 02:04.640
have much stuff working built in, so that makes it a life easier. Typically, when you're

02:04.640 --> 02:10.240
booting a computer, I'm going to put your booting the UFI, then the UFI bootloader,

02:10.240 --> 02:15.640
and the Linux kernel, then the Linux kernels boots the init, which is the first process

02:15.640 --> 02:20.320
that will launch a machine and that's when it's launching all the apps that you have on

02:20.320 --> 02:27.080
your machine. So, it's the first process started by the Linux kernel during boot. It continues

02:27.080 --> 02:34.920
running until the system is shut down if it does not Linux is not happy and crashes.

02:35.920 --> 02:39.920
Sorry. Can you make that go away? Sorry.

02:56.920 --> 02:57.920
Which route?

03:04.920 --> 03:11.320
It's a direct, indirect process, an ancestor of all the other processes. If a process loses

03:11.320 --> 03:17.520
and becomes orphaned, it will now belong to init. It started using a hard-skilled file name

03:17.520 --> 03:23.520
while in fact, Linux tries a few in case the first one doesn't work. And it's typically

03:23.520 --> 03:30.920
assigned the process ID, 95 or number one, so it's PID-1 people use it, cool like this.

03:30.920 --> 03:37.920
You know, probably some common Linux systems, system D, OpenRC, system V, they're plenty.

03:37.920 --> 03:43.920
System D is the one that you have on most machines nowadays, I need this as sheet load of stuff.

03:43.920 --> 03:48.920
Some people are happy with and some people think it's too much.

03:48.920 --> 03:55.920
So, why would you want to write your own in it? The first reason is because you can probably

03:55.920 --> 04:02.920
probably the most important reason. You can also do it to understand your system better or in this case,

04:02.920 --> 04:07.920
in some cases, performance and a simple system with few tasks. Don't try to complete PID with

04:07.920 --> 04:12.920
system D if you're running a lot of stuff on a desktop, for example, but if you have only a few tasks,

04:12.920 --> 04:17.920
in fact, it makes your life simpler when something goes wrong. There are not so many things going on.

04:17.920 --> 04:23.920
So, it gives you also full control on your system. You know, exactly what's running when you can stop everything

04:23.920 --> 04:30.920
thinking very easily. You can also do some environment specific system initialization.

04:30.920 --> 04:36.920
Something needs Wi-Fi, something needs to wait until a certain condition on the system is ready.

04:36.920 --> 04:41.920
You can absolutely do this. You could also do that because you don't like system D,

04:41.920 --> 04:46.920
but I showed you a list of alternatives that might be better than writing your own.

04:46.920 --> 04:52.920
So, typically, this is mostly useful if you have virtual machines or containers or embedded systems,

04:52.920 --> 04:57.920
where you're not running a lot of stuff inside.

04:57.920 --> 05:05.920
Virtual machines, so there are secure sandboxes. It's handy because you can launch them without needing routes.

05:05.920 --> 05:10.920
So, you can run stuff in virtual machines on any desktop without needing route access,

05:10.920 --> 05:16.920
and you can still run stuff as route inside. So, that can be handy for some use cases.

05:17.920 --> 05:24.920
One of the use cases where I use this was to make like, ever more cloud functions, so running and trusted cloud code in the cloud.

05:24.920 --> 05:31.920
You have tight control of side effects, so people speak about WebAssembly and how you can really control all the IO of WebAssembly.

05:31.920 --> 05:37.920
You can do the same with VMs, development sandboxes, run stuff as route.

05:37.920 --> 05:42.920
Another use case is embedded systems. You want to have full control on what's going on.

05:42.920 --> 05:49.920
You don't want to have so many things running, but sometimes you want to have tight interactions like depending on the voltage of the power.

05:49.920 --> 05:56.920
You apply, you want to run this software or not. You want to log files to disk or not.

05:56.920 --> 06:01.920
You also have full control on the disk rights, because what can write on the disk?

06:01.920 --> 06:09.920
Well, Linux has all its logs in RAM, so only the processes that only you're in it and the process it launches can write stuff on disk.

06:09.920 --> 06:18.920
So sometimes maybe you don't need to do overlay files system with really remote. You could just not write to disk.

06:18.920 --> 06:28.920
So, what I will talk a bit more here about how you can use Linux, Python and Firecracker to experiment with this idea on a PC.

06:28.920 --> 06:35.920
So, who here is not familiar with Python or has never used Python? Good.

06:35.920 --> 06:42.920
So, the good thing with Python here is that it's interfaces really well with C and Linux system.

06:42.920 --> 06:50.920
So, you can access all the Linux APIs very easily and I think you know the rest.

06:50.920 --> 06:56.920
So, to do this, you will need a Linux file system. It can be any distribution. It doesn't matter too much.

06:56.920 --> 07:03.920
The idea is to provide tools that you will want to use during execution and also to install Python in your file system.

07:03.920 --> 07:10.920
So, you can actually launch Python. You have some instructions on the right on how you can create a minimal 200 megabytes file system based on DBN.

07:10.920 --> 07:20.920
And this file system from DBN, in fact, doesn't come with any in it, so you need to add your own anyways.

07:20.920 --> 07:29.920
Firecracker is an interesting tool, so it was open source by Amazon. It's used by the AWS Lambda functions and a few other cloud stuff they do.

07:29.920 --> 07:36.920
It's fast. It's really fast. That's really cool. It runs VMs. It's a very simple way.

07:36.920 --> 07:41.920
It doesn't know about USB, it doesn't know about PCI, it doesn't know about older PC stuff.

07:41.920 --> 07:46.920
It just knows about what is relevant inside of VM that allows it to go really fast.

07:46.920 --> 07:49.920
It only runs Linux on Linux as well.

07:49.920 --> 07:56.920
QMU has also something similar called QMU micro VM, which is the same thing based on the QMU code base.

07:56.920 --> 08:02.920
This is an example of configuration you can use, so just like a JSON configuration, you can pass to firecracker.

08:02.920 --> 08:09.920
To boot your VM, you specify, this is my Linux kernel, this is my root file system, some boot arguments.

08:09.920 --> 08:14.920
And that's about all you need.

08:14.920 --> 08:22.920
And if you just do this, you just launch firecracker with your configuration file, define where the API socket is.

08:22.920 --> 08:27.920
Well, you didn't put any in it, so you can see here that Linux is really gentle and tried.

08:27.920 --> 08:31.920
Okay, can I run Sbin slash in it? No, it's not identified it.

08:31.920 --> 08:38.920
Okay, I'll try another path, another path, and at the end it falls back on a shell, so that's really kind of Linux to do this.

08:38.920 --> 08:45.920
It's also quite cool if you want to experiment or debug stuff, you just get directly a shell.

08:46.920 --> 08:50.920
But that's not our goal, our goal is to use Python.

08:50.920 --> 09:00.920
And in fact, with this shibang argument, I think it's like an inscription, but with the shibang, you can just basically have one file and say, well, in fact, this is an executable file.

09:00.920 --> 09:03.920
And it has to be executed with Python 3.11.

09:03.920 --> 09:12.920
And you just put it in slash, as being slash in it, and it will just launch it, run it, and print hello, as you can see here.

09:12.920 --> 09:30.920
And so the only thing is that no, it printed hello, and then it left, and then Linux is unhappy, because the in it's stopped, and you have a host close trace of, I'm not happy something crash this is bad.

09:30.920 --> 09:38.920
So you don't really care, because maybe everything you wanted to do is done already, but it could be handled a little nicer.

09:39.920 --> 09:43.920
So if you went to properly handle this, you may want to shut down the system.

09:43.920 --> 09:51.920
So you could try to do like, oh, that system shut down or hold, but, well, these are in fact tools that go in it system.

09:51.920 --> 10:00.920
So they're not present, so you need to tell the Linux scary old to shut down manually, because you don't have any shut down, how to reboot it.

10:00.920 --> 10:05.920
We did it all provided by system D or run V or other in its systems.

10:05.920 --> 10:13.920
So you need to get a little bit into magic, and the system calls with magic numbers.

10:13.920 --> 10:20.920
So this is the cheat sheet. These are the three number, the four numbers you need in order to reboot the Linux system.

10:20.920 --> 10:28.920
So the first one is a system reboot, and you have two magic numbers to make sure that you're really doing what you want.

10:28.920 --> 10:32.920
This one includes the sentence like, feed that.

10:32.920 --> 10:42.920
And the last one is the actual action you want to do if it holds or reboot the different magic codes there.

10:42.920 --> 10:53.920
And if you do this, you can get actually a clean shut down. So here we can see the machine was printed hello, and then we have a clean shut down.

10:53.920 --> 10:57.920
We can see the same thing here.

10:57.920 --> 11:06.920
This is about the same code except it says hello first them, and we can just launch it and see how much time it takes.

11:06.920 --> 11:10.920
Okay, that took about 100 milliseconds. That was quite nice.

11:10.920 --> 11:15.920
Starting the VM starting the Linux scary old printing hello and shutting it down.

11:15.920 --> 11:21.920
So when I said firecracker is fast, I mean, it's almost instant.

11:21.920 --> 11:28.920
And that makes it really, really handy. I find it really cool compared to launching standard VMs.

11:28.920 --> 11:40.920
The next thing you might want to do is child process management to actually be any that launches other processes depending on which school year from you may want to do it sync or async.

11:40.920 --> 11:43.920
That's up to you to choose.

11:44.920 --> 11:47.920
And this is how you can do this.

11:47.920 --> 11:52.920
You may also want to do things like handling system tasks, how launching reboots.

11:52.920 --> 11:58.920
So there are two also approaches like in the init environment.

11:58.920 --> 12:09.920
Some init says, but basically when you launch it down, it will just, you just launch it scripted will launch it down to the Linux to shut down and kill the processes.

12:09.920 --> 12:15.920
Or you have a unique socket that talks to the init and the init is doing the proper shutdown itself.

12:15.920 --> 12:19.920
That's what like system D&M or advanced init systems are doing.

12:19.920 --> 12:23.920
They don't just let's other stuff kill everything.

12:23.920 --> 12:26.920
You may also want to mount file systems.

12:26.920 --> 12:31.920
You can do this by running commands or you can use the lib c.

12:31.920 --> 12:36.920
The same way used the C schools.

12:36.920 --> 12:40.920
You can call the mount command directly from the lib c from Python.

12:40.920 --> 12:42.920
You don't need to install anything to do this.

12:42.920 --> 12:46.920
No people require it in anything I'm presenting here.

12:46.920 --> 12:54.920
So you will probably want to set up network interfaces and then loading curial modules updating the system clock if you're on an embedded device.

12:54.920 --> 12:57.920
These are all kind of stuff that you may want to do.

12:57.920 --> 13:02.920
And some examples on how to do them there.

13:02.920 --> 13:12.920
So when you are in a VM, you may want to interface with the host system and there are only a few ways to do this with micro VMs.

13:12.920 --> 13:15.920
The first one is called VRTIO VSox.

13:15.920 --> 13:25.920
So it's socket transport between the VM host and the guest through the virtualization stack provided by FaricRack or Q and U.

13:25.920 --> 13:31.920
And this provides you a unique socket on the host and a special socket type inside the guest.

13:31.920 --> 13:38.920
That has a number so it has a specific it's not a few nicks or a TCP it's a VSox in this case.

13:38.920 --> 13:47.920
And this is really also a very perfect way of communication and you can have bidirectional listeners and clients and servers on sockets.

13:47.920 --> 13:53.920
You can use network interfaces to type IP addresses and the whole network stack.

13:53.920 --> 13:59.920
You can have serial consolidation that's how we have the logs and you can type back.

13:59.920 --> 14:06.920
Blog devices you can interact with partitions, but that's not ideal for high throughputs, read and write.

14:06.920 --> 14:15.920
But you could like, cloud in it is basically providing configuration by providing an extra blog device.

14:15.920 --> 14:18.920
That looks like a CDROM with all the configuration of the VM units.

14:18.920 --> 14:21.920
That's what's happening on most cloud platforms.

14:21.920 --> 14:26.920
And it's because it's a micro VM that has USB PCI.

14:26.920 --> 14:32.920
Things that are actually used when you want to share directories between the host and the VM.

14:32.920 --> 14:37.920
Often it's using something like very cold virtual IO.

14:37.920 --> 14:45.920
There are ways to share that and it's using it's emulating PCI which doesn't work in micro VMs.

14:45.920 --> 14:49.920
So when you want to do this it looks a little bit like this on the server.

14:49.920 --> 14:52.920
You just manipulating unique sockets from Python.

14:52.920 --> 14:55.920
But inside the guest you're manipulating V-sock sockets.

14:55.920 --> 15:01.920
And you have a bit of magic numbers like you need to give a number to your sockets.

15:01.920 --> 15:07.920
I choose 52 because that was in the documentation of firecracker, but you could use any number there.

15:08.920 --> 15:14.920
And you have this a bit weird quick to know to which sockets from when you are on the host.

15:14.920 --> 15:21.920
It can actually need to first and connect the number of the sockets to actually connect to it.

15:21.920 --> 15:29.920
So this is how you can create a bit direction or connection without any network stack between the host and the VM.

15:29.920 --> 15:38.920
You may also want to pipe sockets which is a really nice Swiss knife tool to convert its units socket to V-sock to TCP.

15:38.920 --> 15:50.920
So you can have if you have an HTTP server running inside the VM with sockets you can very easily have it exposed as HTTP on TCP for your web browser on the host stuff like this.

15:50.920 --> 15:58.920
If you want to experiment and debug that's really cool with Python because it's interpreted so you can execute new code on the band.

15:58.920 --> 16:05.920
You have an interactive shell. You can have a debugger in the console, remove Python shell.

16:05.920 --> 16:12.920
You can use or you can spin SSH and use command debugging tools, but that which is nice with a Python shell.

16:12.920 --> 16:18.920
You can actually have a shell inside the the namespace of the units to debug it.

16:18.920 --> 16:22.920
For example, if I look here.

16:22.920 --> 16:34.920
Let's say I have my unit this this so I'll put it in light mode.

16:34.920 --> 16:43.920
So here I'm launching Nginx as a sub-process and then I'm launching this is also in the standard library of Python's rule and interactive console.

16:43.920 --> 16:48.920
And then it's just to do I clean shut down what's once this break exits.

16:48.920 --> 17:00.920
And this allows me to very simply say okay let's update this small shortcut.

17:00.920 --> 17:08.920
So I update and I have a Python shell and I have my Nginx object that I can do for example.

17:14.920 --> 17:23.920
And I'm just inside my Nginx inside my VM from my shell here so that makes it really handy to debug experiment with stuff.

17:23.920 --> 17:27.920
I don't need to recompile rebuild the VM the root file system.

17:27.920 --> 17:31.920
So Python is really cool for that.

17:31.920 --> 17:40.920
If you want to also put you I simplify the remote shell on the right that you can also use from from the Vsuck or from TCPAP.

17:41.920 --> 17:46.920
So some advantages of and limitations you have when you actually do this with Python.

17:46.920 --> 17:52.920
So Python starts in fact pretty fast compared to other services and even compared to the Linux kernel.

17:52.920 --> 17:59.920
Most of the time in these 100 milliseconds it was most of it was taken by Linux not by CPITEN.

17:59.920 --> 18:04.920
It's easy to redirect and concise code.

18:04.920 --> 18:18.920
You have the redeveloped print loop you have the Python shell you have a lot of libraries included you don't need especially third party requirements you can always add stuff with tip but you don't actually really need it.

18:18.920 --> 18:27.920
The source code can be extracted so that can allow for the debugging and maintenance and you have one of the richest ecosystems of third party modules.

18:27.920 --> 18:42.920
In limitations while it's not the fastest you have larger memory usage than if you do it in a compiled language typing is not mandatory people some people like mandatory typing or borrowed checkers.

18:42.920 --> 18:53.920
When you're doing if you're processing large logs it's not ideal like usually don't process lots of data in Python your Python code but I think most people know about this.

18:53.920 --> 19:04.920
The source code can be extracted some people don't like this I think it's great because I love open source so it's always nice I like also to ship JavaScript that never meanified.

19:04.920 --> 19:21.920
And complex codes can be dangerous in the units if it crashes you don't have if it crashes is better to you know this ugly thing that try accept anything might be useful sometimes.

19:21.920 --> 19:32.920
The conclusion is not that hard to do this it's great to better understand the operating system it's easy to port it to you favorite language if you don't want to use Python.

19:32.920 --> 19:39.920
It's a good use for Linux and open source.

19:39.920 --> 19:57.920
And you can find the slides on this link or by scanning the cure code I want to thank my friend Hashtag for the idea and LFIM for paying me to develop it with the in its system in VMs and help me and for me to experiment with this.

19:57.920 --> 20:03.920
Thank you so much.

20:03.920 --> 20:31.920
A bit of time for questions any questions no questions one in when you have pd1 one of the task of pd1 is to remove that processes so when you have a process that dies sometimes it stays in the process list and pd1 to take care of like that processes and clean that stuff.

20:31.920 --> 20:33.920
Did you do that?

20:33.920 --> 20:59.920
Yeah, so you can in fact register a callback to a specific using a specific Cisco to say well I specifically Linux carry all signals say I went to handle zombie processes and then you have an implementation with a single or an illness in chaos usual to just handle this and you have like a callback your function can handle the process and decide what to do.

20:59.920 --> 21:01.920
Any questions?

21:01.920 --> 21:03.920
Yes one.

21:07.920 --> 21:19.920
Yeah, you show the magic numbers right but I didn't understand where you have to send these magic numbers to.

21:19.920 --> 21:33.920
You send these magic numbers to the Linux kernel you do a Cisco so I call to the Linux kernel to say hey I want to run this instruction is operation is this yeah.

21:33.920 --> 21:46.920
To the Linux kernel and then it checks this that these numbers are actually the right ones and that you're not doing a mistake and in fact you call function as 169 that's for reboot but you wanted to call another one for example.

21:46.920 --> 21:50.920
Okay, so it's a Cisco all right then understand thanks.

21:54.920 --> 21:56.920
Okay, well I have to.

21:58.920 --> 22:00.920
How come up there? I know just a second.

22:00.920 --> 22:22.920
So if you let this run for a while how stable is Python do you see any crashes or memory leaks or stuff like that.

22:22.920 --> 22:45.920
I didn't see in in the use I have I didn't see any issue that was not because of like memory leaks from my code because I have I had list popping up keeping updating or something but in fact I had no issue with this I didn't have I didn't face any issue of in any stability issue with this.

22:46.920 --> 23:01.920
You could also argue that if you have a stability issue it takes 100 milliseconds to get back up but that's not an embedded device is that's on VMs on power for PCs when you are an embedded device in the Linux kernel will take longer to put and you don't want to do that as often.

23:01.920 --> 23:29.920
So thank you for presenting this interesting use case I once heard I don't know if it's entirely true that Python was also developed to maybe from the functionality replace bear box or something like that and what from your experimenting can you say how complete is the support could you do everything that you need or what are the functional limitations.

23:31.920 --> 23:51.920
But I think you have a lot of stuff I was surprised by how much stuff you can do just from the standard library of Python with this for in this use cases and then you can always cool third party processes just launched processes to launch other programs so it's really not limiting in fact.

23:52.920 --> 24:06.920
I think the main limitations you have is more like performance like system D processes all the logs into a binary format and there's a lot of stuff and you start to want to do all of that then performance will matter more.

24:07.920 --> 24:18.920
Even if maybe most of it is still delegated to other processes it wouldn't matter too much with Python but that's still a bit of an issue as least as long as we have the deal and we are an interpreted language.

24:18.920 --> 24:25.920
Okay there's one question on the chat how did you get Python in your mineral roots of our file system.

24:25.920 --> 24:28.920
So I just the bootstraps.

24:28.920 --> 24:56.920
So here it's a bit small to read but basically when you launch the bootstraps with a minimal version you can say also include this package and you need to just include the version of Python you want.

24:56.920 --> 25:07.920
Don't include just Python 3 because that's a meta package and it doesn't really work in that use case but if you specify the version of Python you can also add nginics and all the software you want.

25:07.920 --> 25:17.920
This is specific to DB and you just say when you use the bootstraps to create the root file system this is the list of all the packages I want to have in my file system.

25:18.920 --> 25:35.920
Thank you. There's a very small question in a system like DB and what's that in it written on like a C C++ system D is written I guess in C it could be C++ but I would suspect more C.

25:35.920 --> 25:43.920
And I have found a question about a short one. Have you ever tried using micro Python instead of Python 311 minimal?

25:43.920 --> 25:53.920
I have not. I think it should work but you know the standard libraries more limited so that would be more difficult.

25:53.920 --> 26:12.920
It is but it's also faster and smaller and it's for like I don't know if it's faster I mean it's smaller it starts up it starts up faster but the performance after the first 25 milliseconds might not be better that may be true I haven't tried that but the start up time is amazing so.

26:13.920 --> 26:31.920
But then if you already have 80 milliseconds from the Linux kernel and it's also maybe then you want to boot I unicker and all where you build everything built it built in so the unique kernel is basically you just boot one binary that does everything contains everything the kernel and the application.

26:31.920 --> 26:33.920
Okay thank you.

26:42.920 --> 26:44.920
Thank you.

