WEBVTT

00:00.000 --> 00:12.200
But the way the immisability layer is achieved is through relipsy and redox archi.

00:12.200 --> 00:19.040
So first, our C library, relipsy, and again, for what ironically, it's written almost entirely

00:19.040 --> 00:25.200
in rust, even many of the header files or order generated using C bind gen from rust declarations.

00:25.200 --> 00:31.840
So the only real C code in relipsy is essentially what bossy requires in the form of same

00:31.840 --> 00:39.360
macros and other other special C specific definitions.

00:39.360 --> 00:45.280
And it's been rustifying over time, reducing unsafe, so for example, howling C strings can

00:45.280 --> 00:49.640
be done safely encoded using the rust type system.

00:49.640 --> 00:54.640
There are both a Linux and a redox back end that Linux back end is in particularly interesting

00:54.640 --> 00:55.640
for this talk.

00:55.640 --> 01:06.880
But the redox back end is what implements the large parts that constitute our policy support.

01:06.880 --> 01:15.760
So redox archi is currently on top of a simpler kernel APIs for example, creating your

01:15.760 --> 01:20.320
either spaces sending pages between outer spaces, switching the current outer space.

01:20.320 --> 01:29.960
So one, and importantly also the quiet large at this point signal handling subsystem.

01:29.960 --> 01:35.520
So what we're trying to achieve with this signal subsystem is to essentially support the

01:35.520 --> 01:38.360
user space analog of hardware interrupts.

01:38.360 --> 01:41.200
So they're really similar.

01:41.200 --> 01:47.440
There are roughly 64 signals, even though that's implementation specific.

01:47.440 --> 01:53.400
There is synchronous, they can interrupt execution at any time when they're not masked

01:53.400 --> 02:02.160
or ignored, and they can also interrupt two skulls that would otherwise lock.

02:02.160 --> 02:07.280
But the way our current redox RTA approach is implemented.

02:07.280 --> 02:15.120
This policy is quite huge problem when signals can occur, because if for example store

02:15.120 --> 02:22.640
this current working directory state, which needs to be accessed every time you open something.

02:22.640 --> 02:28.200
Since this state is a global variable, and since multi-threading is supported, this needs

02:28.200 --> 02:30.120
to be synchronized.

02:30.120 --> 02:37.920
And although that doesn't necessarily have to be implemented using a mutex, impractice

02:37.920 --> 02:40.920
that will always be sort of required.

02:40.920 --> 02:44.440
And the problem is the functions like open they're async safe, which means they can be called

02:44.480 --> 02:45.440
re-entrously.

02:45.440 --> 02:48.440
So this policy is a huge problem when you need synchronization.

02:48.440 --> 02:52.440
Now before all of this functionality was moved to use space, when it's still recited in the

02:52.440 --> 02:57.440
kernel, this would have been a trivial problem, because interrupts are almost always disabled,

02:57.440 --> 03:01.600
at least on the redox microphone.

03:01.600 --> 03:08.400
And doing that manually would have been more than 10 times cheaper, give or take.

03:08.400 --> 03:14.400
So we really need to be able to quickly disable signals for short critical sections.

03:14.400 --> 03:21.720
So that brings us to the question, what if we could bypass the fiscal overhead entirely

03:21.720 --> 03:25.920
and say you've shared memory instead?

03:25.920 --> 03:29.280
And that's exactly what this protocol does.

03:29.280 --> 03:36.800
So it keeps the cost of sync prox mask low, and even bypasses the kernel entirely, at least

03:36.800 --> 03:41.520
in terms of switching to kernel mode.

03:41.520 --> 03:48.160
And it keeps all of this signal configuration state, except the pointers the kernel uses

03:48.160 --> 03:51.760
to access the same state.

03:51.760 --> 03:56.400
And it also provides a basic IPC cancellation primitive.

03:56.400 --> 04:00.680
So essentially all of the configuration is done using shared memory, which in use is based

04:00.680 --> 04:02.400
in kernel space.

04:02.400 --> 04:08.560
Now this allows the kernel implementation to be really simple, rather than having to say

04:08.560 --> 04:15.120
a restore registers inside the user address space that's going to be signal.

04:15.120 --> 04:19.280
The only thing the kernel needs to do is just remember what the instruction pointer was,

04:19.280 --> 04:25.280
set the new instruction pointer to the signal trampoline, as well as save one extra scratch register

04:25.280 --> 04:30.320
so the assembly in the signal that can actually do anything useful.

04:30.320 --> 04:35.360
So it makes the kernel really simple, but it does make users' base quite a lot more complicated,

04:35.440 --> 04:41.280
because now it essentially needs to implement an analogical ended with the per thread allow mask.

04:41.280 --> 04:46.960
So each thread can disable enable signals and have signals sent to that thread specifically

04:46.960 --> 04:49.360
or to the entire process.

04:49.360 --> 04:57.840
Of course there are also some ad hoc protocol behaviors for things like stop signals

04:57.840 --> 05:01.760
and real-time signals that have special requirements.

05:02.720 --> 05:09.920
But to explain this, well to depict this, the way this works, again you have these per thread

05:09.920 --> 05:14.800
areas, generally in the PCB, as well as the per process area, which is just a global variable.

05:16.640 --> 05:22.800
The depending set here shows in logical or between the per process and per thread sets

05:24.000 --> 05:26.800
and the allow shows the per thread allows it.

05:26.800 --> 05:31.200
Now the logical end of these will give the set of pending unlock signals from which

05:32.960 --> 05:38.240
the passics requires that for real-time signals, at least that the lower signal be picked.

05:39.360 --> 05:44.080
And then there is also this additional bit that allows not

05:44.880 --> 05:49.440
affecting the way signals can interrupt things, et cetera, between threads,

05:49.440 --> 05:52.720
but just on the same thread whether the kernel will jump a user base.

05:53.680 --> 06:02.160
So when this mask bit is clear and you take the signal, it will be possible to deliver that signal.

06:05.120 --> 06:12.240
So for this to work, the protocol could, in theory, have been implemented as multi-producer

06:12.240 --> 06:19.600
multi-consumer, but that would have made the protocol way too complex to handle, I think.

06:20.560 --> 06:25.920
So the way it works is essentially, you need mutual exclusion for sending signals.

06:26.480 --> 06:32.960
Only the kernel is currently allowed to do this later, the process manager will do it instead.

06:34.640 --> 06:40.640
When you kill a single thread, you set the pending bit for the corresponding signal for that thread.

06:41.280 --> 06:45.600
You then check as a separate atomic cooperation, whether that's seeing was allowed.

06:46.480 --> 06:54.880
And if both conditions are upheld, you will notify that thread and then wait for the signal

06:54.880 --> 07:00.640
trampoline later to clear the bit for that thread. For processes it's a bit more complex.

07:00.640 --> 07:05.120
You set the process level pending, you then walk through each individual thread,

07:05.120 --> 07:11.280
looking for one, that signal is unlocked. Then once you found one, you unblock it and later wait

07:11.280 --> 07:18.080
for it to, at some point, clear the signal bit later. This can, in theory, result in

07:18.080 --> 07:30.080
spears signals, but this shouldn't be a problem considering that's, in my opinion, misuse of the API.

07:31.040 --> 07:43.280
For, for, for introduction to work, while there are IPC circles ongoing, this of course needs

07:43.280 --> 07:52.320
to support some form of cancellation. First, to explain the basic IPC model on redox,

07:53.280 --> 07:59.040
the way it works is, essentially, if you have, for example, read from the socket, you'd call

07:59.040 --> 08:06.000
a suspreed to. This is synchronous for the client, but asynchronous for the server. So,

08:06.000 --> 08:11.280
you'll wait for the context, for the scheduler to context which, to the server.

08:11.280 --> 08:24.800
And, which will, for cancellation to work, this essentially happens asynchronous as well.

08:25.360 --> 08:30.960
It just sends an, a cancellation request, and this is for IO state to not be broken,

08:30.960 --> 08:35.920
say if there were bytes that were just about to arrive, and then there was a signal. Of course,

08:35.920 --> 08:45.040
those bytes should arrive before the inter, or sent, signal can force cancel however.

08:46.800 --> 08:51.920
So, yeah, in general, this signal's implementation has allowed for increased

08:52.000 --> 08:59.360
POSIX coverage, both in terms of raw new implement APIs, and the, and the

09:02.160 --> 09:10.160
versatility in technical terms of the, of the underlying signals implementation.

09:11.200 --> 09:16.720
The standard has, in general, been quite easy to work with, with a few exceptions, and that's

09:16.800 --> 09:25.920
me, the fact that POSIX has derived from Unix, which has all of these, like ad hoc exceptions,

09:25.920 --> 09:32.160
like what happens if a stop signal is interrupted by a continuation signal, and so on,

09:32.720 --> 09:38.080
that perhaps wouldn't have been there, if this protocol had been designed for microcom, microcom

09:38.080 --> 09:42.960
regionally. There's also, there's also a bigness, say, what happens if a real time signal was sent

09:42.960 --> 09:50.080
at the standard way, or vice versa. The implementation will lead some further testing, since it's,

09:51.920 --> 09:59.920
well, since, since it essentially requires separate sort of inter-pandlers for each arch,

10:00.560 --> 10:06.240
but the protocol should be pre-reboast. Additionally, the process manager will take over the

10:06.240 --> 10:14.800
Colonel's role as the, as responsible for sending signals. But in general, with dynamic, with

10:15.440 --> 10:24.480
dynamic linking progress, having been made a lot recently, this will greatly improve the mechanisms

10:24.480 --> 10:31.680
for further user-pasification, now that the signal performance problem has, in large part, been

10:32.640 --> 10:40.400
overcome by this new memory interface. And this way, it will potentially be possible to

10:40.400 --> 10:46.000
use your pacify even further deeper parts of POSIX, such as the file table allocation,

10:46.000 --> 10:52.720
and so on, every time you access a file descriptor. So yeah, I hope the user-specification can

10:52.800 --> 11:00.080
continue. That's the end of my talk.

11:03.440 --> 11:08.880
Thank you very much for this great talk. We have two minutes for questions in the back.

11:23.680 --> 11:31.680
Sorry, can you repeat the question for you?

11:31.680 --> 11:37.440
Oh, right. So, so, so your question was if if the implementation uses,

11:38.640 --> 11:46.000
like supports real times from Q, rather than be behavior, that the same signal occurs

11:46.000 --> 11:52.560
at most ones before is delivered, which is the case for the lower 32 bits.