WEBVTT

00:00.000 --> 00:17.600
Okay. Let's continue. Up next, we have Javier Martinez, who at some point had reason to write

00:17.600 --> 00:23.160
the GCC plug-in for something that he will tell us, I guess. And then, eventually, contribute

00:23.160 --> 00:29.020
that this implementation upstream into GCC trunk without the plug-in infrastructure.

00:29.020 --> 00:30.020
Please go ahead.

00:30.020 --> 00:36.020
Thank you. That's here.

00:36.020 --> 00:44.220
Hi, everyone. I'm really, really cool. It's a full room. My name is Javier. I'm a C++ developer

00:44.220 --> 00:49.180
at the RW working on low latency C++. Before that, I did computer science at Cambridge

00:49.180 --> 00:53.100
and electrical engineering and nothing else. So, I'm into like low-level performance

00:53.100 --> 00:58.660
kind of things. Also, compilers. There is someone here. I've spent most of my compilers

00:58.700 --> 01:03.660
hacking time on no camel, not C or GCC or anything like that. But I do have the

01:03.660 --> 01:10.140
tiniest of contributions to GCC, which was part of my internship with RW, which is the

01:10.140 --> 01:16.620
topic of today. So, kind of the whole thesis behind this is that most code is called in

01:16.620 --> 01:21.340
an application. So, if you see this code, there is, you know, a testing at some point

01:21.340 --> 01:25.100
there is zero. If it is, it looks like it's going to turn exception otherwise, it's fine

01:25.100 --> 01:31.620
returns. And, you know, to the compiler, heuristics, this is, you know, the exception

01:31.620 --> 01:37.420
through in path is, is clearly a very cold code kind of case. You know, it's an early return

01:37.420 --> 01:41.780
from a function, it's checking in and zero. It's returning an extra, it's throwing an exception

01:41.780 --> 01:46.300
if it's all of the cases. So, I guess, if you, if you were the compiler, and this is kind

01:46.300 --> 01:51.460
of funny to ask here, because a bunch of you are the compiler. But if you were the compiler

01:51.460 --> 01:56.020
and you were to think about what you can do with, you know, that some code is called,

01:56.020 --> 02:00.820
you can, you can reorganize your, your bunches for better prediction. At some point, with

02:00.820 --> 02:04.420
20 and 40, you could even have some hints for the CPU, that is not the case anymore. But

02:04.420 --> 02:08.540
you can at least make the likely case, a fall through or a closer branch than the likely

02:08.540 --> 02:14.820
case. You can also optimize for a size. And, in my opinion, well, for my case, most importantly,

02:14.820 --> 02:20.020
what you see here is you can try to keep the cold functions, well, you are not taking

02:20.020 --> 02:23.460
for the linker, as you see with a clone called, and you try to keep them together. And

02:23.460 --> 02:27.460
they are here is that, you know, all of your cold stuff is like far away, and it's not

02:27.460 --> 02:31.100
messing with your instruction cache when you run your hot code. So, we wanted to do this,

02:31.100 --> 02:34.740
the hint to tell the compiler that something is called is that you've called, it works

02:34.740 --> 02:40.660
on functions. And for my case, it didn't feel like the right granularity to express the

02:40.660 --> 02:44.740
the cold, because, well, I don't want to spray it on every single function or a thousand

02:44.740 --> 02:49.220
functions. And at the component level, I can express the size, but I couldn't express

02:49.300 --> 02:56.260
the cold hint. So, we thought about writing a plugin, the mental model behind the

02:56.260 --> 03:00.340
plugin should be simple, like a pre-processor path that just, you know, you annotate a class

03:00.340 --> 03:05.780
being called, and it is goes through a refunction and marks it as called. So, the way you write

03:05.780 --> 03:10.420
plugins in, in GCC, is that you tell the compiler which points at compilation in your

03:10.420 --> 03:16.020
interest zone. There's 27 such points, it actually gets multiplex into more points, because you

03:16.100 --> 03:21.700
can specify concrete optimization passes that you want to aim for. And then when you hit

03:21.700 --> 03:27.700
that point, the compiler just gives you a handle, and you can look at new data. So, if you

03:27.700 --> 03:31.540
need to do this, you basically open that file or documentation, and one of these points is,

03:31.540 --> 03:36.500
you know, after finishing parsing a type. So, you know, it seems to fit my, my use case.

03:37.380 --> 03:43.220
And as you see, just in a couple lines, you get this pointer to GCC data, and it's a three,

03:43.220 --> 03:49.060
everything is a three in the IR. And, you know, simple enough, you can access the, if the three

03:49.060 --> 03:56.100
is a type, you can, like a union or a struct, or a class. You can iterate the type fields,

03:56.100 --> 04:00.500
if the type field is a function declaration, you know, sure enough, you just add the attribute.

04:01.940 --> 04:06.500
So, as you see, you can get something like this going in just about 30 lines of code, and it's

04:06.500 --> 04:11.060
mostly boilerplate, which is really nice if you're new to it. And then the question is whether,

04:11.220 --> 04:16.900
well, if you want to go from a plugin to an entry patch, it's like, how difficult is this?

04:17.860 --> 04:22.900
Well, the good thing is that the data structure from the written phrase interfaces are all mostly

04:22.900 --> 04:28.980
just the same. So, in my case, kind of my main problem was just finding the right point at which

04:28.980 --> 04:33.620
to do the work that I was doing in the plugin anyways. So, you start browsing code.

04:34.580 --> 04:38.580
I remember there was something called class, a typical class, sounds great. You look at their

04:38.900 --> 04:43.940
header comment, thanks great. And then one of the functions says, well, this is going into

04:43.940 --> 04:48.500
our class and we check in the members and maybe adding some implicitly defined, implicitly declared

04:48.500 --> 04:52.900
special members. So, I thought this was the right place to add it, and surely at the very,

04:52.900 --> 04:58.020
very out of this function, I do exactly the same kind of loop that I do before, just with a few more

04:58.020 --> 05:02.980
safety checks. You have to think about what happens if you're marking a class called that has

05:03.540 --> 05:09.220
function that is already marked, hot kind of thing. But yeah, that does it. It was a really,

05:09.220 --> 05:14.500
really simple transition from the parallel into the entry patch. Sure, there's a couple of surprises.

05:14.500 --> 05:21.380
I think for me, the main pitfall is that these says, it adds any implicitly declared functions,

05:21.380 --> 05:27.300
in reality, there's a few lazily inserted implicitly, there are functions. There is a second

05:27.300 --> 05:33.060
code path that has to repeat that loop. But other than that, you know, very simple straightforward,

05:33.060 --> 05:37.860
then the whole patch is about a hundred lines of code if you remove some documentation changes and

05:37.860 --> 05:45.220
tests. So, yeah, I didn't teach you how to write plugins, but hopefully I thought you that is kind

05:45.220 --> 05:52.020
of a nice introduction to the topic. So, I wrote a workshop that I guess I'll place the link on

05:52.020 --> 05:56.820
matrix later that guides you through writing your first plugin. So, it's a custom attribute,

05:56.900 --> 06:01.540
a static analysis path that is loops of the names of functions, and some instrumentation to

06:03.060 --> 06:06.740
basically call a login function. And this is all like packed in the exercise of writing an aspect,

06:06.740 --> 06:13.940
oriented C++, toy language, which I think is, you know, SQL exercise. But yeah, that's it. Thank you.

06:14.260 --> 06:20.260
Thank you.

06:25.140 --> 06:26.580
I guess if we have any questions.

06:30.580 --> 06:37.220
What's the code simplified on your slide where you added the code attribute or is it the original code?

06:37.220 --> 06:39.620
No, it's definitely simplified, but you're talking about this code.

06:39.780 --> 06:47.380
Yeah, because I was wondering whether it really makes sense to put that idea if I got forward

06:47.380 --> 06:48.260
inside the name.

06:49.860 --> 06:54.340
Right. So, the question was whether this code is simplified. It is simplified to some extent.

06:54.340 --> 07:01.380
This is indeed hoisted out. It's probably a couple more cases. I mean, this does not fully

07:01.380 --> 07:05.380
represent the exact code because I also had to check again if there was attributes already present.

07:05.860 --> 07:10.020
And perhaps something that I found surprising about this one is that this is additive. So you don't

07:10.020 --> 07:16.260
have to, you know, this is a list in GCC. So you don't have to look at all of the past attributes,

07:16.260 --> 07:21.780
and then concatenate the list and then pass it. This is a word of the existing attributes.

07:23.380 --> 07:26.340
So yeah, it's not exactly the same code. Close enough.

07:27.300 --> 07:35.300
Other than hard and cold, what can you apply to an entire class?

07:39.380 --> 07:45.780
We don't do this. Like this part is only for cold and hot. You could do, I could see size.

07:45.780 --> 07:52.580
I mean, what size you can do with pragmars? You can do the always inline kind of thing that is

07:52.580 --> 07:58.180
a common one. I guess you could start big infotivates in counterbanks. Maybe visibility.

08:06.180 --> 08:12.580
Okay, any more questions? Thank you again.

