WEBVTT

00:00.000 --> 00:10.880
Thank you very much. Hello, my name is Yoran Hadlan and you can find me as Jay Herland

00:10.880 --> 00:16.680
on Github, Macedon, LinkedIn, elsewhere. I'm an Norwegian, but I'm currently living in

00:16.680 --> 00:21.480
Delsen, the Netherlands and I work as a developer productivity engineer at Twig, which is part

00:21.480 --> 00:26.240
of the larger modus creates of a consultancy. We help our clients solve our problems with

00:26.240 --> 00:30.720
open source software and we contribute back to the open source community whenever possible.

00:30.720 --> 00:34.880
Today I will talk to you about a somewhat obscure topic that lies between the dark arts

00:34.880 --> 00:40.600
of the compiler tool chains, debuggers and executable binary formats. For the average developer,

00:40.600 --> 00:46.000
I suspect these areas are often associated with a certain amount of black magic. Certainly,

00:46.000 --> 00:50.160
I know that I had worked for many years as a C and C++ developer without ever encountering

00:50.160 --> 00:55.640
today's topic as a possibility. I should start by saying that I have not invented any of

00:55.640 --> 00:59.640
the features or things that I'm presenting here. I'm merely stumbled onto this while trying

00:59.640 --> 01:04.640
to improve debug builds at a client. And I ended up writing a blog post about it to shed some

01:04.640 --> 01:09.920
light on what's going on here and this talk is based on that blog post. So what am I talking

01:09.920 --> 01:14.840
about today? I will give you a practical introduction to debug fission, also known as split

01:14.840 --> 01:19.000
dwarf where dwarf is the name for the standard debugging data format used on Linux and most

01:19.000 --> 01:25.080
of the unique systems. This talk is limited to L files on Linux and we will be using GCC

01:25.080 --> 01:29.840
and the Google linker. LLVM plan works with debug fission 2 and there are also other modern

01:29.840 --> 01:34.320
linkers that support the things cover here. A related topic that I will not cover here

01:34.320 --> 01:39.320
is that I'll compress the debug symbols, but there are some links here if you're interested.

01:39.320 --> 01:43.800
So let's quickly go through the basics. What are debug symbols? In short, given the classical

01:43.800 --> 01:49.640
world program, it's a difference between this and this. On the left hand side, we build

01:49.640 --> 01:54.360
with default compiler options and on the right hand side, we pass G to GCC in order to build

01:54.360 --> 01:58.280
with debug symbols. On the left hand side, the debugger is not too helpful, although it knows

01:58.280 --> 02:01.880
where the main function is and we can set a breakpoint on it, it is not able to give us

02:01.880 --> 02:06.840
much more help. On the right hand side, wherever GDB can tell us what the main function

02:06.840 --> 02:11.800
where it's located, online for in a load of CPP and we can even get source code listings.

02:13.160 --> 02:17.960
The debug symbols are what provide these helpful links or references that map the machine code

02:17.960 --> 02:22.680
currently being executed back into the higher level source code concepts like variables and

02:22.760 --> 02:28.760
functions with file and line numbers. However, debug symbols take a lot of space. In the

02:28.760 --> 02:33.000
build without debug symbols, our toy executable is only about 8 kilobytes large, but the debug

02:33.000 --> 02:38.120
symbols add almost 300% to the original size. So what's actually being added to the executable?

02:38.840 --> 02:43.800
We can use redelf with sections to look at our two executables and we will see that a bunch

02:43.800 --> 02:47.800
of new sections that are called debug something have been added to the bigger executable. These

02:47.800 --> 02:51.960
sections contain these debug symbols that the debugger can use to provide extra help when debugging.

02:53.240 --> 02:56.920
So what's the problem with these debug symbols? Well, they're not only take a couple of

02:56.920 --> 03:00.920
space, they also take extra time, time to generate the debug symbols in the first place,

03:00.920 --> 03:04.840
but also as we'll see, time to copy them from the initial object files through the

03:04.840 --> 03:09.320
intermediate build artifacts and into the final executables. Not to mention that this extra

03:09.320 --> 03:13.400
duplication throughout the build process means that enabling the events symbols in a build

03:13.400 --> 03:17.960
can dramatically increase the total space needed. In the end, you need to consider the benefit

03:17.960 --> 03:22.440
of the debug symbols versus our cost. In other words, how often do you actually need the

03:22.440 --> 03:27.400
debug symbols versus how much does it cost to build with them? And very often a project will

03:27.400 --> 03:31.400
choose to build without the debug symbols by default and then it's after the developer to request

03:31.400 --> 03:35.480
the build with the debug symbols when needed. But then also in many cases, you don't know

03:35.480 --> 03:39.240
that you need the debug symbols until after the build is complete and the release has been deployed.

03:40.600 --> 03:45.320
So let's continue by looking at stripped executables briefly here. Stripping executables is often

03:45.320 --> 03:49.000
done as part of a release build in order to make the smallest possible release or effect.

03:49.720 --> 03:54.440
We can see that applying stripped or executable with the box symbols creates a stripped executable

03:54.440 --> 03:59.560
that is even smaller than the executable built with default options. But the result is completely

03:59.560 --> 04:03.880
unusable in the debugger, of course. This time GDB doesn't even know that there is a main function,

04:03.880 --> 04:08.840
and if I want to set a breakpoint, I need to use hardcode addresses. But GDB does have a trick

04:08.840 --> 04:13.000
up its sleeve because if we have access to the unstrict version of the same executable,

04:13.160 --> 04:17.400
then we can supply the unstrict executable to GDB with the symbol file command.

04:18.200 --> 04:24.600
GDB will now use that unstrict executable to basically make sense of the stripped executable

04:24.600 --> 04:27.960
that is running on the CPU. And we get our debugging conference back.

04:29.000 --> 04:33.480
We can take this one step further and add a debug link into the stripped executable that references

04:33.480 --> 04:38.040
the file that contains the debug symbols. In this case the debug link only adds 96 bytes

04:38.360 --> 04:43.720
to the stripped executable and when this debug link is present, GDB will automatically follow

04:43.720 --> 04:46.040
it to find the debug symbols and we get the debugging we want.

04:48.040 --> 04:52.040
So maybe the solution is to build with the debug symbols and then keep both the unstrict and

04:52.040 --> 04:56.760
the stripped executable. We can then distribute the stripped executable and as long as we can retrieve

04:56.760 --> 05:00.760
the corresponding unstrict executable when we need to debug, we can get the best of both worlds.

05:01.400 --> 05:06.280
This does sound like a good idea first. We get the smallest possible release or effect and as long

05:06.360 --> 05:11.080
as we don't lose track of the corresponding unstrict executable, we also get the debug

05:11.080 --> 05:15.960
is help when we need it. However, it's not completely without drawbacks. In order to have both

05:15.960 --> 05:20.840
the unstrict and stripped version of the same executable, we must first build the unstrict version.

05:20.840 --> 05:23.720
And as we saw before, building everything with the debug symbols is very expensive.

05:24.840 --> 05:28.280
Then only at the end of the build process can we strict executable and make the final release

05:28.280 --> 05:34.040
artifacts. This ends up on the critical path in our build graph. For any change we make to the source code,

05:34.120 --> 05:38.920
we must now rebuild the unstrict executable and then strip it. If we end up not actually using these

05:38.920 --> 05:42.760
debug symbols for most builds, then it can be very hard to justify this build time cost.

05:43.320 --> 05:48.120
So to make this trade of more interesting, we should try to decrease the overall cost of building

05:48.120 --> 05:53.720
with debug symbols. So what we really want to do here can be summed up in two questions. Number one,

05:53.720 --> 05:58.360
can we somehow split off debug symbols while we are compiling? And number two, can we still

05:58.360 --> 06:03.400
keep the separate debug symbols around for debug and time? So let's finally talk about debug

06:04.360 --> 06:09.160
As compile time, you can supply the g split worth option and now in addition to the regular

06:09.160 --> 06:15.400
object file, you will also get an accompanying dot to DWO file. Compare to the original object file,

06:15.400 --> 06:20.600
we can see that around two thirds of the data has been moved into the DWO file and one third remains

06:20.600 --> 06:26.120
in the new object file. Together, the total overhead is about 4% over the old original object file.

06:26.840 --> 06:31.640
We can look close to the new object file with the debug dump and it shows that are now references

06:31.720 --> 06:38.040
from this object file to sections inside the accompanying DWO file. And these are even loaded or

06:38.040 --> 06:42.520
markedly by read-off. So it seems that the g split worth option has moved most of the debugging

06:42.520 --> 06:46.920
information from the object file into the separate DWO file and put in references instead.

06:48.120 --> 06:52.040
When it comes time to link the final executable, the linker now has to handle an object that is

06:52.040 --> 06:57.080
only one third of the original size. This ultimately results in small executables and considerably

06:57.400 --> 07:02.440
foster link times. So let's go ahead with linking. There are no special link options needed

07:02.440 --> 07:07.320
for debug vision per se, but of course we do need to use a linker that supports it. Here we are using

07:07.320 --> 07:13.880
gold. This new hello dot split executable is equivalent to the previous unscript executable.

07:13.880 --> 07:17.640
The only difference is that it is based on an object that was built with g split worth.

07:19.080 --> 07:23.400
Indeed, if we run read-off debug dump on the executable, we can see that the linker has carried

07:23.400 --> 07:30.600
forward references to the DWO file that was created as compile time. Although not strictly necessary

07:30.600 --> 07:35.160
for debug vision, there is a useful link option that is worth mentioning here and that is gdb index.

07:35.160 --> 07:39.880
It is not easy to find good documentation on this option, but it's main objective seems to be

07:39.880 --> 07:44.680
to speed of gdb when loading the executable and it's debug symbols. In effect trading link time

07:44.680 --> 07:50.520
for debugging time. When we use this, we see that we actually save over 8 kilobytes in the final executable.

07:51.480 --> 07:54.840
If we look at what happened to the L sections, we find that three of the debug sections in the

07:54.840 --> 07:59.960
executable have been replaced by a much smaller gdb index section. I'm guessing here and I'm sure

07:59.960 --> 08:03.560
someone in the room can tell me later what's going on, but I'm guessing that there was some

08:03.560 --> 08:07.960
duplication between the remaining debug information in the executable and in the DWO file.

08:07.960 --> 08:11.800
And with this option, we are able to eliminate this duplication at link time.

08:13.240 --> 08:18.040
So let's take a closer look at the DWO file that we generated. Here we have only one

08:18.040 --> 08:22.440
but in a larger project, you would naturally have one DWO file per compilation unit.

08:23.080 --> 08:27.160
And a given executable might have anywhere between a few to several thousand compilation units.

08:28.760 --> 08:33.720
Can we somehow consolidate this into a better debug package so that we don't have to carry all of these

08:33.720 --> 08:40.040
DWO files around with us? Yes we can. There is a tool called DWO DWP that works like a linker

08:40.040 --> 08:45.960
but only for debug information. The best way to run this is with the EXEC option, DWP will then look

08:46.040 --> 08:52.040
at the DWO references in the given executable and consolidate all those DWO files into a single DWP package.

08:53.400 --> 08:59.880
We can then remove all the DWO files and when we run gdb on the executable, gdb is clever enough

08:59.880 --> 09:05.560
to find the DWP package with the same base name as the executable and debugging still just works

09:05.560 --> 09:12.760
automatically. Finally, we have accomplished debug vision. What we have at this point is an executable

09:12.840 --> 09:16.920
that we can easily debug and it's only slightly larger than a strict executable.

09:17.560 --> 09:21.960
Most of the DWO information is now in a separate DWO package that we only need to have available

09:21.960 --> 09:27.160
when we need to run the DWO. And we achieve this by splitting off the DWO information already

09:27.160 --> 09:33.240
at compile time and linking it separately from the main executable. Now let's move on to

09:33.240 --> 09:37.720
how we can enable DWO vision in higher level build systems. After all, very few of us build

09:37.720 --> 09:42.360
our projects by invoking the compiler and linker directly. We'll look at CMake first and then

09:42.360 --> 09:47.160
briefly at Basel. So here is a minimal CMake configuration for our toy project. We declare

09:47.160 --> 09:51.480
minimum CMake version, a project name and we add our executable and the sources it is based on.

09:52.120 --> 09:56.040
When we build this with CMake, we will quickly discover the CMake by default builds without

09:56.040 --> 09:59.960
debugging symbols and it also uses a default linker that does not support debug vision.

10:01.160 --> 10:09.160
So we fix those two things by adding FU's LD equals gold as a link option and by telling CMake

10:09.240 --> 10:14.520
that we want to use the debug build type. The next thing we need to do is to pass the options needed

10:14.520 --> 10:19.080
for debug vision. This tells CMake to pass the same compiler and linker flags that we so previously.

10:20.360 --> 10:25.560
Finally, we get to the trigger part. How do we tell CMake how to link together the DWO files

10:25.560 --> 10:31.640
into DWP package? There is currently no built-in logic to do this in CMake but here is a workaround.

10:31.640 --> 10:35.400
The workaround is explaining more detail in the blog post but in short, we're extending the

10:35.400 --> 10:39.880
default executable action in CMake with a custom command that generates the debug package that

10:39.880 --> 10:45.320
accompanies each executable. There is an open issue on the CMake blog tracker requesting better

10:45.320 --> 10:49.480
support for debug vision and someone actually added a comment there referencing my blog post.

10:49.480 --> 10:53.640
In that comment there refer to this as a hack it were around and kind of fragile and I cannot

10:53.640 --> 10:58.040
say that I actually disagree with them. Having a free support for this in CMake would make things

10:58.040 --> 11:03.880
much smoother. Let's move on to Basel. Here the story is fortunately a lot simpler. Basel supports

11:03.960 --> 11:08.040
debug vision since version 6 and it works as long as the underlying tool chain advertises this

11:08.040 --> 11:13.000
per object debug info feature. With this in place you would then pass fish and equals yes either

11:13.000 --> 11:18.200
via the command line or in your Basel RC configuration file. The DWP package is built separately

11:18.200 --> 11:22.600
but by requesting the executable target name but with a dot DWP appended.

11:24.760 --> 11:30.200
Then I've done a small experiment here. I have to admit that this was done on the by building the

11:30.200 --> 11:35.560
LLDM project. I know this is the GCC DRUM sorry for that but this was the one project that I found

11:35.560 --> 11:41.800
online where it was easy to generate a comparison. They have a flag in their build configuration to enable

11:41.800 --> 11:46.440
this and I'm basically doing three builds a release build we know the DWP symbol that's our baseline.

11:46.440 --> 11:51.240
A DWP build regular build with DWP symbols and then efficient build with DWP symbols and DWP

11:51.240 --> 11:57.800
vision enabled. The blog post has the absolute numbers and much more details but I will only do a

11:57.800 --> 12:02.760
rough comparison here. The darker blue shows the cost of the DWP build without the above vision.

12:02.760 --> 12:08.040
The lighter blue is with DWP vision enabled. For overall build times the DWP build takes almost 40

12:08.040 --> 12:13.640
percent longer than the release build and the vision only takes 25 percent longer. If we include the

12:13.640 --> 12:18.040
install time, things look even worse for the DWP build and I didn't really understand this is first

12:18.040 --> 12:22.920
because the install face of LLDM is almost purely copying files from A to B so what is it takes

12:22.920 --> 12:28.920
so much longer with a DWP build. It is actually purely down to the size of the artifacts. A DWP build

12:28.920 --> 12:33.560
takes 15 times as much space as a release build and you'll notice that I had to resort to a

12:33.560 --> 12:39.560
logarithmic scale here to fit it into the slide. Now the DWP symbols do take a lot of space but most

12:39.560 --> 12:43.960
of this enormous overhead is not due to single copy of the DWP symbols rather the DWP symbols are

12:43.960 --> 12:49.560
copied from object files into intermediate archives and from there into refined executables. When we enable

12:49.560 --> 12:54.920
the vision, that size goes down a lot to five times and the numbers are similar for the

12:54.920 --> 13:00.600
install artifacts. Anyway what can we conclude from this comparison? DWP symbols take a lot of

13:00.600 --> 13:04.440
space of course but the real space overhead has listed with DWP symbols and worth to do with

13:04.440 --> 13:09.400
the amount of times things are copied. When we use DWP vision we can eliminate much of this

13:09.400 --> 13:13.960
duplication and simply by reducing the waste space we can also improve the build time significantly.

13:14.840 --> 13:19.000
So with that we reached the larger conclusion of this talk. DWP vision can save you both time

13:19.000 --> 13:23.640
in space at the cost of some added complexity especially if DWP vision is not already supported in

13:23.640 --> 13:29.240
your build system. Is it worth it? It depends as always. If you complete DWP vision to building

13:29.240 --> 13:33.880
an unscript executable and then distributing the strip version, the strip plus unscriptable

13:33.880 --> 13:37.800
offer the smallest release size. No matter what you do the executable with DWP vision will be

13:37.800 --> 13:43.720
somewhat larger. But the strip plus unscript will also be the least efficient way to do your build.

13:43.720 --> 13:47.960
First you will make a full DWP build and then you must strip that in order to generate release.

13:48.040 --> 13:51.480
So if you struggle with this overhead then looking at the above vision can be a really valuable

13:51.480 --> 13:56.520
investment. So that's it for me. Thank you for listening. If you think that this talk while

13:56.520 --> 13:59.560
there should have been a blog post then you're up to the right. Here's the blog post.

14:00.600 --> 14:04.760
You can find everything in this talk plus even more details, side quests and links to more resources

14:04.760 --> 14:09.560
that I couldn't fit into here. More generally on the tweak bug you can also read about all the

14:09.560 --> 14:13.240
all the things that triggers are up to and if you want to learn more about tweeter mode is great

14:13.320 --> 14:17.800
or links for those. If you want to reach me I'm Jay Holland on GitHub master on LinkedIn and elsewhere

14:17.800 --> 14:26.040
and with that you can always find me afterwards if you have questions or want to discuss more.

