WEBVTT

00:00.000 --> 00:10.440
If I could get this again talking about it, it's a got dreamcast stuff with GCC last year

00:10.440 --> 00:15.560
he had some Fortron or whatever code run on the VMGOS let's see now it's what we've done

00:15.560 --> 00:16.560
with it.

00:16.560 --> 00:18.560
Uh-huh, this is the theme.

00:18.560 --> 00:25.440
Okay, so so I'm here on behalf of the Sega Dreamcast community to talk about a lot of

00:25.440 --> 00:30.520
the stuff that we're doing with GCC, um, I was here last year and last year was kind

00:30.520 --> 00:34.960
of getting infrastructure set up and getting a bunch of new languages on the Dreamcast.

00:34.960 --> 00:39.560
This year, um, I'm happy to say that we've done something with them, we've done a lot

00:39.560 --> 00:40.760
of stuff with them.

00:40.760 --> 00:43.120
Um, first of all, what is the Sega Dreamcast?

00:43.120 --> 00:48.400
It was a sixth generation console, not a fifth one, not like Nintendo 64, here that

00:48.400 --> 00:49.400
a lot.

00:49.400 --> 00:53.440
Um, it was to directly compete against the PlayStation 2 GameCube and Xbox, which came out

00:53.440 --> 00:59.760
a little bit later, uh, it launched quarter four, 1999 and then it was discontinued, uh,

00:59.760 --> 01:05.520
I guess it's not up there, March 31st, 2001, so it had a very short lifespan and it was

01:05.520 --> 01:11.200
considered a commercial failure as you'll see, but it doesn't mean anything to us, um, it

01:11.200 --> 01:17.440
has a Hatachi SH4 CPU and that's what links us to GCC as you'll see, that's why we're

01:17.440 --> 01:18.440
here.

01:18.440 --> 01:23.440
GCC was the only compiler that ever supported the Super 8 architecture other than, uh,

01:23.440 --> 01:29.400
Hatachi specific ones for a small amount of time and nowadays, everything's done on GCC

01:29.400 --> 01:35.560
for SH, um, there's several communities that still use it and there was a form post about

01:35.560 --> 01:40.560
asking if L of M would be interested in the back and then they said, no, the super H is

01:40.560 --> 01:43.760
a toy architecture, so they're right about that.

01:43.800 --> 01:45.760
You'll see how cool the toy is.

01:45.760 --> 01:52.240
Um, it has an imagination power VR2 GPU, which is the previous iteration of what wound up

01:52.240 --> 01:57.080
in the iPhone actually, so that whole line, the power VR from imagination, one of hugely

01:57.080 --> 02:03.680
successful and mobile, um, it had 16 megabytes of RAM, 8 megabytes of video RAM, uh, proprietary

02:03.680 --> 02:08.720
media format called GD ROM, but as you'll see, it can still read CD ROM, that's why we're

02:08.720 --> 02:14.160
able to do home brew on it, um, it came online ready with a built-in 56k modem.

02:14.160 --> 02:18.160
Nowadays, we have hardware models that will let you dial into Raspberry Pi, so you don't

02:18.160 --> 02:23.360
actually have to use dial up anymore, um, and then even the memory card is cool, uh, it's

02:23.360 --> 02:31.480
a 128 kilobite capacity flash, uh, 8 bit micro controller visual memory unit that also has

02:31.480 --> 02:33.480
its own home brew scene.

02:33.480 --> 02:36.680
So this was the year 2001, what happened?

02:36.680 --> 02:42.720
So this cool platform, SEGA for saking consoles to focus on software, SEGA quits the

02:42.720 --> 02:47.880
SEGA Dreamcast, SEGA scraps it, SEGA to halt Dreamcast production, this thing died, and

02:47.880 --> 02:50.280
basically nobody cared about it for a long time.

02:50.280 --> 02:57.800
But if you read the news now this year, that's what it looks like, and how did, how did

02:57.800 --> 02:58.800
this happen?

02:58.800 --> 03:04.560
We were covered by digital foundry the other day for, we poured grand theft auto 3 from

03:04.560 --> 03:08.360
the PlayStation 2, well, that's from the PC, it was originally on the PlayStation 2,

03:08.360 --> 03:10.560
to the SEGA Dreamcast, we're going to go into that.

03:10.560 --> 03:15.240
We've had a bunch of other, uh, really exciting successful ports, uh, we were covered by

03:15.240 --> 03:23.120
hack-a-day, um, Duke Nukem 3D was, uh, impressed its creator that we ported it to the Dreamcast,

03:23.120 --> 03:28.040
and yeah, we've had a lot of fun this year.

03:28.040 --> 03:33.120
So, uh, believe it or not, people still make Dreamcast games, and Dreamcast home brew, and

03:33.120 --> 03:38.000
people port all kinds of crazy stuff to the Dreamcast, because it's actually pretty powerful,

03:38.000 --> 03:42.320
but it's still embedded, um, I think that appeals to a lot of people with it, and it just

03:42.320 --> 03:47.320
has a whole lot going for it, and it's an entirely open ecosystem too, as you'll see, that

03:47.320 --> 03:51.040
kind of draws people in, um, why did they do it?

03:51.040 --> 03:55.600
Like I mentioned, I think the same kind of audience that likes Raspberry Pi development or embedded

03:55.600 --> 04:00.560
systems, uh, I think they're drawn to this, because it's, uh, it was very powerful for

04:00.560 --> 04:01.560
its time.

04:01.560 --> 04:05.360
It's still pretty powerful, and as you'll see, we're doing a lot of stuff on here that maybe

04:05.360 --> 04:11.280
has no business running on its SEGA Dreamcast, but we do it anyway, and, uh, as you'll

04:11.280 --> 04:12.880
see, it works out pretty well.

04:12.880 --> 04:16.440
You can treat it kind of like, uh, on the lower end, you can treat it kind of like a weaker

04:16.440 --> 04:21.720
PC on the higher end, you can optimize for it and go to town on it and get really impressive

04:21.720 --> 04:22.720
results.

04:22.960 --> 04:27.960
At this point in time, we have thanks to you guys, modern compilers and tools, so part

04:27.960 --> 04:35.080
of the appeal is you can develop for this extremely retro platform with modern GCC-142.0, as

04:35.080 --> 04:38.040
you'll see, and be on the bleeding edge of that stuff.

04:38.040 --> 04:41.720
Um, we're in compiler explorer, believe it or not, which makes it fun.

04:41.720 --> 04:43.840
I don't even remember X86 assembly anymore.

04:43.840 --> 04:48.760
I just, look at Super H, and, uh, that's how we optimize a lot.

04:49.240 --> 04:52.880
As I said, you can run custom code with no modifications on this thing.

04:52.880 --> 04:56.800
That's how it got extremely popular is you can just burn a CD and play it.

04:56.800 --> 05:00.160
Nowadays, we have SD cards, we have hard drive mods.

05:00.160 --> 05:04.080
There's a little bit more that you can do now than just burn CDs.

05:04.080 --> 05:11.240
Has cool toys and peripherals, has light guns, it has maracas, it has fishing controllers,

05:11.240 --> 05:15.880
just a lot of fun toys with it, and it has, at this point in time, a pretty thriving

05:15.880 --> 05:19.600
home-view community, and so does the memory card.

05:19.600 --> 05:24.440
Um, so this is a run through of our SDK that I work on.

05:24.440 --> 05:25.440
I'm a contributor to it.

05:25.440 --> 05:27.160
It's called Calistios.

05:27.160 --> 05:31.760
It's been around for about 20 years, but it didn't start to get really good until a lot

05:31.760 --> 05:36.000
more recently, and drop people into work on it.

05:36.000 --> 05:39.320
It's pretty much custom, uh, a custom mini operating system.

05:39.320 --> 05:40.880
It's a unicarnal.

05:40.880 --> 05:44.000
Uh, technically, I think it's considered a library OS.

05:44.000 --> 05:48.840
You link to our kernel, and by default, your applications are on kernel space, which is

05:48.840 --> 05:50.480
both good and bad.

05:50.480 --> 05:56.640
You can do some really fun stuff with it with C++20 and kernel space, and yeah.

05:56.640 --> 06:03.000
We have a, we sit on top of new lib for, uh, file IO, date time, lib-c kind of stuff.

06:03.000 --> 06:07.720
We also implement a whole lot of lib-c and politics APIs ourselves, just, because the

06:07.720 --> 06:11.240
more complete our API coverage is the easier it is for people to target a platform, the

06:11.240 --> 06:12.240
more course we get.

06:12.600 --> 06:17.360
We have a really good virtual file system where you can just use C standard file IO, file

06:17.360 --> 06:22.720
IO, or C++, and it just works, and depending on your virtual path, they can go to the

06:22.720 --> 06:26.160
memory card, it can stream off the network, et cetera.

06:26.160 --> 06:32.960
We have IPv6 stack on it, uh, we have drivers for almost everything the Dreamcast has inside

06:32.960 --> 06:35.680
of it hardware-right-wise.

06:35.680 --> 06:38.840
We have texture tools, asset utilities, stuff like that.

06:38.920 --> 06:44.280
We have a whole bunch of examples that are how I learned to program in general, actually.

06:44.280 --> 06:48.680
I grew up using the examples that's why I learned C, so now I work on them, it's kind

06:48.680 --> 06:49.680
of cool.

06:49.680 --> 06:54.060
Then we have add-ons and ports, like if you know OpenGL, legacy OpenGL, you can

06:54.060 --> 06:59.200
code for the Dreamcast, we have that, we have SDL, we have RayLid, we have OpenAL, we have

06:59.200 --> 07:04.800
a pretty rich set of libraries, obviously the higher up the abstraction chain, you get the

07:04.800 --> 07:08.800
more you're sacrificing performance, and there's a trade-off there, depending on the kind

07:08.800 --> 07:13.360
of game you want to put on there, but I mean we have game engines written entirely in Lua,

07:13.360 --> 07:18.000
if they're just interpreted script, interpreted scripts that are pretty cool, and games

07:18.000 --> 07:19.000
written with those.

07:19.000 --> 07:23.000
We actually had a game jam, but we had 24 submissions this year, and several of them

07:23.000 --> 07:28.200
were written in Lua by people who have never programmed for like these embedded retro consoles

07:28.200 --> 07:33.120
before, and it was accessible enough that just someone writing Lua scripts for like,

07:33.120 --> 07:38.320
I don't know, what engine, love 2D or something, they could come over and be productive

07:38.320 --> 07:42.120
and contribute to a game jam and release a game for the Dreamcast, I thought that was pretty

07:42.120 --> 07:43.620
cool.

07:43.620 --> 07:51.760
Our tool chains, that's the version of Bignitils, 243-1, GCC, we have 142.0 is currently

07:51.760 --> 07:56.680
what most of us are using, although we have a preview tool chain configuration, that's

07:56.680 --> 08:01.440
just always on the tip of 15 that we make sure that no one breaks anything, or that everything's

08:01.440 --> 08:03.080
good, and everything looks great right now.

08:03.080 --> 08:12.040
GDB15.2, in terms of language support, we have up to C23, partially supported, basically

08:12.040 --> 08:17.480
in terms of core language, whatever GCC supports is what we have on the front and partially

08:17.480 --> 08:22.120
C++26, we have very good standard library support for both of them.

08:22.120 --> 08:27.560
We have from last year you might know that we have like standard A sync, we have threads,

08:27.560 --> 08:35.300
we have, we have cover teams and C++, a lot of really powerful things, we have objective

08:35.300 --> 08:39.040
C, but we have yet to complete our foundation port.

08:39.040 --> 08:45.560
We have someone by the name of Luna actually began work on Cliftheos bindings to D using

08:45.560 --> 08:52.760
the GDC front end for the D-line, and that's pretty cool, and then we have rust to some

08:52.760 --> 08:53.760
extent.

08:53.760 --> 08:57.040
It's a work in progress, and there are kind of two ways you can use rust on Dreamcast,

08:57.040 --> 09:00.480
but I'll get into that.

09:00.480 --> 09:03.680
This is my Dreamcast set up yet again, I just want to share it.

09:03.680 --> 09:08.800
It's a little special, I've modded it for 32 megs of RAM so that if I'm profiling a game

09:08.800 --> 09:14.440
that's like close on the RAM limit, and I need a little bit of extra RAM and wiggle room,

09:14.440 --> 09:18.240
and some people are trying to get stuff on the Dreamcast that they just need more RAM to

09:18.240 --> 09:23.960
start with, even if you're not in 16 megs yet, it's nice to not just crash or throw an

09:23.960 --> 09:29.000
exception on out of memory or return null from malloc, that's why it's pretty cool.

09:29.000 --> 09:31.680
Mine has a hard drive, a lot of people are doing that now.

09:31.680 --> 09:35.600
You can have that with the disk drive, I have an Ethernet adapter, that's how I stream

09:35.600 --> 09:38.640
stuff off the network, and that's how I generally test.

09:38.640 --> 09:43.440
I submit binaries over the network, and it's got a bootloader that will play the elf off

09:43.440 --> 09:44.440
of the network.

09:44.440 --> 09:50.000
I have a custom BIOS, custom fan, custom power supply, and I plug it into a Wi-Fi wall

09:50.040 --> 09:55.360
outlet, so I can have this thing ready to go on my Jenkins build server that will turn

09:55.360 --> 10:02.080
it on automatically and run CI on it, so yeah, little crazy.

10:02.080 --> 10:08.040
Like I said, since last time you guys released GCC-142.0, when this happens, we like to have

10:08.040 --> 10:13.440
a release party, we like to say, hey, day one, the Dreamcast, you can come, you can come

10:13.440 --> 10:18.680
use the latest GCC on Dreamcast, so yeah, we had a day one party with that, we play with

10:18.680 --> 10:19.680
the new features.

10:19.680 --> 10:28.440
I will say, we are getting better performance now on the latest GCC-143-1999 processor,

10:28.440 --> 10:31.080
that's pretty exciting.

10:31.080 --> 10:36.240
We have fewer icees, but we still have a couple of knocker live up that.

10:36.240 --> 10:40.960
We do have one thing that's a little bit regress that I should mention.

10:40.960 --> 10:43.360
We do use O.S.

10:43.360 --> 10:47.000
Quite a bit in our community, and I'll get into that to sit into RAM, because we're

10:47.080 --> 10:52.520
putting things from PCs, bigger game engines these days, we use O.S. a lot more, and

10:52.520 --> 10:56.360
it looks like the binary size while they're getting faster, are getting a little bit

10:56.360 --> 10:59.080
bigger since GCC-12.

10:59.080 --> 11:05.360
Okay, absolutely, thank you.

11:05.360 --> 11:06.360
Okay, awesome.

11:06.440 --> 11:09.480
That's important, let's put some books.

11:09.480 --> 11:12.480
Yeah, we absolutely do.

11:12.480 --> 11:14.480
Okay.

11:14.480 --> 11:17.080
We need some examples.

11:17.080 --> 11:18.080
Okay.

11:18.080 --> 11:19.920
Absolutely, we'll get on that.

11:19.920 --> 11:21.520
Thank you, I appreciate it.

11:21.520 --> 11:28.440
Oh, and then you guys declared implicitly, declared C functions, thank you.

11:28.440 --> 11:29.440
Thank you so much.

11:29.440 --> 11:34.840
I know everyone was not happy about that, but I was, fixed a lot of bad code, especially

11:34.840 --> 11:35.840
with LTO.

11:35.840 --> 11:42.160
I thought that was a good move, a lot of people were not happy, but I was.

11:42.160 --> 11:46.240
C++ 23 features that we immediately got started playing with.

11:46.240 --> 11:50.640
Look guys, we got a new standard output mechanism, standard print, standard printline from

11:50.640 --> 11:54.520
print.h, well not.h, just print.

11:54.520 --> 11:59.200
I think they got it from Rust or Python, doesn't that look like syntactically how their

11:59.200 --> 12:01.160
print mechanism works?

12:01.240 --> 12:07.480
We also got deducing this, which is a little complicated, but it helps with representing

12:07.480 --> 12:12.520
the curious, free recurring template pattern in C++, and another trick that allows

12:12.520 --> 12:18.440
you to do that you could never do before, is it allows you to recursively call a lambda,

12:18.440 --> 12:23.360
which you couldn't do before because you had to pre-declave a lambda, and yeah, you couldn't

12:23.360 --> 12:24.360
identify it.

12:24.360 --> 12:28.280
So that fixed that, and we can use all that on the Dreamcast now, which is pretty cool.

12:28.280 --> 12:33.960
So that's the output running on the Dreamcast.

12:33.960 --> 12:38.920
Next, what we start playing with is in our community, most people aren't writing assembly,

12:38.920 --> 12:45.040
but a lot of the people who are at the upper-actual ons of performance, they either write

12:45.040 --> 12:50.240
assembly or there's a fast math library that's been floating around, did everyone just

12:50.240 --> 12:56.920
copy paste into their projects, and we took a step back, we're like, okay, how fast is

12:56.920 --> 12:59.240
this in line assembly here?

12:59.240 --> 13:05.840
And so what we have here is this mathematical fast-inline assembly that's being used all over

13:05.840 --> 13:08.520
the place in the community in different ports.

13:08.520 --> 13:13.240
For example, this is in the OpenGL back end for our OpenGL driver.

13:13.240 --> 13:15.040
It's fast-multiply accumulate.

13:15.040 --> 13:20.280
The Super H4 was apparently one of the first ones to have a hard work-sarrated F-Mac that

13:20.280 --> 13:23.520
wasn't for a DSP, and it's only in approximation.

13:23.520 --> 13:31.680
You have to either use inline assembly to do an F-Mac, which is A times B plus C.

13:31.680 --> 13:32.680
That's all it is.

13:32.680 --> 13:37.800
The standard C-functioned F-M-A-F is similar.

13:37.800 --> 13:39.040
So we have two scenarios.

13:39.040 --> 13:45.400
We have the inline assembly thing that everyone loves to use in our community, and then

13:45.400 --> 13:51.320
we have an inverse square root, which I don't know if you guys have heard of it, but

13:51.320 --> 13:58.720
Quake 3 had this famous, I don't know how to describe it, integer bit math trick that

13:58.720 --> 14:03.200
allowed it to do fast inverse square roots, and there's one instruction on the Dreamcast.

14:03.200 --> 14:05.360
So that's another reason we do inline assembly.

14:05.360 --> 14:10.840
The multiply accumulate is another one, and then we have this instruction that is a floating

14:10.840 --> 14:13.560
point sign cosine pair.

14:13.560 --> 14:18.800
You pass it in angle, and then in two different registers, you get the sign and cosine,

14:18.800 --> 14:19.800
and you can extract that.

14:19.800 --> 14:21.440
That's also an approximation.

14:21.440 --> 14:27.360
So to get the compiler to use these, you have to have a fast math enabled for obvious reasons.

14:27.360 --> 14:32.560
You don't want to apply to cross the board, and you have to have two flags enabled that

14:32.560 --> 14:37.160
independently enabled those instructions, the inverse square root and the combined sign

14:37.160 --> 14:40.880
cosine.

14:40.880 --> 14:42.160
So we put it to the pass.

14:42.160 --> 14:43.480
We made a simple example.

14:43.480 --> 14:44.480
This is what I see all the time.

14:44.480 --> 14:47.560
And this is what I saw in our OpenGL driver.

14:47.560 --> 14:52.960
We had a static loop that was looping over some pre-set amount of vertices.

14:52.960 --> 14:58.640
It might have been four because it was a quadrilateral, or no, that would be six, sorry.

14:58.640 --> 15:07.240
So basically, if we define the pre-processor fmax0, we're using this fmax, which is just

15:07.240 --> 15:11.840
we're letting see, inline that and do what you can.

15:11.840 --> 15:15.680
Otherwise, we're using this thing, that everyone's copypacing in their code base, which

15:15.680 --> 15:18.560
is faster, one instruction, right?

15:18.560 --> 15:25.840
So we finally test it out, and what do you know, this is see the compiler decided that

15:25.840 --> 15:32.320
it can factor out the multiplication, and that became a constant.

15:32.320 --> 15:38.600
And here, it loop on rolled, and it became something much worse.

15:38.600 --> 15:45.640
But loop on rolled, the f-map, which didn't even ever have to happen, and yeah.

15:45.640 --> 15:47.640
So you lose a lot of performance.

15:47.640 --> 15:49.280
So this is with a bunch of tweaks.

15:49.280 --> 15:52.520
This is Mario 64, not going to let it play for too long.

15:52.520 --> 15:55.760
We thought we'd be boring to just play Mario 64 on a Dreamcast, because it should run a

15:55.760 --> 15:56.760
Dreamcast.

15:56.760 --> 16:01.360
So this is it at 60 frames per second, uncapped, to make it more interesting.

16:01.360 --> 16:04.680
So yeah, it runs pretty well at uncapped frame rate.

16:04.680 --> 16:08.200
There is a little visual glitching on Mario's face.

16:08.200 --> 16:11.480
That's an issue we actually fixed, but I kept it, because I think it's endearing, that's

16:11.480 --> 16:15.680
Dreamcast, Mario.

16:15.680 --> 16:20.280
Something I thought was really cool is one day I was playing with the new compiler version,

16:20.280 --> 16:24.560
and I was like, I wonder what happens instead of C++ if you don't have a time zone

16:24.560 --> 16:25.560
database.

16:25.560 --> 16:29.560
So I tried to access it, and I had a time zone database on a Dreamcast.

16:29.560 --> 16:31.800
And I was like, how did this happen?

16:31.800 --> 16:38.120
I could print and walk at time zone database, and it turns out that GCC build script

16:38.160 --> 16:44.680
or something, added where it will go online and download the time zone database and embed

16:44.680 --> 16:52.040
it in a read-only RAM, or I'm sorry, the read-only segment, and yeah, you actually

16:52.040 --> 16:55.120
wind up with the time zone database, and you can turn it off to if you don't want it

16:55.120 --> 16:56.120
on embedded.

16:56.120 --> 16:58.800
So that was, we thought that was really cool.

16:58.800 --> 17:02.360
This is our Sonic Mania port, which does a whole bunch of things that are pretty common

17:02.360 --> 17:05.000
for the community.

17:05.000 --> 17:07.400
This is way too big to fit into RAM initially.

17:07.400 --> 17:11.120
We cannot do anything other than ODOTS initially to fit into RAM.

17:11.120 --> 17:17.800
So what we do is we ODOTS or ODOTS, sorry, we owe three all of the critical things, like

17:17.800 --> 17:24.400
the renders, the graphics, mathematics, the physics, and then we OS everything else, and

17:24.400 --> 17:28.200
we kind of mix them to get the best of performance and size.

17:28.200 --> 17:32.440
And typically when we do that, we can get the performance as well as running with O3 on

17:32.440 --> 17:33.440
everything.

17:33.440 --> 17:35.840
That's pretty nice.

17:35.840 --> 17:40.120
We use LTO a whole lot in the Dreamcast community.

17:40.120 --> 17:43.760
I always tell people to use it if you're watching remotely, you should definitely be using

17:43.760 --> 17:44.760
LTO.

17:44.760 --> 17:48.800
It's like free frame rate gains just for a little bit of extra compile time.

17:48.800 --> 17:54.640
And yeah, this is the console logging from the network from Sonic.

17:54.640 --> 18:00.760
We got a little fun with C++23, and we made an IRQ handler that can basically be invoked

18:00.760 --> 18:09.840
from any generic, callable C++ type by making a bunch of concepts that allow you to basically

18:09.840 --> 18:15.440
constrain a template argument, based on whether it's a callable with the correct syntax

18:15.440 --> 18:20.200
or not, and then use any generic callable as a IRQ handler.

18:20.200 --> 18:25.920
This is our Dream64 port to the second Dreamcast, which is superior to the actual commercial

18:25.920 --> 18:27.920
game in a lot of different ways.

18:27.920 --> 18:30.920
We looked at 16 simultaneous dynamic lights.

18:30.920 --> 18:32.920
We have bump mapping at target 60 FPS.

18:32.920 --> 18:34.920
There's no lighting on the original game.

18:34.920 --> 18:36.920
There's no lighting on the stingery release.

18:36.920 --> 18:43.040
So this has all been added just for Dreamcast, and this the author Jan Martin prides himself

18:43.040 --> 18:45.560
on he likes to rip in line assembly of his code.

18:45.560 --> 18:50.400
He just likes to do it in C, like to prove I can achieve this in C, and he absolutely

18:50.400 --> 18:51.400
can.

18:51.400 --> 18:56.680
Finally, the one that I want to talk about is our grant that thought of ports to the

18:56.680 --> 18:58.760
three ports, the second Dreamcast.

18:58.760 --> 19:02.280
This was called impossible for like the last 20 years.

19:02.280 --> 19:04.880
This was the selling point of the PlayStation 2.

19:04.880 --> 19:10.520
You got a via PS2, so you can play GTA 3 because the Dreamcast isn't powerful enough.

19:10.520 --> 19:14.680
So that was BS.

19:14.680 --> 19:16.880
We're going to look at a little bit of code for it.

19:16.880 --> 19:19.000
It was a lot of work.

19:19.000 --> 19:23.480
It was done by a really talented team, and it was just amazing.

19:23.480 --> 19:27.000
The whole Dreamcast community came together and made this happen, and it was beautiful.

19:27.000 --> 19:30.400
And we had like a hundred players testing, and it was just beautiful.

19:30.400 --> 19:33.080
This is just drawing sprites, okay.

19:33.080 --> 19:44.560
This is a crazy hybrid C++20, abusing pre-setching loop on rolling code base that is just

19:44.560 --> 19:48.960
mixing modern and old and inline assembly, and it's kind of glorious.

19:49.000 --> 19:50.760
It's kind of beautiful.

19:50.760 --> 19:56.400
For rendering primitives, there's two layers deep of lamb does just to do that, but then

19:56.400 --> 20:02.400
we're also being extremely careful with pre-fetching the cache and doing all that.

20:02.400 --> 20:08.400
Believe it or not, just for 2D, these lights, the bloom around it, the lens flare here,

20:08.400 --> 20:11.720
the rain, those are all just flat textures.

20:11.720 --> 20:15.520
So it's actually important that 2D rendering was pretty fast.

20:15.520 --> 20:18.440
This is something I made on the team.

20:18.440 --> 20:22.840
This was born because I wanted to profile the game, but I was too lazy to figure out how

20:22.840 --> 20:27.000
the UI works, and so I decided to render it to the VMU.

20:27.000 --> 20:32.880
So we spawn a thread, a standard thread, and C++11, and we're able to do a bunch of cool

20:32.880 --> 20:33.880
stuff with it.

20:33.880 --> 20:38.680
Relatively easily, that's all the code that took to make that real-time debugger, and we

20:38.680 --> 20:40.600
actually released with it.

20:40.600 --> 20:46.240
Next, I want to talk about which, in my opinion, was the craziest part of this entire project

20:46.240 --> 20:48.640
was the Transform and Lighting Loop.

20:48.640 --> 20:54.520
It's the most expensive part of any Dreamcast game, because they decided that the Dreamcast

20:54.520 --> 20:58.720
CPU would be the thing processing and transforming all the vertices, whereas like the

20:58.720 --> 21:01.120
PlayStation 2 used the co-processor.

21:01.120 --> 21:03.280
So this has to be very fast.

21:03.280 --> 21:08.800
Just to quick look at it, what we've done is this is just transformation.

21:08.800 --> 21:13.360
Notice we have a bunch of non-template type arguments that are Boolean.

21:13.360 --> 21:19.500
This allows us to basically compile time-generate versions of this where we use if-constex

21:19.500 --> 21:22.840
for, and we can compile out the branches, right?

21:22.840 --> 21:26.880
So there are multiple, instantiations of this loop.

21:26.880 --> 21:30.360
None of them have branching in them, and we take the variation.

21:30.360 --> 21:34.840
We choose the variation at one time that's basically best suited, same with, if we're

21:34.840 --> 21:41.000
clipping or not, we're pre-fetching, and yeah, I think that was, it's a little worris

21:41.000 --> 21:46.720
being able to do that on a Dreamcast and mix all those different constructs.

21:46.720 --> 21:48.480
This is what it turned out looking like.

21:48.480 --> 21:51.640
The audio was a little messed up in this capture, because it was earlier in development,

21:51.640 --> 21:52.680
so I just turned it off.

21:52.680 --> 21:58.720
But you can see that obviously it was possible to run Grand Peptata 3 on the Sega Dreamcast.

21:58.720 --> 21:59.720
It looks pretty good.

21:59.720 --> 22:03.440
It doesn't run as well as the PlayStation 2 yet, but we're still working on it.

22:03.440 --> 22:04.760
Has dynamic lighting.

22:04.760 --> 22:09.520
It has a pretty high polygon count for that era.

22:09.520 --> 22:14.800
It has a lot of special effects, and yeah, it was pretty cool.

22:14.800 --> 22:16.520
So what about other programming languages?

22:16.520 --> 22:18.880
Sorry, I went too far.

22:18.880 --> 22:22.640
I just ruin the supplies.

22:22.640 --> 22:24.520
What about other programming languages?

22:24.520 --> 22:27.440
Well, we have a couple that are coming up.

22:27.440 --> 22:33.520
We have some that aren't complete, but does anyone know what this is?

22:33.520 --> 22:37.360
This is the language in case you need to do some government contracting for your Dreamcast

22:37.360 --> 22:40.680
or defense programming.

22:40.680 --> 22:44.640
But we have a bunch of people who are excited to use this on the Dreamcast, because this

22:44.640 --> 22:49.320
is apparently related to VHDL, or it's a lot of influence from VHDL.

22:49.320 --> 22:52.800
We have a bunch of programmers who do that.

22:52.800 --> 23:00.640
So this is binding to our CAPI to make open GL calls, and this is Ada.

23:00.640 --> 23:08.080
We have the entire Ada tool chain going, are the Ada tools, Nat, and all the friends

23:08.080 --> 23:12.200
on the Dreamcast now, and we combine it to the CAPI's, and I think that was really cool.

23:12.200 --> 23:15.560
That was Mark, so we're very happy with that.

23:15.560 --> 23:18.920
And there's a bunch of people who are excited to play with it.

23:18.920 --> 23:25.000
Rust on the Dreamcast is in an interesting state, because, okay, that's intentional.

23:25.000 --> 23:27.880
That's a benchmark.

23:27.880 --> 23:29.560
That one is a benchmark, I promise.

23:29.560 --> 23:35.720
It's not just bad code, but that's our polygons pushing benchmark to see how many polygons

23:35.720 --> 23:36.880
can we push.

23:36.880 --> 23:46.920
So we actually have GCCRS, and we have GCCRS, and we have Rust C code gen GCC, which

23:46.920 --> 23:52.000
is the main path chosen for now, by the person doing most of the Rust work, because it's

23:52.000 --> 23:54.600
a little bit further along.

23:54.600 --> 24:02.160
It's an experimental code gen for the Rust C compiler that interfaces with lib GCCGIT.

24:02.160 --> 24:05.880
We have full-sand library support with it, and it relates to cargo.

24:05.880 --> 24:08.800
We have Tokyo Async runtime.

24:08.800 --> 24:10.520
We have a bunch of APIs bound to it.

24:10.520 --> 24:14.480
There are some that we've made safe, but not very many of them.

24:14.480 --> 24:15.480
Not yet.

24:15.480 --> 24:17.480
It requires patching for SH4.

24:17.480 --> 24:20.200
It requires patching for a lot of different platforms.

24:20.200 --> 24:21.600
It doesn't technically support.

24:21.600 --> 24:25.400
So that's unfortunate, that's a little bit annoying.

24:25.400 --> 24:28.040
We've been getting a lot of good use from it, but there are definitely some strengths

24:28.040 --> 24:30.680
that we're looking forward to from GCCRS.

24:30.680 --> 24:35.920
That's one of them, is it will actually fully support the platforms we have, and there's

24:35.920 --> 24:38.440
another issue, actually, that we've encountered.

24:38.440 --> 24:43.880
Performance has been really good with this approach, but when it really matters in that

24:43.880 --> 24:50.560
transform and lighting loop, this is the critical loop in unsafe Rust.

24:50.560 --> 24:53.440
We're submitting polygons extremely quickly.

24:53.440 --> 24:57.400
You send the primitive header, which just tells the GP what kind of polygons it is, and

24:57.400 --> 25:04.600
then for all the polygons, you map to a special buffer that submits vertices to this called

25:04.600 --> 25:08.600
a tile accelerator extremely quickly, and you basically commit the vertex.

25:08.600 --> 25:11.920
You map to the buffer, fill in the vertex, and commit it.

25:11.920 --> 25:18.240
And the way you commit is with, right now, in our SDK, it's with the prefix.

25:18.240 --> 25:22.480
The prefix instruction in line assembly, that's how the SH4 works.

25:22.480 --> 25:28.240
You launch a prefix with the store queues enabled, and it will basically copy extremely quickly

25:28.240 --> 25:31.480
32 bytes of data, which happens to be a vertex size.

25:31.480 --> 25:38.240
What we have to do with the non GCCRS Rust configuration, well, notice this is always

25:38.240 --> 25:41.680
inline assembly.

25:41.680 --> 25:45.600
They don't have support for SH4 technically, so we can't use inline assembly, so we have

25:45.600 --> 25:48.960
to do something a little bit different that you wouldn't think would matter that much,

25:48.960 --> 25:49.960
but it does.

25:49.960 --> 25:53.320
We have to wrap it in an actual function call, okay.

25:53.320 --> 25:59.500
So think about having to call a function 2 million times a second when this was just one

25:59.500 --> 26:00.500
instruction.

26:00.500 --> 26:07.080
So actually that hurts, that hurts us, well, the statistics aren't there, oh yes, there.

26:07.080 --> 26:12.860
3 million polygons per second peak with C right now, 2.4 with the Rust bindings, doing

26:12.860 --> 26:17.460
literally the exact same thing, just because of that function, that function call overhead.

26:17.460 --> 26:26.940
So that's something that GCCRS would definitely help with, so looking forward to that.

26:26.940 --> 26:29.220
That was kind of the survey of what we've had going on.

26:29.220 --> 26:34.460
I want to thank everyone who's contributed to GCC, who's allowed all this to happen,

26:34.460 --> 26:41.820
because it's from everyone's work, that we even can have nice things on the Dreamcast still.

26:41.820 --> 26:44.780
There's a link for how you get started with Dreamcast development, and I want to mention

26:44.780 --> 26:49.060
that we're not the only community that does crazy stuff with GCC.

26:49.060 --> 26:56.620
Notice, PlayStation portable, Sega Saturn is SSH2, there are CPU cousins, with Dragon and

26:56.620 --> 26:57.620
10S64.

26:57.620 --> 27:03.020
They're all on the latest GCC-14, so you guys are powering a lot of home brew, a lot of

27:03.020 --> 27:09.020
home brew consoles, and that we definitely all appreciate that, and we're language enthusiasts,

27:09.020 --> 27:14.940
so we definitely use a lot of what GCC has to offer.

27:14.940 --> 27:18.100
That was it for the presentation, are there any questions?

27:18.100 --> 27:19.100
Yeah.

27:19.100 --> 27:20.100
Thanks.

27:20.100 --> 27:29.100
Thanks.

27:29.100 --> 27:37.460
That is a really good, okay, two things, OP, who is with the lead engineer of the GTA-3 engine,

27:37.460 --> 27:41.580
is one of the coolest people I've ever met, he supported us on Twitter and tweeted out

27:41.580 --> 27:45.940
that he was impressed, and he always thought work on the Dreamcast, and no, we have not

27:45.940 --> 27:46.940
been shut down.

27:46.940 --> 27:53.180
Every day we look for the season to sit's lighter in the mail, and it hasn't come yet.

27:53.180 --> 27:57.300
Yeah, yeah, definitely.

27:57.300 --> 28:01.060
If you ever see this, thank you very much.

28:01.060 --> 28:02.060
Yes.

28:03.060 --> 28:05.060
Yes, I'm from Huntsville, Alabama.

28:05.060 --> 28:10.380
And you show in the picture of the Dreamcast, the logo is a group, it's not a group, it's

28:10.380 --> 28:11.380
bread.

28:11.380 --> 28:12.380
Yes.

28:12.380 --> 28:14.380
But in the States, the Dreamcast is a group of people.

28:14.380 --> 28:20.460
No, in the States, it's orange, actually, yeah, it's not the same.

28:20.460 --> 28:25.940
I have like 10 different Dreamcast, yeah, that must have been, yeah, I have a little bit

28:25.940 --> 28:29.780
of a problem, I have a lot of them in my office, I didn't even notice that, that's

28:29.780 --> 28:35.500
really good, that's really perceptive, yeah, that must have been, that must have been

28:35.500 --> 28:36.500
exactly what it is.

28:36.500 --> 28:39.500
Yeah, no, that's a good question.

28:39.500 --> 28:40.500
Any other questions?

28:40.500 --> 28:41.500
Yeah.

28:41.500 --> 28:45.940
How do you import games to their, like, do you have the source?

28:45.940 --> 28:46.940
Yes.

28:46.940 --> 28:47.940
Where do you get it from?

28:47.940 --> 28:51.940
Yes, so the question is, how do you import games to their, do you have the source code

28:51.940 --> 28:52.940
to games?

28:52.940 --> 28:56.780
Yes, we have to have something to go off of, but fortunately, just like there's a community

28:56.780 --> 29:01.820
of us doing Dreamcast stuff, for every big game, really big games, there seems to be

29:01.820 --> 29:07.140
communities that merge, that reverse engineer them, that decompile them, and then allow

29:07.140 --> 29:09.980
you to import them using RAPIs and RSDKs.

29:09.980 --> 29:14.620
So these were mostly based on decompelations and reverse engineer projects.

29:14.620 --> 29:18.380
And there's more appearing constantly for other games.

29:18.380 --> 29:20.140
So that's pretty nice.

29:20.140 --> 29:21.140
Interesting.

29:21.140 --> 29:22.140
Oh, yes.

29:22.140 --> 29:26.140
I can just add that the reason why you're pinprinting us, that's a bluder, though, is that

29:26.140 --> 29:30.140
it was a trademark dispute, but then an Italian company called T-Body, that T-Body

29:30.140 --> 29:33.940
or something like that, and they had an orange squirrel, so they changed the squirrel to

29:33.940 --> 29:34.940
a bit then.

29:34.940 --> 29:38.980
I didn't even know the full story, nice.

29:38.980 --> 29:41.980
Anyone else?

29:41.980 --> 29:44.700
Okay, well, thank you very much.

