WEBVTT

00:00.000 --> 00:11.000
Hello, my name is Pedro, I make you an engineer at Quick House.

00:11.000 --> 00:19.000
I want to talk to you about a topic that probably could let us make it think for a while.

00:19.000 --> 00:24.000
That's something it might not be easy, which is fuzzing databases.

00:25.000 --> 00:32.000
I want to click house recently, but has been already used right and many things already happened.

00:32.000 --> 00:37.000
I just don't want to ask first if any of you know Quick House project.

00:37.000 --> 00:39.000
Okay, a few, okay, thanks.

00:39.000 --> 00:45.000
For the you that don't know, Quick House is an open source, an open source database within C++.

00:45.000 --> 00:51.000
It could not be very fast for an open code queries and aggregations.

00:51.000 --> 00:57.000
You can see and can scale very well for one node to many to hundreds.

00:57.000 --> 01:05.000
And you have customers running, oh sorry, running with the petabytes of data.

01:05.000 --> 01:12.000
But what you want to focus here today is about testing databases.

01:12.000 --> 01:17.000
Because they are not as easy as you might think about.

01:17.000 --> 01:21.000
First, I want to just talk about fuzzer for a bit.

01:21.000 --> 01:25.000
I've been doing mostly fuzzing in Quick House for this past few months.

01:25.000 --> 01:29.000
And they have been at a certain topic, as you know, for all these years.

01:29.000 --> 01:34.000
Many fields in computer science, inequality, but also databases.

01:34.000 --> 01:36.000
And we had Quick House.

01:36.000 --> 01:40.000
We also have been, of course, starting at a lot over the years.

01:40.000 --> 01:46.000
With many fuzzers, with all kinds of strategies to find all kinds of issues in Quick House.

01:47.000 --> 01:49.000
There are a few here.

01:49.000 --> 01:56.000
A few of them probably already know, like the SQL answer, which was, you need to find the wrong results.

01:56.000 --> 02:04.000
There are also, like, FAL with fuzzer that I know to do coverage testing, to try to find the paths and find the issues with that.

02:04.000 --> 02:10.000
And the others, let's know that it's a diffuser that we've developed in Quick House ourselves.

02:10.000 --> 02:14.000
And there's this new one, it's about for self-scalters house.

02:14.000 --> 02:19.000
That it's about in the past few months that we was able to find lots of new issues.

02:19.000 --> 02:23.000
And all these fuzzers, they are not perfect.

02:23.000 --> 02:27.000
They can find some issues very easily, others not that easy.

02:27.000 --> 02:35.000
And this is what the point of this presentation that there's not perfect fuzzer to find other issues in that database.

02:35.000 --> 02:40.000
But why? Well, other databases are very complex.

02:40.000 --> 02:45.000
They have to do lots of things from query, optimization and started.

02:45.000 --> 02:49.000
They have to guarantee that data always always there.

02:49.000 --> 02:55.000
So even if the crash happens and it starts everything must be there, running as smooth as possible.

02:55.000 --> 03:00.000
And there are the many other things databases.

03:00.000 --> 03:03.000
They are known to have many interface languages.

03:03.000 --> 03:08.000
But SQL, which one is Clickhouse users, is the most known one.

03:08.000 --> 03:13.000
And it's also quite extensive and it has some complexity in there.

03:13.000 --> 03:16.000
And there are other things, if you run the fuzzer.

03:16.000 --> 03:22.000
That's, we had to make sure that the state is kept between queries,

03:22.000 --> 03:27.000
because that database has a catalog with tables, columns and data.

03:27.000 --> 03:35.000
And you must know that if you are going to find something, you want to make sure to make sure that you are always targeting something.

03:35.000 --> 03:38.000
And testing new code paths there.

03:38.000 --> 03:41.000
And it makes more difficult databases.

03:41.000 --> 03:46.000
They have to run for a very long time, like days of a year, that interruption.

03:46.000 --> 03:49.000
And you also have to vary performance through time.

03:49.000 --> 03:56.000
And it makes difficult to type to detect these things like in a more, I'll say, automated way through a fuzzer.

03:56.000 --> 04:03.000
So what can we do to test database with a fuzzer?

04:03.000 --> 04:06.000
Well, you can think about many things.

04:06.000 --> 04:09.000
You can start to think about, okay, we have SQL.

04:09.000 --> 04:14.000
We can try to generate SQL queries, but how many ways you can do this.

04:14.000 --> 04:19.000
And also, it makes sure that we are fuzzing something that exists there in the database,

04:19.000 --> 04:23.000
like catalog with tables and columns, et cetera, many types.

04:24.000 --> 04:26.000
They are, they are many things that we have to consider.

04:26.000 --> 04:35.000
And it makes difficult to make this, I can run them away with a fuzzer, because fuzzer is run the most away.

04:35.000 --> 04:41.000
And also, as we cannot make it very strict, because the fuzzer makes very strict what can it generate.

04:41.000 --> 04:44.000
You can reduce the mind of the impossible, you can target.

04:44.000 --> 04:47.000
And the issue we could pursue is possibly fine.

04:47.000 --> 04:51.000
And also, don't forget that not all the issues are just point crashes.

04:51.000 --> 04:56.000
We had to find fine, find wrong results, performance issues, and et cetera.

04:56.000 --> 04:58.000
So things are more difficult to find.

04:58.000 --> 05:04.000
So there are many fuzzer out there that we tried, also, in quick, also, and we have running on our CI.

05:04.000 --> 05:09.000
There are some micro-fuzzer that I told, that was testing.

05:09.000 --> 05:14.000
But I think it's another slow to generate inputs, and it takes some time,

05:14.000 --> 05:18.000
and then it takes some time, and we have to keep in track in mind.

05:18.000 --> 05:22.000
And we have to keep it, it makes it a little slow.

05:22.000 --> 05:28.000
But I want to just want to show here the SD fuzzer that we developed, by the quick house of yours back before I joined.

05:28.000 --> 05:36.000
But it basically does, like it takes the inputs from queries, from clients, or a test case where you have from quick house.

05:36.000 --> 05:44.000
And mutates the syntax 3 to create new combinations and possibly fine new issues.

05:44.000 --> 05:50.000
So I have an example here, from the SD fuzzer, you create a table there.

05:50.000 --> 05:53.000
You are running an client, and you fight the query.

05:53.000 --> 05:56.000
And you'll simply start the limitations of the query.

05:56.000 --> 06:02.000
You have here a select from the table, and you can see you can start adding a distinct clause.

06:02.000 --> 06:04.000
You could change the work clause.

06:04.000 --> 06:10.000
And now I'm not working towards one clause, or change the type of query to explain something like that.

06:10.000 --> 06:12.000
It starts by mutating.

06:12.000 --> 06:18.000
But this is a rather slow, because there's one fundamental reason that we have to give the input first to the fuzzer.

06:18.000 --> 06:25.000
And then it starts doing its presentations, and try to find new new issues.

06:25.000 --> 06:32.000
But you try it by itself by randomly, and also this is kind of slow that you will start from that input always.

06:32.000 --> 06:34.000
And you will start mutating it.

06:34.000 --> 06:41.000
You know, you know, you are not sure if this input is probably the right query to find the next book in quick house.

06:41.000 --> 06:45.000
So, let's then go back to a little.

06:45.000 --> 06:47.000
And let's try to design the fuzzer ourselves.

06:47.000 --> 06:50.000
Well, because we are engineers and we do these things.

06:50.000 --> 06:57.000
And I have such an example query here from the tps25, probably all of you know.

06:57.000 --> 06:59.000
Just to give some works here.

06:59.000 --> 07:03.000
We have a select with a from very many tables.

07:03.000 --> 07:08.000
Then you have here a work clause with joining the tables.

07:08.000 --> 07:11.000
Plus, we have some filtering calls that in dates.

07:11.000 --> 07:13.000
And then you have a group by with the nine gates.

07:13.000 --> 07:17.000
So we must follow the grouping rules, and then follow by the end.

07:17.000 --> 07:20.000
You can see already here that you have considered here.

07:20.000 --> 07:24.000
If you go completely randomly, we can get a lot of errors.

07:24.000 --> 07:27.000
Like if you miss this diagram or rules.

07:27.000 --> 07:31.000
And that's many of these fuzzer still fail on this.

07:31.000 --> 07:35.000
So, let's then design one.

07:35.000 --> 07:37.000
Let's then think about something.

07:37.000 --> 07:40.000
I have a simple example here.

07:40.000 --> 07:43.000
Okay, the most valuable one here.

07:43.000 --> 07:45.000
You have a probability here.

07:45.000 --> 07:51.000
For more probabilities, you can either create a table, insert into it, do it, alter or drop,

07:51.000 --> 07:53.000
or run a query most of the time.

07:53.000 --> 07:55.000
It's probably what you want.

07:55.000 --> 07:58.000
This one's nice to start with.

07:58.000 --> 08:01.000
But there's already a few issues here.

08:01.000 --> 08:07.000
Probably you can already see if someone wants to try to give

08:07.000 --> 08:10.000
or write a comment on this.

08:10.000 --> 08:12.000
I don't know.

08:12.000 --> 08:14.000
If not, I can already tell.

08:14.000 --> 08:16.000
There are issues here.

08:16.000 --> 08:21.000
First, we have a probability to add a drop statement at five percent.

08:21.000 --> 08:24.000
So, at 20 statements, a drop statement will be generated.

08:24.000 --> 08:29.000
Which means like a table, probably could not pass like more than 20 statements.

08:29.000 --> 08:31.000
It's statements which is bad.

08:31.000 --> 08:35.000
And also, you have to cut the complexity of a, or I think, quite a table.

08:35.000 --> 08:40.000
That can be more complex with a drop, because it comes with many types.

08:40.000 --> 08:43.000
And it can have expressions that are for constraints and everything.

08:43.000 --> 08:47.000
It can be more complex to handle.

08:47.000 --> 08:51.000
So, there's a chance that there will be many drops, successful ones.

08:51.000 --> 08:54.000
And if you create, create ones instead.

08:54.000 --> 08:57.000
And we probably end with no tables in the catalog, which is bad.

08:57.000 --> 08:59.000
And that doesn't happen in real life.

08:59.000 --> 09:05.000
And also, the reads could, many of the reads could be recomputedly.

09:05.000 --> 09:09.000
And the inserts could be complex, because the inserts must follow the rules of the table.

09:09.000 --> 09:13.000
And you can end up with empty tables most of the time,

09:13.000 --> 09:16.000
which also doesn't happen a lot in real life.

09:16.000 --> 09:18.000
So, it's already some issues on this.

09:18.000 --> 09:21.000
And we have to think about in a way to avoid these issues.

09:21.000 --> 09:24.000
And try to have a balance between, by this completely random,

09:24.000 --> 09:27.000
and by more real life.

09:28.000 --> 09:33.000
So, I think there was something missing on NLC either, and click also.

09:33.000 --> 09:39.000
I can meet bus house to try to fix this gap here.

09:39.000 --> 09:43.000
And this is what it does for now.

09:43.000 --> 09:48.000
25, 4,000 lines of code, which probably is a bit much to a further.

09:48.000 --> 09:55.000
Lots of issues found, more than 100, and lots of people busy fixing them.

09:56.000 --> 09:59.000
Probably, it's already a good start.

09:59.000 --> 10:01.000
And we start this for a few months.

10:01.000 --> 10:06.000
And we're able to find many issues and click house.

10:06.000 --> 10:10.000
So, what it does exactly?

10:10.000 --> 10:14.000
I could try to show the demo here, but probably there's no time to do it.

10:14.000 --> 10:16.000
So, I'm going to stick with the presentation.

10:16.000 --> 10:20.000
You can find this with a book post, but I wrote about this.

10:20.000 --> 10:24.000
So, the bus house in action just starts by creating some tables.

10:25.000 --> 10:29.000
I want you a number to try to be a more random,

10:29.000 --> 10:32.000
and try to create a more square task possible.

10:32.000 --> 10:37.000
And then test regenerative queries, try to find the issues.

10:37.000 --> 10:43.000
And make sure there's, I was in a number of tables in the catalog.

10:43.000 --> 10:46.000
You want to keep out with the dropping them.

10:46.000 --> 10:52.000
Make sure some stay for very long to make sure that we have something to test on.

10:52.000 --> 10:55.000
Also, set the meat on the query size.

10:55.000 --> 10:57.000
You don't want to run it out of manual access queries,

10:57.000 --> 10:59.000
because the combinations are many.

10:59.000 --> 11:03.000
There are lots of chances that the query is 100 anything useful.

11:03.000 --> 11:09.000
And try to try to hack better in search to make sure that the tables have something.

11:09.000 --> 11:13.000
And also, don't the session see it just,

11:13.000 --> 11:17.000
if you want to do this to write it, because some of the issues are not deterministic.

11:17.000 --> 11:20.000
And they are more difficult to produce.

11:20.000 --> 11:26.000
I'm going to try manually, so sometimes we just run the session again.

11:26.000 --> 11:29.000
So, what does house generates?

11:29.000 --> 11:32.000
Well, this is simple for a larger query.

11:32.000 --> 11:35.000
It's still not very constant to be serious for query five.

11:35.000 --> 11:39.000
Okay, I have a few jokes here, because I also do that in a new JSON type,

11:39.000 --> 11:41.000
that we have for regards.

11:41.000 --> 11:45.000
And so we can have more than five years than the table, table column,

11:45.000 --> 11:47.000
and then sub column in JSON.

11:47.000 --> 11:49.000
So it makes a bit more complex.

11:49.000 --> 11:51.000
These are other queries.

11:51.000 --> 11:53.000
Sometimes I can also reduce some small ones.

11:53.000 --> 11:56.000
But I can start here some more causal patterns,

11:56.000 --> 12:00.000
like we try to find that to make sure all the columns use in the query,

12:00.000 --> 12:04.000
use in the form course, so that we must correct as possible.

12:04.000 --> 12:06.000
Try some time, also, for something like the group I boost,

12:06.000 --> 12:09.000
make sure we have correct semantics.

12:09.000 --> 12:14.000
So we have about to find more issues.

12:14.000 --> 12:15.000
What have found?

12:15.000 --> 12:18.000
Yeah, as I said, what have issues?

12:18.000 --> 12:23.000
So issues, crashes, often, like semantics and faults,

12:23.000 --> 12:27.000
and find the other things that are probably those the easiest to find.

12:27.000 --> 12:31.000
There are also the logical errors, which are kind of the certifications,

12:31.000 --> 12:33.000
including cows, or basically I can insertion,

12:33.000 --> 12:35.000
like in a query plan or something else.

12:35.000 --> 12:38.000
You must have certain then eight fails.

12:38.000 --> 12:40.000
A few wrong results.

12:40.000 --> 12:43.000
There are a few ways I'm going to talk in a minute,

12:43.000 --> 12:45.000
how to detect them.

12:45.000 --> 12:48.000
There were a few oil and kills,

12:48.000 --> 12:51.000
and there are some issues.

12:51.000 --> 12:54.000
It's kind of a case about memory management,

12:54.000 --> 12:56.000
but that's not a big issue.

12:56.000 --> 12:59.000
And of course, the queries that gets stuck forever,

12:59.000 --> 13:01.000
never end, maybe.

13:01.000 --> 13:05.000
Because there's a lot of text or bad loop that never ends.

13:05.000 --> 13:10.000
They're also not at common, but also happens.

13:10.000 --> 13:14.000
So what I do to find wrong results?

13:14.000 --> 13:18.000
There are many ways, many ways, a few, I mean, to find them.

13:18.000 --> 13:21.000
Some probably are more than the others.

13:21.000 --> 13:23.000
There's something as simple as that, as dumping a table,

13:23.000 --> 13:27.000
and read back again, and compare the content.

13:27.000 --> 13:30.000
It sounds a bit too simple,

13:30.000 --> 13:33.000
but I was able to find a lot of issues,

13:33.000 --> 13:35.000
mostly in the data formats,

13:35.000 --> 13:38.000
because we include houses, portals, data formats,

13:38.000 --> 13:43.000
like parka, CSV, arrow, and all these formats.

13:44.000 --> 13:48.000
And the other things I could run in a query article,

13:48.000 --> 13:51.000
this was going to start at a few hours ago.

13:51.000 --> 13:54.000
We can do something like, for example,

13:54.000 --> 13:57.000
sweat, count, from a query, with a paddicate,

13:57.000 --> 13:58.000
and compare it.

13:58.000 --> 14:01.000
The number of rows returned by that query,

14:01.000 --> 14:04.000
with a select sum, with a data paddicate,

14:04.000 --> 14:07.000
and compare it, then the number of rows at the first query,

14:07.000 --> 14:10.000
with the result of the sum of the second one,

14:10.000 --> 14:13.000
was also able to find a few issues there.

14:13.000 --> 14:15.000
There are more things,

14:15.000 --> 14:17.000
more simple, like, running at a query,

14:17.000 --> 14:20.000
different settings, like,

14:20.000 --> 14:22.000
but probably it's been enabled or not,

14:22.000 --> 14:25.000
enabled or disabled, external sorting,

14:25.000 --> 14:26.000
things like that,

14:26.000 --> 14:28.000
but also we have to find a few issues.

14:28.000 --> 14:30.000
And there are probably this one,

14:30.000 --> 14:32.000
that probably not my people,

14:32.000 --> 14:34.000
to do think about it,

14:34.000 --> 14:37.000
I compare the results in that database.

14:37.000 --> 14:40.000
Compareding the other click-out versions,

14:40.000 --> 14:42.000
it's probably the easier way,

14:42.000 --> 14:44.000
because they have exactly the same SQL I write,

14:44.000 --> 14:47.000
but if you want to start working with other databases,

14:47.000 --> 14:50.000
like my SQL repository has,

14:50.000 --> 14:52.000
it starts become a bit more difficult,

14:52.000 --> 14:55.000
because the SQL language, as you know,

14:55.000 --> 14:59.000
is not very fine in some ways,

14:59.000 --> 15:03.000
and some results may not be the same between the databases,

15:03.000 --> 15:04.000
and that's fine.

15:04.000 --> 15:07.000
It's just all day design.

15:07.000 --> 15:11.000
But how's this?

15:11.000 --> 15:12.000
Okay, nice.

15:12.000 --> 15:14.000
You have found a solution to click-out,

15:14.000 --> 15:15.000
that's great.

15:15.000 --> 15:17.000
But yeah, but it also has issues,

15:17.000 --> 15:18.000
like, directly other versions.

15:18.000 --> 15:20.000
And the first thing,

15:20.000 --> 15:21.000
probably, as you can see from that,

15:21.000 --> 15:22.000
as I was clearly,

15:22.000 --> 15:26.000
that the combinations start to add a lot,

15:26.000 --> 15:29.000
with the other lots of things.

15:29.000 --> 15:31.000
Well, what click-out is, like,

15:31.000 --> 15:33.000
created for large projects,

15:33.000 --> 15:35.000
with support many things,

15:35.000 --> 15:38.000
by supporting many types,

15:38.000 --> 15:41.000
from integers, to strings,

15:41.000 --> 15:43.000
to nested types, like arrays,

15:43.000 --> 15:45.000
and key and vowels.

15:45.000 --> 15:48.000
There are also now the new JSON type.

15:48.000 --> 15:50.000
Oops, sorry for this.

15:50.000 --> 15:52.000
This is, I...

15:52.000 --> 15:54.000
Okay, sorry.

15:54.000 --> 15:57.000
And one aspect of this.

15:57.000 --> 15:59.000
This is a new JSON type,

15:59.000 --> 16:01.000
which is, like, the next talk.

16:01.000 --> 16:03.000
We won't have another room about it.

16:03.000 --> 16:05.000
And yes, they have many things,

16:05.000 --> 16:08.000
like, many functions and settings,

16:08.000 --> 16:10.000
more than 1,000 of each,

16:10.000 --> 16:11.000
easily.

16:11.000 --> 16:12.000
And that's really stacked up.

16:12.000 --> 16:13.000
And they also have many,

16:13.000 --> 16:16.000
many, many table engines,

16:16.000 --> 16:17.000
basically engines.

16:17.000 --> 16:19.000
It's like, it finds all the table behaves.

16:19.000 --> 16:21.000
The most common one,

16:21.000 --> 16:23.000
basically, is like a table that

16:23.000 --> 16:26.000
emerges huge parts of the time.

16:26.000 --> 16:27.000
So basically,

16:27.000 --> 16:29.000
we insert lots of

16:29.000 --> 16:31.000
huge chunks of data into tables

16:31.000 --> 16:33.000
and merge it over time.

16:33.000 --> 16:35.000
That's the merge tree, in basic sense.

16:35.000 --> 16:37.000
And there are some variations that

16:37.000 --> 16:39.000
to do, like, some merge tree,

16:39.000 --> 16:41.000
that to do, like, some aggregation combinations on it,

16:41.000 --> 16:42.000
but I was for a waiter.

16:42.000 --> 16:43.000
And also, not more things,

16:43.000 --> 16:45.000
even, like, reading for my three,

16:45.000 --> 16:47.000
and even, with table,

16:47.000 --> 16:49.000
Mexico, that tables.

16:49.000 --> 16:51.000
So you have a lot of things.

16:51.000 --> 16:52.000
And, another more,

16:52.000 --> 16:53.000
there are some other settings

16:53.000 --> 16:55.000
that can tune in tables.

16:55.000 --> 16:57.000
And you can also think about

16:57.000 --> 17:00.000
the quick hours in a multi-note setup.

17:00.000 --> 17:02.000
Like, you also spot the application of these things.

17:02.000 --> 17:04.000
It becomes more complicated

17:04.000 --> 17:06.000
if you want to first these things.

17:06.000 --> 17:08.000
And,

17:08.000 --> 17:10.000
both house,

17:10.000 --> 17:11.000
then,

17:11.000 --> 17:13.000
gets some issues from this.

17:13.000 --> 17:16.000
Good combinations still,

17:16.000 --> 17:18.000
many of the queries fail.

17:18.000 --> 17:21.000
I'll say, like, more than a hundred percent easily.

17:21.000 --> 17:23.000
The biggest issue here,

17:23.000 --> 17:26.000
are these type combinations that are not checking them.

17:26.000 --> 17:28.000
I'm comparing a string with an integer,

17:28.000 --> 17:31.000
or with the top of something like that.

17:31.000 --> 17:33.000
It's, yeah, it's error.

17:33.000 --> 17:35.000
They cannot do that, obviously.

17:35.000 --> 17:36.000
Yeah, I could do that checks for this,

17:36.000 --> 17:38.000
but it becomes more complex,

17:38.000 --> 17:40.000
even to two-handle.

17:40.000 --> 17:42.000
And you can see the code by these already,

17:42.000 --> 17:44.000
as some lines.

17:44.000 --> 17:46.000
Also, if I want to do something like,

17:46.000 --> 17:48.000
formance is also difficult.

17:48.000 --> 17:50.000
You have to read tables,

17:50.000 --> 17:51.000
to make sure, like,

17:51.000 --> 17:54.000
using a lot of memory to make sure that,

17:54.000 --> 17:56.000
you see, to find these formance issues.

17:56.000 --> 17:58.000
But it becomes difficult,

17:58.000 --> 18:00.000
because some queries can be completely random,

18:00.000 --> 18:01.000
like, for sports,

18:01.000 --> 18:03.000
and then, yeah, it's obviously, yeah.

18:03.000 --> 18:05.000
And there's nothing you can do about it.

18:05.000 --> 18:07.000
And then,

18:07.000 --> 18:09.000
there could be some fast positives,

18:09.000 --> 18:10.000
for articles.

18:10.000 --> 18:12.000
For example,

18:12.000 --> 18:14.000
I cannot imagine my,

18:14.000 --> 18:16.000
to move some part of a query,

18:16.000 --> 18:19.000
and that query could trigger the runtime error,

18:19.000 --> 18:21.000
but the part could not move,

18:21.000 --> 18:22.000
and, of course,

18:22.000 --> 18:24.000
they give different results,

18:24.000 --> 18:26.000
but success results,

18:26.000 --> 18:28.000
but yeah, that has expected.

18:28.000 --> 18:30.000
And there are other things that I,

18:30.000 --> 18:32.000
so far, I don't change probabilities,

18:32.000 --> 18:33.000
like, of the actions,

18:33.000 --> 18:36.000
so maybe you can change these around time.

18:36.000 --> 18:38.000
And also,

18:38.000 --> 18:42.000
I'm already working the grammar of the queries,

18:42.000 --> 18:44.000
so I cannot generate a query things,

18:44.000 --> 18:47.000
or those updated queries,

18:47.000 --> 18:49.000
that some of these more,

18:49.000 --> 18:51.000
some of the,

18:51.000 --> 18:52.000
or the first or two,

18:52.000 --> 18:54.000
I quite see first or something.

18:54.000 --> 18:55.000
But,

18:55.000 --> 18:57.000
that is a very, quite good.

18:57.000 --> 18:59.000
I already find many issues,

18:59.000 --> 19:00.000
and kick out.

19:00.000 --> 19:02.000
And as I said,

19:02.000 --> 19:04.000
this is more like a for complement to it.

19:04.000 --> 19:05.000
And,

19:05.000 --> 19:06.000
both of these,

19:06.000 --> 19:08.000
there are even more questions

19:08.000 --> 19:10.000
that you can think about the first.

19:10.000 --> 19:12.000
All about running clients in parallel,

19:12.000 --> 19:13.000
like,

19:13.000 --> 19:16.000
that they have hundreds of clients running at the same time.

19:16.000 --> 19:18.000
How are going to synchronize the algorithm?

19:18.000 --> 19:20.000
Like, how they are going to do?

19:20.000 --> 19:21.000
You have said,

19:21.000 --> 19:23.000
make sure that it doesn't scroll down the further,

19:23.000 --> 19:26.000
and don't get out of computation between them.

19:26.000 --> 19:28.000
What about fuzzing the server side?

19:28.000 --> 19:30.000
Like, the way to start the server,

19:30.000 --> 19:31.000
or clashing of starting it,

19:31.000 --> 19:32.000
is another thing.

19:32.000 --> 19:35.000
And if you have like more than one node,

19:35.000 --> 19:38.000
it comes in more complex to think about.

19:38.000 --> 19:39.000
Then there are other things,

19:39.000 --> 19:40.000
like,

19:40.000 --> 19:42.000
size of the tables,

19:42.000 --> 19:43.000
and the queries,

19:43.000 --> 19:44.000
or many columns,

19:44.000 --> 19:47.000
so they have in a table like 1,000 columns.

19:47.000 --> 19:51.000
Some customers have these many lifetables,

19:51.000 --> 19:52.000
or lots of columns.

19:52.000 --> 19:54.000
And then it's like,

19:54.000 --> 19:55.000
I also check for their messages.

19:55.000 --> 19:57.000
Sometimes they are a legitimate,

19:57.000 --> 19:58.000
rigid method,

19:58.000 --> 20:00.000
but it depends on the case.

20:00.000 --> 20:02.000
It's difficult to track this.

20:02.000 --> 20:06.000
And there's also a thing that you should think about,

20:06.000 --> 20:07.000
like,

20:07.000 --> 20:08.000
for like,

20:08.000 --> 20:09.000
all on that table should say in the catalog,

20:09.000 --> 20:12.000
because also probably you want to test another table with another combination.

20:12.000 --> 20:15.000
That might be more likely to bring it issue.

20:15.000 --> 20:17.000
So maybe you want to swap at some point,

20:17.000 --> 20:18.000
but for a long,

20:18.000 --> 20:19.000
if you don't know,

20:19.000 --> 20:21.000
the other things that we can't find.

20:21.000 --> 20:23.000
So,

20:23.000 --> 20:26.000
so what's the conclusion?

20:26.000 --> 20:27.000
The conclusion is obvious,

20:27.000 --> 20:28.000
like,

20:28.000 --> 20:30.000
I want to be able to find other issues,

20:30.000 --> 20:31.000
like,

20:31.000 --> 20:32.000
wanting a fuzzer.

20:32.000 --> 20:35.000
There's still,

20:35.000 --> 20:37.000
many things that you can think about there.

20:37.000 --> 20:40.000
And not so fuzzers usually have the issue that,

20:40.000 --> 20:41.000
yeah,

20:41.000 --> 20:42.000
you find some of the issues,

20:42.000 --> 20:43.000
and then after some time,

20:43.000 --> 20:46.000
it stops finding new issues.

20:46.000 --> 20:47.000
So we have to think about them,

20:47.000 --> 20:49.000
and chasing over the time.

20:49.000 --> 20:53.000
I could also add more features to my house,

20:53.000 --> 20:54.000
to counter this,

20:54.000 --> 20:59.000
but then the cause base becomes quite complex to handle this.

20:59.000 --> 21:03.000
And I can tell you that fuzzers can also have issues on them,

21:03.000 --> 21:07.000
and debugging fuzzers is also kind of weird experience,

21:07.000 --> 21:09.000
because we see many queries running there.

21:09.000 --> 21:12.000
And then there's something you've expected to generate,

21:12.000 --> 21:13.000
but it never happens,

21:13.000 --> 21:14.000
but you don't know,

21:14.000 --> 21:16.000
maybe because it's very red to happen,

21:16.000 --> 21:19.000
or really there's an issue that it never generates.

21:19.000 --> 21:23.000
And it becomes a bit of nightmare to debug this.

21:23.000 --> 21:25.000
And also,

21:25.000 --> 21:26.000
yeah,

21:26.000 --> 21:27.000
features,

21:27.000 --> 21:29.000
we add more combinations,

21:29.000 --> 21:30.000
and then,

21:30.000 --> 21:31.000
let's likely,

21:31.000 --> 21:33.000
to have queries succeed in,

21:33.000 --> 21:34.000
or even,

21:34.000 --> 21:37.000
the color case that has a bug,

21:37.000 --> 21:39.000
it becomes a bit of responsibility to find,

21:39.000 --> 21:41.000
because they are more,

21:41.000 --> 21:45.000
the main is larger than there's more things to find.

21:45.000 --> 21:46.000
So,

21:46.000 --> 21:48.000
what's the solution?

21:48.000 --> 21:52.000
Try to use as much as you can.

21:52.000 --> 21:53.000
Like,

21:53.000 --> 21:54.000
with fuzzers,

21:54.000 --> 21:56.000
different techniques,

21:56.000 --> 21:59.000
to test,

21:59.000 --> 22:02.000
different invariants of other ways.

22:02.000 --> 22:04.000
Try to find a way to,

22:04.000 --> 22:05.000
like,

22:05.000 --> 22:07.000
or calls to find more issues,

22:07.000 --> 22:08.000
like,

22:08.000 --> 22:10.000
I can do that at least before.

22:10.000 --> 22:11.000
And,

22:11.000 --> 22:12.000
you can try,

22:12.000 --> 22:13.000
also try to share,

22:13.000 --> 22:14.000
like,

22:14.000 --> 22:16.000
because between fuzzers to another,

22:16.000 --> 22:17.000
if they are,

22:17.000 --> 22:19.000
it's in the same language as event easier.

22:19.000 --> 22:20.000
For example,

22:20.000 --> 22:22.000
you can use,

22:22.000 --> 22:23.000
both the house,

22:23.000 --> 22:25.000
part of the query generation in STF,

22:25.000 --> 22:27.000
to upload the mutate, for example.

22:27.000 --> 22:28.000
You can do things,

22:28.000 --> 22:30.000
or use both house,

22:30.000 --> 22:31.000
output,

22:31.000 --> 22:32.000
for STF,

22:32.000 --> 22:33.000
and rate it mutate,

22:33.000 --> 22:34.000
and see if you can find anything else.

22:34.000 --> 22:36.000
We can try to,

22:36.000 --> 22:37.000
to,

22:37.000 --> 22:39.000
our discriminations between them,

22:39.000 --> 22:40.000
to, for even,

22:40.000 --> 22:42.000
to increase your chances of finding.

22:42.000 --> 22:43.000
So,

22:43.000 --> 22:45.000
the solution,

22:45.000 --> 22:47.000
I probably would like to,

22:47.000 --> 22:48.000
to have,

22:48.000 --> 22:49.000
is,

22:49.000 --> 22:50.000
ever see,

22:50.000 --> 22:51.000
I crowded with fuzzers,

22:51.000 --> 22:52.000
right,

22:52.000 --> 22:53.000
not.

22:53.000 --> 22:56.000
We have all these different techniques here.

22:56.000 --> 22:57.000
FAL,

22:57.000 --> 22:58.000
or a fuzzer,

22:58.000 --> 22:58.640
should I say

22:58.640 --> 23:00.440
that you sometimes

23:00.440 --> 23:01.440
read those things,

23:01.440 --> 23:02.440
and if you please write them out,

23:02.440 --> 23:03.480
like,

23:03.480 --> 23:04.880
that's for example,

23:04.880 --> 23:06.040
for sky'da.

23:06.040 --> 23:07.960
C amigos meet, that's very non,

23:07.960 --> 23:11.280
to messment whereresh lasted.

23:11.320 --> 23:12.480
Also, something I have to dare,

23:12.480 --> 23:16.000
Sico-holoster for correctness,

23:16.000 --> 23:17.000
uch,

23:17.000 --> 23:19.000
there's this one,

23:19.000 --> 23:20.160
p-stress.

23:20.160 --> 23:22.000
That is not probably Martinat,

23:22.000 --> 23:23.840
not well known by these,

23:23.840 --> 23:26.040
with non- patteredels,

23:26.040 --> 23:27.000
also like,

23:27.000 --> 23:29.480
But it's nice to have it running for hours,

23:29.480 --> 23:31.240
hours and see what happens, like,

23:31.240 --> 23:34.920
I think that you found after 20 hours, it's fun,

23:34.920 --> 23:37.960
how to debug that, you know, is able to find that.

23:37.960 --> 23:39.480
And then you have some other things,

23:39.480 --> 23:43.560
others like SD Fuzzle and the house,

23:43.560 --> 23:47.320
for now to fill this gap and up for a find,

23:47.320 --> 23:48.680
other many issues.

23:48.680 --> 23:50.920
We hope, but there will always be something

23:50.920 --> 23:52.520
that you'll never find.

23:52.520 --> 23:56.840
So what I talk to now, I post a book post,

23:56.920 --> 24:00.920
on our company blog, like the last week,

24:00.920 --> 24:02.520
guys something, I remember.

24:02.520 --> 24:03.880
So we can always go check there,

24:03.880 --> 24:07.480
and there's only sites that I write there.

24:07.480 --> 24:10.360
Yes, I have all the posts about other Fuzzles,

24:10.360 --> 24:12.440
and I am done.

24:12.440 --> 24:15.000
So that's what I want to talk.

24:15.000 --> 24:18.280
But before I leave, I just want to say that we

24:18.280 --> 24:20.440
had because we people tonight,

24:20.440 --> 24:22.680
you're going to have a small dinner.

24:22.760 --> 24:24.360
Here are these address.

24:24.360 --> 24:27.160
We invite everyone to join us.

24:27.160 --> 24:28.920
I think there's going to be some snacks there,

24:28.920 --> 24:32.040
like beer and waffles, as far as I know.

24:32.040 --> 24:34.760
So you're only invited.

24:34.760 --> 24:37.400
So see you there, you can also talk about anything

24:37.400 --> 24:40.920
nerdy related, so it's always fun to be there.

24:40.920 --> 24:43.560
So thank you, everyone.

24:43.560 --> 24:46.040
Now we have time for questions, if you want.

24:47.000 --> 24:48.040
Thank you.

24:51.720 --> 24:52.920
Two questions, okay.

24:52.920 --> 24:54.920
Yeah, anyone?

24:56.600 --> 24:58.120
No, okay, one, one.

24:58.120 --> 25:01.160
One of the questions, first time we had to select one.

25:01.160 --> 25:02.360
Yes.

25:02.360 --> 25:03.160
Three.

25:03.160 --> 25:08.360
I do doing this crossing from your development process,

25:08.360 --> 25:13.720
and it comes, I also like crossing new PRs that are coming in.

25:13.960 --> 25:15.880
I'm doing a part of critical development.

25:15.880 --> 25:19.560
Yes, we also have like new features being added.

25:19.560 --> 25:22.680
Some of these features, I can add in them easily.

25:22.680 --> 25:25.000
Some other problem, not that easy.

25:25.000 --> 25:26.520
So you have pens, because criticals now,

25:26.520 --> 25:28.200
yes, many people working on it.

25:28.200 --> 25:30.520
So there are actually a few people doing QY.

25:30.520 --> 25:35.720
So that's, yeah, usually keep the nerdy.

25:35.720 --> 25:38.120
And my questions are there?

25:38.120 --> 25:41.120
Yes, and I don't want you to talk about

25:41.120 --> 25:43.600
the last thing, should we use one of these nightly films,

25:43.600 --> 25:46.160
or if you do something that you'd like to see,

25:46.160 --> 25:48.320
because it doesn't matter to question.

25:48.320 --> 25:49.520
Because it doesn't matter to question.

25:49.520 --> 25:52.880
Can we find that error that is publicly related to the mic?

25:52.880 --> 25:56.320
How do you know that we're talking about to use that?

25:56.320 --> 25:59.200
Yes, actually, I can tell, sorry.

25:59.200 --> 26:01.680
Actually, I can tell, I have a discussion with my manager about this,

26:01.680 --> 26:03.840
like a few months back, actually.

26:03.840 --> 26:05.440
For me, I, for me, I would say,

26:05.440 --> 26:09.200
more in the nightly field and write issues on the fly.

26:10.160 --> 26:12.720
Actually, we can click on, for now, we have running up

26:12.720 --> 26:14.640
as part of a monthly quest.

26:14.640 --> 26:17.280
And then if we develop our finds in an issue,

26:17.280 --> 26:20.640
first finds an issue, we create an issue for there,

26:20.640 --> 26:27.920
anyone to look, but for me, you can start the pipeline, yes.

26:27.920 --> 26:31.200
And see, because sometimes they don't find,

26:31.200 --> 26:33.040
because they run for short time.

26:33.040 --> 26:35.280
So sometimes you can be safe.

26:35.280 --> 26:38.240
Most, I, I do.

26:38.240 --> 26:40.240
No more questions, no more time.

26:40.240 --> 26:41.120
No more time.

26:41.120 --> 26:41.920
Thank you.

26:41.920 --> 26:42.720
Thank you.

