WEBVTT

00:00.000 --> 00:13.000
All right, so talk about MySQL upgrades and a.4rebooking.com, good morning, thanks for coming.

00:13.000 --> 00:20.800
Try again. As usual, comments and things here are basically my opinions and obviously work

00:20.800 --> 00:27.680
of done and things is of work I've been doing at a booking. The comments and things

00:27.680 --> 00:33.080
are basically our usage of booking.com, things about running MySQL, upgrading to MySQL

00:33.080 --> 00:41.080
.4, why how things we've seen in progress and the collusions about us? I guess, first of

00:41.080 --> 00:46.400
all myself, I have, I'm Simon Mood, I've been working at booking for 17 years, running

00:46.400 --> 00:52.400
MySQL there, other things outside with databases with a side base, which no one probably

00:52.400 --> 00:56.680
knows these days, and prior to that doing things in financial working in financial markets.

00:56.680 --> 01:01.880
I've been in East, and I MySQL Rockstar, but these things all expire unfortunately, but

01:01.880 --> 01:07.560
the Rockstar doesn't expire. Okay, the basis expire, but you can do it again, and most

01:07.560 --> 01:15.320
of you are nice to have those both being recognized for that. The MySQL team at booking

01:15.320 --> 01:21.000
is about 20 engineers, we're located various places in the world, in Amsterdam, primarily,

01:21.040 --> 01:27.160
which is where the head offices, in Cambridge, where one of the original offices for booking

01:27.160 --> 01:32.680
was located. I'm in Madrid, I've been sitting there outside the world, and there's also some

01:32.680 --> 01:39.120
colleagues in Manchester at our rental car offices. I guess you know about booking, this is

01:39.120 --> 01:43.160
a picture of the head office in Amsterdam, which is quite pretty and things. Go there if

01:43.160 --> 01:47.840
you want to go to Amsterdam, it's nice to see going the summer when the weather's good, and

01:47.920 --> 01:53.200
what we do, a typical thing, accommodation, rental cars, flights, and attractions, and things,

01:53.200 --> 02:00.640
I guess most of you know us. MySQL 30, so congratulations on that, sent to Oracle for

02:00.640 --> 02:07.600
looking after MySQL since acquiring sun in 2009. And I think the main thing from the booking

02:07.600 --> 02:13.760
dot com perspective is we probably wouldn't be where we are, if MySQL didn't exist, at least

02:13.840 --> 02:20.560
there's current form. The usage of mySQL of booking, we started off being a postgres shop,

02:22.400 --> 02:28.480
which was interesting, and that seemed to be a good idea at the time, but eventually we saw that

02:28.480 --> 02:37.360
in basically in 2003, 2004, we had some scalability issues, and we started to use mySQL and on big

02:37.360 --> 02:43.280
servers running 32 bit Linux, four gigs of RAM, this is high a big heavy stuff, you know,

02:43.280 --> 02:49.920
myocentables and mySQL 4, and since then we've basically been running mySQL and scaled up,

02:49.920 --> 02:56.400
and using slightly more modern versions of mySQL since then. The original database called BP,

02:56.400 --> 03:02.640
which was booking's portal, still exists in the company, and it's had, we run replication,

03:02.640 --> 03:08.480
I think, most of you realize, and it's the clusters changing size, it's been spit out with data

03:08.480 --> 03:14.400
splitting out and things, and it's been as big as a thousand servers in one cluster, and today

03:14.400 --> 03:20.000
it's a bit smaller, but it's still running, because we can't shut down mySQL, the company is up

03:20.000 --> 03:30.640
24, 7, 365, and bookings we're using mySQL basically for 22 years, I'm sorry, so our mySQL usage,

03:32.720 --> 03:40.320
currently we're managing mySQL with a single framework, which we call badminton, it runs on prem,

03:40.320 --> 03:45.120
with on prem bare metal servers, we run with open accounts open stack, and we also we can run as

03:45.120 --> 03:49.840
again CC2, in principle we could extend that to other cloud providers, we've not run

03:49.840 --> 03:55.280
any on virtual machines, we don't do that at the moment, we've still got thousands of production

03:55.280 --> 04:01.360
instances running on hundreds of clusters, and even our development environment is a similar scale,

04:01.360 --> 04:05.600
it's smaller, smaller VMs, we don't need as many replicas and things, but it's also quite

04:05.600 --> 04:13.040
a large input set up, and basically the centralized management offloads a lot of work from the developers,

04:13.040 --> 04:18.160
they don't have to care about managing data bases, they just make their, they make their applications,

04:18.160 --> 04:22.640
talk to the database and get, get the store the data, they can retrieve it, and try and get on with

04:22.640 --> 04:30.640
making the applications and the business work better. We do look at some cloud vendor offerings

04:30.640 --> 04:36.640
for mySQL, and I're using some of those, and there's different discussions about where that's

04:36.640 --> 04:42.800
going, but primarily at the moment we're on mySQL on those environments, and we'll probably

04:42.800 --> 04:50.880
move more to the cloud as those days move forward. Okay, so upgrading, the first question is, you know,

04:50.880 --> 04:59.360
why? I guess the obvious reasons are 80 is end well, 80 is end of life next year, a lot of people forget,

04:59.360 --> 05:03.680
because it's been five, seven months, end of life last year, and they said, yeah, we've got some time

05:03.680 --> 05:10.480
with 80, blah, blah, blah, blah, blah, blah, but it's end of life next year, so if you're on 80,

05:10.640 --> 05:16.320
you need, like, you need to get off, because basically don't leave it to June, or don't leave

05:16.320 --> 05:22.560
it to the last minute, because you're suddenly finding your problems. MySQL is eight four is the latest

05:22.560 --> 05:29.840
stable version of, that's provided by Oracle, nine seven will be coming out soon, so basically the

05:29.840 --> 05:35.280
idea is get on to the latest stable version, which currently is eight point four, and in theory,

05:35.280 --> 05:40.640
take advantage of new features, take advantage of improvements in performance, which have been shown that

05:40.640 --> 05:46.800
I think eight four is better than eight zero, and in theory, stay more secure, so there's some of the

05:46.800 --> 05:54.160
reasons. The new features, I don't see so many new features yet, there's a lot about heat wave,

05:54.160 --> 06:00.640
which we're not using very much or a little, basically, the new features in the optimizer world,

06:00.640 --> 06:05.760
which is not really exposed, except if you're using heat wave, but I think the new optimizer,

06:05.760 --> 06:10.000
which I've heard a few things about, does some good things if you're doing analytics and things,

06:10.000 --> 06:14.720
even on a normal mySQL server. It's not exposed, and I think it'd be really nice to make it visible.

06:15.760 --> 06:21.200
If not in eight four, in nine x when it goes GA, so that we can play with it, we can see if it works well,

06:21.200 --> 06:26.640
if it doesn't work well. There's an open telemetry plugin, that's enterprise only, that would be nice to

06:26.640 --> 06:33.360
have to use on the community versions, which we're running. The eight point four upgrades are a bit

06:33.360 --> 06:37.760
of a question mark, but you have to do it, basically, because you have to get off eight zero.

06:39.280 --> 06:44.880
The other perspective, which I think is important is looking forward, is that Oracle's current

06:44.880 --> 06:50.480
innovation release, which is the new things they're adding in their system, or in nine, nine dot two

06:50.480 --> 06:58.400
is out in January, so a few weeks ago, and that's got a few more features compared to what's in eight dot four,

06:58.400 --> 07:05.920
and basically, the plan for Oracle is that nine seven, which will be April next year,

07:05.920 --> 07:11.840
when eight zero goes end of life will become LTS. So probably also, you don't need to jump to

07:11.840 --> 07:16.640
that version of LTS when it goes LTS, but maybe also, if there are new features and good things to

07:16.640 --> 07:22.160
make as go there, there's to be worthwhile moving forward. So I'm hoping that between now and next

07:22.160 --> 07:27.280
year, that will see more features, and that will provide things that will lead us to go to nine,

07:27.280 --> 07:35.840
so that we get better things. Sorry, wrong direction, sorry, yeah. So this is basically what I'm saying

07:35.840 --> 07:42.560
that. I'll sort of go wrong direction, excuse me. Okay, and then the other question is how,

07:42.960 --> 07:50.960
how do we upgrade? So our environments, fleet, several thousand servers, we've got hundreds of clusters,

07:50.960 --> 07:56.000
we're running my SQL running spanning three regions, they're all in Europe, but there's still three regions,

07:56.000 --> 08:02.000
we run three tier replication, we also run with GR, or with asynchronous replicas behind the GR,

08:02.560 --> 08:06.640
and of course we don't want to shut down the site while we do upgrades, because that wouldn't be a

08:06.640 --> 08:11.760
good idea. So things when you're upgrading, what do you have to think about the consumers,

08:11.760 --> 08:16.880
so the SQL users, they're the typical people. Some of the changes in 8.4, my SQL native

08:16.880 --> 08:22.880
password is going away, so in theory you've got to get off it. That also implies that you need to use

08:22.880 --> 08:26.880
tier, so make sure that's set up, which from a security perspective, you should be anyway,

08:26.880 --> 08:30.800
but if you're not, that has to be done. A lot of configuration changes in the settings,

08:30.800 --> 08:35.920
lot of settings that the word deprecated will remove in 80, so on 840 you have to remove them all

08:35.920 --> 08:41.040
to make sure the system starts up. And then from a perspective, we use replication a lot,

08:41.520 --> 08:46.800
all the settings that mentioned master and slave go away and they replace by source and replica.

08:46.800 --> 08:51.760
So you need to change that, and if you manage replication yourself and we've got a big set up that

08:51.760 --> 08:55.760
does that, then you've got to change all of that, because otherwise 840 doesn't understand what you're

08:55.760 --> 09:01.600
talking about. The other thing is the users need to check that the SQL behaves the same,

09:01.600 --> 09:06.880
sometimes optimizers changes mean that the queries may behave differently, so you also need to

09:06.880 --> 09:12.000
sort of check that. And then the other thing which we saw less frequently, but we're seeing

09:12.000 --> 09:17.360
more now, we now have a lot of binlog consumers. So the data goes then gets pushed into analytics

09:17.360 --> 09:23.520
and things like that, which is not running on my SQL. So your binlog consumers tend to be

09:23.520 --> 09:28.480
simpler and they sometimes don't like any new formats and may not handle that. So this is to allow

09:28.480 --> 09:33.680
them when you're upgrading to make sure that they can support and ingest the binlogs right for.

09:34.000 --> 09:42.080
Technically from in terms of other things too, that we have to involve. We have my SQL plugins

09:42.080 --> 09:46.480
for specific things, and then obviously all your work flows you need to check, make sure that

09:46.480 --> 09:50.960
everything works. We have multiple applications with different groups of what we call

09:50.960 --> 09:57.040
pools of dedicated service for different groups of applications. We have topology rearrangements

09:57.040 --> 10:02.080
of the replication environment and things. Credential management, you need to get off of the

10:02.080 --> 10:06.640
native password, you need to get onto the CASHING SA2 and so on. So there's multiple things that

10:06.640 --> 10:10.160
you need to, when you're operating, that you need to make sure it all works and doesn't so

10:10.160 --> 10:18.960
cause any problems. So the code base from our perspective, as I say, the replication changes

10:18.960 --> 10:24.000
are quite important because the replication related changes functionality doesn't change, but what

10:24.000 --> 10:30.800
does all the commands used. And the main thing is that really one of the things that I've seen

10:30.800 --> 10:37.200
from this is Oracle deprecates lots of things quite frequently, but actually it never goes away.

10:37.200 --> 10:42.560
So what do you do ignore it? The noise is there. Oh yes, I know about that. And until eventually

10:42.560 --> 10:45.920
they do remove it and then of course you didn't pay any attention for the last

10:45.920 --> 10:52.160
X years, which you could have done. So because of that, that's also a problem. It's easy to

10:52.160 --> 10:56.960
ignore things and then you pay the price for having ignored it. And it's also hard within I think

10:56.960 --> 10:59.920
we're at the technical level, we've got management above us. You said we could do some

10:59.920 --> 11:04.320
what we need to clean something up. It's hard to push that to the upstream management.

11:04.320 --> 11:08.880
Do you need to do it now? No, okay, we'll leave it. And that's getting that balance right's hard.

11:10.800 --> 11:17.440
I think also with the deprecations, sorry, with the deprecations, it will be good if you know

11:17.440 --> 11:21.040
when you're going to, so this is for the Oracle people. When you know they're going to

11:21.040 --> 11:26.640
remove it, make it clearer when you're when you've basically decided. So we know with more

11:26.640 --> 11:30.800
times, so we have more time to plan for it. This is going to go away. We have to handle it.

11:34.640 --> 11:39.680
In terms of upgrades, I guess most people here are the sumo you don't have single service and

11:39.680 --> 11:45.280
they have probably running replication. So if you can use replication, it's easier. Don't upgrade

11:45.280 --> 11:50.320
in place and don't and also read the manual because if you upgrade a single server in place and

11:50.320 --> 11:54.080
don't pay any attention to reading the manual, probably it won't start because the settings might be

11:54.080 --> 11:57.920
wrong. If you fix the settings and it starts up, you might nobody get back in the server because

11:57.920 --> 12:03.120
you're probably still using my single native passwords. One of the things I'd love that Oracle, if

12:03.120 --> 12:10.720
it noticed that the full privilege users don't have, won't be able to access the server because

12:10.720 --> 12:16.400
they use a different authentication password, a plug-in which doesn't exist anymore. They shouldn't

12:16.400 --> 12:20.880
not upgrade the server. But they let you to do that and then you can't get back in. Luckily there's

12:20.960 --> 12:26.560
a way to avoid that and you can basically turn on native passwords temporarily to get back in again.

12:26.560 --> 12:31.760
But you do have to restart the server to do that. And then if you're running with replication,

12:31.760 --> 12:35.600
you can startup8.4 and you can test things on the replicas. You don't have to touch

12:35.600 --> 12:39.040
your existing infrastructure and your existing service, so that makes life much better.

12:40.640 --> 12:41.120
Yes.

12:41.120 --> 12:45.120
Technically, you're going to keep with the native password at the end of the video.

12:46.080 --> 12:51.520
Well, I mean, you could do it forever until 9, I guess. But I assume that they wanted to turn

12:51.520 --> 12:57.600
it off completely. And someone said, if you do that, you're going to call us a huge amount of pain.

12:59.040 --> 13:04.400
Which, which I'm sure, luckily, because I did that on my home PC. I was just not ready to force you

13:04.400 --> 13:09.360
what happens, doesn't matter. I can't get back in again. And okay, I could and I had to poke someone

13:09.360 --> 13:13.680
to say, yes, really easy to do this. But if you don't know that, you go around and it sees the fix

13:13.680 --> 13:22.560
it, yeah. And they probably have exactly that, yeah, so that if on the 6th, if you're developing

13:22.560 --> 13:25.280
locally, you've got to have some of the test data blah, blah, blah, blah, blah, blah. You're up

13:25.280 --> 13:29.360
rate to 8.4 and you can't get back in. Also, you're upgrade, so you can't go back to 8.0.

13:30.640 --> 13:34.320
That's the thing is you can't go back because you've upgrade to 8.4. So the server will let you

13:34.320 --> 13:43.360
upgrade and then you can't get in. Yeah, yeah, okay. But the point is that some of these things

13:43.360 --> 13:48.560
they could be safer and they're less safer than I would like, but maybe, you know, maybe I should

13:48.560 --> 13:54.880
read the manual. Anyway, so this is a typical, I mean, we're shared after apology in previous

13:54.880 --> 13:58.880
presentations myself and other colleagues and things. So we tend to have like a three-tier architecture

13:58.880 --> 14:04.560
with the master or the source here on the left, intermediate masters so that we can spread out

14:04.560 --> 14:09.360
to a large number of replicas on the right-hand side. The coloriest gaming here is different colors

14:09.440 --> 14:13.920
for different data centers and theory we are all regions. In theory we have three regions, but

14:13.920 --> 14:23.520
actually we have more for various reasons in some cases. The nice, the basically applications read

14:23.520 --> 14:28.160
from the right-hand side and if they need to write, they write to the left-hand side. And the good

14:28.160 --> 14:32.720
thing about is that you can scale out the right-hand side as much as you like within reason.

14:33.680 --> 14:39.920
Yeah, and sometimes we will split a cluster like this up into two because the data sets

14:39.920 --> 14:44.160
get too large and then of course the topology gets even bigger and more complicated and it might

14:44.160 --> 14:51.120
be four or five things steep while we're doing that. Yeah, so reading as I say from the leaf,

14:51.120 --> 14:57.600
we read from the leaf replicas, we write to the masters. GR basically is the same as

14:58.000 --> 15:03.600
asynchronous replication. The difference is that we have, although orchestrator, the way it's built

15:03.600 --> 15:08.800
so, or the way we patched it so it's built, it shows the primary on the left because we use single

15:08.800 --> 15:13.760
primary. The rest of the thing is the same. The other GR secondaries are there and then basically

15:13.760 --> 15:25.760
I think nodes read off that so you can do that. Yeah, so again, in place upgrades. So once

15:25.760 --> 15:29.680
you're running, if you want to upgrade, the easiest thing is to just, you could do in place

15:29.680 --> 15:33.680
upgrades on the nodes on the right-hand side. If it's not, if it's not in use normal, we'll notice

15:33.680 --> 15:37.600
and if it works, it works fine, you can check that replication works and then you could check

15:37.600 --> 15:42.320
your applications individually, you could do some basic testing, but basically anything on the right-hand

15:42.320 --> 15:45.920
side, you could just plug that in so you're existing infrastructure and doesn't touch anything so

15:45.920 --> 15:50.320
it's really easy to do that. And in terms of a full upgrade process, you basically do some like

15:50.960 --> 15:54.880
initial testing on a few servers, then basically you upgrade the right-hand side completely,

15:54.960 --> 16:01.680
you then do the intermediate masters, which are intermediate partners, servers, secondary,

16:01.680 --> 16:06.240
and then eventually you change the primary or the master and then you're running your new version.

16:06.240 --> 16:10.480
So the process is quite simple. This process though, in production, when we do it, we've done it

16:10.480 --> 16:15.920
from 5'7' to 8 and earlier versions, it may take a few weeks. You can do it faster, but sometimes

16:15.920 --> 16:20.640
you want to leave it running a little bit longer to make sure that users notice if there are

16:20.640 --> 16:28.880
any changes or any problems or any issues. Basically, the most of the risks by following this process

16:28.880 --> 16:35.840
are reduced. Once you've upgraded the master, you have to fix it yourself and figure it out.

16:35.840 --> 16:39.680
Basically, that's the bottom line. So any testing you could do previously, you could also

16:39.680 --> 16:43.280
test in your development environment, you might have a replicate set of developments in which

16:43.280 --> 16:46.880
you can test there. If you break development, just build it again, you don't care. So I mean,

16:46.880 --> 16:50.480
there's lots of ways around that, but it's good to do that because sometimes things break and

16:50.960 --> 16:54.560
is better for it not to break in production and you've hopefully caught most things.

16:56.480 --> 17:01.200
Issues we've seen, basically, internal tooling doesn't quite do the right thing, it gets confused

17:01.200 --> 17:06.880
between 0x0 and 0x4 or the new version. Sometimes we see behavioral changes in queries, which

17:06.880 --> 17:11.280
you know, the optimizer changes, it does something slightly different. So again, that might be a

17:11.280 --> 17:16.080
report to Oracle or something and so it doesn't work. Maybe my SQL crashes are just something

17:16.080 --> 17:20.320
weird on the query, but if that's the same thing, you can report that if necessary.

17:21.280 --> 17:25.680
But in most cases, if something breaks and you don't really want to proceed, you can just

17:25.680 --> 17:31.360
reclone any of the replicas from 8x0 from existing service and you can just, and once the

17:31.360 --> 17:39.360
problems fix, you move forward. My SQL clients, Oracle recommends running the latest 9.2

17:39.920 --> 17:45.600
or the 9.ex version that's being used. Most clients tend to be on slightly older versions, so

17:45.680 --> 17:49.680
it's good to poke your developers and tell them, please upgrade your clients. Use the latest

17:49.680 --> 17:56.560
9x even on against 8x4 or against 8x0. If you want to know what clients are actually connecting

17:56.560 --> 18:01.680
to your server, performance schema, session, contract, connects attributes tells you and you can

18:01.680 --> 18:08.080
actually see what spread of versions are used. I do this yesterday, just quickly you have a quick

18:08.080 --> 18:12.400
check or once one specific server and as you can see it's quite a blivermix there, most of

18:12.720 --> 18:21.440
my SQL, as you see they're not all running 8x421 or 8x421, which is 8x480. You see there's a couple

18:21.440 --> 18:27.760
of live MariaDB. That's strange. Why is MariaDB used to talk to my SQL? Well, proxy SQL uses

18:27.760 --> 18:34.080
live MariaDB, so I think these are proxy SQL connections into the, anyway, it's interesting to see

18:34.080 --> 18:39.280
that because I saw that and I thought, what are we doing? It turns out to be something that's

18:39.280 --> 18:45.360
expected, but it might surprise. Grant management things will obviously the users have to change

18:45.360 --> 18:49.520
grants, so you get that need to get off-native password. There's no way to rotate from one,

18:50.480 --> 18:54.240
from one authentication and mechanism to another one, it might have been nice to have a plug-in

18:54.240 --> 18:59.600
when you rotate grants that you do that automatically, so that we have to do that. And we have to go

18:59.600 --> 19:06.080
through all our budget, no 25, 34, tens of thousands of users and upgrade them and modify them.

19:06.160 --> 19:12.080
The new percent wild-car for database grants we are using in a few places, not in many, so that needs

19:12.080 --> 19:18.880
to be fixed or chain when you change our tooling to handle that. And the secondary password

19:18.880 --> 19:25.680
is also an option which is useful, but that still uses the same authentication method. So you

19:25.680 --> 19:32.640
can't change, you have to change your authentication method while upgrading. Orchestrator, we use

19:32.640 --> 19:38.080
Orchestrator a lot and we used it right from the beginning when Swami comes to work with us for a while

19:38.080 --> 19:41.680
and he said, you know, what are you doing these silly manual failovers and things? You guys

19:41.680 --> 19:46.720
got no idea what you're doing. We didn't say that, but he helped us do it automatically and made life

19:46.720 --> 19:52.880
much, much, much, much better, so we thank you very much for letting me. But Orchestrator and

19:52.880 --> 19:58.000
Swami's no longer maintaining orchestrator works up to 8-0, doesn't it? It doesn't support for 8-4.

19:58.720 --> 20:03.040
So we had to get some help and the code and the help to us make Orchestrator 8.4 a

20:03.040 --> 20:08.880
way and we're now testing that and using it and it looks quite good. So it looks like it wasn't

20:08.880 --> 20:14.480
a big change, but someone has to make the change, but there is now an 8.4 orchestrator where

20:14.480 --> 20:22.400
a version which can be used if you're upgrading. Orchestrator doesn't handle GR failure, we patched it

20:22.960 --> 20:31.760
and that patch is upstream so that it was GR aware. But also one of the other things that's

20:31.760 --> 20:38.240
happening now is because Oracle are providing features to make automatic failover part of the

20:38.240 --> 20:42.880
server process. I think it's not as smooth as I'd like and we're looking at it and maybe we'll

20:42.880 --> 20:48.480
move over so we don't use Orchestrator for failover maybe moving forward. However, the visualization

20:49.200 --> 20:56.960
of a topology is brilliant and there's nothing better. Plugins, we run my SQL plugins. We run

20:56.960 --> 21:02.720
our Pam plugins so because we're running on the community version we patch in the the

21:02.720 --> 21:09.520
Corona Pam plugin onto my SQL server. We have something to check discusse so the developers don't

21:09.520 --> 21:13.840
fill up the disk which they like to buy inserting lots of data. It gives them warnings and that

21:13.840 --> 21:17.520
basically makes them the what's happening and that makes them consider if they're inserting data

21:17.600 --> 21:24.800
all the disk sizes too large. We have a couple of other specialised all the plug-ins that we use

21:26.960 --> 21:31.200
and again the audit plugin is available on the enterprise version so these were self-built

21:31.200 --> 21:38.880
to and also very lightweight to alternative to that. Plugins have been used for a long time. We

21:38.880 --> 21:44.080
have modified this now so with the 8.4 upgrade we've changed this so we build against 808,

21:44.080 --> 21:49.520
4 and 9x and now the automation process is in place so it's really easy to rebuild this all the time

21:50.560 --> 21:57.280
while we're going through the upgrade process. Comment with the plugins plug-ins are going away

21:57.280 --> 22:02.560
being replaced by components. When I was writing this it still wasn't very visible. I don't

22:02.560 --> 22:07.760
think Oracle say enough about plug-ins please move on to components. It looks like now you can do

22:07.760 --> 22:12.960
everything that could be done with plug-ins with components in which case it's good to look to

22:13.120 --> 22:19.200
move off of that because I think also you get better a safer way to use the server and things like that.

22:21.200 --> 22:25.760
And obviously we shared documentation how to use components and things.

22:28.240 --> 22:32.640
Group replication so when you upgrade we're running group replication with replicas

22:33.040 --> 22:37.280
if it breaks you don't care if you break a group you do care because the whole group goes down

22:37.280 --> 22:42.480
you can't take rights. So upgrading a GR cluster is more problematic you can go through

22:42.560 --> 22:48.560
right at the end upgrading the GR members we're discussing that's one thing. We're also discussing

22:48.560 --> 22:54.080
internally whether to spin up an 84 GR cluster which replicates from a zero so you can do testing

22:54.080 --> 22:58.080
of that completely independently and then eventually whenever the else is upgraded you just change

22:58.080 --> 23:02.720
your cluster from one to the other and that's something that we're still not quite sure if that'll

23:02.720 --> 23:10.400
be the way we're doing so or not. Yeah that's what I'm saying here. When I did the call for papers

23:10.480 --> 23:14.640
and I said I'll do an 8.4 upgrade thing I was expecting to tell you that it's all complete it was

23:14.640 --> 23:18.720
really easy to took a few weeks and was done. Unfortunately we got a bit delayed so we're actually

23:18.720 --> 23:24.240
still working progress so we've started on it the still work to do. I hope maybe in a future

23:24.240 --> 23:30.240
presentation to be able to give the complete things on how painful or not it was but I'm hoping

23:30.240 --> 23:35.120
that we're hoping probably in a few weeks to be have the 8.4 upgrade complete and then we'll be

23:35.120 --> 23:41.840
going to talk about it in more detail later. Issues seen a thing about the MySQL native

23:41.840 --> 23:48.080
password don't get locked out because that's a pain. Semisink plug is bitus as well. Something

23:48.080 --> 23:54.720
very simple we have 80 boxes can feel with semisink plug-ins using Master Slave. We then

23:54.720 --> 23:59.440
we change the configuration 9 that it should be saw some replica but when you clone from one

23:59.440 --> 24:05.280
box to another they all got mixed up and it basically didn't work. I've followed a bug report

24:05.280 --> 24:10.240
to that but also found out how to it can fix it but I had to modify tooling to handle that

24:10.240 --> 24:15.920
a bit better. The HyperGraphort optimizer I'd love to try it and I'd love to be built on

24:15.920 --> 24:20.240
the community versions that are available so even if it's still experimental it's not complete

24:20.240 --> 24:24.400
it'll be nice to use it. I think you know it may not be better for everything but some things

24:24.400 --> 24:33.200
if it is better I'd love to take advantage of that. And some missing 2 features are still

24:33.200 --> 24:37.600
really really love to see the first. Well the second is based on the first. I still want to

24:37.600 --> 24:47.600
build a native clone from 804 or from 9 from 84 to 9. Oracle the clone plug-in now's hope

24:47.600 --> 24:52.000
knows how to copy the data over and what does it do immediately after doing that it restarts.

24:52.800 --> 24:57.440
So if you do that so you don't care about the 9 you know the 8.4 server you can copy 80

24:57.440 --> 25:02.000
what the data are over you can do that and say okay I now restart oh I've got 80 data I can

25:02.000 --> 25:07.040
upgrade. If you could do that you'll just make life easier because then you can clone from the previous

25:07.040 --> 25:12.400
version to the next major version and then the other thing which I'd love to see is that I can

25:12.400 --> 25:17.440
basically clone and spin up a new server by providing a please clone from that box over there

25:17.440 --> 25:21.680
again also doing it from the previous version because otherwise you have to set it all up you have

25:21.680 --> 25:26.560
to set the config file and all these other things but if you use Cassandra use a lot of other tools

25:26.560 --> 25:30.320
you could just say point at that thing make a copy and it will just work. So I'd love to see

25:30.320 --> 25:34.480
some sort of feature like this which would be really good for upgrading for lots of testing lots

25:34.480 --> 25:38.640
of things that you can spin and maybe that all the parameters that are now in 80c might not see

25:38.640 --> 25:44.800
enough maybe there is the the custom parameters in the data there and the in the persistent settings

25:44.800 --> 25:48.880
well that's possible I don't know be a lovely feature to have because then you can copy boxes

25:48.960 --> 25:55.360
around really really easily and it just makes life easier for management. So yep so the main thing

25:55.360 --> 26:01.520
is time to upgrade don't delay 80 is going away nine dot nine dot x hopefully will have something really

26:01.520 --> 26:08.000
good. GR upgrades are more complicated and I'm looking forward to seeing what comes in nine dot seven

26:08.000 --> 26:15.920
LTS and some references a couple of reports there. Any questions?

26:19.840 --> 26:30.080
Gordon. So when people were grading from five seven to eight in a lot of us in the agent

26:30.080 --> 26:35.600
which you will please step right away five seven would not work up the date from eight on this

26:35.600 --> 26:43.200
or five you change your main goal. We've not got that file yet. I mean I think they've been

26:43.200 --> 26:47.440
long format hasn't changed that much so assume it will probably replicate unless you do anything strange.

26:47.520 --> 26:51.840
There is a for GT IDs there's a new like tagging thing if you don't use it then it shouldn't

26:51.840 --> 26:56.480
bite you. So my guess is that right now syntax changes between five seven and eights.

26:56.480 --> 27:00.880
So I think there are eight four I think a minimal except for master but that doesn't affect

27:00.880 --> 27:05.920
replication because that goes from there that's that's our grant changes and things like that so

27:05.920 --> 27:11.760
I suspect that probably again like it was from five seven to eight probably probably it wasn't matter.

27:12.720 --> 27:20.400
With respect to a change of our master's layer for source replica you have probably a variety of tool in

27:21.360 --> 27:25.760
Yeah yeah and that's painful this is so this is painful that the moment we're going to let you out the moment.

27:33.760 --> 27:40.320
So the thing is I'm not a hundred percent so so right now I'm with fixing in place and supporting both.

27:40.720 --> 27:46.960
The reason for that is we know in theory with eight zero and eight with with eight zero and eight for it's the same.

27:46.960 --> 27:53.440
I'm not a hundred percent sure of going to nine where it will stay the same or not because if we if you go to nine in one year

27:54.880 --> 27:59.520
then you've got a handle multiple versions so maybe having being version of where is better.

27:59.520 --> 28:04.880
I think probably being version of where is better is more flexible even if the actual configs are same.

28:05.840 --> 28:09.520
But what we'll have to say still still something that I have I need to see.

28:19.200 --> 28:24.000
Okay so I mean the main thing is that this has really been kicked off very very recently for a simple reason

28:24.000 --> 28:29.680
that other priority business priorities came first and we weren't able to to an issue so I didn't know

28:29.680 --> 28:33.680
that when obviously I did that I had the paper so a lot of the things here is what you need to do

28:33.680 --> 28:38.720
so in practice we've done very little basically most most of this is the plug-in handling because we have

28:38.720 --> 28:44.000
custom plug-ins they have to be built and one of the things is what I've forgotten is when you've got

28:44.000 --> 28:49.360
automation everything works so when you break it just by changing the version which you don't think

28:49.360 --> 28:55.280
it how can it possibly break anything everything breaks because all our monitoring checks slave

28:55.280 --> 29:00.720
status and blah blah blah and it checks this and it does this and it tries to set things and nothing works

29:01.360 --> 29:06.400
so you forget when you have automation in place and everything is smooth and perfect you change

29:06.400 --> 29:11.680
something that is not compatible and if you actually suddenly start to read as somebody you know it's

29:11.680 --> 29:14.880
in multiple different libraries we've got different languages doing different things we've got

29:14.880 --> 29:19.600
grant manager we've got loads of things all over the place and you suddenly realise how painful that

29:19.600 --> 29:25.200
changes and it wasn't had I realised or had we realised and I didn't I don't think in any of one

29:25.200 --> 29:30.160
of the team really thought about it much I always go and do that and I had a plan but we should have

29:30.160 --> 29:35.120
done this earlier but that's what you that's what you find in practice that you didn't realise

29:35.120 --> 29:39.120
the consequences and it would be better but I still I still think there are differences from nine

29:39.120 --> 29:43.760
to eight eight to four and nine so possibly actually it wouldn't help it might save most of the

29:43.760 --> 29:50.160
problems but you only see this when you do it and obviously this is the thing sometimes it's

29:50.160 --> 29:54.800
it's good to delay things and not do it and do it later and obviously if you're busy with

29:54.800 --> 29:59.360
and you know there's business things to do it's good to not do these things until you need to do

29:59.360 --> 30:05.120
it but sometimes it's actually good to do it early and I think it's good to follow follow what

30:05.120 --> 30:09.760
Oracle's doing if you're using or it won't mind my sequel don't don't don't be bleeding edge

30:09.760 --> 30:16.080
but be close to bleeding edge ish enough so that if you have to stop you stop but don't leave

30:16.080 --> 30:18.880
it too long because then the jump can be large and then it's going to be painful

30:24.880 --> 30:33.440
do you have the rollback plan in case it doesn't work on to you? No but I mean the thing the

30:33.440 --> 30:37.200
thing is eight fours are same as eight zero if we're on it I mean I'm this is a personal perspective

30:37.840 --> 30:42.720
eight zero and eight four are basically the same so I mean maybe if nine seven turns out

30:42.720 --> 30:47.360
having loads of new syntax and new things and the optimizer has been hacked away to do something

30:47.360 --> 30:53.040
really megaly different than you might have really big concerns but the code base I think is

30:53.120 --> 31:03.200
it's a lot of clean up in eight zero but yeah you said you're not really at that I'm really curious

31:03.200 --> 31:10.480
about the findings that you will like the things you'll find that works in eight zero

31:10.480 --> 31:16.000
and not in eight four because I know about at least one which is not different given in eight

31:16.000 --> 31:21.440
oh and we're break eight four so I mean I don't know better I mean as I say I was hoping to

31:21.440 --> 31:25.760
coming here and say I can tell you everything in blah blah blah blah no one knows anything and

31:25.760 --> 31:30.240
sometimes you don't you'd like to be ashamed or I can't share more at that this time of

31:30.240 --> 31:41.280
right yeah and that and the replication is something that we use heavily as you and a lot of

31:41.280 --> 31:49.120
people do and if you break replication that's that's and that's not good so I mean eight the

31:49.200 --> 31:55.360
new versions past eight eight zero need to clear up to absorb the the lower version

31:55.360 --> 32:01.920
things and if they can't do that I think it's a bug it's not it's a verified bug and I'm

32:02.080 --> 32:06.720
pressured there's others because I think too many things were removed in the eight four yeah

32:09.040 --> 32:11.920
oh I think I'm going to be yeah yeah thank you