WEBVTT

00:00.000 --> 00:11.000
Thank you, thanks for having me, so I'm Quentin, my name, I've been working a lot

00:11.000 --> 00:16.000
around EEPF, but I'm not here to talk about BFF today, I've got the wrong t-shirt, sorry.

00:16.000 --> 00:24.000
I'm here to talk about the open network fabric and how to build your cloud infrastructure

00:24.000 --> 00:29.000
at home, not really at home, but on your lab, on your hardware.

00:29.000 --> 00:36.000
So, in short, what this project does is to make on-premises cloud infrastructure easy

00:36.000 --> 00:42.000
to set up and to run by sort of obstructing all the network part away.

00:42.000 --> 00:44.000
So, we see about that.

00:44.000 --> 00:51.000
So, can I first have a cloud at home and before we even see what we would need a cloud

00:51.000 --> 00:55.000
for, let's go back to the origins of the cloud in a way.

00:55.000 --> 00:59.000
So, initially, in the beginning, we had a lot of panories, libraries.

00:59.000 --> 01:05.000
It was all of the glorious mess in terms of dependencies, right?

01:05.000 --> 01:09.000
And people came up and said, let the be virtualization.

01:09.000 --> 01:11.000
So, we got machines in machines, great.

01:11.000 --> 01:15.000
But it uses a lot of resources, and it's slow sometimes.

01:15.000 --> 01:21.000
So, people say, let there be containers, and we got the way.

01:22.000 --> 01:26.000
The way it works great, but how do we orchestrate all these containers that we have?

01:26.000 --> 01:32.000
So, people continue and said, what we need is a relationship with 700.

01:32.000 --> 01:35.000
And everything that comes with it.

01:35.000 --> 01:38.000
And we pretty much all know how it goes from here.

01:38.000 --> 01:41.000
Everything got swayed simpler.

01:41.000 --> 01:44.000
Good.

01:44.000 --> 01:46.000
So, we've got all these projects from CNC.

01:46.000 --> 01:49.000
That's the CNC influence, okay?

01:49.000 --> 01:54.000
And the good part is that it's mostly software appliances,

01:54.000 --> 01:59.000
and there are solutions to help in the sense that there are cloud providers that you can use

01:59.000 --> 02:04.000
to run this infrastructure for you.

02:04.000 --> 02:06.000
So, you get the tip of the iceberg, which is nice.

02:06.000 --> 02:10.000
You can run your applications, relatively easy.

02:10.000 --> 02:13.000
And everything that's underwater is, yeah, that's good.

02:13.000 --> 02:15.000
That's someone else's magic, really.

02:15.000 --> 02:17.000
You don't have to care about that too much.

02:17.000 --> 02:23.000
So, what you get from these cloud providers, typically, is that the virtual private cloud,

02:23.000 --> 02:29.000
in which you have your nodes, and then containers.

02:29.000 --> 02:32.000
And you can run your applications in these containers.

02:32.000 --> 02:34.000
You can run everything you want.

02:34.000 --> 02:38.000
You've got some isolation on the layer of freedom, like your cluster,

02:38.000 --> 02:43.000
is not directly connected to the other tenants cluster, which is usually a good thing.

02:43.000 --> 02:47.000
And you start to build your applications,

02:47.000 --> 02:52.000
drilling with these virtual private clouds, the VPCs,

02:52.000 --> 02:58.000
and then sort of they constitute the basic building blocks

02:58.000 --> 03:02.000
for a modern cloud infrastructure in a way.

03:02.000 --> 03:10.000
So, now, what if I want to run these sort of lectures on my lab on premises?

03:11.000 --> 03:14.000
What would you do that? Well, that can be a matter of course.

03:14.000 --> 03:19.000
If you realize that actually buying your hardware, your GPUs, or whatever else,

03:19.000 --> 03:24.000
is in fact cheaper than rent ticket from Amazon on the few months,

03:24.000 --> 03:26.000
maybe it's just paid for.

03:26.000 --> 03:31.000
So, maybe you want to get on course, maybe you want to reduce latency for your applications.

03:31.000 --> 03:35.000
If you got some hard constraints on latency,

03:35.000 --> 03:38.000
if you had some requirements in terms of compliance,

03:38.000 --> 03:41.000
you really need to process your data on the site.

03:41.000 --> 03:47.000
Well, in that case, you may need to get everything back to your company's hardware.

03:47.000 --> 03:54.000
So, some use cases, everything where you have some very specialized cloud application,

03:54.000 --> 04:00.000
everything, smart cities, IoT, what did that for?

04:00.000 --> 04:04.000
Let me see. Ah, this one.

04:05.000 --> 04:09.000
If you want to do some modern training, inference,

04:09.000 --> 04:14.000
closer to the source of data, that makes sense to have something maybe not with a cloud provider,

04:14.000 --> 04:17.000
but something you get a hand on.

04:17.000 --> 04:20.000
So, we're back to the previous diagram, but this time,

04:20.000 --> 04:22.000
you don't have just the tip of the iceberg,

04:22.000 --> 04:27.000
with the whole iceberg just for you how lucky you are.

04:27.000 --> 04:31.000
And it has to become, not someone else's magic, but pure magic.

04:32.000 --> 04:35.000
Either you are a good magician, and that's right.

04:35.000 --> 04:38.000
I see plenty of good networking magicians in this room.

04:38.000 --> 04:39.000
That's awesome.

04:39.000 --> 04:42.000
But it's not a case of everybody, right?

04:42.000 --> 04:48.000
And you can be a good magician, or your life is going to be much more difficult.

04:48.000 --> 04:52.000
Fortunately, there is a third option coming to the topic at this talk,

04:52.000 --> 04:55.000
which is to turn to the open network fabric.

04:55.000 --> 04:57.000
So, what do we do?

04:57.000 --> 05:02.000
We try to deploy this cloud infrastructure, just like we had at the cloud providers,

05:02.000 --> 05:05.000
on community hardware.

05:05.000 --> 05:11.000
And so, we consider branded all white box switches and servers.

05:11.000 --> 05:14.000
We've got a hardware really.

05:14.000 --> 05:19.000
On top of that, we want to handle everything that you need to network.

05:19.000 --> 05:25.000
So, the connectivity, getting also some observability in terms of the flows that come and go.

05:25.000 --> 05:33.000
Some services, some basic things, something to DHCP, firewalling, everything that comes around,

05:33.000 --> 05:35.000
and network fabric, really.

05:35.000 --> 05:41.000
And on top of that, we also want to provide a way for users to really create,

05:41.000 --> 05:46.000
and use their virtual private clouds directly on their hardware,

05:46.000 --> 05:54.000
just like they would be able to do with AWS or GKE or Google Cloud or whatever.

05:54.000 --> 05:57.000
So, what do we use to reach that opportunity?

05:57.000 --> 06:00.000
So, first we've got the community hardware.

06:00.000 --> 06:03.000
So, the open network fabric today,

06:03.000 --> 06:07.000
spot switches from the Risticia, Dell, EdgeCourse, the Micro,

06:07.000 --> 06:11.000
there are more to come, that's a few brands that we spot right now.

06:11.000 --> 06:18.000
And we start by deploying a network operating system on the switches.

06:18.000 --> 06:22.000
And then we use this Sonic, which is open source,

06:22.000 --> 06:28.000
it was based on Tibian, it was initiated by Microsoft,

06:28.000 --> 06:30.000
but now it's hosted by the Linux Foundation.

06:30.000 --> 06:32.000
So, there's no vendor looking.

06:32.000 --> 06:36.000
It's minimalist, modular, hard.

06:36.000 --> 06:39.000
And so, pretty much everything that you need to do the job.

06:39.000 --> 06:45.000
And you get your switch with this operating system that can do pretty much everything you need.

06:45.000 --> 06:50.000
And to handle these several switches once they are set up,

06:50.000 --> 06:54.000
we are using Kubernetes as a control plane so that your infrastructure looks like.

06:54.000 --> 06:57.000
And it feels like a Kubernetes cluster,

06:57.000 --> 07:00.000
with which many people nowadays are pretty familiar.

07:00.000 --> 07:04.000
So, you got your email files, good old email to do,

07:04.000 --> 07:07.000
get out to do infrastructure as a cloud.

07:07.000 --> 07:10.000
It integrates very easily into any cloud,

07:10.000 --> 07:12.000
native type that you might have.

07:12.000 --> 07:18.000
So, really, you can stream to Grafana or the classics in a way.

07:19.000 --> 07:25.000
And the reason that we get is this fabric and top of the community hardware.

07:25.000 --> 07:31.000
And also, the VPC API offered by the open network fabric,

07:31.000 --> 07:37.000
which you're able to create directly your own virtual private clouds.

07:37.000 --> 07:43.000
And you're able to just, to split your hardware, not physically, I mean,

07:43.000 --> 07:46.000
to split the use of your hardware for different applications,

07:46.000 --> 07:49.000
or for different VPCs to different tenants.

07:49.000 --> 07:56.000
And just the way you would do that with another bigger public cloud provider.

07:56.000 --> 08:01.000
So, in the other hood, we're using a VXDand-based BGPE VPN.

08:01.000 --> 08:06.000
And we've got this VPC API coming with policies and services.

08:06.000 --> 08:12.000
So, we've got some simple peering API if you want to connect different VPCs

08:12.000 --> 08:16.000
to allow the VPCs from one to connect with the other.

08:16.000 --> 08:20.000
We've got IPAM, DHCP, DHCP relay.

08:20.000 --> 08:22.000
More things will come in the future.

08:22.000 --> 08:26.000
That's more as what we have today.

08:26.000 --> 08:28.000
And the workflow looks like this.

08:28.000 --> 08:31.000
So, first, you need to get your hardware.

08:31.000 --> 08:33.000
And to why you're switching and servers,

08:33.000 --> 08:37.000
I'm sorry, the software doesn't handle that part for you yet.

08:37.000 --> 08:40.000
So, you've got to still plug the cables.

08:41.000 --> 08:44.000
This is an example of a closed network typology.

08:44.000 --> 08:49.000
So, you've got the spines, which are the top leaf switches and the servers.

08:49.000 --> 08:52.000
And you need to write some wiring diagrams,

08:52.000 --> 08:55.000
which are actually partly generated for you,

08:55.000 --> 08:59.000
but you've got to complete it to make it match your typology.

08:59.000 --> 09:04.000
And you've got to write some metadata, which is mostly the description of the switches.

09:04.000 --> 09:07.000
So, for example, describing the list of the ports at L and your switches.

09:08.000 --> 09:11.000
Then you want the fabric, you know, the fabric.

09:11.000 --> 09:16.000
And all of the switches are automatically bootstrapped for you.

09:16.000 --> 09:19.000
The image is the switches.

09:19.000 --> 09:26.000
It's configured in a way that the connectivity is enabled between the different machines on your network.

09:26.000 --> 09:30.000
And I think the marketing term is zero touch provisioning updates,

09:30.000 --> 09:34.000
maintenance of the switches once the fabric prints.

09:34.000 --> 09:37.000
So, you don't have to really touch anything on the switches.

09:37.000 --> 09:40.000
You don't have to configure the switches manually.

09:40.000 --> 09:47.000
And that's great because you get your VPCs with nearly zero network knowledge required.

09:47.000 --> 09:52.000
And that's what we want, that's good.

09:52.000 --> 09:54.000
In addition to the fabric itself,

09:54.000 --> 09:57.000
so part of the project is the gate where as well,

09:57.000 --> 10:00.000
which is so it's written work in progress,

10:00.000 --> 10:03.000
which means that we're getting started with implementation.

10:03.000 --> 10:05.000
I think it compiles.

10:05.000 --> 10:07.000
I'm not sure if you want to see it.

10:07.000 --> 10:09.000
It's getting there.

10:09.000 --> 10:12.000
So, that's the same idea.

10:12.000 --> 10:17.000
We're using community hardware, X86, ARM servers.

10:17.000 --> 10:24.000
We use flat car as a base digital to run the fabric.

10:25.000 --> 10:26.000
DPK on top of that.

10:26.000 --> 10:28.000
And we're using a rust to implement,

10:28.000 --> 10:35.000
so we've got a set of rust bindings on top of the DPK library to implement our packet processing.

10:35.000 --> 10:41.000
The idea is to get something to connect the fabric with everything outside very fast, very quick,

10:41.000 --> 10:45.000
leveraging hardware flows, getting a good throughput,

10:45.000 --> 10:48.000
and also getting a number of services in the future.

10:48.000 --> 10:50.000
So, everything you can imagine on that,

10:50.000 --> 10:54.000
can be not, can be QOS, can be fire warnings,

10:54.000 --> 10:57.000
so maybe it's a recap, I want day, who knows.

10:57.000 --> 10:59.000
So, a lot of things that we want to do,

10:59.000 --> 11:04.000
and that we come, some of them this year, some of them maybe later.

11:04.000 --> 11:05.000
So, what does it look like?

11:05.000 --> 11:10.000
So, this is a demo, not interactive, just screenshot, sorry.

11:10.000 --> 11:13.000
But before that, I'm afraid we need to talk about the elephant in the room.

11:14.000 --> 11:20.000
The project is open source, not 100% open source yets.

11:20.000 --> 11:24.000
In the sense that we do not support as of today,

11:24.000 --> 11:26.000
the upstream version of Sonic.

11:26.000 --> 11:28.000
So, we don't work on Sonic ourselves,

11:28.000 --> 11:31.000
but we shift a Sonic image with a project.

11:31.000 --> 11:34.000
If you want to test it, you need to register with hhhh,

11:34.000 --> 11:36.000
which is the company behind the project,

11:36.000 --> 11:38.000
and not because we want your details,

11:38.000 --> 11:41.000
where, for example, when the company are happy to get them,

11:41.000 --> 11:44.000
mostly because we need to give you credentials to get the broad commitment,

11:44.000 --> 11:45.000
which is the one we support,

11:45.000 --> 11:48.000
and the reason is that the first potential customers

11:48.000 --> 11:49.000
who are interested in both come,

11:49.000 --> 11:52.000
and we haven't spent the time on sporting upstream Sonic yets.

11:52.000 --> 11:56.000
But this is definitely something that we want to do in the future.

11:56.000 --> 12:01.000
And also, because not everybody yet in the company is super familiar

12:01.000 --> 12:03.000
with the open source aspects.

12:03.000 --> 12:05.000
There's a bit of confusion if you look at the docs,

12:05.000 --> 12:07.000
there's hhhhhhh fabric everywhere,

12:07.000 --> 12:11.000
not everywhere, but in a good number of places.

12:11.000 --> 12:14.000
That's actually the same thing as the open network fabric today.

12:14.000 --> 12:16.000
So, there's a bit of confusion,

12:16.000 --> 12:19.000
but we're going to clear that in time,

12:19.000 --> 12:21.000
as soon as we can, I hope.

12:21.000 --> 12:24.000
So, this is some part of the pleasure for the demo.

12:24.000 --> 12:27.000
We've got two switches.

12:27.000 --> 12:29.000
And the number of servers,

12:29.000 --> 12:31.000
we really care about the first two ones.

12:31.000 --> 12:33.000
And we want to be part of the fabric on that,

12:33.000 --> 12:34.000
and create some VPCs and connect them.

12:34.000 --> 12:37.560
We first need to install everything and deploy a Fabric.

12:37.560 --> 12:41.960
So we registered the package repository with the credentials

12:41.960 --> 12:44.640
we got because we didn't need this vote come so

12:44.640 --> 12:46.120
they can manage today.

12:46.120 --> 12:51.920
Winstall, can you see Microsoft, yes, we install this.

12:51.920 --> 12:54.440
We don't load the installer for the Fabric.

12:54.440 --> 12:59.280
And we initialized the Fabric what we get is a Fab.jml file

12:59.280 --> 13:01.600
that contains information about the Fabric that you want

13:01.600 --> 13:04.600
launched like the base template for the topology

13:04.600 --> 13:06.160
for these sort of things.

13:06.160 --> 13:08.480
And then you generate the wiring diagram so

13:08.480 --> 13:10.520
you get the template for a wiring diagram that you

13:10.520 --> 13:12.840
may need to edit to adjust the topology in that case.

13:12.840 --> 13:14.480
That's the virtual lab that I've been running.

13:14.480 --> 13:19.040
So it's already with no additional edits required.

13:19.040 --> 13:23.360
And then you launch HHFard, V-Lab, and then it downloads

13:23.360 --> 13:26.200
all the different components to install the switches.

13:26.200 --> 13:28.480
It download the images.

13:28.480 --> 13:32.160
It feeds the switches images, push it everything

13:32.160 --> 13:36.880
to the hardware, boots, the switches, and runs the Fabric.

13:36.880 --> 13:40.480
And once this finishes, you're all done.

13:40.480 --> 13:42.040
So in that case, in the virtual lab,

13:42.040 --> 13:46.160
you get several virtual machines for the switches and servers.

13:46.160 --> 13:47.480
But it's all set up, basically.

13:47.480 --> 13:50.480
So you can connect to one special node,

13:50.480 --> 13:52.120
which is the control node that connects

13:52.120 --> 13:54.520
to the different pieces of hardware.

13:54.520 --> 13:56.800
And we should can SSH, and for example,

13:56.800 --> 14:00.800
the list information about the switches.

14:00.800 --> 14:02.600
So I won't go through the details here.

14:02.600 --> 14:04.840
So we are picking up a bit short on time.

14:04.840 --> 14:08.560
So the next step is just to create your VPCs.

14:08.560 --> 14:11.040
So not that we're using cube control for that.

14:11.040 --> 14:12.720
We've got a cube control plugins.

14:12.720 --> 14:16.280
So just cube control for Fabric, VPC,

14:16.280 --> 14:20.840
create the name of your VPC, the subnet

14:20.840 --> 14:23.240
that you're using with DHCP.

14:23.240 --> 14:25.640
Sorry, not the subnet you're using.

14:25.640 --> 14:30.520
And also the DHCP starts address, and the VLAN ID.

14:30.520 --> 14:32.800
And your VPC is creating, and that's it.

14:32.800 --> 14:35.840
And then you need to attach it to a connection

14:35.840 --> 14:37.720
to the connection is a bit of a object

14:37.720 --> 14:42.320
that represents more or less a wire on your topology.

14:42.320 --> 14:44.400
And you still need to connect to the servers

14:44.400 --> 14:46.120
to make sure that the server is a connected to.

14:46.120 --> 14:49.960
So we've got a small utility here to do the bone connection

14:49.960 --> 14:53.560
between the different interfaces that we're using,

14:53.560 --> 14:57.360
because these two servers, it was maybe not clear,

14:57.360 --> 15:00.160
are multi-homed in that example.

15:00.160 --> 15:03.280
And then we can ping one VPC from the other.

15:03.280 --> 15:05.320
And we know actually it doesn't work yet.

15:05.320 --> 15:07.200
There is a very less than that is required.

15:07.200 --> 15:10.680
We need to p-the two VPCs together.

15:10.680 --> 15:12.560
And once you p-them, you can ping them.

15:12.560 --> 15:15.560
VPCs are connected, and we have our magic.

15:15.560 --> 15:17.200
So amount of times.

15:17.200 --> 15:18.440
So thank you.

15:18.440 --> 15:20.560
Do check out the project if you like,

15:20.560 --> 15:22.680
and contributions are welcome, obviously.

15:22.680 --> 15:24.040
And thanks for our attention.

15:24.040 --> 15:25.040
Thank you so much.

15:25.040 --> 15:32.680
APPLAUSE

15:32.680 --> 15:33.480
Any questions?

15:38.400 --> 15:42.280
You mentioned your work on BPS, but the gateway

15:42.280 --> 15:43.560
is running DPDK.

15:43.560 --> 15:47.360
Can you explain what you didn't choose DTP, for example?

15:47.360 --> 15:48.360
Oh.

15:48.360 --> 15:50.360
So I used to work on BPS.

15:50.360 --> 15:53.920
I'm still doing BPS on the side, but I'm no longer working on BPS

15:53.920 --> 15:54.760
for work.

15:54.760 --> 15:56.040
So this has nothing to do with BPS.

15:56.040 --> 15:56.920
I've got a wrong shot.

15:56.920 --> 15:57.960
Sorry.

15:57.960 --> 16:02.120
It's that way, why did you choose DPDK over something like

16:02.120 --> 16:03.120
extra important?

16:03.120 --> 16:07.520
Why did you choose DPDK, because we want to do some hardware

16:07.520 --> 16:10.520
of loads at some points for the gateway.

16:10.520 --> 16:12.320
So I'm working mostly on the gateway,

16:12.320 --> 16:14.040
naturally, on the fabric myself.

16:14.040 --> 16:15.600
And we're interested in hardware of loads,

16:15.600 --> 16:17.520
and that's not something that you really get today

16:17.520 --> 16:19.600
with BPS and BPS and BPS.

16:20.600 --> 16:23.280
Can you please wait for a second to be fair

16:23.280 --> 16:25.520
and have other people ask questions then?

16:30.520 --> 16:32.400
OK, so in one of the last slides,

16:32.400 --> 16:35.200
you show that you control this fabric from Kubernetes

16:35.200 --> 16:37.080
and so on, and you have CRDs for that.

16:37.080 --> 16:38.840
So that means in this whole picture,

16:38.840 --> 16:40.800
you have somewhere a separate Kubernetes

16:40.800 --> 16:44.400
cluster that has flat connectivity to your whole infrastructure

16:44.400 --> 16:47.880
so that you can control it, or do your control it from the inside,

16:47.880 --> 16:51.240
but then my question is, how do you deal with this separation?

16:51.240 --> 16:55.320
If you have a Kubernetes cluster that needs to have a flat access

16:55.320 --> 16:58.080
to the switches, so is there something, you know?

16:58.080 --> 17:00.360
So where is the part that you are not showing us?

17:00.360 --> 17:03.520
Is it some infrared outside, or is there some magic

17:03.520 --> 17:05.640
from the inside and it does to complex the server?

17:05.640 --> 17:08.320
So I think, I'm not sure, I understand the question.

17:08.320 --> 17:10.440
So my question is, you have QCTL fabric

17:10.440 --> 17:11.480
to control the fabric.

17:11.480 --> 17:16.480
So where is this Kubernetes cluster that controls the infra?

17:16.760 --> 17:18.760
It's running, that's part of the fabric.

17:18.760 --> 17:21.920
So that's not the Kubernetes cluster that you might end up

17:21.920 --> 17:23.880
working with for your application, that makes sense.

17:23.880 --> 17:27.400
So it's one of those servers that is in the picture that belongs,

17:27.400 --> 17:28.200
it controls the fabric.

17:28.200 --> 17:30.080
And one of the servers that you get in the end

17:30.080 --> 17:33.600
where you run your VPC, you can deploy

17:36.000 --> 17:39.080
another instance of Kubernetes, high-level instance of Kubernetes

17:39.080 --> 17:43.160
that makes sense to run your traditional Kubernetes workflow.

17:43.160 --> 17:45.440
So you go, OK, so that would be then Kubernetes

17:45.480 --> 17:48.520
that I deploy on the server that we deploy the Sparkle with.

17:48.520 --> 17:49.400
Exactly.

17:49.400 --> 17:53.480
OK, sorry, we're out of time.

17:53.480 --> 17:55.200
But I'm sure you can see it all the way.

17:55.200 --> 17:56.200
Thank you.

