WEBVTT

00:00.000 --> 00:21.000
Thank you for coming to my talk, so it's about SD and identity providers, how they can work

00:21.000 --> 00:27.000
together, so this is also work and progress, so I will show you later there are some

00:27.000 --> 00:32.000
repository where you can look at the code and get some packages, but it's not committed

00:32.000 --> 00:40.000
to the SD upstream, so if you have ideas, suggestions, we are well-wracked on if you will

00:40.000 --> 00:48.000
share with us and so we can improve the experience with SD and the integration of IDPs.

00:49.000 --> 00:56.000
My name is Sumipos, as I already mentioned, I'm a software engineer at Red Hat, and I'm

00:56.000 --> 01:01.000
presenting for SD, I'm a member of SD team, but I'm also currently the maintainer of

01:01.000 --> 01:09.000
RealmeB and ADCLA utilities to join Active Directory or free IPA, and maybe during the talk,

01:09.000 --> 01:16.000
we will see that even RealmeB might play a role here in the future as well with the integration of the IDPs.

01:16.000 --> 01:23.000
So, I will start telling you why we are doing this and how we are doing this and then

01:23.000 --> 01:30.000
then I will try to use some demo, so this demo will be a life demo, so I will see how

01:30.000 --> 01:36.000
strongly I should myself and to the foot. So, where we are coming from, I will talk about

01:36.000 --> 01:43.000
SD as SD is around since more than a decade, and it's basically about centralized identity

01:43.000 --> 01:51.000
management, so we are using sources like classical LDAP servers, maybe with a cover server,

01:51.000 --> 01:58.000
or the more modern integrated variants like Active Directory and free IPA, where we get the user

01:58.000 --> 02:06.000
and group information and also use mostly the cover protocol but others as well to authenticate the

02:06.000 --> 02:14.000
users against those centralized identity management platforms. And so SD is on one hand

02:14.000 --> 02:19.000
getting the information from those servers, and on the other hand, is sending the information

02:19.000 --> 02:25.000
back to the local system. So, Leonard was already talking about Tom and NSS, and that is H, for example.

02:25.000 --> 02:32.000
So, what is the doing is taking the information from the identity provider, like the user name,

02:32.000 --> 02:39.000
and put it into the NSS record that is in the user name, and when the server provides

02:39.000 --> 02:46.000
a UID and a GID, it will also put it into the NSS record, so that the system knows what the

02:46.000 --> 02:52.000
UID and the primary group ID is corresponding to the name of the user, and then there are

02:52.000 --> 02:59.000
the remaining physics attributes like the home directory and the user shell. But as Leonard also

02:59.000 --> 03:03.000
mentioned before, there are many, many other information about the user, for example, the

03:03.000 --> 03:10.000
SSH keys of the user, and SD also can, with the same mechanism as system D and the

03:10.000 --> 03:19.000
walling interface, present the users SSH keys to the SSH team, and SD can also do this with

03:19.000 --> 03:25.000
the host keys, and then there are other items where we can get information from the

03:25.000 --> 03:29.000
servers, like as UD rules, you can read from the server, and make it available to the

03:29.000 --> 03:40.000
SSH programs, so you can manage your SSH, and we also make this information available for the

03:40.000 --> 03:46.000
user, for example via DBS, as Leonard also mentioned originally system DBS also have a

03:46.000 --> 03:53.000
using DBS, and at the time we started to provide the information via DBS, this was

03:53.000 --> 04:00.000
kind of state of the art, and so we used DBS to provide in which user information, so as

04:00.000 --> 04:07.000
mentioned by Leonard as well, the physics and as interface is quite small, where we do not

04:07.000 --> 04:14.000
can put many information, it's basically the information the systems need to log in, but

04:14.000 --> 04:22.000
nothing else, and so we use the DBS interface to enrich this, and I guess we will also

04:22.000 --> 04:29.000
follow Leonard's suggestions and implement the volume interface sooner or later so that

04:29.000 --> 04:36.000
we nicely work together in this respect itself, but I'm told that not every system is using

04:36.000 --> 04:46.000
system D, so I guess we have to maintain all the interfaces we currently have on our own as well.

04:46.000 --> 04:53.000
With respect to the integration of the local public system, LDAP has this RFC 23 or

04:53.000 --> 05:01.000
7 or RFC 23 or 7D steamer, which defines all the attributes we need for a public

05:01.000 --> 05:09.000
system, and Kerberos is something which is kind of quite old coming from the mid 80s,

05:09.000 --> 05:15.000
is designed platform independent, but has a long tradition of the public units environment,

05:15.000 --> 05:22.000
so that also these platform independence is handled quite well with the public environment,

05:22.000 --> 05:30.000
so both sources we are so far using intensively for decentralized user management are very

05:30.000 --> 05:37.000
well integrated and well known, but the IDPs are kind of new for the public environment,

05:37.000 --> 05:44.000
and I will later show you where we are kind of coming into trouble or what kind of obstacles

05:44.000 --> 05:56.000
we have to walk around. From the system side, I will just pick two of them because they

05:56.000 --> 06:03.000
are their relationship, in my opinion, it's important, there's on one hand time for authentication

06:03.000 --> 06:09.000
to public Google authentication modules, on the other hand, the NSS, the name service switch

06:09.000 --> 06:19.000
by GILIPC, and they are independent by design, so you cannot look up the user, which is not authenticated,

06:19.000 --> 06:27.000
and you in theory can authenticate the user, but at the point of authentication not known to the system,

06:27.000 --> 06:35.000
this contradicts each other, and typically you have to kind of satisfy both, so for example,

06:35.000 --> 06:42.000
when you are login with SSH, SSH does a get PW num call before doing any authentication,

06:42.000 --> 06:52.000
so the user must know to the system before authentication, and this is something which makes this integration

06:52.000 --> 06:59.000
with the IDPs a bit more complicated, because if you might have seen in some of the earlier talks,

06:59.000 --> 07:05.000
when you are doing already see authentication, in the end of the authentication you get user information,

07:05.000 --> 07:11.000
back as part of the authentication, but this is polite for us, we needed before.

07:11.000 --> 07:22.000
So this is some of the boundary conditions we have to consider here, so why we want to do this at all?

07:23.000 --> 07:32.000
Because for example, there is already an integration with free IPI and IDPs, so you can use free IPI to authenticate against an external IDP,

07:32.000 --> 07:42.000
and use all the kind of fancy authentication methods your IDP provides, and using IPI has, in my opinion, also many benefits,

07:42.000 --> 07:52.000
when you have a large environment, so when you have the knowledge and the manpower to handle a bit complex environment where you

07:52.000 --> 08:01.000
not only have your IDP, but also some other services like IPA, but even many small environments nowadays depend on the IDPs,

08:01.000 --> 08:11.000
because everything related to web services and authentication is basically running on or to IDC IDPs.

08:11.000 --> 08:22.000
And so this is, these IDPs are kind of requirement, even for small environments, and so as is the, we are trying to fill the gap for those small environments,

08:22.000 --> 08:27.000
so that you do not have to add extra complexity by adding new services, new protocols,

08:27.000 --> 08:36.000
but mainly have your IDP or run your IDP on your own, and then you can consume all the data you already have, like users, like groups,

08:36.000 --> 08:46.000
the relationships, group memberships, and so on, without any extra services which introduce complexity and maybe some other issues.

08:47.000 --> 08:56.000
So, and this is what we want to provide in future, so that as a D can talk with all the different kind of identity providers, like Android,

08:56.000 --> 09:12.000
like T-Clog, Google, you name it, Opera, Amazon, GitHub, was already an example, also in a previous talk, so this is something,

09:13.000 --> 09:31.000
or this is our goal, which we want to provide in future, and how we want to get there, or we want to follow, of course, the ODAC or IDC or to standards,

09:31.000 --> 09:40.000
for the authentication and authorization, because this is what those IDPs provide, and this is also something we can build upon,

09:40.000 --> 09:45.000
because those are more or less pretty good standardized.

09:45.000 --> 09:56.000
A problem here, or what we have to respect as well, is that everything here is web browser based, so since it's coming from the web services and web service authentication,

09:56.000 --> 10:11.000
there has to be a web server involved, and you will ask yourself, yeah, when I do SSH, whereas my web server, I hope we will show you how it works in a demo later.

10:11.000 --> 10:21.000
As I already mentioned, all IDC provides your user identity tokens, but it is nice to have them, and we can take some information out of them,

10:21.000 --> 10:36.000
but typically they are coming to light, because most of the services used to log in, like SSH, bin log in, or whatever, do these get PWM a try first, if the user is known to the system before authenticating the users.

10:36.000 --> 10:54.000
The only exception I know is TDM, with TDM, we can just type in the name, and then TDM already immediately start spamming without checking if the user exists or not, but yeah, this is so far the only example or counterexample I know.

10:54.000 --> 11:21.000
Then, what is also a bit of a problem is that although authentication is very much a nicely standardized, looking up users and groups without the authentication part, as I mentioned we have to do this, there is currently no standard at all, so basically each provider has its own API for this,

11:21.000 --> 11:45.000
luckily more or less all of them are at least the kind of common ground, and most of them also have kind of similar ideas about how to look up the user groups, so we can basically group this into some kind of requests against the IDP to look up those users groups and the relationships.

11:45.000 --> 12:12.000
Another obstacle is that since IDP is so far we are not used for projects integration or log in into those systems, there are typically no projects attributes claims or whatever we can kind of use to generate a UID or log in shell or a home directory, so we have to kind of find ways to run this as well.

12:12.000 --> 12:32.000
And yeah, finally to get this information from an IDP and which is not part of the authentication of the user itself, we also need dedicated credentials to request a user and the user attributes of the memberships of the user from the IDP.

12:32.000 --> 13:01.000
How we are doing this, to get the information from the IDP we have to create some kind of generic object and in the IDP language this would be a client, so like typically a web application is a client, and so here we also create a client, which is kind of connected to the computer, so we can create credentials typically a password which we can store safely on the computer.

13:01.000 --> 13:11.000
And this the computer can use to connect to the IDP and then ask does the user access to that exist and does this user group memberships for example.

13:11.000 --> 13:25.000
For authentication this client has to allow the device authorization ground because this is kind of the only authentication flow which can be triggered without a web server.

13:25.000 --> 13:33.000
Soon or later we will need a web server as I mentioned before, but the initial trigger can be done without a web server.

13:33.000 --> 13:38.000
And this client must be configured to allow user and group look up as well.

13:38.000 --> 13:45.000
Basically, you can think about creating this client as joining the main in the classical sense.

13:45.000 --> 14:02.000
So here's where we only might come into play and future so that we can simplify this that we do not have to create this client manually and give all the permissions to the client manually, but have some utility to do this automatically for you.

14:02.000 --> 14:18.000
Okay, as I already mentioned, an important step here and also with the free IPA integration to a big similar IDP is the device authorization ground because we have to trigger this when the user is logging in for example with SSH.

14:18.000 --> 14:31.000
Then the server site somehow has to trigger starting or initializing your the authentication because at this point in time we do not have any kind of web server available.

14:31.000 --> 14:40.000
In future graphical interfaces might integrate this so there's a working progress by different Linux vendors.

14:40.000 --> 14:55.000
In this area, so at least on this level when you're initially logged in to your computer was a graphical interface that might be the possibility to have a minimal that browser so that you do not have to kind of get some extra steps.

14:55.000 --> 15:04.000
But the basic trigger will be the device authorization the device authorization ground as well.

15:04.000 --> 15:13.000
So, as is the will do user and group lookups for you.

15:13.000 --> 15:19.000
We have the credentials to connect the IDP with the help of the client ID and the client credentials.

15:19.000 --> 15:27.000
And as I mentioned before, most of them have some similar understanding about how to do this, how to do this so we will have a request to look up a user.

15:27.000 --> 15:37.000
You'll have a request look up the groups and then you will have requests to look up the members of the groups and the memberships of a user.

15:37.000 --> 16:01.000
And since all of the IDPs will are doing this differently and have different APIs, we will kind of try to provide a plugin interface so that it is easily to extend when there's a new IDP coming around and has some additional API.

16:01.000 --> 16:17.000
Currently, what we have in code is interfaces for entry ID and for keyclock and if the demo works, I will show you how it's working with keyclock.

16:17.000 --> 16:25.000
As I mentioned, IDPs typically do not have physics attributes as claims or whatever start.

16:25.000 --> 16:28.000
So we have to make them up on our own.

16:28.000 --> 16:36.000
And this is basically not new to us because SD is connected to or can connect to Active Directory.

16:36.000 --> 16:44.000
And as you might know, when you have a default Active Directory installation, there is also no physics ID, no home directory and so on.

16:44.000 --> 16:58.000
There are a couple of extensions which are added by Microsoft way back in time and currently are kind of only existing in the out-up schema, but there are no physics anymore to manage them.

16:58.000 --> 17:06.000
Nevertheless, we were used to this so they are already as the options to provide a default shell.

17:06.000 --> 17:10.000
So if the user entry does not have a shell, this shell will be provided.

17:10.000 --> 17:18.000
We also have some kind of templates to generate home directory based on the username, so like default slash home slash and then the username.

17:18.000 --> 17:21.000
So those we do not need.

17:21.000 --> 17:31.000
And also for Active Directory, we added some ID mapping so that we have an Active Directory user or a group.

17:31.000 --> 17:40.000
We can generate an ID for this and this is not only just kind of making up a number, but we try to do this algorithmically.

17:40.000 --> 17:52.000
So that when you have two different systems joined to the same Active Directory domain, they generate the same new ID and the same GID or private GID for the user without having interaction between each other.

17:52.000 --> 18:08.000
How we do this is basically we take the sit of an Active Directory object, be it a user or a group, take the domain sit part, which defines us some range, and then take the width part.

18:08.000 --> 18:17.000
So it's the last part, the last number of the sit, which is counted upwards by Active Directory, so we use this as a kind of offset.

18:17.000 --> 18:30.000
For example, we pick a range for ID starting with 1 million and then we add width 2030 to this and this will give us the project ID 1 billion 2030.

18:30.000 --> 18:42.000
And this will work on all the systems connected to Active Directory, so all the client systems will have the same new ID, GID and so on.

18:42.000 --> 19:03.000
I hope that with the stronger use of IDP, sooner or later will also have some kind of ideas how to define some positive standards in the IDP world, so that those attributes will be available or can be easily set in IDP as well.

19:03.000 --> 19:12.000
This would make life easier, but as of now we have to generate them on.

19:12.000 --> 19:27.000
I described how we mapped the positive ideas for Active Directory, and we will pick, or have here for the example, pick the sooner scheme for the IDP.

19:27.000 --> 19:37.000
We take, we do not have to domains it for the IDP, but something which is unique and also available for all IDPs, for example, with the token URL.

19:37.000 --> 19:49.000
Since it has to be accessible in the web, it has to be kind of unique, and since every token has to be requested while this endpoint, it is also always present for the IDP.

19:49.000 --> 19:58.000
For example, one thing we can use as kind of identifier for the IDP and to select the range.

19:58.000 --> 20:06.000
There, of course, depending on the IDP, other ways to select the range.

20:06.000 --> 20:18.000
For example, with Android, you have your tenant ID. This is, of course, the tenant ID is part of the token URL, but you can also restrict on this tenant ID if you're only working with Android, for example.

20:18.000 --> 20:26.000
So this will most probably be configurable over the time for the different providers, but kind of the default might be the token URL.

20:26.000 --> 20:38.000
So this selects the range, for example, they are starting from 1 million to 2 million, or 1 million, and 200,000, which is typically our default size of ranges.

20:39.000 --> 20:52.000
And then, unfortunately, there is no such nice number as the written active directory, which is counted up by an IDP.

20:52.000 --> 21:02.000
At least not generically, typically objects and IDPs have some new ideas, which have no structure at all, so we cannot use this.

21:02.000 --> 21:17.000
So as a first step, we are using another hash, and there's the hash of the username, which is unique for the user as well, and this then defines us a UAD or a number inside of the range we are selected.

21:17.000 --> 21:23.000
There's one kind of severe problem with this, as you know, hash is not revertable.

21:23.000 --> 21:31.000
So it is with this scheme, it is easy to get a UID or a GID, if I know the username or the group name.

21:31.000 --> 21:37.000
And of course, SSD will cache this, and then we can, of course, find for a given ID, the user bag.

21:37.000 --> 21:48.000
But if you do not have the cache and start just with the ID, we kind of kind of trying to use this, so this is a kind of drawback of this scheme,

21:48.000 --> 21:57.000
and depending on the IDP, if there is some upcoming number, it might of course be better to use this.

21:57.000 --> 22:06.000
But this is the hash of the name is kind of generate, will kind of work for all, with a restriction that is not revertable or invertible.

22:06.000 --> 22:09.000
Okay, now coming to the demo.

22:10.000 --> 22:24.000
First, if you want to try this at home, as I mentioned before, there is a copper repository with a couple of packages for reason fedora and relevant and maybe some other platforms, where you can download the packages,

22:24.000 --> 22:39.000
as this repository, there are also kind of configuration instructions, and you also find links to my code, working GIDs trees, if you want to compile it on your own, on some other platforms.

22:39.000 --> 22:51.000
For the testing, if you do not want to kind of mess with your system, I do using the SSD CI test containers.

22:51.000 --> 23:09.000
We are using inside of SD to run our test scripts, so they make it easy to set up a container, as is the client, an IPA server, some of the ID domain controller, as Active Directory server, an LDAP server, a couple of server and so on.

23:09.000 --> 23:18.000
And there's also nowadays key blocks server available, so we have an IDP, we have a client, and so we can use this for testing.

23:18.000 --> 23:39.000
Here, I also put the configuration in the slides. I do not know how familiar all of the SD configuration, but basically there is a new ID provider called IDP, which has options on its own, basically defining the IDP and points and so on.

23:39.000 --> 23:59.000
Okay, now let's switch to my client. Okay, first let me stop, as is the, okay.

23:59.000 --> 24:13.000
So just as a proof, so now, currently, if SD is not running, the system does not deal the user or one at T-clock.

24:13.000 --> 24:18.000
So this is my configuration, it is basically the same as on the slide.

24:18.000 --> 24:36.000
There's some generic blocks for us as D, and here I'm going to see the replays, the shell, the bash, and have the template for the home directory.

24:36.000 --> 24:43.000
Let's start again.

24:43.000 --> 24:55.000
Or maybe we have a look at the keyclook server first.

24:55.000 --> 25:21.000
So as I mentioned, we need a client on the keyclook site, I already have prepared a client of session time.

25:21.000 --> 25:28.000
Okay, and as you see, I allow client authentication, so the client can authenticate with the help of a password.

25:28.000 --> 25:42.000
I allow device key authorization and service count rowing spins. I have assigned some rows to the client, basically to allow them to look up users and groups.

25:42.000 --> 25:50.000
And I have some users here, user or one, user or team.

25:50.000 --> 25:56.000
Going back, so I started SD.

25:56.000 --> 26:08.000
I remove logs, and I remove the cache at all, as well, so to be sure there is no kind of object already stored on the system in the cache.

26:08.000 --> 26:15.000
Okay, so now the user is present on the system.

26:15.000 --> 26:20.000
I prefer a fully qualified name, so the username is user or one at keyclook.

26:20.000 --> 26:31.000
This might become important when you have multiple different IDPs, because SD allows by design to configure multiple domains or sources.

26:31.000 --> 26:38.000
And yeah, when you collect from multiple IDPs, there is a fair chance that you have a collision of user names.

26:38.000 --> 26:45.000
This is typical, for example, active directory environments where all the main, every domain has an administrator user,

26:45.000 --> 26:53.000
and the domain user group, so you have to use for the qualified names and those environments to not confuse you and the system.

26:53.000 --> 27:06.000
And you see, the user has a UID and the primary GID, so we, by default, use the private groups, so the numerical value of the primary group will be the same,

27:06.000 --> 27:10.000
and it has also auto-generate home directory and to shell.

27:10.000 --> 27:22.000
There was a second user of user O2, it is here as well, and you can see the IDs are kind of coming from the same range,

27:22.000 --> 27:29.000
but differs. So yeah, this is of course important that you have different UIDs and GIDs for all the users.

27:29.000 --> 27:41.000
Okay, next step authentication, and so I will, as is H, just onto the local host.

27:41.000 --> 27:50.000
And here you can see what device authorization means, because you are getting a prompt back that you should connect to URL,

27:50.000 --> 27:55.000
and then even type in some authorization code.

27:55.000 --> 28:05.000
But keyclog is so nice, also presenting a URL where you have both in one, so we have the do not have to type the code on your own.

28:05.000 --> 28:14.000
And so I open this URL and say, okay, I'm successful without them.

28:14.000 --> 28:23.000
I'm going back to my shell and press enter, and now I should be able to log in, but it fails. Why does it fail?

28:23.000 --> 28:35.000
It's because of the browser. I've shown you some details of the keyclog configuration, and I'm still logged in as admin, and now there's user O1.

28:35.000 --> 28:45.000
So the browser, since it's caching all this information, is still thinking, I want to look in as user as administrator.

28:45.000 --> 28:52.000
But on the other hand, I want to look in as user O1. So this is already the example that you kind of cannot freak out the system.

28:52.000 --> 29:03.000
So you really have to be logged in as a log out button.

29:04.000 --> 29:18.000
Okay, closing the windows. Okay, now I use the new key.

29:18.000 --> 29:25.000
Okay, now I'm asked to log in because I'm logged in at all.

29:25.000 --> 29:34.000
So I'm asking again, if I want to allow the access, okay, success will look in on the IDP.

29:34.000 --> 29:41.000
And now I'm also successfully logged in to the log system.

29:41.000 --> 29:54.000
Okay, demo worked. That was basically what I wanted to show you.

29:54.000 --> 30:10.000
Thank you very much for your attention. As I mentioned, this is working progress. If you have suggestion, just send me an email or there's also a GitHub issue about this topic, where there's already some discussion happening.

30:10.000 --> 30:26.000
Just suggesting a question about, what about Android, what about kick-load with some details, or if you have insight of some of the other IDPs, yeah, feel free to reach out to me or send some comments to the GitHub.

30:26.000 --> 30:35.000
Thank you very much.

30:35.000 --> 30:51.000
So we don't have time for questions. Can the next speaker, we don't have time for questions, sorry.

30:51.000 --> 31:03.000
Maybe it's better if you don't have time. Okay, time is running out, give me a map, okay.

31:03.000 --> 31:07.000
What about the UIDs of the user's new works? Yeah.

31:07.000 --> 31:18.000
I think systematic, I can do, if you import them in a different way, it doesn't really get access to files, right?

31:18.000 --> 31:21.000
Just, just.

31:21.000 --> 31:26.000
Yep.

31:33.000 --> 31:43.000
Thank you very much.

