“Who can do what,” or why Identity Access Management matters
Captured source
source ↗“Who can do what,” or why Identity Access Management matters Build • Olivier Cano • 20/04/23 • 7 min read
On February 22nd, we activated IAM for all Scaleway organizations. Identity and Access Management (IAM) is a security framework to control the authentication and authorization of individuals and manage their access to resources.
IAM is no small task; at Scaleway, every API call goes through IAM before being executed. For each second you spend reading this post, about a thousand permission requests are executed on IAM.
But here’s the thing, on February 22nd, IAM had already been live for eight months. You just didn’t know it. How did we manage to migrate so seamlessly? That’s what we will look at in this series of two blog posts.
In this first part, we’ll dive into the history of IAM at Scaleway and the different database implementations we used over the years. The second post will go further into the technical details of the migration itself.
IAM at Scaleway prior to 2022
IAM can be summarized with the question “ who can access what ?”. The answer to this question is determined in what we call a policy . The “ who ” often defines the identity: an individual, a team, or even non-human users. The “ what ” defines different levels of granularity we want to give access to: an organization, a project, or a particular resource.
The list of “ who ” or “ what ” in a policy depends on the business you run. Let’s have a look at these two example policies:
Two sample sentences to demonstrate access: “Alice can access the organization Foo” and “People with green eyes born during a full moon can access the secret room during odd days”
Even though both are valid policies, the first one is much simpler to understand and implement. In a nutshell, the policy dilemma is that you have to weigh simplicity vs. granularity.
2015: A permissions-based system
Scaleway started in 2015 as an internal startup within Online SAS. At the time, IAM was not called IAM yet. Instead, we would refer to it as the “permissions system,” which was very simple: a user has many permissions.
Disclaimer: database models shown in this blog post are not exhaustive.
Database model showing the permissions-based system
A permission represents an action to perform on a resource in a product. For example, the permission instance:server:read allows a user to read information about servers in their organization, but it does not allow them to create servers. The permission is then checked by the product API managing this resource.
This approach worked well for many years, but it had its limits:
Every time we created a user, we had to insert entries for each of the P permissions in the users__permissions table.
Every time we created a permission, we had to insert entries for each of the U users in the users__permissions table.
As Scaleway grew, more permissions were added for new products, and a lot of new users joined Scaleway. At some point, it took a full day to seed a new permission in the database. We had clearly hit the limits of the permissions model.
2020: The switch to a role-based system
We improved the performance of the permissions model by introducing permission sets — sets of one or multiple permissions that we can attribute to one or many users. We took this refactoring opportunity to introduce a Role-Based Access Control (RBAC) model: when a user assumes a role in an organization, they get access to various permission sets.
Database model showing how the role-based system worked
At the time, organizations equaled users, so each organization could only have one user. The idea with RBAC was to enable a multi-user feature by letting a user assume one of the four predefined roles in any organization:
owner: can do everything in the organization
administrator: can do everything except delete the organization
billing administrator: can only manage the billing and payment
editor: can access cloud products
In this system, users could be part of many organizations, but they remained global across Scaleway. It was a simple approach but meant that an organization owner couldn’t take actions on their users other than removing them from an organization. Example: the owner of an organization cannot enforce multi-factor authentication on their users.
2020: Pivoting from user-based to project-based API keys
In 2020, we also introduced a new feature: projects. This feature allows you to organize resources by isolating them in projects. Implementing this into our RBAC model was trivial: instead of assigning a role in an organization, we now assign it within a project.
With this feature, we also introduced project-scoped API keys. Before, API keys were bound to a global user, which means the same API key had the permissions associated with the user across all the organizations the user belongs to. With project-scoped API keys, the API key is bound to a certain role on a particular project. This was a big step toward adding more granularity to our policies.
Before:
Example: “API key has access to all resources organizations Bob belongs to”
After:
Example: “API key has access to all resources of a project”
The limits of the RBAC model
The project feature was a quick win, but we knew the RBAC model would be limited when it came to defining scope when accessing a product. We identified four possible scopes to configure access to a product:
Accessing a product on a project
Accessing a product on many projects
Accessing a product on all current projects
Accessing a product on all current and future projects
Every time we tried to address these four use cases with RBAC, we ended up twisting the model in a way that didn’t satisfy us. But having more granularities in IAM is the top feature requested by our users, and we have great plans for IAM, such as adding granularity at the resource level and more. We didn’t want to make a compromise here, so we needed a new approach with better handling of project and organization scopes.
Designing our current IAM system
In 2022, we started working on our IAM system as it exists today. The main requirements were:
Simpler user management for audit and security features
Support of both project and organization scopes
Letting the organization owners define the policy they want
Anticipating the need for more granularity in the future
The first thing we did was to analyze many existing IAM systems, especially from the open-source world. None of the existing…
Excerpt shown — open the source for the full document.