WritingScalewayScalewaypublished Jun 22, 2023seen 5d

Tackling technical debt and creating IP Mobility

Open original ↗

Captured source

source ↗
published Jun 22, 2023seen 5dcaptured 3dhttp 200method plain

Tackling technical debt and creating IP Mobility Deploy • Victor Ramiro, Pavel Lunin • 22/06/23 • 9 min read

Connecting tens of thousands of virtual instances within a cloud environment is no easy task. You need a solid and well-designed network to support this kind of scale. Additionally, data center networking is one of the fastest-evolving areas of network engineering. Some of the solutions available today were not available ten years ago.

Initially, Scaleway’s choice to ensure growth was to implement a highly available NAT (Network Address Translation) to make IP addresses move between physical machines with the instances to which they were assigned. While that was the right choice at the time, our past decisions have since created a technical debt that we need to remove to let us grow even further in the future.

In the following months, we will start rolling out some important changes in the way our inner network operates. We will be removing the NAT to create a simpler way to handle the network. The planned changes will bring several improvements for our users, among them IP stability, support for IPv6, and enhanced security.

Tackling technical debt without judgment

It’s important to say that the solution that we’ve been using so far is not a bad solution. It’s just an old solution, and now we have other options.

The choices made a lot of sense when Scaleway was created because we used to manage hardware servers and not virtual machines. Under those circumstances, NAT was an efficient and sufficient method to quickly switch public IPs from one hardware server to another and also allowed for our product range to grow quickly.

But time has passed, and our product catalog has evolved, so it was time for this part of the stack to catch up to today’s needs and find more contemporary solutions that will help us solve some problems and make things easier for us and our customers.

The problem: We need to move IP addresses

In a cloud environment, virtual machines (VM) are hosted on physical hypervisors, and the network must know to which physical host it should send packets in order to deliver them to the VM.

It might seem simple: a virtual machine has its IP address routed like any other on the internet using standard network routing protocols. However, VMs might need to move from one physical hypervisor to another for plenty of reasons:

The customer stops the VM, which “archives” it: the VM’s snapshot is sent to the remote storage while the compute resources of the hypervisor are freed up. If the customer wants to restart the archived instance after some time has passed, a new slot of compute resources is allocated on a potentially different hypervisor, and the snapshot is sent back from the archive to be run on the new host.

All hardware can fail, and hypervisors are no exception. When this happens, all customer instances are moved away from the faulty machine.

There are other types of corrective maintenance and incidents that might require Scaleway to move instances from one hypervisor to another.

When an instance moves to another HV, its IP address has to be moved as well. Doing this for hundreds of thousands of VMs running on thousands of hypervisors is not trivial. Simple routing and bridging techniques used in enterprise networking wouldn’t scale.

The old solution: One-to-One NAT

Years ago, at the beginning of the Scaleway Elements Cloud ecosystem, we chose to address this problem using the principle of indirection .

Instead of assigning a publicly routable IP address to an instance, we provided it with a private one from the RFC1918 space (10.x.x.x) and then mapped it to a publicly routable IP using a centralized NAT solution.

Looking at the above schematic, if we want to move VM2 from HV10 to HV20 and VM5 from HV20 to HV10, we’ll need to assign new hypervisor-bound RFC1918 addresses to them and change the mapping on the NAT to preserve their public IP addresses:

While these addresses from RFC1918 space are often colloquially called “private” due to their non-publicly routable nature, they don’t have much to do with privacy. These IP addresses are reachable by all Scaleway customers and don’t provide any additional security compared to public IP addresses.

If you are interested in private communication between instances, you should consider using Scaleway Private Networks , which provide a communication channel isolated from other customers.

Meet Natasha, our high-performance stateless NAT engine

While NAT is a well-known technique widely used in enterprise and ISP networks, its primary goal is to reduce the use of public IPv4 addresses by mapping many “private” IPv4 addresses to a single public one in a 1:N fashion.

Such 1:N address translation requires so-called stateful packet processing with port mapping, sometimes called NAPT. While inevitable in some network applications, this is not what Scaleway wanted to use due to many limitations and poor scaling. So we decided to implement 1:1 NAT, sometimes called basic NAT , which doesn’t require stateful packet processing.

One of the pitfalls of this approach is that most commercial and open-source NAT implementations don’t support stateless 1:1 mapping and focus on the stateful 1:N translation, as this is the most common NAT use case. So we had to write our own high-performance stateless NAT engine, Natasha .

Despite many doubts and internal debates at the beginning, it turned out that Natasha did the job pretty well, and the performance we achieved was measured in hundreds of gigabits per second. Each Scaleway AZ is equipped with a highly available cluster of 4 to 32 Natasha machines, depending on the size of the AZ.

Known limitations of the NAT solution

But even with great performance, stateless NAT has a number of inconveniences. Let’s look at them more closely.

Lack of transparency for the customer

Customers need to consider that the IP addresses assigned directly to an instance's network interface might change. For example, if you use these addresses for frontend-backend communication with Access Control Lists (ACLs) or security groups, you have to update these ACLs when the addresses change. Scaleway provides a number of tools, for example, dynamic DNS records for public and private addresses, which are updated automatically, but all this still requires action at the customer end.

Long switchover time

It takes time to update the NAT mapping tables. When an instance moves to…

Excerpt shown — open the source for the full document.