WritingScalewayScalewaypublished Apr 28, 2022seen 5d

DevOps coding game: container, deployment, monitoring & CI/CD

Open original ↗

Captured source

source ↗
published Apr 28, 2022seen 5dcaptured 3dhttp 200method plain

DevOps coding game: container, deployment, monitoring & CI/CD Build • Aurélien Maury - CTO Wescale • 28/04/22 • 6 min read

A year ago, WeScale had the idea to create a coding game oriented towards DevOps and Infra-as-Code: a treasure hunt that could amuse experts and push beginners to improve their skills by solving technical puzzles.

Abhra Shambhala

The Abhra Shambhala project started in March 2021, driven by the passion for sharing technical knowledge in a fun way that also provides a challenge for participants, as well as for us... At the time of writing this article, 235 people have already taken part in our treasure hunt, 22 have reached the finish line.

Let's take a look at the project in more detail, without spoiling the game.

Challenges and objectives of building the coding game

The primary goal of the game is to familiarize players with git, Docker, Helm manifests, and automated deployment pipelines.

To build a successful coding game platform, we needed to prevent users from being able to modify the build from pull requests for (obvious) security reasons and deploy/destroy the platform quickly. Continuous deployment on our application components was also a must-have to deliver quickly in case of problems. Spoiler alert: there are always problems when building a technical treasure hunt game.

We also obviously wanted to have fun while testing a complete stack with a concrete project and bring players into this fun story.

Tools

The toolbox used to deploy and maintain this architecture is composed of:

Ansible, as the central orchestrator. All operations start by a playbook, and variables repository are stored as YAML files in group_vars

Terraform to pilot servers’ resources, the initialization of Rancher, and cluster deployment on Kapsule. Terraform’s operations are supervised by a playbook that handles prerequisite tasks, output collects, and YAML file generation on group_vars to make the output from an operation available for the following playbooks

Make to spare you from typing ansible-playbook all the time

Direnv, for the virtualenv with Ansible and to load environment variables for Ansible and Terraform configurations

Final architecture

The final form of our architecture here revolves around two primary resources: a compute instance and a Kapsule cluster.

Architecture airplane view

Compute Instance: game master

The cornerstone of the platform is a simple Instance running on Debian 11 with:

The DNS domain we manage for our applications

For more flexibility in deployments, we delegated a subdomain to a bind9 daemon that will become our reference DNS authority. The DNS records of application deployments that we expose are managed here, with updates pushed by an Ansible playbook.

A user-friendly web interface to manage our Kapsule cluster

We installed Rancher on this server rather than on Kapsule to allow an easier switch between Kapsule clusters without reinstalling Rancher.

Rancher is deployed by Helm charts, and we maintain a local cluster with a single K3S instance to serve as Rancher’s execution platform.

To expose the Rancher service, we installed Nginx to act as a reverse proxy to a local port and carry the TLS certificates. K3S APIs are not exposed externally and are only accessed locally via Ansible.

The deployment of cluster tools

Once Rancher is deployed, any cluster imported into its management scope can receive Rancher tooling, an observability stack detailed in the next section.

Kapsule: game board

Now that the back base has been deployed, we can start a Kapsule Kubernetes cluster through an Ansible playbook that pilots Terraform to create the cluster, retrieve useful output and launch a second action to implement the cluster on Rancher.

Rancher deploys its probes and graphic interface to inspect the cluster when imported. This gives us the tools to visualize workloads’ logs from each pod and start terminals on each for troubleshooting.

A final Terraform piloting playbook is used to deploy:

Grafana, the well-known dashboard manager, adapted to the following application components alongside dashboards

Logging Operator, an automation of Fluentbit and Fluentd by BanzaïCloud, makes life much easier for centralizing logs

Loki, to store logs for Grafana

The Nginx-ingress-controller and cert-manager which will be used to expose our APIs

Phew! Once all this has been deployed, we have a sound and well-equipped working base to accommodate the application part. We will not detail here the content of the application so as not to disclose the CodinGame to future participants!

We’ll just tell you that there is a continuous deployment component with a Drone (which does the job brilliantly).

Automation

Automation and reproducibility are central to our work. Once the code base is mature, an environment can be set up with two commands that follow a certain number of playbooks.

make core

Creation of the Compute instance for Rancher, through Terraform

System Setup

Subdomain delegation to make it the authority of the domain for game services

Creation of public certificates by DNS challenge with Let's Encrypt

Installation of K3S, Rancher, and exposure through the Nginx reverse-proxy

Initial Rancher setup, via Terraform

make kapsule

Creation of the Kapsule cluster via Terraform

Importing the cluster in the management of Rancher

Installation of Helm observability charts and Rancher, via Terraform

Thoughts on the tools after the first experience

Scaleway’s Kubernetes Kapsule

Installing one’s own Kubernetes cluster is a hard path to follow. Therefore, a managed K8s orchestrator is the obvious choice. Kubernetes Kapsule is a great choice with:

Available versions that closely follow the K8s roadmap

Transparent integration with Scaleway Load Balancer

Easy-to-handle attached storage

Ansible Terraform

The Ansible-Terraform duo for Infra-as-Code management is a real success, even if the encapsulation of Terraform by playbooks may seem counterintuitive.

In this context, where the scope is clearly defined and involves many heterogeneous tasks, Ansible as an entry point makes it more accessible.

Ansible is an excellent glue for all that, and the Ansible Terraform module fits right in.

Fleet

The GitOps component from RancherLabs, Fleet, still seems young to us.

The proposed Custom Resources Definition abstraction for managing continuous deployment flows and targets is somewhat complex to grasp. Redeployments…

Excerpt shown — open the source for the full document.