Context & Objectives
Aircall is a cloud-based call center solution provider that uses Serverless and Kubernetes technologies for running its workloads. The organization uses Gitlab CI with Kubernetes Executor for implementing Continuous Integration and Continuous Deployment (CI/CD). Some of the most common use cases Aircall uses Gitlab Runner DinD are:
AWS SAM applications. AWS SAM invoke will build and run the lambda function in a Docker container, so a Docker engine is required.
Testing Kubernetes applications (KinD).
In this blog, we will discuss how Aircall implemented Sysbox to enhance the security and efficiency of its containerized workloads.
Note: DinD stands short for Docker-in-Docker. We will use this abbreviation across the blog for simplicity.
What is Sysbox and Why We Choose It
Sysbox is an open-source project that enhances Docker's functionality by adding features such as namespace isolation, resource management, and system call filtering. Sysbox container runtime is based on OCI runc which is used by Docker & Kubernetes.
To address security issues with Gitlab Runner DinD using the privileged mode, Aircall chose to implement Sysbox in combination with DinD runners. The implementation would provide a more secure and efficient way to build and run containerized workloads.
Gitlab Runner with Sysbox
How We Implemented It
The implementation of Sysbox involved several steps, including using Karpenter provisioner to manage the scaling of the nodes where Sysbox would get installed, setting up Sysbox as a daemonset on the selected nodes and configuring the GitLab runner to select Sysbox as the container runtime.
The CI DinD job workflow at Aircall starts with triggering the creation of Gitlab pods using Kubernetes executor. The runner pods are initially set to run on specific nodes launched through Karpenter provisioner, which assigns labels to new nodes as per the provisioner requirements and nodeSelector definitions within Gitlab Runner and Sysbox pod. Karpenter assigns a taint to the new nodes that disallows runner pods to be scheduled until Sysbox is running.
Gitlab DinD with Sysbox runtime Workflow
Sysbox is installed as a daemonset. The installation manifest specifies a nodeSelector that the scheduler uses to assign the pod on new nodes labeled with sysbox-install: "yes". Sysbox is then installed on the new nodes, and the default container runtime is changed to sysbox-runc. Karpenter will add a taint to the nodes to ensure no pods are scheduled before Sysbox installation has succeeded. Once the installation has been completed, Sysbox removes the taint from the new nodes, allowing the runner pods to be scheduled and the build to proceed.
Sysbox Install manifest
To execute jobs inside containers deployed with Docker + Sysbox, the GitLab runner is configured to select Sysbox as the container runtime and disable the use of privileged containers. The runner machine has the GitLab runner agent, Docker, and Sysbox installed. The GitLab runner executes jobs inside containers deployed with Docker + Sysbox. The runner configuration includes settings such as the runtime_class_name, which specifies the pod to use sysbox-runc as the runtime environment on the container; runners.kubernetes.volumes.empty_dir, which mounts docker certs from Docker DinD service to job containers during runtime, allowing container jobs to use the Docker API through the DinD service; and runners.kubernetes.node_selector, which selects a node launched through Karpenter provisioner with Sysbox daemonset running on that node. Below is a snippet of Gitlab Runner configuration.
Gitlab Runner configuration
Karpenter is a controller that runs in Aircall's CI clusters. Its job is to provision new nodes to handle UnSchedulable pods and remove the nodes when they are no longer needed. A new provisioner is created for Sysbox that defines how Karpenter manages UnSchedulable Gitlab runner pods. The provisioner configuration specifies the EC2 AMI ID through the use of amiSelector and labels the new nodes so that the Scheduler can allocate Gitlab runner and Sysbox pods accordingly.
Karpenter Provisioner manifest
The reason we are using Ubuntu is because Amazon Linux OS is not supported at this moment in time. Other distros are also supported as shown in the compatibility matrix in the Sysbox website.
The labelling approach Sysbox recommends using sysbox-runtime=running as a node selector has some issues. Karpenter will propagate this requirements to node labels, so this label will be added to the node during its bootstrap. That means that the Scheduler will schedule jobs in the nodes even though Sysbox installation is not complete yet (similar issue to any CNI plugin implementation).
In order to workaround this issue taints are added in the deployment to ensure that no pod are scheduled on nodes until they are ready to execute them. This is a temporary fix until this gets natively supported through Sysbox.
Patch for Sysbox install
More details on the upstream fix release can be found in this Slack thread.
As part of Site Reliability Engineering initiative Aircall has adopted Sysbox to improve our overall operating environment. The implementation of Sysbox involves installation of Sysbox as a daemonset, and configuration of the GitLab runner to select Sysbox as the container runtime. The implementation also involves the use of Karpenter provisioner to launch UnSchedulable GitLab runner pods in the nodes where Sysbox is running. With Sysbox, Aircall can execute containerised workloads more securely and efficiently, enabling the organisation to meet the needs of its customers better.
Published on January 2, 2024.