In the quest to optimize costs associated with provisioned EC2 nodes in lower environments, the imperative need arises to scale down EKS Karpenter provisioned nodes to zero. However, Karpenter lacks native support for scheduled scale-to-zero cluster nodes, prompting an exploration of alternative solutions.
This AWS blog post delves into common options for achieving scale-to-zero cluster nodes with Karpenter, outlining their respective pros and cons. The chosen options underwent evaluation based on criteria such as flexibility, ease of implementation, and impact on application configurations.
Our blog post goes beyond the theoretical and provides practical insights into the implementation details, testing, and monitoring considerations.
For more cost-saving options in EKS, you might want to have a look at Karpenter consolidation and EC2 Spot capacity instances.
How we implemented it
Our approach involves using CronJobs to dynamically scale up and down nodes created by Karpenter on a schedule. By introducing a zero CPU limit to provisioners and subsequently deleting all nodes, this flexible solution allows to keep workloads running while scaling down to zero.
We use terraform to setup EKS. Our internal EKS module is utilizing terraform-aws-eks community module. The latter is enriched with a set of customized sub-modules to widen the ecosystem. For the CronJobs we have built a terraform sub-module in our internal terraform EKS module.
manifest.yaml.tftpl template file encapsulates YAML configurations for essential Kubernetes entities like ServiceAccount, ClusterRole, ClusterRoleBinding, and CronJob, facilitating the scaling operations.
The Scale Down cronjob involves the following steps:
Patching the provisioner with setting CPU limits to zero
Draining the nodes
Deleting Karpenter provisioned nodes
Setting limits prevent Karpenter from creating new instances once the limit is exceeded. In Step 1, setting the CPU limits to zero, will scale down the provisioned nodes to 0.
Step 2 is required to safely drain pods from the nodes.
Finally in Step 3 we delete the nodes created by the provisioner. Karpenter adds a finalizer to nodes that it provisions to support graceful node termination. The request to delete node via
kubectl will also trigger a termination of the associated EC2 instance.
The Scale Up cronjob removes the CPU limits from the provisioner, enabling the latter to respond to new node scheduling requirements.
We convert the template into yaml after replacing the required variables.
Aircall internal terraform EKS module initialises the
karpenter-scale-zero sub-module, allowing seamless integration into the cluster.
Environment specific configurations, such as schedule intervals and namespaces, are easily customizable.
To deploy the terraform sub-module code in an environment we add the following input variable in the environment specific terraform code.
The above example deploys the CronJobs for two Karpenter provisioners, namely
whisper. The CronJobs will scale the nodes filtered via provisioner name label as for the schedule specified.
It is mandatory to run both Karpenter and the
karpenter-scale-zero CronJobs in Fargate. This ensures Karpenter pods and the CronJobs can always run even in circumstances when Karpenter nodes are brought down by the scale to zero operations.
To enable this we deploy both Karpenter and CronJobs in
karpenter namespace which is configured to run in
fargate mode in the terraform module.
TTL settings cleans up finished Cron Jobs have finished execution, after
Additionally, in a GitOps setup for scenarios where Karpenter provisioner is deployed as an Argo CD App, specific adjustments in the form of ignore differences in the manifest are needed to keep the app in sync after the provisioner patching.
Testing the solution
The test involves deploying an application that uses
nodeSelector to schedule the pods in specific nodes provisioned via Karpenter. In this example we deploy an application called
inflate in nodes labeled with
karpenter.sh/provisioner-name = default
We first list the EKS nodes provisioned by Karpenter.
Verification of proper scheduling and execution involves checking the status of the CronJobs in
When the scale down schedule is met, Karpenter provisioner will delete all nodes that match the label specified in the scale down command. The application pods will be in
Pending state until the scale up job triggers. Furthermore, examining the logs of the Scale down and Scale Up cron jobs ensures the accurate execution of scaling actions.
Visualizing the performance and status of the scaling processes is crucial for effective monitoring. Monitoring tools and dashboards can provide insights into the dynamic nature of node scaling and assist in troubleshooting if needed. We use DataDog to monitor our EKS clusters. As we can see the Nodes are scheduled to get terminated at 8 p.m. and then reinstated at 6 a.m.
Karpenter nodes scale in/out
During this time the application pods that are scheduled to run on the nodes created by Karpenter provisioners are also transitioning from
The nodes are re-instated successfully after the Karpenter provisioner is patched to remove the spec limits of CPU. The application pods are transitioned from
Running again after the nodes are up.
Despite the availability of options, each solution comes with its set of challenges. The decision-making process involves evaluating trade-offs and selecting the option that aligns best with the specific use case. Challenges may arise, especially in GitOps setups, emphasising the need for careful consideration and troubleshooting.
As part of SRE FinOps cost-saving initiative, scaling Karpenter nodes to zero is a strategic move to achieve significant cost savings in lower environments.
While Karpenter lacks native support for scheduled scale-down, options like setting Spec Limit to Zero in Karpenter provisioner offer practical solutions.
The selection between these options hinges on factors like flexibility, ease of implementation, and compatibility with existing setups.
Using this method, we brought down the non-production monthly cost of Karpenter-provisioned nodes used to host Aircall Voice AI products by ~ 50% and the overall cost of the clusters by 25%.
Published on February 13, 2024.