---
title: "How to horizontally autoscale pods in Kubernetes"
description: "In this article I demonstrate how to set up an autoscaler to scale up the pods when the CPU usage exceeds a certain threshold and back down again."
date: 2021-03-16
modified: 2023-11-20
author: "David Poole"
url: "https://safeswisscloud.com/en/blog/how-to-horizontally-autoscale-pods-in-kubernetes/"
featured_image: "https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2021/03/how-to-horizontally-autoscale-pods-in-kubernetes.png?fit=1200%2C630&ssl=1"
categories:
  - name: "Cloud How To"
    url: "https://safeswisscloud.com/en/blog/category/cloud-how-to/"
  - name: "Kubernetes"
    url: "https://safeswisscloud.com/en/blog/category/kubernetes-openshift-en/"
language: "en-US"
---

# How to horizontally autoscale pods in Kubernetes

One of the many wonderful features of Kubernetes/OpenShift distribution as implemented at Safe Swiss Cloud is the HPA Horizontal Pod Autoscaler. As the name suggests, HPA will automatically spin up or spin down pods for you when a given CPU or memory load threshold is crossed.

### **Goal**

To demonstrate by example how the HPA in OpenShift/Kubernetes can scale the number of application pods from 1 to 3 replicas as load increases and back down to 1 again as the load decreases.

### **Implementation**

Here is an example of how to set up an autoscaler to scale up the pods when the CPU usage exceeds a certain threshold. I will demonstrate doing this using the graphical user interface of OpenShift 4.5. The same can be achieved using the CLI – this method is described in the official documentation <https://docs.openshift.com/container-platform/4.5/nodes/pods/nodes-pods-autoscaling.html>.

1. ****Deploy a suitable Pod for our Test**** First off we need to deploy some sort of pod that we can use to test our autoscaler. In this example I went for the OpenShift example <https://github.com/sclorg/django-ex> since this exposes a web server and external route, which is easy to load up with URL requests.   
    ![The single Django pod before autoscaling](https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2021/03/fig-1-single-django-pod-before-autoscaling.png?fit=1894%2C878&ssl=1)_Figure 1: The single Django pod before autoscaling_
2. ****Add Metrics to the Deployment YAML****Go to **Administrator** -&gt; **Deployments** and edit the YAML of your deployment and search for the **resources** parameter. By default, this parameter is empty i.e. **resources: {}**. You need to remove the braces and add a value for the CPU as shown in Figure2. The 200m here is just an arbitrary starting value. If you don’t do this, the autoscaler will ignore our pods since it will not be able to fetch the CPU metrics from the running pods.   
    ![Adding the cpu: 200m reservation within the Build object](https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2021/03/fig-2-adding-the-cpu.png?fit=1900%2C724&ssl=1)_Figure 2: Adding the cpu: 200m reservation within the Build object_
3. ****Create the Horizontal Pod Autoscaler****Go to **Administrator** -&gt; **Workloads** -&gt; **Horizontal Pod Autoscalers** and select **Create Horizontal Pod Autosca**ler and edit the resulting YAML. In the **spec** block, replace **name** with that of your deployment i.e. django-ex-git. For the purposes of our test, you can reduce **targetAverageUtilization** from 50 to 2 so the autoscaler will scale the pods up to a maximum of 3 replicas as soon as the pod CPU load exceeds just 2% rather than 50%.  
    ![Editing the HorizontalPodAutoscaler object](https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2021/03/fig-3-editing-horizontal-pod-austoscaler-object.png?fit=1910%2C1010&ssl=1)_Figure 3: Editing the HorizontalPodAutoscaler object_  
      
    Once you have saved your HPA, you should end up with an active autoscaler object.   
    ![The just created HorizontalPodAutoscaler object](https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2021/03/fig-4-just-created-horizontal-pod-austoscaler.png?fit=1900%2C712&ssl=1)_Figure 4: The just created HorizontalPodAutoscaler object_  
      
    Checking the **Conditions** of the autoscaler object you should see these two conditions in Figure 5 below, both with a status of **True**. If you forgot to define the resources in Figure 2., you will see a message under **Reason** saying that the metrics values could not be read instead of the **ValidMetric Found** message.  
    ![Part of the HorizontalPodAutoscaler object display](https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2021/03/fig-5-part-of-horizontal-pod-austoscaler-object-display.png?fit=1906%2C946&ssl=1)Figure 5: Part of the HorizontalPodAutoscaler object display
4. ****Load up the Pod to make it Autoscale****Now it’s time to put some load on our Django pod so HPA has a chance to autoscale. To do this, you can run a simple request loop from your laptop or desktop PC.  
      
    $ for((i=0;i&lt;500;i++)) do curl --connect-timeout 3 'http://django-ex-git-&lt;your project&gt;.apps.&lt;your domain&gt;'; done;  
      
    Now the magic will happen. After a short wait, the pods will scale from 1 to 3 and then back again to 1 once the load has been removed i.e. some time after the 500 script requests have completed.  
    ![The Django application has now been scaled to 3 pods by the autoscaler](https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2021/03/fig-6-django-application-scaled-by-autoscaler.png?fit=1886%2C868&ssl=1)_Figure 6: The Django application has now been scaled to 3 pods by the autoscaler_  
      
    Looking at the in-built monitoring metrics for the pods, you can see that the green area below represents the initial pod and the two shades of blue, the two new pods that were spun up and stopped again. In order to reduce “thrashing”, there is a default 5 minute cool-down period imposed by Kubernetes when down-scaling (see <https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/>).  
      
    ![In-built pod metrics showing the spin up / down of the extra two pods](https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2021/03/fig-7-in-built-pod-metrics.png?fit=1066%2C750&ssl=1)  
    _Figure 7: In-built pod metrics showing the spin up / down of the extra two pods_

### **Summary**

After you create a horizontal pod autoscaler, OpenShift begins to query the CPU and/or memory resource metrics on the pods. When these metrics are available, the horizontal pod autoscaler computes the ratio of the current metric utilization with the desired metric utilization, and scales up or down accordingly. The query and scaling occurs at a regular interval, but can take one to two minutes before metrics become available.

![](https://i0.wp.com/safeswisscloud.com/wp-content/uploads/2020/05/ssc-world-map.png?resize=1200%2C600&ssl=1)### **Kubernetes at Safe Swiss Cloud**

Learn more about the Kubernetes/OpenShift distribution as implemented at Safe Swiss Cloud.

<a class="wp-block-button__link has-vivid-red-background-color has-background" style="border-radius:5px">FIND OUT MORE</a>

[REQUEST A BRIEFING](https://safeswisscloud.com/en/request-a-kubernetes-cloud-trial/)

---

**References:**

- <https://docs.openshift.com/container-platform/4.5/nodes/pods/nodes-pods-autoscaling.html>
- <https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/>
