How to horizontally autoscale pods in Kubernetes

In this article I demonstrate how to set up an autoscaler to scale up the pods when the CPU usage exceeds a certain threshold and back down again.

Author David Poole on 16 March 2021 David Poole's blog

One of the many wonderful features of Kubernetes/OpenShift distribution as implemented at Safe Swiss Cloud is the HPA Horizontal Pod Autoscaler. As the name suggests, HPA will automatically spin up or spin down pods for you when a given CPU or memory load threshold is crossed.

Goal

To demonstrate by example how the HPA in OpenShift/Kubernetes can scale the number of application pods from 1 to 3 replicas as load increases and back down to 1 again as the load decreases.

Implementation

Here is an example of how to set up an autoscaler to scale up the pods when the CPU usage exceeds a certain threshold. I will demonstrate doing this using the graphical user interface of OpenShift 4.5. The same can be achieved using the CLI – this method is described in the official documentation https://docs.openshift.com/container-platform/4.5/nodes/pods/nodes-pods-autoscaling.html.

Deploy a suitable Pod for our Test
First off we need to deploy some sort of pod that we can use to test our autoscaler. In this example I went for the OpenShift example https://github.com/sclorg/django-ex since this exposes a web server and external route, which is easy to load up with URL requests.
Figure 1: The single Django pod before autoscaling
Add Metrics to the Deployment YAML
Go to Administrator -> Deployments and edit the YAML of your deployment and search for the resources parameter. By default, this parameter is empty i.e. resources: {}. You need to remove the braces and add a value for the CPU as shown in Figure2. The 200m here is just an arbitrary starting value. If you don’t do this, the autoscaler will ignore our pods since it will not be able to fetch the CPU metrics from the running pods.
Figure 2: Adding the cpu: 200m reservation within the Build object
Create the Horizontal Pod Autoscaler
Go to Administrator -> Workloads -> Horizontal Pod Autoscalers and select Create Horizontal Pod Autoscaler and edit the resulting YAML. In the spec block, replace name with that of your deployment i.e. django-ex-git. For the purposes of our test, you can reduce targetAverageUtilization from 50 to 2 so the autoscaler will scale the pods up to a maximum of 3 replicas as soon as the pod CPU load exceeds just 2% rather than 50%.
Figure 3: Editing the HorizontalPodAutoscaler object

Once you have saved your HPA, you should end up with an active autoscaler object.
Figure 4: The just created HorizontalPodAutoscaler object

Checking the Conditions of the autoscaler object you should see these two conditions in Figure 5 below, both with a status of True. If you forgot to define the resources in Figure 2., you will see a message under Reason saying that the metrics values could not be read instead of the ValidMetric Found message.
Figure 5: Part of the HorizontalPodAutoscaler object display
Load up the Pod to make it Autoscale
Now it’s time to put some load on our Django pod so HPA has a chance to autoscale. To do this, you can run a simple request loop from your laptop or desktop PC.

$ for((i=0;i<500;i++)) do curl --connect-timeout 3 'http://django-ex-git-<your project>.apps.<your domain>'; done;

Now the magic will happen. After a short wait, the pods will scale from 1 to 3 and then back again to 1 once the load has been removed i.e. some time after the 500 script requests have completed.
Figure 6: The Django application has now been scaled to 3 pods by the autoscaler

Looking at the in-built monitoring metrics for the pods, you can see that the green area below represents the initial pod and the two shades of blue, the two new pods that were spun up and stopped again. In order to reduce “thrashing”, there is a default 5 minute cool-down period imposed by Kubernetes when down-scaling (see https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/).

Figure 7: In-built pod metrics showing the spin up / down of the extra two pods

Summary

After you create a horizontal pod autoscaler, OpenShift begins to query the CPU and/or memory resource metrics on the pods. When these metrics are available, the horizontal pod autoscaler computes the ratio of the current metric utilization with the desired metric utilization, and scales up or down accordingly. The query and scaling occurs at a regular interval, but can take one to two minutes before metrics become available.

Kubernetes at Safe Swiss Cloud

Learn more about the Kubernetes/OpenShift distribution as implemented at Safe Swiss Cloud.

FIND OUT MORE

REQUEST A BRIEFING

References:

Comments [2]

CESAR GUERRA
June 12th, 2021

You know what the problem is when activating the METRIC-SERVER in Openshift since it is not installed by default & without that the HPA does not work and the Dashboard graphics are not shown.

You know how to do it, because the way to do it in KUBERNETES (based on Yamls) in OPENSHIFT4 doesn’t work.

David Poole
June 14th, 2021

Hi Cesar
Many thanks for leaving a comment. Actually I did not have to install anything extra (my version = OKD 4.5). Although, this may be due to the fact that I only autoscale on the cpu resource in this example. I saw from this link https://docs.openshift.com/container-platform/4.5/monitoring/exposing-custom-application-metrics-for-autoscaling.html that there is an option to define custom autoscaling metrics. Is this the YAML that you are referring to?
Kind regards
David

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category .
cookielawinfo-checkbox-marketing	1 year	This cookie is set by the GDPR Cookie Consent plugin to store the user consent for the cookies in the category "Marketing".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Necessary" category .
JSESSIONID	session	Used for Cross Site Request Forgery (CSRF) protection
sdsc	session	Signed data service context cookie used for database routing to ensure consistency across all databases when a change is made. Used to ensure that user-inputted content is immediately available to the submitting user upon submission
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin to store whether or not the user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_D83559EP8M	2 years	This cookie is installed by Google Analytics.
browser_id	5 years	This cookie is used for identifying the visitor browser on re-visit to the website.
split	1 month	This cookie is used to evaluate the changes to the website by checking which multivariate test the user takes part in.

Cookie	Duration	Description
bcookie	1 year	Browser Identifier cookie to uniquely indentify devices accessing LinkedIn to detect abust on the platform and diagnostic purposes
bscookie	1 year	Used for remembering that a logged in user is verified by two factor authentication
lang	session	Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings
li_gc	6 months	Used to store consent of guests regarding the use of cookies for non-essential purposes
li_mc	6 months	Used as a temporary cache to avoid database lookups for a member's consent for use of non-essential cookies and used for having consent information on the client side to enforce consent on the client side
lidc	24 hours	To facilitate data center selection

How to horizontally autoscale pods in Kubernetes

Goal

Implementation

Summary

Kubernetes at Safe Swiss Cloud

Comments [2]

Search Blog

Questions or Feedback?

Recent blog posts