Alexa metrics
Live Chat

Welcome to UKFast, do you have a question? Our hosting experts have the answers.

Chat Now
Sarah UKFast | Account Manager

Auto Scaling Applications with Kubernetes HPA

21 January 2020 by Guest

This week’s guest blog is authored by Lee, one of our in-house DevOps engineers who works on developing our cloud products and streamlining our deployment processes. 

At UKFast, we utilise Kubernetes for a variety of applications such as websites and public-facing APIs. Historically, these applications have been deployed within virtual machines (VMs) and scaled manually with the use of a traditional load balancer. But with our applications now deployed within Kubernetes, we take advantage of the Kubernetes HPA (horizontal pod auto scaler). The HPA automatically scales pods based upon pre-configured metric thresholds such as CPU utilisation.

Below, we’ll walk through configuring and testing a basic HPA configuration, using pre-defined CPU utilisation thresholds. It is assumed that you have a basic understanding of Kubernetes and CLI tools.

Deployment

First, we’ll deploy an application to test the functionality of HPA. For this example, we’re using the image chiahan1123/docker-hpa-example, which artificially generates CPU load by the use of expensive arithmetic.

We’ll be deploying this application into a pre-existing Kubernetes cluster, with all resources placed in the namespace loadtestdemo. Let’s go ahead and apply a set of configurations for our application, which includes a Deployment, Service and Ingress:


cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: testapp
  namespace: loadtestdemo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: testapp
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: testapp
    spec:
      containers:
      - image: chiahan1123/docker-hpa-example
        imagePullPolicy: Always
        name: testapp
        ports:
        - containerPort: 80
          protocol: TCP
        resources:
          limits:
            cpu: 200m
          requests:
            cpu: 100m
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  name: testapp-service
  namespace: loadtestdemo
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: testapp
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: testapp-ingress
spec:
  rules:
  - host: testapp.ukfast.co.uk
    http:
      paths:
      - path: /
        backend:
          serviceName: testapp-service
          servicePort: 80
EOF


 

To verify our deployment, we can retrieve running pods to ensure there is one running:

kubectl get pod 

NAME                     READY  STATUS  RESTARTS  AGE 
testapp-7dbd668469-b2trj 1/1    Running    0      25s
 

 

Next, we’ll create our HPA. For this example, we’ll set the CPU percent threshold to 50 (targetCPUUtilizationPercentage), with a maximum of 10 replicas (maxReplicas):

cat <<EOF | kubectl apply -f -
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: testapp
spec:
  maxReplicas: 10
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: testapp
  targetCPUUtilizationPercentage: 50
EOF
 

 

We can retrieve our new HPA with kubectl:

kubectl get hpa 

NAME    REFERENCE          TARGETS MINPODS MAXPODS REPLICAS AGE 
testapp Deployment/testapp 0%/50%     1      10       1     10s
 

Our test application is now deployed and configured for auto scaling, which we will demonstrate below.

Testing

Now that our test application has been deployed and HPA configured, we will proceed to test the functionality of the configured HPA as to ensure it is functioning as expected. For this demonstration, the following tools are required:

  • LTaaS with a configured domain for the test application (in this example, we have configured our ingress with testapp.ukfast.co.uk)
  • UKFast CLI

First, we’ll retrieve the ID of the domain we’re testing:

ukfast loadtest domain list

+--------------------------------------+----------------------+ 
|                  ID                  |         NAME         | 
+--------------------------------------+----------------------+ 
| d1d1ee4c-c6bc-464f-ab5f-1d47c3507f70 | testapp.ukfast.co.uk | 
+--------------------------------------+----------------------+
 

 

Next, we’ll retrieve the available scenarios for the load test:

ukfast loadtest scenario list 

+--------------------------------------+---------+--------------------------------+ 
|                  ID                  |  NAME   |          DESCRIPTION           | 
+--------------------------------------+---------+--------------------------------+ 
| 7d541984-d198-4815-aa9a-6cd2894c0090 | Smash   | Increases to maximum users     | 
|                                      |         | by 50% of the run time, then   | 
|                                      |         | maintain until the end         | 
| cb6427fb-b98d-481d-a173-eaa6d15d6703 | Step up | Increase the users in 5 equal  | 
|                                      |         | steps over a period of time    | 
| ce1873a4-048e-44d9-92b4-aab62fd41a42 | Incline | Gradually ramp the users up to | 
|                                      |         | 100% over a period of time     | 
+--------------------------------------+---------+--------------------------------+
 

 

For this test, we’ll use the the Smash scenario with 1000 users for a duration of 5m:

ukfast loadtest test create --domain-id d1d1ee4c-c6bc-464f-ab5f-1d47c3507f70 --name 
testapp-loadtest --protocol https --scenario-id 7d541984-d198-4815-aa9a-6cd2894c0090 --
authorisation-name Lee --authorisation-company UKFast --authorisation-position Developer
 --authorisation-agreement-version v1.0 --duration 5m --number-of-users 1000 

+--------------------------+----------------+----------+------+-----------------+----------+
|                  ID      |       NAME     | PROTOCOL | PATH | NUMBER OF USERS | DURATION |
+--------------------------+----------------+----------+------+-----------------+----------+
| 18fb4101-45b7-4b22-868c-b7292bb3f388 | testapp-loadtest | https | /  |  1000  | 00:05:00 |
+--------------------------------------+------------------+-------+----+--------+----------+
 

 

We’re now ready to run the test that we’ve just created, as below:

ukfast loadtest job create --test-id 18fb4101-45b7-4b22-868c-b7292bb3f388 --run-now 

+--------------------------------------+---------------------+-------------------+---------+ 
|                  ID                  | JOB START TIMESTAMP | JOB END TIMESTAMP | STATUS  | 
+--------------------------------------+---------------------+-------------------+---------+ 
| 3e6b357b-80b3-4f30-857d-36dba5509c6d |                     |                   | Pending | 
+--------------------------------------+---------------------+-------------------+---------+
 

 

Whilst the test is running, we can verify the functionality of our HPA by monitoring the replica count using kubectl:

watch -n1 kubectl get pod
 

 

We can also monitor the results of the load test via the UKFast CLI:

ukfast loadtest job results show 3e6b357b-80b3-4f30-857d-36dba5509c6d --graph-virtualusers 
--graph-latency

Hpa Kubernetes Demo Spots
 

 

As can be seen from the load test results above, the latency of our application remains stable for the majority of the test, with the asterisk * marking where our HPA started to scale the deployment replicas:

kubectl get pod
 
NAME                       READY   STATUS              RESTARTS   AGE 
testapp-7dbd668469-6gw5l   1/1     Running             0          5m3s 
testapp-7dbd668469-bknf6   0/1     ContainerCreating   0          1s 
testapp-7dbd668469-h9xxc   0/1     ContainerCreating   0          1s

 

 

We can also verify the scaling of our deployment by observing events:

kubectl get events
 
10m         Normal    ScalingReplicaSet              deployment/testapp                
Scaled up replica set testapp-7dbd668469 to 3

 

 

Our HPA will automatically scale down our deployment replicas in line with CPU metrics for our given threshold:

kubectl get events 

62s         Normal    ScalingReplicaSet              deployment/testapp                
Scaled down replica set testapp-7dbd668469 to 1
 

 

What next?

That concludes the deployment of a basic HPA into Kubernetes, which we have tested using the UKFast load testing service. Kubernetes HPAs are highly customisable and can be tailored to most application scaling requirements. Custom and external metrics can be used to further extend HPAs, allowing developers to provide their own source of metrics data should these be required.

Interested in learning more about containers or Kubernetes HPA? Why not take a look at UKFast’s Dedicated Container Platform.

FIND OUT MORE