Setting up Kubernetes cluster autoscaling

This guide takes you through setting up your Brightbox Kubernetes cluster to automatically add new capacity when required.


You’ll need to have deployed a Kubernetes cluster with our Terraform configuration.

Configure the cluster to autoscale

Our Terraform manifests automatically deploy the Brightbox Kubernetes cluster autoscaler, so all we need to do is configure the maximum number of new workers that can be built. Just set the terraform variable worker_max in your local.auto.tfvars:

worker_max = 5

And run terraform apply which will configure the cluster with the new setting:

$ terraform apply


Apply complete! Resources: 1 added, 1 changed, 1 destroyed.

And we’re ready to scale up an application!

Connect to your Kubernetes cluster

If you’re using our Terraform configuration, the master output is the public IP address of the Kubernetes master server. You can SSH into this server using your SSH key:

$ ssh ubuntu@$(terraform output master)
Welcome to Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-111-generic x86_64)
Last login: Thu Jul 16 09:48:47 2020 from

And use kubectl on the master to inspect the cluster. In this particular example cluster, I have 1 master node and just 1 worker node:

$ kubectl get nodes

srv-4dbz0   Ready    master   16m     v1.18.5
srv-hrgv1   Ready    worker   14m     v1.18.5

Deploy an example application

We’ll deploy a basic hello world application to play with here. First create a namespace for it:

$ kubectl create namespace example

Then create the Deployment, and we’ll specifically request 512MB ram per pod:

apiVersion: apps/v1
kind: Deployment
  name: hello-world
  namespace: example
  replicas: 1
      app: hello-world
        app: hello-world
        - name: app
          image: brightbox/rails-hello-world
            - name: web
              containerPort: 3000
              protocol: TCP
              memory: "512Mi"

And apply it:

$ kubectl apply -f hello-world-deployment.yaml
deployment.apps/hello-world configured

Inspect the deployment

So now we can see the hello-world deployment on this cluster is set to just 1 replica, which means there is only 1 pod running the application:

$ kubectl -n example get deployments
hello-world   1/1     1            1           3m

$ kubectl -n example get pods -o wide
NAME                           READY   STATUS    RESTARTS   AGE   IP               NODE        NOMINATED NODE   READINESS GATES
hello-world-5f48c6bb68-drw4n   1/1     Running   0          38s   srv-hrgv1   <none>           <none>

Scale up the deployment

Now let’s scale up the deployment. First, let’s increase the replicas to 2:

$ kubectl -n example scale --replicas=2 deployment/hello-world
deployment.apps/hello-world scaled

$ kubectl -n example get pods -o wide
NAME                           READY   STATUS    RESTARTS   AGE     IP               NODE        NOMINATED NODE   READINESS GATES
hello-world-5f48c6bb68-drw4n   1/1     Running   0          8m48s   srv-hrgv1   <none>           <none>
hello-world-5f48c6bb68-fxtnc   1/1     Running   0          3s   srv-hrgv1   <none>           <none>

Here we can see that an additional pod was created for the application, but since there was room on the existing worker (srv-hrgv1), it was just allocated onto there by the Kubernetes scheduler. Let’s scale it up a bit further, beyond the existing capacity of the cluster:

$ kubectl -n example scale --replicas=4 deployment/hello-world
deployment.apps/hello-world scaled

$ kubectl -n example get pods -o wide
NAME                           READY   STATUS    RESTARTS   AGE     IP               NODE        NOMINATED NODE   READINESS GATES
hello-world-5f48c6bb68-drw4n   1/1     Running   0          10m   srv-hrgv1   <none>           <none>
hello-world-5f48c6bb68-fxtnc   1/1     Running   0          2m10s   srv-hrgv1   <none>           <none>
hello-world-5f48c6bb68-c2mjx   0/1     Pending   0          4s      <none>           <none>      <none>           <none>
hello-world-5f48c6bb68-qrv2h   0/1     Pending   0          4s      <none>           <none>      <none>           <none>

Now we can see two additional pods are listed in Pending state, as there is no capacity available for them. This is where the autoscaler kicks in. If we give it a couple of minutes and check the pod status again, we’ll see they’re been allocated to a brand new node (srv-e4isv)

$ kubectl -n example get pods -o wide
NAME                           READY   STATUS    RESTARTS   AGE     IP               NODE        NOMINATED NODE   READINESS GATES
hello-world-5f48c6bb68-drw4n   1/1     Running   0          15m   srv-hrgv1   <none>           <none>
hello-world-5f48c6bb68-fxtnc   1/1     Running   0          7m4s   srv-hrgv1   <none>           <none>
hello-world-5f48c6bb68-c2mjx   1/1     Running   0          4m58s    srv-e4isv   <none>           <none>
hello-world-5f48c6bb68-qrv2h   1/1     Running   0          4m58s    srv-e4isv   <none>           <none>

and indeed there is a new node in the cluster:

$ kubectl get nodes
srv-4dbz0   Ready    master   5h41m   v1.18.5
srv-e4isv   Ready    <none>   115m    v1.18.5
srv-hrgv1   Ready    worker   5h39m   v1.18.5

we can have a peek behind the curtain by inspecting the logs for the cluster autoscaler deployment over in the kube-system namespace. It detected that some pods needed scheduling, figured how how many new workers were needed and built them:

I0716 13:19:14.581283       1 scale_up.go:322] Pod example/hello-world-5f48c6bb68-qrv2h is unschedulable
I0716 13:19:14.581486       1 scale_up.go:322] Pod example/hello-world-5f48c6bb68-c2mjx is unschedulable
I0716 13:19:14.581852       1 scale_up.go:452] Best option to resize: grp-v28su
I0716 13:19:14.581964       1 scale_up.go:456] Estimated 1 nodes needed in grp-v28su
I0716 13:19:14.582073       1 scale_up.go:569] Final scale-up plan: [{grp-v28su 1->2 (max: 5)}]
I0716 13:19:14.582156       1 scale_up.go:658] Scale-up: setting group grp-v28su size to 2

Scale the deployment down

And when we’re done, we can scale the application back down to 1 pod and the autoscaler will remove the servers it built after a few minutes, if they’re no longer needed:

$ kubectl -n example scale --replicas=1 deployment/hello-world
deployment.apps/hello-world scaled

Again, the logs from the autoscaler:

I0716 15:30:59.513137       1 scale_down.go:790] srv-e4isv was unneeded for 10m9.233169593s
I0716 15:30:59.513258       1 scale_down.go:1053] Scale-down: removing empty node srv-e4isv
I0716 15:30:59.519680       1 delete.go:103] Successfully added ToBeDeletedTaint on node srv-e4isv

Terraform vs. autoscaler

So here we got a new worker server built by the autoscaler, outside of terraform, so terraform knows nothing about it - it’s entirely managed by the autoscaler. And the autoscaler knows which workers are managed by terraform, and won’t ever touch those, to avoid stepping on any toes.

You should consider the worker nodes built by terraform as static and the worker nodes built by the autoscaler as dynamic.

Fully automated scaling

This is convenient, but you don’t want to have to scale deployments up and down manually whenever your system’s load changes. To fully automate scaling, something needs to monitor the load and adjust the replicas for your deployments automatically. That something is the Horizontal Pod Scaler and is a topic for a future post.

Last updated: 16 Jul 2020 at 15:52 UTC

