Module 5 - Scale up your app

Difficulty: Beginner Estimated Time: 10 minutes

The goal of this scenario is to scale a deployment with kubectl scale and to see the load balancing in action

Step 1 - Scaling a deployment

First, let’s list the deployments using the get deployment command:

kubectl get deployments

The output should be similar to:

NAME                  READY   UP-TO_DATE   AVAILABLE   AGE
kubernetes-bootcamp   1/1     1            1           11m

We should have 1 Pod. If not, run the command again. This shows:

NAME lists the names of the Deployments in the cluster.
READY shows the ratio of CURRENT/DESIRED replicas
UP-TO-DATE displays the number of replicas that have been updated to achieve the desired state.
AVAILABLE displays how many replicas of the application are available to your users.
AGE displays the amount of time that the application has been running.

To see the ReplicaSet created by the Deployment, run:

kubectl get rs

Notice that the name of the ReplicaSet is always formatted as [DEPLOYMENT-NAME]-[RANDOM-STRING]. The random string is randomly generated and uses the pod-template-hash as a seed.

Two important columns of this command are:

DESIRED displays the desired number of replicas of the application, which you define when you create the Deployment. This is the desired state.
CURRENT displays how many replicas are currently running.

Next, let’s scale the Deployment to 4 replicas. We’ll use the kubectl scale command, following by the deployment type, name and desired number of instances:

kubectl scale deployments/kubernetes-bootcamp --replicas=4

To list your Deployments once again, use get deployments:

kubectl get deployments

The change was applied, and we have 4 instances of the application available. Next, let’s check if the number of Pods changed:

kubectl get pods -o wide

There are 4 Pods now, with different IP addresses. The change was registered in the Deployment events log. The check that, use the describe command:

kubectl describe deployments/kubernetes-bootcamp

You can also view in the output of this command that there are 4 replicas now.

Step 2 - Load Balancing

Let’s check that the Service is load-balancing the traffic. To find out the exposed IP and Port we can use the describe service as we learned in the previous Module:

kubectl describe services/kubernetes-bootcamp

Note for Docker Desktop users: Due to Docker Desktop networking limitations, by default you’re unable to access pods directly from the host. Run minikube service kubernetes-bootcamp, this will create a SSH tunnel from the pod to your host and open a window in your default browser that’s connected to the service. Refresh the browser page to see the load-balancing working. The tunnel can be terminated by pressing control-C, then continue on to Step 3.

Create an environment variable called NODE_PORT that has a value as the Node port:

export NODE_PORT=$(kubectl get services/kubernetes-bootcamp -o go-template='{{(index .spec.ports 0).nodePort}}')
echo NODE_PORT=$NODE_PORT

Next, we’ll do a curl to the exposed IP and port. Execute the command multiple times:

curl $(minikube ip):$NODE_PORT

We hit a different Pod with every request. This demonstrates that the load-balancing is working.

Step 3 - Scale Down

To scale down the Service to 2 replicas, run again the scale command:

kubectl scale deployments/kubernetes-bootcamp --replicas=2

List the Deployments to check if the change was applied with the get deployments command:

kubectl get deployments

The number of replicas decreased to 2. List the number of Pods, with get pods:

kubectl get pods -o wide

This confirms that 2 Pods were terminated.

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Last modified August 11, 2024: Fixed typo (3b2576d89)