Saiyam Pathak avatar
By Saiyam Pathak
Director of Technical Evangelism


Learn all about Kubernetes deployments and how the process of deployment works.


Introduction to deployments

In this video, we'll describe how deployment works. Whenever you create a deployment, you specify the number of replicas and specify in the spec section what image you want to run. Kubernetes will deploy that many pods inside the cluster. So, you have specified three replicas of three pods that have been created, and on top of that, if we have a service, it will start sending the traffic to the pods. Now, what happens when you want to change an image, tag, or maybe choose a different image? In this case, the command will be kubectl set image deployment/demo nginx=nginx:1.15.0 --record.

How does the process of Kubernetes deployment work?

Now, we will see how the process works internally. First, a fourth replica of the pod gets created, and we have to see if it has the liveness and readiness checks. If the liveness check passes and the readiness check passes, then it will be added to the load balancer, and traffic will start serving towards the new pod. Now, what happens to the previous one once this pod is ready and added to the load balancer? The previous pod gets deleted, but the previous pod also serves some traffic. Hence, whenever the previous pod is deleted, it has the termination grace period. The termination grace period by default is 30 seconds, and you can also set that in the deployment. In those 30 seconds, whatever requests are there will be already serving. Still, this connection then gets removed, and there are no more new requests that go to this particular pod. After its termination grace period, the pod finally gets deleted, and the traffic starts sending to the newer pod.

Now, the next pod with the newer image will spin up, and again, when its liveliness and readiness check passes, it gets added to the load balancer, and the previous one gets removed. Furthermore, after the termination grace period, the pod will finally get deleted from the cluster. This is how the zero-downtime upgrade works. This is a rolling update strategy, and you can also specify maxSurge, which means the maximum number of pods that can be there during the update. You can also specify a maxUnavailable, which means the maximum number of pods that can be unavailable due to the upgrade. By default, it's 25%. But you can change and specify min ready seconds, which means how many seconds to wait before the service starts sending traffic to the new pod. That's it about how the deployment works internally.

Thank you for watching, see you in the next lecture.

Don't stop now, check out your next lesson