- Published on
Enhancing High Availability in Kubernetes: Strategies for Robust Application Deployment
- Authors
- Name
- Ruan Bekker
- @ruanbekker
In this post, I'll explore various strategies to enhance the high availability of your Kubernetes applications. We'll walk through step-by-step implementations of key practices like Pod Anti-Affinity, Rolling Updates, Health Checks, and Pod Disruption Budgets, among others.
Our objective is to maintain at least three replicas, ensuring each pod is placed in a different zone. This way, if one zone experiences a failure, the remaining two pods in other zones can continue serving traffic. The following image illustrates the distribution of three replicas across three distinct zones:
But I will go into more detail in this post.
About
This setup will help ensure high availability and resilience across your Kubernetes cluster:
- 3 Replicas: Ensures high availability.
- Pod Anti-Affinity: Ensures pods are distributed across different zones.
- Pod Disruption Budget: Ensures only one pod is unavailable at any time.
- Rolling Update Strategy: Ensures a smooth update process with minimal downtime.
- Health Checks: Ensures the application is healthy before accepting traffic.
- Resource Management: Resource requests and limits help in optimal resource utilization and ensure that your application does not consume more than its allocated resources.
Prepare your Nodes
I would like to keep the labels consistent with cloud providers, so I will be adding these labels to my nodes:
topology.kubernetes.io/region=za-sktr-1
topology.kubernetes.io/zone=za-sktr-1a
(za-sktr-1a
,za-sktr-1b
,za-sktr-1c
)
These are my nodes:
kubectl get nodes
node1 # za-sktr-1a
node2 # za-sktr-1b
node3 # za-sktr-1c
node4 # za-sktr-1a
node5 # za-sktr-1b
node6 # za-sktr-1c
Then we can go ahead to label our nodes accordingly:
For node1:
kubectl label node/node1 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node1 topology.kubernetes.io/zone=za-sktr-1a
For node2:
kubectl label node/node2 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node2 topology.kubernetes.io/zone=za-sktr-1b
For node3:
kubectl label node/node3 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node3 topology.kubernetes.io/zone=za-sktr-1c
For node4:
kubectl label node/node4 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node4 topology.kubernetes.io/zone=za-sktr-1a
For node5:
kubectl label node/node5 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node5 topology.kubernetes.io/zone=za-sktr-1b
For node6:
kubectl label node/node6 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node6 topology.kubernetes.io/zone=za-sktr-1c
Now that all our nodes are labeled, we will be able to control the placement of the pods.
Deployment Overview
The deployment manifest will have the following features:
- Replicas is set to 3
- Rolling update strategy with
maxUnavailable: 1
andmaxSurge: 1
this ensures that only one pod is unavailable during updates and only one new pod is created at a time. - Node affinity to ensure pods in different zones. The affinity section uses
podAntiAffinity
to ensure that no two pods of your application are scheduled in the same zone. - Health checks: the
readinessProbe
andlivenessProbe
, these checks will ensure that the pod is only considered available if it passes these checks. - Resource Requests and Limits:
requests
The minimum resources (CPU and memory) that the Kubernetes scheduler will allocate to a pod. In this case, each pod requests 100m (0.1 CPU) and 32Mi (32 MB of memory).limits
The maximum resources a pod is allowed to use. If a pod exceeds these limits, it may be throttled or killed. Here, the CPU is limited to 1000m (1 CPU) and memory to 512Mi (512 MB).
Define our Resources
First we need to define manifests/deployment.yaml
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-container
image: ruanbekker/hostname:latest
ports:
- containerPort: 8000
resources:
requests:
memory: "32Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "1000m"
readinessProbe:
httpGet:
path: /
port: 8000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /
port: 8000
initialDelaySeconds: 15
periodSeconds: 20
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- my-app
topologyKey: topology.kubernetes.io/zone
The pod disruption budget that will ensure that only one pod can be unavailable at a time, which we can define in manifests/pdb.yaml
:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: my-app
Next we will create the namespace where we will deploy our resources into:
kubectl create ns ha
Then deploy the resources into the namespace:
kubectl apply -f manifests/ -n ha
Then we can view our pods and set the output to wide so that we can see onto which nodes was our pods placed, using:
kubectl get pods -n ha -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-app-556c5c74cd-d2lnb 1/1 Running 0 60s 10.42.7.229 node2 <none> <none>
my-app-556c5c74cd-rh977 1/1 Running 0 60s 10.42.3.204 node4 <none> <none>
my-app-556c5c74cd-s9tnd 1/1 Running 0 60s 10.42.6.6 node6 <none> <none>
Rolling Updates
After we have deployed a new revision, we can see how our pods was replaced one at a time:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-app-556c5c74cd-bbzt7 0/1 ContainerCreating 0 3s <none> node6 <none> <none>
my-app-556c5c74cd-vngvj 0/1 Pending 0 3s <none> <none> <none> <none>
my-app-78f5cfd9ff-9k8z6 1/1 Running 0 7m15s 10.42.7.217 node2 <none> <none>
my-app-78f5cfd9ff-wqsd5 1/1 Running 0 7m15s 10.42.3.201 node4 <none> <none>
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-app-556c5c74cd-bbzt7 0/1 Running 0 9s 10.42.6.4 node6 <none> <none>
my-app-556c5c74cd-vngvj 0/1 Pending 0 9s <none> <none> <none> <none>
my-app-78f5cfd9ff-9k8z6 1/1 Running 0 7m21s 10.42.7.217 node2 <none> <none>
my-app-78f5cfd9ff-wqsd5 1/1 Running 0 7m21s 10.42.3.201 node4 <none> <none>
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-app-556c5c74cd-bbzt7 1/1 Running 0 15s 10.42.6.4 node6 <none> <none>
my-app-556c5c74cd-fzj7s 0/1 Pending 0 2s <none> <none> <none> <none>
my-app-556c5c74cd-vngvj 0/1 ContainerCreating 0 15s <none> node4 <none> <none>
my-app-78f5cfd9ff-9k8z6 1/1 Running 0 7m27s 10.42.7.217 node2 <none> <none>
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-app-556c5c74cd-bbzt7 1/1 Running 0 26s 10.42.6.4 node6 <none> <none>
my-app-556c5c74cd-fzj7s 0/1 ContainerCreating 0 13s <none> node2 <none> <none>
my-app-556c5c74cd-vngvj 1/1 Running 0 26s 10.42.3.202 node4 <none> <none>
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-app-556c5c74cd-bbzt7 1/1 Running 0 2m14s 10.42.6.4 node6 <none> <none>
my-app-556c5c74cd-fzj7s 1/1 Running 0 2m1s 10.42.7.227 node2 <none> <none>
my-app-556c5c74cd-vngvj 1/1 Running 0 2m14s 10.42.3.202 node4 <none> <none>
This way you can see our pods are being deployed in a safe and controlled manner.
Thank You
Thanks for reading, if you like my content, feel free to check out my website, and subscribe to my newsletter or follow me at @ruanbekker on Twitter.
- Linktree: https://go.ruan.dev/links
- Patreon: https://go.ruan.dev/patreon