Published on

Enhancing High Availability in Kubernetes: Strategies for Robust Application Deployment

Authors

In this post, I'll explore various strategies to enhance the high availability of your Kubernetes applications. I will walk through step-by-step implementations of key practices like Pod Anti-Affinity, Toplogy Spread Constraints, Rolling Updates, Health Checks, and Pod Disruption Budgets, among others.

Our objective is to maintain at least three replicas, ensuring each pod is placed in a different zone. This way, if one zone experiences a failure, the remaining two pods in other zones can continue serving traffic. The following image illustrates the distribution of three replicas across three distinct zones:

But I will go into more detail in this post.

About

This setup will help ensure high availability and resilience across your Kubernetes cluster:

  • 3 Replicas: Ensures high availability.
  • Pod Anti-Affinity & Topology Spread Constraints: Ensures pods are distributed across different zones.
  • Pod Disruption Budget: Ensures only one pod is unavailable at any time.
  • Rolling Update Strategy: Ensures a smooth update process with minimal downtime.
  • Health Checks: Ensures the application is healthy before accepting traffic.
  • Resource Management: Resource requests and limits help in optimal resource utilization and ensure that your application does not consume more than its allocated resources.

Prepare your Nodes

I would like to keep the labels consistent with cloud providers, so I will be adding these labels to my nodes:

  • topology.kubernetes.io/region=za-sktr-1
  • topology.kubernetes.io/zone=za-sktr-1a (za-sktr-1a, za-sktr-1b, za-sktr-1c)

These are my nodes:

kubectl get nodes

node1 # za-sktr-1a
node2 # za-sktr-1b
node3 # za-sktr-1c
node4 # za-sktr-1a
node5 # za-sktr-1b
node6 # za-sktr-1c

Then we can go ahead to label our nodes accordingly:

For node1:

kubectl label node/node1 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node1 topology.kubernetes.io/zone=za-sktr-1a

For node2:

kubectl label node/node2 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node2 topology.kubernetes.io/zone=za-sktr-1b

For node3:

kubectl label node/node3 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node3 topology.kubernetes.io/zone=za-sktr-1c

For node4:

kubectl label node/node4 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node4 topology.kubernetes.io/zone=za-sktr-1a

For node5:

kubectl label node/node5 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node5 topology.kubernetes.io/zone=za-sktr-1b

For node6:

kubectl label node/node6 topology.kubernetes.io/region=za-sktr-1
kubectl label node/node6 topology.kubernetes.io/zone=za-sktr-1c

Now that all our nodes are labeled, we will be able to control the placement of the pods.

Deployment Overview

The deployment manifest will have the following features:

  • Replicas is set to 3
  • Rolling update strategy with maxUnavailable: 1 and maxSurge: 1 this ensures that only one pod is unavailable during updates and only one new pod is created at a time.
  • Node affinity and Pod Placement:
    • Pod Anti-Affinity ensures that no two pods of your application are scheduled in the same zone.
    • Topology Spread Constraints balance pod placement evenly across zones, reducing the chance of overloading any single zone.
  • Health checks: the readinessProbe and livenessProbe, these checks will ensure that the pod is only considered available if it passes these checks.
  • Resource Requests and Limits:
    • requests The minimum resources (CPU and memory) that the Kubernetes scheduler will allocate to a pod. In this case, each pod requests 100m (0.1 CPU) and 32Mi (32 MB of memory).
    • limits The maximum resources a pod is allowed to use. If a pod exceeds these limits, it may be throttled or killed. Here, the CPU is limited to 1000m (1 CPU) and memory to 512Mi (512 MB).

Key Pod Placement Strategies

Pod Anti-Affinity

Pod Anti-Affinity ensures that Kubernetes avoids placing multiple replicas of a pod in the same zone or on the same node. By setting Anti-Affinity rules, you can spread pods across different zones or nodes to improve fault tolerance.

affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - my-app
      topologyKey: topology.kubernetes.io/zone
  • Explanation: This ensures no two pods of the same application are placed in the same zone, providing resilience in case of zone failures.
  • Downside: If there are no available zones that meet the anti-affinity rule, the pod will not be scheduled.

Topology Spread Constraints

Topology Spread Constraints focus on balancing pods evenly across zones or other topological domains. This allows more flexibility compared to Pod Anti-Affinity.

spec:
  template:
    spec:
      topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            app: my-app
        maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway
  • Explanation: This allows a maximum difference of one pod between the most and least populated zones, ensuring balanced distribution. The whenUnsatisfiable: ScheduleAnyway directive ensures that Kubernetes will still schedule pods even if they cannot perfectly satisfy the constraints.

Key Differences

  • Pod Anti-Affinity: strictly ensures that no two pods of the same application are placed in the same zone or node, which is ideal for maximizing fault tolerance.
  • Topology Spread Constraints: focus on balancing pod distribution while allowing flexibility when constraints are unmet, which is more suitable for balancing load across multiple zones.

Can they be used together?

Yes, Kubernetes allows you to define both Pod Anti-Affinity and Topology Spread Constraints within the same deployment manifest. The two mechanisms serve different purposes, and when used together, they provide a combination of strict placement rules (anti-affinity) and more flexible balancing rules (spread constraints).

Why would you use both?

The combination of Pod Anti-Affinity and Topology Spread Constraints can help achieve both:

  1. Strict separation of pods: Ensuring that no two pods are co-located in the same zone or node (using pod anti-affinity).
  2. Balanced distribution: Ensuring that pods are distributed as evenly as possible across the available zones or nodes (using topology spread constraints).

Example Use Case of using Both

Imagine a scenario where you have a stateful or critical application where both separation and balanced load distribution are important:

  • Strict Separation for High Availability: You may want to ensure that certain pods of your application (such as master nodes, database replicas, or API gateways) are never scheduled on the same zone or node to avoid a single-point-of-failure risk. This is where Pod Anti-Affinity comes in.
  • Balanced Load Distribution for Efficiency: At the same time, you want to ensure that your pods are evenly spread across the available zones or nodes to prevent overloading any single part of your infrastructure. This is where Topology Spread Constraints are useful.

When Would This Be Advisable?

This combination is ideal in scenarios where:

  • Critical Applications: If you're running critical workloads, such as databases or stateful services, that must always remain available even during node or zone failures.
  • High-Availability Clusters: In multi-zone clusters where you want pods spread evenly across zones for fault tolerance and also to distribute load efficiently across nodes or zones.
  • Cluster with Varying Resources: In environments where resources are distributed unevenly across zones (e.g. certain zones might have more capacity), you can use Topology Spread Constraints to ensure a more even distribution while also using Pod Anti-Affinity to avoid pod co-location in risky zones.

Scenario Where This Might Be Used

Suppose you're running a high-availability service where you want each pod to be:

  • In different zones (to prevent a zone failure from taking down more than one replica).
  • Evenly distributed across nodes in each zone to avoid overloading a single node or zone with too many pods.

In this case, using Pod Anti-Affinity ensures strict separation across zones, and Topology Spread Constraints ensure the overall distribution remains balanced across the available nodes or zones.

Example Manifest

Here's how you can combine both strategies in a single deployment:

affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - my-app
      topologyKey: topology.kubernetes.io/zone
topologySpreadConstraints:
- labelSelector:
    matchLabels:
      app: my-app
  maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: ScheduleAnyway

In this example:

  • Pod Anti-Affinity ensures that no two pods are placed in the same zone.
  • Toplogy Spread Constraints ensure that pods are evenly distributed across all zones (with max skew of 1 between the most and least populated zones).

Caveats of Using Both

  1. Complex Scheduling: Combining both strategies may result in more complex scheduling decisions for Kubernetes, which could lead to pod scheduling delays if there are insufficient resources to meet both constraints.
  2. Scheduling Failures: If the resources in a zone or node are too constrained to satisfy the Pod Anti-Affinity rules, pods might not be scheduled. Similarly, overly strict Topology Spread Constraints could prevent Kubernetes from balancing load properly.

When to Avoid Using Both

If your application can tolerate slight imbalances in pod distribution or does not require strict separation, using only Topology Spread Constraints might suffice. If you require strict pod separation but don't care as much about perfect balance, Pod Anti-Affinity alone may be enough.

Conclusion

Using Pod Anti-Affinity and Topology Spread Constraints together can be highly beneficial when you need both strict pod separation and balanced distribution for your high-availability applications. However, it’s essential to balance the complexity of scheduling and ensure your cluster has enough resources to meet both sets of constraints.

Demo Deployment

First we need to define manifests/deployment.yaml that specifies multiple Replicas, Resource Requests, Limits and Pod Anti-Affinity:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-container
        image: ruanbekker/hostname:latest
        ports:
        - containerPort: 8000
        resources:
          requests:
            memory: "32Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "1000m"
        readinessProbe:
          httpGet:
            path: /
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /
            port: 8000
          initialDelaySeconds: 15
          periodSeconds: 20
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - my-app
            topologyKey: topology.kubernetes.io/zone

The pod disruption budget that will ensure that only one pod can be unavailable at a time, which we can define in manifests/pdb.yaml:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

Next we will create the namespace where we will deploy our resources into:

kubectl create ns ha

Then deploy the resources into the namespace:

kubectl apply -f manifests/ -n ha

Then we can view our pods and set the output to wide so that we can see onto which nodes was our pods placed, using:

kubectl get pods -n ha -o wide

NAME                      READY   STATUS    RESTARTS   AGE   IP            NODE     NOMINATED NODE   READINESS GATES
my-app-556c5c74cd-d2lnb   1/1     Running   0          60s   10.42.7.229   node2    <none>           <none>
my-app-556c5c74cd-rh977   1/1     Running   0          60s   10.42.3.204   node4    <none>           <none>
my-app-556c5c74cd-s9tnd   1/1     Running   0          60s   10.42.6.6     node6    <none>           <none>

Rolling Updates

After we have deployed a new revision, we can see how our pods was replaced one at a time:

NAME                      READY   STATUS              RESTARTS   AGE     IP            NODE     NOMINATED NODE   READINESS GATES
my-app-556c5c74cd-bbzt7   0/1     ContainerCreating   0          3s      <none>        node6    <none>           <none>
my-app-556c5c74cd-vngvj   0/1     Pending             0          3s      <none>        <none>   <none>           <none>
my-app-78f5cfd9ff-9k8z6   1/1     Running             0          7m15s   10.42.7.217   node2    <none>           <none>
my-app-78f5cfd9ff-wqsd5   1/1     Running             0          7m15s   10.42.3.201   node4    <none>           <none>

NAME                      READY   STATUS    RESTARTS   AGE     IP            NODE     NOMINATED NODE   READINESS GATES
my-app-556c5c74cd-bbzt7   0/1     Running   0          9s      10.42.6.4     node6    <none>           <none>
my-app-556c5c74cd-vngvj   0/1     Pending   0          9s      <none>        <none>   <none>           <none>
my-app-78f5cfd9ff-9k8z6   1/1     Running   0          7m21s   10.42.7.217   node2    <none>           <none>
my-app-78f5cfd9ff-wqsd5   1/1     Running   0          7m21s   10.42.3.201   node4    <none>           <none>

NAME                      READY   STATUS              RESTARTS   AGE     IP            NODE     NOMINATED NODE   READINESS GATES
my-app-556c5c74cd-bbzt7   1/1     Running             0          15s     10.42.6.4     node6    <none>           <none>
my-app-556c5c74cd-fzj7s   0/1     Pending             0          2s      <none>        <none>   <none>           <none>
my-app-556c5c74cd-vngvj   0/1     ContainerCreating   0          15s     <none>        node4    <none>           <none>
my-app-78f5cfd9ff-9k8z6   1/1     Running             0          7m27s   10.42.7.217   node2    <none>           <none>

NAME                      READY   STATUS              RESTARTS   AGE   IP            NODE     NOMINATED NODE   READINESS GATES
my-app-556c5c74cd-bbzt7   1/1     Running             0          26s   10.42.6.4     node6    <none>           <none>
my-app-556c5c74cd-fzj7s   0/1     ContainerCreating   0          13s   <none>        node2    <none>           <none>
my-app-556c5c74cd-vngvj   1/1     Running             0          26s   10.42.3.202   node4    <none>           <none>

NAME                      READY   STATUS    RESTARTS   AGE     IP            NODE     NOMINATED NODE   READINESS GATES
my-app-556c5c74cd-bbzt7   1/1     Running   0          2m14s   10.42.6.4     node6    <none>           <none>
my-app-556c5c74cd-fzj7s   1/1     Running   0          2m1s    10.42.7.227   node2    <none>           <none>
my-app-556c5c74cd-vngvj   1/1     Running   0          2m14s   10.42.3.202   node4    <none>           <none>

This way you can see our pods are being deployed in a safe and controlled manner.

Thank You

Thanks for reading, if you like my content, feel free to check out my website, and subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Join my Newsletter?
Buy Me A Coffee