Kubernetes High Availability Best Practices: Mastering Pod Distribution
When running applications in Kubernetes, ensuring high availability isn’t just about having multiple replicas – it’s about intelligently distributing those replicas across your infrastructure. In this blog post, we’ll dive deep into advanced pod distribution strategies that help maintain application resilience and optimal performance. Assuming these pods will always kept separated automatically by Kubernetes is a big mistake, as the scheduling algorithm at some point can place same pod on the same nodes if the pod distribution configuration not properly configured.
Understanding Pod Distribution Challenges
Before we explore solutions, let’s understand the challenges:
-
Multiple pods of the same application could end up on the same node, creating a single point of failure
-
Uneven pod distribution across zones can lead to degraded performance during zone failures
-
Resource contention between co-located pods can impact application performance
-
Network latency between pods in different regions can affect application response times
Pod Anti-Affinity: Keeping Pods Apart
Pod anti-affinity is one of the most powerful tools for ensuring high availability. It allows you to define rules that prevent pods from being scheduled together based on specified criteria. Below diagram shows what soft and hard anti-affinity will do.

Here’s an example of how to implement hard pod anti-affinity:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-server
spec:
replicas: 3
template:
metadata:
labels:
app: web
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- web
topologyKey: kubernetes.io/hostname
This configuration ensures that pods with the label app: web will never be scheduled on the same node. For more flexibility, you can use preferredDuringSchedulingIgnoredDuringExecution to implement soft anti-affinity rules.
Topology Spread Constraints: Even Distribution Across Your Cluster
While anti-affinity helps keep pods apart, topology spread constraints ensure even distribution across your infrastructure. This is particularly important in multi-zone clusters. As in the below diagram, it shows how pod topology spread can help to ensure service are resilient on node failure:

Here’s an example of implementing topology spread constraints:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-server
spec:
replicas: 6
template:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: web
This configuration ensures that:
-
Pods are distributed evenly across zones
-
The difference in number of pods between any two zones won’t exceed 1
-
New pods won’t be scheduled if they would violate the maxSkew constraint
Combining Strategies for Maximum Resilience
For optimal high availability, combine both approaches:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-server
spec:
replicas: 6
template:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: web
topologyKey: kubernetes.io/hostname
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: web
This configuration provides:
-
Node-level pod separation through anti-affinity
-
Zone-level even distribution through topology spread constraints
-
Protection against both node and zone failures
Best Practices and Recommendations
-
Start with Soft Constraints
Begin with soft anti-affinity rules and topology spread constraints during initial deployment. This provides flexibility while you evaluate the impact on your cluster. -
Monitor Pod Distribution
Regular monitoring of pod distribution is crucial. Use tools like:
kubectl get pods -o wide
kubectl describe nodes | grep -A5 "Non-terminated Pods"
- Consider Resource Requirements
When implementing distribution strategies, account for:
-
Node resource capacity
-
Reserved resources for system components
-
Resource requirements of your applications
- Plan for Failure Scenarios
Test your configuration under different failure scenarios:
-
Node failures
-
Zone outages
-
Network partitions
Summary
It is very critical in implementing proper pod distribution strategies as it is crucial for maintaining high availability in Kubernetes clusters especially for mission critical services. By combining pod anti-affinity with topology spread constraints, you can create resilient applications that can withstand various types of infrastructure failures.
Remember that these configurations should be tailored to your specific needs, considering factors like:
-
Application architecture
-
Infrastructure topology
-
Performance requirements
-
Business continuity needs
Regular testing and monitoring will help ensure your chosen strategies effectively maintain the desired level of availability for your applications.
Such approach will help us to plan for a maintenance with no downtime required for example like cluster patching in this post.