4. Scaling and Scheduling Pods in Kubernetes: All About Taints & Tolerance
Managing Pod Placement with Taints and Tolerations
It is recommended to read this blog before continuing with this one.
In Kubernetes, taints and tolerations are like a "Do Not Disturb" sign and a "VIP Pass" for pods and nodes. Here's a simple way to understand:
What are Taints?
Taints are labels applied to nodes to say:
"Hey, only specific pods can run here. Others, stay away!"
What are Tolerations?
Tolerations are added to pods to say:
"Hey, I have permission to run on nodes with this 'Do Not Disturb' sign."
Example:
Imagine a mall with a VIP lounge (node). The lounge has a sign:
"Only VIP members allowed" (Taint: key=vip, effect=NoSchedule).
Now, a pod (customer) wants to enter.
Regular customers without the VIP pass (no toleration) are not allowed inside.
VIP customers with the pass (toleration: key=vip) are allowed.
Why Use Taints and Tolerations?
They help ensure:
Critical workloads run on specific nodes (e.g., high-performance hardware).
Other workloads don’t overload sensitive nodes.
Taints & Tolerations: You can add a taint to a node to stop pods from being scheduled there unless they have a matching toleration. A taint puts a label on a node, showing that only certain pods with the right toleration can be scheduled on that node.
Note: The control plane is tainted by default, which is why pods are not created on the control plane.
Lets understand it with an example:
Taints Concept:
- Create a
pod.yml
and apply the changes:
apiVersion: v1
kind: Pod
metadata:
labels:
run: nginx
name: nginx
spec:
containers:
- image: nginx
name: nginx
kubectl apply -f pod.yml
The pod is created and functioning properly because, by default, all the worker nodes are not tainted, allowing it to run there. Meanwhile, the control plane is tainted, which is why no pods are running on it.
- Lets delete our
pod.yml
kubectl delete -f pod.yml
- List the nodes and taint one of them with mentioned command:
kubectl get nodes
kubectl taint node acrobat-cluster-worker prod=true:NoSchedule
kubectl taint node acrobat-cluster-worker2 prod=true:NoSchedule
kubectl taint
:
This is the command used to apply a taint to a node in Kubernetes.node
:
Specifies that the taint will be applied to a node.acrobat-cluster-worker
:
The name of the node to which the taint is being applied. In this case, it’sacrobat-cluster-worker
.prod=true
:
This is the taint key-value pair.prod
is the key, andtrue
is the value. It’s used to label the node with this specific taint.NoSchedule
:
The effect of the taint. It means that no pods will be scheduled on this node unless they have a matching toleration forprod=true
.
Now, you might be wondering, since our control plane is already tainted and we have also tainted the remaining worker nodes, where will our pod run when we apply it? Let's find out.
kubectl apply -f pod.yml
You will observe that the pod remains in a pending status.
Let describe the pod and see the result:
- To remove the taint, you can use the following command. The command for untainting is similar to the taint command, with the only difference being the
-
at the end.
kubectl taint node acrobat-cluster-worker2 prod=true:NoSchedule-
Now you can verify that the pod is running by using the command kubectl get pods
:
To remove the taint from worker node1, use the following command:
kubectl taint node acrobat-cluster-worker prod=true:NoSchedule-
Tolerance Concept:
- To begin, open the
pod.yml
file:
apiVersion: v1
kind: Pod
metadata:
labels:
run: nginx
name: nginx
spec:
containers:
- image: nginx
name: nginx
tolerations:
- key: "prod"
operator: "Equal"
value: "true"
effect: "NoSchedule"
tolerations:
This section defines the tolerations for a pod. Tolerations allow the pod to accept taints applied to nodes.- key: "prod"
The key of the taint this pod will tolerate. In this case, it matches theprod
key from the taint applied to the node.operator: "Equal"
The operator specifies how the value is compared.Equal
means the pod will only tolerate the taint if the key-value pair exactly matches (in this case,prod=true
).value: "true"
The value for the keyprod
. The pod tolerates the taint if the value istrue
, matching the taint applied to the node.effect: "NoSchedule"
The effect specifies what happens when the taint is applied. In this case, the effectNoSchedule
allows the pod to be scheduled on the node, as long as it tolerates theprod=true
taint.
What it does:
This toleration ensures that the pod can be scheduled on a node with the prod=true:NoSchedule
taint. Without this toleration, the pod would not be able to run on the node.
- Now delete our pod which was running:
kubectl delete -f pod.yml
- To apply a taint to our worker node1, use the following command to set the taint as prod=true:NoSchedule:
kubectl taint node acrobat-cluster-worker prod=true:NoSchedule
And apply a taint to our worker node2 using the following command to set the taint as dev=true:NoSchedule:
kubectl taint node acrobat-cluster-worker2 dev=true:NoSchedule
- Now our pod is tainted for both the nodes, let see if it will work or not when we apply the changes:
kubectl apply -f pod.yml
kubectl get pods
You can see its running:
WHY? Because the pod can only run on acrobat-cluster-worker
because it accepts the prod=true
tag. It can't run on acrobat-cluster-worker2
because that node has a different tag (dev=true
) that the pod doesn't accept.
This is an explanation of the concept of Taints and Tolerations.
Happy Learning :)
Chetan Mohod ✨
For more DevOps updates, you can follow me on LinkedIn.