4. Scaling and Scheduling Pods in Kubernetes: All About Taints & Tolerance

Managing Pod Placement with Taints and Tolerations

4. Scaling and Scheduling Pods in Kubernetes: All About Taints & Tolerance

It is recommended to read this blog before continuing with this one.

In Kubernetes, taints and tolerations are like a "Do Not Disturb" sign and a "VIP Pass" for pods and nodes. Here's a simple way to understand:


What are Taints?

Taints are labels applied to nodes to say:
"Hey, only specific pods can run here. Others, stay away!"

What are Tolerations?

Tolerations are added to pods to say:
"Hey, I have permission to run on nodes with this 'Do Not Disturb' sign."


Example:

Imagine a mall with a VIP lounge (node). The lounge has a sign:
"Only VIP members allowed" (Taint: key=vip, effect=NoSchedule).

Now, a pod (customer) wants to enter.

  • Regular customers without the VIP pass (no toleration) are not allowed inside.

  • VIP customers with the pass (toleration: key=vip) are allowed.


Why Use Taints and Tolerations?

They help ensure:

  • Critical workloads run on specific nodes (e.g., high-performance hardware).

  • Other workloads don’t overload sensitive nodes.


Taints & Tolerations: You can add a taint to a node to stop pods from being scheduled there unless they have a matching toleration. A taint puts a label on a node, showing that only certain pods with the right toleration can be scheduled on that node.

Note: The control plane is tainted by default, which is why pods are not created on the control plane.


Lets understand it with an example:

Taints Concept:

  1. Create a pod.yml and apply the changes:
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: nginx
  name: nginx
spec:
  containers:
  - image: nginx
    name: nginx
kubectl apply -f pod.yml

The pod is created and functioning properly because, by default, all the worker nodes are not tainted, allowing it to run there. Meanwhile, the control plane is tainted, which is why no pods are running on it.

  1. Lets delete our pod.yml
kubectl delete -f pod.yml
  1. List the nodes and taint one of them with mentioned command:
kubectl get nodes

kubectl taint node acrobat-cluster-worker prod=true:NoSchedule
kubectl taint node acrobat-cluster-worker2 prod=true:NoSchedule
  • kubectl taint:
    This is the command used to apply a taint to a node in Kubernetes.

  • node:
    Specifies that the taint will be applied to a node.

  • acrobat-cluster-worker:
    The name of the node to which the taint is being applied. In this case, it’s acrobat-cluster-worker.

  • prod=true:
    This is the taint key-value pair. prod is the key, and true is the value. It’s used to label the node with this specific taint.

  • NoSchedule:
    The effect of the taint. It means that no pods will be scheduled on this node unless they have a matching toleration for prod=true.

Now, you might be wondering, since our control plane is already tainted and we have also tainted the remaining worker nodes, where will our pod run when we apply it? Let's find out.

kubectl apply -f pod.yml

You will observe that the pod remains in a pending status.

Let describe the pod and see the result:

  1. To remove the taint, you can use the following command. The command for untainting is similar to the taint command, with the only difference being the - at the end.
kubectl taint node acrobat-cluster-worker2 prod=true:NoSchedule-

Now you can verify that the pod is running by using the command kubectl get pods:

To remove the taint from worker node1, use the following command:

kubectl taint node acrobat-cluster-worker prod=true:NoSchedule-

Tolerance Concept:

  1. To begin, open the pod.yml file:
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: nginx
  name: nginx
spec:
  containers:
  - image: nginx
    name: nginx
  tolerations:
    - key: "prod"
      operator: "Equal"
      value: "true"
      effect: "NoSchedule"
  1. tolerations:
    This section defines the tolerations for a pod. Tolerations allow the pod to accept taints applied to nodes.

  2. - key: "prod"
    The key of the taint this pod will tolerate. In this case, it matches the prod key from the taint applied to the node.

  3. operator: "Equal"
    The operator specifies how the value is compared. Equal means the pod will only tolerate the taint if the key-value pair exactly matches (in this case, prod=true).

  4. value: "true"
    The value for the key prod. The pod tolerates the taint if the value is true, matching the taint applied to the node.

  5. effect: "NoSchedule"
    The effect specifies what happens when the taint is applied. In this case, the effect NoSchedule allows the pod to be scheduled on the node, as long as it tolerates the prod=true taint.

What it does:

This toleration ensures that the pod can be scheduled on a node with the prod=true:NoSchedule taint. Without this toleration, the pod would not be able to run on the node.

  1. Now delete our pod which was running:
kubectl delete -f pod.yml
  1. To apply a taint to our worker node1, use the following command to set the taint as prod=true:NoSchedule:
kubectl taint node acrobat-cluster-worker prod=true:NoSchedule

And apply a taint to our worker node2 using the following command to set the taint as dev=true:NoSchedule:

kubectl taint node acrobat-cluster-worker2 dev=true:NoSchedule
  1. Now our pod is tainted for both the nodes, let see if it will work or not when we apply the changes:
kubectl apply -f pod.yml

kubectl get pods

You can see its running:

WHY? Because the pod can only run on acrobat-cluster-worker because it accepts the prod=true tag. It can't run on acrobat-cluster-worker2 because that node has a different tag (dev=true) that the pod doesn't accept.

This is an explanation of the concept of Taints and Tolerations.


Happy Learning :)

Chetan Mohod ✨

For more DevOps updates, you can follow me on LinkedIn.