This blog post introduces concepts regarding scheduling and autoscaling in Kubernetes.

In the previous article, we covered how to achieve data persistence for databases and file storage systems, and mentioned how difficult it is to horizontally scale those systems. When we resort to vertical scaling instead, we need to ensure that the pod is always assigned to a specific node that we are upgrading the resources on, which we haven't yet discussed. There are also other situations where we want to schedule pods to a specific set of nodes. Hence, in this article, we will discuss several features that allow us to do that, and also about autoscaling, which lets us avoid manually changing the number of replicas as the workload changes.
Node Selectors
The easiest way to assign a single pod to a particular node is by providing the node's name (visible via kubectl get nodes
)
in the nodeName
field in the pod's specification. This is sufficient for situations where a pod is restricted to
being scheduled to only one node, such as for vertical scaling of a database or file storage system in the cluster.
However, we might want to allow the pod to be scheduled to multiple nodes when those nodes meet certain requirements
(e.g., when the pod is running a machine learning workload, we want to assign it to any node equipped with a GPU).
Just like deployments and services can identify the pods they are responsible for using labels and selectors,
pods can also identify the nodes they can be assigned to using nodeSelector
in the pod's specification,
and they can be scheduled to any node with matching labels. We can assign labels to nodes
using kubectl label <node-name> <label-name>=<label-value>
. The mechanism is intuitive and easy to use,
but it cannot enforce complex scheduling logic.
Taints & Tolerations
While node selectors specify which nodes to assign pods to, taints and tolerations control which nodes to not assign pods to.
Specifically, taints can be set up so that new pods without tolerations to a node's taints cannot be scheduled (NoSchedule
),
cannot be scheduled or executed on the node (NoExecute
), or cannot be scheduled unless there are no other nodes available
for assignment (PreferNoSchedule
). For example, we can taint a node with a GPU capability like
kubectl taint node cluster-worker1 gpu=true:NoExecute
, which prevents pods without a toleration to the taint gpu=true
from being scheduled or executed on the node.
# ...
spec:
# ...
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoExecute"
To assign a toleration to a specific taint, we can do so as shown above. While taints and tolerations can guarantee that certain pods are not scheduled or executed on certain nodes, they do not guarantee that pods with tolerations to the taints are scheduled only to nodes with the taints. Hence, they are often utilized together with node selectors to ensure that scheduling behaves in a particular desired way.
Node Affinity
While node selectors are powerful due to their simplicity, they cannot specify complex logic, such as assigning pods to nodes with labels matching one of two values and another label matching a particular value. For example, node selectors cannot express scheduling pods to nodes with SSD or HDD and GPU available. Node affinity allows us to do exactly that and set preferences instead of requirements.
# ...
spec:
# ...
affinity:
nodeAffinity:
requireDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchingExpressions:
- key: "disk"
operator: "In"
values:
- "ssd"
- "hhd"
- key: "gpu"
operator: "Equal"
value: "true"
The above example demonstrates node affinity assigning a pod. The node affinity requires the node label disk
to match either ssd
or hhd
, and gpu
to match true
. We can also use preferredDuring...
to specify
that pods prefer nodes with matching labels to a certain extent, though they can still be assigned to other nodes
if they are not available. You can see the Kubernetes official documentation (cited below) for more information.
Though it's a great substitute for node selectors, it can be unnecessarily complicated and less intuitive
to use for some use cases, so it's important to choose between them wisely.
Requests & Limits
As we have seen in previous discussions, we often specify nodes due to the resource requirements of pods. For memory and CPU, Kubernetes has features called requests and limits, which allow us to set the minimum and maximum resources for a pod, preventing situations where the pod consumes the entire resource of the node and goes out of memory (OOM) of the node (making the node unavailable), or pods with high resource requirements get scheduled to nodes that do not meet those requirements.
# ...
spec:
containers:
- name: container
image: my-image
resources:
requests:
memory: "100Mi" # 100MiB
cpu: "250m" # 0.25 CPU (25% of a CPU core power)
limits:
memory: "200Mi" # 200MiB
cpu: "500m" # 0.5 CPU (50% of a CPU core power)
Since the processes with resource requirements run and depend on the container, we set requests and limits per container as shown above. The above requests specify at least 100MiB of memory and 0.25 CPU, preventing the pod from being scheduled to nodes that do not satisfy the resource request, and it limits itself to consuming 200MiB and 0.5 CPU, preventing the entire node from crashing by crashing the pod itself with OOM.
Horizontal Pod Autoscaling
So far, we have been scaling the cluster by manually increasing the number of pods in the deployment.
However, we cannot constantly monitor the cluster every second to set an appropriate number of replicas
for cost efficiency and availability. Hence, Kubernetes allows us to autoscale the cluster by adjusting
the number of pods in the deployment within defined thresholds, depending on CPU utilization.
This method is called horizontal pod autoscaling (HPA), and it's achievable using a command
like kubectl autoscale deployment <deployment-name> --cpu-percent=50 --min=1 --max=10
,
which creates an HPA that automatically adjusts the number of pods from 1 to 10 to achieve 50% CPU utilization at maximum.
We can obtain information about the HPA, such as current CPU utilization and the number of replicas (as opposed to the target),
using the command kubectl get hpa
. Horizontal node autoscaling (automatically adding more nodes) is another option
for horizontally autoscaling the cluster, though the implementation depends on the cloud provider.
There are vertical pod autoscaling and vertical node autoscaling as well for limited use cases. However,
they are primarily not well-suited to Kubernetes clusters and are outside the scope of this article.
Conclusion
In this article, we covered the basics of how we can configure scheduling and autoscaling to meet the resource requirements of pods, prevent node crashes, and maintain the cluster easily. Though we have covered many foundational concepts for setting up a cluster so far, there are still many other important concepts remaining, especially for setting up a cluster for production. Hence, we will continue the discussion on some of those concepts in the next article.
Resources
- Kubernetes. n.d. Kubernetes Documentation. Kubernetes.
- Tech Tutorial with Piyush. 2024. Day 13/40 - Static Pods, Manual Scheduling, Labels, and Selectors in Kubernetes. YouTube.
- Tech Tutorial with Piyush. 2024. Day 14/40 - Taints and Tolerations in Kubernetes. YouTube.
- Tech Tutorial with Piyush. 2024. Day 15/40 - Kubernetes Node Affinity Explained | How Node Affinity Works. YouTube.
- Tech Tutorial with Piyush. 2024. Day 16/40 - Kubernetes Requests and Limits - CKA Full Course 2025. YouTube.
- Tech Tutorial with Piyush. 2024. Day 17/40 - Kubernetes Autoscaling Explained| HPA Vs VPA. YouTube.