- Automatically adjust pod resource levels with the vertical pod autoscaler
Automatically adjust pod resource levels with the vertical pod autoscaler
The OKD Vertical Pod Autoscaler Operator (VPA) automatically reviews the historic and current CPU and memory resources for containers in pods and can update the resource limits and requests based on the usage values it learns. The VPA uses individual custom resources (CR) to update all of the pods associated with a workload object, such as a Deployment
, DeploymentConfig
, StatefulSet
, Job
, DaemonSet
, ReplicaSet
, or ReplicationController
, in a project.
The VPA helps you to understand the optimal CPU and memory usage for your pods and can automatically maintain pod resources through the pod lifecycle.
About the Vertical Pod Autoscaler Operator
The Vertical Pod Autoscaler Operator (VPA) is implemented as an API resource and a custom resource (CR). The CR determines the actions the Vertical Pod Autoscaler Operator should take with the pods associated with a specific workload object, such as a daemon set, replication controller, and so forth, in a project.
You can use the default recommender or use your own alternative recommender to autoscale based on your own algorithms.
The default recommender automatically computes historic and current CPU and memory usage for the containers in those pods and uses this data to determine optimized resource limits and requests to ensure that these pods are operating efficiently at all times. For example, the default recommender suggests reduced resources for pods that are requesting more resources than they are using and increased resources for pods that are not requesting enough.
The VPA then automatically deletes any pods that are out of alignment with these recommendations one at a time, so that your applications can continue to serve requests with no downtime. The workload objects then re-deploy the pods with the original resource limits and requests. The VPA uses a mutating admission webhook to update the pods with optimized resource limits and requests before the pods are admitted to a node. If you do not want the VPA to delete pods, you can view the VPA resource limits and requests and manually update the pods as needed.
By default, workload objects must specify a minimum of two replicas in order for the VPA to automatically delete their pods. Workload objects that specify fewer replicas than this minimum are not deleted. If you manually delete these pods, when the workload object redeploys the pods, the VPA does update the new pods with its recommendations. You can change this minimum by modifying the |
For example, if you have a pod that uses 50% of the CPU but only requests 10%, the VPA determines that the pod is consuming more CPU than requested and deletes the pod. The workload object, such as replica set, restarts the pods and the VPA updates the new pod with its recommended resources.
For developers, you can use the VPA to help ensure your pods stay up during periods of high demand by scheduling pods onto nodes that have appropriate resources for each pod.
Administrators can use the VPA to better utilize cluster resources, such as preventing pods from reserving more CPU resources than needed. The VPA monitors the resources that workloads are actually using and adjusts the resource requirements so capacity is available to other workloads. The VPA also maintains the ratios between limits and requests that are specified in initial container configuration.
If you stop running the VPA or delete a specific VPA CR in your cluster, the resource requests for the pods already modified by the VPA do not change. Any new pods get the resources defined in the workload object, not the previous recommendations made by the VPA. |
Installing the Vertical Pod Autoscaler Operator
You can use the OKD web console to install the Vertical Pod Autoscaler Operator (VPA).
Prerequisites
Ensure that you have downloaded the pull secret from the Red Hat OpenShift Cluster Manager as shown in Obtaining the installation program in the installation documentation for your platform.
If you have the pull secret, add the
redhat-operators
catalog to the OperatorHub custom resource (CR) as shown in Configuring OKD to use Red Hat Operators.
Procedure
In the OKD web console, click Operators → OperatorHub.
Choose VerticalPodAutoscaler from the list of available Operators, and click Install.
On the Install Operator page, ensure that the Operator recommended namespace option is selected. This installs the Operator in the mandatory
openshift-vertical-pod-autoscaler
namespace, which is automatically created if it does not exist.Click Install.
Verify the installation by listing the VPA Operator components:
Navigate to Workloads → Pods.
Select the
openshift-vertical-pod-autoscaler
project from the drop-down menu and verify that there are four pods running.Navigate to Workloads → Deployments to verify that there are four deployments running.
Optional. Verify the installation in the OKD CLI using the following command:
$ oc get all -n openshift-vertical-pod-autoscaler
The output shows four pods and four deplyoments:
Example output
NAME READY STATUS RESTARTS AGE
pod/vertical-pod-autoscaler-operator-85b4569c47-2gmhc 1/1 Running 0 3m13s
pod/vpa-admission-plugin-default-67644fc87f-xq7k9 1/1 Running 0 2m56s
pod/vpa-recommender-default-7c54764b59-8gckt 1/1 Running 0 2m56s
pod/vpa-updater-default-7f6cc87858-47vw9 1/1 Running 0 2m56s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/vpa-webhook ClusterIP 172.30.53.206 <none> 443/TCP 2m56s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/vertical-pod-autoscaler-operator 1/1 1 1 3m13s
deployment.apps/vpa-admission-plugin-default 1/1 1 1 2m56s
deployment.apps/vpa-recommender-default 1/1 1 1 2m56s
deployment.apps/vpa-updater-default 1/1 1 1 2m56s
NAME DESIRED CURRENT READY AGE
replicaset.apps/vertical-pod-autoscaler-operator-85b4569c47 1 1 1 3m13s
replicaset.apps/vpa-admission-plugin-default-67644fc87f 1 1 1 2m56s
replicaset.apps/vpa-recommender-default-7c54764b59 1 1 1 2m56s
replicaset.apps/vpa-updater-default-7f6cc87858 1 1 1 2m56s
About Using the Vertical Pod Autoscaler Operator
To use the Vertical Pod Autoscaler Operator (VPA), you create a VPA custom resource (CR) for a workload object in your cluster. The VPA learns and applies the optimal CPU and memory resources for the pods associated with that workload object. You can use a VPA with a deployment, stateful set, job, daemon set, replica set, or replication controller workload object. The VPA CR must be in the same project as the pods you want to monitor.
You use the VPA CR to associate a workload object and specify which mode the VPA operates in:
The
Auto
andRecreate
modes automatically apply the VPA CPU and memory recommendations throughout the pod lifetime. The VPA deletes any pods in the project that are out of alignment with its recommendations. When redeployed by the workload object, the VPA updates the new pods with its recommendations.The
Initial
mode automatically applies VPA recommendations only at pod creation.The
Off
mode only provides recommended resource limits and requests, allowing you to manually apply the recommendations. Theoff
mode does not update pods.
You can also use the CR to opt-out certain containers from VPA evaluation and updates.
For example, a pod has the following limits and requests:
resources:
limits:
cpu: 1
memory: 500Mi
requests:
cpu: 500m
memory: 100Mi
After creating a VPA that is set to auto
, the VPA learns the resource usage and deletes the pod. When redeployed, the pod uses the new resource limits and requests:
resources:
limits:
cpu: 50m
memory: 1250Mi
requests:
cpu: 25m
memory: 262144k
You can view the VPA recommendations using the following command:
$ oc get vpa <vpa-name> --output yaml
After a few minutes, the output shows the recommendations for CPU and memory requests, similar to the following:
Example output
...
status:
...
recommendation:
containerRecommendations:
- containerName: frontend
lowerBound:
cpu: 25m
memory: 262144k
target:
cpu: 25m
memory: 262144k
uncappedTarget:
cpu: 25m
memory: 262144k
upperBound:
cpu: 262m
memory: "274357142"
- containerName: backend
lowerBound:
cpu: 12m
memory: 131072k
target:
cpu: 12m
memory: 131072k
uncappedTarget:
cpu: 12m
memory: 131072k
upperBound:
cpu: 476m
memory: "498558823"
...
The output shows the recommended resources, target
, the minimum recommended resources, lowerBound
, the highest recommended resources, upperBound
, and the most recent resource recommendations, uncappedTarget
.
The VPA uses the lowerBound
and upperBound
values to determine if a pod needs to be updated. If a pod has resource requests below the lowerBound
values or above the upperBound
values, the VPA terminates and recreates the pod with the target
values.
Changing the VPA minimum value
By default, workload objects must specify a minimum of two replicas in order for the VPA to automatically delete and update their pods. As a result, workload objects that specify fewer than two replicas are not automatically acted upon by the VPA. The VPA does update new pods from these workload objects if the pods are restarted by some process external to the VPA. You can change this cluster-wide minimum value by modifying the minReplicas
parameter in the VerticalPodAutoscalerController
custom resource (CR).
For example, if you set minReplicas
to 3
, the VPA does not delete and update pods for workload objects that specify fewer than three replicas.
If you set |
Example VerticalPodAutoscalerController
object
apiVersion: autoscaling.openshift.io/v1
kind: VerticalPodAutoscalerController
metadata:
creationTimestamp: "2021-04-21T19:29:49Z"
generation: 2
name: default
namespace: openshift-vertical-pod-autoscaler
resourceVersion: "142172"
uid: 180e17e9-03cc-427f-9955-3b4d7aeb2d59
spec:
minReplicas: 3 (1)
podMinCPUMillicores: 25
podMinMemoryMb: 250
recommendationOnly: false
safetyMarginFraction: 0.15
1 | Specify the minimum number of replicas in a workload object for the VPA to act on. Any objects with replicas fewer than the minimum are not automatically deleted by the VPA. |
Automatically applying VPA recommendations
To use the VPA to automatically update pods, create a VPA CR for a specific workload object with updateMode
set to Auto
or Recreate
.
When the pods are created for the workload object, the VPA constantly monitors the containers to analyze their CPU and memory needs. The VPA deletes any pods that do not meet the VPA recommendations for CPU and memory. When redeployed, the pods use the new resource limits and requests based on the VPA recommendations, honoring any pod disruption budget set for your applications. The recommendations are added to the status
field of the VPA CR for reference.
By default, workload objects must specify a minimum of two replicas in order for the VPA to automatically delete their pods. Workload objects that specify fewer replicas than this minimum are not deleted. If you manually delete these pods, when the workload object redeploys the pods, the VPA does update the new pods with its recommendations. You can change this minimum by modifying the |
Example VPA CR for the Auto
mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: vpa-recommender
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment (1)
name: frontend (2)
updatePolicy:
updateMode: "Auto" (3)
1 | The type of workload object you want this VPA CR to manage. |
2 | The name of the workload object you want this VPA CR to manage. |
3 | Set the mode to Auto or Recreate :
|
There must be operating pods in the project before the VPA can determine recommended resources and apply the recommendations to new pods. |
Automatically applying VPA recommendations on pod creation
To use the VPA to apply the recommended resources only when a pod is first deployed, create a VPA CR for a specific workload object with updateMode
set to Initial
.
Then, manually delete any pods associated with the workload object that you want to use the VPA recommendations. In the Initial
mode, the VPA does not delete pods and does not update the pods as it learns new resource recommendations.
Example VPA CR for the Initial
mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: vpa-recommender
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment (1)
name: frontend (2)
updatePolicy:
updateMode: "Initial" (3)
1 | The type of workload object you want this VPA CR to manage. |
2 | The name of the workload object you want this VPA CR to manage. |
3 | Set the mode to Initial . The VPA assigns resources when pods are created and does not change the resources during the lifetime of the pod. |
There must be operating pods in the project before a VPA can determine recommended resources and apply the recommendations to new pods. |
Manually applying VPA recommendations
To use the VPA to only determine the recommended CPU and memory values, create a VPA CR for a specific workload object with updateMode
set to off
.
When the pods are created for that workload object, the VPA analyzes the CPU and memory needs of the containers and records those recommendations in the status
field of the VPA CR. The VPA does not update the pods as it determines new resource recommendations.
Example VPA CR for the Off
mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: vpa-recommender
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment (1)
name: frontend (2)
updatePolicy:
updateMode: "Off" (3)
1 | The type of workload object you want this VPA CR to manage. |
2 | The name of the workload object you want this VPA CR to manage. |
3 | Set the mode to Off . |
You can view the recommendations using the following command.
$ oc get vpa <vpa-name> --output yaml
With the recommendations, you can edit the workload object to add CPU and memory requests, then delete and redeploy the pods using the recommended resources.
There must be operating pods in the project before a VPA can determine recommended resources. |
Exempting containers from applying VPA recommendations
If your workload object has multiple containers and you do not want the VPA to evaluate and act on all of the containers, create a VPA CR for a specific workload object and add a resourcePolicy
to opt-out specific containers.
When the VPA updates the pods with recommended resources, any containers with a resourcePolicy
are not updated and the VPA does not present recommendations for those containers in the pod.
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: vpa-recommender
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment (1)
name: frontend (2)
updatePolicy:
updateMode: "Auto" (3)
resourcePolicy: (4)
containerPolicies:
- containerName: my-opt-sidecar
mode: "Off"
1 | The type of workload object you want this VPA CR to manage. |
2 | The name of the workload object you want this VPA CR to manage. |
3 | Set the mode to Auto , Recreate , or Off . The Recreate mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes. |
4 | Specify the containers you want to opt-out and set mode to Off . |
For example, a pod has two containers, the same resource requests and limits:
# ...
spec:
containers:
- name: frontend
resources:
limits:
cpu: 1
memory: 500Mi
requests:
cpu: 500m
memory: 100Mi
- name: backend
resources:
limits:
cpu: "1"
memory: 500Mi
requests:
cpu: 500m
memory: 100Mi
# ...
After launching a VPA CR with the backend
container set to opt-out, the VPA terminates and recreates the pod with the recommended resources applied only to the frontend
container:
...
spec:
containers:
name: frontend
resources:
limits:
cpu: 50m
memory: 1250Mi
requests:
cpu: 25m
memory: 262144k
...
name: backend
resources:
limits:
cpu: "1"
memory: 500Mi
requests:
cpu: 500m
memory: 100Mi
...
Using an alternative recommender
You can use your own recommender to autoscale based on your own algorithms. If you do not specify an alternative recommender, OKD uses the default recommender, which suggests CPU and memory requests based on historical usage. Because there is no universal recommendation policy that applies to all types of workloads, you might want to create and deploy different recommenders for specific workloads.
For example, the default recommender might not accurately predict future resource usage when containers exhibit certain resource behaviors, such as cyclical patterns that alternate between usage spikes and idling as used by monitoring applications, or recurring and repeating patterns used with deep learning applications. Using the default recommender with these usage behaviors might result in significant over-provisioning and Out of Memory (OOM) kills for your applications.
Instructions for how to create a recommender are beyond the scope of this documentation, |
Procedure
To use an alternative recommender for your pods:
Create a service account for the alternative recommender and bind that service account to the required cluster role:
apiVersion: v1 (1)
kind: ServiceAccount
metadata:
name: alt-vpa-recommender-sa
namespace: <namespace_name>
---
apiVersion: rbac.authorization.k8s.io/v1 (2)
kind: ClusterRoleBinding
metadata:
name: system:example-metrics-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-reader
subjects:
- kind: ServiceAccount
name: alt-vpa-recommender-sa
namespace: <namespace_name>
---
apiVersion: rbac.authorization.k8s.io/v1 (3)
kind: ClusterRoleBinding
metadata:
name: system:example-vpa-actor
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-actor
subjects:
- kind: ServiceAccount
name: alt-vpa-recommender-sa
namespace: <namespace_name>
---
apiVersion: rbac.authorization.k8s.io/v1 (4)
kind: ClusterRoleBinding
metadata:
name: system:example-vpa-target-reader-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-target-reader
subjects:
- kind: ServiceAccount
name: alt-vpa-recommender-sa
namespace: <namespace_name>
1 Creates a service accocunt for the recommender in the namespace where the recommender is deployed. 2 Binds the recommender service account to the metrics-reader
role. Specify the namespace where the recommender is to be deployed.3 Binds the recommender service account to the vpa-actor
role. Specify the namespace where the recommender is to be deployed.4 Binds the recommender service account to the vpa-target-reader
role. Specify the namespace where the recommender is to be deployed.To add the alternative recommender to the cluster, create a Deployment object similar to the following:
apiVersion: apps/v1
kind: Deployment
metadata:
name: alt-vpa-recommender
namespace: <namespace_name>
spec:
replicas: 1
selector:
matchLabels:
app: alt-vpa-recommender
template:
metadata:
labels:
app: alt-vpa-recommender
spec:
containers: (1)
- name: recommender
image: quay.io/example/alt-recommender:latest (2)
imagePullPolicy: Always
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
ports:
- name: prometheus
containerPort: 8942
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
seccompProfile:
type: RuntimeDefault
serviceAccountName: alt-vpa-recommender-sa (3)
securityContext:
runAsNonRoot: true
1 Creates a container for your alternative recommender. 2 Specifies your recommender image. 3 Associates the service account that you created for the recommender. A new pod is created for the alternative recommender in the same namespace.
$ oc get pods
Example output
NAME READY STATUS RESTARTS AGE
frontend-845d5478d-558zf 1/1 Running 0 4m25s
frontend-845d5478d-7z9gx 1/1 Running 0 4m25s
frontend-845d5478d-b7l4j 1/1 Running 0 4m25s
vpa-alt-recommender-55878867f9-6tp5v 1/1 Running 0 9s
Configure a VPA CR that includes the name of the alternative recommender
Deployment
object.Example VPA CR to include the alternative recommender
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: vpa-recommender
namespace: <namespace_name>
spec:
recommenders:
- name: alt-vpa-recommender (1)
targetRef:
apiVersion: "apps/v1"
kind: Deployment (2)
name: frontend
1 Specifies the name of the alternative recommender deployment. 2 Specifies the name of an existing workload object you want this VPA to manage.
Using the Vertical Pod Autoscaler Operator
You can use the Vertical Pod Autoscaler Operator (VPA) by creating a VPA custom resource (CR). The CR indicates which pods it should analyze and determines the actions the VPA should take with those pods.
Prerequisites
The workload object that you want to autoscale must exist.
If you want to use an alternative recommender, a deployment including that recommender must exist.
Procedure
To create a VPA CR for a specific workload object:
Change to the project where the workload object you want to scale is located.
Create a VPA CR YAML file:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: vpa-recommender
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment (1)
name: frontend (2)
updatePolicy:
updateMode: "Auto" (3)
resourcePolicy: (4)
containerPolicies:
- containerName: my-opt-sidecar
mode: "Off"
recommenders: (5)
- name: my-recommender
1 Specify the type of workload object you want this VPA to manage: Deployment
,StatefulSet
,Job
,DaemonSet
,ReplicaSet
, orReplicationController
.2 Specify the name of an existing workload object you want this VPA to manage. 3 Specify the VPA mode: auto
to automatically apply the recommended resources on pods associated with the controller. The VPA terminates existing pods and creates new pods with the recommended resource limits and requests.recreate
to automatically apply the recommended resources on pods associated with the workload object. The VPA terminates existing pods and creates new pods with the recommended resource limits and requests. Therecreate
mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes.initial
to automatically apply the recommended resources when pods associated with the workload object are created. The VPA does not update the pods as it learns new resource recommendations.off
to only generate resource recommendations for the pods associated with the workload object. The VPA does not update the pods as it learns new resource recommendations and does not apply the recommendations to new pods.
4 Optional. Specify the containers you want to opt-out and set the mode to Off
.5 Optional. Specify an alternative recommender. Create the VPA CR:
$ oc create -f <file-name>.yaml
After a few moments, the VPA learns the resource usage of the containers in the pods associated with the workload object.
You can view the VPA recommendations using the following command:
$ oc get vpa <vpa-name> --output yaml
The output shows the recommendations for CPU and memory requests, similar to the following:
Example output
...
status:
...
recommendation:
containerRecommendations:
- containerName: frontend
lowerBound: (1)
cpu: 25m
memory: 262144k
target: (2)
cpu: 25m
memory: 262144k
uncappedTarget: (3)
cpu: 25m
memory: 262144k
upperBound: (4)
cpu: 262m
memory: "274357142"
- containerName: backend
lowerBound:
cpu: 12m
memory: 131072k
target:
cpu: 12m
memory: 131072k
uncappedTarget:
cpu: 12m
memory: 131072k
upperBound:
cpu: 476m
memory: "498558823"
...
1 lowerBound
is the minimum recommended resource levels.2 target
is the recommended resource levels.3 upperBound
is the highest recommended resource levels.4 uncappedTarget
is the most recent resource recommendations.
Uninstalling the Vertical Pod Autoscaler Operator
You can remove the Vertical Pod Autoscaler Operator (VPA) from your OKD cluster. After uninstalling, the resource requests for the pods already modified by an existing VPA CR do not change. Any new pods get the resources defined in the workload object, not the previous recommendations made by the Vertical Pod Autoscaler Operator.
You can remove a specific VPA CR by using the |
After removing the VPA Operator, it is recommended that you remove the other components associated with the Operator to avoid potential issues.
Prerequisites
- The Vertical Pod Autoscaler Operator must be installed.
Procedure
In the OKD web console, click Operators → Installed Operators.
Switch to the openshift-vertical-pod-autoscaler project.
For the VerticalPodAutoscaler Operator, click the Options menu and select Uninstall Operator.
Optional: To remove all operands associated with the Operator, in the dialog box, select Delete all operand instances for this operator checkbox.
Click Uninstall.
Optional: Use the OpenShift CLI to remove the VPA components:
Delete the VPA namespace:
$ oc delete namespace openshift-vertical-pod-autoscaler
Delete the VPA custom resource definition (CRD) objects:
$ oc delete crd verticalpodautoscalercheckpoints.autoscaling.k8s.io
$ oc delete crd verticalpodautoscalercontrollers.autoscaling.openshift.io
$ oc delete crd verticalpodautoscalers.autoscaling.k8s.io
Deleting the CRDs removes the associated roles, cluster roles, and role bindings.
This action removes from the cluster all user-created VPA CRs. If you re-install the VPA, you must create these objects again.
Delete the VPA Operator:
$ oc delete operator/vertical-pod-autoscaler.openshift-vertical-pod-autoscaler