Horizontal Pod Autoscaler (HPA)
HPA - the Horizontal Pod Autoscaler is a Kubernetes extension that automatically adjusts the number of replicas of a deployment in response to the resource demand of a workload.
HPA is a feature available on CloudFlow Enterprise Accounts.
For more information on Kubernetes and HPA, see Kubernetes docs.
How to use the Horizontal Pod Autoscaler resource with CloudFlow?
After you have created a Project in CloudFlow, you can use the Horizontal Pod Autoscaler to automatically scale the number of replicas of the deployment.
- Create a yaml file, such as hpa.yaml with the following content:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 50
- Apply it to your application:
kubectl apply -f hpa.yaml
- See your HPA object running on CloudFlow:
kubectl get hpa.v2beta2.autoscaling
See other supported kubectl commands you can use with the HPA resource.
What parts of the Horizontal Pod Autoscaler spec are supported by CloudFlow?
- CloudFlow, supports the
autoscaling/v2beta2
version of the Horizontal Pod Autoscaler API object. - The following fields (including subfields) of the Horizontal Pod Autoscaler spec are supported:
scaleTargetRef
minReplicas
maxReplicas
metrics
type: Resource
- When using a
Resource
metric, scaling is only supported based on thecpu
andmemory
resources. - The
maxReplicas
field can have the highest value of20
.
Adaptive Edge Engine(AEE) and Horizontal Pod Autoscaler (HPA)
The AEE and the HPA work together to provide a scalable container deployment that scales across the globe and within a particular edge location.
While AEE deploys the deployment to new edge locations depending on the traffic requirements in a particular region, the HPA is used to scale the number of replicas of the deployment in a particular edge location based on the resource (CPU and/or memory) demand.