- Published on
Using KEDA for Autoscaling Pods using Prometheus Metrics
- Authors
- Name
- Ruan Bekker
- @ruanbekker
In our previous KEDA post we looked at a KEDA introduction. We explored how KEDA can dynamically scale containers based on various triggers, including messages in queues, Kafka topics, HTTP requests, and custom metrics.
In this post, we will leverage this flexibility and integrate Prometheus with KEDA to achieve autoscaling based on application-specific metrics, such as http_requests_per_minute
where we can scale our pods based on the amount of requests it receives.
What will we be doing?
To get started, we'll deploy a Kubernetes cluster using KinD for demonstration purposes. Then, we'll leverage Helm to deploy both KEDA and Prometheus. Once the environment is set up, we'll deploy a sample application that exposes Prometheus metrics. Finally, we'll define a ScaledObject to automatically scale our deployment pods based on HTTP requests per minute.
Kubernetes Environment Setup
You can follow this if you don't have KinD installed, once you have it installed, we can define the kind-config.yaml
:
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node:v1.29.4@sha256:3abb816a5b1061fb15c6e9e60856ec40d56b7b52bcea5f5f1350bc6e2320b6f8
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
listenAddress: "0.0.0.0"
This config defines one control-plane node with port 80 exposed. Go ahead and deploy this node:
kind create cluster --name workshop --config kind-config.yaml
Ingress Controller
Now let's deploy Ingress-Nginx for our Kubernetes Ingress Controller to the kube-system
namespace:
helm repo add nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm upgrade --install nginx-public nginx/ingress-nginx \
--version 4.7.3 \
--namespace kube-system \
--set controller.admissionWebhooks.enabled=false \
--set controller.hostPort.enabled=true \
--set controller.ingressClass=nginx \
--set controller.service.type=NodePort
KEDA
Next, we need to deploy KEDA and I will deploy it to the keda
namespace:
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm upgrade --install keda kedacore/keda --namespace keda --create-namespace --version 2.15.0
You can view the helm chart values for more configuration options.
Prometheus Stack
Next up, we need to deploy Prometheus and I will use the kube-prometheus-stack helm chart from Prometheus.
Here I am providing the prometheus values:
prometheus:
prometheusSpec:
serviceMonitorSelector:
matchLabels:
release: kube-prometheus-stack
ingress:
enabled: true
ingressClassName: nginx
pathType: ImplementationSpecific
hosts:
- prometheus.127.0.0.1.nip.io
paths:
- /
Then we can proceed to deploy Prometheus to the prometheus
namespace:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--version 61.7.0 \
--namespace prometheus \
--create-namespace \
--values prometheus-values.yaml
At this point in time, you should have Ingress-Nginx, KEDA and Prometheus deployed to your Kubernetes cluster.
Application Deployment
The application that we are going to deploy, will be the application that exposes Prometheus metrics, which we can use to autoscale on Prometheus metrics. We will rely on the http_requests_per_minute
metric to determine when we want to scale.
I will chain the Deployment
, Service
, ServiceMonitor
and Ingress
into one application-deployment.yaml
manifest:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- env:
- name: API_VERSION
value: v5
image: ruanbekker/golang-prometheus-task-async
imagePullPolicy: IfNotPresent
name: myapp
ports:
- containerPort: 8080
name: http
protocol: TCP
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 100m
memory: 32Mi
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
labels:
app.kubernetes.io/name: myapp
name: myapp
namespace: default
spec:
ingressClassName: nginx
rules:
- host: myapp.127.0.0.1.nip.io
http:
paths:
- backend:
service:
name: myapp
port:
name: http
path: /
pathType: Prefix
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app: myapp
release: kube-prometheus-stack
name: myapp
namespace: default
spec:
endpoints:
- path: /metrics
port: http
scheme: http
namespaceSelector:
matchNames:
- default
selector:
matchLabels:
app: myapp
release: kube-prometheus-stack
---
apiVersion: v1
kind: Service
metadata:
name: myapp
namespace: default
labels:
app: myapp
release: kube-prometheus-stack
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
selector:
app: myapp
type: ClusterIP
Once the manifest was written to disk, we can deploy the sample application to the default
namespace:
kubectl apply -f application-deployment.yaml
Since we defined an ingress for our application, we can make a http requests against it:
- Application Endpoint:
http://myapp.127.0.0.1.nip.io/
- Metrics Endpoint:
http://myapp.127.0.0.1.nip.io/metrics
We have also defined a ServiceMonitor
so that the application can be registered by Prometheus as a scrape target, so that Prometheus can scrape our application for metrics.
AutoScaling our Application
In order to setup AutoScaling, we need to define a ScaledObject
, which is a Kubernetes CRD from KEDA to define and manage autoscaling rules for applications based on various events or metrics. It acts as a bridge between the application and the Kubernetes autoscaling mechanism, allowing for more flexible and granular control over scaling behavior.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaledobject
namespace: default
spec:
scaleTargetRef:
name: myapp
minReplicaCount: 2
maxReplicaCount: 10
pollingInterval: 10
cooldownPeriod: 30
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus-operated.prometheus.svc.cluster.local:9090
metricName: http_requests_per_minute
query: sum(rate(http_requests_total{job="myapp", path="/"}[1m]) * 60) by (service)
threshold: '60'
Where it will use Prometheus metrics to automatically scale a deployment named myapp
based on HTTP requests per minute. Here are some of the configuration parameters in detail:
- scaleTargetRef: Reference to the Kubernetes object to be scaled (deployment named myapp).
- minReplicaCount: Sets the minimum number of replicas for the myapp deployment (2 in this case).
- maxReplicaCount: Defines the maximum number of replicas the deployment can scale to (10 in this case).
- pollingInterval: Specifies the interval (10 seconds) at which KEDA checks the Prometheus metrics.
- cooldownPeriod: Defines the waiting period (30 seconds) after scaling up or down before KEDA reevaluates the metrics.
- triggers: An array containing the trigger definitions for scaling.
- type: Sets the trigger type to prometheus.
- metadata: Additional information for the trigger.
- serverAddress: Specifies the address of the Prometheus server (http://prometheus-operated.prometheus.svc.cluster.local:9090).
- metricName: Defines the human-readable name of the metric to monitor (http_requests_per_minute).
- query: The PromQL query to fetch the actual metric value. This query calculates the rate of HTTP requests per minute for the / path of the myapp service in the last minute.
- threshold: Sets the threshold value (60) for triggering scaling.
Now we can deploy the scaledobject resource:
kubectl apply -f application-scaledobject.yaml
We can now access the Prometheus frontend on http://prometheus.127.0.0.1.nip.io/
and use the following query to monitor the http requests:
sum(rate(http_requests_total{job="myapp", path="/"}[1m]) * 60) by (service)
Then we can generate 60 requests per minute:
while true; do sleep 1; curl myapp.127.0.0.1.nip.io; done
We can view the scaledobject resource:
kubectl get scaledobject -w
# NAME SCALETARGETKIND SCALETARGETNAME MIN MAX TRIGGERS AUTHENTICATION READY ACTIVE FALLBACK PAUSED AGE
# prometheus-scaledobject apps/v1.Deployment myapp 2 10 prometheus True True False Unknown 16m
Then when we look at our pods after some time:
kubectl get pods -n default
We can see that our pods has increased:
NAME READY STATUS RESTARTS AGE
myapp-fd69cc685-47m9x 1/1 Running 0 67m
myapp-fd69cc685-758m9 1/1 Running 0 61s
myapp-fd69cc685-9nfcl 1/1 Running 0 46s
myapp-fd69cc685-f9tc5 1/1 Running 0 46s
myapp-fd69cc685-h7ng4 1/1 Running 0 61s
myapp-fd69cc685-hkk8r 1/1 Running 0 67m
myapp-fd69cc685-kmb55 1/1 Running 0 46s
myapp-fd69cc685-p88vw 1/1 Running 0 61s
myapp-fd69cc685-qk646 1/1 Running 0 46s
myapp-fd69cc685-x5wxm 1/1 Running 0 4m16s
And if we stop our loop from generating requests, after some time we can look at our pods again:
NAME READY STATUS RESTARTS AGE
myapp-fd69cc685-47m9x 1/1 Running 0 70m
myapp-fd69cc685-758m9 1/1 Running 0 4m30s
myapp-fd69cc685-9nfcl 1/1 Running 0 4m15s
myapp-fd69cc685-f9tc5 1/1 Running 0 4m15s
myapp-fd69cc685-h7ng4 1/1 Running 0 4m30s
myapp-fd69cc685-hkk8r 1/1 Running 0 70m
myapp-fd69cc685-kmb55 1/1 Running 0 4m15s
myapp-fd69cc685-p88vw 1/1 Running 0 4m30s
myapp-fd69cc685-qk646 1/1 Running 0 4m15s
myapp-fd69cc685-x5wxm 1/1 Running 0 7m45s
myapp-fd69cc685-qk646 1/1 Terminating 0 6m16s
myapp-fd69cc685-f9tc5 1/1 Terminating 0 6m16s
myapp-fd69cc685-kmb55 1/1 Terminating 0 6m16s
myapp-fd69cc685-758m9 1/1 Terminating 0 6m31s
myapp-fd69cc685-qk646 0/1 Terminating 0 6m16s
myapp-fd69cc685-kmb55 0/1 Terminating 0 6m16s
myapp-fd69cc685-f9tc5 0/1 Terminating 0 6m16s
We can also view the hpa
resource as KEDA uses HPA under the hood:
kubectl get hpa -n default
Will show us the hpa that is being used:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
keda-hpa-prometheus-scaledobject Deployment/myapp 0/60 (avg) 2 10 2 29m
Up Next?
In the next post in our keda series, we will use the RabbitMQ scaler, so that we can scale applications based on rabbitmq queues.
Resources
- https://keda.sh/docs/2.15/deploy/
- https://keda.sh/docs/2.15/scalers/prometheus/
- https://github.com/ruanbekker/golang-prometheus-task-async
Thank You
Thanks for reading, if you like my content, feel free to check out my website, and subscribe to my newsletter or follow me at @ruanbekker on Twitter.
- Linktree: https://go.ruan.dev/links
- Patreon: https://go.ruan.dev/patreon