Event-driven autoscaling in Kubernetes with KEDA
Learn how to implement event-driven autoscaling in Kubernetes using KEDA, covering the basics, benefits, and step-by-step guides on scaling workloads based on external events.
Written by
Technical Writer @ Civo
Written by
Technical Writer @ Civo
In November 2015, Kubernetes 1.1 brought about a new feature called the horizontal pod autoscaler. Designed to help users scale out their workloads more dynamically based on CPU and memory usage.
Fast forward to Kubernetes 1.8, and the vertical pod autoscaler is introduced as a way to dynamically resize the CPU and memory allocated to existing pods. Both these features saw mass adoption within the Kubernetes community because they solved a problem modern applications faced, scaling up or out as load increased.
Not all applications are created equal, and scaling via CPU and memory usage isn't always the answer, as modern applications are decoupled; an increase in load on one of your dependencies could signal the need to scale to handle the increase in load.
In this tutorial, we will take a look at how to use KEDA to scale workloads in an event-driven manner.
What is KEDA?
As part of a joint effort between Microsoft and Red Hat in 2019, Kubernetes event-driven auto scaling, or KEDA for short, was born. It was initially geared toward better supporting Azure functions on OpenShift, but being open source, the community quickly expanded the use case far beyond its original scope.
KEDA is an open-source project under the Cloud Native Computing Foundation (CNCF) that helps scale workloads based on external events or custom metrics. This is useful for responding to load on external systems, such as a cache that your application depends on.
KEDA vs. HPA vs. VPA
With an understanding of how KEDA operates, you might be wondering how it compares or differs from the first-party auto-scalers. Here is a quick table that details the differences between the two:
A quick tip to remember the differences between HPA, VPA, and KEDA is: HPA scales your application out based on CPU and memory, VPA scales up your application using the same metrics, and KEDA can trigger vertical and horizontal scaling based on external event sources.
How does KEDA work?
KEDA differs from first-party auto scalers (HPA and VPA) by introducing scalers, components that connect to external event sources and retrieve metrics. Scalers support various systems, such as message queues (RabbitMQ, Apache Kafka), databases (Redis, PostgreSQL), and custom HTTP endpoints. Each scaler is designed to understand the specific protocol and metrics format of its target system.
KEDA's scaling engine uses the metrics retrieved by scalers to evaluate whether a workload needs to be scaled. This is configured through KEDA's primary custom resource definition (CRD), the ScaledObject.
Source: Image by author
When you create a ScaledObject, KEDA automatically generates and manages an HPA resource behind the scenes.
Prerequisites
This portion of the tutorial assumes some working knowledge of Kubernetes. Additionally, you will need the following installed in order to follow along:
- Helm (used to install KEDA)
- Kubernetes cluster
- kubectl (installed and configured to interact with the cluster)
- A Civo account
- Civo CLI (optional)
Install metrics server
The metrics server remains a crucial component of using KEDA, as it is used to make the underlying scaling of pods through an HPA or VPA.
If you do not have a metrics server deployed in your cluster already, run the following command to install:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
This will install the most recent version of the metrics server, which is compatible with Kubernetes version 1.91+
Deploy KEDA
With the metrics server installed, you can deploy KEDA using Helm:
Add the chart:
helm repo add kedacore https://kedacore.github.io/charts
Update your local repository:
helm repo update
Install KEDA:
helm install keda kedacore/keda --namespace keda --create-namespace
Output is similar to:

Source: Image by author
Scaling a time-based application
To kick things off, let's discuss time-based scaling. This occurs when you have predictable spikes in traffic, such as a lunch rush for a food delivery app, or events that happen at specific times, like a Black Friday sale or the release of a popular clothing item.
These types of predictable and time-bound events are good candidates for cron-based scaling.
First, let's deploy a simple nginx application:
kubectl apply -f - <<EOFapiVersion: apps/v1kind: Deploymentmetadata:name: cron-scaled-appnamespace: defaultspec:replicas: 1selector:matchLabels:app: cron-scaled-apptemplate:metadata:labels:app: cron-scaled-appspec:containers:- name: appimage: nginx:alpineports:- containerPort: 80resources:requests:cpu: 100mmemory: 128Milimits:cpu: 200mmemory: 256MiEOF
Now for the ScaledObject that enables time-based scaling:
kubectl apply -f - <<EOFapiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata:name: cron-scaledobjectnamespace: defaultspec:scaleTargetRef:name: cron-scaled-appminReplicaCount: 1maxReplicaCount: 3triggers:# Scale up to 6 replicas every 10 minutes (for testing)- type: cronmetadata:timezone: UTCstart: "*/10 * * * *" # Every 10 minutesend: "2-4/10 * * * *" # End 2-4 minutes later (scale down)desiredReplicas: "6"EOF
The ScaledObject uses standard cron syntax for scheduling scaling events. The key component here is spec.scaleTargetRef, which tells KEDA which deployment to scale, in this case, our cron-scaled-app deployment. Every 10 minutes, KEDA will scale the deployment up to 6 replicas, then scale it back down to the minimum of 2 replicas a few minutes later.
To verify that the scaling rule worked, run:
kubectl get pods -w
This will list the pods in the default namespace and watch for new pods. After ten minutes, your output should be similar to the following:

Source: Image by author
Scaling cache-dependent applications
Cron-based scaling is great, but not all applications have time-based spikes. More often than not, developers use a caching layer like Redis. In this next example, let's take a look at how we can scale a Redis worker based on the length of a queue.
First, we need to deploy Redis:
kubectl apply -f - <<EOF---# Redis DeploymentapiVersion: apps/v1kind: Deploymentmetadata:name: redisnamespace: defaultspec:replicas: 1selector:matchLabels:app: redistemplate:metadata:labels:app: redisspec:containers:- name: redisimage: redis:7-alpineports:- containerPort: 6379---# Redis ServiceapiVersion: v1kind: Servicemetadata:name: redis-servicenamespace: defaultspec:selector:app: redisports:- port: 6379targetPort: 6379type: ClusterIPEOF
Next, deploy a worker that will simulate processing jobs from the Redis queue:
kubectl apply -f - <<EOFapiVersion: apps/v1kind: Deploymentmetadata:name: redis-scaled-workernamespace: defaultspec:replicas: 1selector:matchLabels:app: redis-scaled-workertemplate:metadata:labels:app: redis-scaled-workerspec:containers:- name: workerimage: python:3.9-slimcommand: ["/bin/sh"]args:- -c- |pip install "redis[hiredis]"python -c "import redis, time, randomr = redis.Redis(host='redis-service', port=6379, decode_responses=True)while True:try:job = r.blpop('job-queue', timeout=5)if job:print(f'Processing job: {job[1]}')time.sleep(random.randint(1, 5)) # Simulate workelse:print('No jobs, waiting...')except Exception as e:print(f'Error: {e}')time.sleep(1)"resources:requests:cpu: 100mmemory: 128Milimits:cpu: 200mmemory: 256MiEOF
Now, we will add a producer to add jobs to the queue:
kubectl apply -f - <<EOFapiVersion: apps/v1kind: Deploymentmetadata:name: redis-job-producernamespace: defaultspec:replicas: 1selector:matchLabels:app: redis-job-producertemplate:metadata:labels:app: redis-job-producerspec:containers:- name: producerimage: python:3.9-slimcommand: ["/bin/sh"]args:- -c- |pip install "redis[hiredis]"python -c "import redis, time, random, json, osr = redis.Redis(host='redis-service', port=6379, decode_responses=True)producer_id = os.environ.get('HOSTNAME', 'producer')counter = 0print(f'[{producer_id}] Starting job producer...')while True:try:# Simulate varying load - sometimes burst, sometimes quietjobs_to_add = random.randint(1, 10)for i in range(jobs_to_add):job_data = {'id': counter,'task': f'process_data_{counter}','producer': producer_id,'timestamp': time.time()}r.rpush('job-queue', json.dumps(job_data))counter += 1current_queue_length = r.llen('job-queue')print(f'[{producer_id}] Added {jobs_to_add} jobs. Queue length now: {current_queue_length}')# Variable sleep to create realistic load patternssleep_time = random.randint(3, 12)time.sleep(sleep_time)except Exception as e:print(f'[{producer_id}] Error: {e}')time.sleep(5) # Wait before retrying"resources:requests:cpu: 50mmemory: 64Milimits:cpu: 100mmemory: 128Mienv:- name: REDIS_HOSTvalue: "redis-service"- name: REDIS_PORTvalue: "6379"EOF
Finally, a ScaledObject that monitors the Redis queue and scales the worker:
kubectl apply -f - <<EOFapiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata:name: redis-scaledobjectnamespace: defaultspec:scaleTargetRef:name: redis-scaled-worker # This is the deployment to scaleminReplicaCount: 1maxReplicaCount: 5triggers:- type: redismetadata:address: redis-service.default.svc.cluster.local:6379listName: job-queuelistLength: "5"EOF
This ScaledObject uses the Redis scaler to monitor the job-queue list. When the queue length exceeds 5 items, KEDA will scale up the redis-scaled-worker deployment. The spec.scaleTargetRef points to our worker deployment, and KEDA will automatically add more worker pods to handle the increased queue load, then scale back down when the queue shrinks.
After a few minutes, run:
kubectl get pods
Your output should be similar to:
redis-scaled-worker-5fbc5475b8-f2hnv 1/1 Running 0 7m44sredis-job-producer-7d5cdfb97b-z5ztb 1/1 Running 0 3m23sredis-scaled-worker-5fbc5475b8-tmxlv 1/1 Running 0 103sredis-scaled-worker-5fbc5475b8-nwgrl 1/1 Running 0 88sredis-scaled-worker-5fbc5475b8-jnvv4 1/1 Running 0 73sredis-scaled-worker-5fbc5475b8-lpsct 1/1 Running 0 73sredis-scaled-worker-5fbc5475b8-bxg2b 1/1 Running 0 28s
Pro tip: Debugging
If you are trying to figure out why a ScaledObject or cron isn’t working. A good place to start is to describe the ScaledObject and check events using kubectl:
# Describe the ScaledObject to see its current statuskubectl describe scaledobject redis-scaledobject# Check events for any scaling-related issueskubectl get events --sort-by='.lastTimestamp' | grep -i scale
Please note that event retention in Kubernetes may be limited at times, so this command may not give accurate information if a significant amount of time has passed.
Closing thoughts
KEDA is a fantastic project that allows you to scale workloads based on more dynamic criteria, such as application dependencies or time of day.
While in this tutorial, we covered the cron and Redis scalers, KEDA supports a lot more scalers, such as Postgres and Prometheus. If you are looking to scale your cluster nodes on Civo, check out this section of the docs and happy scaling!

Technical Writer @ Civo
Jubril Oyetunji is a DevOps engineer and technical writer with a strong focus on cloud-native technologies and open-source tools. His work centers on creating practical tutorials that help developers better understand platforms such as Kubernetes, NGINX, Rust, and Go.
As a contract technical writer, Jubril authored an extensive library of technical guides covering cloud-native infrastructure and modern development workflows. Many of his tutorials achieved strong search rankings, helping developers around the world learn and adopt emerging technologies.
Share this article