January 7, 2025

Guide to KEDA (Kubernetes Event-Driven Autoscaler)

Tania Duggal
Technical Writer

In this article, we will introduce KEDA (Kubernetes Event-Driven Autoscaler), and walk through an example of using KEDA for Cron-based scaling.

But first, let’s talk about load-based scaling vs event-driven scaling. Kubernetes provides the scalability and flexibility necessary to handle workloads of different sizes, but choosing the right scaling strategy is difficult.

Load-based Scaling

Load-based scaling is the traditional approach where Kubernetes adjusts the number of replicas in a deployment based on metrics like CPU or memory usage. For example, the Horizontal Pod Autoscaler (HPA) increases or decreases the number of pods to maintain a target CPU utilization percentage. Load-based scaling has the advantage of depending on metrics, such as CPU and memory usage, which are familiar to most system administrators. It allows for real-time adjustments to workloads, dynamically responding to changes in demand. However, this approach has its limitations. It is restricted to limited triggers, as it cannot scale based on custom metrics or external events. Furthermore, load-based scaling may experience a lag in response, making it less effective in handling sudden spikes or drops in highly dynamic workloads.

Event-Driven Scaling

Event-driven scaling adjusts the number of replicas based on external events or custom metrics. It’s ideal for scenarios like processing messages from a queue, where the number of items dictates the scaling needs. The event-driven approach works well for scale-to-zero scenarios, such as Function-as-a-Service (FaaS) applications, where workloads scale down to zero when not in use. This is where KEDA in Kubernetes comes in.

Load-Based Scaling Event-Driven Scaling
Trigger Mechanism Adjusts the number of pod replicas based on resource usage metrics like CPU or memory utilization. Adjusts the number of pod replicas based on external events or custom metrics, such as message queue length or HTTP request count.
Use Cases Suitable for applications with predictable workloads where resource consumption correlates with demand. Ideal for event-driven architectures, such as processing tasks from a message queue or handling scheduled jobs.
Scalability Scales between a minimum and maximum number of replicas but doesn't inherently scale to zero. Can scale applications down to zero replicas when there are no events to process, conserving resources.
Configuration Configured using Kubernetes' Horizontal Pod Autoscaler (HPA), focusing on resource utilization thresholds. Configured using tools like KEDA, which support various event sources and custom metrics for scaling decisions.
Resource Efficiency May lead to over-provisioning if resource usage doesn't accurately represent workload demand. Enhances resource efficiency by scaling precisely based on event load, reducing unnecessary resource consumption.

What is KEDA (Kubernetes Event-Driven Autoscaler)?

KEDA (Kubernetes Event-Driven Autoscaler) is an open-source project that extends Kubernetes' scaling capabilities by enabling event-driven scaling for any container workload.

KEDA was created by Microsoft and Red Hat to bridge the gap between Kubernetes' scaling and the event-driven architecture. Since its launch, KEDA has become a CNCF project, with a growing community and wide adoption in production environments.

KEDA supports over 65 scalers, including Azure Service Bus, AWS SQS Queue, Kafka, Prometheus, and more, allowing it to handle a wide range of event sources. It integrates with Kubernetes' Horizontal Pod Autoscaler (HPA), increasing its capabilities without introducing complexity. KEDA is lightweight and introduces minimal overhead, making it an efficient choice for production environments.

How KEDA - Kubernetes Event-Driven Autoscaler Works

Event Detection: KEDA monitors in Kubernetes various event sources (e.g., message queues, databases) using components called scalers. Each scaler is designed for a specific event source and knows how to query it for metrics.

Metric Evaluation: When an event occurs, the scaler evaluates its associated metrics. For example, a scaler monitoring a message queue might check the number of pending messages.

Scaling Decision: Based on the metrics, KEDA determines in K8s whether to adjust the number of application instances (pods). If the metric exceeds a defined threshold, KEDA instructs Kubernetes to scale up the application; if it's below the threshold, it scales down.

Integration with Kubernetes: KEDA acts as a metrics server within Kubernetes, providing these event-based metrics to the Horizontal Pod Autoscaler (HPA). This integration allows Kubernetes to make informed scaling decisions based on both traditional metrics (like CPU usage) and external event metrics.

>> Take a Look at How HPA works and its Best Practices.

KEDA Architecture in Kubernetes

The architecture of KEDA in K8s has different components working together:

keda kubernetes
Source: KEDA

Scalers: These are responsible for connecting to external event sources and retrieving metrics. KEDA supports a wide range of scalers for different event sources, including message queues, databases, and monitoring systems.

Metrics Server: Kubernetes Event-Driven Autoscaler includes a metrics server that exposes the retrieved metrics to Kubernetes. This allows the HPA to access these metrics and make scaling decisions based on them.

Operator: The KEDA operator manages the lifecycle of the scalers and ensures they are properly configured and running. It also handles scaling the application up or down based on the metrics provided by the scalers.

Admission Webhooks: Kubernetes Event-Driven Autoscaler uses admission webhooks to validate resource changes and prevent misconfigurations. For example, it makes sure that multiple ScaledObject resources do not target the same application, which could lead to conflicting scaling behaviors.

KEDA Kubernetes Scalers

K8s Event-Driven Autoscaler scalers define the events or custom metrics that trigger scaling. Some scalers examples include:

Message queues: It can scale based on the number of messages in an Azure Service Bus queue or Kafka topic.

Database metrics: It adjusts the number of replicas based on the size of a Redis stream or MySQL binlog.

Cron-based: It scales workloads up or down at specific times using cron expressions.

Each scaler is configured using simple YAML, allowing users to define scaling behavior in a declarative manner.

>> Want to Maximize Cost Savings by Putting Your Kubernetes Resources to Sleep During Off-Hours? Check out this post.

Tutorial: Cron-based Scaling with KEDA Kubernetes

Let’s consider a scenario where you need to scale a job to run only during business hours. This can be achieved using KEDA’s Cron scaler.

Step 1: Install KEDA

You should install KEDA in your cluster using Helm or other methods:

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace

Step 2:  Define the Deployment and the Cron Scaler

Define Deployment:

envsubst < keda-deployment.yaml | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-cron-job
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-cron-job
  template:
    metadata:
      labels:
        app: my-cron-job
    spec:
      containers:
        - name: busybox
          image: busybox
          command: [“sleep”, “3600”]

Then, create a `ScaledObject` that defines the scaling behavior based on a cron schedule:

envsubst < scaledobject.yaml | kubectl apply -f -
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-scaler
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-cron-job
  triggers:
  - type: cron
    metadata:
      timezone: Asia/Kolkata
      start: 0 6 * * * # At 6:00 AM
      end: 0 20 * * * # At 8:00 PM
      desiredReplicas: "10"

With this configuration, the `my-cron-job` deployment will scale to ten replicas from 6 AM to 8 PM in the Asia/Kolkata timezone and scale down to zero outside these hours.

webinar 012025 cta-1
Want to know more about the Event-Driven Autoscaling with KEDA? Join us for an engaging session with Zbyněk Roubalík, CTO of Kedify and KEDA Project Maintainer.

KEDA is an excellent choice for dynamic and event-heavy workloads. Under the hood, KEDA uses the Horizontal Pod Autoscaler (HPA) to implement its scaling decisions. While HPA traditionally depends on metrics like CPU and memory usage, KEDA introduces event-driven triggers, eliminating the need to monitor resource utilization directly.

In PerfectScale, you can easily jump over HPA using the switcher above the table.

PerfectScale HPA View
HPA View

From there, you can see the HPA view provides a clear overview of workloads utilizing Horizontal Pod Autoscaler (HPA). This feature enables users to quickly identify the workloads where HPA has been introduced and adjust HPA thresholds with the provided informative tooltips that offer tailored recommendations. These recommendations are particularly helpful in optimizing scaling decisions, minimizing resource waste, and ensuring efficient operation of workloads.

PerfectScale HPA View
HPA View
Column Description
HPA It indicates whether HPA has been introduced for the workload. You can easily sort the column by clicking the header or apply specific filters.
CPU (%) It displays the trigger for HPA by CPU.
There are two types of indicators to be aware of:
- A red indication signifies that the threshold is below 60%, indicating potential significant CPU waste.
- A yellow indication suggests that the threshold falls between 60% and 80%, pointing to potential moderate CPU waste.
Memory (%) It displays the trigger for HPA by Memory.
There are two types of indicators to be aware of:
- A red indication signifies that the threshold is below 60%, indicating potential significant Memory waste.
- A yellow indication suggests that the threshold falls between 60% and 80%, pointing to potential moderate Memory waste.
Custom Metric It indicates whether a Custom metric has been detected.

At PerfectScale by DoiT, we discover HPA configurations for workloads in clusters and optimize the pod resource requests and limits while accounting for horizontal scaling behaviors. With KEDA, this process becomes even more seamless, as scaling decisions are decoupled from resource metrics, focusing solely on external events or custom triggers. This enables teams to achieve cost efficiency and performance consistency without over-provisioning or under-utilizing resources.

FAQ: KEDA Kubernetes

In January 2025, Ant Weiss, our Chief Cluster Whisperer, hosted a webinar with Zbynek Roubalik, the CTO of Kedify and maintainer of KEDA. The session was packed with Kubernetes Event-Driven Autoscaling (KEDA) overview, why event-driven autoscaling matters, what benefits it brings to the table, and what challenges the community still needs to solve.

During the webinar, there were a lot of questions from the audience. We felt the answers provided by Zynek and Ant could benefit a wider audience; therefore, we’re summarizing them in this post.

So, let's dig in!

Question: What are the pros and cons of using KEDA versus HPA based on CPU and memory?

Answer: There are certain workloads where CPU and memory metrics are enough. But once you need to depend on some external metrics—when the application is consuming something from an external system—then it’s much better to use KEDA.

KEDA builds on top of HPA, so you don’t lose the capabilities of HPA when using KEDA. Scaling based on resources is kind of reactive because you see the increase in resource usage and then trigger the scale-out process. With event-driven scaling, you have events that indicate an increase in resource demand, allowing you to scale more proactively.

Question: Is it possible to use KEDA instead of HPA and VPA?

Answer: Yes, you can use KEDA instead of HPA because KEDA, under the hood, creates a scaled object and opens the necessary connections. In the end, KEDA also generates an HPA resource. So, you can define the same CPU and memory resource scalers that are used with HPA. It’s essentially a one-to-one replacement for HPA because it still uses HPA under the hood.

For VPA, the answer is No. Mixing VPA and HPA for a single deployment is not a good practice. You shouldn't do that because HPA will scale the workload horizontally based on a specific metric, while VPA will try to scale it vertically based on the same metric, causing a conflict. So, you shouldn’t use them together. If you still want to achieve vertical pod autoscaling for HPA-based workloads, look into PerfectScale by DoiT because we know how to do it.

Question: What's the difference between KEDA vs. HPA with custom metrics using Prometheus?

Answer: First is the ease of setup; it is much easier to integrate KEDA with Prometheus. In addition, once you set up KEDA, you gain access to 65+ additional scalers beyond just Prometheus. The Prometheus adapter uses a single configuration, possibly a ConfigMap, to define all scaling settings for the entire cluster. With KEDA, scaling settings can be defined per workload, offering more flexibility.

Question: Is there a single KEDA operator that can be active at a given time?

Answer: Yes, there is only a single controller handling the requests. The limitation itself is with the metrics adapter component. It all ties back to HPA and how it communicates through the Kubernetes API. This interface is singular within the cluster. It works through the Kubernetes API extension, and it uses the external metrics endpoint.

Question: Is KEDA free to use? How is it different from KEDA Enterprise (Kedify)?

Answer: Yes, KEDA is an open-source project. It's free to use. KEDA Enterprise (Kedify) is basically built on top of KEDA. It has a bunch of enterprise features, including support, dashboards, and security fixes. The goal is to continue supporting the open-source project while also serving customers who have specific needs, larger deployments, or require certain features. When we started the project, the team's aim was to make Kubernetes event-driven autoscaling simple. The team wanted to solve a single problem with KEDA rather than create a tool that tries to do ten different things but doesn’t excel at any of them. The philosophy is similar to Unix utilities—each tool should do one thing well.

Question: When will the HTTPScaledObject be available for production use in KEDA?

Answer: To handle incoming traffic, gather metrics in real-time, and scale the application accordingly, there is an HTTP add-on for KEDA in the open-source version. This add-on has been in development for a long time, but it is not currently in very active development. There is not much traffic or contribution to it.

The team does not recommend it for production use. Some users have successfully used it, and for certain use cases, it works well. If you don’t have a high load, it may be fine.

However, it is still in beta, so the team does not recommend it.

The add-on uses a different resource called HTTPScaledObject, where you define various scaling options. It relies on an interceptor component, which directs all incoming traffic through it before reaching the workload. Based on the metrics gathered by the interceptor component, the workload can scale out or even scale to zero. However, the interceptor component is also a problematic part because it has to handle a lot of complexity. At the moment, it is not performing well enough for reliable production use.

Therefore we recommend using the Kedify’s HTTP Scaler that’s been tested for production use.

Question: Is KEDA for Kubernetes jobs? For example, a CI/CD pipeline requiring a pod to be created dynamically for doing CI?



Answer: It's not only for Jobs, but it's also for additional kinds of workloads like Deployments, statefulsets, etc.

Question: What if, in an organization, we don't use a message broker service like Kafka or any other? What other conventional sources or services can KEDA support?



Answer: You can find the list of all available scalers here and you can add your own.

Question: How is KEDA different from Karpenter?

Answer : Karpenter is for node scaling. KEDA is for pods.

Question: KEDA Configuration for Time-based Autoscaling with Locked Resources?

Scenario: Locking resources and preventing scale-downs during specific months. I want to lock the resources and prevent autoscaling from scaling down pods during these specific tax periods, even if the resource utilization drops below the 80% threshold. To ensure that scaling down doesn't happen while the process is running, maintain the availability and stability of the application. How can I implement this solution using KEDA to prevent pod terminations and lock resources during specific periods (e.g., tax months)?

Answer: Time-based scheduling is best achieved using the Cron scaler in KEDA.

Question: Can KEDA be integrated with confluent Kafka ?

Answer: Yes. There’s a Kafka scaler.

Question: How do you see KEDA benefits when it comes to nodepool types/optimization of nodepools?

Answer: KEDA can optimize node utilization when combined with a smart dynamic node autoscaler like Karpenter or NAP. KEDA will change the number of pods based on events, while the node scaler will provision nodes to schedule these pods on. This requires, of course, correctly defining the resource requests for the scaled pods—something PerfectScale can do for you. 

Question: Can KEDA be used within all main Cloud Hyperscalers k8s deployments, except OCI (Oracle Cloud Infrastructure)?

Answer: While KEDA can definitely be installed and used on Kubernetes clusters on Oracle Cloud, there are currently no community-provided scalers for integrating with Oracle cloud services. There are scalers for consuming events from all other cloud hyperscalers.

Question: Is KEDA supported with AWS SQS?

Answer: Yes, it is supported.

Question: What are some questions you are asking customers to better understand their challenges when using K8s? So, you can position KEDA?

Answer: KEDA is a solution when resource-based horizontal autoscaling doesn’t work well. If your current HPA implementation can’t provide the required QoS with regards to latency, reliability, or throughput. Or maybe the performance is ok, but you’re paying too much for the infrastructure - time to review your autoscaling practices and evaluate scaling based on events rather than resources.

>> Take a look at How to Save Costs by Sleeping Kubernetes Resources During Off-Hour with KEDA?

Question: In the future, is there a plan to move KEDA to an enterprise version?

Answer: KEDA will be an open-source project. It’s free to use. KEDA Enterprise (Kedify) is basically built on top of KEDA. It has a bunch of enterprise features, including support, dashboards, and security fixes. The goal is to continue supporting the open-source project while also serving customers who have specific needs, larger deployments, or require certain features. 

Question: Can KEDA know how much resource is required to be assigned to a pod when the incoming event has unpredictable workloads and different requirements for requests? Or does it only scale horizontally?

Answer: KEDA only scales horizontally. For vertical autoscaling, check out PerfectScale; we have a free community offering.

Final Thoughts

KEDA makes event-driven autoscaling in Kubernetes easier, more flexible, and more efficient, especially when resource-based scaling falls short. Whether you’re dealing with unpredictable workloads, integrating external services, or optimizing your infrastructure costs, KEDA provides a powerful solution to scale your applications dynamically.

If you’re new to KEDA, start by exploring the KEDA docs to understand how it can fit into your Kubernetes environment. And if you're looking for a way to optimize both horizontal and vertical scaling, give PerfectScale a try. Start your free trial today and see how it can fine-tune your autoscaling strategy!

PerfectScale Lettermark

Reduce your cloud bill and improve application performance today

Install in minutes and instantly receive actionable intelligence.
Subscribe to our newsletter
Find out what is KEDA (Kubernetes Event-Driven Autoscaler), and take a look at an example of using KEDA for Cron-based scaling.
This is some text inside of a div block.
This is some text inside of a div block.

About the author

This is some text inside of a div block.
more from this author
By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.