.png)

Building in-house Kubernetes right-sizing means combining open-source tools with custom automation scripts. Here’s what that stack usually looks like
VPA can help with basic right-sizing by generating CPU and memory recommendations for containers.
It lacks impact and context awareness, so it requires manual review to reduce production risk. It also doesn’t work well with HPA. And because it relies heavily on recent usage data, it is a poor fit for highly dynamic or short-lived workloads, especially AI/ML applications where usage patterns change fast.
Labels namespaces, creates VPA objects automatically, and shows recommendations in a dashboard.
It makes VPA easier to use at scale, but the visibility stays limited. You can see what VPA recommends, but not the context behind it: workload behavior over time, revision-awareness, business or performance impact, or whether applying it actually makes sense.
Analyzes Prometheus data and recommends CPU and memory requests for workloads.
It helps reduce right-sizing guesswork, but it still depends on metrics history and gives limited context around workload changes, production risk, requiring manual reviews.
Prometheus provides the metrics layer for the DIY stack, while Grafana dashboards and custom scripts turn those metrics into actions.
Relying on Prometheus metrics adds extra effort and increases the cluster footprint, with ongoing maintenance around storage, retention, query performance, scripts, PR automation, and breakages when Prometheus or Kubernetes changes.
Understand the differences between PerfectScale and a typical DIY optimization stack.

Homegrown tools are not free, as they’re paid for in engineering time, the most expensive resource you have.
To integrate VPA + Goldilocks + KRR + Prometheus + custom scripts into a solution that produces actionable, automated results across the entire K8s stack.
Each release or version can break VPA CRDs, Prometheus exporters, and your custom scripts. Someone must keep an eye on it.
PerfectScale customers consistently find additional savings on top of what their existing tooling captured, plus cost of mitigated resilience incidents.
Install in minutes. See actionable intelligence on day one.