Kubernetes cluster size best practices for you to establish and continually maintain the proper Kubernetes requests and limits to optimize application performance, reliability, and cost.
These 8 pivotal guidelines will not only help you set up your cluster effectively but also ensure its continual optimization as your needs evolve. By adhering to these best practices, you can strike the perfect balance between resource allocation and application demands, paving the way for a robust and scalable Kubernetes environment.
Setting Resource Requests and Limits
Two different types of resource configurations can be set on each container of a Kubernetes pod: requests and limits. While resource requests are what determine the amount of CPU or memory allocated to your Kubernetes containers, resource limits establish the level at which Kubernetes will shut down (or “throttle”) a container for consuming too much memory.
Setting limits is helpful because it reduces overcommitment of resources and protects application deployments from “starving.” Some applications use more memory than others, but it doesn't matter much. Kubernetes checks the requests either way and locates the “best” node for a specific pod, much like a game of Tetris.
If you set resource limits without a request, Kubernetes naturally sets memory and CPU requests that are equal to the understood limit. This conservative approach can help you gain control over a Kubernetes cluster and avoid these problems:
- Pod Out of Memory (OOM)—when pod memory reaches the memory limit, it is evicted to protect the entire node hosting this pod.
- Node Out of Memory (OOM)—when a node dies of memory starvation, cluster stability is affected.
- Pod Eviction—if a node doesn’t have the proper resources, it may unexpectedly terminate pods.
- CPU Starvation—when applications are forced to share a limited amount of CPU, others on the same node may not receive enough.
- Cost Optimization—clusters without proper requests and limits lead to over-provisioning, which means spending money on resources you don’t use.
Finding ways to help engineers and DevOps teams accurately establish Kubernetes requests and limits on their own is a verified best practice for keeping Kubernetes clusters and applications healthy. But it can be tricky.
If values are set too high through over-provisioning, your Kubernetes nodes may be negatively impacted. Set them too low through under-provisioning and risk poor application performance. Effectively setting these requirements is important because it ensures the proper amount of resources are available—and that your Kubernetes cluster has the right number of nodes for services and overall optimization. One of the best ways to establish effective requests and limits for an application is to check out its behavior at runtime.
Avoiding CPU Throttling
One widely used technique to avoid CPU throttling is removing CPU limits completely. Note, this technique should only be used for CPU, and not for memory, since CPU is a compressible resource.
When removing CPU limits it becomes even more important to properly define the right request because in situations with high CPU consumption the application will be evicted completely (not just throttled).
Rightsizing Your Containers
When working with Kubernetes, rightsizing your instances and workloads to optimize your compute is critical, as it helps maximize the business value of Kubernetes investments. By following best practices around rightsizing, users can repurpose freed capacity for additional workloads and/or significantly lower their cloud footprint.
The challenge, however, is knowing when you should rightsize a workload, which means you need the right metrics. When building applications for scale, nodes have to be the right size. Many smaller nodes and a few larger ones sit at either side of the spectrum. While it might seem reasonable to choose something in the middle, your decision should be based on best practices.
Even though Kubernetes’ autoscaling capabilities offer basic optimization by adding or removing resources based on demand, the problem of over and under-provisioning remains. Metrics collection and rightsizing happen at the container level, meaning your autoscaler launches, bin packs, and retires nodes to meet the set requirements of your newly rightsized resource requirements. Once you have visibility into your cluster utilization, you will be able to rightsize your tasks and pods while the autoscaling feature addresses the underlying infrastructure.
Kubernetes Cluster Size Best Practices To Remeber
Although it’s tempting to begin rightsizing your most costly applications and environments, the best way to get quick results is to begin with lower systems that can be easily identified as over-provisioned. This type of low-hanging fruit offers a great first testing opportunity. With increased understanding comes the willingness to allocate additional time and resources to rightsizing your workloads.
- Be cautious by over-provisioning (setting generous limits or requests) the first time you deploy to production. You can always lower them after you understand your true needs.
- Go small by utilizing numerous small pods instead of a few large ones. This move will provide higher availability for your applications by default.
- Don’t run too many pods, as it can lead to resource exhaustion and/or overload by creating too many open connections on your server, making troubleshooting difficult. This undermines the debugging process and can slow down application deployment.
- Review your past resource usage periodically and perform corrective actions where necessary. Measuring and analyzing capacity utilization over time is the best way to avoid consuming too many resources.
- Test workload performance on rightsized instances to ensure performance does not suffer. If you have high and low environments for the same workload, right-size the lower ones first then use load testing tools to evaluate performance.
- Prioritize and remediate issues: (over and under-provisioning)
- Complete the feedback loop by communicating regularly with developers. This move can help them provide more accurate capacity requirements in the future.
- Repeat steps 4-7 on a regular basis. Both utilization and demand can change over time, meaning what is rightsized today may not be in three months.
It is good to remember, Kubernetes is ephemeral. Even after your settings are established, you will need to regularly monitor your containerized environment. Do you have all the tools you need to empower DevOps teams with this level of visibility? Configuration and overall optimization are key ways to ensure resource efficiency and security are always present—and that your Kubernetes workloads have just the right amount of resources for success.
PerfectScale makes it easy for DevOps and SRE professionals to govern, right-size and scale Kubernetes to continually meet customer demand. By comparing overtime usage patterns with resource configurations we provide actionable recommendations that improve performance while eliminating wasted compute resources. Get the data-driven intelligence needed to ensure peak Kubernetes performance at the lowest possible cost with PerfectScale.
Start your free trial today!