Chief Technology & Product Officer, CloudBolt Software.
Platform engineering continues to gain momentum in modern engineering contexts, and it’s not hard to see why. The ability to build declarative, elastic and portable infrastructures that accelerate time to value and lower operational toil is transforming DevOps patterns—and raising the bar for how we build, deploy and manage cloud applications and infrastructure.
It’s not surprising that Gartner predicts that 80% of large software engineering organizations will establish platform teams as internal providers of reusable services, components and tools for application delivery by 2026.
The result is a clear shift from complexity to simplicity, from bespoke to well-defined platform engineering pathways. These pathways streamline lead time, mitigate risk and facilitate smoother software design and engineering processes. Numerous disciplines—such as continuous integration and continuous delivery (CI/CD), security engineering (DevSecOps), self-service capabilities and site reliability engineering (SRE) Day 2 operations—have embraced these patterns.
Despite these advancements, many platform engineering teams grapple with breaking the cycle of escalating costs and inefficient manual cost management.
Kubernetes: The De-Facto Cloud-Native Landing Zone
Kubernetes-based platforms are the go-to for building and deploying modern apps, with 96% of global businesses using or evaluating Kubernetes-based platforms for container orchestration tools as of 2021. Regardless of what infrastructure or architecture your applications span, Kubernetes offers a common language to leverage the same configuration across any environment to essentially adopt a “write once, deploy anywhere” approach.
With the broad adoption and deployment of Kubernetes, however, comes cost management and optimization challenges. Monthly Kubernetes costs can swiftly escalate into tens or even hundreds of thousands of dollars for medium- to large-sized enterprises. Without proper monitoring mechanisms in place, organizations often remain oblivious to their actual spend and over-allocate resources as a result. In fact, as of 2021, a staggering 68% of organizations either neglect to monitor Kubernetes spending altogether or rely solely on monthly estimates.
Optimizing container use and costs without impacting related applications is challenging. Manually managing the process is both labor-intensive and error-prone, offering little insight into the underlying causes driving cost escalations. The findings from the FinOps Foundation’s 2023 State of FinOps survey highlighted this dilemma, with about half of respondents expressing their intent to automate container rightsizing. Surprisingly, a year later, only 28% of organizations are actively engaged in container optimization, while 26% have opted to forgo such efforts altogether.
Incorporating FinOps principles into platform engineering pathways holds promise for streamlining these operations.
A Declarative Path Forward: FinOps Strategies For Platform Engineers
A mature FinOps practice offers a strategic approach to optimizing costs in containerized environments. Here are a few strategies.
1. Create “paved roads.”
A core principle of platform engineering is codifying a declarative event sequence that is inclusive of all configuration, dependencies, quality and security checks and any other business criteria necessary to deliver artifacts (code, apps, infrastructures or actions) into an environment.
In other words, take what is commonly a Rube Goldberg-like adventure of opaque, toilsome, manual operations and turn it into a clearly defined, repeatable, automated action. Because these “paved roads” are a unique point in time, they are an appropriate place to enforce critical business rules and metadata. If this opportunity is missed, you’re relegated to debt cleanup from which it is difficult to recover.
2. Label taxonomy.
One of the critical FinOps concerns in a paved road is tagging enforcement, which is the most effective way of aligning infrastructure and workloads to the business context that allows for accurate ownership, cost attribution, optimization and unit cost tracking.
You must enforce a tagging standard on the paved road if you want to be able to answer questions like, “What is application X’s contribution margin?” or “What teams are deploying our most efficient workloads?” or “Where in the business do we have optimization opportunities?”
Often, the tagging taxonomy stops at virtual machines, but you must employ a clear labeling (tagging) taxonomy in your clusters to do any cost attribution that helps monitor cost efficiency. This gives you the granularity necessary to monitor and attribute costs based on utilization.
3. Shine the light on baseline efficiency.
After you accurately attribute costs, you can baseline your efficiency rating. This is typically a measurement of utilization (real or reserved) against provisioning that drives cost basis. Use this to highlight the need for continuous optimization.
4. Look for ways to automate optimization.
If you lean on static governance policies for requests and limits, you’ll bounce between inefficient utilization and performance problems. And if you lean on humans to manually tune, you’re your platform teams will likely burn out. Kubernetes was designed for automation, so use that same technique to drive workload efficiency into your clusters in Day 2 operations.
The Price of Kubernetes: Balancing Value With Spend
As platform engineering reshapes DevOps, best practices such as “paved roads” and tagging taxonomies are a means to expedite value delivery and alleviate operational burdens. Yet, the journey is incomplete without the integration of FinOps strategies to mitigate the escalating costs of Kubernetes adoption.
By leveraging a blend of strategies, tech leaders can more effectively navigate the complexities of Kubernetes cost management and pave the way toward a more streamlined cloud-native future.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?