Cloud-native technology and AI. They may go together like chocolate and peanut butter. Or maybe like coffee and pain au chocolate.
After gathering with distributed computing and cloud-native infrastructure experts at KubeCon + CloudNativeCon Europe 2024 in Paris last week, we discovered something interesting: Kubernetes and other cloud-native technologies are likely to see rising interest as ways to solve for the distributed demands of AI.
All of this provided an upbeat vibe for the record 12,000 attendees and hundreds of companies. Brilliant spring sunshine in Paris and new all-time highs in the stock market also helped.
“It feels like a massive uptick from Chicago,” Bassam Tabbara, the founder and CEO of Upbound, told me (KubeCon Fall was held in the United States in Chicago).
Hot topics in Paris included platform engineering, data observability, cloud cost management, FinOps, and, of course, AI. Open-source projects such as OpenTelemetry, Cilium, eBPF, Prometheus, and Crossplane are gaining momentum, based on project data and comments from attendees.
Raising Questions About AI Compute
AI technology, as always, was highlighted in keynotes, presentations, and media roundtables. In an interesting twist, KubeCon was held at the same time that Nvidia was creating a media frenzy around its own GTC conference, held halfway around the world in San Jose during the same week.
With AI chips in scarce supply and many of the benefits currently being seen mostly by companies with the deepest pockets, it’s that clear it’s still early in the AI game, with the need for better economics to spread the wealth. Cloud-native technologies can serve this role by bringing down costs and democratizing AI.
In a KubeCon media roundtable on AI, participants questioned the dominance of GPUs, as well as challenges for AI, such as security, data management, and energy consumption.
“With [AI] superclusters… are we thinking about the energy consumption and what that means if we continue?” asked Sudha Raghavan, senior vice president, Oracle Cloud Infrastructure Developer Platform.
Raghavan asked whether Kubernetes could help diversify compute away from scarce and expensive GPUs over time. “It’s not all about GPUs,” she said. “It can run on CPUs. The demand is so high for GPUs. Innovation is behind that demand. If we can put that innovation on things that we have, it can go faster.”
“Kubernetes is running massive AI workloads today,” said Lachlan Evenson, principal program manager, Microsoft, and governing board member, Cloud Native Computing Federation (CNCF). “It’s ubiquitous and incredibly flexible. It’s not easy. You have to be one of these big companies. We have to make it easier to run for everybody. Each innovation cycle is much more rapid than the previous one.”
Zemlin, CNCF Leaders Address AI
AI was also addressed at the highest level of the conference in keynotes, as well as by CNCF executives (CNCF is the KubeCon producer).
The AI Working Group of the launched its Cloud Native AI white paper, which positions cloud-native technologies for AI and points to the largest challenges, including managing large data sizes, managing data during development and deployment, and adhering to data governance and security policies.
Many of the keynote speakers pointed out that Kubernetes and cloud-native infrastructure will play a huge role in AI, with large language models (LLMs) and inference for AI demanding more data, storage, and compute across the spectrum.
Jim Zemlin, executive director of the Linux Foundation, addressed the ironic overlap of KubeCon with Nvidia’s GTC conference, held in San Jose during the same week:
“At the CPU layer, we definitely see a lot of concentration around Nvidia, which is clearly the market leader. And the [Nvidia] GTC Conference is going on in San Jose. Unfortunately, we were the largest event this week until Jensen [Huang, Nvidia CEO] decided to do his in the same week – and that is so much bigger, and deservedly so.”
Zemlin said the technology industry needs to push for more open data models for AI to democratize the technology:
“But if we take it one more layer up to the foundation models themselves, and particularly to the development of frontier models, you have a mix of open and closed, with OpenAI being the most advanced frontier foundation model at present,” said Zemlin. “… but open-source foundation models like Mistral and Llama are really nipping at their heels. And with many more to come, I might add, meeting that same level of performance.
According to Priyanka Sharma, executive director of the CNCF:
“Gen-AI is prompting cloud-native to rethink infrastructure paradigms to accommodate AI workloads, improve platform engineering’s focus with AI insights, and ensure AI-ready systems. This integration represents a significant shift in how we design, deploy, and manage cloud-native solutions.”
Data Demands and Costs Loom Large
Another output of the AI boom is data. Massive amounts of data. This is fueling some of the AI pushback, such as the question of how organizations retain control of and rights to the data used in AI models, how to securely transport AI data, and how to manage data costs as the volume explodes.
Martin Mao, the Cofounder and CEO of data observability company Chronosphere, said that customers are increasingly looking at the costs of the data they are monitoring and looking for places to cut costs.
“When you shift [to cloud native], the volume of the observability data grows and your observability bill grows,” said Mao. “People are generally complaining about the efficiency of observability tools,” he said. “They’re worried they’re getting worse results and not getting more out of the tools. That’s part of the move to cloud native.”
Chronosphere is based on the open-source M3 data observability system created by Uber, to which it adds enhancements to create metrics for containerized infrastructure, microservices applications, and business services. Customers include Robinhood, Snap, Obsidian Security, DoorDash, Zillow, and Visa.
KubeCon was a hotbed of discussion about tools used for both data observability and cloud-cost management. I spoke with additional cloud optimization vendors such as CAST AI, PerfectScale, and Zesty, which all said the surge in adoption was bringing more customer interest.
As data costs mount, AI scales, and cloud-native technologies are adopted, it’s clear that all these areas are likely to be increasingly intertwined. After all, AI will only accelerate the mountains of data that companies are trying to secure, transport, store, and optimize. Cloud-native technology including Kubernetes is going to play a key role.