Nvidia will remain the gold standard for AI training chips, CEO Jensen Huang told investors, even as rivals push to cut into his market share and one of Nvidia’s major suppliers gave a subdued forecast for AI chip sales.
Everyone from OpenAI to Elon Musk’s Tesla rely on Nvidia semiconductors to run their large language or computer vision models. The roll out of Nvidia’s “Blackwell” system later this year will only cement that lead, Huang said at the company’s annual shareholder meeting on Wednesday.
Unveiled in March, Blackwell is the next generation of AI training processors to follow its flagship “Hopper” line of H100 chips—one of the most prized possessions in the tech industry fetching prices in the tens of thousands of dollars each.
“The Blackwell architecture platform likely be the most successful product in our history and even in the entire computer history,” Huang said.
Nvidia briefly eclipsed Microsoft and Apple this month to become the world’s most valuable company in a remarkable rally that has fueled much of this year’s gains in the S&P 500 index. At more than $3 trillion, Huang’s company was at one point worth more than entire economies and stock markets, only to suffer a record loss in market value as investors locked in profits.
Yet as long as Nvidia chips continue to be the benchmark for AI training, there’s little reason to believe the longer-term outlook is cloudy and here the fundamentals continue to look robust.
One of Nvidia’s key advantages is a sticky AI ecosystem known as CUDA, short for Compute Unified Device Architecture. Much like how everyday consumers are loath to switch from their Apple iOS device to a Samsung phone using Google Android, an entire cohort of developers have been working with CUDA for years and feel so comfortable there is little reason to consider using another software platform. Much like the hardware, CUDA effectively has become a standard of its own.
“The Nvidia platform is broadly available through every major cloud provider and computer maker, creating a large and attractive base for developers and customers, which makes our platform more valuable to our customers,” Huang added on Wednesday.
Micron’s in-line guidance for next quarter revenue not enough for bulls
The AI trade did take a recent hit after memory chip supplier Micron Technology, a supplier of high bandwidth memory (HBM) chips to companies like Nvidia, forecast fiscal fourth quarter revenue would only match market expectations of around $7.6 billion.
Shares in Micron plunged 7%, underperforming by a large margin a slight gain in the broader tech-heavy Nasdaq Composite.
In the past, Micron and its Korean rivals Samsung and SK Hynix have seen cyclical boom and busts common to the memory chip market, long considered a commodity business when compared with logic chips such as graphic processors.
But excitement has surged as demand for its chips needed for AI training. Micron’s stock more than doubled over the past 12 months, meaning investors have already priced in much of management’s predicted growth.
“The guidance was basically in line with expecations and in the AI hardware world if you guide in line that’s considered a slight disappointment,” says Gene Munster, a tech investor with Deepwater Asset Management. “Momentum investors just didn’t see that incremental reason to be more positive about the story.”
Analysts closely track demand for high bandwidth memory as a leading indicator for the AI industry because it is so crucial for solving the biggest economic constraint facing AI training today—the issue of scaling.
HBM chips address scaling problem in AI training
Costs crucially do not rise in line with a model’s complexity—the number of parameters it has, which can number into the billions—but rather grow exponentially. This results in diminishing returns in efficiency over time.
Even if revenue grows at a consistent rate, losses risk ballooning into the billions or even tens of billions a year as a model gets more advanced. This threatens to overwhelm any company that doesn’t have a deep-pocketed investor like Microsoft capable of ensuring an OpenAI can still “pay the bills”, as CEO Sam Altman phrased it recently.
A key reason for the diminishing returns is the growing gap between the two factors that dictate AI training performance. The first is a logic chip’s raw compute power—as measured by FLOPS, a type of calculation per second—and the second is the memory bandwidth needed to quickly feed it data—often expressed in millions of transfers per second, or MT/s.
Since they work in tandem, scaling one without the other simply leads to waste and cost inefficiency. That’s why FLOPS utilization, or how much of the compute can actually be brought to bear, is a key metric when judging the cost efficiency of AI models.
Sold out through the end of next year
As Micron points out, data transfer rates have been unable to keep pace with rising compute power. The resulting bottleneck, often referred to as the “memory wall” is a leading cause for today’s inherent inefficiency when scaling AI training models.
That explains why the U.S. government focused heavily on memory bandwidth when deciding which specific Nvidia chips needed to be banned from export to China in order to weaken Beijing’s AI development program.
On Wednesday, Micron said its HBM business was “sold out” all the way through the end of the next calendar year, which trails its fiscal year by one quarter, echoing similar comments from Korean competitor SK Hynix.
“We expect to generate several hundred million dollars of revenue from HBM in FY24 and multiple [billions of dollars] in revenue from HBM in FY25,” Micron said on Wednesday.