Hello and welcome to Eye on AI. In this edition….Google launches the ability to make purchases directly from Google Search’s AI Mode and Gemini…Apple selects Google to power an upgraded Siri…Meta announces a new AI infrastructure team…researchers use AI to find new ways to edit genes.
It was another week with a lot of AI-related announcements. Among the bigger news items was Google’s launch of an e-commerce shopping checkout feature directly from Google Search’s AI Mode and its Gemini chatbot app. Among the first takers for the new feature is retail behemoth Walmart, so this is a big deal. Behind the scenes, the AI checkout is powered by a new “Universal Commerce Protocol” that should make it easier for retailers to support agentic AI sales. Google Cloud also announced a bunch of AI features to support agentic commerce for customers, including a new Gemini Enterprise for Customer Experience product that combines shopping and customer support (watch this space—the combination of those two previously separate functions could have big implications for the way many businesses are organized.) Home Depot was one of the first announced customers for this new cloud product.
It’s still early days for agentic commerce, but already many companies are panicking about how they make sure their products and sites surface highly in what these AI agents might recommend to users. A nascent industry of companies has sprung up offering what are variously called “generative engine optimization” (GEO) or “generative-AI optimization” (GAIO) services. Some of these echo longstanding internet search optimization strategies, but with a few key differences. GEO seems, at least for now, somewhat harder to game than SEO. Chatbots and AI agents seem to care a lot about products that have received positive earned media attention from reputable news outlets (which should be a good thing for consumers—and for media organizations!) as well as those that rank highly in trusted customer review sites.
But the world of AI-mediated commerce presents big governance risks that many companies may not fully understand, according to Tim de Rosen, the founder of a company called AIVO Standard, which offers companies a method for generative AI optimization and also a way to track and hopefully govern what information AI agents are using.
The problem, de Rosen told me in a phone call last week, is that while various AI models tend to be consistent in how they characterize a brand’s product offerings—usually correctly reporting the nature of a product, its features, and how those features compare to competing products, as well as providing citations to the sources of that information—they are inconsistent and error-prone when asked questions that pertain to a company’s financial stability, governance, and technical certifications. Yet this information can play a significant role in major procurement decisions.
AI models are less reliable on financial and governance questions
In one example, AIVO Standard assessed how frontier AI models answered questions about Ramp, the fast-growing business expense management software company. AIVO Standard found that models could not reliably answer questions about Ramp’s cybersecurity certifications and governance standards. In some cases, de Rosen said, this was likely to subtly push enterprises towards procurement decisions involving larger, publicly traded, incumbent businesses—even in cases when a privately-held upstart also met the same standards—simply because the AI models could not accurately answer questions about the younger, privately-held company’s governance and financial suitability or cite sources for the information they did provide.
In another example, the company looked at what AI models said about the risk factors of rival weight loss drugs. It found that AI models did not simply list risk factors, but slipped into making recommendations and judgments about which drug was likely the “safer choice” for the patient. “The outputs were largely factual and measured, with disclaimers present, but they still shaped eligibility, risk perception, and preference,” de Rosen said.
AIVO Standard found that these problems held across all the leading AI models and a variety of different prompts, and that they persisted even when the models were asked to verify their answers. In fact, in some cases, the models would tend to double-down on inaccurate information, insisting it was correct.
GEO is still more art than science
There are several implications. One, for all the companies selling GEO services, is that GEO may not work well across different aspects of brand information. Companies shouldn’t necessarily trust a marketing tech firm that says it can show them how their brand is showing up in chatbot responses, let alone believe that the marketing tech company has some magic formula for reliably shaping those AI responses. Prompt results may vary considerably, even from one minute to the next, depending on what type of brand information is being assessed. And there’s not much evidence yet on how exactly to steer chatbot responses for non-product information.
But the far bigger issue is that there is a moment in many agentic workflows—even those with a human in the loop—where AI-provided information becomes the basis for decision making. And, as de Rosen says, currently most companies don’t really police the boundaries between information, judgment, and decision-making. They don’t have any way of keeping track of exactly what prompt was used, what the model returned in response, and exactly how this fed into the ultimate recommendation or decision. In regulated industries such as finance or health care, if something goes wrong, regulators are going to ask for exactly those details. And unless regulated enterprises implement systems for capturing all of this data, they are headed for trouble.
With that, here’s more AI news.
Jeremy Kahn
[email protected]
@jeremyakahn
FORTUNE ON AI
Anthropic launches Claude Cowork, a file-managing AI agent that could threaten dozens of startups—by Beatrice Nolan
U.K. investigation into X over allegedly illegal deepfakes risks igniting a free speech battle with the U.S.—by Beatrice Nolan
Malaysia and Indonesia move to ban Musk’s Grok AI over sexually explicit deepfakes—Angelica Ang
Anthropic unveils Claude for Healthcare, expands life science features, and partners with HealthEx to let users connect medical records—by Jeremy Kahn
AI IN THE NEWS
Apple chooses Google’s AI for updated Siri. Apple signed a multi-year partnership with Google to power key AI features in its products, including a long-awaited Siri upgrade, the companies announced on Monday. The deal underscores Google’s resurgence in AI and helped push the market value of Google-parent Alphabet above the $4 trillion threshold. Apple said the agreement does not change its existing partnership with OpenAI, under which Siri currently hands off some queries to ChatGPT, though it remains unclear how the Google tie-up will shape Siri’s future AI integrations. The financial terms of the deal were not disclosed either, although Bloomberg previously reported that Apple was considering paying Google as much as $1 billion per year to access its AI models for Siri.
Meta announces new AI infrastructure team, including former Trump advisor. The social media giant said it was creating a new top-level initiative called Meta Compute to secure tens—and eventually hundreds—of gigawatts of data center capacity. The effort is being led by Daniel Gross, a prominent AI tech executive and investor who Meta had hired to help its Superintelligence Labs effort, and Santosh Janardhan, who is the company’s head of infrastructure. CEO Mark Zuckerberg said the way Meta builds and finances data centers will become a key strategic advantage, as the company pours money into facilities such as a $27 billion data center in Louisiana and nuclear-power partnerships to meet energy demand. Meta also named Dina Powell McCormick, who served in several key positions during the first Trump administration, as president and vice chair to help forge government partnerships and guide strategy, reporting directly to Zuckerberg. You can read more from the Wall Street Journal here.
Microsoft warns that DeepSeek is proving popular in emerging markets. Research published by Microsoft shows that U.S. AI companies are losing ground to Chinese rivals in emerging markets. The low-cost of open models built in China, such as DeepSeek, is proving decisive in spurring adoption in places such as Ethiopia, Zimbabwe, and Turkmenistan. Microsoft president Brad Smith said Chinese open-source models now rival U.S. offerings on performance while undercutting them on price, helping China overtake the U.S. in global usage of “open” AI, especially across Africa and other parts of the global south. By contrast, U.S. firms like OpenAI, Google, and Anthropic have focused on closed, subscription-based models—raising concerns that without greater investment, the AI divide between rich and poor countries will widen, and that U.S. companies may ultimately see their growth limited to more developed markets. Read more from the Financial Times here.
Salesforce launches updated Slackbot powered by Anthropic’s Claude. Salesforce is rolling out an upgraded Slackbot for Business+ and Enterprise+ customers that uses generative AI to answer questions and surface information across Slack, Salesforce, and connected services like Google Drive and Confluence. The new Slackbot is powered primarily by Anthropic’s Claude model. The company says the AI assistant respects user permissions and is designed to reduce reliance on external tools such as ChatGPT by working directly inside Slack, which Salesforce acquired for $27.1 billion in 2021. The launch comes as investors remain skeptical about enterprise software firms’ ability to benefit from the AI boom, with Salesforce shares down sharply over the past year despite its push to get businesses to adopt its “Agentforce” AI agents. Read more from CNBC here.
EYE ON AI RESEARCH
Microsoft, Nvidia and U.K. startup Basecamp Research make AI-aided breakthrough in gene editing. An international research team including scientists from Nvidia and Microsoft has used AI to mine evolutionary data from more than a million species to design potential new gene-editing tools and drug therapies. The team developed a set of AI models, called Eden, which were trained on a vast, previously unpublished biological dataset assembled by Basecamp. Nvidia’s venture capital arm is an investor in Basecamp.
The AI models can generate novel enzymes for large, precise gene insertions that could improve the ability of the body’s immune cells to target cancerous tumors. Basecamp has demonstrated the effectiveness of these gene-edited cells in laboratory tests so far, but they have not been tested in people. The Eden-designed gene editing enzymes can also make genetic edits that allow cells to produce peptides that can fight drug-resistant bacteria. Researchers say the work could dramatically expand the range of treatable cancers and genetic diseases by overcoming long-standing data and technical constraints in gene therapy. Experts caution, however, that the clinical impact will depend on further validation, safety testing, and regulatory and manufacturing hurdles. You can read more from the Financial Times.
AI CALENDAR
Jan. 19-23: World Economic Forum, Davos, Switzerland.
Jan. 20-27: AAAI Conference on Artificial Intelligence, Singapore.
Feb. 10-11: AI Action Summit, New Delhi, India.
March 2-5: Mobile World Congress, Barcelona, Spain.
March 16-19: Nvidia GTC, San Jose, Calif.
BRAIN FOOD
What if people prefer AI-written fiction, or simply can’t tell the difference? That’s the question that New Yorker writer Vaudhini Vara asks in a provocative essay that was published as a “Weekend Essay” on the magazine’s website a few weeks ago. While out-of-the-box AI models continue to struggle to produce stories as convincing as graduates of top MFA programs and experienced novelists, it turns out that when you fine-tune these models on an existing author’s works, they can produce prose that is often indistinguishable from what the original author might create. Disconcertingly, in a test conducted by researcher Tuhin Chakrabarty— who has conducted some of the best experiments to date on the creative writing abilities of AI models—and which Vara repeats herself in a slightly different form, even readers with highly-attuned literary sensibilities (such as MFA students) prefer the AI written versions to human-authored prose. If that’s the case, what hope will there be for authors of genre fiction or romance novels?
I had a conversation a few months ago with a friend who is an acclaimed novelist. He was pessimistic about whether future generations would value human-written literature. I tried to argue that readers will always care about the idea that they are in communication with a human author, that there is a mind with lived experience behind the words. He was not convinced. And increasingly, I’m worried his pessimism is well-founded.
Vara ultimately concludes that the only way to preserve the idea of literature as the transmission of lived experience across the page, is for us to collectively demand it (and possibly even ban the fine-tuning of AI models on the works of existing writers.) I am not sure that’s realistic. But it may be the only choice left to us.
FORTUNE AIQ: THE YEAR IN AI—AND WHAT’S AHEAD
Businesses took big steps forward on the AI journey in 2025, from hiring Chief AI Officers to experimenting with AI agents. The lessons learned—both good and bad–combined with the technology’s latest innovations will make 2026 another decisive year. Explore all of Fortune AIQ, and read the latest playbook below:
–The 3 trends that dominated companies’ AI rollouts in 2025.
–2025 was the year of agentic AI. How did we do?
–AI coding tools exploded in 2025. The first security exploits show what could go wrong.
–The big AI New Year’s resolution for businesses in 2026: ROI.
–Businesses face a confusing patchwork of AI policy and rules. Is clarity on the horizon?






