On July 9th, 2024, OpenAI unveiled the five-stages of Artificial Intelligence to its employees. Fast forward to October 1st, and OpenAI’s CEO Sam Altman claimed their latest release, o1, had propelled AI from conversational language (level 1) to human reasoning (level 2). While the accuracy of this claim is debatable, scientists and benchmarking organizations such as Model Evaluation and Threat Research seem to support it. This leap brings us closer to OpenAI’s level 3: AI agents capable of acting on a user’s behalf for days at a time.
To explore the implications of AI agency, I spoke with Jeremy Kahn, AI Editor at Fortune Magazine and author of Mastering AI: A Survival Guide to Our Superpowered Future.
AI Responsibility
Kahn believes AI agents are just around the corner. “Salesforce has already introduced these agents,” he notes. “We can expect Google, Microsoft, and OpenAI to follow suit within six to eight months. It’s likely Amazon will add agent capabilities to Alexa as well.” Just after Kahn and I spoke, Microsoft announced the first of its AI agents.
However, the rise of AI agents isn’t without challenges. Developers are grappling with a crucial question: how much autonomy should these agents have? Kahn explains, “There’s a delicate balance between autonomy and understanding human instructions. If I tell my AI agent to ‘book me flights’ and it chooses first class, who’s responsible – me or the agent’s creator?”
The Need for Reasoning
In these early days of AI agency, we therefore face a critical risk: agents acting without sound reasoning. Imagine an AI courier, tasked with delivering a package ASAP, mowing down pedestrians in its path—an example of action without common sense. Kahn illustrates this danger with a real-world case: when OpenAI was testing whether its o1 model could potentially help a hacker steal sensitive network data, it asked the AI model to attempt a ‘capture the flag’ exercise, hacking a data container. When the target container failed to launch, rendering the exercise theoretically impossible, o1 didn’t give up. Instead, it identified a different, poorly secured container and extracted the data from there, ‘winning’ by sidestepping the implied rules. This focus on end results echoes the broader concerns of some AI Safety experts: that an AI tasked with solving climate change, for example, might conclude that eliminating humanity is the most efficient solution. Such scenarios highlight the urgent need to instil proper reasoning within agency in AI systems.
The Wild West of AI Decision-Making
Kahn, in his book, argues that AI development has often prioritized outputs over processes. As a result, the reasoning within AI models remains opaque, forcing developers to reconsider how to make Large Language Models (LLMs) explain their thinking.
If, as Kahn suggests, true reasoning requires compassion and empathy, AI may always have limitations. He points out that sound judgment stems from lived experience – even a five-year-old understands that endangering pedestrians to deliver a package quickly is wrong.
Kahn is concerned by the current lack of reasoning as we move towards AI agency: “We’re in a wild west scenario right now, with insufficient rules, safeguards, and controls around AI agents and their business models. Most companies haven’t fully considered the implications of how these AI agents will behave or be used.”
Rethinking Work in the Age of AI Agents
The rise of AI agents will dramatically reshape the future of work. Kahn poses two fundamental questions:
1) What are the humans going to be doing?
2) What skills will be needed?
His perspective? “We need to redefine human roles, focusing on higher-level skills for supervising systems. I anticipate people will oversee multiple AI systems simultaneously, serving as both guides and judges of these systems’ outputs.”
The first iterations of AI agency are either here or just around the corner, depending on how you define agency. This leaves us with OpenAI’s levels 4 and 5 of Artificial Intelligence: Innovators – where AI can develop its own innovations, and Organizations – when AI can perform the work of an entire organization. At level 5, it is assumed that AI will have achieved Artificial General Intelligence (AGI).
We may be some distance from levels 4 and 5 today, but considering how quickly we’ve moved beyond conversational AI (level 1), begun to embrace AI reasoning (level 2), and are now rapidly approaching AI agency (level 3), perhaps AGI is closer than we imagined.