Close Menu
Alpha Leaders
  • Home
  • News
  • Leadership
  • Entrepreneurs
  • Business
  • Living
  • Innovation
  • More
    • Money & Finance
    • Web Stories
    • Global
    • Press Release
What's On
From ‘reinvention exhaustion’ to ‘friction absorption’: CEOs who built instant delivery are worn out

From ‘reinvention exhaustion’ to ‘friction absorption’: CEOs who built instant delivery are worn out

5 June 2026
Enterprise AI’s Security Time Bomb Is Ticking. Cisco Shares Its Plan.

Enterprise AI’s Security Time Bomb Is Ticking. Cisco Shares Its Plan.

5 June 2026
SpaceX and Anthropic are about to go public—and your 401(k) may be forced to buy in

SpaceX and Anthropic are about to go public—and your 401(k) may be forced to buy in

5 June 2026
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
Alpha Leaders
newsletter
  • Home
  • News
  • Leadership
  • Entrepreneurs
  • Business
  • Living
  • Innovation
  • More
    • Money & Finance
    • Web Stories
    • Global
    • Press Release
Alpha Leaders
Home » AI reasoning models that can ‘think’ are more vulnerable to jailbreak attacks, new research suggests
News

AI reasoning models that can ‘think’ are more vulnerable to jailbreak attacks, new research suggests

Press RoomBy Press Room8 November 20253 Mins Read
Facebook Twitter Copy Link Pinterest LinkedIn Tumblr Email WhatsApp
AI reasoning models that can ‘think’ are more vulnerable to jailbreak attacks, new research suggests

New research suggests that advanced AI models may be easier to hack than previously thought, raising concerns about the safety and security of some leading AI models already used by businesses and consumers.

A joint study from Anthropic, Oxford University, and Stanford undermines the assumption that the more advanced a model becomes at reasoning—its ability to “think” through a user’s requests—the stronger its ability to refuse harmful commands.

Using a method called “Chain-of-Thought Hijacking,” the researchers found that even major commercial AI models can be fooled with an alarmingly high success rate, more than 80% in some tests. The new mode of attack essentially exploits the model’s reasoning steps, or chain-of-thought, to hide harmful commands, effectively tricking the AI into ignoring its built-in safeguards.

These attacks can allow the AI model to skip over its safety guardrails and potentially open the door for it to generate dangerous content, such as instructions for building weapons or leaking sensitive information.

A new jailbreak

Over the last year, large reasoning models have achieved much higher performance by allocating more inference-time compute—meaning they spend more time and resources analyzing each question or prompt before answering, allowing for deeper and more complex reasoning. Previous research suggested this enhanced reasoning might also improve safety by helping models refuse harmful requests. However, the researchers found that the same reasoning capability can be exploited to circumvent safety measures.

According to the research, an attacker could hide a harmful request inside a long sequence of harmless reasoning steps. This tricks the AI by flooding its thought process with benign content, weakening the internal safety checks meant to catch and refuse dangerous prompts. During the hijacking, researchers found that the AI’s attention is mostly focused on the early steps, while the harmful instruction at the end of the prompt is almost completely ignored.

As reasoning length increases, attack success rates jump dramatically. Per the study, success rates jumped from 27% when minimal reasoning is used to 51% at natural reasoning lengths, and soared to 80% or more with extended reasoning chains.

This vulnerability affects nearly every major AI model on the market today, including OpenAI’s GPT, Anthropic’s Claude, Google’s Gemini, and xAI’s Grok. Even models that have been fine-tuned for increased safety, known as “alignment-tuned” models, begin to fail once attackers exploit their internal reasoning layers.

Scaling a model’s reasoning abilities is one of the main ways that AI companies have been able to improve their overall frontier model performance in the last year, after traditional scaling methods appeared to show diminishing gains. Advanced reasoning allows models to tackle more complex questions, helping them act less like pattern-matchers and more like human problem solvers.

One solution the researchers suggest is a type of “reasoning-aware defense.” This approach keeps track of how many of the AI’s safety checks remain active as it thinks through each step of a question. If any step weakens these safety signals, the system penalizes it and brings the AI’s focus back to the potentially harmful part of the prompt. Early tests show this method can restore safety while still allowing the AI to perform well and answer normal questions effectively.

Academic research Artificial Intelligence ChatGPT Data Security Gemini Google Hackers openAI Safety
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link

Related Articles

From ‘reinvention exhaustion’ to ‘friction absorption’: CEOs who built instant delivery are worn out

From ‘reinvention exhaustion’ to ‘friction absorption’: CEOs who built instant delivery are worn out

5 June 2026
SpaceX and Anthropic are about to go public—and your 401(k) may be forced to buy in

SpaceX and Anthropic are about to go public—and your 401(k) may be forced to buy in

5 June 2026
Union Pacific CEO on Trump wanting stake in .5 billion merger: ‘We do not need anybody’s help’

Union Pacific CEO on Trump wanting stake in $71.5 billion merger: ‘We do not need anybody’s help’

5 June 2026
Airbnb CEO Brian Chesky plans to start a new AI company

Airbnb CEO Brian Chesky plans to start a new AI company

5 June 2026
Making Sense Of The AI IPO Tsunami Heading For Wall Street

Making Sense Of The AI IPO Tsunami Heading For Wall Street

4 June 2026
IBM, AT&T accused by whistleblower of covering up foreign hacks

IBM, AT&T accused by whistleblower of covering up foreign hacks

4 June 2026
Don't Miss
Unwrap Christmas Sustainably: How To Handle Gifts You Don’t Want

Unwrap Christmas Sustainably: How To Handle Gifts You Don’t Want

By Press Room27 December 2024

Every year, millions of people unwrap Christmas gifts that they do not love, need, or…

Exclusive: DeFi platform Azura launches after raising .9 million from Initialized

Exclusive: DeFi platform Azura launches after raising $6.9 million from Initialized

22 October 2024
Sam Altman’s World Wants To Scan Your Eyes To Prove You’re Human

Sam Altman’s World Wants To Scan Your Eyes To Prove You’re Human

22 October 2024
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
Latest Articles
Union Pacific CEO on Trump wanting stake in .5 billion merger: ‘We do not need anybody’s help’

Union Pacific CEO on Trump wanting stake in $71.5 billion merger: ‘We do not need anybody’s help’

5 June 20260 Views
NYT ‘Pips’ Hints, Answers And Walkthrough For Friday, June 5

NYT ‘Pips’ Hints, Answers And Walkthrough For Friday, June 5

5 June 20263 Views
Airbnb CEO Brian Chesky plans to start a new AI company

Airbnb CEO Brian Chesky plans to start a new AI company

5 June 20261 Views
Making Sense Of The AI IPO Tsunami Heading For Wall Street

Making Sense Of The AI IPO Tsunami Heading For Wall Street

4 June 20262 Views

Recent Posts

  • From ‘reinvention exhaustion’ to ‘friction absorption’: CEOs who built instant delivery are worn out
  • Enterprise AI’s Security Time Bomb Is Ticking. Cisco Shares Its Plan.
  • SpaceX and Anthropic are about to go public—and your 401(k) may be forced to buy in
  • This Will Be The First World Cup Ever With AI Coaches On The Sidelines
  • Union Pacific CEO on Trump wanting stake in $71.5 billion merger: ‘We do not need anybody’s help’

Recent Comments

No comments to show.
About Us
About Us

Alpha Leaders is your one-stop website for the latest Entrepreneurs and Leaders news and updates, follow us now to get the news that matters to you.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks
From ‘reinvention exhaustion’ to ‘friction absorption’: CEOs who built instant delivery are worn out

From ‘reinvention exhaustion’ to ‘friction absorption’: CEOs who built instant delivery are worn out

5 June 2026
Enterprise AI’s Security Time Bomb Is Ticking. Cisco Shares Its Plan.

Enterprise AI’s Security Time Bomb Is Ticking. Cisco Shares Its Plan.

5 June 2026
SpaceX and Anthropic are about to go public—and your 401(k) may be forced to buy in

SpaceX and Anthropic are about to go public—and your 401(k) may be forced to buy in

5 June 2026
Most Popular
This Will Be The First World Cup Ever With AI Coaches On The Sidelines

This Will Be The First World Cup Ever With AI Coaches On The Sidelines

5 June 20261 Views
Union Pacific CEO on Trump wanting stake in .5 billion merger: ‘We do not need anybody’s help’

Union Pacific CEO on Trump wanting stake in $71.5 billion merger: ‘We do not need anybody’s help’

5 June 20260 Views
NYT ‘Pips’ Hints, Answers And Walkthrough For Friday, June 5

NYT ‘Pips’ Hints, Answers And Walkthrough For Friday, June 5

5 June 20263 Views

Archives

  • June 2026
  • May 2026
  • April 2026
  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • March 2022
  • January 2021
  • March 2020
  • January 2020

Categories

  • Blog
  • Business
  • Entrepreneurs
  • Global
  • Innovation
  • Leadership
  • Living
  • Money & Finance
  • News
  • Press Release
© 2026 Alpha Leaders. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.