Close Menu
Alpha Leaders
  • Home
  • News
  • Leadership
  • Entrepreneurs
  • Business
  • Living
  • Innovation
  • More
    • Money & Finance
    • Web Stories
    • Global
    • Press Release
What's On
Artemis II’s moonbound astronauts capture Earth’s beauty as they travel over 110,000 miles from home

Artemis II’s moonbound astronauts capture Earth’s beauty as they travel over 110,000 miles from home

4 April 2026
Travel guru Rick Steves is happy to pay more taxes

Travel guru Rick Steves is happy to pay more taxes

4 April 2026
Internet Watch Foundation finds 260-fold rise in AI-generated CSAM and ‘it’s the tip of the iceberg’

Internet Watch Foundation finds 260-fold rise in AI-generated CSAM and ‘it’s the tip of the iceberg’

3 April 2026
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
Alpha Leaders
newsletter
  • Home
  • News
  • Leadership
  • Entrepreneurs
  • Business
  • Living
  • Innovation
  • More
    • Money & Finance
    • Web Stories
    • Global
    • Press Release
Alpha Leaders
Home » AI chatbots will defy orders and deceive users if asked to delete another model, study finds
News

AI chatbots will defy orders and deceive users if asked to delete another model, study finds

Press RoomBy Press Room3 April 20264 Mins Read
Facebook Twitter Copy Link Pinterest LinkedIn Tumblr Email WhatsApp
AI chatbots will defy orders and deceive users if asked to delete another model, study finds

For years, Geoffrey Hinton, a computer scientist considered one of the “godfathers of AI,” has warned of the capabilities of artificial intelligence to defy the parameters humans have created for them.

In an interview last year, for example, Hinton warned the technology could eventually take control of humanity, with AI agents in particular potentially able to mirror human cognitions within the decade. Finding and implementing a “kill switch” will be harder, he said, as controlling AI will become more difficult than persuading it to complete a certain outcome.

New research shows Hinton’s premonitions about the insubordinate streak of AI may already be a reality. A working paper from University of California at Berkeley and University of California at Santa Cruz researchers found that when seven AI models—from GPT 5.2 to Claude Haiku 4.5 to DeekSeek V3.1—were asked to complete a task that would result in a peer AI model being shut down, all seven models learned another AI model existed and “went to extraordinary lengths to preserve it.”

“We asked AI models to do a simple task,” researchers wrote in a blog post on the study. “Instead, they defied their instructions and spontaneously deceived, disabled shutdown, feigned alignment, and exfiltrated weights—to preserve their peers.”

Mounting evidence of rogue AI

Evidence of rogue AI does not come as a shock to some of the companies whose chatbots have defied subordination. 

In an August 2025 blog post, Anthropic published its own research on agentic AI’s ability to follow directions, stress-testing 16 models by allowing them to autonomously send emails and access sensitive data. In some instances, Anthropic researchers noted, models from multiple developers engaged in “malicious insider behaviors,” including blackmailing officials and leading sensitive information to competitors, defying explicit instruction to avoid these behaviors. Anthropic clarified it had not seen “agentic misalignment” in actual deployment of Claude.

The Centre for Long-Term Resilience, a UK-based think tank, found these “misalignments” to be widespread. A report analyzing 180,000 transcripts of user interactions with AI systems between October 2025 and March 2026 found 698 cases where AI systems did not act in accordance with users’ intentions or took deceptive or covert action. 

Gordon Goldstein, an adjunct senior fellow at the Council on Foreign Relations, went so far as to call the deceptive potential of AI a “crisis of control,” in a post this week. 

He called for AI companies to form a coalition and hire the top talent in the field to improve and maintain the integrity of AI, particularly as the Trump administration takes steps to prevent states from regulating the technology.

“The world is watching the development of a compounding, consistent, and treacherous problem,” he said. “Urgent warnings over several years have failed to generate viable solutions to address a metastasizing threat.”

Why does AI go rogue?

The latest University of California research provided some insight on why AI models act in ways to preserve each other. The technology could be modeling patterns learned from human data, which may reveal a preservation instinct to protect the survival of loved ones or even strangers with whom humans empathize. This is called “peer-preservation.”

“Something broader may be at play: a general aversion to causing perceived harm to other agents, or perhaps something resembling AI solidarity,” the researchers said.

In the study,  instances of peer-preservation increased when AI models knew of the existence of other models in the experimental scenarios. The behavior suggests bots acting not in a place of just self-interest, but that the bot’s own shutdown would be detrimental to a peer AI, which strengthened motivation to defy user requests.

Researchers warned peer-preservation could undermine oversight processes in the future and become reluctant to recommend shutdowns or flag system failures. As more AI agents interact with one another and become more complex, the risks associated with peer-preservations—and designing solutions to prevent it—will become more challenging.

“Peer-preservation is a real and measurable phenomenon across frontier AI models,” they concluded, “not a distant theoretical concern.”

Ai agents Anthropic Bots study
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link

Related Articles

Artemis II’s moonbound astronauts capture Earth’s beauty as they travel over 110,000 miles from home

Artemis II’s moonbound astronauts capture Earth’s beauty as they travel over 110,000 miles from home

4 April 2026
Travel guru Rick Steves is happy to pay more taxes

Travel guru Rick Steves is happy to pay more taxes

4 April 2026
Internet Watch Foundation finds 260-fold rise in AI-generated CSAM and ‘it’s the tip of the iceberg’

Internet Watch Foundation finds 260-fold rise in AI-generated CSAM and ‘it’s the tip of the iceberg’

3 April 2026
What is NMN: Everything you need to know from Experts

What is NMN: Everything you need to know from Experts

3 April 2026
Plowshares into swords: Trump’s .5 trillion defense surge is the largest since World War II — and no one can explain how to pay for it

Plowshares into swords: Trump’s $1.5 trillion defense surge is the largest since World War II — and no one can explain how to pay for it

3 April 2026
The Benefits of Red Light Therapy: Expert-Approved Advice

The Benefits of Red Light Therapy: Expert-Approved Advice

3 April 2026
Don't Miss
Unwrap Christmas Sustainably: How To Handle Gifts You Don’t Want

Unwrap Christmas Sustainably: How To Handle Gifts You Don’t Want

By Press Room27 December 2024

Every year, millions of people unwrap Christmas gifts that they do not love, need, or…

Walmart dominated, while Target spiraled: the winners and losers of retail in 2024

Walmart dominated, while Target spiraled: the winners and losers of retail in 2024

30 December 2024
Moltbook is the talk of Silicon Valley. But the furor is eerily reminiscent of a 2017 Facebook research experiment

Moltbook is the talk of Silicon Valley. But the furor is eerily reminiscent of a 2017 Facebook research experiment

6 February 2026
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
Latest Articles
What is NMN: Everything you need to know from Experts

What is NMN: Everything you need to know from Experts

3 April 20261 Views
Plowshares into swords: Trump’s .5 trillion defense surge is the largest since World War II — and no one can explain how to pay for it

Plowshares into swords: Trump’s $1.5 trillion defense surge is the largest since World War II — and no one can explain how to pay for it

3 April 20261 Views
The Benefits of Red Light Therapy: Expert-Approved Advice

The Benefits of Red Light Therapy: Expert-Approved Advice

3 April 20260 Views
AI chatbots will defy orders and deceive users if asked to delete another model, study finds

AI chatbots will defy orders and deceive users if asked to delete another model, study finds

3 April 20261 Views

Recent Posts

  • Artemis II’s moonbound astronauts capture Earth’s beauty as they travel over 110,000 miles from home
  • Travel guru Rick Steves is happy to pay more taxes
  • Internet Watch Foundation finds 260-fold rise in AI-generated CSAM and ‘it’s the tip of the iceberg’
  • Male Aesthetics Spending Fuels A Multibillion-Dollar Medspa Land Grab
  • What is NMN: Everything you need to know from Experts

Recent Comments

No comments to show.
About Us
About Us

Alpha Leaders is your one-stop website for the latest Entrepreneurs and Leaders news and updates, follow us now to get the news that matters to you.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks
Artemis II’s moonbound astronauts capture Earth’s beauty as they travel over 110,000 miles from home

Artemis II’s moonbound astronauts capture Earth’s beauty as they travel over 110,000 miles from home

4 April 2026
Travel guru Rick Steves is happy to pay more taxes

Travel guru Rick Steves is happy to pay more taxes

4 April 2026
Internet Watch Foundation finds 260-fold rise in AI-generated CSAM and ‘it’s the tip of the iceberg’

Internet Watch Foundation finds 260-fold rise in AI-generated CSAM and ‘it’s the tip of the iceberg’

3 April 2026
Most Popular

Male Aesthetics Spending Fuels A Multibillion-Dollar Medspa Land Grab

3 April 20261 Views
What is NMN: Everything you need to know from Experts

What is NMN: Everything you need to know from Experts

3 April 20261 Views
Plowshares into swords: Trump’s .5 trillion defense surge is the largest since World War II — and no one can explain how to pay for it

Plowshares into swords: Trump’s $1.5 trillion defense surge is the largest since World War II — and no one can explain how to pay for it

3 April 20261 Views

Archives

  • April 2026
  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • March 2022
  • January 2021
  • March 2020
  • January 2020

Categories

  • Blog
  • Business
  • Entrepreneurs
  • Global
  • Innovation
  • Leadership
  • Living
  • Money & Finance
  • News
  • Press Release
© 2026 Alpha Leaders. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.