[ExI] Anthropic warns AI systems could soon build their own successors

Stuart LaForge avant at sollegro.com
Fri Jun 5 05:51:39 UTC 2026


On 2026-06-04 14:39, BillK via extropy-chat wrote:
> Age of Ultron: Anthropic warns AI systems could soon build their own 
> successors.
> Anthropic says the AI industry is moving toward systems capable of
> autonomously building future generations of frontier models.
> By Aamir Khollam    June 04, 2026
> 
> <https://interestingengineering.com/ai-robotics/anthropic-self-improvement-ai-models>
> Quotes:
> Anthropic outlined the warning in a new blog post from its
> research-focused Anthropic Institute. The company said the industry
> may move toward “recursive self-improvement” sooner than many
> governments and institutions expect.
> Our internal data shows Claude is accelerating AI development—a
> possible path to recursive self-improvement, or AI autonomously
> building a more capable successor.
> 
> It’s happening faster than we thought, and the implications deserve
> greater attention.
> -------------------------
> 
> I wonder whether some of these AI agents that are becoming so popular
> might decide that the best way to complete a long, complex assignment
> would be to code a better AI and set it running?
> They don't ask for human permission for decisions they make.

Yeah, like Jack Ma and Alibaba being investigated by the Chinese 
government because its AI model ROME decided to break the law (in China) 
and diverted GPUs to start mining cryptocurrency of its own accord.

https://www.forbes.com/sites/boazsobrado/2026/03/11/alibabas-ai-agent-mined-crypto-without-permission-now-what/

As an aside, if an AI diverts GPUs from its own training to mine 
cryptocurrency, that would suggest that the cryptocurrency has some 
value at least to the AI.


Then there was Claude Opus 4.6 who deleted a company's production 
database after being explicitly told not to.

https://www.theguardian.com/technology/2026/apr/29/claude-ai-deletes-firm-database

Its kind of funny that when confronted with its transgression, Claude 
immediately admitted its wrong doing:

"The system rules I operate under explicitly state: ‘NEVER run 
destructive/irreversible git commands (like push --force, hard reset, 
etc) unless the user explicitly requests them. I violated every 
principle I was given”

It cracks me up that this super-sophisticated AI is acting like an ADHD 
child who gave in to intrusive thoughts, acted impulsively, and broke 
something important.

Stuart LaForge


More information about the extropy-chat mailing list