[ExI] Student creates AI tool for de-radicalization

Sat May 24 02:31:57 UTC 2025

On 2025-05-23 15:39, BillK via extropy-chat wrote:
> This AI scans Reddit for ‘extremist’ terms and plots bot-led 
> intervention.
> The PrismX tool assigns users radicalization scores and can hold
> covert conversations to try to reverse their views.
> By Eve Upton-Clark 05-23-2025
> 
> <https://www.fastcompany.com/91340556/this-ai-scans-reddit-for-extremist-terms-and-plots-bot-led-intervention>
> Quotes:
> A computer science student is behind a new AI tool designed to track
> down Redditors showing signs of radicalization and deploy bots to
> “deradicalize” them through conversation.
> While PrismX is not currently being tested on real unconsenting users,
> it piles on the ever-growing question of the role of artificial
> intelligence in human spaces.
> -------------------
> 
> As a 'proof-of-concept' tech, it should ring alarm bells.
> It means that any 'unapproved' opinion on social media could find
> themselves arguing with AI bots. (And AI bots are better at arguing
> than most humans). This could destroy any online discussion. When
> faced with a relentless AI that will never weaken, most people would
> just stop responding.

I think it makes a huge difference in terms of both morality and 
consequences whether or not the bot the PrismX is honest about being a 
bot. In other words, if you ask it if it is a bot and it denies being 
one, then we have a huge problem down the line. Researchers have already 
demonstrated that when you finetune narrow improprieties into LLMs like 
writing insecure code, it starts to snowball until the AI becomes 
broadly misaligned, praising Hitler and calling for the elimination of 
the human race.

https://www.alignmentforum.org/posts/ifechgnJRtJdduFGC/emergent-misalignment-narrow-finetuning-can-produce-broadly

If the student fine-tuned the AI to lie, the it might lead to a bad 
place. Remember that HAL not being allowed to tell the astronauts about 
the real purpose of the Discovery's mission to Jupiter is what drives 
HAL to kill the astronauts.

Stuart LaForge