Get the latest Science News and Discoveries

Writing backwards can trick an AI into providing a bomb recipe


AI models have safeguards in place to prevent them creating dangerous or illegal output, but a range of jailbreaks have been found to evade them. Now researchers show that writing backwards can trick AI models into revealing bomb-making instructions.

None

Get the Android app

Or read this on New Scientist

Read more on:

Photo of writing

writing

Photo of bomb recipe

bomb recipe

Related news:

News photo

‘Writing’ with atoms could transform materials fabrication for quantum devices - EurekAlert

News photo

Why writing by hand beats typing for thinking and learning: « There’s actually some very important things going on during the embodied experience of writing by hand. It has important cognitive benefits. »

News photo

Did the people of Easter Island independently invent writing?