Futurology
Lugh
•
9mo ago
•
77%
Two-faced AI language models learn to hide deception - ‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.
www.nature.comComments 9