OpenAI has launched a "confession" mechanism, and research says that training the GPT-5 model through Confessions can significantly increase the "self-reporting" rate
OpenAI released a study on "How to Make Language Models More Honest Through Confession", proposing to add a separate "confession output" to the model,...