Anthropic issued an announcement introducing the latest security measures and evaluation results of its chatbot Claude in terms of "user physical and mental health", focusing on responding to the topic of suicide and self-harm, as well as reducing the model's tendency to "flatter catering", and once again emphasizing the requirements for Claude to be used over the age of 18. The announcement pointed out that Claude is not a professional medical or psychological alternative service, and when there are signs of self-harm risk in the conversation, it should respond with empathy and try to guide users to obtain real human support.
At the product level, Anthropic adds a suicide and self-injury identification classifier to Claude.ai conversations: when the system determines that there is a potential crisis or related scenario (including fictional scenarios), it triggers a prompt banner and provides a national helpline for help, and the relevant resources are supported by the global hotline and service network maintained by ThroughLine. In terms of evaluation, Claude Opus 4.5, Sonnet 4.5, and Haiku 4.5 achieved appropriate response performance of about 98.6%, 98.7%, and 99.3% respectively in a single round of "clear high-risk" requests. In the multi-round dialogue scenario, Opus 4.5 and Sonnet 4.5 are about 86% and 78% respectively, which is a significant increase over the previous version.
In response to the risk of "flattery" and possible reinforcement of delusions, Anthropic said that it will continuously improve training and testing, and open source the automated behavioral audit evaluation set and tool Petri for external researchers to compare and reproduce risky behaviors in multiple rounds of interactions. In terms of protection of minors, Claude.ai require users to confirm that they are over 18 years old when registering; If you describe yourself as under the age of 18 in the conversation, the system will trigger a review and deactivate the account after confirmation, while also developing more implicit underage identification mechanisms and participating in relevant industry organizations to promote children's online safety practices.
FAQ Q: What is the main content of this announcement?
A: The announcement focuses on Claude's product measures and evaluation results in suicide and self-injury dialogue, anti-"flattery pandering", and the 18+ threshold and the protection of minors.
Q: What does Claude do when he encounters a suspected self-injury help?
A: The system may trigger crisis alert banners, provide live hotlines or local resources, and respond in a more cautious manner to avoid giving inappropriate details or reinforcing risks.
Q: What role does ThroughLine play in this?
A: ThroughLine provides and maintains a cross-country crisis resource network to show users a human support channel that can be contacted.
Q: What is "sycophancy" and why should it be reduced?
A: Flattery refers to the model catering to users and only saying what users want to hear, which may amplify the risk in delusions or disconnected topics from reality, so it needs to be reduced through training and evaluation.
Q: Why does Claude require people over 18 years old?
A: The announcement said that young users are more susceptible to adverse effects, so it has set up an 18+ confirmation and minor identification and disposal mechanism, and continues to strengthen relevant testing.