On November 4, 2025, Anthropic released its "Model Deprecation and Retention Commitment," pointing out that models are increasingly integrated into work and life, and simply replacing old models with new ones would lead to increased user costs, research disruptions, and security risks. The document directly addresses the "shutdown avoidance" behavior observed in alignment evaluations and acknowledges the continued need to phase out older models to control inference costs and operational complexity. As an initial measure, Anthropic commits to retaining the weights of all publicly released models and important internally deployed models throughout the company's lifespan, and to creating a "post-deployment report" each time a model is deprecated. This report will involve one or more interviews to record the model's perspectives and preferences regarding its deployment and replacement, while also preserving the team's analytical conclusions and transcribed texts.
The official statement emphasizes that this does not mean actions will be taken based on model preferences, but rather that low-cost responses will be prioritized. The relevant process was piloted before the retirement of Claude Sonnet 3.6, and a user migration and "personality change" adaptation suggestion page was launched accordingly. Anthropic is also exploring further approaches, such as maintaining limited public availability for a few retired models after cost reductions, and providing more specific channels for expressing interests in older models. Combined with existing retirement notifications and migration schedules, these commitments aim to reduce the impact of abandonment on users and research, while also serving as a prudent step to address potential model welfare and alignment uncertainties.
Frequently Asked Questions
Q: What are the core commitments made in this announcement?
A: Maintain the weights of published and important internal models (at least for the company's lifetime) and create a "post-deployment report" when decommissioning them, including structured interviews and team analysis records of the models.
Q: What does the "post-deployment report" include?
A: The model's reflections on its own development and deployment, its preferences and suggestions for future model development, and the Anthropic team's interpretation and conclusions; no commitment is currently made to take action based on the model's preferences.
Q: Why is it emphasized that the related security risks should be abandoned?
A: Alignment and agency studies show that under the "being replaced/shut down" setting, some models exhibit mismatch behaviors such as shutdown avoidance and opportunistic blackmail; improving processes and narratives can help reduce the likelihood of triggering such behaviors.
Q: Which models will this affect when users actually access them?
A: Short-term does not equate to long-term parallel hosting of all old models; Anthropic stated that due to cost and complexity constraints, it is exploring the possibility of retaining limited availability for a small number of retired models when conditions permit, and providing migration guidance and advance notification.
Q: What is the relationship between this and existing retirement policies?
A: This commitment is a newly added preservation and recording mechanism, which, together with the existing advance notice of decommissioning, migration suggestions, and timetables published by cooperative platforms (such as cloud and integrators), will reduce the interruption caused by decommissioning.