OpenAI has announced the launch of a public Safety Bug Bounty program, focusing on AI abuse and safety risks in its products. The project complements the existing Security Bug Bounty and focuses on scenarios outside of traditional security breaches that can cause real-world harm.
The current scope includes proxy-based risks, third-party prompt injection and data leakage, OpenAI's proprietary information leakage, and issues related to account and platform integrity. Some cases require a reproduction success rate of at least 50% and must reflect real and quantifiable harms. General content jailbreaks, rude outputs, or information readily available through search are not eligible for program rewards.
Participants need to submit through the Bugcrowd application. OpenAI said that reports will be handled by the safety and security team, and some issues may also be transferred to the original security breach bounty program.
FAQs
Q: What is OpenAI's Safety Bug Bounty?
A: This is a security bounty program for the public that specifically receives reports on AI abuse and security risks.
Q: What is the difference between OpenAI's Safety Bug Bounty and Security Bug Bounty?
A: The former focuses more on abuse, harm, and model security scenarios, while the latter is more focused on traditional security vulnerabilities.
Q: What questions are not covered by OpenAI's rewards?
A: Normal jailbreaks, strategy bypasses with no obvious harm, and easily publicly available information outputs are usually not counted.
Q: How can researchers participate in OpenAI's Safety Bug Bounty?
A: Researchers need to apply through the Bugcrowd page and submit reproduction materials according to the rules.