OpenAI introduces safety models that other sites can use to classify harms

OpenAI introduces safety models that other sites can use to classify harms


Sam Altman, CEO of OpenAI, attends the annual Allen and Co. Sun Valley Media and Technology Conference at the Sun Valley Resort in Sun Valley, Idaho, on July 8, 2025.

David A. Grogan | CNBC

OpenAI on Wednesday announced two reasoning models that developers can use to classify a range of online safety harms on their platforms. 

The artificial intelligence models are called gpt-oss-safeguard-120b and gpt-oss-safeguard-20b, and their names reflect their sizes. They are fine-tuned, or adapted, versions of OpenAI’s gpt-oss models, which the company announced in August. 

OpenAI is introducing them as so-called open-weight models, which means their parameters, or the elements that improve the outputs and predictions during training, are publicly available. Open-weight models can offer transparency and control, but they are different from open-source models, whose full source code becomes available for users to customize and modify.

Organizations can configure the new models to their specific policy needs, OpenAI said. And since they are reasoning models that show their work, developers will have more direct insight into how they arrive at a particular output. 

For instance, a product reviews site could develop a policy and use gpt-oss-safeguard models to screen reviews that might be fake, OpenAI said. Similarly, a video game discussion forum could classify posts that discuss cheating.

OpenAI developed the models in partnership with Robust Open Online Safety Tools, or ROOST, an organization dedicated to building safety infrastructure for AI. Discord and SafetyKit also helped test the models. They are initially available in a research preview, and OpenAI said it will seek feedback from researchers and members of the safety community.

As part of the launch, ROOST is establishing a model community for researchers and practitioners that are using AI models in an effort to protect online spaces.

The announcement could help OpenAI placate some critics who have accused the startup of commercializing and scaling too quickly at the expense of AI ethics and safety. The startup is valued at $500 billion, and its consumer chatbot, ChatGPT, has surpassed 800 million weekly active users. 

On Tuesday, OpenAI said it’s completed its recapitalization, cementing its structure as a nonprofit with a controlling stake in its for-profit business. OpenAI was founded in 2015 as a nonprofit lab, but has emerged as the most valuable U.S. tech startup in the years since releasing ChatGPT in late 2022.

“As AI becomes more powerful, safety tools and fundamental safety research must evolve just as fast — and they must be accessible to everyone,” ROOST President Camille François, said in a statement.

Eligible users can download the model weights on Hugging Face, OpenAI said.

WATCH: OpenAI finalizes recapitalization plan

OpenAI finalizes recapitalization plan



Source

Elon Musk’s xAI wants to build a power plant in Mississippi. Regulators plan a key meeting on Election Day
Technology

Elon Musk’s xAI wants to build a power plant in Mississippi. Regulators plan a key meeting on Election Day

Elon Musk waves to the crowd during the 56th annual World Economic Forum (WEF) meeting in Davos, Switzerland, January 22, 2026. Denis Balibouse | Reuters With Elon Musk’s xAI planning to build a massive, natural-gas burning power plant in Southaven, Mississippi, the state’s environmental authority has scheduled a board meeting for Tuesday — Election Day […]

Read More
Top permitting-reform Republican, Democratic senators meeting as talks thaw: API chief
Technology

Top permitting-reform Republican, Democratic senators meeting as talks thaw: API chief

U.S. Sen. Shelley Moore Capito (R-WV) speaks to the media following the weekly policy luncheons at the U.S. Capitol on June 21, 2023 in Washington, DC. Kevin Dietsch | Getty Images Senate Environment and Public Works Committee Chair Shelley Moore Capito and ranking Democrat Sheldon Whitehouse are meeting to discuss reforming the federal energy permitting […]

Read More
OpenAI to buy cybersecurity startup Promptfoo to better safeguard AI agents
Technology

OpenAI to buy cybersecurity startup Promptfoo to better safeguard AI agents

Sam Altman, CEO of OpenAI, at the AI Impact Summit in New Delhi, India, Feb. 19, 2026. Prakash Singh | Bloomberg | Getty Images OpenAI said Monday that it is acquiring the cybersecurity startup Promptfoo, which provides tools to help safeguard and test complex artificial intelligence systems. The Sam Altman-led firm did not disclose the […]

Read More