Summary

Anthropic updated its open-source alignment toolbox to Petri 3.0 and handed its ongoing development to Meridian Labs. The new version separates auditor and target components, adds a realism-focused Dish add-on, and integrates with Bloom for deeper behavior analysis.

What changed

Anthropic shipped Petri 3.0, added new architecture and realism features, integrated Bloom, and transferred Petri’s development to Meridian Labs.

Why it matters

Anthropic is pushing a third-party home for one of its most visible alignment tools, which could make safety evaluations look less lab-specific and more like shared infrastructure. That matters if regulators, governments, and enterprise buyers want independent model-testing layers instead of vendor-only claims.

Evidence excerpt

Anthropic says Petri 3.0 splits the auditor and target model into separately tweakable components and that it has handed development to Meridian Labs to help keep the tool independent.

Sources