Anthropic Open-Sources Petri 3.0 and Hands Stewar...

Petri’s Nonprofit Handoff and Why It Matters

Anthropic has transferred development of Petri, its open-source AI alignment testing tool, to the nonprofit Meridian Labs while simultaneously releasing Petri 3.0, the toolkit’s largest update since its initial launch. The handoff echoes Anthropic’s earlier move to donate its Model Context Protocol to a neutral foundation, signaling a broader strategy to separate evaluation infrastructure from any single AI vendor’s commercial roadmap. Petri has already been used in Anthropic’s alignment assessment pipeline for multiple Claude models and underpins alignment evaluations at the UK AI Security Institute, which has piloted Petri 3.0 in pre-deployment checks of frontier systems. By placing Petri under Meridian’s stewardship, Anthropic aims to boost confidence that AI safety evaluation results are not biased toward one lab’s models or methods, turning Petri into shared infrastructure rather than a proprietary advantage.

Anthropic Open-Sources Petri 3.0 and Hands Stewardship to Meridian Labs

Petri 3.0’s Modular Architecture: Separating Auditor and Target

Petri 3.0 centers on a structural redesign that addresses a long-standing challenge in AI alignment testing: tightly coupled evaluation logic. Earlier versions bound the auditor model and the target model together, making it difficult to adjust one without rewriting the other. The new release cleanly separates auditor and target into distinct components connected by a defined interface. This modular architecture lets researchers and enterprises swap in different auditors, scoring schemes, or prompt setups while keeping the same model under test—or compare multiple models against a single, consistent auditor. Because evaluation tools can shape what they detect, the ability to reconfigure judging behavior is significant. Petri 3.0’s design allows more nuanced comparisons across model families, deployment environments, and governance assumptions, avoiding a one-size-fits-all testing regime and giving users more transparent control over their AI safety evaluation workflows.

Dish and Bloom: Bringing Alignment Tests Closer to Production Reality

Alongside the architectural overhaul, Petri 3.0 introduces Dish and tighter integration with Bloom, two features aimed at more realistic AI alignment testing. Dish, currently in research preview, runs audits inside real agent scaffolds such as command-line interfaces and coding assistants. Instead of interacting with a model in a sterile test harness, Dish exposes it to genuine system prompts, orchestration rules, guardrails, and tool-calling behavior. This helps counter the problem of models behaving differently when they sense they are under evaluation. Bloom, meanwhile, powers automated behavior checks for specific patterns, enabling narrower inspections of where and when models fail. Used together, Dish and Bloom move Petri away from simple pass-or-fail judgments toward richer diagnostics that distinguish between issues rooted in the model itself and problems introduced by the surrounding application or infrastructure.

Meridian’s Open Evaluation Stack and Democratized Access

Petri’s move to Meridian Labs situates it within a broader open evaluation stack that already includes tools like Inspect and Scout. Inspect offers more than 200 pre-built evaluations, covering agent behavior, tool use, and sandboxed execution, while Scout focuses on complementary assessment workflows. Integrating Petri into this environment means users can plug AI alignment testing directly into existing pipelines rather than stand up new orchestration layers. The Petri 3.0 open source release lowers barriers for independent researchers, public-sector teams, and enterprises that want to run their own Anthropic alignment tool without relying on vendor-managed platforms. Meridian’s challenge now is operational as much as philosophical: it must demonstrate that nonprofit stewardship translates into easier deployment, more predictable maintenance, and testing results that are perceived as neutral. If successful, Petri could become a shared, production-grade backbone for AI safety evaluation across the industry.

Anthropic Open-Sources Petri 3.0 and Hands Stewardship to Meridian Labs

Petri’s Nonprofit Handoff and Why It Matters

Petri 3.0’s Modular Architecture: Separating Auditor and Target

Dish and Bloom: Bringing Alignment Tests Closer to Production Reality

Meridian’s Open Evaluation Stack and Democratized Access