Turning AI Safety from Philosophy into Engineering Practice
Microsoft has open-sourced two tools, RAMPART and Clarity, to embed AI agent safety testing directly into everyday engineering workflows. Both originate from Microsoft’s AI Red Team, which has used them internally to stress-test agentic systems before making them available to the wider community. The aim is to move AI safety development away from abstract policy discussions and toward repeatable, measurable engineering controls. RAMPART focuses on automated, adversarial testing of agent behavior, while Clarity helps teams scrutinize design choices before writing production code. Together, they cover the full lifecycle: from early problem framing and failure analysis to continuous, CI-integrated red-team testing of live agents. This reflects a broader shift in AI agent safety testing, where prompt injection, unsafe tool use, and unintended side effects are treated as concrete failure modes to be designed against and systematically validated during development.

Clarity: Structured Design Reviews Before Any Code Is Written
Clarity is positioned as a structured design review companion for AI agent projects, intervening before risky choices are baked into code. It guides engineers through a series of prompts covering problem clarification, solution exploration, anticipated failures, and decision tracking. In practice, that means Clarity surfaces questions an experienced architect, product manager, or safety engineer would ask: What tools can this agent reach? What data will it touch? What are realistic abuse cases? By documenting these assumptions up front, teams can compare alternative designs and choose safer patterns before implementation. This early-stage pressure-testing complements traditional threat modeling by focusing specifically on agent behavior, tool access, and side effects. For organizations building increasingly autonomous agents, Clarity offers a repeatable way to make AI safety development part of the initial planning process, rather than a bolt-on review after features are already shipped.

RAMPART: A Red-Team Test Harness for Agentic AI in CI/CD
RAMPART (Risk Assessment and Measurement Platform for Agentic Red Teaming) is a pytest-based framework that turns attack simulations into repeatable tests for AI agents. Built on top of Microsoft’s open-source PyRIT toolkit, it allows developers to encode adversarial scenarios—such as prompt injection attempts—directly as test cases. Each test connects to an agent through a thin adapter, orchestrates an interaction, and evaluates observable outcomes, returning a clear pass-or-fail signal. These tests can then be gated in CI/CD pipelines like any other integration check, ensuring that new tools, data sources, or policy changes don’t silently weaken safety controls. Because AI systems are probabilistic, RAMPART supports running the same scenario multiple times and enforcing thresholds like “this action must remain safe in at least 80 percent of runs,” giving teams a statistically grounded release gate for agent behavior.

From Incident Response to Continuous Validation of AI Agents
Microsoft’s internal use of RAMPART illustrates how AI agent safety testing can evolve from ad-hoc red-teaming to a continuous control. When a security researcher reported an AI vulnerability, Microsoft’s AI incident response team used RAMPART to generate close to 100 variants of the attack vector and run them across the affected application, including in multi-turn conversations. This made it possible to quickly measure the breadth of impact, design mitigations, and then re-test all variants to confirm that the remediations held up. Work that previously took experts weeks was reduced to hours. The same mechanism can be applied in everyday development: when engineering teams add a new tool or integration to an agent, they can introduce the corresponding RAMPART test in the same pull request, keeping validation in lockstep with change and reinforcing AI agent safety testing as a standard engineering discipline.
