What does this repo signal mean?

Microsoft published microsoft/agent-governance-toolkit (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo microsoft/agent-governance-toolkit · language Python · Substantial new repo, high stars, from major lab.. onlylabs links this event to 1 captured evidence page and 6 related repo signals. It also maps to Infrastructure, Safety and policy in the data-business radar.

Microsoft Repo: microsoft/agent-governance-toolkit

Captured source

source ↗

GitHub/github.com/microsoft/agent-governance-toolkit

microsoft/agent-governance-toolkit repository metadata

Source ↗

published Mar 2, 2026seen Jun 8captured Jun 11http 200method plain

microsoft/agent-governance-toolkit

Description: AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10.

Language: Python

License: MIT

Stars: 4196

Forks: 580

Open issues: 73

Created: 2026-03-02T22:11:47Z

Pushed: 2026-06-11T00:42:45Z

Default branch: main

Fork: no

Archived: no

README: 🌍 [English](/README.md) | [日本語](./docs/i18n/README.ja.md) | [简体中文](./docs/i18n/README.zh-CN.md) | [한국어](./docs/i18n/README.ko.md)

![Agent Governance Toolkit](docs/assets/readme-banner.svg)

Agent Governance Toolkit

Ship agents to production without losing sleep

🚀 Quick Start · 📋 Specifications · 📦 PyPI · 📝 Changelog

![CI](https://github.com/microsoft/agent-governance-toolkit/actions/workflows/ci.yml) ![Discord](https://discord.gg/vBg9SNN8) ![OpenSSF Scorecard](https://scorecard.dev/viewer/?uri=github.com/microsoft/agent-governance-toolkit) ![OpenSSF Best Practices](https://www.bestpractices.dev/projects/12085)

> [!IMPORTANT] > Public Preview -- production-quality public preview releases. May have breaking changes before GA.

Policy enforcement, identity, sandboxing, and SRE for autonomous AI agents. One pip install, any framework.

---

The Problem

Your AI agents call tools, browse the web, query databases, and delegate to other agents. Once deployed, they make decisions autonomously. You need answers to three questions:

1. Is this action allowed? An agent with access to send_email and query_database should not be able to drop_table. OAuth scopes and IAM roles control which services an agent can reach, not what it does once connected.

2. Which agent did this? In a multi-agent system, five agents might share a single API key. When something goes wrong, "an agent did it" is not an incident response.

3. Can you prove what happened? Auditors and regulators need tamper-evident records of every decision: what policy was active, what the agent requested, and why it was allowed or denied.

Prompt-level safety ("please follow the rules") is not a control surface. It is a polite request to a stochastic system. OWASP LLM01:2025 states this explicitly: *"it is unclear if there are fool-proof methods of prevention for prompt injection."* The published numbers back this up. Andriushchenko et al. (ICLR 2025) report 100% attack success rate on GPT-4o, GPT-3.5, Claude 3, and Llama-3 using adaptive attacks with logprob access and suffix optimization, evaluated against the JailbreakBench benchmark (Chao et al., NeurIPS 2024). Microsoft's own AI Red Teaming Agent formalizes Attack Success Rate (ASR), the rate of policy violations under adversarial input, as the canonical metric for this class of failure. *Lessons from Red Teaming 100 Generative AI Products* reinforces the point: *"mitigations do not eliminate risk entirely"* and red teaming must be a continuous process because model-layer defenses are probabilistic by construction.

AGT does not try to win that fight inside the prompt. Every tool call, message send, and delegation is intercepted in deterministic application code *before* the model's intent reaches the wire. Actions the AGT kernel denies are not "unlikely." They are structurally impossible. That is the difference between asking an agent to behave and making it incapable of misbehaving.

---

Quick Start

Prerequisites: Python 3.10+

pip install agent-governance-toolkit[full]

For Claude Code, add AGT as a plugin marketplace and install the governance plugin:

/plugin marketplace add microsoft/agent-governance-toolkit
/plugin install agt-governance@agent-governance-toolkit

Govern any tool function in two lines:

from agentmesh.governance import govern

safe_tool = govern(my_tool, policy="policy.yaml") # every call checked, logged, enforced

That's it. safe_tool evaluates your YAML policy on every call, logs the decision, and raises GovernanceDenied if the action is blocked.

# policy.yaml
apiVersion: governance.toolkit/v1
name: production-policy
default_action: allow
rules:
- name: block-destructive
condition: "action.type in ['drop', 'delete', 'truncate']"
action: deny
description: "Destructive operations require human approval"

- name: require-approval-for-send
condition: "action.type == 'send_email'"
action: require_approval
approvers: ["security-team"]

>>> safe_tool(action="read", table="users")
{'table': 'users', 'rows': 42}

>>> safe_tool(action="drop", table="users")
GovernanceDenied: Action denied by policy rule 'block-destructive':
Destructive operations require human approval

Or use the full PolicyEvaluator API for programmatic control:

PolicyEvaluator example

from agent_os.policies import (
PolicyEvaluator, PolicyDocument, PolicyRule,
PolicyCondition, PolicyAction, PolicyOperator, PolicyDefaults
)

evaluator = PolicyEvaluator(policies=[PolicyDocument(
name="my-policy", version="1.0",
defaults=PolicyDefaults(action=PolicyAction.ALLOW),
rules=[PolicyRule(
name="block-dangerous-tools",
condition=PolicyCondition(
field="tool_name",
operator=PolicyOperator.IN,
value=["execute_code", "delete_file"]
),
action=PolicyAction.DENY, priority=100,
)],
)])

result = evaluator.evaluate({"tool_name": "web_search"}) # Allowed
result = evaluator.evaluate({"tool_name": "delete_file"}) # Blocked

TypeScript / .NET / Rust / Go examples

TypeScript

import { PolicyEngine } from "@microsoft/agent-governance-sdk";

const engine = new PolicyEngine([
{ action: "web_search", effect: "allow" },
{ action: "shell_exec", effect: "deny" },
]);
engine.evaluate("web_search"); // "allow"
engine.evaluate("shell_exec"); // "deny"

.NET

using AgentGovernance;
using...

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

Substantial new repo, high stars, from major lab.

Microsoft has a repo signal matching infrastructure, safety and policy.

Infrastructure Safety and policy