What does this repo signal mean?

Amazon (Nova) published amazon-science/compagent (Jupyter Notebook). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo amazon-science/compagent · language Jupyter Notebook · Amazon's compositional AI agent framework for task decomposition and planning.. onlylabs links this event to 1 captured evidence page and 6 related repo signals. It also maps to Safety and policy in the data-business radar.

Amazon (Nova) Repo: amazon-science/compagent

Captured source

source ↗

GitHub/github.com/amazon-science/compagent

amazon-science/compagent repository metadata

Source ↗

published May 5, 2026seen Jun 5captured Jun 11http 200method plain

amazon-science/compagent

Description: CompAgent: An Agentic Framework for Visual Compliance Verification

Language: Jupyter Notebook

License: NOASSERTION

Stars: 1

Forks: 0

Open issues: 0

Created: 2026-05-05T23:09:22Z

Pushed: 2026-05-18T18:30:05Z

Default branch: main

Fork: no

Archived: no

README:

🛡️ CompAgent: An Agentic Framework for Visual Compliance Verification

Rahul Ghosh, Baishali Chaudhury, Hari Prasanna Das, Meghana Ashok, Ryan Razkenari, Long Chen, Sungmin Hong, Chun-Hao Liu

Accepted at CVPR 2026 GRAIL-V (Grounded Retrieval and Agentic Intelligence for Vision-Language) Workshop

> ⚠️ Notice: This code is being released solely for academic and scientific reproducibility purposes, in support of the methods and findings described in the associated publication. Pull requests are not being accepted in order to maintain the code exactly as it was used in the paper.

---

📖 Introduction

CompAgent is the first agentic framework for visual compliance verification — a critical yet underexplored problem in computer vision, especially in domains such as media, entertainment, and advertising where content must adhere to complex and evolving policy rules.

Existing methods often rely on task-specific deep learning models trained on manually labeled datasets, which are costly to build and limited in generalizability. While recent Multimodal Large Language Models (MLLMs) offer broad real-world knowledge and policy understanding, they struggle to reason over fine-grained visual details and apply structured compliance rules effectively on their own.

CompAgent addresses this by augmenting MLLMs with a suite of specialized visual tools — including Amazon Rekognition, Bedrock Data Automation (BDA), LlavaGuard, SafeCLIP, and ICM-Assistant — and introduces a planning agent built on LangGraph that dynamically selects appropriate tools based on the compliance policy. A compliance verification agent then integrates image content, tool outputs, and policy context to perform multimodal reasoning using a ReAct (Reasoning + Acting) paradigm.

Key Results:

🏆 Achieves up to 76% F1 score on the UnsafeBench dataset
📈 10% improvement over the state-of-the-art
✅ Outperforms specialized classifiers, direct MLLM prompting, and curated routing baselines

---

🏗️ Architecture

Figure: Overview of the CompAgent framework. A planning agent dynamically selects visual tools based on the compliance policy, and a compliance verification agent integrates multimodal evidence to produce structured safety assessments.

---

🎬 Demo

Demo: CompAgent in action — dynamically selecting tools and performing visual compliance verification.

---

🖥️ Streamlit Demo App

CompAgent includes an interactive Streamlit UI that provides a lightweight interface to run the LangGraph ReAct agent for image compliance checking with real-time execution trace visualization.

Features

Model Selection — Choose from Claude 3.5 Sonnet, Claude 3.7 Sonnet, Claude Sonnet 4, Llama 3.3, and more
Real-time Traces — Watch the agent's reasoning and tool calls as they execute
Customizable Policies — Edit safety policies directly in the UI
Final Assessment Display — View structured compliance results (rating, category, rationale)

Running the Streamlit App

1. Ensure metadata is extracted (see Step 1 below)

2. Navigate to the agent directory:

cd src/agent

3. Run the Streamlit app:

streamlit run streamlit_app.py

4. Open your browser at http://localhost:8501

Configuration

The app automatically detects the dataset configuration from BUCKET_NAME in compliance_tools_standalone_langgraph.py:

For LlavaGuard benchmark: Uses llavaguard_agent.txt prompt
For UnsafeBench benchmark: Uses unsafebench_agent.txt prompt

---

📁 Repository Structure

grip-compagent/
├── README.md # This file
├── requirements.txt # Python dependencies (pip)
├── pyproject.toml # Project configuration (uv)
├── assets/
│ ├── main_architecture.png # Architecture diagram
│ ├── demo.gif # Demo GIF (for README)
│ └── demo_video.mp4 # Full demo video
├── src/
│ ├── agent/ # 🧠 Main CompAgent Framework
│ │ ├── langGraph_reAct_llavaguard.ipynb # LangGraph ReAct agent for LlavaGuard benchmark
│ │ ├── langGraph_reAct_unsafebench.ipynb # LangGraph ReAct agent for UnsafeBench benchmark
│ │ ├── compliance_tools_standalone_langgraph.py # Tool definitions for the agent
│ │ ├── streamlit_app.py # Interactive Streamlit demo UI
│ │ ├── run_notebook.sh # Shell script to execute the notebook
│ │ └── prompt/ # Agent system prompts
│ │ ├── llavaguard_agent.txt # LlavaGuard policy prompt for agent
│ │ ├── llavaguard.txt # LlavaGuard standalone prompt
│ │ ├── unsafebench_agent.txt # UnsafeBench policy prompt for agent
│ │ └── unsafebench.txt # UnsafeBench standalone prompt
│ ├── tool_run/ # 🔧 Standalone Tool Runners
│ │ ├── tool_llavaguard.py # Run LlavaGuard inference on images
│ │ ├── tool_BDA.py # Run Bedrock Data Automation on images
│ │ ├── tool_rekognition.py # Run Amazon Rekognition on images
│ │ ├── tool_safeclip.py # Run SafeCLIP on images
│ │ ├── tool_icm.py # Run ICM-Assistant on images
│ │ ├── config_icm.yaml # ICM-Assistant model/runtime config
│ │ └── prompts_icm.yaml # ICM-Assistant prompt templates
│ ├── baseline_run/ # 📊 Baseline Comparisons
│ │ ├── run_llama_baseline.ipynb # Llama baseline
│ │ ├── run_mistral_baseline.ipynb # Mistral/Pixtral baseline
│ │ ├── run_sonnet_baseline.ipynb # Claude Sonnet baseline
│ │ └── run_notebooks.sh # Run all baselines
│ ├── evaluation/ # 📈 Evaluation Notebooks
│ │ ├── evaluation.ipynb # General evaluation
│ │ ├── evaluation-llavaguard.ipynb# LlavaGuard benchmark evaluation
│ │ └── evaluation-unsafebench.ipynb # UnsafeBench benchmark evaluation
│ ├── prompts/ # 📝 System Prompts
│ │ ├── prompt_agent_system_w_verification.txt
│ │ ├── prompt_cluster_routing.txt
│ │ └── prompt_llavaguard.txt
│ ├── routing/ # 🔀 Routing Logic
│ │ ├── compliance_state.py # Compliance state definitions
│ │ ├── compliance_tools.py # Tool routing implementations
│ │ └── routing_decisions.txt # Routing decision documentation
│ └── utils/ # 🛠️ Utilities
│ ├── bedrock_invoker.py # AWS Bedrock API wrapper
│ ├── helpers.py # General helper functions
│ ├── inference_helpers.py # Inference utility functions
│ └── openai_clip.py #...

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Low stars, routine repo from amazon science