amazon-science/compagent
Jupyter Notebook
Captured source
source ↗amazon-science/compagent
Description: CompAgent: An Agentic Framework for Visual Compliance Verification
Language: Jupyter Notebook
License: NOASSERTION
Stars: 1
Forks: 0
Open issues: 0
Created: 2026-05-05T23:09:22Z
Pushed: 2026-05-18T18:30:05Z
Default branch: main
Fork: no
Archived: no
README:
🛡️ CompAgent: An Agentic Framework for Visual Compliance Verification
Rahul Ghosh, Baishali Chaudhury, Hari Prasanna Das, Meghana Ashok, Ryan Razkenari, Long Chen, Sungmin Hong, Chun-Hao Liu
Accepted at CVPR 2026 GRAIL-V (Grounded Retrieval and Agentic Intelligence for Vision-Language) Workshop
> ⚠️ Notice: This code is being released solely for academic and scientific reproducibility purposes, in support of the methods and findings described in the associated publication. Pull requests are not being accepted in order to maintain the code exactly as it was used in the paper.
---
📖 Introduction
CompAgent is the first agentic framework for visual compliance verification — a critical yet underexplored problem in computer vision, especially in domains such as media, entertainment, and advertising where content must adhere to complex and evolving policy rules.
Existing methods often rely on task-specific deep learning models trained on manually labeled datasets, which are costly to build and limited in generalizability. While recent Multimodal Large Language Models (MLLMs) offer broad real-world knowledge and policy understanding, they struggle to reason over fine-grained visual details and apply structured compliance rules effectively on their own.
CompAgent addresses this by augmenting MLLMs with a suite of specialized visual tools — including Amazon Rekognition, Bedrock Data Automation (BDA), LlavaGuard, SafeCLIP, and ICM-Assistant — and introduces a planning agent built on LangGraph that dynamically selects appropriate tools based on the compliance policy. A compliance verification agent then integrates image content, tool outputs, and policy context to perform multimodal reasoning using a ReAct (Reasoning + Acting) paradigm.
Key Results:
- 🏆 Achieves up to 76% F1 score on the UnsafeBench dataset
- 📈 10% improvement over the state-of-the-art
- ✅ Outperforms specialized classifiers, direct MLLM prompting, and curated routing baselines
---
🏗️ Architecture
Figure: Overview of the CompAgent framework. A planning agent dynamically selects visual tools based on the compliance policy, and a compliance verification agent integrates multimodal evidence to produce structured safety assessments.
---
🎬 Demo
Demo: CompAgent in action — dynamically selecting tools and performing visual compliance verification.
---
🖥️ Streamlit Demo App
CompAgent includes an interactive Streamlit UI that provides a lightweight interface to run the LangGraph ReAct agent for image compliance checking with real-time execution trace visualization.
Features
- Model Selection — Choose from Claude 3.5 Sonnet, Claude 3.7 Sonnet, Claude Sonnet 4, Llama 3.3, and more
- Real-time Traces — Watch the agent's reasoning and tool calls as they execute
- Customizable Policies — Edit safety policies directly in the UI
- Final Assessment Display — View structured compliance results (rating, category, rationale)
Running the Streamlit App
1. Ensure metadata is extracted (see Step 1 below)
2. Navigate to the agent directory:
cd src/agent
3. Run the Streamlit app:
streamlit run streamlit_app.py
4. Open your browser at http://localhost:8501
Configuration
The app automatically detects the dataset configuration from BUCKET_NAME in compliance_tools_standalone_langgraph.py:
- For LlavaGuard benchmark: Uses
llavaguard_agent.txtprompt - For UnsafeBench benchmark: Uses
unsafebench_agent.txtprompt
---
📁 Repository Structure
grip-compagent/ ├── README.md # This file ├── requirements.txt # Python dependencies (pip) ├── pyproject.toml # Project configuration (uv) ├── assets/ │ ├── main_architecture.png # Architecture diagram │ ├── demo.gif # Demo GIF (for README) │ └── demo_video.mp4 # Full demo video ├── src/ │ ├── agent/ # 🧠 Main CompAgent Framework │ │ ├── langGraph_reAct_llavaguard.ipynb # LangGraph ReAct agent for LlavaGuard benchmark │ │ ├── langGraph_reAct_unsafebench.ipynb # LangGraph ReAct agent for UnsafeBench benchmark │ │ ├── compliance_tools_standalone_langgraph.py # Tool definitions for the agent │ │ ├── streamlit_app.py # Interactive Streamlit demo UI │ │ ├── run_notebook.sh # Shell script to execute the notebook │ │ └── prompt/ # Agent system prompts │ │ ├── llavaguard_agent.txt # LlavaGuard policy prompt for agent │ │ ├── llavaguard.txt # LlavaGuard standalone prompt │ │ ├── unsafebench_agent.txt # UnsafeBench policy prompt for agent │ │ └── unsafebench.txt # UnsafeBench standalone prompt │ ├── tool_run/ # 🔧 Standalone Tool Runners │ │ ├── tool_llavaguard.py # Run LlavaGuard inference on images │ │ ├── tool_BDA.py # Run Bedrock Data Automation on images │ │ ├── tool_rekognition.py # Run Amazon Rekognition on images │ │ ├── tool_safeclip.py # Run SafeCLIP on images │ │ ├── tool_icm.py # Run ICM-Assistant on images │ │ ├── config_icm.yaml # ICM-Assistant model/runtime config │ │ └── prompts_icm.yaml # ICM-Assistant prompt templates │ ├── baseline_run/ # 📊 Baseline Comparisons │ │ ├── run_llama_baseline.ipynb # Llama baseline │ │ ├── run_mistral_baseline.ipynb # Mistral/Pixtral baseline │ │ ├── run_sonnet_baseline.ipynb # Claude Sonnet baseline │ │ └── run_notebooks.sh # Run all baselines │ ├── evaluation/ # 📈 Evaluation Notebooks │ │ ├── evaluation.ipynb # General evaluation │ │ ├── evaluation-llavaguard.ipynb# LlavaGuard benchmark evaluation │ │ └── evaluation-unsafebench.ipynb # UnsafeBench benchmark evaluation │ ├── prompts/ # 📝 System Prompts │ │ ├── prompt_agent_system_w_verification.txt │ │ ├── prompt_cluster_routing.txt │ │ └── prompt_llavaguard.txt │ ├── routing/ # 🔀 Routing Logic │ │ ├── compliance_state.py # Compliance state definitions │ │ ├── compliance_tools.py # Tool routing implementations │ │ └── routing_decisions.txt # Routing decision documentation │ └── utils/ # 🛠️ Utilities │ ├── bedrock_invoker.py # AWS Bedrock API wrapper │ ├── helpers.py # General helper functions │ ├── inference_helpers.py # Inference utility functions │ └── openai_clip.py #…
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Low stars, routine repo from amazon science