RepoInclusionAI (Ant Group)InclusionAI (Ant Group)published Aug 15, 2025seen 5d

inclusionAI/UI-Venus

Python

Open original ↗

Captured source

source ↗
published Aug 15, 2025seen 5dcaptured 11hhttp 200method plain

inclusionAI/UI-Venus

Description: UI-Venus is a native UI agent designed to perform precise GUI element grounding and effective navigation using only screenshots as input.

Language: Python

Stars: 1010

Forks: 85

Open issues: 12

Created: 2025-08-15T11:57:47Z

Pushed: 2026-05-11T06:54:30Z

Default branch: UI-Venus-1.5

Fork: no

Archived: no

README:

UI-Venus 1.5

UI-Venus 1.5 is a unified, end-to-end GUI Agent designed for robust real-world applications. The model family includes two dense (2B/8B) and one MoE (30B-A3B) variants to meet various downstream scenarios.

Upgrades from UI-Venus 1.0:

  • 🔹 Mid-Training Stage: 10B tokens across 30+ datasets for foundational GUI semantics
  • 🔹 Online RL: Full-trajectory rollouts for long-horizon dynamic navigation
  • 🔹 Model Merging: Unified agent combining grounding, web, and mobile specialists

Results: SOTA on ScreenSpot-Pro (69.6%), VenusBench-GD (75.0%), AndroidWorld (77.6%), with robust navigation across 40+ Chinese mobile apps.

---

📈 UI-Venus Benchmark Performance

> Figure: Performance of UI-Venus 1.5 across multiple benchmarks. UI-Venus 1.5 achieves State-of-the-Art (SOTA) results on key grounding benchmarks (ScreenSpot-Pro, VenusBench-GD, OSWorld-G, UI-Vision) and agent benchmarks (AndroidWorld, AndroidLab, VenusBench-Mobile).

---

🚀 News

  • [2026/02] We release UI-Venus 1.5, an end-to-end GUI Agent designed for robust real-world applications.
  • [2026/02] We release VenusBench-Mobile, a challenging online benchmark for mobile GUI agents. See branch VenusBench-Mobile.
  • [2025/12] We release VenusBench-GD, a comprehensive multi-platform GUI grounding benchmark. See branch VenusBench-GD.
  • [2025/8] We release [UI-Venus 1.0](https://github.com/inclusionAI/UI-Venus/tree/UI-Venus-1.0), the first version of our UI agent model.

---

Overview

  • [Demo](#-demo)
  • [Venus Framework](#venus-framework)
  • [Quick Start](#-quick-start)
  • [Benchmark Results](#detailed-benchmark-results)
  • [Contact](#contact)
  • [Citation](#citation)

---

✨ Demo

Chinese App Demo Videos / 中文应用演示视频

Ximalaya - 喜马拉雅

Open Ximalaya FM, play "Zootopia 2," and set the playback mode to list loop. 打开喜马拉雅,帮我播放疯狂动物城2,设置列表循环播放

![Ximalaya Demo](assets/gif/xmly.gif)

---

Qimao Novel - 七猫免费小说

Open Qimao Free Webnovel and add the top 3 books from the "Creative/Mind-bending" chart to the bookshelf. 打开七猫免费小说,将小说脑洞榜前三名都加入书架

![Qimao Demo](assets/gif/7mao.gif)

---

Weibo - 微博

Open Weibo, search for "Hangzhou Weather," and post a comment based on the current weather. 打开微博,搜索杭州天气,并根据天气进行评论

![Weibo Demo](assets/gif/wb.gif)

---

Xiaohongshu - 小红书

Open Xiaohongshu, search for baking tutorials, and play a video with over 10,000 views. 打开小红书,搜索烘焙教程,找到播放量大于1w的视频进行播放

![Xiaohongshu Demo](assets/gif/xhs.gif)

---

Toutiao - 今日头条

Open Toutiao, click on the top trending story, and view the "Event Summary. 打开今日头条,点击进入热榜第1名,查看事件速览

![Toutiao Demo](assets/gif/jrtt.gif)

---

Venus Framework

We provide a complete Android automation framework for deploying UI-Venus 1.5 as an autonomous mobile agent.

Features:

  • 🎯 Single task execution with natural language
  • 🔄 Multi-device parallel batch processing
  • 📊 Trajectory recording and replay
  • 🔁 Intelligent loop detection

👉 [Documentation →](./Venus_framework/README.md) | [中文文档 →](./Venus_framework/README_CN.md)

Supported: 40+ mainstream Chinese apps including Weibo, Xiaohongshu, Taobao, Meituan, Bilibili, Alipay, and more.

---

🚀 Quick Start

Installation

pip install -r requirements.txt

Grounding Evaluation

Edit scripts/run_gd_auto.sh or scripts/run_gd_ddp.sh to configure:

# GPU Configuration
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

# Model Configuration
MODEL_PATH="/path/to/UI-Venus-1.5" # Your model checkpoint path

# Dataset Configuration (uncomment one)
# ScreenSpot-Pro
IMGS_PATH="/path/to/Screenspot-pro/images"
TEST_PATH="/path/to/Screenspot-pro/annotations"

Run evaluation:

# Single/Multi-GPU with device_map="auto"
bash scripts/run_gd_auto.sh

# Multi-GPU with DDP (faster for large datasets)
bash scripts/run_gd_ddp.sh

Navigation Evaluation

Edit scripts/run_navi.sh to configure:

# GPU Configuration
CUDA_DEVICES="0,1,2,3"

# Model Configuration
MODEL_PATH="/path/to/UI-Venus-1.5"

# Input/Output
INPUT_FILE="examples/trace/trace.json" # Navigation task file
OUTPUT_FILE="./results/navi/output.json"

# Prompt Type Configuration (important!)
PROMPT_TYPE="mobile" # Options: "web" for web tasks, "mobile" for mobile tasks (default: mobile)

# vLLM Configuration
TENSOR_PARALLEL_SIZE=4 # Should match GPU count
GPU_MEMORY_UTIL=0.8 # Reduce if OOM
MAX_MODEL_LEN=16192

Prompt Type Selection:

  • PROMPT_TYPE="mobile" - Use mobile-specific prompts for Android/iOS app navigation tasks
  • PROMPT_TYPE="web" - Use web-specific prompts for browser/web page navigation tasks

Run evaluation:

# Default: Mobile prompt
bash scripts/run_navi.sh

---

Detailed Benchmark Results

Grounding Benchmarks

| Models | VenusBench-GD | ScreenSpot-Pro | OSworld-G | UI-Vision | |--------|:-------------:|:--------------:|:---------:|:---------:| | Qwen3-VL-30B-A3B | 52.4 | 53.7 | 69.3 | 61.2 | | Step-GUI-8B | - | 62.6 | - | - | | MAI-UI-8B | 65.2 | 65.8 | 60.1 | 40.7 | | MAI-UI-32B | - | 67.9 | 67.6 | 47.1 | | UI-Venus-1.0-7B | 49.0 | 50.8 | 54.6 | 26.5 | | UI-Venus-1.0-72B | 70.2 | 61.9 | 62.2 | 36.8 | | UI-Venus-1.5-2B | 67.3 | 57.7 | 59.4 | 44.8 | | + ZoomIn | 67.9 | 64.6 | 61.4 | 46.8 | | UI-Venus-1.5-8B | 72.3 | 68.4 | 69.7 | 46.5 | | + ZoomIn | 72.6 | 73.9 | 70.6 | 51.7 | | UI-Venus-1.5-30B-A3B | 75.0 | 69.6 | 70.6 | 54.7 | | + ZoomIn | 74.3 | 74.8 | 72.2 | 57.8 |

Navigation Benchmarks

| Models | Params | AndroidWorld | AndroidLab | VenusBench-Mobile | WebVoyager | |--------|:------:|:------------:|:----------:|:-----------------:|:----------:| | *General VLMs* | | | | | | | GPT-4o | - | - | 31.2 | - | 55.5 | | Claude-3.7 | - | - | - | - | 84.1 | | Qwen3-VL-30B-A3B | 30B | 54.3 | 42.0* | 8.7 | 47.5* | | GLM-4.6V…

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

1k stars, new UI repo