What does this repo signal mean?

Qwen (Alibaba Cloud) published QwenLM/Qwen3-ASR-Toolkit (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo QwenLM/Qwen3-ASR-Toolkit · language Python · Solid new repo with decent stars, but not a major launch.. onlylabs links this event to 1 captured evidence page and 6 related repo signals. It also maps to Product and customer in the data-business radar.

Qwen (Alibaba Cloud) Repo: QwenLM/Qwen3-ASR-Toolkit

Captured source

source ↗

GitHub/github.com/QwenLM/Qwen3-ASR-Toolkit

QwenLM/Qwen3-ASR-Toolkit repository metadata

Source ↗

published Sep 16, 2025seen Jun 5captured Jun 11http 200method plain

QwenLM/Qwen3-ASR-Toolkit

Description: Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support.

Language: Python

License: MIT

Stars: 965

Forks: 92

Open issues: 12

Created: 2025-09-16T09:03:49Z

Pushed: 2026-02-05T05:35:03Z

Default branch: main

Fork: no

Archived: no

README:

Qwen3-ASR-Toolkit

![PyPI version](https://badge.fury.io/py/qwen3-asr-toolkit)

😊 Important Notice

Qwen3-ASR is now open-sourced 🎉🎉🎉. Welcome to visit the **GitHub** and **blog** for more information. The open-source model offers functionality comparable to the API and supports free, fast local deployment. Qwen3-ASR open-source model includes two powerful all-in-one speech recognition models (0.6B/1.7B) that support language identification and ASR for 52 languages and dialects, as well as a novel non-autoregressive speech forced-alignment model that can align text–speech pairs in 11 languages. Its powerful performance is sufficient to deliver highly compelling speech-to-text transcription capabilities. Welcome to use it!

Overview

An advanced, high-performance Python command-line toolkit for using the Qwen-ASR API (formerly Qwen3-ASR-Flash). This implementation overcomes the API's 3-minute audio length limitation by intelligently splitting long audio/video files and processing them in parallel, enabling rapid transcription of hours-long content.

🚀 Key Features

Break the 3-Minute Limit: Seamlessly transcribe audio and video files of any length by bypassing the official API's duration constraint.
Smart Audio Splitting: Utilizes Voice Activity Detection (VAD) to split audio into meaningful chunks at natural silent pauses. This ensures that words and sentences are not awkwardly cut off.
High-Speed Parallel Processing: Leverages multi-threading to send audio chunks to the Qwen-ASR API concurrently, dramatically reducing the total transcription time for long files.
Intelligent Post-Processing: Automatically detects and removes common ASR hallucinations and repetitive artifacts for cleaner, more accurate transcripts.
SRT Subtitle Generation: Automatically create timestamped `.srt` subtitle files based on VAD segments, perfect for adding captions to video content.
Automatic Audio Resampling: Automatically converts audio from any sample rate and channel count to the 16kHz mono format required by the Qwen-ASR API. You can use any audio file without worrying about pre-processing.
Universal Media Support: Supports virtually any audio and video format (e.g., .mp4, .mov, .mkv, .mp3, .wav, .m4a) thanks to its reliance on FFmpeg.
Simple & Easy to Use: A straightforward command-line interface allows you to get started with just a single command.

⚙️ How It Works

This tool follows a robust pipeline to deliver fast and accurate transcriptions for long-form media:

1. Media Loading: The script first loads your media file, whether it's a local file or a remote URL. 2. VAD-based Chunking: It analyzes the audio stream using Voice Activity Detection (VAD) to identify silent segments. 3. Intelligent Splitting: The audio is then split into smaller chunks based on the detected silences. Each chunk's duration is managed to stay under the 3-minute API limit, with a user-configurable target length (defaulting to 120 seconds), preventing mid-sentence cuts. 4. Parallel API Calls: A thread pool is initiated to upload and process these chunks concurrently using the DashScope Qwen-ASR API. 5. Result Aggregation & Cleaning: The transcribed text segments from all chunks are collected, re-ordered, and then post-processed to remove detected repetitions and hallucinations. 6. Output Generation: The final, cleaned transcription is printed to the console and saved to a .txt file. Optionally, a timestamped `.srt` subtitle file can also be generated.

🏁 Getting Started

Follow these steps to set up and run the project on your local machine.

Prerequisites

Python 3.8 or higher.
FFmpeg: The script requires FFmpeg to be installed on your system to handle media files.
Ubuntu/Debian: sudo apt update && sudo apt install ffmpeg
macOS: brew install ffmpeg
Windows: Download from the official FFmpeg website and add it to your system's PATH.
DashScope API Key: You need an API key from Alibaba Cloud's DashScope.
You can obtain one from the DashScope Console. If you are calling the API services of Tongyi Qwen for the first time, you can follow the tutorial on this website to create your own API Key.
For better security and convenience, it is highly recommended to set your API key as an environment variable named DASHSCOPE_API_KEY. The script will automatically use it, and you won't need to pass the --api-key argument in the command.

On Linux/macOS:

export DASHSCOPE_API_KEY="your_api_key_here"

*(To make this permanent, add the line to your ~/.bashrc, ~/.zshrc, or ~/.profile file.)*

On Windows (Command Prompt):

set DASHSCOPE_API_KEY="your_api_key_here"

On Windows (PowerShell):

$env:DASHSCOPE_API_KEY="your_api_key_here"

*(For a permanent setting on Windows, search for "Edit the system environment variables" in the Start Menu and add DASHSCOPE_API_KEY to your user variables.)*

Installation

We recommend installing the tool directly from PyPI for the simplest setup.

Option 1: Install from PyPI (Recommended)

Simply run the following command in your terminal. This will install the package and make the qwen3-asr command available system-wide.

pip install qwen3-asr-toolkit

Option 2: Install from Source

If you want to install the latest development version or contribute to the project, you can install from the source code.

1. Clone the repository:

git clone https://github.com/QwenLM/Qwen3-ASR-Toolkit.git
cd Qwen3-ASR-Toolkit

2. Install the package:

pip install .

📖 Usage

Once installed, you can use the qwen3-asr command directly from your terminal. By default, the tool will...

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

Solid new repo with decent stars, but not a major launch.