StreamLake (Kuaishou) analysis

Thesis

Kuaishou's StreamLake is executing a dual-pronged open-source strategy: a video-native multimodal foundation model line (Keye-VL-2.0) aimed at long-video understanding and agentic capabilities, and an AI coding product suite (KAT-Coder, CodeFlicker, Vanchin) targeting the software engineering tool market. Both tracks are anchored in Apache 2.0 releases, rigorous public benchmarking against frontier models (GPT-5, Gemini, Claude, Qwen), and a conservative scaling posture that prioritizes cost efficiency over infrastructure buildout [W5, W6, W1, W2].

Signal desks

Hiring

No cited evidence in this pack.

Forks

kwaipilot/experiments — fork of SWE-bench/experiments, the official SWE-bench leaderboard submission repository for predictions, execution logs, trajectories, and evaluation results. This fork, dated September 2025, indicates internal SWE-bench evaluation infrastructure work that directly precedes the KAT-Coder product launch in October 2025 [P4, E4, W5].

Releases

Keye-VL-2.0-30B-A3B (May 2026): 30B total parameters, MoE architecture with ~3B active parameters. First multimodal model to integrate DeepSeek Sparse Attention (DSA). 256K context window for hour-long video temporal reasoning. Debuts Code Interpreter, Tool Use, and Search agent capabilities in the Keye line. Apache 2.0, weights on Hugging Face and ModelScope, public GitHub repo, demo available [W1, W2, W3, W4].
SWE-Compass (December 2025): Apache 2.0 evaluation framework covering 8 software engineering task types, 8 programming scenarios, and 10 programming languages, with 2,000 instances sourced from real GitHub PRs. arXiv paper published. Hugging Face dataset available [P1, E1].
KAT-Coder-Agent (September 2025): Agent repo created with minimal public detail; 1 star, 0 forks [P2, E2].
KAT-Coder (September 2025): Repository created with HTML language tag; sparse public detail; 1 star, 0 forks [P3, E3].
AI Coding Product Matrix (October 2025): CodeFlicker (intelligent development tool), multiple self-developed KAT-Coder models, and Vanchin (万擎) large model platform. KAT-Coder-Air V1 offered free to all users W5.

Talking

Keye-VL-2.0 launch narrative: Framed as the "first to bring DSA into multimodal understanding," delivering "near-lossless end-to-end temporal reasoning on hour-long video," explicitly benchmarked as beating Gemini-2.5-Pro and Gemini 3 Flash on TimeLens video metrics [W2, W3]. External coverage highlights beating Qwen3-VL-235B on LongVideoBench at 74.1 with one-eighth the active parameters, positioning the model as a "credible drop-in for long-video agent stacks built on closed APIs" W1.
Coding product launch narrative: 36Kr coverage frames the AI coding release as a "工具+模型+平台" (tools + models + platform) three-in-one matrix, with the specific claim that KAT-Coder-Pro V1 surpasses GPT-5 and Claude Sonnet 4 on SWE-bench Verified at 73.4% W5.
StreamLake video cloud positioning: Conservative scaling rhetoric — "will not massively invest in infrastructure" or "burn money expanding teams," instead growing "like a snowball" with healthy margins. AI focus explicitly on video AI across the creation-to-distribution pipeline, validated against real large-scale data W6.

Shipping

Evidence of five distinct shipped artifacts across two product lines:

1. Keye-VL-2.0-30B-A3B: Model weights on Hugging Face (Kwai-Keye/Keye-VL-2.0-30B-A3B) and ModelScope, GitHub repository (Kwai-Keye/Keye), public demo, Apache 2.0 license. Shipped May 2026 with Code Interpreter, Tool Use, and Search agent features [W1, W2, W3, W4]. 2. SWE-Compass: Evaluation framework on GitHub (kwaipilot/SWE-Compass), Hugging Face dataset, arXiv paper (2511.05459), Apache 2.0. Shipped December 2025 [P1, E1]. 3. AI Coding product matrix: CodeFlicker, KAT-Coder models (Pro V1 at 73.4% SWE-bench Verified, Air V1 free), Vanchin platform. Launched October 2025 W5. 4. KAT-Coder-Agent: Minimal public release, September 2025 [P2, E2]. 5. KAT-Coder: Minimal public release, September 2025 [P3, E3].

Research themes

Multi-dimensional SWE evaluation: SWE-Compass extends beyond Python-centric SWE-bench with 10 languages, 8 task types, and 8 programming scenarios, sourced from real GitHub pull requests. Explicitly targets gaps in "narrow task categories, Python-centric bias, and insufficient alignment with real-world development workflows" P1.
Sparse attention for multimodal models: Keye-VL-2.0 is claimed as the first multimodal model to integrate DeepSeek Sparse Attention (DSA), enabling a 256K context window for "almost lossless reasoning" on long video [W2, W3, W4].
Long-video temporal reasoning: Benchmark results on LongVideoBench (74.1) and QVHighlights-TimeLens (70.1 mIoU) demonstrate competitive or superior performance against larger models (Qwen3-VL-235B, Gemini-2.5-Pro) for hour-long video understanding [W1, W2].
Agent-augmented multimodal models: Keye-VL-2.0 introduces Code Interpreter, Tool Use, and Search capabilities, evolving the model "from passive observer to active agent" [W2, W3].
AI coding agents and evaluation: SWE-bench experiments fork for evaluation pipeline management, KAT-Coder agent development, and SWE-Compass benchmark construction form a coherent coding-agent research thread [P1, P2, P3, P4, W5].

Hiring & scaling

No cited evidence in this pack. Public positioning from StreamLake leadership states the video cloud unit "will not massively invest in infrastructure" or "burn money expanding teams," preferring a "snowball" growth model W6. However, no specific job listings, team composition data, or location-based hiring signals are present in the supplied evidence.

Category implications

Video AI open-weight competition: Keye-VL-2.0's Apache 2.0 release, combined with benchmark wins against larger closed and open models (Gemini-2.5-Pro, Qwen3-VL-235B), positions Kuaishou as a serious open-weight entrant in long-video understanding. The permissive license makes it a "credible drop-in for long-video agent stacks built on closed APIs" W1, which could pressure commercial video understanding API pricing [W1, W2, W4].
AI coding tools market entry: KAT-Coder-Pro V1 beating GPT-5 and Claude Sonnet 4 on SWE-bench Verified (73.4%) combined with a free tier (Air V1) signals an aggressive product entry against incumbent coding assistants. The "工具+模型+平台" three-in-one approach suggests a platform ecosystem play rather than a point-product W5.
Evaluation infrastructure as moat: SWE-Compass and the SWE-bench experiments fork indicate significant investment in proprietary evaluation methodology that extends beyond community benchmarks. Building a 10-language, 8-task-type benchmark from real GitHub PRs creates evaluation capabilities that directly feed model development and product benchmarking [P1, P4].
Platform GTM strategy: The pairing of open-weight model releases (Keye-VL-2.0, SWE-Compass) with commercial platform products (Vanchin, CodeFlicker) mirrors the cloud-to-enterprise funnel seen in other neocloud plays. However, StreamLake's stated conservative scaling posture — "no large-scale infrastructure investment" and "no burning cash on team expansion" — suggests a capital-efficient approach that may limit near-term capacity growth [W5, W6].
Research-to-product pipeline: The September 2025 SWE-bench experiments fork E4 followed by KAT-Coder repos in mid-September [E2, E3] and the full product matrix launch in October 2025 W5 demonstrates a compressed research-to-product cycle of roughly one month. SWE-Compass in December 2025 E1 and Keye-VL-2.0 in May 2026 [W1, W2] show sustained release cadence across both tracks.
Thin evidence areas: No hiring data, no infrastructure or compute procurement signals, no revenue or customer metrics, and no evidence of safety, alignment, or policy work. The agent capabilities claimed for Keye-VL-2.0 (Code Interpreter, Tool Use, Search) are described but not independently benchmarked in the cited sources [W2, W3]. KAT-Coder-Agent and KAT-Coder repos remain at 1 star each with minimal documentation, suggesting pre-release or limited public launch status [P2, P3].

Traction highlights

Keye-VL-2.0: 74.1 on LongVideoBench, beating Qwen3-VL-235B (one-eighth the active parameters); 70.1 mIoU on QVHighlights-TimeLens, beating Gemini-2.5-Pro and Gemini 3 Flash [W1, W2]
KAT-Coder-Pro V1: 73.4% on SWE-bench Verified, surpassing GPT-5 and Claude Sonnet 4 W5
SWE-Compass: 18 GitHub stars, 2 forks, Hugging Face dataset published P1
KAT-Coder-Agent: 1 star, 0 forks — pre-traction P2
KAT-Coder: 1 star, 0 forks — pre-traction P3
Community attention: Keye-VL-2.0 covered by AI/TLDR, AI Tech Deep Dives, CSDN, and AI铺子 across English and Chinese tech media [W1, W2, W3, W4]; AI coding launch covered by 36Kr W5