cohere-ai/vllm-skills
Captured source
source ↗cohere-ai/vllm-skills
License: Apache-2.0
Stars: 2
Forks: 0
Open issues: 0
Created: 2026-04-22T14:52:55Z
Pushed: 2026-06-24T02:41:45Z
Default branch: main
Fork: no
Archived: no
README:
vllm-skills
> AI agent skills for keeping a long-lived vLLM fork in sync with upstream — automated rebase, conflict resolution, and test-driven verification.
Five composable skills that an AI coding agent reads and executes interactively to:
- detect when a new upstream vLLM release is available,
- rebase the fork's custom commits onto it,
- resolve conflicts using upstream diff context,
- and iterate on user-defined checks (tests, benchmarks, evals) until the fork is back to a healthy state.
For the design rationale and a worked case study (Cohere's transcription model on v0.19.1), see [docs/auto-fork-maintenance.md](docs/auto-fork-maintenance.md). To reproduce that example end-to-end, follow [docs/reproduce-cohere-transcribe-v0.19.1.md](docs/reproduce-cohere-transcribe-v0.19.1.md).
Compatibility
Each skill is a SKILL.md markdown file with YAML frontmatter (name, description), following the Agent Skills format used by Cursor. The skills only assume access to a shell, the file system, and git, so they should also work with other coding agents that can read and execute the same Markdown-based instructions.
The rules/skill-edit-checklist.mdc file is a Cursor Rule and is optional.
Skills
| Skill | Role in the loop | What it does | |-------|------------------|--------------| | [install-vllm](skills/install-vllm/SKILL.md) | Environment setup | Creates a uv virtualenv, installs vLLM in editable mode with the correct precompiled CUDA wheel | | [local-test-runner](skills/local-test-runner/SKILL.md) | Measurement | Runs Buildkite CI-equivalent tests locally on NVIDIA GPUs; parses .buildkite/test_areas/*.yaml, manages Hugging Face tokens, captures logs | | [detect-upstream-base](skills/detect-upstream-base/SKILL.md) | Disturbance detection | Finds the upstream tag (v1) the fork is currently based on via git merge-base + git describe | | [rebase-assistant](skills/rebase-assistant/SKILL.md) | Controller | Rebases custom commits from v1 onto v2, resolves conflicts using upstream diffs, verifies with test-runner | | [auto-rebase](skills/auto-rebase/SKILL.md) | Orchestrator | Checks for new upstream releases via gh, invokes detect-upstream-base and rebase-assistant end-to-end |
See [skills/README.md](skills/README.md) for the dependency graph, shared notation (v1/v2/b1/b2), and the change-impact table contributors should follow when editing skills.
Quick start
In an agent session inside your vLLM fork checkout:
/auto-rebase sync the current branch with the latest upstream release and make sure tests/entrypoints/openai/correctness/test_transcription_api_correctness.py passes
The agent will:
1. detect the current upstream base tag (v1) and the latest release (v2), 2. confirm with you before rebasing, 3. verify the checks pass on the pre-rebase branch as a baseline, 4. rebase the custom commits onto v2 and resolve conflicts, 5. iterate (fix, re-run checks, repeat) until everything passes, 6. summarize what changed and offer to push.
Prerequisites
Each skill checks its own prereqs at runtime, but at a minimum you'll need:
| Tool | Used by | Install | |------|---------|---------| | uv | install-vllm | curl -LsSf https://astral.sh/uv/install.sh \| sh | | gh (authenticated) | auto-rebase | gh auth login | | Hugging Face token | local-test-runner (for tests that pull weights) | hf auth login | | upstream git remote | detect-upstream-base, rebase-assistant | git remote add upstream git@github.com:vllm-project/vllm.git |
Beyond vLLM
The skills are vLLM-specific, but the underlying pattern — *detect the disturbance, measure the gap, iterate until error → 0* — generalizes to any long-lived fork with a measurable definition of "working" (a test, a benchmark, an eval). The same loop has been applied to other long-lived forks at Cohere, including a Hugging Face transformers fork. For the full framing, see [docs/auto-fork-maintenance.md](docs/auto-fork-maintenance.md).
License
Apache 2.0 — see [LICENSE](LICENSE).
Notability
notability 3.0/10New repo by notable lab but very low traction.