ForkUpstage (Solar)Upstage (Solar)published Dec 29, 2025seen 5d

UpstageAI/vllm

forked from vllm-project/vllm

Open original ↗

Captured source

source ↗
published Dec 29, 2025seen 5dcaptured 12hhttp 200method plain

UpstageAI/vllm

Description: A high-throughput and memory-efficient inference and serving engine for LLMs

Language: Python

License: Apache-2.0

Stars: 1

Forks: 0

Open issues: 19

Created: 2025-12-29T06:50:24Z

Pushed: 2026-03-26T13:43:46Z

Default branch: v0.12.0-solar-open

Fork: yes

Parent repository: vllm-project/vllm

Archived: no

README:

Easy, fast, and cheap LLM serving for everyone

| Documentation | Blog | Paper | Twitter/X | User Forum | Developer Slack |

--- Join us at the PyTorch Conference, October 22-23 and Ray Summit, November 3-5 in San Francisco for our latest updates on vLLM and to meet the vLLM team! Register now for the largest vLLM community events of the year!

---

*Latest News* 🔥

  • [2025/11] We hosted vLLM Bangkok Meetup. We explored vLLM and LMCache inference and low-resource language adaptation with speakers from Embedded LLM, AMD, and Red Hat. Please find the meetup slides here.
  • [2025/11] We hosted the first vLLM Europe Meetup in Zurich focused on quantization, distributed inference, and reinforcement learning at scale with speakers from Mistral, IBM, and Red Hat. Please find the meetup slides here and recording here
  • [2025/11] We hosted vLLM Beijing Meetup focusing on distributed inference and diverse accelerator support with vLLM! Please find the meetup slides here.
  • [2025/10] We hosted vLLM Shanghai Meetup focused on hands-on vLLM inference optimization! Please find the meetup slides here.
  • [2025/09] We hosted vLLM Toronto Meetup focused on tackling inference at scale and speculative decoding with speakers from NVIDIA and Red Hat! Please find the meetup slides here.
  • [2025/08] We hosted vLLM Shenzhen Meetup focusing on the ecosystem around vLLM! Please find the meetup slides here.
  • [2025/08] We hosted vLLM Singapore Meetup. We shared V1 updates, disaggregated serving and MLLM speedups with speakers from Embedded LLM, AMD, WekaIO, and A*STAR. Please find the meetup slides here.
  • [2025/08] We hosted vLLM Shanghai Meetup focusing on building, developing, and integrating with vLLM! Please find the meetup slides here.
  • [2025/05] vLLM is now a hosted project under PyTorch Foundation! Please find the announcement here.
  • [2025/01] We are excited to announce the alpha release of vLLM V1: A major architectural upgrade with 1.7x speedup! Clean code, optimized execution loop, zero-overhead prefix caching, enhanced multimodal support, and more. Please check out our blog post here.

Previous News

Excerpt shown — open the source for the full document.

Notability

notability 1.0/10

Routine fork, no notable traction