Unlocking AI Inference at Scale: CoreWeave Joins Red Hat Open Source Project llm-d as Founding Member
Captured source
source ↗CoreWeave Joins Red Hat Open Source AI Initiative as Founding Member
Announcement
Announcement
Webinar
Announcement
Podcast
Announcement
GTC 2026
Announcement
CoreWeave brings up the industry’s first NVIDIA Vera Rubin NVL72 deployment.
Read more
Products
Data and storage
Infrastructure control
Runtime acceleration
Model and agent development
Mission control
Solutions
Pricing
Resources
About us
Contact us Login
Contact us Login
Clear
CoreWeave and Red Hat logos over a white and blue background
At CoreWeave, we believe open source software (OSS) is essential for driving innovation in AI and ML and offering flexibility to developers. Our purpose-built AI cloud platform has been developed from the ground up on Kubernetes. We’ve made open source contributions such as CoreWeave Tensorizer to deliver the scale, speed, and performance our customers need when running AI workloads. Today, we are thrilled to announce that CoreWeave is a founding member of Red Hat's new llm-d OSS project, alongside IBM Research, Google, and NVIDIA. In a fast evolving AI landscape, where the future growth will be fueled by inference—the engine that transforms AI models into actionable results, it is critical to tear down infrastructure silos. We are proud to deepen our long-standing commitment to OSS AI and contribute our expertise in productizing AI inference workloads at scale while advancing the Kubernetes ecosystem and fostering interoperability for AI. This groundbreaking initiative has already garnered the support of leading gen AI model providers, AI accelerator pioneers, premier AI cloud platforms, and the developer community. About CoreWeave CoreWeave delivers the leading AI Cloud Platform—purpose-built to deliver the speed, performance, and expertise needed to unleash AI’s full potential. Our customers train and deploy their innovative foundation models on CoreWeave and get cutting-edge performance, reliability, scale, and infrastructure efficiency for their AI workloads. Our leadership was recognized by SemiAnalysis’s ClusterMAX™ Rating System as the only cloud provider to earn the top Platinum tier rating. Current challenges with LLM inference vLLM has quickly become the open source de facto inference server, providing day 0 model support for emerging frontier models and support for a broad list of GPUs and accelerators. However, as foundation models grow in size, evolve in their capabilities, and increasingly support agentic applications, developers face new challenges in deploying these models at scale while managing infrastructure, costs, and latencies to fit a wide range of use cases and applications. This drives the need for open standards and broader collaboration in the industry to help developers easily navigate rapidly advancing technologies by making it easy to develop, test, and scale inference workloads, as well as increase the interoperability for these workloads across different platforms. How llm-d is groundbreaking for AI inference llm-d is a visionary project that amplifies the power of vLLM to transcend single-server limitations and unlock production scale AI inference. Using the proven orchestration prowess of Kubernetes, llm-d integrates advanced inference capabilities to deliver greater performance and lower latency for inference workloads. llm-d delivers Prefill and Decode Disaggregation which enables these components to scale independently, lmcache-based KV cache offloading to optimize memory use, AI Inference Gateway, and AI-aware network routing for more efficient data transfers using NVIDIA Inference Xfer Library (NIXL). In addition, Kubernetes-powered clusters and controllers enable efficient scheduling of compute and storage resources as workload demands fluctuate, and enable interoperability across cloud platforms. CoreWeave’s continued commitment to open source CoreWeave is proud to be a founding member to the project alongside Google, IBM Research, and NVIDIA. We are committed to our deep collaboration with Red Hat on architecting the future of large-scale LLM serving, and are excited to collaborate with an incredible group of partners and the broader developer community to build a flexible, high-performance inference engine that accelerates innovation and lays the groundwork for open, interoperable AI. We look forward to taking our learnings and best practices from managing large-scale Kubernetes and AI inference deployments to contribute to llm-d and reduce the heavy lifting needed from everyday developers. Our approach to open source focuses on promoting open and flexible interfaces. For example, our contributions will enable Kubernetes native operators for deployment and management, comprehensive testing and benchmarking harnesses for production workloads, and effective deployment and monitoring of the inference stack at scale across multiple GPUs and clusters. Additionally, we are focused on reducing latency, optimizing costs, and pushing the boundaries of scale for AI workloads. We contributed CoreWeave Tensorizer to vLLM with expanded support for llm-d, enabling more than 5X faster model loading compared to HuggingFace when scaling from zero through an innovative “zero-copy” approach. Lastly, as the leading AI cloud platform and the first to deploy latest hardware, including GB200, CoreWeave will lead the charge in unlocking the full potential of the hardware innovations. These capabilities will unlock developers’ productivity and make it easy for them to build once and run across different cloud platforms that support Kubernetes deployments. Get involved: Join the llm-d project Open source AI initiatives become more powerful with more contributions across the industry. Whether you’re part of a long-standing enterprise or a budding startup ready to accelerate, llm-d offers a flexible, powerful platform to build upon. Get started with llm-d today. Contact us to get connected with our team of experts and experience our industry leading cloud platform.
We're proud to announce we've joined Red Hat's new llm-d OSS project as a founding contributor. Learn more about how it's transforming AI inference.
Share this article: Copied
Related Blogs
The Data Center Questions Everyone Is Asking, Answered 5 min read
What a Reference Architecture for Distributed AI Training Actually Looks Like 6 min read
Why Inference Latency and Availability Drift in Production 7 min read
5 Misunderstandings About Enterprise AI…
Excerpt shown — open the source for the full document.
Notability
notability 5.0/10Notable partnership announcement for AI inference open source project.