WritingCoreWeaveCoreWeavepublished May 20, 2026seen 6d

CoreWeave Adds Skypilot Support for Effortless Multi-Cloud AI Orchestration

Open original ↗

Captured source

source ↗

SkyPilot for Multi-Cloud Orchestration | CoreWeave Blog

Announcement

Announcement

Webinar

Announcement

Podcast

Announcement

GTC 2026

Announcement

CoreWeave brings up the industry’s first NVIDIA Vera Rubin NVL72 deployment.

Read more

Products

Data and storage

Infrastructure control

Runtime acceleration

Model and agent development

Mission control

Solutions

Pricing

Resources

About us

Contact us Login

Contact us Login

Clear

We’re excited to announce official support for SkyPilot , unlocking seamless, cloud-agnostic, AI open-source orchestration on CoreWeave’s highly performant and scalable GPU infrastructure. This support for SkyPilot adds an additional orchestration option alongside CoreWeave’s existing support for SUNK (Slurm on Kubernetes) and Kueue . SkyPilot, an open-source framework founded at the Sky Computing Lab at UC Berkeley, was designed to abstract cloud complexity and automate the selection, provisioning, and management of compute and storage on any infrastructure. CoreWeave, the Essential Cloud for AI, offers cutting-edge performance, reliability, and observability, making it the ultimate infrastructure backend for SkyPilot users who prioritize speed, efficiency, and flexibility at scale. Whether you’re a fast-moving AI-native startup or an enterprise scaling to thousands of GPUs, SkyPilot on CoreWeave unlocks production workloads, and it’s already trusted by some of the most ambitious companies in the space. What’s new and why it matters: Get AI running on CoreWeave at lightning speed. Get training, development, and serving workloads deployed on CoreWeave GPUs in minutes with ready-to-use examples and recipes, such as Llama 4 FT , vLLM , and more on the examples page . One‑line InfiniBand enablement on CoreWeave. Add to your SkyPilot resources to automatically configure InfiniBand, RDMA, and environment variables - no manual tuning. The official NCCL test example shows end‑to‑end setup and benchmark results demonstrating 3.6Tb/s interconnect bandwidth on CoreWeave. Native CoreWeave AI Object Storage integration in SkyPilot. SkyPilot now recognizes CoreWeave buckets and includes install/config steps and a cw:// scheme in its storage docs, making it straightforward to fuse mount data for easy access, and to achieve  throughput up to 7 GB/s per GPU , far beyond traditional object storage. Autoscaling support to maximize cluster utilization. SkyPilot automatically scales your CoreWeave clusters up or down based on demand,ensuring you pay only for what you use while maintaining peak GPU utilization across distributed clusters. First‑class visibility in SkyPilot install docs. CoreWeave is now included in the SkyPilot installation guide, with steps to connect your CoreWeave Kubernetes Service (CKS) cluster and optional CoreWeave AI Object Storage setup.

Why run SkyPilot on CoreWeave? By combining SkyPilot’s intelligent orchestration with CoreWeave’s purpose-built AI infrastructure, teams gain a streamlined way to launch, scale, and optimize workloads across clusters. Together, they remove the complexity of managing compute at scale so that AI engineers can focus on building, not babysitting infrastructure. Running SkyPilot on CoreWeave means AI teams can: Get SLURM-like convenience with the reliability and flexibility of Kubernetes Easily manage multiple Kubernetes clusters to get a unified control plane Instantly provision hundreds (or thousands) of GPUs with a single command Optimize for price/performance thanks to CoreWeave’s kubernetes-on-bare-metal architecture Reduce time-to-market by leveraging the industry's fastest networking and storage for distributed workloads

‍ How it works: From multi-cloud MLOps to scalable reality SkyPilot’s open-source orchestration abstracts cluster management, auto-selects the best region/GPU (including spot/preemptible instances), and manages job submission for your ML training or inference workloads. With CoreWeave now supported as a backend, you can launch SkyPilot jobs on your CoreWeave cluster directly from your familiar SkyPilot YAML or CLI. This unlocks greater efficiency for ML researchers, including: Dynamic resource matching: SkyPilot will intelligently allocate the optimal GPU resources from CoreWeave based on your compute and storage needs and pricing targets. Built-in failure handling, smart retries, and log streaming, essential for production MLOps and ease of debugging. Fully automated configuration of 3.2 Tb/s Infiniband interconnects Autoscaling support to maximize cluster utilization Tight integration with CoreWeave’s LOTA (Local Object Transfer Accelerator) storage layer; this makes large model checkpoints, datasets, and artifacts blazingly fast to distribute and access (up to 7GB/s throughput per GPU).

Explore SkyPilot on CoreWeave, trusted by AI-native pioneers CoreWeave and SkyPilot already powers production workloads for a growing roster of customers, including multiple leading AI-native pioneers and startups running jobs across hundreds of GPUs. By combining SkyPilot’s multi-cloud orchestration with CoreWeave’s purpose-built AI infrastructure, teams can scale elastically and meet demand from thousands of end users, serving critical business applications. Get started in minutes Install SkyPilot and follow the CoreWeave setup steps in the installation guide . Validate high-performance networking using the InfiniBand/NCCL example . Mount CoreWeave Object Storage using the cw:// scheme ( example PR ).

For a targeted hands‑on walkthrough to create dev pods, serving, and multi‑node training on CoreWeave infrastructure, see our internal guide Skypilot on CoreWeave . ‍

Launch any AI job, anywhere, with minimal friction. CoreWeave adds SkyPilot support for multi-cloud orchestration, unlocking greater ease, speed, and visibility for ML teams.

Share this article: Copied

Related Blogs

The Data Center Questions Everyone Is Asking, Answered 5 min read

What a Reference Architecture for Distributed AI Training Actually Looks Like 6 min read

Why Inference Latency and Availability Drift in Production 7 min read

5 Misunderstandings About Enterprise AI Training Infrastructure 5 min read

Choosing the Right NVIDIA Platform for Running Inference on CoreWeave 5 min read

CoreWeave Closes the Loop Between Training and Inference 4 min read

Why Distributed Training Fails at Scale 7 min read

Run Agentic Workloads Safely at Scale with CoreWeave Sandboxes 6 min read

Red Hat AI Inference on CKS for…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine integration support announcement.