WritingCoreWeaveCoreWeavepublished May 20, 2026seen 6d

CoreWeave Launches the First Generally Available NVIDIA RTX PRO 6000 Blackwell Server Instances

Open original ↗

Captured source

source ↗

CoreWeave Launches the First Generally Available NVIDIA RTX PRO 6000 Blackwell Server Instances

Announcement

Announcement

Webinar

Announcement

Podcast

Announcement

GTC 2026

Announcement

CoreWeave brings up the industry’s first NVIDIA Vera Rubin NVL72 deployment.

Read more

Products

Data and storage

Infrastructure control

Runtime acceleration

Model and agent development

Mission control

Solutions

Pricing

Resources

About us

Contact us Login

Contact us Login

Clear

Photo credit: NVIDIA

Today, we’re proud to share the general availability of NVIDIA RTX PRO™ 6000 Blackwell Server Edition -based instances, making us the first cloud provider to deliver this groundbreaking GPU architecture for AI, graphics, and high-performance computing workloads. Designed for enterprises and startups pushing the boundaries of generative AI, real-time rendering, and LLM innovation, these instances unlock better performance and efficiency. CoreWeave’s purpose-built cloud infrastructure helps to ensure customers harness the full potential of NVIDIA Blackwell architecture , combining cutting-edge compute with AI-optimized infrastructure to deliver industry-leading performance, reliability, and efficiency.

Redefining AI and graphics performance with NVIDIA RTX PRO Server The NVIDIA RTX PRO 6000 Blackwell Server Edition introduces a substantial performance leap over the previous NVIDIA L40S generation , achieving up to 5.6x faster LLM inference and 3.5x faster text-to-video generation. These advances are made possible by 96GB of ultra-fast GDDR7 memory at 1.6TB/s bandwidth, fifth-generation Tensor Cores for FP4 precision, and fourth-generation RT Cores. The RTX PRO 6000 offers breakthrough technology for a broad range of use cases from agentic AI, physical AI, and scientific computing to rendering, 3D graphics, and video. Designed for mission-critical workloads, it achieves: 3.8 PFLOPS of FP4 AI performance for agentic AI, LLM inference, and generative workflows Over 5x LLM Inference Throughput (vs L40S) Over 2x Faster Fine-tuning (vs L40S)

Purpose-built with a legacy for speed As the first cloud provider to make RTX PRO 6000 Blackwell Server instances generally available, CoreWeave honors its legacy of delivering cutting-edge compute for AI pioneers, as demonstrated by its early leadership with NVIDIA HGX™ H200 and NVIDIA GB200 NVL72 deployments. Beyond CoreWeave’s ability to bring up the latest compute at record speeds, CoreWeave’s AI-optimized infrastructure is engineered to extract peak performance from NVIDIA Blackwell GPUs, including RTX PRO 6000.

Each CoreWeave RTX PRO 6000 instance supports configurations of up to 8 GPUs, paired with dual Intel Emerald Rapids CPUs and NVIDIAⓇ BlueField Ⓡ -3 DPUs to deliver more secure VPC isolation and low-latency networking in multi-tenant environments. NVIDIA BlueField offloads critical network tasks from the CPU, freeing compute resources exclusively for AI and GPU-demanding workloads while maintaining enterprise-grade security at scale. The instances include over 7TB of high-speed local NVMe storage enabling rapid access to large models, datasets, and assets, significantly accelerating AI inference and graphics-intensive workloads by reducing data retrieval latency. These instances will also be supported by CoreWeave’s Observability services , which offer granular monitoring of GPU utilization, system errors, temperatures, and other logs to help customers quickly detect and resolve issues to minimize workflow disruptions.

RTX PRO 6000-based instances are easily available through both CoreWeave Kubernetes Service (CKS) and Slurm on Kubernetes (SUNK) work in tandem to simplify orchestration for containerized applications. CoreWeave AI Object Storage (CAIOS) and Local Object Transport Accelerator (LOTA) integrates with RTX PRO 6000 Blackwell-based instances to help ensure high-throughput data access and intelligent caching for large-scale training and inference pipelines.

Every layer of the CoreWeave platform from hardware and software is fine-tuned to maximize GPU efficiency, allowing researchers and engineers to focus on innovation rather than infrastructure. This hyperoptimization on AI efficiency has earned CoreWeave the #1 AI Cloud ranking by SemiAnalysis , including the exclusive Platinum ClusterMAX rating, validating our leadership in large-scale GPU cluster performance and reliability. By handling the complexities of optimization, CoreWeave enables customers to focus on breakthroughs, not bottlenecks. ‍ Transform your AI innovations with Blackwell on CoreWeave With the RTX PRO™ 6000 Blackwell Server Edition now available on CoreWeave in the US-EAST-04 region, enterprises can accelerate AI training, reduce latency, and scale deployments for next-gen AI and graphics workloads. By combining NVIDIA’s most advanced compute for AI and graphics with CoreWeave’s purpose-built cloud, customers gain the power and flexibility to scale their AI ambitions seamlessly. ‍ Contact CoreWeave today to deploy your RTX PRO 6000 Blackwell instances and experience the #1 AI Cloud for yourself.

‍ ‍ 1 Llama3 70B Inference, NVIDIA preliminary performance projections, April 2025. 8K/256, 20 t/s/usr, 2s FTL, 8 GPU; RTX PRO 6000 (FP4) vs. L40S (FP8)

CoreWeave launches RTX PRO 6000 Blackwell instances—delivering unmatched AI, graphics, and LLM performance on the industry’s most advanced, secure, and scalable cloud infrastructure.

Share this article: Copied

Related Blogs

The Data Center Questions Everyone Is Asking, Answered 5 min read

What a Reference Architecture for Distributed AI Training Actually Looks Like 6 min read

Why Inference Latency and Availability Drift in Production 7 min read

5 Misunderstandings About Enterprise AI Training Infrastructure 5 min read

Choosing the Right NVIDIA Platform for Running Inference on CoreWeave 5 min read

CoreWeave Closes the Loop Between Training and Inference 4 min read

Why Distributed Training Fails at Scale 7 min read

Run Agentic Workloads Safely at Scale with CoreWeave Sandboxes 6 min read

Red Hat AI Inference on CKS for Hybrid Inference 4 min read

CoreWeave Is Now the Fastest at Inference on One of the Best Open Source Models Kimi K2.6 3 min read

Contact us Login

Products GPU Compute CPU Compute Storage Services Networking Services Managed Services Bare Metal Servers Platform Fleet LifeCycle Controller

Node LifeCycle Controller Tensorizer Observability

Solutions AI Model Training AI Inference VFX…

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

New high-end GPU instances for AI workloads