Slash Storage Costs up to 75% With Automated Usage-Based Billing Levels
Captured source
source ↗Lower AI Storage Costs by 75% | CoreWeave
Announcement
Announcement
Webinar
Announcement
Podcast
Announcement
GTC 2026
Announcement
CoreWeave brings up the industry’s first NVIDIA Vera Rubin NVL72 deployment.
Read more
Products
Data and storage
Infrastructure control
Runtime acceleration
Model and agent development
Mission control
Solutions
Pricing
Resources
About us
Contact us Login
Contact us Login
Clear
Your ML data isn’t just priceless. It’s also pricey. Each fine-tune spits out gigabytes of “just-in-case” artifacts, compiling with checkpoints, model weights, and datasets. Teams default to hot storage for these artifacts to avoid rehydration delays, but the costs pile up fast. That data sprawl also clogs your storage containers, slows down directory listings, and risks capacity alarms in the middle of a breakthrough. Before you hit delete or archive your data, consider this: What if your object storage was built for AI and smart enough to automatically distinguish and lower the price for inactive data, without slowing reads or changing your pipeline? In this post, we’ll explain how CoreWeave AI Object Storage ’s automated, usage-based billing levels for AI checkpoints and other cold data can help curb storage costs and keep experiments moving. We’ll also walk through how CoreWeave AI Object Storage classifies inactivity and applies lower pricing in-place, delivering savings of up to—and in many cases exceeding—75%. Want the deep dive now? Download our ebook Beyond the Hot Tier: Cut AI Storage Costs, Accelerate Breakthroughs . When hot storage becomes a budget fire In large-scale AI pipelines, data volumes grow faster than teams anticipate. Every hyperparameter sweep, branch fork, or nightly experiment adds dozens of AI artifacts to your bucket. In real AI workloads, the split skews heavily toward inactive data. No public benchmarks quantify inactive data in AI storage at scale today, but some industry trends suggest around 80% of data is inactive at any given time . Those kinds of numbers add up really quick in any environment. The chart below (based on real-world numbers) shows that inactive data typically ranges from 75% to 93% of the total footprint across three anonymized CoreWeave customer scenarios. Customer Total capacity Active data Inactive data Customer A 65 PB 5 PB (7%) 60 PB (93%) Customer B 8 PB 2 PB (25%) 6 PB (75%) Customer C 17.5 PB 3.5 PB (20%) 14 PB (80%)
Table: representative customer data mix (anonymized). The challenge: Keeping everything “hot” explodes costs, computational overhead, and operational drag. Left unchecked, this sprawl not only inflates your object storage costs but also degrades performance, complicates audits, and eats into headroom for new data. The typical solutions, deleting or archiving, leave much to be desired. Why deleting data isn’t the answer It might seem intuitive to simply delete outdated AI checkpoints to rein in your storage costs. However, then you lose that data for future AI training. In practice, manual deletion creates even more complexity: Loss of audit trails : Compliance or rollback requirements often demand complete histories of model snapshots. Inconsistent retention : Manual cleanup logic can leave gaps or remove critical checkpoints by mistake. Increased on‑call load : Writing and maintaining deletion scripts adds operational debt and risk of downtime.
Rather than a one-off fix, deletion becomes an ongoing burden and offers zero guarantees that your pipeline remains reliable or auditable. Where traditional cloud archiving falls short for AI workloads Most cloud storage tiering features were designed for general-purpose object storage, not iterative ML. Teams end up wiring policies into code, absorbing extra line items on the bill, and accepting slower reads from archive tiers. These tradeoffs often push practitioners to keep more data hot. While legacy clouds offer lifecycle-based “intelligent tiering” storage optimizations in an attempt to resolve this, these solutions tend to break under real AI workloads. Challenges with intelligent tiering include: The process is not fully automatic. It requires opt-in tagging and policies: developers must tag or classify data in code and maintain bucket policies to mark which checkpoints should be archived. You pay fees for the feature and for access. Hyperscalers often charge management fees to track access, plus request, retrieval, and egress fees when data moves between tiers. Performance takes a hit. Moving objects between tiers creates inconsistent read times; loading checkpoints from an archive or "inactive" tier can be slower, which nudges teams to keep more data in hot storage.
So what’s the best solution? It’s not deleting, not generic auto-archiving, and not even a separate storage tier. You need object storage for AI that does not charge you for inactive data and does not degrade performance. Basically, you’re looking for a unicorn. Luckily, CoreWeave has the right solution. CoreWeave AI Object Storage’s automated usage-based billing: Save up to 75% on storage costs CoreWeave AI Object Storage leverages automated usage-based billing to recognize inactive data based on real usage—no tags, no policies, no code changes. It simply bills that data at a lower rate, with the same high performance across every object. How it works: CoreWeave AI Object Storage uses real-time access tracking to adjust billing seamlessly across three transparent pricing levels—Hot ($0.06/GB/mo), Warm ($0.03/GB/mo), and Cold ($0.015/GB/mo)—based on how frequently data is accessed. Data automatically moves from Hot to Warm after seven days of inactivity, and from Warm to Cold after more than 30 days, all without any manual management or rehydration delay.
Every object remains fully accessible at full line-rate performance, with no retrieval, egress, or request fees.
How is this possible? Unlike legacy hyperscalers, CoreWeave AI Object Storage was built for AI workloads from the ground up. Technically, CoreWeave AI Object Storage doesn’t archive data or move it at all; it simply does the work for you of figuring out which data is inactive, and we don’t charge you a premium for that data. The result is straightforward billing and predictable, low-latency access, whether a checkpoint was touched yesterday or last month. No rehydration. No code changes. No wasted spend. Cost comparison of CoreWeave AI Object Storage vs. traditional object storage Hidden storage…
Excerpt shown — open the source for the full document.
Notability
notability 4.0/10Product feature announcement, not a model release.