NVIDIA/aicr v0.15.0
NVIDIA/aicr
Captured source
source ↗v0.15.0
Repository: NVIDIA/aicr
Tag: v0.15.0
Published: 2026-06-15T19:54:42Z
Prerelease: no
Release notes: This release focuses on recipe health scoring, improved deployment validation, improved snapshot/discovery, and extending software supply chain capabilities for enterprise users.
Highlights
Recipe Structural Health \- New pkg/health engine computes per-recipe health signals (chart_pinned, constraints_wellformed, declared_coverage) and rolls them up into a recipe-health matrix. aicr recipe list surfaces structural-health columns (with a --no-health opt-out), a tools/health generator and weekly recipe-health-refresh workflow keep the matrix current, and a lint guard now requires healthCheck.assertFile.
Improved Deployment Validation \- The chainsaw deployment-phase runner is now an in-process executor rather than a shelled-out binary. aicr validate runs all phases by default with a --fail-fast opt-in, fails closed on evaluator errors, and is nil-safe across health checks.
Snapshot/Discovery \- The collector now discovers GPU SKUs without nvidia-smi, removing the CUDA base image dependency and matching SKUs on token boundaries instead of substrings.
Closed Supply Chain \- Signing and verification now work end-to-end in air-gapped and enterprise environments. aicr bundle supports KMS-backed signing (--signing-key) and private Sigstore deployments (--fulcio-url, --rekor-url); aicr verify --key validates bundles against a KMS or public key; and aicr evidence publish signs recipe evidence off-network. The recipe catalog itself now ships signed provenance for the V1 closed supply chain, and keyless signing warns before publishing identity to the public transparency log.
New Recipes & Overlays
- A100 training Kubeflow overlay chains for EKS, AKS, GKE COS, and OKE
- GB300 concrete EKS service-bound overlays
- OKE GB200 and AKS H100 Dynamo performance checks
CLI & Bundling
aicr recipe listsubcommand for catalog enumeration- Gatekeeper added as an optional component
Inference Performance & Validation
- Inference-performance validation enhanced and tuned; gated on all worker services Ready
nccl-all-reduce-bwgates wired for EKS + H200; GKE NCCL node selector made dynamic- Bounded absent-resource retries in deployment-phase health checks
*Thanks to* @atif1996, @cdesiniotis, @dims, @haarchri, @JaydipGabani, @lalitadithya, @lockwobr, @njhensley, @pdmack, @pedjak, @rsd-darshan, @sttts, @xdu31, @yuanchen8911, and @mchmarny.
Changelog
New Features
- cfb0cb06fbc3156c1b5cd9681f7d0826de0899df: feat(agentgateway): scope inference-gateway LB to allowed source ranges (#1138) (@yuanchen8911)
- 3b8b1f31fb75a25a7c62d3b0e814d5d4e0da5d95: feat(bundle): KMS-backed signing via --signing-key (#407) (#1205) (@lockwobr)
- 5316d3c08d9274caadb265c52293fd861378d74a: feat(bundle): private Sigstore via --fulcio-url and --rekor-url (#1158) (@lockwobr)
- b847f968144f3e0290d67cf1a3828337095319c5: feat(bundler): retry sign.Bundle on transient Sigstore failures (#1251) (@mchmarny)
- 3c1f525d76741dae40d9996aafd6b02b3224d8e9: feat(bundler): warn on open agentgateway inference-gateway exposure (#1163) (@yuanchen8911)
- cd30fe5630d06ef8c091fa9586f9fa589aa0bc70: feat(ci): add weekly recipe-health-refresh workflow (#1320) (@njhensley)
- 5f8647dd3ae68ddb1f5e25eed5d63d302060333f: feat(cli): add --no-health opt-out to recipe list (#1314) (@njhensley)
- f0490fbf1e575098da54fd0d962768173708f6b4: feat(cli): add --set-json/--set-file for list and object bundle overrides (#1162) (@yuanchen8911)
- b401339ca6531e3d2d1ac4159cadba6de4097d23: feat(cli): add structural-health columns to recipe list (#1302) (@njhensley)
- a47a53fa469747da3735137711d2dd3cc6eb9838: feat(cli): warn before keyless signing publishes identity to public log (#1300) (@njhensley)
- 97934ad1773b203e6f2be25eac91f26b5ca1180e: feat(collector): driver-free GPU SKU discovery; remove nvidia-smi + CUDA base (#1352) (@mchmarny)
- eb8728d3029112c5dc8f4b3d5f63cf8b333ef0f2: feat(coverage): generated CUJ/CLI coverage matrix (RQ3) (#1316) (@mchmarny)
- 1674ea4bd49c4bfd6e9551444c57c57d4efed352: feat(evidence): add
aicr evidence publishfor off-network signing (#1140) (@njhensley) - e1b01600fd687ed391d288cea165fe5b52f6d863: feat(health): add tools/health generator and recipe-health matrix (#1304) (@njhensley)
- 8d0da78521ce2da7e2a628ab1f77665630e8dd44: feat(health): chart_pinned signal + declared_coverage descriptor (#1293) (@mchmarny)
- fa92b4375bbaf006216f5fe3b33a9b2464149a7b: feat(health): constraints_wellformed signal (parse-only, hermetic) (#1301) (@njhensley)
- 10b08f1c6c72029071a39d8ae187f97320ee9cfa: feat(health): pkg/health core Compute loop, resolves signal, rollup (#1291) (@mchmarny)
- 3e2f8230e18cbfbf997a7d3a93edf386b167aaf8: feat(recipe): add AKS H100 Dynamo perf check (#1232) (@yuanchen8911)
- 8f8bc56a00f9ec00f3c355efdb1e6b598c4dfa1d: feat(recipe): add OKE GB200 perf check (#1233) (@yuanchen8911)
- ec95e2001498624aa990d908533f1d17e8c72118: feat(recipe): add aicr recipe list subcommand for catalog enumeration (#1208) (@rsd-darshan)
- 57fbed0f8e55daf5199e429a28309c1765d4c078: feat(recipe): hydrate healthCheck.assertFile + suppression sentinel (#1231) (@mchmarny)
- ae6819c6c71b14c91ac26b7cdf44eae86f9e6c2b: feat(recipe): lint guard requiring healthCheck.assertFile + allowlist (#1244) (@mchmarny)
- 0bf2267c1f74cee213decd884f45cec2a5f0bf93: feat(recipe): signed catalog provenance for V1 closed supply chain (#1216) (@mchmarny)
- 463d6a1997ab40222975d243d2f6fcc16a0c1de8: feat(recipes): add A100 AKS training Kubeflow overlay chain (#1295) (@yuanchen8911)
- d8d3070292ca6662d6a31a9993531cb5b257fc04: feat(recipes): add A100 EKS training Kubeflow overlay chain (#1305) (@yuanchen8911)
- fd64dd7a55aae45862198aabcc180d3be25bbbba: feat(recipes): add A100 GKE COS training Kubeflow overlay chain (#1306) (@yuanchen8911)
- 6eb85ac43ae6030d223ec5a36617c34fd05368c2: feat(recipes): add A100 OKE training Kubeflow overlay chain (#1294) (@yuanchen8911)
- 4b817ce56c550115a1d319676f8ea3df6ea33721: feat(recipes): add concrete GB300 EKS service-bound overlays (#1319) (@yuanchen8911)
- cad014233eb63d9c033c4cc1d304239ce282d82e: feat(recipes): backfill chainsaw health checks for 5 missing components (#1243) (@mchmarny)
- 81daab3b13e607ed98dc9a496b63ad11b0590d26: feat(recipes): deepen 21 chainsaw health checks; close epic #660 (#1245) (@mchmarny)
*...
Excerpt shown — open the source for the full document.
Notability
notability 4.0/10Minor version release of NVIDIA's AICR library