NVIDIA/NVSentinel v1.11.0
NVIDIA/NVSentinel
Captured source
source ↗Release v1.11.0
Repository: NVIDIA/NVSentinel
Tag: v1.11.0
Published: 2026-06-22T12:58:45Z
Prerelease: no
Release notes:
Release v1.11.0
This release improves the accuracy of Mean Time To Repair (MTTR) reporting by letting dashboards distinguish automated remediations from events that require manual intervention, and fixes a checkpoint-advancement bug in the fault-remediation reconciler that could silently drop live health events on cold start.
Major New Features
Recommended-action label on MTTR metrics (#1406)
The fault_quarantine_node_remediation_duration_excluding_drain_seconds MTTR histogram now carries a recommended_action label. Previously, nodes that required manual handling (e.g. a CONTACT_SUPPORT recommended action) could sit cordoned for hours before an operator acted, and that long idle time was bucketed alongside genuine automated remediations, inflating MTTR on Grafana dashboards. With the new label, dashboards can filter out CONTACT_SUPPORT and other manual events so MTTR reflects only automated remediations.
Bug Fixes & Reliability
- Fixed cold-start checkpoint advancement on document ID errors (#1411): Cold-start events are enqueued without resume tokens, but the document-ID error path in the fault-remediation reconciler called the watcher directly, where an empty token could resolve to the current MongoDB or PostgreSQL stream position and advance the checkpoint past events that had not yet been handled. Document-ID extraction failures are now routed through
safeMarkProcessed, so cold-start events are no longer incorrectly marked processed and remediation events are no longer silently lost.
Acknowledgments
This release includes contributions from:
- @fallintoplace
- @nitz2407
- @XRFXLP
- @lalitadithya
Thank you to everyone who contributed code, testing, documentation, design reviews, and feedback! Special thanks to first-time contributor @fallintoplace.
Container Images
See versions.txt for the full list of container images and versions.
Helm Chart
Install with:
helm install nvsentinel oci://ghcr.io/nvidia/nvsentinel \ --version v1.11.0 \ --namespace nvsentinel \ --create-namespace
To upgrade from v1.10.0:
helm upgrade nvsentinel oci://ghcr.io/nvidia/nvsentinel \ --version v1.11.0 \ --namespace nvsentinel \ --reuse-values
Notability
notability 4.0/10Routine release of NVIDIA's video analytics platform.