What does this release signal mean?

NVIDIA published NVIDIA/DALI v2.0.0 (NVIDIA/DALI). This release signal is evidence of what shipped, changed, or was packaged for users. High-signal details: Major release of popular data-loading library for DL · DALI v2.0.0 Repository: NVIDIA/DALI Tag: v2.0.0 Published: 2026-03-03T16:32:22Z Prerelease: no Release notes: Key Features and Enhancements --- This DALI release.... onlylabs links this event to 1 captured evidence page and 6 related release signals.

NVIDIA Release: NVIDIA/DALI v2.0.0

Captured source

source ↗

GitHub/github.com/NVIDIA/DALI

NVIDIA/DALI v2.0.0

Source ↗

published Mar 3, 2026seen Jun 9captured Jun 11http 200method plain

DALI v2.0.0

Repository: NVIDIA/DALI

Tag: v2.0.0

Published: 2026-03-03T16:32:22Z

Prerelease: no

Release notes: Key Features and Enhancements --- This DALI release includes the following key features and enhancements:

Improved DALI dynamic mode:
Added asynchronous and deferred execution (#6210, #6204, #6124, #6216, #6152)
Improved multithreading, supporting no-gil Python 3.13t and Python 3.14. (#6200, #6174, #6136, #5884, #6201, #6202, #6164, #6142)
Added TorchData integration (#6198)
Improved usability and interoperability with other libraries (#6131, #6182, #6188, #6172, #6179, ##6143)
Improved execution device specification and handling (#6194, #6165)
Improved examples and documentation (#6140, #6189, #6170)
Added contrast-limited adaptive histogram equalization (CLAHE) operator (#6069)
Thank you @tonyreina for your contribution!
Added support for CUDA 13.1U1 (#6163)
Improved slice, full, zeros, ones operators (#6159, #6109, #6169)

Fixed Issues ---

Added DALI_MAX_IMAGE_SIZE env var to limit decoded image size in CPU and GPU decoders. (#6208)
Fiedx out-of-bounds reads in image format detection. (#6207)
Fixed audio decoder handling of files over 2GB. (#6199)
Fixed random crop operators conforming to new random state passing. (#6190)
Fixed displacement filter occasionaly returning corrupted data due to missing synchronization. (#6168)
Replaced pickle with JSON in DALI checkpoints format. (#6154)
Fixed slicing with negative stride. (#6161)
Fixed memory leak (#6153) in fixed-size poll allocator. (#6158)

Improvements ---

Add a function that purges operator instance cache for an EvalContext. (#6216)
Add TorchData integration in dynamic mode and create examples (#6198)
Add exception propagation for deferred and async execution (#6210)
Update VERSION to 2.0.0
Add ndd.Stream.synchronize method and implement EvalMode.sync_full (#6204)
ndd vs fn tests part 1: utils and automated tests (#6191)
Add multithreading guide for dynamic mode (#6200)
Limit thread count to 32 in ndd multithreading tests. (#6201)
Fix the conda tests in free threaded env (#6202)
Improved device handling. Remove mixed device. Make DALI work without GPU (#6194)
Replace deprecated pkg_resources.require with packaging/importlib-based alternative (#6196)
Add first class batch to tensor conversion with optional padding (#6182)
Make DALI Dynamic and Pipeline APIs two separate sections (#6189)
Documentation for ndd.DType (#6170)
Add multithreaded tests for dynamic mode (#6164)
Exclude ndd readers from operator docs (#6173)
Update DALI_DEPS: libsound, openssl (#6185)
Broadcast lists of scalars into any shape in ArgValue. (#6188)
Add per-thread stream. Rework stream semantics. Add a real Python stream class. (#6174)
Hide deprecated operators from documentation (#6180)
Fix jupyter tests (#6184)
Move to CUDA 13.1U1 (#6163)
Improve the interoperability of dynamic mode with PyTorch (#6172)
Remove debug mode references from documentation (#6175)
Create examples showing ndd usage (#6140)
Add __str__ and __repr__ generic formatting utilites (#6167)
Add layout handling to full, zeros, ones operator family (#6159)
Make EvalMode.eager the default (#6152)
Default num_threads and stream for dynamic API (#6165)
Dependency update 2026-02 (#6155)
Unexperimentalize operators (#6134)
Adjust performance threshold for dynamic mode in TL1_decoder_perf (#6160)
Update PyTorch Lightning example notebook (#6145)
Fix O_DIRECT expected to read number of bytes numpy reader (#6148)
Add pkg_resources compatibility fallback using importlib.metadata (#6144)
Relax numpy version constraints (#6137)
Move inflate from experimental to decoders, fix doc hiding for ndd, bump deprecation cut-off for ndd to 2.0 (#6141)
Support asynchronous execution in dynamic mode. (#6124)
Fix conda free-threaded Python build (#6142)
Add experimental Python 3.14 support and remove Python 3.9 (#6136)
Add dynamic mode RN50 pipeline to hw decoder bench (#6115)
Add --no-build-isolation flag to cocoapi pip install (#6132)
Improve interoperability of ndd tensors with third party libraries (#6131)
Fix cuFFT linking to respect BUILD_FFTS option (#6135)
Enable cross-device copy with cudaMemcpyPeerAsync. (#6130)
Add support for Python 3.13t (#5884)
Upgrade GitHub Actions for Node 24 compatibility (#6133)
Add PyTorch DataLoader Evaluator plugin (#6112)
Hide ops API (#6123)
Add the information of deprecation version origin (#6127)
Change the defaults for build options in docker/build_helper.sh (#6129)
Allow non-copying TensorList construction from a list of tesnors. (#6128)
Move all internal dnn API class/object public members to private (#6120)
Support more border modes in Slice (#6109)
Contrast-limited adaptive histogram equalization (CLAHE) to DALI image operators (#6069)
Add USE_PREBUILD_PYBIND11 option to use system pybind11 (#6117)
Drop Python 3.9 support (#6119)
Move to cuda 13.1 (#6116)
Remove old eager mode. (#6113)

Bug Fixes ---

Allocate CPU outputs in host order. Reset workspace order to host whe… (#6217)
Fix workspace stream handling in CPU imgcodec decoders. (#6215)
Add missing pillow installation in TL0_self_test_Ampere test (#6213)
Add DALI_MAX_IMAGE_SIZE env var to limit decoded image size in CPU and GPU decoders (#6208)
Accept more types in BBoxRotate input_shape argument. (#6212)
Fix out-of-bounds reads in image format detection (#6207)
Rework instance cache. (#6206)
Use notify_all instead of notify for EvalMode.async (#6205)
Fix dynamic mode pyi files (#6187)
Add sharding support to dynamic mode Reader (#6197)
Fix audio decoder to support files over 2GB (#6199)
Improve type hints in dynamic mode (#6183)
Safely calling Operator._init_spec in invocation.py (#6193)
Rework random crop operators (#6190)
Fix batch creation from unevaluated tensors (#6178)
Forbid passing axes to expand_dims as an input. (#6181)
Fix stream handling in tensor join when called from Dynamic mode. (#6171)
Fix batch construction from a tensor and layout. Add ability to change batch layout in batch and as_batch. (#6179)
Add handling of default layouts in standalone operator calls. (#6176)
Prevent deadlocks with asynchronous execution (#6177)
Set the device of ndd tensor slices (#6169)
Add missing __syncthreads in displacement filter. (#6168)
Use JSON in pipeline checkpointing (#6154)

*...

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

Major release of popular data-loading library for DL