ReleaseNVIDIANVIDIApublished Oct 27, 2025seen 2d

NVIDIA/DALI v1.52.0

NVIDIA/DALI

Open original ↗

Captured source

source ↗
published Oct 27, 2025seen 2dcaptured 10hhttp 200method plain

DALI v1.52.0

Repository: NVIDIA/DALI

Tag: v1.52.0

Published: 2025-10-27T14:10:55Z

Prerelease: no

Release notes: Key Features and Enhancements --- This DALI release includes the following key features and enhancements:

  • Introduced experimental Dynamic Mode: imperative execution model with lazy evaluation for easier integration into Python workflows. (#6066, #6064, #6060, #6056, #6042, #6039, #6037, #6036, #5954)
  • Dynamic mode: add augmentation gallery (#6057)
  • DALI Dynamic docs main page (#6052)
  • Added pipeline ZOO - snippets and examples for common image and video processing use cases. (#5922)
  • Added support for CUDA 13U2 (#6063)
  • Added fn.decoders.numpy (#5953) and CPU fn.paste operators (#5968).

Thank you @5had3z for your contributions.

  • Exposed knobs for pipeline dynamic executor:
  • Exposed executor's stream_policy and concurrency options (#5983)
  • Environment variable to control executor threads. (#5949)

Fixed Issues ---

  • Fixed stream ordering in Tensor::Copy and Tensor(List)GPU.as_cpu (#6070)
  • Fixed conversion of pinned tensors to DLPack. (#6061)
  • Fixed DLPack stride check if stride pointer is NULL
  • Fixed handling of videos without keyframes and reuse of old indices (#6058)
  • Fixed resize_crop_mirror video output shape (#5957)

Improvements ---

  • Update to FFmpeg 8.0
  • Dynamic mode: add augmentation gallery (#6057)
  • Add dynamic API for math functions + tests. (#6066)
  • Rename DALI2 to dynamic (#6064)
  • Move to CUDA 13.0 U2 (#6063)
  • Dynamic mode: operator base classes and operator call generator (#6060)
  • Update VERSION to 1.52.0
  • Update deps 25/10 (#6053)
  • Dynamic Mode: Tensor and Batch Types (#6056)
  • Remove CMake from acknowledgements. (#6020)
  • DALI Dynamic docs main page (#6052)
  • Reduce minimum throughput for experimental decoder in TL1_decoder_perf (#6050)
  • Fix TL0_video_plugin to run with sanitizer (#6040)
  • Imperative mode: Invocation (#6042)
  • Update LD_PRELOAD in sanitizer configuration, exclude more numba tests (#6041)
  • Imperative mode: EvalContext, EvalMode, Type and Device (#6039)
  • Update the test environment to Ubuntu 24.04 (#6033)
  • Update curl 3.15 -> 3.16 (#6038)
  • Add TensorList broadcasting constructor. (#6037)
  • Backend changes for imperative mode (#6036)
  • Add nvcc/nvjitlink version compatibility check to numba CUDA test (#6035)
  • Unify minimum required CMake version. (#6022)
  • Fix installation of Horovod in TL1_tensorflow-dali_test (#6024)
  • Remove confusing warning on host decoder fallback (#6029)
  • Add stream argument to TensorGPU DLPack constructor. (#6015)
  • Cumulative dependency update for September 2025. (#6017)
  • Silence false warnings in sanitized build (#6018)
  • Lower the 5% threshold in image decoder perf test to 15% to account for off iterations (#6021)
  • Bump CMake to 3.25.2 (#6019)
  • Move to CUDA 13.0 U1 (#6016)
  • Move to the gcc-toolset-14 (#6014)
  • Update test packages (#6010)
  • Correct support matrix entry for Orin (#6008)
  • Silence a false positive warning triggered by GCC 12.2.1 (#6002)
  • Fix CVE-2024-13978 and CVE-2025-8534 in libtiff (#6007)
  • Bump up OpenCV version to 4.12 in conda (#6005)
  • Move to the latest nvJPEG2k (#6000)
  • Enable more aggressive binary compression (#6001)
  • Use subprocess.run in get_tf_compiler_version to avoid CalledProcessError on grep (#5991)
  • Add functions that change the type of the tensor or tensor list to a different type of the same size. (#5995)
  • Update OpenCV version in tests (#5987)
  • Improve performance of experimental.resize (#5662)
  • Expose executor policy flags (#5983)
  • Pin CMake to max 4.0.3 in jupter_conda tests. (#5985)
  • Add driver version check to the usage of numba_cuda (#5982)
  • Fix nvComp installation in tests (#5984)
  • Update DALI_DEPS_VERSION to use patched libtiff (#5981)
  • Improve creating image batches in CV-CUDA ops (#5966)
  • Dependency update 07-2025 (#5978)
  • Make the numba operator compatible with the numba-cuda package (#5975)
  • Adjust TF plugin build dependencies (#5976)
  • fn.paste CPU impl (#5968)
  • Make sure that protobuf always uses own absl version instead of system one (#5974)
  • Thread pool with semaphore and spinlock (#5970)
  • Extend GetInputDevice in OpSchema python bindings. (#5972)
  • Remove data preparation instructions from the video superres use case (#5965)
  • Added fn.decoders.numpy (#5953)
  • Pipeline zoo - initial commit (#5922)
  • Expose Stream, Operator and Workspace in Python (#5954)
  • Fix nvcc not working with sanitizer (#5959)
  • Make the number of dynamic executor threads configurable via environment variables. (#5949)

Bug Fixes ---

  • Fix stream ordering in Tensor::Copy and Tensor(List)GPU.as_cpu
  • Fix conversion of pinned tensors to DLPack. (#6061)
  • Fix DLPack tests to use HWC layout instead of NHWC (#6062)
  • Fix handling of videos without keyframes and reuse of old indices (#6058)
  • Refactor layout handling in Python backend + add layout dimensionality checks in Tensor and TensorList python bindings (#6054)
  • Fix standalone op output streams. (#6055)
  • Remove EvalContext destructor. (#6043)
  • Fix static analysis issues (#6032)
  • Install newer CMake in TL0_jupyter (#6034)
  • Disable PYBIND11_FINDPYTHON in CMakeLists.txt (#6031)
  • Remove a custom patch for PyCuda, add numba_cuda version constrain (#6023)
  • Bugfix: Skip DLPack stride check if stride pointer is NULL
  • Improve error handling in ThreadPool (#6011)
  • Fix test_backend_impl launch command (#6003)
  • Remove unnecessary default values from optional arguments. (#5992)
  • Add missing backslash in test scripts. (#5986)
  • Fixes outdated DALI mannylinux tag (#5980)
  • resize_crop_mirror - invalid video output shape fix (#5957)

Breaking API changes --- There are no breaking changes in this DALI release.

Deprecated features --- No features were deprecated in this release.

Known issues: ---

  • In some cases, the pass-through parallel external source outputs may be corrupted when used with pipelined dynamic executor. The issue occurs when all four conditions are met: 1. the pipeline uses dynamic executor exec_dynamic=True (default), 2. the external_source runs in parallel mode (parallel=True), 3. the ES output is directly returned from the pipeline, 4. the ES output is a single contiguous chunk of memory (either batch=True or batch_size=1). Currently, as a workaround, user can specify exec_dynamic=False when instantiating pipeline or add an extra fn.copy to prevent directly returning ES outputs from the pipeline.
  • A problem with insufficient static TLS…

Excerpt shown — open the source for the full document.