ReleaseNVIDIANVIDIApublished May 14, 2025seen 5d

NVIDIA/MatX v0.9.1

NVIDIA/MatX

Open original ↗

Captured source

source ↗
published May 14, 2025seen 5dcaptured 8hhttp 200method plain

v0.9.1

Repository: NVIDIA/MatX

Tag: v0.9.1

Published: 2025-05-14T15:43:42Z

Prerelease: no

Release notes:

Sparse support + bugfixes

  • New operators: argminmax, dense2sparse, sparse2dense, interp1, normalize, argsort
  • Removed requirement for --relaxed-constexpr
  • Added MatX NVTX domain
  • Significantly improved speed of svd and inv
  • Python integration sample
  • Experimental sparse tensor support (SpMM and solver routines supported)
  • Significantly reduced FFT memory usage

What's Changed

  • Moving definition of CUB cache up by @cliffburdick in https://github.com/NVIDIA/MatX/pull/771
  • Added documentation of memory types by @cliffburdick in https://github.com/NVIDIA/MatX/pull/770
  • Cleaning up non-const operator() to avoid code duplication by @cliffburdick in https://github.com/NVIDIA/MatX/pull/769
  • Switch to CUB/Thrust backend for cuda executor argmax by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/772
  • Refactor cub argmax to generic cub reduce, use for argmin. Fixes #774. by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/776
  • Change any() and all() to use CUB's reduce by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/777
  • Add argminmax operator by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/778
  • Fix matx::HostExecutor segfault with argmin/argmax by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/780
  • Added new cusolverDnXsyevBatched API for batched eigen calls for CTK 12.6.2 and up by @cliffburdick in https://github.com/NVIDIA/MatX/pull/781
  • cub.h CUDACC guards for custom ops by @nvjonwong in https://github.com/NVIDIA/MatX/pull/782
  • Add example compiled with host compiler to catch regressions. by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/783
  • Remove relaxed constexpr by @cliffburdick in https://github.com/NVIDIA/MatX/pull/775
  • Cleanup versions.json so jq can parse it. by @alliepiper in https://github.com/NVIDIA/MatX/pull/785
  • Allow rapids-cmake's version file to be overridden. by @alliepiper in https://github.com/NVIDIA/MatX/pull/786
  • Update rapids-cmake (branch-24.12@03ec7ef) by @alliepiper in https://github.com/NVIDIA/MatX/pull/787
  • Created MatX NVTX domain by @cliffburdick in https://github.com/NVIDIA/MatX/pull/784
  • Update docs github action by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/789
  • Update docs github action by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/790
  • Work around compiler parser bug by @cliffburdick in https://github.com/NVIDIA/MatX/pull/791
  • Updating developer documentation by @cliffburdick in https://github.com/NVIDIA/MatX/pull/793
  • Modify concat op to enable concatenating float3. by @nvjonwong in https://github.com/NVIDIA/MatX/pull/792
  • Fix rapids cmake by @alliepiper in https://github.com/NVIDIA/MatX/pull/799
  • Switched to getRs instead of getRi for faster inverse by @cliffburdick in https://github.com/NVIDIA/MatX/pull/797
  • Update CMakeLists.txt by @cliffburdick in https://github.com/NVIDIA/MatX/pull/801
  • Support half precision R2C transforms by @cliffburdick in https://github.com/NVIDIA/MatX/pull/796
  • Fix gcc13 erroneous warning by @cliffburdick in https://github.com/NVIDIA/MatX/pull/802
  • fixed missing forwarding code for allocate by @aartbik in https://github.com/NVIDIA/MatX/pull/804
  • Fix bug with eye, and also zero workspace before LU factorization by @cliffburdick in https://github.com/NVIDIA/MatX/pull/807
  • Change shape_type for the remap op by @nvjonwong in https://github.com/NVIDIA/MatX/pull/806
  • Faster batched SVD for small sizes by @cliffburdick in https://github.com/NVIDIA/MatX/pull/805
  • Fixing broadcasting in all operator() by @cliffburdick in https://github.com/NVIDIA/MatX/pull/795
  • Add a better error on memory allocation failure by @cliffburdick in https://github.com/NVIDIA/MatX/pull/808
  • Fix solver interfaces to use executor in cache by @cliffburdick in https://github.com/NVIDIA/MatX/pull/809
  • Python integration sample by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/812
  • Fixes for clang17 errors/warnings by @cliffburdick in https://github.com/NVIDIA/MatX/pull/815
  • Misc Cleanup by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/814
  • frexp_fix by @cliffburdick in https://github.com/NVIDIA/MatX/pull/817
  • Adding structures needed for sparse support by @cliffburdick in https://github.com/NVIDIA/MatX/pull/819
  • fix missing newline at EOF (to avoid future diff issues) by @aartbik in https://github.com/NVIDIA/MatX/pull/822
  • add size() to container storage by @aartbik in https://github.com/NVIDIA/MatX/pull/824
  • minor edit for sparse (layout and proper swap def) by @aartbik in https://github.com/NVIDIA/MatX/pull/820
  • add a to-string method for memory space by @aartbik in https://github.com/NVIDIA/MatX/pull/823
  • Cleanup cmake usage when MatX is a dependent project by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/827
  • Fixing warnings issues by clang-19, both host and device by @cliffburdick in https://github.com/NVIDIA/MatX/pull/825
  • Update build_docs actions to newest. Add CI_RUN_DATETIME in version.rst by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/829
  • introduce a versatile sparse tensor type to MatX (experimental) by @aartbik in https://github.com/NVIDIA/MatX/pull/821
  • Add initial tiff support by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/831
  • Make dim2lvl translation for printing more in the style of MatX by @aartbik in https://github.com/NVIDIA/MatX/pull/832
  • Expose tensor format (and lvl specs) to sparse tensor data by @aartbik in https://github.com/NVIDIA/MatX/pull/833
  • Add cross product operator by @mfzmullen in https://github.com/NVIDIA/MatX/pull/818
  • remove LVL depth restriction with constexpr templating by @aartbik in https://github.com/NVIDIA/MatX/pull/834
  • Guard all DIM/LVL recursion against completely empty format by @aartbik in https://github.com/NVIDIA/MatX/pull/835
  • Adjust half-type threshold for cross product unit tests by @mfzmullen in https://github.com/NVIDIA/MatX/pull/838
  • Added fp32 version of normcdf by @cliffburdick in https://github.com/NVIDIA/MatX/pull/839
  • Changing black scholes to float and improving performance by @cliffburdick in https://github.com/NVIDIA/MatX/pull/840
  • Implement the () operator on sparse tensors by @aartbik in https://github.com/NVIDIA/MatX/pull/837
  • Support operators into einsum interface by @cliffburdick in https://github.com/NVIDIA/MatX/pull/845
  • Add print function with nonzero dim args by @tbensonatl in https://github.com/NVIDIA/MatX/pull/844

*…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine library patch release, no notable traction