NVIDIA/MatX v0.9.1
NVIDIA/MatX
Captured source
source ↗published May 14, 2025seen 5dcaptured 8hhttp 200method plain
v0.9.1
Repository: NVIDIA/MatX
Tag: v0.9.1
Published: 2025-05-14T15:43:42Z
Prerelease: no
Release notes:
Sparse support + bugfixes
- New operators:
argminmax,dense2sparse,sparse2dense,interp1,normalize,argsort - Removed requirement for --relaxed-constexpr
- Added MatX NVTX domain
- Significantly improved speed of
svdandinv - Python integration sample
- Experimental sparse tensor support (SpMM and solver routines supported)
- Significantly reduced FFT memory usage
What's Changed
- Moving definition of CUB cache up by @cliffburdick in https://github.com/NVIDIA/MatX/pull/771
- Added documentation of memory types by @cliffburdick in https://github.com/NVIDIA/MatX/pull/770
- Cleaning up non-const operator() to avoid code duplication by @cliffburdick in https://github.com/NVIDIA/MatX/pull/769
- Switch to CUB/Thrust backend for cuda executor argmax by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/772
- Refactor cub argmax to generic cub reduce, use for argmin. Fixes #774. by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/776
- Change any() and all() to use CUB's reduce by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/777
- Add argminmax operator by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/778
- Fix matx::HostExecutor segfault with argmin/argmax by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/780
- Added new cusolverDnXsyevBatched API for batched eigen calls for CTK 12.6.2 and up by @cliffburdick in https://github.com/NVIDIA/MatX/pull/781
- cub.h CUDACC guards for custom ops by @nvjonwong in https://github.com/NVIDIA/MatX/pull/782
- Add example compiled with host compiler to catch regressions. by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/783
- Remove relaxed constexpr by @cliffburdick in https://github.com/NVIDIA/MatX/pull/775
- Cleanup versions.json so jq can parse it. by @alliepiper in https://github.com/NVIDIA/MatX/pull/785
- Allow rapids-cmake's version file to be overridden. by @alliepiper in https://github.com/NVIDIA/MatX/pull/786
- Update rapids-cmake (branch-24.12@03ec7ef) by @alliepiper in https://github.com/NVIDIA/MatX/pull/787
- Created MatX NVTX domain by @cliffburdick in https://github.com/NVIDIA/MatX/pull/784
- Update docs github action by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/789
- Update docs github action by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/790
- Work around compiler parser bug by @cliffburdick in https://github.com/NVIDIA/MatX/pull/791
- Updating developer documentation by @cliffburdick in https://github.com/NVIDIA/MatX/pull/793
- Modify concat op to enable concatenating float3. by @nvjonwong in https://github.com/NVIDIA/MatX/pull/792
- Fix rapids cmake by @alliepiper in https://github.com/NVIDIA/MatX/pull/799
- Switched to getRs instead of getRi for faster inverse by @cliffburdick in https://github.com/NVIDIA/MatX/pull/797
- Update CMakeLists.txt by @cliffburdick in https://github.com/NVIDIA/MatX/pull/801
- Support half precision R2C transforms by @cliffburdick in https://github.com/NVIDIA/MatX/pull/796
- Fix gcc13 erroneous warning by @cliffburdick in https://github.com/NVIDIA/MatX/pull/802
- fixed missing forwarding code for allocate by @aartbik in https://github.com/NVIDIA/MatX/pull/804
- Fix bug with eye, and also zero workspace before LU factorization by @cliffburdick in https://github.com/NVIDIA/MatX/pull/807
- Change shape_type for the remap op by @nvjonwong in https://github.com/NVIDIA/MatX/pull/806
- Faster batched SVD for small sizes by @cliffburdick in https://github.com/NVIDIA/MatX/pull/805
- Fixing broadcasting in all operator() by @cliffburdick in https://github.com/NVIDIA/MatX/pull/795
- Add a better error on memory allocation failure by @cliffburdick in https://github.com/NVIDIA/MatX/pull/808
- Fix solver interfaces to use executor in cache by @cliffburdick in https://github.com/NVIDIA/MatX/pull/809
- Python integration sample by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/812
- Fixes for clang17 errors/warnings by @cliffburdick in https://github.com/NVIDIA/MatX/pull/815
- Misc Cleanup by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/814
- frexp_fix by @cliffburdick in https://github.com/NVIDIA/MatX/pull/817
- Adding structures needed for sparse support by @cliffburdick in https://github.com/NVIDIA/MatX/pull/819
- fix missing newline at EOF (to avoid future diff issues) by @aartbik in https://github.com/NVIDIA/MatX/pull/822
- add size() to container storage by @aartbik in https://github.com/NVIDIA/MatX/pull/824
- minor edit for sparse (layout and proper swap def) by @aartbik in https://github.com/NVIDIA/MatX/pull/820
- add a to-string method for memory space by @aartbik in https://github.com/NVIDIA/MatX/pull/823
- Cleanup cmake usage when MatX is a dependent project by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/827
- Fixing warnings issues by clang-19, both host and device by @cliffburdick in https://github.com/NVIDIA/MatX/pull/825
- Update build_docs actions to newest. Add CI_RUN_DATETIME in version.rst by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/829
- introduce a versatile sparse tensor type to MatX (experimental) by @aartbik in https://github.com/NVIDIA/MatX/pull/821
- Add initial tiff support by @tmartin-gh in https://github.com/NVIDIA/MatX/pull/831
- Make dim2lvl translation for printing more in the style of MatX by @aartbik in https://github.com/NVIDIA/MatX/pull/832
- Expose tensor format (and lvl specs) to sparse tensor data by @aartbik in https://github.com/NVIDIA/MatX/pull/833
- Add cross product operator by @mfzmullen in https://github.com/NVIDIA/MatX/pull/818
- remove LVL depth restriction with constexpr templating by @aartbik in https://github.com/NVIDIA/MatX/pull/834
- Guard all DIM/LVL recursion against completely empty format by @aartbik in https://github.com/NVIDIA/MatX/pull/835
- Adjust half-type threshold for cross product unit tests by @mfzmullen in https://github.com/NVIDIA/MatX/pull/838
- Added fp32 version of normcdf by @cliffburdick in https://github.com/NVIDIA/MatX/pull/839
- Changing black scholes to float and improving performance by @cliffburdick in https://github.com/NVIDIA/MatX/pull/840
- Implement the () operator on sparse tensors by @aartbik in https://github.com/NVIDIA/MatX/pull/837
- Support operators into einsum interface by @cliffburdick in https://github.com/NVIDIA/MatX/pull/845
- Add print function with nonzero dim args by @tbensonatl in https://github.com/NVIDIA/MatX/pull/844
*…
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Routine library patch release, no notable traction