ReleaseMicrosoftMicrosoftpublished Oct 10, 2025seen 5d

microsoft/mscclpp v0.8.0

microsoft/mscclpp

Open original ↗

Captured source

source ↗
published Oct 10, 2025seen 5dcaptured 8hhttp 200method plain

MSCCL++ v0.8.0

Repository: microsoft/mscclpp

Tag: v0.8.0

Published: 2025-10-10T18:32:33Z

Prerelease: no

Release notes:

What's Changed

  • Fix #458 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/568
  • Fix multinode test failure by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/574
  • Separate linters from cmake by @chhwang in https://github.com/microsoft/mscclpp/pull/587
  • Fix relaxedWait() by @chhwang in https://github.com/microsoft/mscclpp/pull/594
  • NCCL fixes by @chhwang in https://github.com/microsoft/mscclpp/pull/592
  • Updated Dev Container by @chhwang in https://github.com/microsoft/mscclpp/pull/591
  • Support CudaIpc connection within a single process by @chhwang in https://github.com/microsoft/mscclpp/pull/593
  • Fix GpuStreamPool to be aware of the device ID of streams by @chhwang in https://github.com/microsoft/mscclpp/pull/590
  • update pytest and python API to fix ut failure by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/598
  • Fixed the local channel test by @chhwang in https://github.com/microsoft/mscclpp/pull/597
  • Use smart pointer for IB structure by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/585
  • Update documentation by @chhwang in https://github.com/microsoft/mscclpp/pull/576
  • Support CUDA 12.9 by @chhwang in https://github.com/microsoft/mscclpp/pull/600
  • Merge ChannelTrigger with ProxyTrigger by @chhwang in https://github.com/microsoft/mscclpp/pull/601
  • MNNVL fix by @chhwang in https://github.com/microsoft/mscclpp/pull/604
  • New DSL implementation by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/579
  • python doc auto generation by @chhwang in https://github.com/microsoft/mscclpp/pull/605
  • all2all implementation by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/609
  • Fix ut by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/613
  • Create ib mr for per ib transport by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/611
  • Fix for multi-nodes test by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/614
  • add torch test by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/612
  • AlltoAll Test Support by @caiomcbr in https://github.com/microsoft/mscclpp/pull/606
  • Adding Channel Id Field DSL Port Channel Operations by @caiomcbr in https://github.com/microsoft/mscclpp/pull/615
  • Fix deadlock in Executor channel setup by @caiomcbr in https://github.com/microsoft/mscclpp/pull/616
  • Fix NVLS correctness issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/618
  • Fixed cpp linter by @chhwang in https://github.com/microsoft/mscclpp/pull/619
  • Thread Block Group DSL by @caiomcbr in https://github.com/microsoft/mscclpp/pull/621
  • Fix memory exchange within a single process by @chhwang in https://github.com/microsoft/mscclpp/pull/624
  • Fix hang issue in logging submodule by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/625
  • Integrate MSCCL++ with torch workload by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/626
  • Add FifoDeviceHandle::poll() by @chhwang in https://github.com/microsoft/mscclpp/pull/630
  • Fix Illegal Memory Access in nvls_test for CUDA12.9 by @abhijangda in https://github.com/microsoft/mscclpp/pull/631
  • Adapt with torch 2.6 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/632
  • Fix for safe process teardown by @chhwang in https://github.com/microsoft/mscclpp/pull/633
  • use unix socket to share fd by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/634
  • Address teardown issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/638
  • Revise NCCL API implementation by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/617
  • Support detailed version tracking that captures git repository information by @seagater in https://github.com/microsoft/mscclpp/pull/639
  • Fix Rocm build issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/642
  • Add 2 Node AllReduce DSL Algorithm by @caiomcbr in https://github.com/microsoft/mscclpp/pull/636
  • Make ncclReduce/ncclSend/ncclRecv work by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/643
  • Reduce memory footprint for allreduce8 and allgather6 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/644
  • Add MSCCLPP_GIT_COMMIT micro by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/640
  • Address corner case when generating version file by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/641
  • Pipeline fix by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/645

New Contributors

  • @abhijangda made their first contribution in https://github.com/microsoft/mscclpp/pull/631

Full Changelog: https://github.com/microsoft/mscclpp/compare/v0.7.0...v0.8.0

Notability

notability 3.0/10

Routine version release of a library