ReleaseMicrosoftMicrosoftpublished Apr 9, 2026seen 5d

microsoft/mscclpp v0.9.0

microsoft/mscclpp

Open original ↗

Captured source

source ↗
published Apr 9, 2026seen 5dcaptured 8hhttp 200method plain

MSCCL++ v0.9.0

Repository: microsoft/mscclpp

Tag: v0.9.0

Published: 2026-04-09T01:15:40Z

Prerelease: no

Release notes:

What's Changed

  • Fix lint.sh by @chhwang in https://github.com/microsoft/mscclpp/pull/652
  • Update the port channel tutorial doc by @chhwang in https://github.com/microsoft/mscclpp/pull/653
  • Fix test script by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/655
  • Update EndpointConfig interfaces by @chhwang in https://github.com/microsoft/mscclpp/pull/651
  • New allreduce algo for small message size by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/647
  • Fix docs by @chhwang in https://github.com/microsoft/mscclpp/pull/656
  • Fix docs version by @chhwang in https://github.com/microsoft/mscclpp/pull/659
  • Exclude irrelevant files from workflow triggers by @chhwang in https://github.com/microsoft/mscclpp/pull/663
  • Improving DSL documentation by @caiomcbr in https://github.com/microsoft/mscclpp/pull/650
  • Add exclude paths under pipeline triggers by @chhwang in https://github.com/microsoft/mscclpp/pull/664
  • Test peer accessibility after deployment by @chhwang in https://github.com/microsoft/mscclpp/pull/661
  • Rename nvls* files by @chhwang in https://github.com/microsoft/mscclpp/pull/660
  • Fix #651 by @chhwang in https://github.com/microsoft/mscclpp/pull/662
  • Add token pool for cuCreate API by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/628
  • Auto-detect CUDA arch in CMake GPU check by @chhwang in https://github.com/microsoft/mscclpp/pull/666
  • FP8 support for Allreduce by @seagater in https://github.com/microsoft/mscclpp/pull/646
  • Resolve IBVerbs Loading Issues by @caiomcbr in https://github.com/microsoft/mscclpp/pull/648
  • Fixes for no-IB systems by @chhwang in https://github.com/microsoft/mscclpp/pull/667
  • Integrate MSCCL++ DSL to torch workload by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/620
  • Add a new logger by @chhwang in https://github.com/microsoft/mscclpp/pull/668
  • upgrade codeql to v3 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/676
  • IB stack enhancements & bug fixes by @chhwang in https://github.com/microsoft/mscclpp/pull/673
  • Support Synchronous Initialization for Proxy Service by @caiomcbr in https://github.com/microsoft/mscclpp/pull/679
  • Supporting New Packet Kernel Operation at Executor by @caiomcbr in https://github.com/microsoft/mscclpp/pull/677
  • connect() APIs changed to return an instance instead of a shared_ptr by @chhwang in https://github.com/microsoft/mscclpp/pull/680
  • Fix Minor Issue Proxy Python Interface by @caiomcbr in https://github.com/microsoft/mscclpp/pull/685
  • Revise the mscclpp datatype by @seagater in https://github.com/microsoft/mscclpp/pull/671
  • Fix Error in Non IB Env at Executor by @caiomcbr in https://github.com/microsoft/mscclpp/pull/686
  • No IB Env CI Test by @caiomcbr in https://github.com/microsoft/mscclpp/pull/687
  • Fix Python bindings and tests by @chhwang in https://github.com/microsoft/mscclpp/pull/690
  • DSL Quick Start by @caiomcbr in https://github.com/microsoft/mscclpp/pull/689
  • Add CudaDeviceGuard by @chhwang in https://github.com/microsoft/mscclpp/pull/691
  • Optimized logger by @chhwang in https://github.com/microsoft/mscclpp/pull/693
  • Build fixes by @chhwang in https://github.com/microsoft/mscclpp/pull/696
  • Add an IB multi-node tutorial by @chhwang in https://github.com/microsoft/mscclpp/pull/702
  • Creating Documentation Section for MSCCL++ DSL by @caiomcbr in https://github.com/microsoft/mscclpp/pull/706
  • Make IB more configurable by @chhwang in https://github.com/microsoft/mscclpp/pull/703
  • Improve DSL Documentation by @caiomcbr in https://github.com/microsoft/mscclpp/pull/707
  • Add handle cache for AMD platform by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/698
  • Add copilot-instructions.md by @chhwang in https://github.com/microsoft/mscclpp/pull/602
  • Use uncached memory on Rocm platform to avoid hang by @qishilu in https://github.com/microsoft/mscclpp/pull/711
  • Replace __HIP_PLATFORM_AMD__ to use internal macro by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/712
  • Rename P2P log subsys into GPU by @chhwang in https://github.com/microsoft/mscclpp/pull/716
  • Minor fixes by @chhwang in https://github.com/microsoft/mscclpp/pull/715
  • Remove UB std:: declarations by @chhwang in https://github.com/microsoft/mscclpp/pull/709
  • Tune the nThreadsPerBlock for FP8 and Half datatype on MI300 by @seagater in https://github.com/microsoft/mscclpp/pull/694
  • Update container images for pipeline by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/717
  • Add CUDA 13.0 Docker images by @chhwang in https://github.com/microsoft/mscclpp/pull/720
  • Bypassing SSCA alerts by @chhwang in https://github.com/microsoft/mscclpp/pull/721
  • Add GpuIpcMemHandle by @chhwang in https://github.com/microsoft/mscclpp/pull/704
  • Reduce CI build time by @chhwang in https://github.com/microsoft/mscclpp/pull/723
  • Use GpuIpcMem for NVLS connections by @chhwang in https://github.com/microsoft/mscclpp/pull/719
  • Fix ci issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/727
  • Fix ci pipeline failure by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/729
  • Torch integration by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/692
  • fp8 nvls support (e5m2 and e4m3) by @mahdiehghazim in https://github.com/microsoft/mscclpp/pull/730
  • Support versioning for mscclpp document by @seagater in https://github.com/microsoft/mscclpp/pull/724
  • Revert "Support versioning for mscclpp document (#724)" by @seagater in https://github.com/microsoft/mscclpp/pull/734
  • Use native GPU architecture when NVIDIA GPU is detected; otherwise fall back to multi-arch build. by @mahdiehghazim in https://github.com/microsoft/mscclpp/pull/732
  • Update document versioning for PR #724 by @seagater in https://github.com/microsoft/mscclpp/pull/735
  • Fix the relative path extraction on github page by @seagater in https://github.com/microsoft/mscclpp/pull/739
  • Support multi-node in MemoryChannel tutorial by @chhwang in https://github.com/microsoft/mscclpp/pull/726
  • Address comments for PR #692 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/733
  • Refactor reduce kernel by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/738
  • Fix cpplint error in main branch by @seagater in https://github.com/microsoft/mscclpp/pull/740
  • Update copilot-instructions.md by @chhwang in…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine library version release.