microsoft/mscclpp v0.9.0
microsoft/mscclpp
Captured source
source ↗published Apr 9, 2026seen 5dcaptured 8hhttp 200method plain
MSCCL++ v0.9.0
Repository: microsoft/mscclpp
Tag: v0.9.0
Published: 2026-04-09T01:15:40Z
Prerelease: no
Release notes:
What's Changed
- Fix lint.sh by @chhwang in https://github.com/microsoft/mscclpp/pull/652
- Update the port channel tutorial doc by @chhwang in https://github.com/microsoft/mscclpp/pull/653
- Fix test script by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/655
- Update
EndpointConfiginterfaces by @chhwang in https://github.com/microsoft/mscclpp/pull/651 - New allreduce algo for small message size by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/647
- Fix docs by @chhwang in https://github.com/microsoft/mscclpp/pull/656
- Fix docs version by @chhwang in https://github.com/microsoft/mscclpp/pull/659
- Exclude irrelevant files from workflow triggers by @chhwang in https://github.com/microsoft/mscclpp/pull/663
- Improving DSL documentation by @caiomcbr in https://github.com/microsoft/mscclpp/pull/650
- Add exclude paths under pipeline triggers by @chhwang in https://github.com/microsoft/mscclpp/pull/664
- Test peer accessibility after deployment by @chhwang in https://github.com/microsoft/mscclpp/pull/661
- Rename nvls* files by @chhwang in https://github.com/microsoft/mscclpp/pull/660
- Fix #651 by @chhwang in https://github.com/microsoft/mscclpp/pull/662
- Add token pool for cuCreate API by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/628
- Auto-detect CUDA arch in CMake GPU check by @chhwang in https://github.com/microsoft/mscclpp/pull/666
- FP8 support for Allreduce by @seagater in https://github.com/microsoft/mscclpp/pull/646
- Resolve IBVerbs Loading Issues by @caiomcbr in https://github.com/microsoft/mscclpp/pull/648
- Fixes for no-IB systems by @chhwang in https://github.com/microsoft/mscclpp/pull/667
- Integrate MSCCL++ DSL to torch workload by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/620
- Add a new logger by @chhwang in https://github.com/microsoft/mscclpp/pull/668
- upgrade codeql to v3 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/676
- IB stack enhancements & bug fixes by @chhwang in https://github.com/microsoft/mscclpp/pull/673
- Support Synchronous Initialization for Proxy Service by @caiomcbr in https://github.com/microsoft/mscclpp/pull/679
- Supporting New Packet Kernel Operation at Executor by @caiomcbr in https://github.com/microsoft/mscclpp/pull/677
connect()APIs changed to return an instance instead of a shared_ptr by @chhwang in https://github.com/microsoft/mscclpp/pull/680- Fix Minor Issue Proxy Python Interface by @caiomcbr in https://github.com/microsoft/mscclpp/pull/685
- Revise the mscclpp datatype by @seagater in https://github.com/microsoft/mscclpp/pull/671
- Fix Error in Non IB Env at Executor by @caiomcbr in https://github.com/microsoft/mscclpp/pull/686
- No IB Env CI Test by @caiomcbr in https://github.com/microsoft/mscclpp/pull/687
- Fix Python bindings and tests by @chhwang in https://github.com/microsoft/mscclpp/pull/690
- DSL Quick Start by @caiomcbr in https://github.com/microsoft/mscclpp/pull/689
- Add
CudaDeviceGuardby @chhwang in https://github.com/microsoft/mscclpp/pull/691 - Optimized logger by @chhwang in https://github.com/microsoft/mscclpp/pull/693
- Build fixes by @chhwang in https://github.com/microsoft/mscclpp/pull/696
- Add an IB multi-node tutorial by @chhwang in https://github.com/microsoft/mscclpp/pull/702
- Creating Documentation Section for MSCCL++ DSL by @caiomcbr in https://github.com/microsoft/mscclpp/pull/706
- Make IB more configurable by @chhwang in https://github.com/microsoft/mscclpp/pull/703
- Improve DSL Documentation by @caiomcbr in https://github.com/microsoft/mscclpp/pull/707
- Add handle cache for AMD platform by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/698
- Add copilot-instructions.md by @chhwang in https://github.com/microsoft/mscclpp/pull/602
- Use uncached memory on Rocm platform to avoid hang by @qishilu in https://github.com/microsoft/mscclpp/pull/711
- Replace
__HIP_PLATFORM_AMD__to use internal macro by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/712 - Rename
P2Plog subsys intoGPUby @chhwang in https://github.com/microsoft/mscclpp/pull/716 - Minor fixes by @chhwang in https://github.com/microsoft/mscclpp/pull/715
- Remove UB
std::declarations by @chhwang in https://github.com/microsoft/mscclpp/pull/709 - Tune the nThreadsPerBlock for FP8 and Half datatype on MI300 by @seagater in https://github.com/microsoft/mscclpp/pull/694
- Update container images for pipeline by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/717
- Add CUDA 13.0 Docker images by @chhwang in https://github.com/microsoft/mscclpp/pull/720
- Bypassing SSCA alerts by @chhwang in https://github.com/microsoft/mscclpp/pull/721
- Add
GpuIpcMemHandleby @chhwang in https://github.com/microsoft/mscclpp/pull/704 - Reduce CI build time by @chhwang in https://github.com/microsoft/mscclpp/pull/723
- Use
GpuIpcMemfor NVLS connections by @chhwang in https://github.com/microsoft/mscclpp/pull/719 - Fix ci issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/727
- Fix ci pipeline failure by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/729
- Torch integration by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/692
- fp8 nvls support (e5m2 and e4m3) by @mahdiehghazim in https://github.com/microsoft/mscclpp/pull/730
- Support versioning for mscclpp document by @seagater in https://github.com/microsoft/mscclpp/pull/724
- Revert "Support versioning for mscclpp document (#724)" by @seagater in https://github.com/microsoft/mscclpp/pull/734
- Use native GPU architecture when NVIDIA GPU is detected; otherwise fall back to multi-arch build. by @mahdiehghazim in https://github.com/microsoft/mscclpp/pull/732
- Update document versioning for PR #724 by @seagater in https://github.com/microsoft/mscclpp/pull/735
- Fix the relative path extraction on github page by @seagater in https://github.com/microsoft/mscclpp/pull/739
- Support multi-node in
MemoryChanneltutorial by @chhwang in https://github.com/microsoft/mscclpp/pull/726 - Address comments for PR #692 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/733
- Refactor reduce kernel by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/738
- Fix cpplint error in main branch by @seagater in https://github.com/microsoft/mscclpp/pull/740
- Update
copilot-instructions.mdby @chhwang in…
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Routine library version release.