microsoft/mscclpp v0.7.0
microsoft/mscclpp
Captured source
source ↗published Jul 12, 2025seen 5dcaptured 13hhttp 200method plain
MSCCL++ v0.7.0
Repository: microsoft/mscclpp
Tag: v0.7.0
Published: 2025-07-12T08:10:51Z
Prerelease: no
Release notes:
What's Changed
- Move pipeline to official org by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/406
- Disable CuMemMap check for ROCm by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/411
- NVLS support for NCCL API by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/410
- Supporting multi-node executors in NCCL API by @caiomcbr in https://github.com/microsoft/mscclpp/pull/412
- Fix synchronization in allreduce8 kernel by @dsidler in https://github.com/microsoft/mscclpp/pull/407
- Add ncclBcast / ncclBroadcast support by @SreevatsaAnantharamu in https://github.com/microsoft/mscclpp/pull/419
- Update README by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/414
- Fix nccl-test failure issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/421
- Tackle build warnings by @chhwang in https://github.com/microsoft/mscclpp/pull/422
- trigger ci for release branches by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/426
- Fix CI trigger issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/428
- Fix typos in the pipeline by @chhwang in https://github.com/microsoft/mscclpp/pull/420
- Update version number by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/433
- Enhance the nccl error message handling by @seagater in https://github.com/microsoft/mscclpp/pull/434
- [NPKIT] Adding the NPKIT support for kernel allreduce7 in mscclpp-nccl by @PedramAlizadeh in https://github.com/microsoft/mscclpp/pull/399
- Fix azure pipeline by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/437
- Add
GpuBufferclass by @chhwang in https://github.com/microsoft/mscclpp/pull/423 - Fix CMake build messages by @chhwang in https://github.com/microsoft/mscclpp/pull/443
- Flushing Proxy Channels at CPU side upon reaching the Inflight Request Limit by @caiomcbr in https://github.com/microsoft/mscclpp/pull/415
- Fix Python binding of exceptions by @chhwang in https://github.com/microsoft/mscclpp/pull/444
- Auto-update version numbers in CMakeLists.txt by @chhwang in https://github.com/microsoft/mscclpp/pull/450
- Resolve cuMemMap error by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/451
- Manage runtime environments by @chhwang in https://github.com/microsoft/mscclpp/pull/452
- Lazily create streams for CudaIpcConnection by @chhwang in https://github.com/microsoft/mscclpp/pull/449
- Fix PR #449 by @chhwang in https://github.com/microsoft/mscclpp/pull/453
- Merge mscclpp-lang to mscclpp project by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/442
- Renaming channels by @chhwang in https://github.com/microsoft/mscclpp/pull/436
- Add multi-nodes example & update doc by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/455
- Adjusting BFS to seek circular dependencies in the msccl-tools DAG by @caiomcbr in https://github.com/microsoft/mscclpp/pull/459
- remove unnecessary sync by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/461
- Support ReduceScatter in the NCCL interface by @caiomcbr in https://github.com/microsoft/mscclpp/pull/460
- Updating MSCCLLang Examples by @caiomcbr in https://github.com/microsoft/mscclpp/pull/462
- Disable channel cache by @seagater in https://github.com/microsoft/mscclpp/pull/463
- Adjusting AllGather Collective in MSCCLLang by @caiomcbr in https://github.com/microsoft/mscclpp/pull/466
- Adding Read Put Packet operation at Executor by @caiomcbr in https://github.com/microsoft/mscclpp/pull/441
- NPKit Support to Read Put Packet Operation by @caiomcbr in https://github.com/microsoft/mscclpp/pull/471
- Adjust NPKit IB Event by @caiomcbr in https://github.com/microsoft/mscclpp/pull/472
- Fix minor typos and errors in documentation by @RyoYang in https://github.com/microsoft/mscclpp/pull/474
- Improving Get Operation at MSCCLLang by @caiomcbr in https://github.com/microsoft/mscclpp/pull/475
- Fix memory OOM issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/479
- Mark mscclpp-test as deprecated in the doc by @chhwang in https://github.com/microsoft/mscclpp/pull/478
- Update allgather fallback algo by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/476
- Add min operation for allreduce by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/481
- NCCL API CI Test for ReduceScatter by @caiomcbr in https://github.com/microsoft/mscclpp/pull/465
- Fix correctness issue when mscclppDisableChannelCache set to true by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/483
- nccl/rccl integration by @seagater in https://github.com/microsoft/mscclpp/pull/469
- Fix reduceMin failaure issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/486
- Reduce Operation Support to the Executor by @caiomcbr in https://github.com/microsoft/mscclpp/pull/484
- Add CI test for fallback allgather, allreduce, broadcastand reducescatter to NCCL operations by @seagater in https://github.com/microsoft/mscclpp/pull/485
- Remove the requirement for
CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTEDfor NVLS support by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/489 - Add CUDA 12.8 images by @chhwang in https://github.com/microsoft/mscclpp/pull/488
- Add a devcontainer configuration by @chhwang in https://github.com/microsoft/mscclpp/pull/490
- Fix CMake installation in Dockerfile for arm64 by @chhwang in https://github.com/microsoft/mscclpp/pull/491
- Export mscclpp GpuBuffer to dlpack format by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/492
- Fix the virtual address mapping issue of cuMemMap in fallback code by @seagater in https://github.com/microsoft/mscclpp/pull/501
- Improve signal/wait performance and fix barrier issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/499
- Fix performance issue introduced in PR: 499 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/505
- Add flag to disable nvls by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/500
- Optimized allreduce fallback for ~10KB sizes by @chhwang in https://github.com/microsoft/mscclpp/pull/506
- Automatic creation of Scratch Buffer at MSCCLLang by @caiomcbr in https://github.com/microsoft/mscclpp/pull/510
- Use implicit ctors for default device ctors by @chhwang in https://github.com/microsoft/mscclpp/pull/512
- apps/nccl: fix a bug in allreduce kernels for graph mode by @nusislam…
Excerpt shown — open the source for the full document.
Notability
notability 4.0/10Routine version release of existing library.