ReleaseMicrosoftMicrosoftpublished Jul 12, 2025seen 5d

microsoft/mscclpp v0.7.0

microsoft/mscclpp

Open original ↗

Captured source

source ↗
published Jul 12, 2025seen 5dcaptured 13hhttp 200method plain

MSCCL++ v0.7.0

Repository: microsoft/mscclpp

Tag: v0.7.0

Published: 2025-07-12T08:10:51Z

Prerelease: no

Release notes:

What's Changed

  • Move pipeline to official org by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/406
  • Disable CuMemMap check for ROCm by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/411
  • NVLS support for NCCL API by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/410
  • Supporting multi-node executors in NCCL API by @caiomcbr in https://github.com/microsoft/mscclpp/pull/412
  • Fix synchronization in allreduce8 kernel by @dsidler in https://github.com/microsoft/mscclpp/pull/407
  • Add ncclBcast / ncclBroadcast support by @SreevatsaAnantharamu in https://github.com/microsoft/mscclpp/pull/419
  • Update README by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/414
  • Fix nccl-test failure issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/421
  • Tackle build warnings by @chhwang in https://github.com/microsoft/mscclpp/pull/422
  • trigger ci for release branches by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/426
  • Fix CI trigger issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/428
  • Fix typos in the pipeline by @chhwang in https://github.com/microsoft/mscclpp/pull/420
  • Update version number by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/433
  • Enhance the nccl error message handling by @seagater in https://github.com/microsoft/mscclpp/pull/434
  • [NPKIT] Adding the NPKIT support for kernel allreduce7 in mscclpp-nccl by @PedramAlizadeh in https://github.com/microsoft/mscclpp/pull/399
  • Fix azure pipeline by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/437
  • Add GpuBuffer class by @chhwang in https://github.com/microsoft/mscclpp/pull/423
  • Fix CMake build messages by @chhwang in https://github.com/microsoft/mscclpp/pull/443
  • Flushing Proxy Channels at CPU side upon reaching the Inflight Request Limit by @caiomcbr in https://github.com/microsoft/mscclpp/pull/415
  • Fix Python binding of exceptions by @chhwang in https://github.com/microsoft/mscclpp/pull/444
  • Auto-update version numbers in CMakeLists.txt by @chhwang in https://github.com/microsoft/mscclpp/pull/450
  • Resolve cuMemMap error by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/451
  • Manage runtime environments by @chhwang in https://github.com/microsoft/mscclpp/pull/452
  • Lazily create streams for CudaIpcConnection by @chhwang in https://github.com/microsoft/mscclpp/pull/449
  • Fix PR #449 by @chhwang in https://github.com/microsoft/mscclpp/pull/453
  • Merge mscclpp-lang to mscclpp project by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/442
  • Renaming channels by @chhwang in https://github.com/microsoft/mscclpp/pull/436
  • Add multi-nodes example & update doc by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/455
  • Adjusting BFS to seek circular dependencies in the msccl-tools DAG by @caiomcbr in https://github.com/microsoft/mscclpp/pull/459
  • remove unnecessary sync by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/461
  • Support ReduceScatter in the NCCL interface by @caiomcbr in https://github.com/microsoft/mscclpp/pull/460
  • Updating MSCCLLang Examples by @caiomcbr in https://github.com/microsoft/mscclpp/pull/462
  • Disable channel cache by @seagater in https://github.com/microsoft/mscclpp/pull/463
  • Adjusting AllGather Collective in MSCCLLang by @caiomcbr in https://github.com/microsoft/mscclpp/pull/466
  • Adding Read Put Packet operation at Executor by @caiomcbr in https://github.com/microsoft/mscclpp/pull/441
  • NPKit Support to Read Put Packet Operation by @caiomcbr in https://github.com/microsoft/mscclpp/pull/471
  • Adjust NPKit IB Event by @caiomcbr in https://github.com/microsoft/mscclpp/pull/472
  • Fix minor typos and errors in documentation by @RyoYang in https://github.com/microsoft/mscclpp/pull/474
  • Improving Get Operation at MSCCLLang by @caiomcbr in https://github.com/microsoft/mscclpp/pull/475
  • Fix memory OOM issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/479
  • Mark mscclpp-test as deprecated in the doc by @chhwang in https://github.com/microsoft/mscclpp/pull/478
  • Update allgather fallback algo by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/476
  • Add min operation for allreduce by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/481
  • NCCL API CI Test for ReduceScatter by @caiomcbr in https://github.com/microsoft/mscclpp/pull/465
  • Fix correctness issue when mscclppDisableChannelCache set to true by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/483
  • nccl/rccl integration by @seagater in https://github.com/microsoft/mscclpp/pull/469
  • Fix reduceMin failaure issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/486
  • Reduce Operation Support to the Executor by @caiomcbr in https://github.com/microsoft/mscclpp/pull/484
  • Add CI test for fallback allgather, allreduce, broadcastand reducescatter to NCCL operations by @seagater in https://github.com/microsoft/mscclpp/pull/485
  • Remove the requirement for CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_FABRIC_SUPPORTED for NVLS support by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/489
  • Add CUDA 12.8 images by @chhwang in https://github.com/microsoft/mscclpp/pull/488
  • Add a devcontainer configuration by @chhwang in https://github.com/microsoft/mscclpp/pull/490
  • Fix CMake installation in Dockerfile for arm64 by @chhwang in https://github.com/microsoft/mscclpp/pull/491
  • Export mscclpp GpuBuffer to dlpack format by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/492
  • Fix the virtual address mapping issue of cuMemMap in fallback code by @seagater in https://github.com/microsoft/mscclpp/pull/501
  • Improve signal/wait performance and fix barrier issue by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/499
  • Fix performance issue introduced in PR: 499 by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/505
  • Add flag to disable nvls by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/500
  • Optimized allreduce fallback for ~10KB sizes by @chhwang in https://github.com/microsoft/mscclpp/pull/506
  • Automatic creation of Scratch Buffer at MSCCLLang by @caiomcbr in https://github.com/microsoft/mscclpp/pull/510
  • Use implicit ctors for default device ctors by @chhwang in https://github.com/microsoft/mscclpp/pull/512
  • apps/nccl: fix a bug in allreduce kernels for graph mode by @nusislam…

Excerpt shown — open the source for the full document.

Notability

notability 4.0/10

Routine version release of existing library.