ReleaseMicrosoft Microsoftpublished Jan 3, 2024seen 1w

microsoft/superbenchmark v0.10.0

microsoft/superbenchmark

Open original ↗

Captured source

GH

GitHub/github.com/microsoft/superbenchmark

microsoft/superbenchmark v0.10.0

published Jan 3, 2024seen 1wcaptured 1whttp 200method plain

Release SuperBench v0.10.0

Repository: microsoft/superbenchmark

Tag: v0.10.0

Published: 2024-01-03T00:10:48Z

Prerelease: no

Release notes:

SuperBench 0.10.0 Release Notes

SuperBench Improvements

Support monitoring for AMD GPUs.
Support ROCm 5.7 and ROCm 6.0 dockerfile.
Add MSCCL support for Nvidia GPU.
Fix NUMA domains swap issue in NDv4 topology file.
Add NDv5 topo file.
Fix NCCL and NCCL-test to 2.18.3 for hang issue in CUDA 12.2.

Micro-benchmark Improvements

Add HPL random generator to gemm-flops with ROCm.
Add DirectXGPURenderFPS benchmark to measure the FPS of rendering simple frames.
Add HWDecoderFPS benchmark to measure the FPS of hardware decoder performance.
Update Docker image for H100 support.
Update MLC version into 3.10 for CUDA/ROCm dockerfile.
Bug fix for GPU Burn test.
Support INT8 in cublaslt function.
Add hipBLASLt function benchmark.
Support cpu-gpu and gpu-cpu in ib-validation.
Support graph mode in NCCL/RCCL benchmarks for latency metrics.
Support cpp implementation in distributed inference benchmark.
Add O2 option for gpu copy ROCm build.
Support different hipblasLt data types in dist inference.
Support in-place in NCCL/RCCL benchmark.
Support data type option in NCCL/RCCL benchmark.
Improve P2P performance with fine-grained GPU memory in GPU-copy test for AMD GPUs.
Update hipblaslt GEMM metric unit to tflops.
Support FP8 for hipblaslt benchmark.

Model Benchmark Improvements

Change torch.distributed.launch to torchrun.
Support Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark.

Result Analysis

Support baseline generation from multiple nodes.

Notability

notability 3.0/10

Routine version release of benchmarking tool.