microsoft/superbenchmark v0.12.0
microsoft/superbenchmark
Captured source
source ↗published Aug 11, 2025seen 1wcaptured 1whttp 200method plain
Release SuperBench v0.12.0
Repository: microsoft/superbenchmark
Tag: v0.12.0
Published: 2025-08-11T21:58:22Z
Prerelease: no
Release notes:
SuperBench 0.12.0 Release Notes
SuperBench Improvements
- Optimized cutlass build process for faster builds and smaller binaries.
- Improve image build pipeline.
- Add support for arm64 builds.
- Upgrade pipeline dependencies.
- Fix SuperBench installation and code lint issues.
- Update Flake8 repository.
- Add support for the latest Python versions.
- Enhance error handling for
pkg_resourcesimports. - Update ROCm image build labels.
- Add CUDA 12.8 and CUDA 12.9 support.
- Consolidate multi-architecture Docker images.
- Upgrade runner OS to latest version.
- Fix typos in documentation and code.
Micro-benchmark Improvements
- Add general CPU bandwidth and latency benchmarks.
- Add nvbandwidth build process and benchmarks.
- Add architecture support for 10.0 in gemm-flops.
- Add GPU Stream micro benchmark.
- Add FP4 GEMM FLOPS support in
cublaslt_gemmbenchmark. - Add Grace CPU support for CPU Stream benchmark.
- Revise CPU Stream benchmark.
- Fix NUMA error on Grace CPU in gpu-copy benchmark.
- Bump onnxruntime-gpu dependency from 1.10.0 to 1.12.0.
- Fix stderr message in gpu-copy benchmark.
- Fix TensorRT inference parsing.
- Handle N/A values in nvbandwidth benchmark.
- Avoid unintended nvbandwidth function calls in all benchmarks.
- Support CUDA arch flag and autotuning in
cublasltGEMM.
Model-benchmark Improvements
- Add LLaMA-2 model benchmarks.
- Add Mixture of Experts model benchmarks.
- Add DeepSeek inference benchmark (AMD GPU).
Result Analysis
- Enhance logging for diagnosis rule baseline errors.
Documentation Updates
- Update CODEOWNERS file.
Notability
notability 3.0/10Routine minor version release, no major traction.