NVIDIA/srt-slurm v1.0.0
NVIDIA/srt-slurm
Captured source
source ↗published Jun 5, 2026seen 5dcaptured 10hhttp 200method plain
v1.0.0
Repository: NVIDIA/srt-slurm
Tag: v1.0.0
Published: 2026-06-05T23:10:11Z
Prerelease: no
Release notes:
What's Changed
- Update README to include TensorRT LLM and vLLM in description by @nlevin-ui in https://github.com/NVIDIA/srt-slurm/pull/1
- [MISC] Add License / headers, and a small check to prepare for release by @xinli-sw in https://github.com/NVIDIA/srt-slurm/pull/4
- feat: enable runtime container detection for portable dynamo source builds by @qiching in https://github.com/NVIDIA/srt-slurm/pull/3
- Sync ishandhanani/srt-slurm history into NVIDIA/srt-slurm by @csahithi in https://github.com/NVIDIA/srt-slurm/pull/14
- Add trace-replay benchmark type by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/16
- fix: use custom_tokenizer to workaround the trtllm + glm5 tokenizer loading issue by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/20
- fix: add nvidia pypi as an extra index to be able to pip install the prerelease dynamo wheels by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/22
- fix: support cross-arch clusters (x86_64 login, aarch64 compute) by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/17
- feat: trace-replay benchmark with aiperf_args passthrough by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/18
- feat: add mocker backend for pipeline smoke tests by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/25
- feat: separate login-node and compute-node venvs by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/29
- feat: runtime fingerprinting, identity verification, and lockfile by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/19
- feat: configurable NATS max_payload for disagg serving by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/31
- Copy {job_id}.json into log directory for S3 upload by @KaunilD in https://github.com/NVIDIA/srt-slurm/pull/15
- TRTLLM nsys profiling harness + Dynamo OTEL tracing automation by @karen-sy in https://github.com/NVIDIA/srt-slurm/pull/27
- Add CODEOWNERS file by @xinli-sw in https://github.com/NVIDIA/srt-slurm/pull/37
- Add CSV export for sa-bench rollup by @weireweire in https://github.com/NVIDIA/srt-slurm/pull/26
- Sanitize srun output in node IP resolution by @weireweire in https://github.com/NVIDIA/srt-slurm/pull/38
- feat: lockfile v2 — shareable recipe + lock section by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/32
- fix: Install maturin if not present by @trevor-m in https://github.com/NVIDIA/srt-slurm/pull/45
- [codex] Add generic telemetry and custom benchmark support by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/43
- [codex] Port HF cache cleanup by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/49
- Add srt-slurm MCP spec server and preflight validation by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/53
- Push logs_url to status API eagerly and via final PUT by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/54
- [codex] narrow srtctl mcp to authoring and validation by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/55
- [codex] Keep MCP validation off host cluster config by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/56
- fix: emit aggregated resources and harden sa-bench rollup by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/58
- feat: use pre-generated custom dataset for benchmarking MTP with chat template by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/64
- docs: loud warnings on custom benchmark templating and nginx-off mode by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/66
- feat(sa-bench): add sglang DeepSeek-V4 tokenizer by @YAMY1234 in https://github.com/NVIDIA/srt-slurm/pull/73
- feat: DeepSeek-V4-Pro perf recipes for GB300 / GB200 (1k/1k agg) by @elvischenv in https://github.com/NVIDIA/srt-slurm/pull/70
- fix(orchestrate): robust container bootstrap (maturin/protoc/venv-race) by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/81
- fix(sa-bench): actionable error + warmup parity for use_chat_template by @YAMY1234 in https://github.com/NVIDIA/srt-slurm/pull/76
- feat(schema): make gsm8k a first-class BenchmarkType by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/82
- [codex] add AIME benchmark by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/83
- feat(aime): rework around
ns evalfor reasoning-model parity by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/87 - Add scripts for wideEP; Note we can reach a PD balance with dep8, cc=2048 by @samuellees in https://github.com/NVIDIA/srt-slurm/pull/52
- Revert "Add scripts for wideEP; Note we can reach a PD balance with dep8, cc=2048" by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/89
- refactor(aime): drop structured runner, ship configs/aime/{run.sh,rescore.py} by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/91
- Add the chat template to the glm5 tokenizer and apply that when sampling the requests by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/65
- feat(config): resolve container aliases for telemetry + preflight by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/101
- [codex] Add Dynamo nightly wheel install support by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/99
- feat(dynamo): cache hash-pinned source builds on /configs by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/88
- Add DeepSeek V4 Pro vLLM GB200 recipes by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/102
- feat(config): cluster-wide default_bash_preamble for ulimits and the like by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/104
- fix(nginx): raise file descriptor limit for nginx workers by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/108
- log: always set dyn skip log fmt by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/109
- [NOT FINAL] add wip DSv4 aggregate and disaggregate recipes by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/85
- nginx: rework to make ulimit optional by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/110
- log: demote per-srun command line to DEBUG by @cquil11 in https://github.com/NVIDIA/srt-slurm/pull/111
- fix: using a setup script to install pip in trtllm venv by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/116
- default dyn log by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/118
- feat: Add live monitor to…
Excerpt shown — open the source for the full document.
Notability
notability 5.0/10Nvidia release of SRT-Slurm integration tool.