ReleaseNVIDIANVIDIApublished Jun 5, 2026seen 5d

NVIDIA/srt-slurm v1.0.0

NVIDIA/srt-slurm

Open original ↗

Captured source

source ↗
published Jun 5, 2026seen 5dcaptured 10hhttp 200method plain

v1.0.0

Repository: NVIDIA/srt-slurm

Tag: v1.0.0

Published: 2026-06-05T23:10:11Z

Prerelease: no

Release notes:

What's Changed

  • Update README to include TensorRT LLM and vLLM in description by @nlevin-ui in https://github.com/NVIDIA/srt-slurm/pull/1
  • [MISC] Add License / headers, and a small check to prepare for release by @xinli-sw in https://github.com/NVIDIA/srt-slurm/pull/4
  • feat: enable runtime container detection for portable dynamo source builds by @qiching in https://github.com/NVIDIA/srt-slurm/pull/3
  • Sync ishandhanani/srt-slurm history into NVIDIA/srt-slurm by @csahithi in https://github.com/NVIDIA/srt-slurm/pull/14
  • Add trace-replay benchmark type by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/16
  • fix: use custom_tokenizer to workaround the trtllm + glm5 tokenizer loading issue by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/20
  • fix: add nvidia pypi as an extra index to be able to pip install the prerelease dynamo wheels by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/22
  • fix: support cross-arch clusters (x86_64 login, aarch64 compute) by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/17
  • feat: trace-replay benchmark with aiperf_args passthrough by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/18
  • feat: add mocker backend for pipeline smoke tests by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/25
  • feat: separate login-node and compute-node venvs by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/29
  • feat: runtime fingerprinting, identity verification, and lockfile by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/19
  • feat: configurable NATS max_payload for disagg serving by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/31
  • Copy {job_id}.json into log directory for S3 upload by @KaunilD in https://github.com/NVIDIA/srt-slurm/pull/15
  • TRTLLM nsys profiling harness + Dynamo OTEL tracing automation by @karen-sy in https://github.com/NVIDIA/srt-slurm/pull/27
  • Add CODEOWNERS file by @xinli-sw in https://github.com/NVIDIA/srt-slurm/pull/37
  • Add CSV export for sa-bench rollup by @weireweire in https://github.com/NVIDIA/srt-slurm/pull/26
  • Sanitize srun output in node IP resolution by @weireweire in https://github.com/NVIDIA/srt-slurm/pull/38
  • feat: lockfile v2 — shareable recipe + lock section by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/32
  • fix: Install maturin if not present by @trevor-m in https://github.com/NVIDIA/srt-slurm/pull/45
  • [codex] Add generic telemetry and custom benchmark support by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/43
  • [codex] Port HF cache cleanup by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/49
  • Add srt-slurm MCP spec server and preflight validation by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/53
  • Push logs_url to status API eagerly and via final PUT by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/54
  • [codex] narrow srtctl mcp to authoring and validation by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/55
  • [codex] Keep MCP validation off host cluster config by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/56
  • fix: emit aggregated resources and harden sa-bench rollup by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/58
  • feat: use pre-generated custom dataset for benchmarking MTP with chat template by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/64
  • docs: loud warnings on custom benchmark templating and nginx-off mode by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/66
  • feat(sa-bench): add sglang DeepSeek-V4 tokenizer by @YAMY1234 in https://github.com/NVIDIA/srt-slurm/pull/73
  • feat: DeepSeek-V4-Pro perf recipes for GB300 / GB200 (1k/1k agg) by @elvischenv in https://github.com/NVIDIA/srt-slurm/pull/70
  • fix(orchestrate): robust container bootstrap (maturin/protoc/venv-race) by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/81
  • fix(sa-bench): actionable error + warmup parity for use_chat_template by @YAMY1234 in https://github.com/NVIDIA/srt-slurm/pull/76
  • feat(schema): make gsm8k a first-class BenchmarkType by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/82
  • [codex] add AIME benchmark by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/83
  • feat(aime): rework around ns eval for reasoning-model parity by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/87
  • Add scripts for wideEP; Note we can reach a PD balance with dep8, cc=2048 by @samuellees in https://github.com/NVIDIA/srt-slurm/pull/52
  • Revert "Add scripts for wideEP; Note we can reach a PD balance with dep8, cc=2048" by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/89
  • refactor(aime): drop structured runner, ship configs/aime/{run.sh,rescore.py} by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/91
  • Add the chat template to the glm5 tokenizer and apply that when sampling the requests by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/65
  • feat(config): resolve container aliases for telemetry + preflight by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/101
  • [codex] Add Dynamo nightly wheel install support by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/99
  • feat(dynamo): cache hash-pinned source builds on /configs by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/88
  • Add DeepSeek V4 Pro vLLM GB200 recipes by @alec-flowers in https://github.com/NVIDIA/srt-slurm/pull/102
  • feat(config): cluster-wide default_bash_preamble for ulimits and the like by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/104
  • fix(nginx): raise file descriptor limit for nginx workers by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/108
  • log: always set dyn skip log fmt by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/109
  • [NOT FINAL] add wip DSv4 aggregate and disaggregate recipes by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/85
  • nginx: rework to make ulimit optional by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/110
  • log: demote per-srun command line to DEBUG by @cquil11 in https://github.com/NVIDIA/srt-slurm/pull/111
  • fix: using a setup script to install pip in trtllm venv by @richardhuo-nv in https://github.com/NVIDIA/srt-slurm/pull/116
  • default dyn log by @ishandhanani in https://github.com/NVIDIA/srt-slurm/pull/118
  • feat: Add live monitor to…

Excerpt shown — open the source for the full document.

Notability

notability 5.0/10

Nvidia release of SRT-Slurm integration tool.