ReleaseNovita AINovita AIpublished Jun 22, 2026seen 4d

novitalabs/pegaflow v0.22.10

novitalabs/pegaflow

Open original ↗

Captured source

source ↗
published Jun 22, 2026seen 4dcaptured 4dhttp 200method plain

v0.22.10

Repository: novitalabs/pegaflow

Tag: v0.22.10

Published: 2026-06-22T06:54:16Z

Prerelease: no

Release notes: Release of the pegaflow workspace / pegaflow-llm 0.22.10 — 12 commits since v0.22.9, centered on MLA KV-cache storage efficiency, model-aware transfer-backend selection, and cross-node redundancy observability.

English

✨ Features

  • MLA KV page-first storage (#360) — store MLA KV cache page-first so per-block metadata collapses, cutting metadata overhead for MLA models.
  • Per-layer MLA TP save distribution (#359) — spread MLA tensor-parallel save work across ranks by layer to balance save load.
  • Model-aware KV transfer backend (#357) — the connector auto-selects the KV transfer backend per model; the server no longer needs a static backend setting.
  • Metaserver block-redundancy metrics (#361) — new pegaflow_metaserver_block_redundancy{owners="1|2|3|>=4"} distribution plus pegaflow_metaserver_block_redundancy_avg gauge, surfacing the cross-node KV replication factor (how much effective cache capacity shrinks).
  • P/D handshake wire schema (#345) — seal the prefill/decode handshake wire schema in pegaflow-pd-wire.
  • Transfer benchmarks (#349, eb69309) — p2p RDMA fetch example plus native D2H/H2D transfer-path measurement.

🐛 Fixes

  • Drop late duplicate saves (#358) — skip late duplicate saves of already-resident blocks, avoiding redundant work.

♻️ Refactors

  • Restructure pd_connector for maintainability (#355).
  • SealedBlock owns its RawBlock slots (#352).
  • Use usize for block_ids, validated at the RPC boundary (#351).

🔧 Chore

  • Bump version 0.22.90.22.10 (#362).

> ⚠️ Strict version handshake: client and server must match on CARGO_PKG_VERSION at registration — upgrade both sides together.

中文

✨ 新功能

  • MLA KV page-first 存储 (#360) — MLA KV cache 按 page-first 布局存储,合并每块元数据,降低 MLA 模型的元数据开销。
  • MLA TP save 按层跨 rank 分摊 (#359) — 把 MLA 张量并行的 save 工作按层分散到各 rank,均衡 save 负载。
  • 按模型自动选 KV 传输 backend (#357) — connector 按模型自动选择 KV 传输 backend,server 不再需要静态指定。
  • Metaserver 块冗余度指标 (#361) — 新增 pegaflow_metaserver_block_redundancy{owners="1|2|3|>=4"} 分布与 pegaflow_metaserver_block_redundancy_avg,反映跨节点 KV 副本数(即有效缓存容量缩水倍数)。
  • P/D 握手 wire schema (#345) — 在 pegaflow-pd-wire 中固化 prefill/decode 握手协议。
  • 传输 benchmark (#349, eb69309) — p2p RDMA fetch 示例 + 原生 D2H/H2D 传输路径测量。

🐛 修复

  • 丢弃迟到的重复 save (#358) — 跳过对已驻留块的迟到重复 save,避免冗余工作。

♻️ 重构

  • 重构 pd_connector 提升可维护性 (#355)。
  • SealedBlock 自持 RawBlock slot (#352)。
  • block_ids 改用 usize,在 RPC 边界校验 (#351)。

🔧 杂项

  • 版本 0.22.90.22.10 (#362)。

> ⚠️ 严格版本握手:注册时 client 与 server 必须 CARGO_PKG_VERSION 完全一致——升级请两端同时进行。

Full Changelog: https://github.com/novitalabs/pegaflow/compare/v0.22.9...v0.22.10

Notability

notability 3.0/10

Routine release of an incremental version.