ReleaseDatabricks (DBRX)Databricks (DBRX)published May 15, 2025seen 5d

databricks/compose-rl v0.5.0

databricks/compose-rl

Open original ↗

Captured source

source ↗
published May 15, 2025seen 5dcaptured 9hhttp 200method plain

v0.5.0

Repository: databricks/compose-rl

Tag: v0.5.0

Published: 2025-05-15T04:43:50Z

Prerelease: no

Release notes:

What's new

  • Online RL Algorithms: We now support PPO and GRPO for online RL training
  • RL with Verifiable Rewards: We've added support for verifiable rewards with online RL algorithms, along with evaluations during training.
  • Registries for extensible and composable design
  • Robust vLLM support for efficient inference during online RL training

What's Changed

  • Update version to match latest release by @dakinggg in https://github.com/databricks/compose-rl/pull/25
  • attach vllm engines to state by @vchiley in https://github.com/databricks/compose-rl/pull/20
  • Adding warning for truncating preferences by @bcui-db in https://github.com/databricks/compose-rl/pull/27
  • Add load planner for PPO by @bcui-db in https://github.com/databricks/compose-rl/pull/18
  • Auto set TP size by @vchiley in https://github.com/databricks/compose-rl/pull/29
  • Enable Masking of EOS tokens list by @bcui-db in https://github.com/databricks/compose-rl/pull/31
  • Accomodate typing changes for transformers 4.51 by @dakinggg in https://github.com/databricks/compose-rl/pull/33
  • Dataloader changes for RLVR by @gupta-abhay in https://github.com/databricks/compose-rl/pull/21
  • Moved the long seq fix on top of main by @abaheti95 in https://github.com/databricks/compose-rl/pull/34
  • Changes for better reward validation by @gupta-abhay in https://github.com/databricks/compose-rl/pull/35
  • Inheritance fix by @gupta-abhay in https://github.com/databricks/compose-rl/pull/37
  • Simple change by @gupta-abhay in https://github.com/databricks/compose-rl/pull/40
  • K generation per prompt by @abaheti95 in https://github.com/databricks/compose-rl/pull/36
  • Merge ReadMEs for easier parsing by @gupta-abhay in https://github.com/databricks/compose-rl/pull/41
  • Enable hf token for restricted data access by @gupta-abhay in https://github.com/databricks/compose-rl/pull/42
  • Enable different KL estimators for training by @gupta-abhay in https://github.com/databricks/compose-rl/pull/44
  • update readme by @bcui-db in https://github.com/databricks/compose-rl/pull/45
  • Upgrade yapf version by @gupta-abhay in https://github.com/databricks/compose-rl/pull/46
  • Fast inference w/ single vllm generate call per PPO iter by @abaheti95 in https://github.com/databricks/compose-rl/pull/43
  • Addressing cleanup comments on fast vLLM PR by @abaheti95 in https://github.com/databricks/compose-rl/pull/49
  • Improving online RL logging by @abaheti95 in https://github.com/databricks/compose-rl/pull/50
  • Update vLLM, enables single node Tensor parallel sizes (1, 2, 4, 8) by @bcui-db in https://github.com/databricks/compose-rl/pull/48
  • Unified kl estimators by @gupta-abhay in https://github.com/databricks/compose-rl/pull/53
  • Add codeowners by @gupta-abhay in https://github.com/databricks/compose-rl/pull/54
  • Add chat functionality to vLLM actor by @bcui-db in https://github.com/databricks/compose-rl/pull/55
  • Exposing average log prob flag by @abaheti95 in https://github.com/databricks/compose-rl/pull/56
  • Modifying codeowners by @gupta-abhay in https://github.com/databricks/compose-rl/pull/57
  • GRPO implementation by @abaheti95 in https://github.com/databricks/compose-rl/pull/51
  • Registries for extending compose-rl by @gupta-abhay in https://github.com/databricks/compose-rl/pull/47
  • Simple tests for new registries by @gupta-abhay in https://github.com/databricks/compose-rl/pull/58
  • Timeout change by @gupta-abhay in https://github.com/databricks/compose-rl/pull/59
  • Fix label generation for MATH to match verification by @gupta-abhay in https://github.com/databricks/compose-rl/pull/60
  • Changes for optional tokens list by @gupta-abhay in https://github.com/databricks/compose-rl/pull/61
  • Minor changes for dtype and docstrings by @gupta-abhay in https://github.com/databricks/compose-rl/pull/62

New Contributors

  • @vchiley made their first contribution in https://github.com/databricks/compose-rl/pull/20
  • @gupta-abhay made their first contribution in https://github.com/databricks/compose-rl/pull/21

Full Changelog: https://github.com/databricks/compose-rl/compare/v0.4.0...v0.5.0

Notability

notability 4.0/10

Routine release, not a major launch