databricks/compose-rl v0.5.0
databricks/compose-rl
Captured source
source ↗published May 15, 2025seen 5dcaptured 9hhttp 200method plain
v0.5.0
Repository: databricks/compose-rl
Tag: v0.5.0
Published: 2025-05-15T04:43:50Z
Prerelease: no
Release notes:
What's new
- Online RL Algorithms: We now support PPO and GRPO for online RL training
- RL with Verifiable Rewards: We've added support for verifiable rewards with online RL algorithms, along with evaluations during training.
- Registries for extensible and composable design
- Robust vLLM support for efficient inference during online RL training
What's Changed
- Update version to match latest release by @dakinggg in https://github.com/databricks/compose-rl/pull/25
- attach vllm engines to state by @vchiley in https://github.com/databricks/compose-rl/pull/20
- Adding warning for truncating preferences by @bcui-db in https://github.com/databricks/compose-rl/pull/27
- Add load planner for PPO by @bcui-db in https://github.com/databricks/compose-rl/pull/18
- Auto set TP size by @vchiley in https://github.com/databricks/compose-rl/pull/29
- Enable Masking of EOS tokens list by @bcui-db in https://github.com/databricks/compose-rl/pull/31
- Accomodate typing changes for transformers 4.51 by @dakinggg in https://github.com/databricks/compose-rl/pull/33
- Dataloader changes for RLVR by @gupta-abhay in https://github.com/databricks/compose-rl/pull/21
- Moved the long seq fix on top of main by @abaheti95 in https://github.com/databricks/compose-rl/pull/34
- Changes for better reward validation by @gupta-abhay in https://github.com/databricks/compose-rl/pull/35
- Inheritance fix by @gupta-abhay in https://github.com/databricks/compose-rl/pull/37
- Simple change by @gupta-abhay in https://github.com/databricks/compose-rl/pull/40
- K generation per prompt by @abaheti95 in https://github.com/databricks/compose-rl/pull/36
- Merge ReadMEs for easier parsing by @gupta-abhay in https://github.com/databricks/compose-rl/pull/41
- Enable hf token for restricted data access by @gupta-abhay in https://github.com/databricks/compose-rl/pull/42
- Enable different KL estimators for training by @gupta-abhay in https://github.com/databricks/compose-rl/pull/44
- update readme by @bcui-db in https://github.com/databricks/compose-rl/pull/45
- Upgrade yapf version by @gupta-abhay in https://github.com/databricks/compose-rl/pull/46
- Fast inference w/ single vllm generate call per PPO iter by @abaheti95 in https://github.com/databricks/compose-rl/pull/43
- Addressing cleanup comments on fast vLLM PR by @abaheti95 in https://github.com/databricks/compose-rl/pull/49
- Improving online RL logging by @abaheti95 in https://github.com/databricks/compose-rl/pull/50
- Update vLLM, enables single node Tensor parallel sizes (1, 2, 4, 8) by @bcui-db in https://github.com/databricks/compose-rl/pull/48
- Unified kl estimators by @gupta-abhay in https://github.com/databricks/compose-rl/pull/53
- Add codeowners by @gupta-abhay in https://github.com/databricks/compose-rl/pull/54
- Add
chatfunctionality to vLLM actor by @bcui-db in https://github.com/databricks/compose-rl/pull/55 - Exposing average log prob flag by @abaheti95 in https://github.com/databricks/compose-rl/pull/56
- Modifying codeowners by @gupta-abhay in https://github.com/databricks/compose-rl/pull/57
- GRPO implementation by @abaheti95 in https://github.com/databricks/compose-rl/pull/51
- Registries for extending compose-rl by @gupta-abhay in https://github.com/databricks/compose-rl/pull/47
- Simple tests for new registries by @gupta-abhay in https://github.com/databricks/compose-rl/pull/58
- Timeout change by @gupta-abhay in https://github.com/databricks/compose-rl/pull/59
- Fix label generation for MATH to match verification by @gupta-abhay in https://github.com/databricks/compose-rl/pull/60
- Changes for optional tokens list by @gupta-abhay in https://github.com/databricks/compose-rl/pull/61
- Minor changes for dtype and docstrings by @gupta-abhay in https://github.com/databricks/compose-rl/pull/62
New Contributors
- @vchiley made their first contribution in https://github.com/databricks/compose-rl/pull/20
- @gupta-abhay made their first contribution in https://github.com/databricks/compose-rl/pull/21
Full Changelog: https://github.com/databricks/compose-rl/compare/v0.4.0...v0.5.0
Notability
notability 4.0/10Routine release, not a major launch