PaddlePaddle/PaddleNLP v3.0.0-beta3
PaddlePaddle/PaddleNLP
Captured source
source ↗published Dec 16, 2024seen 5dcaptured 13hhttp 200method plain
v3.0.0-beta3
Repository: PaddlePaddle/PaddleNLP
Tag: v3.0.0-beta3
Published: 2024-12-16T09:35:00Z
Prerelease: no
Release notes: 本次更新增强了PaddleNLP的基础体验,新增了Llama-3.2、DeepSeekV2模型,升级了TokenizerFast功能,重构了SFTTrainer。
此外,PaddleNLP还支持了优化器状态的卸载和重载功能,实现了精细化的重新计算,训练性能提升7%。在Unified Checkpoint方面,进一步优化了异步保存逻辑,新增Checkpoint压缩功能,可节省78.5%存储空间。 最后,在大模型推理、自动并行、多硬件支持、文档使用上,我们都进行了深度优化。
主要更新与增强
1. 新增模型:
- 新增了Llama-3.2模型(#9199)、DeepSeekV2模型(#9250),进一步丰富了大型模型的选择。
2. 基础架构改进:
- 重构了SFTTrainer和SFTConfig,提高了代码的可维护性。(#9318)
- 支持优化器状态的卸载和重载功能(#9467),有效降低了内存使用。
- 通过Hook实现了精细化的重新计算支持,例如,在llama模型上,训练性能可提升7%。(#9396)
- Unified Checkpoint优化:
- 更新了异步保存逻辑(#9173, #9274, #9321),显著提升了检查点的保存与加载效率。
- 增加了对专家并行的支持(#9055),使模型训练更加灵活。
- 支持在开启sharding_comm_overlap时使用Unified Checkpoint。(#9392)
- 新增了Checkpoint压缩功能,最多可节省78.5%的存储空间。(#9183)
- 通过多线程技术减少了检查点的加载时间(#9034)。
- Tokenizer功能增强:
- 允许在Tokenizer调用时指定
padding_side参数(#9258),提升了用户体验。 - Qwen tokenizer现支持添加特殊标记(#9344),增强了其灵活性。
- 修复了TokenizerFast中缺失的
clean_up_tokenization_spaces问题(#9304),提高了文本处理的准确性。 - 统一了分词器的
_pad函数到基类。#9280 - 新增了对
BertTokenizerFast的支持,并允许在调用时注册tokenizer。(#9353) - 改进了Qwen、Gemma、Yuan模型chat template的特殊输入处理。(#9462)
3. 推理性能提升:
- 支持LLM推理直接量化内置bos模型(#9197)。
- 加强了对LLM推理中FP8 量化的支持(如#9328, #9423),满足了多样化的精度需求。
- 增强了投机解码(speculative decoding)和Append Attention 的支持。(#9180) (#9244)
4. 硬件兼容性扩展:
5. 自动并行优化:
- 修复了自动并行过程中的多个问题(如#9217, #9355),确保了并行训练的稳定性。
- 更新了自动并行配置与检查点转换器(如#9136, #9432),提升了训练的灵活性和稳定性。
6. 文档和测试更新:
- 更新了多个文档,包括LLM模型文档(如#9314)和量化文档(如#9330),确保了信息的时效性和准确性。
- 新增了多个测试用例,如分布式数据加载测试(#9438),提高了测试的覆盖率。
- 修复了文档中的链接错误和排版问题(如#9127, #9515),提升了用户体验。
本次更新标志着PaddleNLP的持续进步,为用户提供了更加全面、高效和稳定的NLP解决方案。我们期待在未来的版本中,继续为用户带来更多的创新和价值。
What's Changed
- [Unified Checkpoint] update async_save_info in develop by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9173
- add flashmask rm by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9154
- [LLM_INFER] Support quantized model from bos and fix docs by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9197
- fix ci not set no_proxy and modify tests in pir mode by @fightfat in https://github.com/PaddlePaddle/PaddleNLP/pull/9205
- [Models] Add Llama-3.2 by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9199
- move some auto_parallel args into class AutoTrainingArguments by @Wennie396 in https://github.com/PaddlePaddle/PaddleNLP/pull/9155
- [Performance] Compatible with flashmask API rename upgrade by @GuoxiaWang in https://github.com/PaddlePaddle/PaddleNLP/pull/9019
- [AutoParallel] add vpp align and pp amp test by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/9176
- fix auto ci return bug when run in v100 by @fightfat in https://github.com/PaddlePaddle/PaddleNLP/pull/9216
- fix auto ci return bug when run in v100 by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/9228
- [LLM] Add tools for parameters by @Hanyonggong in https://github.com/PaddlePaddle/PaddleNLP/pull/9137
- [AutoParallel] Add test for fuse_ffn and fuse_attention_qkv pass by @zhangbo9674 in https://github.com/PaddlePaddle/PaddleNLP/pull/9203
- [CI] Fix ci import. by @ZHUI in https://github.com/PaddlePaddle/PaddleNLP/pull/9239
- [Version] Update version info by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9241
- [Auto Parallel] Adding align mode support by @zhangyuqin1998 in https://github.com/PaddlePaddle/PaddleNLP/pull/9150
- [LLM INFER] top_p_sampling_reject support top_p=0 and custom seed by @gzy19990617 in https://github.com/PaddlePaddle/PaddleNLP/pull/9202
- [INFER] update tune_cublaslt_gemm op and fix some bugs by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9222
- Reduce the time spent on git downloading third-party libraries by @vivienfanghuagood in https://github.com/PaddlePaddle/PaddleNLP/pull/9246
- [PIR] fix pir open bugs by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9248
- Cherry-pick some PRs from incubate/paddlenlp-fleety by @sneaxiy in https://github.com/PaddlePaddle/PaddleNLP/pull/9245
- [Unified Checkpoint] Support expert parallel by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9055
- [PIR] fix pir dt2st for chatglm_v2 by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9251
- Cherry-pick some PRs from incubate/paddlenlp-fleety by @LiYuRio in https://github.com/PaddlePaddle/PaddleNLP/pull/9253
- [Unified Checkpoint] Fix generation config save by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9223
- [AutoParallel] Fix tests for pass paddle AutoParallel CI by @liym27 in https://github.com/PaddlePaddle/PaddleNLP/pull/9267
- change dataset by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9266
- [Unified Checkpoint] update async save logic by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9274
- add config file for model chatglm2,gemma,yuan by @Mangodadada in https://github.com/PaddlePaddle/PaddleNLP/pull/9139
- Fix async hang by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9276
- [AutoParallel] Change llama test from sharding stage2 to stage1 by @zhangbo9674 in https://github.com/PaddlePaddle/PaddleNLP/pull/9281
- [Tokenizer] Enable padding_side as call time kwargs by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9258
- [Trainer] fix save_model by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9286
- [CI] Skip inference test cases by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9270
- [LLM] Add deepseekv2 by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9250
- [Tokenizer] Unify tokenizer _pad by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9280
- [CI] Fix llm/alignment/rm/flashmask path by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9289
- support attention mask using causal=True by @GuoxiaWang in https://github.com/PaddlePaddle/PaddleNLP/pull/9268
- [FlashMask] Add FlashMask for Qwen2 by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9264
- bug fix…
Excerpt shown — open the source for the full document.
Notability
notability 6.0/10Beta release of major NLP library