ReleaseBaidu (ERNIE)Baidu (ERNIE)published Oct 8, 2024seen 5d

PaddlePaddle/PaddleNLP v3.0.0-beta2

PaddlePaddle/PaddleNLP

Open original ↗

Captured source

source ↗
published Oct 8, 2024seen 5dcaptured 11hhttp 200method plain

v3.0.0-beta2

Repository: PaddlePaddle/PaddleNLP

Tag: v3.0.0-beta2

Published: 2024-10-08T08:52:31Z

Prerelease: yes

Release notes: 本次更新强化了PaddleNLP的基础设施,新增了Qwen2.5、Mixtral 8*22B模型并升级了Tokenizer功能,同时重命名了数据索引工具。

此外,还修复了MoE模型参数保存与加载等问题,提升了文本处理准确性,并更新了文档与测试用例。在推理性能、硬件支持及自动并行方面也进行了优化,包括支持更多模型与参数配置、多GPU推理、国产硬件支持增强以及分布式训练流程优化等。

核心变更与增强功能

1. 基础设施强化

  • 新增Qwen2.5模型(#9157 ),Mixtral 8*22B。进一步丰富模型库。
  • Tokenizer功能升级,现支持加载额外解码标记added_tokens_decoder(#8997 ),提升灵活性。
  • 数据索引工具tool_helpers重命名为fast_dataindex(#9134 ),以更直观反映其功能特性。
  • 实现训练过程中数据间隔跳过的功能(#8989 ),优化数据处理效率。
  • Unified Checkpoint优化
  • 更新优化器异步保存信号(#8975 ),保证保存稳定。
  • 修复统一检查点中的多项问题(#9082 ),确保功能正确性。

3. 问题修复

  • 解决了MoE模型参数保存与加载的问题(#9045 )。
  • 修正Tokenizer中空格与特殊符号处理的不足(#9010 , #9144 ),提升文本处理准确性。

4. 文档与测试更新

  • 更新多个文档,涵盖LLM模型文档(如#8990 , #8999 )及量化文档(#9057 )等,确保信息的时效性与准确性。
  • 新增测试用例,如针对PIR模式序列并行的测试(#9015 ),强化测试覆盖度。
  • 修复文档中的链接错误(如#9127 ),提升用户体验。

5. 其他关键变更

  • 推理性能优化
  • LLM推理代码得到优化,支持更多模型与参数配置(如#8986 , #8995 ),拓宽应用场景。
  • 实现Qwen2_Moe多GPU推理(#9121 )及wint4量化(#9129 ),提升推理效率。
  • 加强LLM推理对FP8与INT8的支持(如#9032 , #9151 ),满足多样化精度需求。
  • 硬件支持拓展
  • 增强对DCU、XPU、MLU等国产硬件的支持(如#8983 , #8504 , #9075 ),促进国产化替代。
  • 优化上述硬件上的模型训练与推理性能,提升整体运算效率。
  • 自动并行优化
  • 修复训练过程中数据重复跳过的问题(#8980 ),确保数据处理的正确性。
  • 更新自动并行配置与检查点转换器(如#8847 , #9136 ),提升并行训练的灵活性与稳定性。
  • 新增损失NaN/Inf检查器(#8943 ),及时发现并处理潜在数值问题。
  • 优化分布式训练中的数据加载与梯度合并流程(如#9120 , #9179 ),提升训练速度与稳定性。

What's Changed

  • [Unified checkpoint] update optimizer async save signal by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/8975
  • 更正run_dpo.py文件路径 by @Mangodadada in https://github.com/PaddlePaddle/PaddleNLP/pull/8952
  • fix the loss base in llama_align_dygraph_dy2st_auto_bs2_bf16_DP2-MP1-… by @winter-wang in https://github.com/PaddlePaddle/PaddleNLP/pull/8986
  • [Bug fix] fix skip consumed_samples twice bug by @zhangyuqin1998 in https://github.com/PaddlePaddle/PaddleNLP/pull/8980
  • fix pip error in legacy benchmarks by @fightfat in https://github.com/PaddlePaddle/PaddleNLP/pull/8978
  • 【auto_parallel】Add checkpoint convertor by @xingmingyyj in https://github.com/PaddlePaddle/PaddleNLP/pull/8847
  • [llm]update finetune.md by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/8990
  • tool_helpers升级后可以支持32766个数据集. by @JunnYu in https://github.com/PaddlePaddle/PaddleNLP/pull/8994
  • add DCU inference docs by @YanhuiDua in https://github.com/PaddlePaddle/PaddleNLP/pull/8983
  • [Distributed]Add loss nan/inf checker by @ForFishes in https://github.com/PaddlePaddle/PaddleNLP/pull/8943
  • 【llm】update docs by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/8999
  • [Feature] Fused Mixtral support by @penPenf28 in https://github.com/PaddlePaddle/PaddleNLP/pull/8901
  • [XPU] Add README.md for llama2-7b by @xiguapipi in https://github.com/PaddlePaddle/PaddleNLP/pull/8979
  • Add gcu llama readme by @EnflameGCU in https://github.com/PaddlePaddle/PaddleNLP/pull/8950
  • fix qwen model use_casual_mask by @deepllz in https://github.com/PaddlePaddle/PaddleNLP/pull/9009
  • [ZeroPadding] revert zero_padding #8973 by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9003
  • [LLM Inference] Fix step.cu bug by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/8995
  • Refine checkpoint converter by @zhangbo9674 in https://github.com/PaddlePaddle/PaddleNLP/pull/9001
  • [Feature] fused mixtral wint4 by @penPenf28 in https://github.com/PaddlePaddle/PaddleNLP/pull/9013
  • llm inference docs by @Sunny-bot1 in https://github.com/PaddlePaddle/PaddleNLP/pull/8976
  • [LLM Inference] Support Qwen2_Moe Inference Model by @CJ77Qi in https://github.com/PaddlePaddle/PaddleNLP/pull/8892
  • fix llama3 static run by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/8849
  • [paddle inference cpu]update cpu inference by @bukejiyu in https://github.com/PaddlePaddle/PaddleNLP/pull/8984
  • fix the tipc ce case by @wawltor in https://github.com/PaddlePaddle/PaddleNLP/pull/8748
  • [Cherry-pick] Add is_distributed field in sharding reshard param_meta by @sneaxiy in https://github.com/PaddlePaddle/PaddleNLP/pull/9028
  • [Tokenizer] Support for loading added_tokens_decoder by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/8997
  • [Inference] Add a8w8(fp8) a8w8c8(int8) quant_type support by @lixcli in https://github.com/PaddlePaddle/PaddleNLP/pull/9032
  • Fix checker of nan/inf by @ForFishes in https://github.com/PaddlePaddle/PaddleNLP/pull/9029
  • [Cherry-pick] add comm buffer size (#8963) by @ForFishes in https://github.com/PaddlePaddle/PaddleNLP/pull/9031
  • [Unified Checkpoint] Update async save info by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/8982
  • [llm]support pad to max_length & fix sp bug by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9040
  • [Bugfix] fix bias optional by @penPenf28 in https://github.com/PaddlePaddle/PaddleNLP/pull/9037
  • fix setup.py for llm inference by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9041
  • [Inference] Add cutlass gemm dequant op by @gzy19990617 in https://github.com/PaddlePaddle/PaddleNLP/pull/8909
  • [Inference] update fakequant support by @lixcli in https://github.com/PaddlePaddle/PaddleNLP/pull/9047
  • add test for pir sequence parallel on llama model by @liym27 in https://github.com/PaddlePaddle/PaddleNLP/pull/9015
  • Fix moe save load by @Meiyim in https://github.com/PaddlePaddle/PaddleNLP/pull/9045
  • Update quantization.md by @ZHUI in https://github.com/PaddlePaddle/PaddleNLP/pull/9057
  • 【Fix】Initialize dp degree in single GPU by @greycooker in https://github.com/PaddlePaddle/PaddleNLP/pull/9056
  • fix bos download by @westfish in https://github.com/PaddlePaddle/PaddleNLP/pull/9023
  • [Inference] Update fakequant script by @lixcli in https://github.com/PaddlePaddle/PaddleNLP/pull/9054
  • [AutoParallel][PIR] Fit pir grad merge by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/8985
  • [MLU] Support rms_norm_mlu by @PeiyuLau in https://github.com/PaddlePaddle/PaddleNLP/pull/8504
  • [Inference] support llama3 a8w8c8_fp8 inference and cutlass_fp8_gemm by @ckl117 in https://github.com/PaddlePaddle/PaddleNLP/pull/8953
  • [Inference] Qwen2 support fp8 inference by @ckl117 in https://github.com/PaddlePaddle/PaddleNLP/pull/8954
  • [Version] update version info by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9060
  • [NPU] Fix baichuan2-13b-chat infer by @ronny1996 in https://github.com/PaddlePaddle/PaddleNLP/pull/9070
  • [MLU] Fix Llama attrntion_mask in npu and mlu by @DrownFish19 in…

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

Major NLP library release, beta of v3.