ReleaseBaidu (ERNIE)Baidu (ERNIE)published Jun 27, 2024seen 5d

PaddlePaddle/PaddleSpeech r1.4.2

PaddlePaddle/PaddleSpeech

Open original ↗

Captured source

source ↗
published Jun 27, 2024seen 5dcaptured 8hhttp 200method plain

PaddleSpeech r1.4.2

Repository: PaddlePaddle/PaddleSpeech

Tag: r1.4.2

Published: 2024-06-27T03:45:31Z

Prerelease: no

Release notes:

S2T

  • Add WavLM ASR-en, WavLM fine-tuning for ASR on LibriSpeech. #3242 by @jiamingkong
  • Add HuBERT ASR-en, HuBERT fine-tuning for ASR on LibriSpeech. #3088 by @Zth9730
  • Add Squeezeformer model. #2755 by @yeyupiaoling
  • Add AMP for U2 conformer. #3167 by @zxcd
  • Mv dataset into paddlespeech.dataset. #3183 #3189 by @zh794390558
  • Fix example/aishell local/train.sh if condition bug. #3146 by @lemondy
  • Fix cli args to config. #3194 by @zh794390558
  • Fix scaler save, load, unscale_ blow, grad_clip. by @zxcd

T2S

  • Add SVS(Singing Voice Synthesis) examples with Opencpop dataset, including DiffSinger、PWGAN (#3031 by @lym0302) and HiFiGAN (#3038 by @lym0302), the effect is continuously optimized.
  • Add SVS frontend. #3062 by @lym0302
  • Add TTS iSTFTNet (#3006 by @longRookie), TTS JETS (#3109 by @ljhzxc)
  • Starganv2: by @yt605155624
  • Clean starganv2 vc model code and add docstring. #2987
  • Add starganv2 vc trainer. #3143 #3182
  • Add StarGANv2VC preprocess. #3163
  • Fix losses of StarGAN v2 VC. #3184
  • Support for LITE: by @yt605155624
  • Fix elementwise_floordiv's fill_constant. #3075
  • Fix VITS lite infer. #3098
  • Fix vits reduce_sum's input/output dtype. #3028
  • Fix dtype diff of last expand_v2 op of VITS. #3041
  • Fix input dtype of elementwise_mul op from bool to int64. #3054
  • Add XPU support for SpeedySpeech and FastSpeech2. #3502 #3514 by @USTCKAY
  • Fix some preprocess bugs. #3155 by @yt605155624
  • Fix bug of merge_yi function. #3786 by @mattheliu

Server

  • Add code-switch conformer_talcs support. #3230 by @Gsonovb
  • Add subtitle file (.srt format) generation example. #3123 by @twoDogy
  • Fix: add file read encoding. #3606 by @Coloryr

Install & Benchmark

  • Update paddle2onnx to newest install version. #3084 by @yt605155624
  • Update to py3.8, fix librosa==0.8.1 numpy==1.23.5 for paddleaudio. by @zh794390558
  • Fix transformation import error. #3779 by @kk-2000
  • Adapt view behavior change, fix KeyError. #3794 by @zxcd
  • Fix profiler, fix gpu_mem unit, add max_mem_reserved for benchmark. #3323 #3634 #3604 by @mmglove

Docs

  • Fix some typos. #3178 by @Yulv-git
  • Update svs_music_score.md. #3085 #3070 by @lym0302
  • Update quick_start.md. #3175 #3176 by @46319943
  • Add cli test readme. #3784 by @zxcd
  • Update bug-report-tts.md. #3120 by @yt605155624

Others

  • Fix 0-d tensor, with the upgrade of paddlepaddle==2.5, the problem of modifying 0-d tensor has been solved. #3214 by zxcd #3334 by @zh794390558
  • Add dtype param for arange API. #3302 by @zxcd
  • Fix develop bug function:view to reshape. #3633 by @luyao-cv
  • Fix progress bar unit. #3177 by @46319943
  • Rm unused dep. #3097 by @lym0302

Acknowledgements

Special thanks to @jiamingkong @Zth9730 @yeyupiaoling @zxcd @zh794390558 @lemondy @lym0302 @longRookie @ljhzxc @yt605155624 @USTCKAY @mattheliu @Gsonovb @twoDogy @Coloryr @kk-2000 @mmglove @Yulv-git @46319943 @luyao-cv

New Contributors

  • @jiamingkong made their first contribution in #3242
  • @yeyupiaoling made their first contribution in #2755
  • @lemondy made their first contribution in #3146
  • @longRookie made their first contribution in #3006
  • @ljhzxc made their first contribution in #3109
  • @USTCKAY made their first contribution in #3502
  • @mattheliu made their first contribution in #3786
  • @Gsonovb made their first contribution in #3230
  • @twoDogy made their first contribution in #3123
  • @Coloryr made their first contribution in #3606
  • @kk-2000 made their first contribution in #3779
  • @Yulv-git made their first contribution in #3178
  • @46319943 made their first contribution in #3175
  • @luyao-cv made their first contribution in #3633

Full Changelog: https://github.com/PaddlePaddle/PaddleSpeech/compare/r1.4.1...r1.4.2