github huggingface/transformers v4.15.0

latest releases: v4.46.2, v4.46.1, v4.46.0...
2 years ago

New Model additions

WavLM

WavLM was proposed in WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing by Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Furu Wei.

WavLM sets a new SOTA on the SUPERB benchmark.

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=wavlm

Wav2Vec2Phoneme

Wav2Vec2Phoneme was proposed in Simple and Effective Zero-shot Cross-lingual Phoneme Recognition by Qiantong Xu, Alexei Baevski, Michael Auli.
Wav2Vec2Phoneme allows to do phoneme classification as part of automatic speech recognition

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=phoneme-recognition

UniSpeech-SAT

Unispeech-SAT was proposed in UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER AWARE PRE-TRAINING by Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu.

UniSpeech-SAT is especially good at speaker related tasks.

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=unispeech-sat

UniSpeech

Unispeech was proposed in UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data by Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang.
Three new models are released as part of the ImageGPT integration: ImageGPTModel, ImageGPTForCausalImageModeling, ImageGPTForImageClassification, in PyTorch.

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=unispeech

New Tasks

Speaker Diarization and Verification

Wav2Vec2-like architecture now have a speaker diarization and speaker verification head added to their architectures.
You can try out the new task here: https://huggingface.co/spaces/microsoft/wavlm-speaker-verification

  • Add Speaker Diarization and Verification heads by @anton-l in #14723

What's Changed

New Contributors

Full Changelog: v4.14.0...v4.15.0

Don't miss a new transformers release

NewReleases is sending notifications on new releases.