github huggingface/transformers v4.7.0
v4.7.0: DETR, RoFormer, ByT5, HuBERT, support for torch 1.9.0

latest releases: v4.40.1, v4.40.0, v4.39.3...
2 years ago

v4.7.0: DETR, RoFormer, ByT5, Hubert, support for torch 1.9.0

DETR (@NielsRogge)

Three new models are released as part of the DETR implementation: DetrModel, DetrForObjectDetection and DetrForSegmentation, in PyTorch.

DETR consists of a convolutional backbone followed by an encoder-decoder Transformer which can be trained end-to-end for object detection. It greatly simplifies a lot of the complexity of models like Faster-R-CNN and Mask-R-CNN, which use things like region proposals, non-maximum suppression procedure, and anchor generation. Moreover, DETR can also be naturally extended to perform panoptic segmentation, by simply adding a mask head on top of the decoder outputs.

DETR can support any timm backbone.

The DETR model was proposed in End-to-End Object Detection with Transformers by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=detr

ByT5 (@patrickvonplaten)

A new tokenizer is released as part of the ByT5 implementation: ByT5Tokenizer. It can be used with the T5 family of models.

The ByT5 model was presented in ByT5: Towards a token-free future with pre-trained byte-to-byte models by Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, Colin Raffel.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?search=byt5

RoFormer (@JunnYu)

14 new models are released as part of the RoFormer implementation: RoFormerModel, RoFormerForCausalLM, RoFormerForMaskedLM, RoFormerForSequenceClassification, RoFormerForTokenClassification, RoFormerForQuestionAnswering and RoFormerForMultipleChoice, TFRoFormerModel, TFRoFormerForCausalLM, TFRoFormerForMaskedLM, TFRoFormerForSequenceClassification, TFRoFormerForTokenClassification, TFRoFormerForQuestionAnswering and TFRoFormerForMultipleChoice, in PyTorch and TensorFlow.

RoFormer is a BERT-like autoencoding model with rotary position embeddings. Rotary position embeddings have shown improved performance on classification tasks with long texts. The RoFormer model was proposed in RoFormer: Enhanced Transformer with Rotary Position Embedding by Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu.

  • Add new model RoFormer (use rotary position embedding ) #11684 (@JunnYu)

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=roformer

HuBERT (@patrickvonplaten)

HuBERT is a speech model that accepts a float array corresponding to the raw waveform of the speech signal.

HuBERT was proposed in HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units by Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed.

Two new models are released as part of the HuBERT implementation: HubertModel and HubertForCTC, in PyTorch.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=hubert

Hugging Face Course - Part 1

On Monday, June 14th, 2021, we released the first part of the Hugging Face Course. The course is focused on the Hugging Face ecosystem, including transformers. Most of the material in the course is now linked from the transformers documentation which now includes videos to explain singular concepts.

TensorFlow additions

The Wav2Vec2 model can now be used in TensorFlow:

PyTorch 1.9 support

Notebooks

General improvements and bugfixes

Don't miss a new transformers release

NewReleases is sending notifications on new releases.