github huggingface/transformers v3.2.0
Bert Seq2Seq models, FSMT, LayoutLM, Funnel Transformer, LXMERT

latest releases: v4.46.2, v4.46.1, v4.46.0...
4 years ago

Bert Seq2Seq models, FSMT, Funnel Transformer, LXMERT

BERT Seq2seq models

The BertGeneration model is a BERT model that can be leveraged for sequence-to-sequence tasks using EncoderDecoderModel as proposed in Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.

It was added to the library in PyTorch with the following checkpoints:

  • google/roberta2roberta_L-24_bbc
  • google/roberta2roberta_L-24_gigaword
  • google/roberta2roberta_L-24_cnn_daily_mail
  • google/roberta2roberta_L-24_discofuse
  • google/roberta2roberta_L-24_wikisplit
  • google/bert2bert_L-24_wmt_de_en
  • google/bert2bert_L-24_wmt_en_de

Contributions:

FSMT (FairSeq MachineTranslation)

FSMT (FairSeq MachineTranslation) models were introduced in Facebook FAIR’s WMT19 News Translation Task Submission by Nathan Ng, Kyra Yee, Alexei Baevski, Myle Ott, Michael Auli, Sergey Edunov.

It was added to the library in PyTorch, with the following checkpoints:

  • facebook/wmt19-en-ru
  • facebook/wmt19-en-de
  • facebook/wmt19-ru-en
  • facebook/wmt19-de-en

Contributions:

  • [ported model] FSMT (FairSeq MachineTranslation) #6940 (@stas00)
  • build/eval/gen-card scripts for fsmt #7155 (@stas00)
  • skip failing FSMT CUDA tests until investigated #7220 (@stas00)
  • [fsmt] rewrite SinusoidalPositionalEmbedding + USE_CUDA test fixes + new TranslationPipeline test #7224 (@stas00)
  • [s2s] adjust finetune + test to work with fsmt #7263 (@stas00)
  • [fsmt] SinusoidalPositionalEmbedding no need to pass device #7292 (@stas00)
  • Adds FSMT to LM head AutoModel #7312 (@LysandreJik)

LayoutLM

The LayoutLM model was proposed in LayoutLM: Pre-training of Text and Layout for Document Image Understandin by Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, and Ming Zhou. It’s a simple but effective pre-training method of text and layout for document image understanding and information extraction tasks, such as form understanding and receipt understanding.

It was added to the library in PyTorch with the following checkpoints:

  • layoutlm-base-uncased
  • layoutlm-large-uncased

Contributions:

Funnel Transformer

The Funnel Transformer model was proposed in the paper Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing. It is a bidirectional transformer model, like BERT, but with a pooling operation after each block of layers, a bit like in traditional convolutional neural networks (CNN) in computer vision.

It was added to the library in both PyTorch and TensorFlow, with the following checkpoints:

  • funnel-transformer/small
  • funnel-transformer/small-base
  • funnel-transformer/medium
  • funnel-transformer/medium-base
  • funnel-transformer/intermediate
  • funnel-transformer/intermediate-base
  • funnel-transformer/large
  • funnel-transformer/large-base
  • funnel-transformer/xlarge
  • funnel-transformer/xlarge-base

Contributions:

LXMERT

The LXMERT model was proposed in LXMERT: Learning Cross-Modality Encoder Representations from Transformers by Hao Tan & Mohit Bansal. It is a series of bidirectional transformer encoders (one for the vision modality, one for the language modality, and then one to fuse both modalities) pre-trained using a combination of masked language modeling, visual-language text alignment, ROI-feature regression, masked visual-attribute modeling, masked visual-object modeling, and visual-question answering objectives. The pretraining consists of multiple multi-modal datasets: MSCOCO, Visual-Genome + Visual-Genome Question Answering, VQA 2.0, and GQA.

It was added to the library in TensorFlow with the following checkpoints:

  • unc-nlp/lxmert-base-uncased
  • unc-nlp/lxmert-vqa-uncased
  • unc-nlp/lxmert-gqa-uncased

Contributions

New pipelines

The following pipeline was added to the library:

Notebooks

The following community notebooks were contributed to the library:

  • Demoing LXMERT with raw images by incorporating the FRCNN model for roi-pooled extraction and bounding-box predction on the GQA answer set. #6986 (@eltoto1219)
  • [Community notebooks] Add notebook on fine-tuning GPT-2 Model with Trainer Class #7005 (@philschmid)
  • Add "Fine-tune ALBERT for sentence-pair classification" notebook to the community notebooks #7255 (@NadirEM)
  • added multilabel text classification notebook using distilbert to community notebooks #7201 (@DhavalTaunk08)

Encoder-decoder architectures

An additional encoder-decoder architecture was added:

Bug fixes and improvements

Don't miss a new transformers release

NewReleases is sending notifications on new releases.