pypi transformers 4.1.1
v4.1.1: TAPAS, MPNet, model parallelization, Sharded DDP, conda, multi-part downloads.

latest releases: 4.41.0, 4.40.2, 4.40.1...
3 years ago

v4.1.1: TAPAS, MPNet, model parallelization, Sharded DDP, conda, multi-part downloads.

TAPAS (@NielsRogge)

Four new models are released as part of the TAPAS implementation: TapasModel, TapasForQuestionAnswering, TapasForMaskedLM and TapasForSequenceClassification, in PyTorch.

TAPAS is a question answering model, used to answer queries given a table. It is a multi-modal model, joining text for the query and tabular data.

The TAPAS model was proposed in TAPAS: Weakly Supervised Table Parsing via Pre-training by Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno and Julian Martin Eisenschlos.

MPNet (@StillKeepTry)

Six new models are released as part of the MPNet implementation: MPNetModel, MPNetForMaskedLM, MPNetForSequenceClassification, MPNetForMultipleChoice, MPNetForTokenClassification, MPNetForQuestionAnswering, in both PyTorch and TensorFlow.

MPNet introduces a novel self-supervised objective named masked and permuted language modeling for language understanding. It inherits the advantages of both the masked language modeling (MLM) and the permuted language modeling (PLM) to addresses the limitations of MLM/PLM, and further reduce the inconsistency between the pre-training and fine-tuning paradigms.

The MPNet model was proposed in MPNet: Masked and Permuted Pre-training for Language Understanding by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.

  • MPNet: Masked and Permuted Pre-training for Language Understanding #8971 (@StillKeepTry)

Model parallel (@alexorona)

Model parallelism is introduced, allowing users to load very large models on two or more GPUs by spreading the model layers over them. This can allow GPU training even for very large models.

Conda release (@LysandreJik)

Transformers welcome their first conda releases, with v4.0.0, v4.0.1 and v4.1.0. The conda packages are now officially maintained on the huggingface channel.

Multi-part uploads (@julien-c)

For the first time, very large models can be uploaded to the model hub, by using multi-part uploads.

New examples and reorganization (@sgugger)

We introduced a refactored SQuAD example & notebook, which is faster and simpler than the previous scripts.

The example directory has been re-ordered as we introduce the separation between "examples", which are maintained examples showcasing how to do one specific task, and "research projects", which are bigger projects and maintained by the community.

Introduction of fairscale with Sharded DDP (@sgugger)

We introduce support for fariscale's ShardedDDP in the Trainer, allowing reduced memory usage when training models in a distributed fashion.

Barthez (@moussaKam)

The BARThez model is a French variant of the BART model. We welcome its specific tokenizer to the library and multiple checkpoints to the modelhub.

General improvements and bugfixes

Don't miss a new transformers release

NewReleases is sending notifications on new releases.