T5 Model (@patrickvonplaten, @thomwolf )
T5 is a powerful encoder-decoder model that formats every NLP problem into a text-to-text format. It achieves state of the art results on a variety of NLP tasks (Summarization, Question-Answering, ...).
Five sets of pre-trained weights (pre-trained on a multi-task mixture of unsupervised and supervised tasks) are released. In ascending order from 60 million parameters to 11 billion parameters:
t5-small
, t5-base
, t5-large
, t5-3b
, t5-11b
T5 can now be used with the translation and summarization pipeline.
Related:
- paper
- official code
- model available in Hugging Face's community models
- docs
Big thanks to the original authors, especially @craffel who helped answer our questions, reviewed PRs and tested T5 extensively.
New BART checkpoint: bart-large-xsum
(@sshleifer)
These weights are from BART finetuned on the XSum abstractive summarization challenge, which encourages shorter (more abstractive) summaries. It achieves state of the art.
BART summarization example with pytorch-lightning (@acarrera94)
New example: BART for summarization, using Pytorch-lightning. Trains on CNN/DM and evaluates.
Translation pipeline (@patrickvonplaten)
A new pipeline is available, leveraging the T5 model. The T5 model was added to the summarization pipeline as well.
Memory improvements with BART (@sshleifer)
In an effort to have the same memory footprint and same computing power necessary to run inference on BART, several improvements have been made on the model:
- Remove the LM head and use the embedding matrix instead (~200MB)
- Call encoder before expanding input_ids (~1GB)
- SelfAttention only returns weights if config.output_attentions (~500MB)
- Two separate, smaller decoder attention masks (~500MB)
- drop columns that are exclusively pad_token_id from input_ids in
evaluate_cnn
example.
TensorFlow models may now be serialized (@gthb)
Supports JSON serialization of Keras layers by overriding get_config, so that they can be sent to Tensorboard to display a conceptual graph of the model. TensorFlow models may now be saved using model.save
, as other Keras models.
New model: XLMForTokenClassification (@sakares)
A new head was added to XLM: XLMForTokenClassification
.