pypi transformers 2.1.1
CTRL, DistilGPT-2, Pytorch TPU, tokenizer enhancements, guideline requirements

latest releases: 4.40.2, 4.40.1, 4.40.0...
4 years ago

New model architectures: CTRL, DistilGPT-2

Two new models have been added since release 2.0.

Distillation

Several updates have been made to the distillation script, including the possibility to distill GPT-2 and to distill on the SQuAD task. By @VictorSanh.

Pytorch TPU support

The run_glue.py example script can now run on a Pytorch TPU.

Updates to example scripts

Several example scripts have been improved and refactored to use the full potential of the new tokenizer functions:

QOL enhancements on the tokenizer

Enhancements have been made on the tokenizers. Two new methods have been added: get_special_tokens_mask and truncate_sequences .

The former returns a mask indicating which tokens are special tokens in a token list, and which are tokens from the initial sequences. The latter truncate sequences according to a strategy.

Both of those methods are called by the encode_plus method, which itself is called by the encode method. The encode_plus now returns a larger dictionary which holds information about the special tokens, as well as the overflowing tokens.

Thanks to @julien-c, @thomwolf, and @LysandreJik for these additions.

New German BERT models

Breaking changes

  • The two methods add_special_tokens_single_sequence and add_special_tokens_sequence_pair have been removed. They have been replaced by the single method build_inputs_with_special_tokens which has a more comprehensible name and manages both sequence singletons and pairs.

  • The boolean parameter truncate_first_sequence has been removed in tokenizers' encode and encode_plus methods, being replaced by a strategy in the form of a string: 'longest_first', 'only_second', 'only_first' or 'do_not_truncate' are accepted strategies.

  • When the encode or encode_plus methods are called with a specified max_length, the sequences will now always be truncated or throw an error if overflowing.

Guidelines and requirements

New contributing guidelines have been added, alongside library development requirements by @rlouf, the newest member of the HuggingFace team.

Community additions/bug-fixes/improvements

  • GLUE Processors have been refactored to handle inputs for all tasks coming from the tensorflow_datasets. This work has been done by @agrinh and @philipp-eisen.
  • The padding_idx is now correctly initialized to 1 in randomly initialized RoBERTa models. @ikuyamada
  • The documentation CSS has been adapted to work on older browsers. @TimYagan
  • An addition concerning the management of hidden states has been added to the README by @BramVanroy.
  • Integration of TF 2.0 models with other Keras modules @thomwolf
  • Past values can be opted-out @thomwolf

Don't miss a new transformers release

NewReleases is sending notifications on new releases.