github pytorch/text 0.3.1
0.3.1: Quality-of-life improvements and bugfixes

latest releases: v0.18.0, v0.18.0-rc4, v0.17.2...
5 years ago

Major changes:

  • Added bABI dataset (#286)
  • Added MultiNLP dataset (#326)
  • Pytorch 0.4 compatibility + bugfixes (#299, #302)
  • Batch iteration now returns a tuple of (inputs), outputs by default without having to index attributes from Batch (#288)
  • [BREAKING] Iterator no longer repeats infinitely by default (now stops after epoch has completed) (#417)

Minor changes:

  • Handle moses tokenizer being migrated from nltk (#361)
  • Vector loading made more efficient and flexible (#353)
  • Allow special tokens to be added to the end of the vocabulary (#400)
  • Allow filtering unknown words from examples (#413)

Bugfixes:

  • Documentation (#382, #383, #393 #395, #410)
  • Create cache dir for pretrained embeddings if it doesn't exist (#301)
  • Various typos (#293, #369, #373, #344, #401, #404, #405, #418)
  • Dataset.split() not copying sort_key fixed (#279)
  • Various python 2.* vs python 3.* issues (#280)
  • Fix OOV token vector dimensionality (#308)
  • Lowercased type of TabularDataset (#315)
  • Fix splits method in various translation datasets (#377, #385, #392, #429)
  • Fix ParseTextField postprocessing (#386)
  • Fix SubwordVocab (#399)
  • Make NestedField GPU compatible and fix frequency saving (#409, #403)
  • Allow CSVreader params to be modified by user (#432)
  • Use tqdm progressbar in downloads (#425)

Don't miss a new text release

NewReleases is sending notifications on new releases.