github allenai/allennlp v0.4.0

latest releases: v2.10.1, v2.10.0, v2.9.3...
6 years ago

New features:

  • A major feature in the 0.4 release is the inclusion of ELMo which produces contextualized word embeddings that greatly improve model performance. You can read more in our ELMo HowTo.
  • Support for lazy datasets, so you can stream data through the trainer with a lower memory footprint. This is a breaking change for some parts of the API; if you've written a DatasetReader, you will probably need to change a little bit of code. The Dataset class is now gone.
  • First-class support for models that operate on spans instead of on tokens.
  • Support for programmatically importing additional dependencies so you don't need to write your own run.py script.
  • A simple server to create a stand-alone web demo for your model.
  • Added constrained decoding to the ConditionalRandomField module (and to the corresponding NER tagger model)

Additional tutorials:

Additional models / dataset readers:

Minor bugfixes and features:

  • This release is compatible with pytorch 0.3.1 (and still compatible with pytorch 0.3.0).
  • Made it possible to do batch tokenization with spacy inside a DatasetReader.
  • Added a make-vocab command to precompute a vocabulary for a dataset.
  • Added a fine-tune command to fine-tune a trained model on a new dataset.
  • Unified handling of Ontonotes-based datasets, so it is now easier to write new DatasetReaders that use Ontonotes.
  • Predictors now support non-json formats for bulk prediction.
  • More flexible batching code, and bug fixes when batching / padding using ListFields.
  • Added ability to handle different data reading configurations at train and test time.

Breaking Changes

This release contains several breaking changes. Please see the migration guide if you have pre-0.4.0 code you need to update.

Don't miss a new allennlp release

NewReleases is sending notifications on new releases.