New features:
- A major feature in the 0.4 release is the inclusion of ELMo which produces contextualized word embeddings that greatly improve model performance. You can read more in our ELMo HowTo.
- Support for lazy datasets, so you can stream data through the trainer with a lower memory footprint. This is a breaking change for some parts of the API; if you've written a
DatasetReader
, you will probably need to change a little bit of code. TheDataset
class is now gone. - First-class support for models that operate on spans instead of on tokens.
- Support for programmatically importing additional dependencies so you don't need to write your own
run.py
script. - A simple server to create a stand-alone web demo for your model.
- Added constrained decoding to the
ConditionalRandomField
module (and to the corresponding NER tagger model)
Additional tutorials:
- A tutorial on writing
Predictors
for usingpython -m allennlp.run predict
and for creating demos. - Instructions for how to visualize model internals in a live demo of your model, for gaining better insights about what your model is doing.
- A tutorial on how laziness works in AllenNLP.
Additional models / dataset readers:
- A span-based constituency parser that independently predicts a non-terminal label for each span in an input sentence, along with a Penn treebank dataset reader that reads data for this model. Trained model and demo coming soon.
Minor bugfixes and features:
- This release is compatible with pytorch 0.3.1 (and still compatible with pytorch 0.3.0).
- Made it possible to do batch tokenization with spacy inside a
DatasetReader
. - Added a
make-vocab
command to precompute a vocabulary for a dataset. - Added a
fine-tune
command to fine-tune a trained model on a new dataset. - Unified handling of Ontonotes-based datasets, so it is now easier to write new
DatasetReaders
that use Ontonotes. Predictors
now support non-json formats for bulk prediction.- More flexible batching code, and bug fixes when batching / padding using
ListFields
. - Added ability to handle different data reading configurations at train and test time.
Breaking Changes
This release contains several breaking changes. Please see the migration guide if you have pre-0.4.0 code you need to update.