allenai/allennlp v1.1.0rc1 on GitHub

This is the first pre-release candidate for version 1.1. There will probably be at least more candidate before the true 1.1 release.

What's new since v1.0.0

Fixed

Reduced the amount of log messages produced by allennlp.common.file_utils.
Fixed a bug where PretrainedTransformerEmbedder parameters appeared to be trainable
in the log output even when train_parameters was set to False.
Fixed a bug with the sharded dataset reader where it would only read a fraction of the instances
in distributed training.
Fixed checking equality of ArrayFields.
Fixed a bug where NamespaceSwappingField did not work correctly with .empty_field().
Put more sensible defaults on the huggingface_adamw optimizer.
Simplified logging so that all logging output always goes to one file.
Fixed interaction with the python command line debugger.
Log the grad norm properly even when we're not clipping it.
Fixed a bug where PretrainedModelInitializer fails to initialize a model with a 0-dim tensor
Fixed a bug with the layer unfreezing schedule of the SlantedTriangular learning rate scheduler.
Fixed a regression with logging in the distributed setting. Only the main worker should write log output to the terminal.
Pinned the version of boto3 for package managers (e.g. poetry).
Fixed issue #4330 by updating the tokenizers dependency.
Fixed a bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader
in case it does not have a tokenizer.
reg_loss is only now returned for models that have some regularization penalty configured.
Fixed a bug that prevented cached_path from downloading assets from GitHub releases.
Fixed a bug that erronously increased last label's false positive count in calculating fbeta metrics.
Tqdm output now looks much better when the output is being piped or redirected.
Small improvements to how the API documentation is rendered.

Added

A method to ModelTestCase for running basic model tests when you aren't using config files.
Added some convenience methods for reading files.
Added an option to file_utils.cached_path to automatically extract archives.
Added the ability to pass an archive file instead of a local directory to Vocab.from_files.
Added the ability to pass an archive file instead of a glob to ShardedDatasetReader.
Added a new "linear_with_warmup" learning rate scheduler.
Added a check in ShardedDatasetReader that ensures the base reader doesn't implement manual
distributed sharding itself.
Added an option to PretrainedTransformerEmbedder and PretrainedTransformerMismatchedEmbedder to use a
scalar mix of all hidden layers from the transformer model instead of just the last layer. To utilize
this, just set last_layer_only to False.
cached_path() can now read files inside of archives.

Changed

Not specifying a cuda_device now automatically determines whether to use a GPU or not.
Discovered plugins are logged so you can see what was loaded.
allennlp.data.DataLoader is now an abstract registrable class. The default implementation
remains the same, but was renamed to allennlp.data.PyTorchDataLoader.
BertPooler can now unwrap and re-wrap extra dimensions if necessary.
New transformers dependency. Only version >=3.0 now supported.

Commits

4eb9795 Prepare for release v1.1.0rc1
f195440 update 'Models' links in README (#4475)
9c801a3 add CHANGELOG to API docs, point to license on GitHub, improve API doc formatting (#4472)
69d2f03 Clean up Tqdm bars when output is being piped or redirected (#4470)
7b188c9 fixed bug that erronously increased last label's false positive count (#4473)
64db027 Skip ETag check if OSError (#4469)
b9d011e More BART changes (#4468)
7a563a8 add option to use scalar mix of all transformer layers (#4460)
d00ad66 Minor tqdm and logging clean up (#4448)
6acf205 Fix regloss logging (#4449)
8c32ddf Fixing bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader (#4456)
b9a9164 Update transformers requirement from <2.12,>=2.10 to >=2.10,<3.1 (#4446)
181ef5d pin boto3 to resolve some dependency issues (#4453)
c75a1eb ensure base reader of ShardedDatasetReader doesn't implement sharding itself (#4454)
8a05ad4 Update CONTRIBUTING.md (#4447)
5b988d6 ensure only rank 0 worker writes to terminal (#4445)
8482f02 fix bug with SlantedTriangular LR scheduler (#4443)
e46a578 Update transformers requirement from <2.11,>=2.10 to >=2.10,<2.12 (#4411)
8229aca Fix pretrained model initialization (#4439)
60deece Fix type hint in text_field.py (#4434)
23e549e More multiple-choice changes (#4415)
6d0a4fd generalize DataLoader (#4416)
acd9995 Automatic file-friendly logging (#4383)
637dbb1 fix README, pin mkdocs, update mkdocs-material (#4412)
9c4dfa5 small fix to pretrained transformer tokenizer (#4417)
84988b8 Log plugins discovered and filter out transformers "PyTorch version ... available" log message (#4414)
54c41fc Adds the ability to automatically detect whether we have a GPU (#4400)
96ff585 Changes from my multiple-choice work (#4368)
eee15ca Assign an empty mapping array to empty fields of NamespaceSwappingField (#4403)
aa2943e Bump mkdocs-material from 5.3.2 to 5.3.3 (#4398)
7fa7531 fix eq method of ArrayField (#4401)
e104e44 Add test to ensure data loader yields all instances when batches_per_epoch is set (#4394)
b6fd697 fix sharded dataset reader (#4396)
30e5dbf Bump mypy from 0.781 to 0.782 (#4395)
b0ba2d4 update version
1d07cc7 Bump mkdocs-material from 5.3.0 to 5.3.2 (#4389)
ffc5184 ensure Vocab.from_files and ShardedDatasetReader can handle archives (#4371)
20afe6c Add Optuna integrated badge to README.md (#4361)
ba79f14 Bump mypy from 0.780 to 0.781 (#4390)
85e531c Update README.md (#4385)
c2ecb7a Add a method to ModelTestCase for use without config files (#4381)
6852def pin some doc building requirements (#4386)
bf422d5 Add github template for using your own python run script (#4380)
ebde6e8 Bump overrides from 3.0.0 to 3.1.0 (#4375)
e52b751 ensure transformer params are frozen at initialization when train_parameters is false (#4377)
3e8a9ef Add link to new template repo for config file development (#4372)
4f70bc9 tick version for nightly releases
63a5e15 Update spacy requirement from <2.3,>=2.1.0 to >=2.1.0,<2.4 (#4370)
ef7c75b reduce amount of log messages produced by file_utils (#4366)