This is the first pre-release candidate for version 1.1. There will probably be at least more candidate before the true 1.1 release.
What's new since v1.0.0
Fixed
- Reduced the amount of log messages produced by
allennlp.common.file_utils
. - Fixed a bug where
PretrainedTransformerEmbedder
parameters appeared to be trainable
in the log output even whentrain_parameters
was set toFalse
. - Fixed a bug with the sharded dataset reader where it would only read a fraction of the instances
in distributed training. - Fixed checking equality of
ArrayField
s. - Fixed a bug where
NamespaceSwappingField
did not work correctly with.empty_field()
. - Put more sensible defaults on the
huggingface_adamw
optimizer. - Simplified logging so that all logging output always goes to one file.
- Fixed interaction with the python command line debugger.
- Log the grad norm properly even when we're not clipping it.
- Fixed a bug where
PretrainedModelInitializer
fails to initialize a model with a 0-dim tensor - Fixed a bug with the layer unfreezing schedule of the
SlantedTriangular
learning rate scheduler. - Fixed a regression with logging in the distributed setting. Only the main worker should write log output to the terminal.
- Pinned the version of boto3 for package managers (e.g. poetry).
- Fixed issue #4330 by updating the
tokenizers
dependency. - Fixed a bug in
TextClassificationPredictor
so that it passes tokenized inputs to theDatasetReader
in case it does not have a tokenizer. reg_loss
is only now returned for models that have some regularization penalty configured.- Fixed a bug that prevented
cached_path
from downloading assets from GitHub releases. - Fixed a bug that erronously increased last label's false positive count in calculating fbeta metrics.
Tqdm
output now looks much better when the output is being piped or redirected.- Small improvements to how the API documentation is rendered.
Added
- A method to ModelTestCase for running basic model tests when you aren't using config files.
- Added some convenience methods for reading files.
- Added an option to
file_utils.cached_path
to automatically extract archives. - Added the ability to pass an archive file instead of a local directory to
Vocab.from_files
. - Added the ability to pass an archive file instead of a glob to
ShardedDatasetReader
. - Added a new
"linear_with_warmup"
learning rate scheduler. - Added a check in
ShardedDatasetReader
that ensures the base reader doesn't implement manual
distributed sharding itself. - Added an option to
PretrainedTransformerEmbedder
andPretrainedTransformerMismatchedEmbedder
to use a
scalar mix of all hidden layers from the transformer model instead of just the last layer. To utilize
this, just setlast_layer_only
toFalse
. cached_path()
can now read files inside of archives.
Changed
- Not specifying a
cuda_device
now automatically determines whether to use a GPU or not. - Discovered plugins are logged so you can see what was loaded.
allennlp.data.DataLoader
is now an abstract registrable class. The default implementation
remains the same, but was renamed toallennlp.data.PyTorchDataLoader
.BertPooler
can now unwrap and re-wrap extra dimensions if necessary.- New
transformers
dependency. Only version >=3.0 now supported.
Commits
4eb9795 Prepare for release v1.1.0rc1
f195440 update 'Models' links in README (#4475)
9c801a3 add CHANGELOG to API docs, point to license on GitHub, improve API doc formatting (#4472)
69d2f03 Clean up Tqdm bars when output is being piped or redirected (#4470)
7b188c9 fixed bug that erronously increased last label's false positive count (#4473)
64db027 Skip ETag check if OSError (#4469)
b9d011e More BART changes (#4468)
7a563a8 add option to use scalar mix of all transformer layers (#4460)
d00ad66 Minor tqdm and logging clean up (#4448)
6acf205 Fix regloss logging (#4449)
8c32ddf Fixing bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader (#4456)
b9a9164 Update transformers requirement from <2.12,>=2.10 to >=2.10,<3.1 (#4446)
181ef5d pin boto3 to resolve some dependency issues (#4453)
c75a1eb ensure base reader of ShardedDatasetReader doesn't implement sharding itself (#4454)
8a05ad4 Update CONTRIBUTING.md (#4447)
5b988d6 ensure only rank 0 worker writes to terminal (#4445)
8482f02 fix bug with SlantedTriangular LR scheduler (#4443)
e46a578 Update transformers requirement from <2.11,>=2.10 to >=2.10,<2.12 (#4411)
8229aca Fix pretrained model initialization (#4439)
60deece Fix type hint in text_field.py (#4434)
23e549e More multiple-choice changes (#4415)
6d0a4fd generalize DataLoader (#4416)
acd9995 Automatic file-friendly logging (#4383)
637dbb1 fix README, pin mkdocs, update mkdocs-material (#4412)
9c4dfa5 small fix to pretrained transformer tokenizer (#4417)
84988b8 Log plugins discovered and filter out transformers "PyTorch version ... available" log message (#4414)
54c41fc Adds the ability to automatically detect whether we have a GPU (#4400)
96ff585 Changes from my multiple-choice work (#4368)
eee15ca Assign an empty mapping array to empty fields of NamespaceSwappingField
(#4403)
aa2943e Bump mkdocs-material from 5.3.2 to 5.3.3 (#4398)
7fa7531 fix eq method of ArrayField (#4401)
e104e44 Add test to ensure data loader yields all instances when batches_per_epoch is set (#4394)
b6fd697 fix sharded dataset reader (#4396)
30e5dbf Bump mypy from 0.781 to 0.782 (#4395)
b0ba2d4 update version
1d07cc7 Bump mkdocs-material from 5.3.0 to 5.3.2 (#4389)
ffc5184 ensure Vocab.from_files and ShardedDatasetReader can handle archives (#4371)
20afe6c Add Optuna integrated badge to README.md (#4361)
ba79f14 Bump mypy from 0.780 to 0.781 (#4390)
85e531c Update README.md (#4385)
c2ecb7a Add a method to ModelTestCase for use without config files (#4381)
6852def pin some doc building requirements (#4386)
bf422d5 Add github template for using your own python run script (#4380)
ebde6e8 Bump overrides from 3.0.0 to 3.1.0 (#4375)
e52b751 ensure transformer params are frozen at initialization when train_parameters is false (#4377)
3e8a9ef Add link to new template repo for config file development (#4372)
4f70bc9 tick version for nightly releases
63a5e15 Update spacy requirement from <2.3,>=2.1.0 to >=2.1.0,<2.4 (#4370)
ef7c75b reduce amount of log messages produced by file_utils (#4366)