This release adds multi-GPU support, an improved documentation page with API docs and finally deprecates Python 3.8!
Improved Documentation and API Docs
Thanks to @konstantin-lukas we have a completely new design of our documentation page, which now includes API docs.
You can check it out here!
- Check out our tutorials
- Check out the new Python API docs
Future releases will improve docstring coverage and further improve upon the documentation!
PRs:
- Fix doc build by @helpmefindaname in #3528
- Rework Doc page by @konstantin-lukas in #3563
- Test new docstrings and apidocs deployment by @alanakbik in #3573
Multi-GPU Support
Flair now offers support for training models on multiple GPUs! Big thanks to @jeffpicard!
PRs:
- Add multi-GPU support by @jeffpicard in #3548
- Fix gradient accumulation and learning rate aggregation by @jeffpicard in #3583
Deprecations
Since Python3.8 is no longer supported, we are also dropping support for it, in favor of features added in python 3.9.
To acknowledge CVE-2024-10073, we decided to drop support for the flair.models.clustering
module, since we aren't aware of any usage of it, we decided to do a hard drop instead of a deprecation.
- Drop python 3.8 by @helpmefindaname in #3560
- Remove clustering support by @helpmefindaname in #3567
Other Improvements
New Datasets
- Add CleanCoNLL object by @susannaruecker in #3557
- Add NoiseBench object by @elenamer in #3512
Performance Improvements
- perf: optimize dictionary items check by @MattGPT-ai in #3569
- Refactor
fill_mean_token_embeddings
for performance optimization on GPU by @sheldon-roberts in #3525
New Features and Improvements
- Add proxies information to requests.head by @diego-morientez in #3535
- Allow specifying proxy information in TransformerEmbeddings by @diego-morientez in #3539
- Add
use_tokenizer
toJsonlDataset
by @david-waterworth in #3486 - Use built-in version parsing from packaging by @adrianeboyd in #3502
Bugfixes
TransformerDocumentEmbeddings
: Fix error whencls_pooling="mean"
orcls_pooling="max"
by @fkdosilovic in #3558SequenceTagger
: Fix the incorrect token prediction distribution from_all_scores_for_token()
by @mdmotaharmahtab in #3449TransformerEmbeddings
: Fix T5 tokenizer loading by @helpmefindaname in #3544TextPairRegressor
: Fix: use proper eval default main eval metrics by @MattGPT-ai in #3538TextPairRegressor
: Fix state dict key mismatch for embeddings by @MattGPT-ai in #3537- Make onnx export work again by @helpmefindaname in #3530
- Fix support metric by @MattGPT-ai in #3510
Operations/Development
- Invalidate tars classifier and tars ner tests to save disk space by @helpmefindaname in #3527
- Ignore FutureWarning by @alanakbik in #3526
- Update SECURITY.md with current contact by @alanakbik in #3568
New Contributors
- @adrianeboyd made their first contribution in #3502
- @david-waterworth made their first contribution in #3486
- @diego-morientez made their first contribution in #3535
- @jeffpicard made their first contribution in #3548
- @mdmotaharmahtab made their first contribution in #3449
- @fkdosilovic made their first contribution in #3558
Full Changelog: v0.14.0...v0.15.0