ESM-2/ESMFold
ESM-2 and ESMFold are new state-of-the-art Transformer protein language and folding models from Meta AI's Fundamental AI Research Team (FAIR). ESM-2 is trained with a masked language modeling objective, and it can be easily transferred to sequence and token classification tasks for proteins. Checkpoints exist in various sizes, from 8 million parameters up to a huge 15 billion parameter model.
ESMFold is a state-of-the-art single sequence protein folding model which produces high accuracy predictions significantly faster. Unlike previous protein folding tools like AlphaFold2 and openfold
, ESMFold uses a pretrained protein language model to generate token embeddings that are used as input to the folding model, and so does not require a multiple sequence alignment (MSA) of related proteins as input. As a result, proteins can be folded in a single forward pass of the model without requiring any external databases or search/alignment tools to be present at inference time. This hugely reduces the time and compute requirements for folding.
Transformer protein language models were introduced in the paper Biological structure and function emerge from scaling
unsupervised learning to 250 million protein sequences by Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus.
ESMFold was introduced in the paper Language models of protein sequences at the scale of evolution enable accurate structure prediction by Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, and Alexander Rives.
- Add ESMFold by @Rocketknight1 in #19977
- TF port of ESM by @Rocketknight1 in #19587
LiLT
LiLT allows to combine any pre-trained RoBERTa text encoder with a lightweight Layout Transformer, to enable LayoutLM-like document understanding for many languages.
It was proposed in LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding by Jiapeng Wang, Lianwen Jin, Kai Ding.
- Add LiLT by @NielsRogge in #19450
Flan-T5
FLAN-T5 is an enhanced version of T5 that has been finetuned on a mixture of tasks.
It was released in the paper Scaling Instruction-Finetuned Language Models by Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei.
- Add
flan-t5
documentation page by @younesbelkada in #19892
Table Transformer
Table Transformer is a model that can perform table extraction and table structure recognition from unstructured documents based on the DETR architecture.
It was proposed in PubTables-1M: Towards comprehensive table extraction from unstructured documents by Brandon Smock, Rohith Pesala, Robin Abraham.
- Add table transformer [v2] by @NielsRogge in #19614
Contrastive search decoding
Contrastive search decoding is a new state-of-the-art generation method which aims at reducing the repetitive patterns in which generation models often fall.
It was introduced in A Contrastive Framework for Neural Text Generation by Yixuan Su, Tian Lan, Yan Wang, Dani Yogatama, Lingpeng Kong, Nigel Collier.
- Adding the state-of-the-art contrastive search decoding methods for the codebase of generation_utils.py by @gmftbyGMFTBY in #19477
Safety and security
We continue to explore the new serialization format not using Pickle via the safetensors library, this time by adding support for TensorFlow models. More checkpoints have been converted to this format. Support is still experimental.
🚨 Breaking changes
The following changes are bugfixes that we have chosen to fix even if it changes the resulting behavior. We mark them as breaking changes, so if you are using this part of the codebase, we recommend you take a look at the PRs to understand what changes were done exactly.
- 🚨🚨🚨 TF: Remove
TFWrappedEmbeddings
(breaking: TF embedding initialization updated for encoder-decoder models) by @gante in #19263 - 🚨🚨🚨 [Breaking change] Deformable DETR intermediate representations by @Narsil in #19678
Bugfixes and improvements
- Enabling custom TF signature draft by @dimitreOliveira in #19249
- Fix whisper for
pipeline
by @ArthurZucker in #19482 - Extend
nested_XXX
functions to mappings/dicts. by @Guillem96 in #19455 - Syntax issues (lines 126, 203) by @kant in #19444
- CLI: add import protection to datasets by @gante in #19470
- Fix
TFGroupViT
CI by @ydshieh in #19461 - Fix doctests for
DeiT
andTFGroupViT
by @ydshieh in #19466 - Update
WhisperModelIntegrationTests.test_large_batched_generation
by @ydshieh in #19472 - [Swin] Replace hard-coded batch size to enable dynamic ONNX export by @lewtun in #19475
- TF: TFBart embedding initialization by @gante in #19460
- Make LayoutLM tokenizers independent from BertTokenizer by @arnaudstiegler in #19351
- Make
XLMRoberta
model and config independent fromRoberta
by @asofiaoliveira in #19359 - Fix
get_embedding
dtype at init. time by @ydshieh in #19473 - Decouples
XLMProphet
model fromProphet
by @srhrshr in #19406 - Implement multiple span support for DocumentQuestionAnswering by @ankrgyl in #19204
- Add warning in
generate
&device_map=auto
& half precision models by @younesbelkada in #19468 - Update TF whisper doc tests by @amyeroberts in #19484
- Make bert_japanese and cpm independent of their inherited modules by @Davidy22 in #19431
- Added tokenize keyword arguments to feature extraction pipeline by @quancore in #19382
- Adding the README_es.md and reference to it in the others files readme by @Oussamaosman02 in #19427
- [CvT] Tensorflow implementation by @mathieujouffroy in #18597
python3
instead ofpython
in push CI setup job by @ydshieh in #19492- Update PT to TF CLI for audio models by @amyeroberts in #19465
- New by @IMvision12 in #19481
- Fix
OPTForQuestionAnswering
doctest by @ydshieh in #19479 - Use a dynamic configuration for circleCI tests by @sgugger in #19325
- Add multi-node conditions in trainer_qa.py and trainer_seq2seq.py by @regisss in #19502
- update doc for perf_train_cpu_many by @sywangyi in #19506
- Avoid Push CI failing to report due to many commits being merged by @ydshieh in #19496
- [Doctest] Add
configuration_bert.py
to doctest by @ydshieh in #19485 - Fix whisper doc by @ArthurZucker in #19518
- Syntax issue (line 497, 526) Documentation by @kant in #19442
- Fix pytorch seq2seq qa by @FilipposVentirozos in #19258
- Add depth estimation pipeline by @nandwalritik in #18618
- Adding links to pipelines parameters documentation by @AndreaSottana in #19227
- fix MarkupLMProcessor option flag by @davanstrien in #19526
- [Doctest] Bart configuration update by @imarekkus in #19524
- Remove roberta dependency from longformer fast tokenizer by @sirmammingtonham in #19501
- made tokenization_roformer independent of bert by @naveennamani in #19426
- Remove bert fast dependency from electra by @Threepointone4 in #19520
- [Examples] Fix typos in run speech recognition seq2seq by @sanchit-gandhi in #19514
- [X-CLIP] Fix doc tests by @NielsRogge in #19523
- Update Marian config default vocabulary size by @gante in #19464
- Make
MobileBert
tokenizers independent fromBert
by @501Good in #19531 - [Whisper] Fix gradient checkpointing by @sanchit-gandhi in #19538
- Syntax issues (paragraphs 122, 130, 147, 155) Documentation: @sgugger by @kant in #19437
- using trunc_normal for weight init & cls_token by @mathieujouffroy in #19486
- Remove
MarkupLMForMaskedLM
fromMODEL_WITH_LM_HEAD_MAPPING_NAMES
by @ydshieh in #19534 - Image transforms library by @amyeroberts in #18520
- Add a decorator for flaky tests by @sgugger in #19498
- [Doctest] Add
configuration_yolos.py
by @daspartho in #19539 - Albert config update by @imarekkus in #19541
- [Doctest]
Add configuration_whisper.py
by @daspartho in #19540 - Throw an error if
getattribute_from_module
can't find anything by @ydshieh in #19535 - [Doctest] Beit Config for doctest by @daspartho in #19542
- Create the arange tensor on device for enabling CUDA-Graph for Clip Encoder by @RezaYazdaniAminabadi in #19503
- [Doctest] GPT2 Config for doctest by @daspartho in #19549
- Build Push CI images also in a daily basis by @ydshieh in #19532
- Fix checkpoint used in
MarkupLMConfig
by @ydshieh in #19547 - add a note to whisper docs clarifying support of long-form decoding by @akashmjn in #19497
- [Whisper] Freeze params of encoder by @sanchit-gandhi in #19527
- [Doctest] Fixing the Doctest for imageGPT config by @RamitPahwa in #19556
- [Doctest] Fixing mobile bert configuration doctest by @RamitPahwa in #19557
- [Doctest] Fixing doctest bert_generation configuration by @Threepointone4 in #19558
- [Doctest] DeiT Config for doctest by @daspartho in #19560
- [Doctest] Reformer Config for doctest by @daspartho in #19562
- [Doctest] RoBERTa Config for doctest by @daspartho in #19563
- [Doctest] Add
configuration_vit.py
by @daspartho in #19561 - [Doctest] bloom config update by @imarekkus in #19566
- [Re-submit] Compute true loss Flax examples by @duongna21 in #19504
- Fix fairseq wav2vec2-xls-r pretrained weights conversion scripts by @heatz123 in #19508
- [Doctest] CTRL config by @imarekkus in #19574
- [Doctest] Add configuration_canine.py by @IzicTemi in #19575
- [Doctests] Config files for
ViTMAE
andYOSO
by @grgkaran03 in #19567 - Added type hints to
DebertaV2ForMultipleChoice
Pytorch by @IMvision12 in #19536 - [WIP] Add type hints for Lxmert (TF) by @elusenji in #19441
- [Doctests] add
configuration_blenderbot.py
by @grgkaran03 in #19577 - [Doctest] adds trajectory_transformer config to Docs test by @SD-13 in #19586
- [Doctests] add
configuration_blenderbot_small.py
by @grgkaran03 in #19589 - [Doctest] Swin V2 Config for doctest by @daspartho in #19595
- [Doctest] Swin Config for doctest by @daspartho in #19594
- [Doctest] SEW Config for doctest by @daspartho in #19597
- [Doctest] UniSpeech Config for doctest by @daspartho in #19596
- [Doctest] SEW-D Config for doctest by @daspartho in #19598
- [Doctest] fix doc test for megatron bert by @RamitPahwa in #19600
- Adding type hints for TFXLnet by @thliang01 in #19344
- [Doctest] Add
configuration_bigbird_pegasus.py
andconfiguration_big_bird.py
by @Xabilahu in #19606 - Cast masks to np.unit8 before converting to PIL.Image.Image by @amyeroberts in #19616
- [Whisper] Don't return attention mask in feat extractor by @sanchit-gandhi in #19521
- [Time Series Transformer] Add doc tests by @NielsRogge in #19607
- fix BLOOM ONNX config by @NouamaneTazi in #19573
- Fix
test_tf_encode_plus_sent_to_model
forTAPAS
by @ydshieh in #19559 - Allow usage of TF Text BertTokenizer on TFBertTokenizer to make it servable on TF Serving by @piEsposito in #19590
- add gloo backend support for CPU DDP by @sywangyi in #19555
- Fix
ImageToTextPipelineTests.test_small_model_tf
by @ydshieh in #19565 - Fix
FlaubertTokenizer
by @ydshieh in #19552 - Visual Bert config for doctest by @ztjhz in #19605
- GPTTokenizer dependency removed from deberta class by @RamitPahwa in #19551
- xlm roberta config for doctest by @ztjhz in #19609
- Ernie config for doctest by @ztjhz in #19611
- xlm roberta xl config for doctest by @ztjhz in #19610
- fix: small error by @0xflotus in #19612
- Improve error messaging for ASR pipeline. by @Narsil in #19570
- [Doctest] LeViT Config for doctest by @daspartho in #19622
- [Doctest] DistilBERT Config for doctest by @daspartho in #19621
- [Whisper] Fix gradient checkpointing (again!) by @sanchit-gandhi in #19548
- [Doctest] Add
configuration_resnet.py
by @daspartho in #19620 - Fix whisper doc by @ArthurZucker in #19608
- Sharding fails in TF when absolute scope was modified if
.
in layer name by @ArthurZucker in #19124 - [Doctest] Add configuration_vision_text_dual_encoder.py by @SD-13 in #19580
- [Doctest] Add configuration_vision_encoder_decoder.py by @SD-13 in #19583
- [Doctest] Add configuration_time_series_transformer.py by @SD-13 in #19582
- Tokenizer from_pretrained should not use local files named like tokenizer files by @sgugger in #19626
- [Doctest] CodeGen config for doctest by @AymenBer99 in #19633
- [Doctest] Add
configuration_data2vec_text.py
by @daspartho in #19636 - [Doctest] Conditional DETR config for doctest by @AymenBer99 in #19641
- [Doctest] XLNet config for doctest by @AymenBer99 in #19649
- [Doctest] Add
configuration_trocr.py
by @thliang01 in #19658 - Add doctest info in testingmdx by @ArthurZucker in #19623
- Add pillow to layoutlmv3 example requirements.txt by @Spacefish in #19663
- add return types for tf gptj, xlm, and xlnet by @sirmammingtonham in #19638
- Fix pipeline predict transform methods by @s-udhaya in #19657
- Type hints MCTCT by @rchan26 in #19618
- added type hints for Yolos Pytorch model by @WhiteWolf47 in #19545
- A few CI fixes for
DocumentQuestionAnsweringPipeline
by @ankrgyl in #19584 - Removed Bert interdependency from Funnel transformer by @mukesh663 in #19655
- fix warnings in deberta by @sanderland in #19458
- word replacement line #231 by @shreem-123 in #19662
- [Doctest] Add configuration_transfo_xl.py by @thliang01 in #19651
- Update perf_train_gpu_one.mdx by @cakiki in #19676
- object-detection instead of object_detection by @Spacefish in #19677
- add return_tensor parameter for feature extraction by @ajsanjoaquin in #19257
- Fix code examples of DETR and YOLOS by @NielsRogge in #19669
- Revert "add return_tensor parameter for feature extraction by @sgugger in #19257)"
- Fixed the docstring and type hint for forced_decoder_ids option in Ge… by @koreyou in #19640
- Add normalize to image transforms module by @amyeroberts in #19544
- [Doctest] Data2VecAudio Config for doctest by @daspartho in #19635
- Update ESM checkpoints to point to
facebook/
by @Rocketknight1 in #19675 - Removed XLMModel inheritance from FlaubertModel(torch+tf) by @D3xter1922 in #19432
- [Examples] make default preprocessing_num_workers=1 by @Yang-YiFan in #19684
- [Doctest] Add configuration_convbert.py by @AymenBer99 in #19643
- [Doctest] Add configuration_realm.py by @ak04p in #19646
- Update CONTRIBUTING.md by @shreem-123 in #19689
- [Doctest] Add
configuration_data2vec_vision.py
by @daspartho in #19637 - Fix some CI torch device issues for PyTorch 1.13 by @ydshieh in #19681
- Fix checkpoint used in
VisualBertConfig
doc example by @ydshieh in #19692 - Fix dtype in radnomly initialized head by @sgugger in #19690
- fix tests by @ArthurZucker in #19670
- fix test whisper with new max length by @ArthurZucker in #19668
- check decoder_inputs_embeds is None before shifting labels by @ArthurZucker in #19671
- Fix docs by @NielsRogge in #19687
- update documentation by @ArthurZucker in #19706
- Improve DETR models by @NielsRogge in #19644
- Small fixes for TF-ESM1b and ESM-1b weight conversions by @Rocketknight1 in #19683
- Fix typo in perf docs by @cakiki in #19705
- Fix redundant normalization of OWL-ViT text embeddings by @alaradirik in #19712
- Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode by @falcaopetri in #18351
- [Doctest] CVT config for doctest by @AymenBer99 in #19695
- [Doctest] Add configuration_wav2vec2.py to documentation_tests.py by @juancopi81 in #19698
- ]Fixed pegasus config doctest by @mukesh663 in #19722
- fix seq2seqtrainer predict without labels by @IvanSedykh in #19721
- add return_tensors parameter for feature_extraction 2 by @Narsil in #19707
- Improving
image-segmentation
pipeline tests. by @Narsil in #19710 - [Doctest] Adding config files for convnext by @soma2000-lang in #19717
- [Doctest] Fixing doctest
configuration_pegasus_x.py
by @mukesh663 in #19725 - Specify TF framework in TF-related pipeline tests by @ydshieh in #19719
- Add docs by @NielsRogge in #19729
- Fix activations being all the same module by @sgugger in #19728
- add
accelerate
support forWhisper
by @younesbelkada in #19697 - Clean up deprecation warnings by @Davidy22 in #19654
- Repo utils test by @sgugger in #19696
- Add decorator to flaky test by @amyeroberts in #19674
- [Doctest] Add doctest for
FlavaConfig
andFNetConfig
by @ndrohith09 in #19724 - Update contribution guide by @stevhliu in #19700
- [Doctest] Add wav2vec2_conformer for doctest by @juancopi81 in #19734
- [Doctest] XLM Config for doctest by @AymenBer99 in #19685
- [Doctest] Add
configuration_clip.py
by @daspartho in #19647 - [Doctest] GPTNeoConfig , GPTNeoXConfig , GPTNeoXJapaneseConfig by @ndrohith09 in #19741
- Update modeling_markuplm.py by @IMvision12 in #19723
- Fix issue #19300 by @raghavanone in #19483
- [Doctest] Add
configuration_wavlm.py
by @juancopi81 in #19749 - Specify TF framework explicitly in more pipeline tests by @ydshieh in #19748
- Fix cache version file creation by @sgugger in #19750
- Image transforms add center crop by @amyeroberts in #19718
- [Doctest] Add
configuration_decision_transformer.py
by @Xabilahu in #19751 - [Doctest] Add
configuration_detr.py
by @Xabilahu in #19752 - Fixed spacing errors by @shreya24ag in #19754
- All broken links were fixed in contributing file by @mdfaizanahmed786 in #19760
- [Doctest] SpeechToTextTransformer Config for doctest by @daspartho in #19757
- [Doctest] SqueezeBERT Config for doctest by @daspartho in #19758
- [Doctest] SpeechToTextTransformer2 Config for doctest by @daspartho in #19756
- [Doctest] OpenAIGPTConfig and OPTConfig by @ndrohith09 in #19763
image-segmentation
pipeline: re-enablesmall_model_pt
test. by @Narsil in #19716- Update modeling_layoutlmv3.py by @IMvision12 in #19753
- adding key pair dataset by @rohit1998 in #19765
- Fix exception thrown using MishActivation by @chinoll in #19739
- [FLAX] Add dtype to embedding for gpt2 model by @merrymercy in #18462
- TF: sample generation compatible with XLA and dynamic batch sizes by @gante in #19773
- Install tf2onnx dev version by @ydshieh in #19755
- Fix docker image build by @ydshieh in #19759
- PT <-> TF for composite models by @ydshieh in #19732
- Add warning about restarting runtime to import errors by @Rocketknight1 in #19774
- Added support for multivariate independent emission heads by @kashif in #19453
- Update
ImageToTextPipelineTests.test_small_model_tf
by @ydshieh in #19785 - Make public versions of private tensor utils by @sgugger in #19775
- Update training.mdx by @ftorres16 in #19791
- [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial. by @davialvb in #19779
- Add sentencepiece to BertJapaneseTokenizer by @conan1024hao in #19769
- Fix CTRL
test_torchscrip_xxx
CI by updating_create_and_check_torchscript
by @ydshieh in #19786 - Fix nightly test setup by @sgugger in #19792
- Fix image segmentation pipeline errors, resolve backward compatibility issues by @alaradirik in #19768
- Fix error/typo in docstring of TokenClassificationPipeline by @pchr8 in #19798
- Use None to detect if truncation was unset by @sgugger in #19794
- Generate: contrastive search test updates by @gante in #19787
- Run some TF Whisper tests in subprocesses to avoid GPU OOM by @ydshieh in #19772
- Added translation of run_scripts.mdx to Portuguese Issue #16824 by @davialvb in #19800
- Generate: minor docstring fix by @gante in #19801
- [Doctest]
MaskFormerConfig
doctest by @sha016 in #19817 - [Doctest] Add
configuration_plbart.py
by @ayaka14732 in #19809 - [Doctest] Add
configuration_poolformer.py
by @ayaka14732 in #19808 - [Doctest] Add
configuration_electra.py
by @ayaka14732 in #19807 - [Doctest] Add
configuration_nezha.py
by @ayaka14732 in #19810 - Display the number of trainable parameters when lauching a training by @regisss in #19835
- replace reference to Datasets in metrics deprecation with Evaluate by @angus-lherrou in #19812
- Fix OOM in Config doctest by @ydshieh in #19840
- fix broken links in testing.mdx by @XFFXFF in #19820
- fix image2test args forwarding by @kventinel in #19648
- Added translation of converting_tensorflow_models.mdx to Portuguese Issue #16824 by @davialvb in #19824
- Fix nightly CircleCI by @ydshieh in #19837
- fixed typo in fp16 training section for perf_train_gpu_one by @dsingal0 in #19736
- Update
LEDModelIntegrationTests
expected values by @ydshieh in #19841 - Improve check copies by @kventinel in #19829
- Fix doctest for
MarkupLM
by @ydshieh in #19845 - add small updates only by @stevhliu in #19847
- Refactor conversion function by @sgugger in #19799
- Spanish translation of multiple_choice.mdx, question_answering.mdx. by @alceballosa in #19821
- Fix doctest for
GenerationMixin.contrastive_search
by @ydshieh in #19863 - Add missing lang tokens in M2M100Tokenizer.get_vocab by @guillaumekln in #18416
- Added translation of serialization.mdx to Portuguese Issue #16824 by @davialvb in #19869
- Generate: contrastive search cosmetic tweaks by @gante in #19871
- [Past CI] Vilt only supports PT >= v1.10 by @LysandreJik in #19851
- Fix incorrect model<->tokenizer mapping in tokenization testing by @ydshieh in #19872
- Update doc for revision and token by @sgugger in #19793
- Factored out some code in the
image-segmentation
pipeline. by @Narsil in #19727 - [DOCTEST] Config doctest for
MCTCT
,MBart
andLayoutLM
by @Revanth2002 in #19889 - Fix LR by @regisss in #19875
- Correct README image text by @KayleeDavisGitHub in #19883
- No conv bn folding in ipex to avoid warning by @sanderland in #19870
- Add missing information on token_type_ids for roberta model by @raghavanone in #19766
- Change the import of kenlm from github to pypi by @raghavanone in #19770
- Update
max_diff
intest_save_load_fast_init_to_base
by @ydshieh in #19849 - Allow flax subfolder by @patrickvonplaten in #19902
accelerate
support forRoBERTa
family by @younesbelkada in #19906- Add checkpoint links in a few config classes by @ydshieh in #19910
- Generate: contrastive search uses existing abstractions and conventions by @gante in #19896
- Convert None logits processor/stopping criteria to empty list. by @ccmaymay in #19880
- Some fixes regarding auto mappings and test class names by @ydshieh in #19923
- Fix bug in Wav2Vec2's GPU tests by @falcaopetri in #19803
- Fix warning when collating list of numpy arrays by @sgugger in #19846
- Add type hints to TFPegasusModel by @EdAbati in #19858
- Remove embarrassing debug print() in save_pretrained by @Rocketknight1 in #19922
- Add
accelerate
support for M2M100 by @younesbelkada in #19912 - Add RoBERTa resources by @stevhliu in #19911
- Add T5 resources by @stevhliu in #19878
- Add BLOOM resources by @stevhliu in #19881
- Add GPT2 resources by @stevhliu in #19879
- Let inputs of fast tokenizers be tuples as well as lists by @sgugger in #19898
- Add
accelerate
support for BART-like models by @younesbelkada in #19927 - Create dummy models by @ydshieh in #19901
- Support segformer fx by @dwlim-nota in #19924
- Use self._trial to generate trial_name for Trainer. by @reyoung in #19874
- Add Onnx Config for ImageGPT by @RaghavPrabhakar66 in #19868
- Update Code of Conduct to Contributor Covenant v2.1 by @pankali in #19935
- add resources for bart by @stevhliu in #19928
- add resources for distilbert by @stevhliu in #19930
- Add wav2vec2 resources by @stevhliu in #19931
- [Conditional, Deformable DETR] Add postprocessing methods by @NielsRogge in #19709
- Fix ONNX tests for ONNX Runtime v1.13.1 by @lewtun in #19950
- donut -> donut-swin by @ydshieh in #19920
- [Doctest] Add configuration_deberta.py by @Saad135 in #19968
- gradient checkpointing for GPT-NeoX by @chiaolun in #19946
- [modelcard] Update for ASR by @sanchit-gandhi in #19985
- [ASR] Update 'tasks' for model card by @sanchit-gandhi in #19986
- Tranformers documentation translation to Italian #17459 by @draperkm in #19988
- Pin torch to < 1.13 temporarily by @ydshieh in #19989
- Add support for gradient checkpointing by @NielsRogge in #19990
Significant community contributions
The following contributors have made significant changes to the library over the last release:
- @arnaudstiegler
- Make LayoutLM tokenizers independent from BertTokenizer (#19351)
- @asofiaoliveira
- Make
XLMRoberta
model and config independent fromRoberta
(#19359)
- Make
- @srhrshr
- Decouples
XLMProphet
model fromProphet
(#19406)
- Decouples
- @Davidy22
- @mathieujouffroy
- @IMvision12
- @501Good
- Make
MobileBert
tokenizers independent fromBert
(#19531)
- Make
- @mukesh663
- @D3xter1922
- Removed XLMModel inheritance from FlaubertModel(torch+tf) (#19432)
- @falcaopetri
- @gmftbyGMFTBY
- Adding the state-of-the-art contrastive search decoding methods for the codebase of generation_utils.py (#19477)
- @davialvb
- [ custom_models.mdx ] - Translated to Portuguese the custom models tutorial. (#19779)
- Added translation of run_scripts.mdx to Portuguese Issue #16824 (#19800)
- Added translation of converting_tensorflow_models.mdx to Portuguese Issue #16824 (#19824)
- Added translation of serialization.mdx to Portuguese Issue #16824 (#19869)
- @alceballosa
- Spanish translation of multiple_choice.mdx, question_answering.mdx. (#19821)