huggingface/transformers v4.26.0 on GitHub

`GenerationConfig`

The generate method has multiple arguments whose defaults were lying in the model config. We have now decoupled these in a separate generation config, which makes it easier to store different sets of parameters for a given model, with different generation strategies. While we will keep supporting generate arguments in the model configuration for the foreseeable future, it is now recommended to use a generation config. You can learn more about its uses here and its documentation here.

Generate: use GenerationConfig as the basis for .generate() parametrization by @gante in #20388
Generate: TF uses GenerationConfig as the basis for .generate() parametrization by @gante in #20994
Generate: FLAX uses GenerationConfig as the basis for .generate() parametrization by @gante in #21007

`ImageProcessor`

In the vision integration, all feature extractor classes have been deprecated to be renamed to ImageProcessor. The old feature extractors will be fully removed in version 5 of Transformers and new vision models will only implement the ImageProcessor class, so be sure to switch your code to this new name sooner rather than later!

Add deprecation warning when image FE instantiated by @amyeroberts in #20427
Vision processors - replace FE with IPs by @amyeroberts in #20590
Replace FE references by @amyeroberts in #20702

New models

AltCLIP

AltCLIP is a variant of CLIP obtained by switching the text encoder with a pretrained multilingual text encoder (XLM-Roberta). It has very close performances with CLIP on almost all tasks, and extends the original CLIP’s capabilities to multilingual understanding.

Add AltCLIP by @jongjyh in #20446

BLIP

BLIP is a model that is able to perform various multi-modal tasks including visual question answering, image-text retrieval (image-text matching) and image captioning.

Add BLIP by @younesbelkada in #20716

BioGPT

BioGPT is a domain-specific generative pre-trained Transformer language model for biomedical text generation and mining. BioGPT follows the Transformer language model backbone, and is pre-trained on 15M PubMed abstracts from scratch.

Add BioGPT by @kamalkraj in #20420

BiT

BiT is a simple recipe for scaling up pre-training of ResNet-like architectures (specifically, ResNetv2). The method results in significant improvements for transfer learning.

Add BiT + ViT hybrid by @NielsRogge in #20550

EfficientFormer

EfficientFormer proposes a dimension-consistent pure transformer that can be run on mobile devices for dense prediction tasks like image classification, object detection and semantic segmentation.

Efficientformer by @Bearnardd in #20459

GIT

GIT is a decoder-only Transformer that leverages CLIP’s vision encoder to condition the model on vision inputs besides text. The model obtains state-of-the-art results on image captioning and visual question answering benchmarks.

Add GIT (GenerativeImage2Text) by @NielsRogge in #20295

GPT-sw3

GPT-Sw3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-Sw3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.

Add gpt-sw3 model to transformers by @ekgren in #20209

Graphormer

Graphormer is a Graph Transformer model, modified to allow computations on graphs instead of text sequences by generating embeddings and features of interest during preprocessign and collation, then using a modified attention.

Graphormer model for Graph Classification by @clefourrier in #20968

Mask2Former

Mask2Former is a unified framework for panoptic, instance and semantic segmentation and features significant performance and efficiency improvements over MaskFormer.

Add Mask2Former by @alaradirik and @shivalikasingh95 in #20792

OneFormer

OneFormer is a universal image segmentation framework that can be trained on a single panoptic dataset to perform semantic, instance, and panoptic segmentation tasks. OneFormer uses a task token to condition the model on the task in focus, making the architecture task-guided for training, and task-dynamic for inference.

Add OneFormer Model by @praeclarumjj3 in #20577

Roberta prelayernorm

The RoBERTa-PreLayerNorm model is identical to RoBERTa but uses the --encoder-normalize-before flag in fairseq.

Implement Roberta PreLayerNorm by @AndreasMadsen in #20305

Swin2SR

Swin2R improves the SwinIR model by incorporating Swin Transformer v2 layers which mitigates issues such as training instability, resolution gaps between pre-training and fine-tuning, and hunger on data.

Add Swin2SR by @NielsRogge in #19784

TimeSformer

TimeSformer is the first video transformer. It inspired many transformer based video understanding and classification papers.

[New Model] Add TimeSformer model by @fcakyon in #18908

UPerNet

UPerNet is a general framework to effectively segment a wide range of concepts from images, leveraging any vision backbone like ConvNeXt or Swin.

Add UperNet by @NielsRogge in #20648

Vit Hybrid

ViT hybrid is a slight variant of the plain Vision Transformer, by leveraging a convolutional backbone (specifically, BiT) whose features are used as initial “tokens” for the Transformer. It’s the first architecture that attains similar results to familiar convolutional architectures.

Add BiT + ViT hybrid by @NielsRogge in #20550

Backbones

Breaking a bit the one model per file policy, we introduce backbones (mainly for vision models) which can then be re-used in more complex models like DETR, MaskFormer, Mask2Former etc.

[NAT, DiNAT] Add backbone class by @NielsRogge in #20654
Add Swin backbone by @NielsRogge in #20769
[DETR and friends] Use AutoBackbone as alternative to timm by @NielsRogge in #20833

Bugfixes and improvements

fix cuda OOM by using single Prior by @ArthurZucker in #20486
Add ESM contact prediction by @Rocketknight1 in #20535
flan-t5.mdx: fix link to large model by @szhublox in #20555
Fix torch device issues by @ydshieh in #20584
Fix flax GPT-J-6B linking model in tests by @JuanFKurucz in #20556
[Vision] fix small nit on BeitDropPath layers by @younesbelkada in #20587
Install natten with CUDA version by @ydshieh in #20546
Add entries to FEATURE_EXTRACTOR_MAPPING_NAMES by @ydshieh in #20551
Cleanup some config attributes by @ydshieh in #20554
[Whisper] Move decoder id method to tokenizer by @sanchit-gandhi in #20589
Add require_torch to 2 pipeline tests by @ydshieh in #20585
Install tensorflow_probability for TF pipeline CI by @ydshieh in #20586
Ci-whisper-asr by @ArthurZucker in #20588
cross platform from_pretrained by @ArthurZucker in #20538
Make convert_to_onnx runable as script again by @mcernusca in #20009
ESM openfold_utils type hints by @ringohoffman in #20544
Add RemBERT ONNX config by @hchings in #20520
Fix link to Swin Model contributor novice03 by @JuanFKurucz in #20557
Fix link to swin transformers v2 microsoft model by @JuanFKurucz in #20558
Fix link to table transformer detection microsoft model by @JuanFKurucz in #20560
clean up unused classifier_dropout in config by @ydshieh in #20596
Fix whisper and speech to text doc by @ArthurZucker in #20595
Replace set-output by $GITHUB_OUTPUT by @ydshieh in #20547
[Vision] .to function for ImageProcessors by @younesbelkada in #20536
[Whisper] Fix decoder ids methods by @sanchit-gandhi in #20599
Add-whisper-conversion by @ArthurZucker in #20600
README in Hindi 🇮🇳 by @pacman100 in #20097
Fix code sample in preprocess by @stevhliu in #20561
Split autoclasses on modality by @stevhliu in #20559
Fix test for file not found by @sgugger in #20604
Rework the pipeline tutorial by @Narsil in #20437
Documentation fixes by @SamuelzXu in #20607
Adding anchor links to Hindi README by @pacman100 in #20606
exclude jit time from the speed metric calculation of evaluation and prediction by @sywangyi in #20553
Check if docstring is None before formating it by @xxyzz in #20592
updating T5 and BART models to support Prefix Tuning by @pacman100 in #20601
Fix AutomaticSpeechRecognitionPipelineTests.run_pipeline_test by @ydshieh in #20597
Ci-jukebox by @ArthurZucker in #20613
Update some GH action versions by @ydshieh in #20537
Fix dtype of weights in from_pretrained when device_map is set by @sgugger in #20602
add missing is_decoder param by @stevhliu in #20631
Fix link to speech encoder decoder model in speech recognition readme by @JuanFKurucz in #20633
Fix natten installation in docker file by @ydshieh in #20632
Clip floating point constants to bf16 range to avoid inf conversion by @aws-sangeetha in #20605
Pin TensorFlow to the next release by @sgugger in #20635
[MaskFormer] Add support for ResNet backbone by @NielsRogge in #20483
[Trainer] add error when passing 8bitmodels by @younesbelkada in #20651
[ViTHybrid] + [BiT] cleaner __init__ by @younesbelkada in #20649
Update summarization run_pipeline_test by @ydshieh in #20623
pin TF 2.11 in docker files by @ydshieh in #20642
Speed up git-lfs detection on error by @xloem in #20641
Updated Trainer args typing by @julianmack in #20655
Add dpt-hybrid support by @younesbelkada in #20645
[Whisper] Fix forced decoder ids by @sanchit-gandhi in #20652
Add TFBartForSequenceClassification by @uglyboxer in #20570
run_speech_recognition_seq2seq.py: add cache_dir param to dataset by @eschmidbauer in #20540
[BiT] Small patch fix by @younesbelkada in #20657
Fix gpt2 fp16 training when tracing is enabled by @JingyaHuang in #20656
Fix load from PT-formatted checkpoint in composite TF models by @sgugger in #20661
Update the list of contributors to reflect current organization by @sgugger in #20603
Fix expected values for TF-ESM tests by @Rocketknight1 in #20680
Add BackboneMixin by @ydshieh in #20660
Migrate torchdynamo to torch.compile by @sgugger in #20634
Whilelist Transformers private method in DummyObject by @sgugger in #20681
[ViTHybrid] Fix accelerate slow tests by @younesbelkada in #20679
Enable bf16 option for XLA devices by @jeffhataws in #20684
Fix CIs for PyTorch 1.13 by @ydshieh in #20686
Fix donut image processor by @amyeroberts in #20625
Added missing test_tokenization_led by @IMvision12 in #20568
Add video classification pipeline by @nateraw in #20151
[Backbones] Improve out features by @NielsRogge in #20675
Change transformers.onnx to use optimum.exporters.onnx by @michaelbenayoun in #20529
skip test_multi_gpu_data_parallel_forward for MaskFormerSwinModelTest by @ydshieh in #20688
[ViTHybrid] fix last accelerate slow test by @younesbelkada in #20705
Fix rendering issue in quicktour by @sgugger in #20708
Made LUKE Tokenizer independent from RoBERTa by @salvo96 in #20720
Spanish translation of asr.mdx and add_new_pipeline.mdx by @alceballosa in #20569
Add accelerate support for LongT5 models by @pszemraj in #20341
Fix AutoModelTest.test_model_from_pretrained by @ydshieh in #20730
Adding ValueError when imcompatible parameters are used. by @Narsil in #20729
Add type hints for Whisper models by @donelianc in #20396
Very small edit to change name to OpenAI GPT by @stanleycai95 in #20722
fsdp fix by @pacman100 in #20719
Spanish translation of the file debugging.mdx by @SimplyJuanjo in #20566
Convert tokenizer outputs for Keras in doc example by @Rocketknight1 in #20732
Clarify return_tensor and return_text parameters by @stevhliu in #20662
Add vision requirement to image transforms by @amyeroberts in #20712
Add a progress bar for large model loading by @sgugger in #20713
Disambiguate test for required_input in tokenization base file. by @sgugger in #20731
Add decorator for flaky Donut tests by @amyeroberts in #20739
rename layoutlm_job to exotic_models_job by @ydshieh in #20736
Update CI to torch 1.13.0 by @ydshieh in #20687
Add keep_in_fp32_modules support by @younesbelkada in #20683
Change a logic in pipeline test regarding TF by @ydshieh in #20710
Fix AdamWeightDecay for TF 2.11 by @Rocketknight1 in #20735
in the resize() function in image_transforms.py, the line 267: by @dhansmair in #20728
Add docs xlm roberta by @hazrulakmal in #20742
Fixing the pipeline tutorial test by @Narsil in #20746
Uninstall torch_tensorrt in DeepSpeed CI image for now by @ydshieh in #20758
Remove image_transforms functions from init by @amyeroberts in #20704
Fix missing () in some usage of is_flaky by @ydshieh in #20749
[Tests] Improve test_attention_outputs by @NielsRogge in #20701
Fix attribute error problem by @casuallyName in #20765
[CI-Test] Fixes but also skips the mT5 tests by @ArthurZucker in #20755
Replaces xxx_required with requires_backends by @amyeroberts in #20715
Install torch-tensorrt 1.3.0 for DeepSpeed CI by @ydshieh in #20764
Even more validation. by @Narsil in #20762
Install vision for TF pipeline tests by @ydshieh in #20771
Patch for FlanT5-XXL 8bit support by @larsmennen in #20760
[Pipeline] fix failing bloom pipeline test by @younesbelkada in #20778
Fixing object detection with layoutlm by @Narsil in #20776
Install video dependency for pipeline CI by @ydshieh in #20777
Move convert_to_rgb to image_transforms module by @amyeroberts in #20784
Recompile apex in DeepSpeed CI image by @ydshieh in #20788
[Pipeline] skip feature extraction test if in IMAGE_PROCESSOR_MAPPING by @younesbelkada in #20790
Fix object detection2 by @Narsil in #20798
Stop calling expand_1d on newer TF versions by @Rocketknight1 in #20786
Add Universal Segmentation class + mapping by @NielsRogge in #20766
Install sentencepiece in DeepSpeed CI image by @ydshieh in #20795
lazy import torch._softmax_backward_data for better compatibility by @daquexian in #20796
[Vision] [Refactor] Initialize weights on the correct place by @younesbelkada in #20803
Vilt - use image_transforms pad by @amyeroberts in #20780
[clip] fix error message by @stas00 in #20818
Add model resources for ViT by @stanleycai95 in #20723
fix typo output not ouput in bitsandbytes trainer test by @Thomas-MMJ in #20839
Fix tiny typo by @fzyzcjy in #20841
Remove unused max_position_embeddings in config classes by @ydshieh in #20836
[mBART] fix erroneous italics in docstring by @sanchit-gandhi in #20835
TF AdamWeightDecay fix for 2.11 by @Rocketknight1 in #20848
remove unused use_cache in config classes by @ydshieh in #20844
[SegFormer] Add support for segmentation masks with one label by @NielsRogge in #20279
Clarify use_fast parameter in docstring by @stevhliu in #20840
[S2T, Whisper] Add copied from statements by @sanchit-gandhi in #20787
Embed circle packing chart for model summary by @stevhliu in #20791
[Swin2SR] Add doc tests by @NielsRogge in #20829
[Examples] Update big table by @NielsRogge in #20845
Use config.num_channels in CLIP-like modeling files by @ydshieh in #20857
fix past_key_values in GPTNeoXForCausalLM.prepare_inputs_for_generation by @ValeKnappich in #20621
Add visual prompt to processor of CLIPSeg model by @idilsulo in #20816
Adding evaluate to the list of libraries required in generated notebooks by @MKhalusova in #20850
Fix past CI by skipping LevitModelTest.test_problem_types by @ydshieh in #20859
Fix whisper export by @mht-sharma in #20800
Fix doctest by @ArthurZucker in #20843
Add-warning-tokenizer by @ArthurZucker in #20826
Update HubertModelIntegrationTest.test_inference_keyword_spotting by @ydshieh in #20863
Generate: post-generate config doctest fix by @gante in #20804
change strings to f-strings in image_processing_utils.py by @dhansmair in #20865
[FSMT] Make it compatible with xxxForConditionalGeneration models by @younesbelkada in #20825
[MobileNet-v2] Fix ONNX typo by @younesbelkada in #20860
having new model entries in Hindi for Hindi README by @pacman100 in #20869
Add Onnx Config for PoolFormer by @BakingBrains in #20868
Adding support for fp16 for asr pipeline. by @Narsil in #20864
Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch by @bastings in #20801
Add japanese translation of template by @younesbelkada in #20870
[RobertaPreLayernom] Fixes the CI daily test by @ArthurZucker in #20886
Fixes typo in the help text for --max_length by @makrai in #20883
typo fix by @nathan-barry in #20891
[ T5] fix fp16 loading issue by @younesbelkada in #20878
Update flan-t5 original model link by @kamalkraj in #20897
fix docs typos in "add_new_model" by @elisim in #20900
[Past CI] 🔥 Leave Past CI failures in the past 🔥 by @ydshieh in #20861
Avoid collisions in writing metrics via 2 APIs - azureml + mlflow by @akshaya-a in #20837
Generate: correctly detect default max length by @gante in #20911
Remove non-breaking spaces by @aphedges in #20929
Load the state dict on CPU to prevent unnecessary GPU memory surge by @HarshTrivedi in #20920
Fix FP16 inference in TextGenerationPipeline by @bofenghuang in #20913
Remove Bert tokenizer dependency from DistillBert (slow/fast) tokenizers by @IvanLauLinTiong in #20933
Adds type checking to PreTrainedConfig. by @mmcdermott in #20926
Fix error message in WhisperFeatureExtractor by @bofenghuang in #20936
Fixing DistilBert error message by @SamuelzXu in #20945
[trainer: distributed_concat] ensure all_gather's inputs are contiguous by @stas00 in #20951
Add generate kwargs to AutomaticSpeechRecognitionPipeline by @bofenghuang in #20952
update pyknp to rhoknp by @conan1024hao in #20890
Generate: TF XLA beam sample by @gante in #20927
Fix T5 docstring by @IvanLauLinTiong in #20957
Generate: delete unused TF _reorder_cache by @gante in #20964
MinNewTokensLengthLogitsProcessor for .generate method #20814 by @kotikkonstantin in #20892
Fix post_process_object_detection method descriptions by @alaradirik in #20977
Remove more unused attributes in config classes by @ydshieh in #20858
[run_clm example] add torch_dtype option for model load. by @sywangyi in #20971
Fix valid ratio for Deformable Detr by @long8v in #20958
Enable decoder_attention_mask in generate function by @samuelpullely in #20726
Ignore errors when deleting old checkpoints in trainer by @akrogager in #20984
Avoid CI runs under users' own CircleCI personal account by @ydshieh in #20981
Fix for LXMERT by @ydshieh in #20986
Improve OWL-ViT postprocessing by @alaradirik in #20980
Fix race condition on cleaning checkpoints when save_total_limit set to 1 by @radcheb in #20989
Add custom stop token ids for generation by @tokestermw in #20727
update template by @ArthurZucker in #20885
Add: doc page for the object detection task by @MKhalusova in #20925
auxiliary_loss works for Deformable Detr by @long8v in #20959
Update image processor parameters if creating with kwargs by @amyeroberts in #20866
Fix bug in segmentation postprocessing by @alaradirik in #20198
Don't call deprecated method by @amyeroberts in #20904
Fix model hub link by @idilsulo in #20998
Refactor the function get_results by @milyiyo in #20999
Update bug report template by @stevhliu in #21004
Remove T5 dependency from mT5 model by @SD-13 in #20949
Update PR template by @stevhliu in #21006
add: task guide on video classification model fine-tuning. by @sayakpaul in #20827
Generate: Fix CI related to #20727 by @gante in #21003
Fix (DeepSpeed) docker image build issue by @ydshieh in #21002
Fix callback docstrings by @stevhliu in #21005
Generate: post-generate config TF doctest fix by @gante in #21018
Generate: FLAX infers pad token in its absence and has functional example by @gante in #21009
[BLIP] Fix daily CI failing test by @younesbelkada in #20877
Make sure dynamic objects can be saved and reloaded by @sgugger in #21008
[CLIPSeg] Fix integration test by @NielsRogge in #20995
Added mask_time_prob and mask_time_length arguments to wav2vec2 pretraining script by @mpierrau in #20985
[NumPy] Remove references to deprecated NumPy type aliases by @hvaara in #21022
Fix arguments passed to predict function in QA Seq2seq training script by @Observer46 in #21026
Support turning off the model uploading in ClearML by @david1542 in #20969
fix parameter name in docstring by @cceyda in #21032
fix levit timm conversion file by @Bearnardd in #20938
fix typo by @kaisugi in #21042
fix typo by @sabaul in #21048
Replace past with past_key_values by @ArthurZucker in #20944
Fix warning for MCTC model by @sgugger in #21049
remove flax file from documentation_tests.txt by @ydshieh in #21036
Patch-past-refactor by @ArthurZucker in #21050
Make the attention_head_size in distilbert an object attribute by @KarlFelixJoehnk in #20970
feature: update wandb callback to upload checkpoints by @parambharat in #21035
Fix header level by @stevhliu in #21072
Update docstring for CLIPConfig by @yingzha in #21066
fix typo in comment by @soulseen in #21088
Optimize inference only mode memory if ipex is used by @sywangyi in #21083
Fixed issue #21039 by @susnato in #21062
Remove more unused attributes in config classes by @ydshieh in #21000
[bnb optim] fixing test by @stas00 in #21030
Fix past CI by @ydshieh in #20967
Fix torchscript tests for AltCLIP by @ydshieh in #21102
[Tokenizers] Fix a small typo by @ArthurZucker in #21104
Update task summary part 1 by @stevhliu in #21014
Add Spanish translation to community.mdx by @shogohida in #21055
Rework automatic code samples in docstrings by @sgugger in #20757
[CI-doc-daily] Remove RobertaPreLayernorm random tests by @ArthurZucker in #20992
Use raw string for regex in tokenization_t5_fast.py by @odashi in #21125
Fixed typo in docstring by @tkburis in #21115
[VideoMAE] Fix docstring by @NielsRogge in #21111
[LongT5] Remove duplicate encoder_attention_mask default value check by @guillaume-be in #21124
Add min_new_tokens argument in generate() (implementation based on MinNewTokensLengthLogitsProcessor) by @silverriver in #21044
Fixing batching pipelines on single items for ChunkPipeline by @Narsil in #21132
Fixed issue #21053 by @susnato in #21065
Fix RealmModelIntegrationTest.test_inference_open_qa by @ydshieh in #21136
Added clefourrier as ref point for graph models in bug reports by @clefourrier in #21139
Update TFTapasEmbeddings by @ydshieh in #21107
[GIT] Fix training by @NielsRogge in #21133
Fixes to TF collators by @Rocketknight1 in #21143
TF: serializable hubert by @gante in #20966
Generate: TF contrastive search must pop use_cache from model_kwargs by @gante in #21149
Rename test_feature_extraction files by @amyeroberts in #21140
Small simplification to TopKLogitsWarper by @njhill in #21130
feat: add standalone guide on XLA support. by @sayakpaul in #21141
Clarify and add missing typical_p argument docstring. by @shermansiu in #21095
Fixing offline mode for pipeline (when inferring task). by @Narsil in #21113
Whisper Timestamp processor and prediction by @ArthurZucker in #20620
Add batch of resources by @NielsRogge in #20647
Change variable name to prevent shadowing by @sayakpaul in #21153
CLI: update hub PR URL by @gante in #21154
Add resources by @NielsRogge in #20872
Add: tensorflow example for image classification task guide by @MKhalusova in #21038
Add: An introductory guide for text generation by @MKhalusova in #21090
Refactoring of the text generate API docs by @MKhalusova in #21112
Add Epsilon- and Eta-Sampling by @shermansiu in #21121
Fixed num_channels!=3 normalization training by @layjain in #20630
🌐 [i18n-KO] Translated installation.mdx to Korean by @wonhyeongseo in #20948
Add Japanese translation to multilingual.mdx by @shogohida in #21084
Make test_save_pretrained_signatures slow test by @ydshieh in #21105
blip support for training by @younesbelkada in #21021
Remove Roberta Dependencies from XLM Roberta Flax and Tensorflow models by @SamuelzXu in #21047
Fix typos in documentation by @jordimas in #21160
OPT: Fix batched generation with FLAX by @gante in #21150
Fix git model for generate with beam search. by @PeterL1n in #21071
fix the issue that the output dict of jit model could not get [:2] by @sywangyi in #21146
using raw string for regex to search <extra_id> by @pfliu-nlp in #21162
Fix doctest CI by @ydshieh in #21166
Adapt repository creation to latest hf_hub by @sgugger in #21158
Add AWS Neuron torchrun support by @jeffhataws in #20806
Rewrite a couple of lines in the TF XLA doc by @Rocketknight1 in #21177
[issues template] update deepspeed owners by @stas00 in #21027
Fix Mask2FormerForUniversalSegmentation by @ydshieh in #21175
Update year 2020 to 2023 in one file by @ydshieh in #21190
workaround documentation rendering bug by @hollance in #21189
Fix device issue in UperNetModelIntegrationTest by @ydshieh in #21192
Updates to computer vision section of the Preprocess doc by @MKhalusova in #21181
Rename GLPN image processor tests by @amyeroberts in #21194
Update examples with image processors by @amyeroberts in #21155
hertz is already per second by @hollance in #21188
[Whisper] Fix timestamp processor by @ArthurZucker in #21187
Add hallucination filter by @KMFODA in #18675
[CVT] Fix module initialization issue by @younesbelkada in #21193
Flax dtype-dependent numerical masking by @gante in #21197
Add Japanese translation index.mdx by @kambehmw in #21186
Add disclaimer for necessary fake models by @sgugger in #21178
Enabling live automatic-speech-recognition asr for Whisper. by @Narsil in #21196
[Whispe] Fix pipeline after timestamp merges by @ArthurZucker in #21198
Update modeling doc strings FE -> IP by @amyeroberts in #21106
Generate: documented function to compute the transition scores by @gante in #21191
deleted references of self.vocab_size and self.type_vocab_size for multiple models [TF implementation] by @susnato in #21164
Update huggingface_hub version by @ydshieh in #21212
Fix CONFIG_ARCHIVE_MAP_MAPPING_NAMES by @ydshieh in #21207
Fix GPTJ doctest by @ydshieh in #21213
Declare len method in PreTrainedTokenizerBase by @thomasw21 in #21210
Fix code example in training tutorial by @stevhliu in #21201
Make parallelism for CircleCI jobs work - but keep it 1 for now by @ydshieh in #21157
Fix OneFormer Docstrings by @praeclarumjj3 in #21215
Fix task summary doctest by @stevhliu in #21200
Remove all hf-internal-testing checkpoints that can be removed by @sgugger in #21199
Microphone live inference catching up when inference is too slow (whisper). by @Narsil in #21219
[BLIP] fix docstring for BlipTextxxx by @younesbelkada in #21224
Skip failing test for now by @sgugger in #21226
[BLIP] fix doctest by @younesbelkada in #21217
Generate: precision fix in compute_transition_scores doctests by @gante in #21251
Extend Script to enable conversion of Encoder Only T5x Models to Pytorch by @ToluClassics in #20907
Add test_image_processing_common.py by @amyeroberts in #20785
[GIT] Convert more checkpoints by @NielsRogge in #21245
Optimize by not computing gradients for parameters set to requires_grad=False by @raghavanone in #21236
Fix reformer CI by @ydshieh in #21254
Add Japanese translation installation.mdx by @kambehmw in #21241
Add scikit-learn dependency to train langage-modeling by @mostafaelhoushi in #21229
Add missing checkpoint for doctest by @amyeroberts in #21258
Generate: save generation config with the models' .save_pretrained() by @gante in #21264
Replace reduce_labels with do_reduce_labels by @amyeroberts in #21218
Update tests: replace feature extractor tests with image processor by @amyeroberts in #20768
Notebook examples grouping and update by @MKhalusova in #21265
Add: TensorFlow example for semantic segmentation task guide by @MKhalusova in #21223
[ci-daily] Fix pipeline tests by @ArthurZucker in #21257
Add class properties with warnings by @amyeroberts in #21195
[Whisper] fix all issues with unk token by @ArthurZucker in #21250
Supported pipeline tasks update by @MKhalusova in #21268
Models docstring by @sgugger in #21225
Hotifx remove tuple for git config image processor. by @Narsil in #21278
Fix MaskFormerImageProcessor.post_process_instance_segmentation by @alaradirik in #21256

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@fcakyon
- [New Model] Add TimeSformer model (#18908)
@kamalkraj
- Add BioGPT (#20420)
- Update flan-t5 original model link (#20897)
@ringohoffman
- ESM openfold_utils type hints (#20544)
@SamuelzXu
- Documentation fixes (#20607)
- Fixing DistilBert error message (#20945)
- Remove Roberta Dependencies from XLM Roberta Flax and Tensorflow models (#21047)
@alceballosa
- Spanish translation of asr.mdx and add_new_pipeline.mdx (#20569)
@ekgren
- Add gpt-sw3 model to transformers (#20209)
@AndreasMadsen
- Implement Roberta PreLayerNorm (#20305)
@IvanLauLinTiong
- Remove Bert tokenizer dependency from DistillBert (slow/fast) tokenizers (#20933)
- Fix T5 docstring (#20957)
@jongjyh
- Add AltCLIP (#20446)
@SD-13
- Remove T5 dependency from mT5 model (#20949)
@Bearnardd
- fix levit timm conversion file (#20938)
- Efficientformer (#20459)
@praeclarumjj3
- Add OneFormer Model (#20577)
- Fix OneFormer Docstrings (#21215)
@kambehmw
- Add Japanese translation index.mdx (#21186)
- Add Japanese translation installation.mdx (#21241)

huggingface/transformers v4.26.0 v4.26.0: Generation configs, image processors, backbones and plenty of new models! on GitHub

GenerationConfig

ImageProcessor