Pegasus, mBART, DPR, self-documented outputs and new pipelines
Pegasus
The Pegasus model from PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization by Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu, was added to the library in PyTorch.
Model implemented as a collaboration between Jingqing Zhang and @sshleifer in #6340
- PegasusForConditionalGeneration (torch version) #6340
- add pegasus finetuning script #6811 script. (warning very slow)
DPR
The DPR model from Dense Passage Retrieval for Open-Domain Question Answering by Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih was added to the library in PyTorch.
DeeBERT
The DeeBERT model from DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference by Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin has been added to the examples/
folder alongside its training script, in PyTorch.
Self-documented outputs
As well as returning tuples, PyTorch and TensorFlow models now return a subclass of ModelOutput
that is appropriate. A ModelOutput
is a dataclass containing all model returns. This allows for easier inspection, and for self-documenting model outputs.
- Change model outputs types to self-document outputs #5438 (@sgugger)
- Tf model outputs #6247 (@sgugger)
Models return tuples by default, and return self-documented outputs if the return_dict
configuration flag is set to True
or if the return_dict=True
keyword argument is passed to the forward/call method.
Summary of the behavior:
# The new outputs are opt-in, you have to activate them explicitly with `return_dict=True`
# Either at instantiation
model = BertForSequenceClassification.from_pretrained('bert-base-cased', return_dict=True)
# Or when calling the model
output = model(**inputs, return_dict=True)
# You can access the elements of the outputs with
# (1) named attributes
loss = outputs.loss
logits = outputs.logits
# (2) their names as strings like a dict
loss = outputs["loss"]
logits = outputs["logits"]
# (3) their index as integers or slices in the pre-3.1.0 outputs tuples
loss = outputs[0]
logits = outputs[1]
loss, logits = outputs[:2]
# One **breaking behavior** of these new outputs (which is the reason you have to opt-in to use these new outputs:
# Iterating on the outputs now return the names (keys) instead of the values:
print([element for element in outputs])
>>> ['loss', 'logits']
# Thus you cannot unpack the output like pre-3.1.0 (you get the string names instead of the values):
# (But you can query a slice like indicated in (3) above)
loss_keys, logits_key = outputs
Encoder-Decoder framework
The encoder-decoder framework has been enhanced to allow more encoder decoder model combinations, e.g.:
Bert2Bert, Bert2GPT2, Roberta2Roberta, Longformer2Roberta, ....
- [EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer #6411 (@patrickvonplaten)
- [EncoderDecoder] Add Cross Attention for GPT2 #6415 (@patrickvonplaten)
- [EncoderDecoder] Add functionality to tie encoder decoder weights #6538 (@patrickvonplaten)
- Multiple combinations of EncoderDecoder models have been fine-tuned and evaluated on CNN/Daily-Mail summarization: https://huggingface.co/models?search=cnn_dailymail-fp16 (@patrickvonplaten)
TensorFlow as a first-class citizen
As we continue working towards having TensorFlow be a first-class citizen, we continually improve on our TensorFlow API and models.
- [Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile #5395 (@patrickvonplaten)
- [Benchmark] Add benchmarks for TF Training #5594 (@patrickvonplaten)
Machine Translation
MarianMTModel
- en-zh and 357 other checkpoints for machine translation were added from the Helsinki-NLP group's Tatoeba Project (@sshleifer + @jorgtied). There are now > 1300 supported pairs for machine translation.
- Marian converter updates #6342 (@sshleifer)
- Marian distill scripts + integration test #6799 (@sshleifer)
mBART
The mBART model from Multilingual Denoising Pre-training for Neural Machine Translation was can now be accessed through MBartForConditionalGeneration
.
- Add mbart-large-cc25, support translation finetuning #5129 (@sshleifer)
- [mbart] prepare_translation_batch passes **kwargs to allow DeprecationWarning #5581 (@sshleifer)
- MBartForConditionalGeneration #6441 (@patil-suraj)
- [fix] mbart_en_ro_generate test now identical to fairseq #5731 (@sshleifer)
- [Doc] explaining romanian postprocessing for MBART BLEU hacking #5943 (@sshleifer)
- [test] partial coverage for train_mbart_enro_cc25.sh #5976 (@sshleifer)
- MbartTokenizer: do not hardcode vocab size #5998 (@sshleifer)
- MBART: support summarization tasks where max_src_len > max_tgt_len #6003 (@sshleifer)
- Fix #6096: MBartTokenizer's mask token #6098 (@sshleifer)
- [s2s] Document better mbart finetuning command #6229 (@sshleifer)
- mBART Conversion script #6230 (@sshleifer)
- [s2s] add BartTranslationDistiller for distilling mBART #6363 (@sshleifer)
- [Doc] add more MBart and other doc #6490 (@patil-suraj)
examples/seq2seq
- examples/seq2seq/finetune.py supports --task translation
- All sequence to sequence tokenizers (T5, Bart, Marian, Pegasus) expose a
prepare_seq2seq_batch
method that makes batches for sequence to sequence trianing.
PRs:
- Seq2SeqDataset uses linecache to save memory #5792 (@Pradhy729)
- [examples/seq2seq]: add --label_smoothing option #5919 (@sshleifer)
- seq2seq/run_eval.py can take decoder_start_token_id #5949 (@sshleifer)
- [examples (seq2seq)] fix preparing decoder_input_ids for T5 #5994 (@patil-suraj)
- [s2s] add support for overriding config params #6149 (@stas00)
- s2s: fix LR logging, remove some dead code. #6205 (@sshleifer)
- [s2s] tiny QOL improvement: run_eval prints scores #6341 (@sshleifer)
- [s2s] fix label_smoothed_nll_loss #6344 (@patil-suraj)
- [s2s] fix --gpus clarg collision #6358 (@sshleifer)
- [s2s] Script to save wmt data to disk #6403 (@sshleifer)
- rename prepare_translation_batch -> prepare_seq2seq_batch #6103 (@sshleifer)
- Mult rouge by 100: standard units #6359 (@sshleifer)
- allow spaces in bash args with "$@" #6521 (@sshleifer)
- [seq2seq] MAX_LEN env var for MT commands #5837 (@sshleifer)
- [seq2seq] distillation.py accepts trainer arguments #5865 (@sshleifer)
- [s2s]Use prepare_translation_batch for Marian finetuning #6293 (@sshleifer)
- [BartTokenizer] add prepare s2s batch #6212 (@patil-suraj)
- [T5Tokenizer] add prepare_seq2seq_batch method #6122 (@patil-suraj)
- [s2s] round runtime in run_eval #6798 (@sshleifer)
- [s2s README] Add more dataset download instructions #6737 (@sshleifer)
- [s2s] round bleu, rouge to 4 digits #6704 (@sshleifer)
- [s2s] command line args for faster val steps #6833
New documentation
Several new documentation pages have been added and older documentation has been tweaked to be more accurate and understandable. An open in colab button has been added on the tutorial pages.
- Guide to fixed-length model perplexity evaluation #5449 (@joeddav)
- Improvements to PretrainedConfig documentation #5642 (@sgugger)
- Document model outputs #5673 (@sgugger)
- docs(wandb): explain how to use W&B integration #5607 (@borisdayma)
- Model utils doc #6005 (@sgugger)
- ONNX documentation #5992 (@mfuntowicz)
- Tokenizer documentation #6110 (@sgugger)
- Pipeline documentation #6175 (@sgugger)
- Encoder decoder config docs #6195 (@afcruzs)
- Colab button #6389 (@sgugger)
- Generation documentation #6470 (@sgugger)
- Add custom datasets tutorial #6466 (@joeddav)
- Logging documentation #6852 (@sgugger)
Trainer updates
New additions to the Trainer
- Added data collator for permutation (XLNet) language modeling and related calls #5522 (@shngt)
- Trainer support for iterabledataset #5834 (@Pradhy729)
- Adding PaddingDataCollator #6442 (@sgugger)
- Add hyperparameter search to Trainer #6576 (@sgugger)
- [examples] Add trainer support for question-answering #4829 (@patil-suraj)
- Adds comet_ml to the list of auto-experiment loggers #6176 (@dsblank)
- Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task #6644 (@HuangLianzhe)
New models & model architectures
The following model architectures have been added to the library
- FlaubertForTokenClassification #5644 (@stas00)
- TFXLMForTokenClassification #5614 (@LysandreJik)
- TFXLMForMultipleChoice #5614 (@LysandreJik)
- TFFlaubertForTokenClassification #5614 (@LysandreJik)
- TFFlaubertForMultipleChoice #5614 (@LysandreJik)
- TFElectraForSequenceClassification #6227 (@jplu)
- TFElectraForMultipleChoice #6227 (@jplu)
- TF Longformer #5764 (@patrickvonplaten)
- CamembertForCausalLM #6577 (@patil-suraj)
Regression testing on TPU & TPU CI
Thanks to @zcain117 we now have access to TPU CI for the PyTorch/xla framework. This enables regression testing on the TPU aspects of the Trainer
, and offers very simple regression testing on model training performance.
- Test XLA examples #5583
- Add setup for TPU CI to run every hour. #6219 (@zcain117)
- Add missing docker arg for TPU CI. #6393 (@zcain117)
- Get GKE logs via kubectl logs instead of gcloud logging read. #6446 (@zcain117)
New pipelines
New pipelines have been added:
- Zero shot classification pipeline #5760 (@joeddav)
- Addition of a DialoguePipeline #5516 (@guillaume-be)
- Add targets arg to fill-mask pipeline #6239 (@joeddav)
Community notebooks
- Fine-tune Electra and interpret with Integrated Gradients #6321 (@elsanns)
- Update ONNX notebook to include section on quantization. #6831 (@mfuntowicz)
Centralized logging
Logging is now centralized. The library offers methods to handle the verbosity level of all loggers contained in the library. [Link to logging doc here]:
- Centralize logging #6434 (@LysandreJik)
Bug fixes and improvements
-
[Reformer] Adapt Reformer MaskedLM Attn mask #5560 (@patrickvonplaten)
-
Make T5 compatible with ONNX #5518 (@abelriboulot)
-
[Bart] enable test_torchscript, update test_tie_weights #5457 (@sshleifer)
-
[docs] fix model_doc links in model summary #5566 (@patil-suraj)
-
[Benchmark] Readme for benchmark #5363 (@patrickvonplaten)
-
Fix Inconsistent NER Grouping (Pipeline) #4987 (@enzoampil)
-
QA pipeline BART compatible #5496 (@mfuntowicz)
-
More explicit error when failing to tensorize overflowing tokens #5633 (@LysandreJik)
-
Should check that torch TPU is available #5636 (@LysandreJik)
-
Fixed TextGenerationPipeline on torch + GPU #5629 (@TevenLeScao)
-
Fixed use of memories in XLNet (caching for language generation + warning when loading improper memoryless model) #5632 (@TevenLeScao)
-
Pipeline model type check #5679 (@JetRunner)
-
rename the functions to match the rest of the test convention #5692 (@stas00)
-
[Longformer] fix longformer global attention output #5659 (@patrickvonplaten)
-
[Fix] github actions CI by reverting #5138 #5686 (@sshleifer)
-
[Reformer classification head] Implement the reformer model classification head for text classification #5198 (@as-stevens)
-
Cleanup bart caching logic #5640 (@sshleifer)
-
[AutoModels] Fix config params handling of all PT and TF AutoModels #5665 (@patrickvonplaten)
-
[cleanup] T5 test, warnings #5761 (@sshleifer)
-
[fix] T5 ONNX test: model.to(torch_device) #5769 (@mfuntowicz)
-
[Benchmark] fix benchmark non standard model #5801 (@patrickvonplaten)
-
[Benchmark] Fix models without
architectures
param in config #5808 (@patrickvonplaten) -
[Longformer] fix longformer slow-down #5811 (@patrickvonplaten)
-
[seq2seq] pack_dataset.py rewrites dataset in max_tokens format #5819 (@sshleifer)
-
[seq2seq] Don't copy self.source in sortishsampler #5818 (@sshleifer)
-
[cleanups] make Marian save as Marian #5830 (@sshleifer)
-
[Reformer] - Cache hidden states and buckets to speed up inference #5578 (@patrickvonplaten)
-
Update tokenizers to 0.8.1.rc to fix Mac OS X issues #5867 (@sepal)
-
Xlnet outputs #5883 (@TevenLeScao)
-
[cleanup] squad processor #5868 (@sshleifer)
-
[Fix] seq2seq pack_dataset.py actually packs #5913 (@sshleifer)
-
[CI] self-scheduled runner tests examples/ #5927 (@sshleifer)
-
[CI] Install examples/requirements.txt #5956 (@sshleifer)
-
Expose padding_strategy on squad processor to fix QA pipeline performance regression #5932 (@mfuntowicz)
-
[docs] Add integration test example to copy pasta template #5961 (@sshleifer)
-
Cleanup Trainer and expose customization points #5982 (@sgugger)
-
Avoid unnecessary warnings when loading pretrained model #5922 (@sgugger)
-
Ensure OpenAI GPT position_ids is correctly initialized and registered at init. #5773 (@mfuntowicz)
-
[CI] Don't test apex #6021 (@sshleifer)
-
add a summary report flag for run_examples on CI #6035 (@stas00)
-
don't complain about missing W&B when WANDB_DISABLED=true #6036 (@stas00)
-
Allow to set Adam beta1, beta2 in TrainingArgs #5592 (@gonglinyuan)
-
Fix the return documentation rendering for all model outputs #6022 (@sgugger)
-
Fix typo (model saving TF) #5734 (@Colanim)
-
Add new AutoModel classes in pipeline #6062 (@patil-suraj)
-
[pack_dataset] don't sort before packing, only pack train #5954 (@sshleifer)
-
CL util to convert models to fp16 before upload #5953 (@sshleifer)
-
[fix] no warning for position_ids buffer #6063 (@sshleifer)
-
Pipelines should use tuples instead of namedtuples #6061 (@LysandreJik)
-
Moving transformers package import statements to relative imports in some files #5796 (@afcruzs)
-
github issue template suggests who to tag #5790 (@sshleifer)
-
[s2s] Delete useless method, log tokens_per_batch #6081 (@sshleifer)
-
Logs should not be hidden behind a logger.info #6097 (@LysandreJik)
-
Fix zero-shot pipeline single seq output shape #6104 (@joeddav)
-
[fix] add bart to LM_MAPPING #6099 (@sshleifer)
-
[Fix] position_ids tests again #6100 (@sshleifer)
-
Fix deebert tests #6102 (@sshleifer)
-
Added capability to quantize a model while exporting through ONNX. #6089 (@mfuntowicz)
-
XLNet PLM Readme #6121 (@LysandreJik)
-
Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} #5614
-
Actually the extra_id are from 0-99 and not from 1-100 #5967 (@orena1)
-
Fix FlauBERT GPU test #6142 (@LysandreJik)
-
Enable ONNX/ONNXRuntime optimizations through converter script #6131 (@mfuntowicz)
-
Add Pytorch Native AMP support in Trainer #6151 (@prajjwal1)
-
Replace mecab-python3 with fugashi for Japanese tokenization #6086 (@polm)
-
parse arguments from dict #4869 (@patil-suraj)
-
Add script to convert BERT tf2.x checkpoint to PyTorch #5791 (@mar-muel)
-
Empty assert hunt #6056 (@TevenLeScao)
-
Adds train_batch_size, eval_batch_size, and n_gpu to to_sanitized_dict output for logging. #5331 (@jaymody)
-
[DataCollatorForLanguageModeling] fix labels #6213 (@patil-suraj)
-
Fix _shift_right function in TFT5PreTrainedModel #6214 (@maurice-g)
-
Remove outdated BERT tips #6217 (@JetRunner)
-
run_hans label fix #6221 (@VictorSanh)
-
Make the order of additional special tokens deterministic #5704 (@gonglinyuan)
-
test_tokenization_common.py: Remove redundant coverage #6224 (@sshleifer)
-
[Reformer] fix reformer fp16 test #6237 (@patrickvonplaten)
-
[Reformer] Make random seed generator available on random seed and not on model device #6244 (@patrickvonplaten)
-
Update to match renamed attributes in fairseq master #5972 (@LilianBordeau)
-
[WIP] lightning_base: support --lr_scheduler with multiple possibilities #6232 (@stas00)
-
Trainer + wandb quality of life logging tweaks #6241 (@TevenLeScao)
-
Add strip_accents to basic BertTokenizer. #6280 (@PhilipMay)
-
Argument to set GPT2 inner dimension #6296 (@TevenLeScao)
-
[Reformer] fix default generators for pytorch < 1.6 #6300 (@patrickvonplaten)
-
Remove redundant line in run_pl_glue.py #6305 (@xujiaze13)
-
[Fix] text-classification PL example #6027 (@bhashithe)
-
fix the shuffle agrument usage and the default #6307 (@stas00)
-
CI dependency wheel caching #6287 (@LysandreJik)
-
Patch GPU failures #6281 (@LysandreJik)
-
fix consistency CrossEntropyLoss in modeling_bart #6265 (@idoh)
-
Add a script to check all models are tested and documented #6298 (@sgugger)
-
[examples] consistently use --gpus, instead of --n_gpu #6315 (@stas00)
-
Patch models #6326 (@LysandreJik)
-
Ci GitHub caching #6382 (@LysandreJik)
-
[EncoderDecoderModel] add a
add_cross_attention
boolean to config #6377 (@patrickvonplaten) -
Feed forward chunking #6024 (@Pradhy729)
-
testing utils: capturing std streams context manager #6231 (@stas00)
-
[Performance improvement] "Bad tokens ids" optimization #6064 (@guillaume-be)
-
pl version: examples/requirements.txt is single source of truth #6309 (@stas00)
-
[pl] restore lr logging behavior for glue, ner examples #6314 (@stas00)
-
lr_schedulers: add get_polynomial_decay_schedule_with_warmup #6361 (@stas00)
-
[examples] add pytest dependency #6425 (@sshleifer)
-
[test] replace capsys with the more refined CaptureStderr/CaptureStdout #6422 (@stas00)
-
Fixes to make life easier with the nlp library #6423 (@sgugger)
-
Move prediction_loss_only to TrainingArguments #6426 (@sgugger)
-
Fix docs and bad word tokens generation_utils.py #6387 (@ZhuBaohe)
-
Test model outputs equivalence #6445 (@LysandreJik)
-
add LongformerTokenizerFast in AutoTokenizer #6463 (@patil-suraj)
-
add BartTokenizerFast in AutoTokenizer #6464 (@patil-suraj)
-
Add POS tagging and Phrase chunking token classification examples #6457 (@vblagoje)
-
Clean directory after script testing #6453 (@JetRunner)
-
Use hash to clean the test dirs #6475 (@JetRunner)
-
Sort unique_no_split_tokens to make it deterministic #6461 (@lhoestq)
-
Support additional dictionaries for BERT Japanese tokenizers #6515 (@singletongue)
-
Remove deprecated assertEquals #6532 (@JetRunner)
-
[testing] a new TestCasePlus subclass + get_auto_remove_tmp_dir() #6494 (@stas00)
-
[sched] polynomial_decay_schedule use default power=1.0 #6473 (@stas00)
-
Fix flaky ONNX tests #6531 (@mfuntowicz)
-
[doc] make the text more readable, fix some typos, add some disambiguation #6508 (@stas00)
-
[doc] multiple corrections to "Summary of the tasks" #6509 (@stas00)
-
Fixed label datatype for STS-B #6492 (@amodaresi)
-
[docs] Fix wrong newline in the middle of a paragraph #6573 (@romainr)
-
[docs] Fix number of 'ug' occurrences in tokenizer_summary #6574 (@romainr)
-
add BartConfig.force_bos_token_to_be_generated #6526 (@sshleifer)
-
Fix bart base test #6587 (@sshleifer)
-
Feed forward chunking others #6365 (@Pradhy729)
-
tf generation utils: remove unused kwargs #6591 (@sshleifer)
-
[BartTokenizerFast] add prepare_seq2seq_batch #6543 (@patil-suraj)
-
[docs] Copy code button misses '...' prefixed code #6518 (@romainr)
-
removed redundant arg in prepare_inputs #6614 (@prajjwal1)
-
add intro to nlp lib & dataset links to custom datasets tutorial #6583 (@joeddav)
-
Add tests/test_tokenization_reformer.py #6485 (@D-Roberts)
-
[Tests] fix attention masks in Tests #6621 (@patrickvonplaten)
-
XLNet Bug when training with apex 16-bit precision #6567 (@johndolgov)
-
Move threshold up for flaky test with Electra #6622 (@sgugger)
-
Regression test for pegasus bugfix #6606 (@sshleifer)
-
Trainer automatically drops unused columns in nlp datasets #6449 (@sgugger)
-
[Docs model summaries] Add pegasus to docs #6640 (@patrickvonplaten)
-
[Doc model summary] add MBart model summary #6649 (@patil-suraj)
-
Specify config filename in HfArgumentParser #6626 (@jarednielsen)
-
Don't reset the dataset type + plug for rm unused columns #6683 (@sgugger)
-
Fixed DataCollatorForLanguageModeling not accepting lists of lists #6685 (@TevenLeScao)
-
[doc] remove BartForConditionalGeneration.generate #6659 (@stas00)
-
[fixdoc] Add import to pegasus usage doc #6698 (@sshleifer)
-
Remove hard-coded uses of float32 to fix mixed precision use #6648 (@schmidek)
-
Add typing.overload for convert_ids_tokens #6637 (@tamuhey)
-
Allow tests in examples to use cuda or fp16,if they are available #5512 (@Joel-hanson)
-
ci/gh/self-scheduled: add newline to make examples tests run even if src/ tests fail #6706 (@sshleifer)
-
tensor.nonzero() is deprecated in PyTorch 1.6 #6715 (@mfuntowicz)
-
[Albert] Add position ids to allowed uninitialized weights #6719 (@patrickvonplaten)
-
Fix ONNX test_quantize unittest #6716 (@mfuntowicz)
-
[squad] make examples and dataset accessible from SquadDataset object #6710 (@lazovich)
-
Fix pegasus-xsum integration test #6726 (@sshleifer)
-
T5Tokenizer adds EOS token if not already added #5866 (@sshleifer)
-
[Torchscript] Fix docs #6740 (@patrickvonplaten)
-
Add "tie_word_embeddings" config param #6692 (@patrickvonplaten)
-
Fix tf boolean mask in graph mode #6741 (@JayYip)
-
[TF Longformer] Improve Speed for TF Longformer #6447 (@patrickvonplaten)
-
[s2s] run_eval.py QOL improvements and cleanup #6746 (@sshleifer)
-
s2s distillation uses AutoModelForSeqToSeqLM #6761 (@sshleifer)
-
Add AdaFactor optimizer from fairseq #6722 (@moscow25)
-
Adds Adafactor to the docs and slightly fixes the formatting #6765 (@LysandreJik)
-
Fix the TF Trainer gradient accumulation and the TF NER example #6713 (@jplu)
-
Fix run_squad.py to work with BART #6756 (@tomgrek)
-
Add NLP install to self-scheduled CI #6767 (@sshleifer)
-
[testing] replace hardcoded paths to allow running tests from anywhere #6523 (@stas00)
-
[test schedulers] adjust to test the first step's reading #6429 (@stas00)
-
PL: --adafactor option #6776 (@sshleifer)
-
[style] set the minimal required version for
black
#6784 (@stas00) -
Transformer-XL: Improved tokenization with sacremoses #6322 (@RafaelWO)
-
prepare_seq2seq_batch makes labels/ decoder_input_ids made later. #6654 (@sshleifer)
-
t5 model should make decoder_attention_mask #6800 (@sshleifer)
-
[s2s] Test hub configs in self-scheduled CI #6809 (@sshleifer)
-
[bart] rename self-attention -> attention #6708 (@sshleifer)
-
Fixed open in colab link #6825 (@PandaWhoCodes)
-
clarify shuffle #6312 (@xujiaze13)
-
TF Flaubert w/ pre-norm #6841 (@LysandreJik)
-
Only access loss tensor every logging_steps #6802 (@jysohn23)
-
Add checkpointing to Ray Tune HPO #6747 (@krfricke)
-
Fix marian slow test #6854 (@sshleifer)
-
Bart can make decoder_input_ids from labels #6758 (@sshleifer)
-
Restore PaddingStrategy.MAX_LENGTH on QAPipeline while no v2. #6875 (@mfuntowicz)
-
[Generate] Facilitate PyTorch generate using
ModelOutputs
#6735 (@patrickvonplaten)