pypi transformers 4.41.0
v4.41.0: Phi3, JetMoE, PaliGemma, VideoLlava, Falcon2 and FalconVLM

latest releases: 4.42.3, 4.42.2, 4.42.1...
one month ago

New models

Phi3

The Phi-3 model was proposed in Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone by Microsoft.

TLDR; Phi-3 introduces new ROPE scaling methods, which seems to scale fairly well! A 3b and a
Phi-3-mini is available in two context-length variantsโ€”4K and 128K tokens. It is the first model in its class to support a context window of up to 128K tokens, with little impact on quality.

image

JetMoE

JetMoe-8B is an 8B Mixture-of-Experts (MoE) language model developed by Yikang Shen and MyShell. JetMoe project aims to provide a LLaMA2-level performance and efficient language model with a limited budget. To achieve this goal, JetMoe uses a sparsely activated architecture inspired by the ModuleFormer. Each JetMoe block consists of two MoE layers: Mixture of Attention Heads and Mixture of MLP Experts. Given the input tokens, it activates a subset of its experts to process them. This sparse activation schema enables JetMoe to achieve much better training throughput than similar size dense models. The training throughput of JetMoe-8B is around 100B tokens per day on a cluster of 96 H100 GPUs with a straightforward 3-way pipeline parallelism strategy.

image

PaliGemma

PaliGemma is a lightweight open vision-language model (VLM) inspired by PaLI-3, and based on open components like the SigLIP vision model and the Gemma language model. PaliGemma takes both images and text as inputs and can answer questions about images with detail and context, meaning that PaliGemma can perform deeper analysis of images and provide useful insights, such as captioning for images and short videos, object detection, and reading text embedded within images.

More than 120 checkpoints are released see the collection here !

image

VideoLlava

Video-LLaVA exhibits remarkable interactive capabilities between images and videos, despite the absence of image-video pairs in the dataset.

๐Ÿ’ก Simple baseline, learning united visual representation by alignment before projection
With the binding of unified visual representations to the language feature space, we enable an LLM to perform visual reasoning capabilities on both images and videos simultaneously.
๐Ÿ”ฅ High performance, complementary learning with video and image
Extensive experiments demonstrate the complementarity of modalities, showcasing significant superiority when compared to models specifically designed for either images or videos.

image

Falcon 2 and FalconVLM:

image

Two new models from TII-UAE! They published a blog-post with more details! Falcon2 introduces parallel mlp, and falcon VLM uses the Llava framework

GGUF from_pretrained support

image

You can now load most of the GGUF quants directly with transformers' from_pretrained to convert it to a classic pytorch model. The API is simple:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
filename = "tinyllama-1.1b-chat-v1.0.Q6_K.gguf"

tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)

We plan more closer integrations with llama.cpp / GGML ecosystem in the future, see: #27712 for more details

Quantization

New quant methods

In this release we support new quantization methods: HQQ & EETQ contributed by the community. Read more about how to quantize any transformers model using HQQ & EETQ in the dedicated documentation section

dequantize API for bitsandbytes models

In case you want to dequantize models that have been loaded with bitsandbytes, this is now possible through the dequantize API (e.g. to merge adapter weights)

  • FEAT / Bitsandbytes: Add dequantize API for bitsandbytes quantized models by @younesbelkada in #30806

API-wise, you can achieve that with the following:

from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer

model_id = "facebook/opt-125m"

model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=BitsAndBytesConfig(load_in_4bit=True))
tokenizer = AutoTokenizer.from_pretrained(model_id)

model.dequantize()

text = tokenizer("Hello my name is", return_tensors="pt").to(0)

out = model.generate(**text)
print(tokenizer.decode(out[0]))

Generation updates

Sdpa support

๐Ÿšจ might be breaking

  • ๐Ÿšจ๐Ÿšจ๐ŸšจDeprecate evaluation_strategy to eval_strategy๐Ÿšจ๐Ÿšจ๐Ÿšจ by @muellerzr in #30190
  • ๐Ÿšจ Add training compatibility for Musicgen-like models by @ylacombe in #29802
  • ๐Ÿšจ Update image_processing_vitmatte.py by @rb-synth in #30566

Cleanups

  • Remove task guides auto-update in favor of links towards task pages by @LysandreJik in #30429
  • Remove add-new-model in favor of add-new-model-like by @LysandreJik in #30424
  • Remove mentions of models in the READMEs and link to the documentation page in which they are featured. by @LysandreJik in #30420

Not breaking but important for Llama tokenizers

Fixes

New Contributors

  • @joaocmd made their first contribution in #23342
  • @kamilakesbi made their first contribution in #30121
  • @dtlzhuangz made their first contribution in #30262
  • @steven-basart made their first contribution in #30405
  • @manju-rangam made their first contribution in #30457
  • @kyo-takano made their first contribution in #30494
  • @mgoin made their first contribution in #30488
  • @eitanturok made their first contribution in #30509
  • @clinty made their first contribution in #30512
  • @warner-benjamin made their first contribution in #30442
  • @XavierSpycy made their first contribution in #30438
  • @DarshanDeshpande made their first contribution in #30558
  • @frasermince made their first contribution in #29721
  • @lucky-bai made their first contribution in #30358
  • @rb-synth made their first contribution in #30566
  • @lausannel made their first contribution in #30573
  • @jonghwanhyeon made their first contribution in #30597
  • @mobicham made their first contribution in #29637
  • @yting27 made their first contribution in #30362
  • @jiaqianjing made their first contribution in #30664
  • @claralp made their first contribution in #30505
  • @mimbres made their first contribution in #30653
  • @sorgfresser made their first contribution in #30687
  • @nurlanov-zh made their first contribution in #30485
  • @zafstojano made their first contribution in #30678
  • @davidgxue made their first contribution in #30602
  • @rootonchair made their first contribution in #30698
  • @eigen2017 made their first contribution in #30552
  • @Nilabhra made their first contribution in #30771
  • @a8nova made their first contribution in #26870
  • @pashminacameron made their first contribution in #30790
  • @retarfi made their first contribution in #30763
  • @yikangshen made their first contribution in #30005
  • @ankur0904 made their first contribution in #30804
  • @conditionedstimulus made their first contribution in #30823
  • @nxphi47 made their first contribution in #29850
  • @Aladoro made their first contribution in #30652
  • @hyenal made their first contribution in #30555
  • @darshana1406 made their first contribution in #30870

Full Changelog: v4.40.2...v4.41.0

Don't miss a new transformers release

NewReleases is sending notifications on new releases.