A new model is added to transformers: Embedding Gemma
It is added on top of the v4.56.0 release, and can be installed from the following tag: v4.56.0-Embedding-Gemma-preview
.
In order to install this version, please install with the following command:
pip install git+https://github.com/huggingface/transformers@v4.56.0-Embedding-Gemma-preview
If fixes are needed, they will be applied to this release; this installation may therefore be considered as stable and improving.
As the tag implies, this tag is a preview of the EmbeddingGemma model. This tag is a tagged version of the main
branch and does not follow semantic versioning. This model will be included in the next minor release: v4.57.0
.
Embedding-Gemma

Today, Google releases EmbeddingGemma, a state-of-the-art multilingual embedding model perfect for on-device use cases. Designed for speed and efficiency, the model features a compact size of 308M parameters and a 2K context window, unlocking new possibilities for mobile RAG pipelines, agents, and more. EmbeddingGemma is trained to support over 100 languages and is the highest-ranking text-only multilingual embedding model under 500M on the Massive Text Embedding Benchmark (MTEB) at the time of writing.
Usage example
EmbeddingGemma can be found on the Huggingface Hub. It is integrated in sentence-transformers
which depends on transformers
.
See below for sentence-transformers
examples using the model:
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("google/embeddinggemma-300m")
# Run inference with queries and documents
query = "Which planet is known as the Red Planet?"
documents = [
"Venus is often called Earth's twin because of its similar size and proximity.",
"Mars, known for its reddish appearance, is often referred to as the Red Planet.",
"Jupiter, the largest planet in our solar system, has a prominent red spot.",
"Saturn, famous for its rings, is sometimes mistaken for the Red Planet."
]
query_embeddings = model.encode_query(query)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# (768,) (4, 768)
# Compute similarities to determine a ranking
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.3011, 0.6359, 0.4930, 0.4889]])
# Convert similarities to a ranking
ranking = similarities.argsort(descending=True)[0]
print(ranking)
# tensor([1, 2, 3, 0])