pypi huggingface-hub 0.28.0
[v0.28.0]: Third-party Inference Providers on the Hub & multiple quality of life improvements and bug fixes

latest releases: 0.28.1, 0.28.0rc5, 0.28.0rc4...
2 days ago

⚡️Unified Inference Across Multiple Inference Providers

Screenshot 2025-01-28 at 12 05 42

The InferenceClient now supports third-party providers, offering a unified interface to run inference across multiple services while leveraging models from the Hugging Face Hub. This update enables developers to:

  • 🌐 Switch providers seamlessly - Transition between inference providers with a single interface.
  • 🔗 Unified model IDs - Always reference Hugging Face Hub model IDs, even when using external providers.
  • 🔑 Simplified billing and access management - You can use your Hugging Face Token for routing to third-party providers (billed through your HF account).

A list of supported third-party providers can be found here.

Example of text-to-image inference with Replicate:

>>> from huggingface_hub import InferenceClient

>>> replicate_client = InferenceClient(
...    provider="replicate",
...    api_key="my_replicate_api_key", # Using your personal Replicate key
)
>>> image = replicate_client.text_to_image(
...    "A cyberpunk cat hacking neural networks",
...    model="black-forest-labs/FLUX.1-schnell"
)
>>> image.save("cybercat.png")

Another example of chat completion with Together AI:

>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient(
...     provider="together",  # Use Together AI provider
...     api_key="<together_api_key>",  # Pass your Together API key directly
... )
>>> client.chat_completion(
...     model="deepseek-ai/DeepSeek-R1",
...     messages=[{"role": "user", "content": "How many r's are there in strawberry?"}],
... )

When using external providers, you can choose between two access modes: either use the provider's native API key, as shown in the examples above, or route calls through Hugging Face infrastructure (billed to your HF account):

>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient(
...    provider="fal-ai",
...    token="hf_****"  # Your Hugging Face token
)

⚠️ Parameters availability may vary between providers - check provider documentation.
🔜 New providers/models/tasks will be added iteratively in the future.
👉 You can find a list of supported tasks per provider and more details here.

✨ HfApi

The following change aligns the client with server-side updates by adding new repositories properties: usedStorage and resourceGroup.

[HfApi] update list of repository properties following server side updates by @hanouticelina in #2728

Extends empty commit prevention to file copy operations, preserving clean version histories when no changes are made.

[HfApi] prevent empty commits when copying files by @hanouticelina in #2730

🌐 📚 Documentation

Thanks to @WizKnight, the hindi translation is much better!

Improved Hindi Translation in Documentation📝 by @WizKnight in #2697

💔 Breaking changes

The like endpoint has been removed to prevent misuse. You can still remove existing likes using the unlikeendpoint.

[HfApi] remove like endpoint by @hanouticelina in #2739

🛠️ Small fixes and maintenance

😌 QoL improvements

🐛 Bug and typo fixes

🏗️ internal

Don't miss a new huggingface-hub release

NewReleases is sending notifications on new releases.