This is the final minor release before v1.0.0. This release focuses on performance optimizations to HfFileSystem
and adds a new get_organization_overview
API endpoint.
We'll continue to release security patches as needed, but v0.37 will not happen. The next release will be 1.0.0. We’re also deeply grateful to the entire Hugging Face community for their feedback, bug reports, and suggestions that have shaped this library.
Full Changelog: v0.35.0...v0.36.0
📁 HfFileSystem
Major optimizations have been implemented in HfFileSystem
:
- Cache is kept when pickling a
fs
instance. This is particularily useful when streaming datasets in a distributed training environment. Each worker won't have to rebuild their cache anymore
Listing files with .glob()
has been greatly optimized:
from huggingface_hub import HfFileSystem
HfFileSystem().glob("datasets/HuggingFaceFW/fineweb-edu/data/*/*")
# Before: ~100 /tree calls (one per subdirectory)
# Now: 1 /tree call
Minor updates:
- add block_size in init by @lhoestq in #3425
- hffs minor fix by @lhoestq in #3449
- HTTP backoff: Retry on ChunkedEncodingError by @lhoestq in #3437
🌍 HfApi
It is now possible to get high-level information about an organization, the same way it is already possible to do with users:
>>> from huggingface_hub import get_organization_overview
>>> get_organization_overview("huggingface")
Organization(
avatar_url='https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png',
name='huggingface',
fullname='Hugging Face',
details='The AI community building the future.',
is_verified=True,
is_following=True,
num_users=198,
num_models=164, num_spaces=96,
num_datasets=1043,
num_followers=64814
)
- Add client support for the organization overview endpoint by @BastienGimbert in #3436
🛠️ Small fixes and maintenance
🐛 Bug and typo fixes
- Add quotes for better shell compatibility by @aopstudio in #3369
- update the
sentence_similarity
docstring by @tolgaakar in #3374 - Do not retry on 429 (only on 5xx) by @Wauplin in #3377
- Use git xet transfer to check if xet is enabled by @hanouticelina in #3381
- Replace pkgx install instruction with uv by @gary149 in #3420
- The error message as previously displayed... by @goldnode in #3405
- Use all tools unless explicit allowed_tools by @Mithil467 in #3397
- [type validation] skip unresolved forward ref by @zucchini-nlp in #3376
- document job stage possible values by @hanouticelina in #3431
- update token parameter docstring by @hanouticelina in #3447
🏗️ internal
- bump to 0.36.0.dev0 by @Wauplin (direct commit on main)
- [Workflow] security fix by @glegendre01 in #3383
- migrate tip blocks by @hanouticelina in #3392
- [Internal] Fix
ty
quality by @hanouticelina in #3441 - backward compatible cli tracking (v0.x) by @Wauplin in #3460
Community contributions
The following contributors have made changes to the library over the last release. Thank you!
- @aopstudio
* Add quotes for better shell compatibility (#3369) - @tolgaakar
* update thesentence_similarity
docstring (#3374) (#3375) - @Mithil467
* Use all tools unless explicit allowed_tools (#3397) - @goldnode
* The error message as previously displayed... (#3405) - @BastienGimbert
* Add client support for the organization overview endpoint (#3436)