github huggingface/huggingface_hub v1.18.0
[v1.18.0] Unified file copying, web URL support, and storage usage

5 hours ago

🖥️ Unified hf cp command

A single hf cp command now handles all file-copy workflows (upload a local file, download from the Hub, or copy between two remote locations) with consistent hf:// URI syntax for both repositories and buckets. It is also available as hf repos cp and hf buckets cp; all three aliases are identical, so you can use whichever reads best for your workflow. You can stream from stdin (-) or to stdout (-), and a trailing / on the source path gives you rsync-style semantics (copy the folder contents, not the folder itself). Note that remote-to-remote copies only work within the same storage region, and bucket-to-repo is not yet supported.

# Upload a local file to a repo
hf cp ./model.safetensors hf://username/my-model/model.safetensors

# Download a file to stdout
hf cp hf://username/my-model/config.json - | jq .

# Copy between two Hub repos
hf cp hf://username/source-model/config.json hf://username/dest-model/config.json

📚 Documentation: CLI guide — Copy files

  • [CLI] Add unified hf cp command (aliased as hf repos cp and hf buckets cp) by @Wauplin in #4295

🥚 Easter egg:explore your storage usage

image
  • [CLI] Easter egg: city skyline in hf repos ls by @Wauplin in #4287

🔗 Paste web URLs directly

parse_hf_uri now accepts Hugging Face web URLs so you can paste a link straight into the CLI or the library and it "just works".

# Copy-paste a URL from the website
hf cp https://huggingface.co/nvidia/LocateAnything-3B/blob/main/config.json - | jq '.architectures'

📚 Documentation: HF URIs — Web URLs

  • [URIs] Parse web URLs in parse_hf_uri + add HfUri.to_url by @Wauplin in #4296

🚨 Breaking change

On Lustre, GPFS, and some NFS mounts, flock(2) silently succeeds for every caller, which means filelock provides no mutual exclusion. When multiple hf_hub_download calls race for the same file, they can append to the same .incomplete file and silently corrupt the blob cache. This release fixes that by always downloading to a fresh temporary file instead of resuming an incomplete one, making the download path safe even when file locking is broken. filelock is still used as a "best-effort" hint to avoid unnecessary duplicate downloads, but correctness no longer depends on it. This is a breaking change: resuming a previously failed partial download is no longer possible. However, file resumability was already a niche use case only applicable when hf_xet is disabled.

  • [Fix] Make concurrent downloads safe even when file locking is broken by @Wauplin in #4306

🖥️ CLI

🐛 Bug and typo fixes

  • Fix ~ user home not expanded in local_dir and cache_dir on file download by @Wauplin in #4293
  • Do not fail on repo/bucket creation if HTTP 401 and already exists by @Wauplin in #4294
  • Fix umask probe writing tmp file outside download dir by @Wauplin in #4305

📖 Documentation

  • [Docs] Document missing endpoint and template_str parameters by @aicayzer in #4298
  • [Docs] Document missing parameters in hf_hub_url and preupload_lfs_files by @aicayzer in #4300
  • [Docs] Mention storage region limitation for server-side copy by @Wauplin in #4302

🏗️ Internal

  • Post-release: bump version to 1.18.0.dev0 by @huggingface-hub-bot[bot] in #4291
  • Bump the actions group with 2 updates by @dependabot[bot] in #4309

Don't miss a new huggingface_hub release

NewReleases is sending notifications on new releases.