github PKHarsimran/website-downloader v2.2.0
Improved Path Normalization & Filename Handling

latest releases: v2.4.0, v2.3.2, v2.3.1...
3 months ago

Summary

This release introduces a targeted improvement to how URL paths are converted into local filesystem paths when mirroring websites. It improves robustness when handling mixed encodings and malformed filenames, while keeping all existing safety protections intact.

Special thanks to @gsrec for reporting the edge cases and providing helpful examples that led to this improvement.


✨ Improvements

  • Decodes URL-encoded path segments (e.g., %20 → space)
  • Trims unnecessary whitespace from individual path segments
  • Collapses accidental multi-dot sequences (e.g., file....jpgfile.jpg)
  • Maintains traversal protection (../ prevention)
  • Preserves per-segment and overall path length limits
  • Keeps hashing fallback for overly long filenames

🔒 Compatibility

  • No changes to crawl logic
  • No changes to download behavior
  • No changes to threading or queue handling
  • No structural refactoring
  • Fully backward compatible

This is a minimal and safe enhancement focused strictly on improving path normalization and edge-case filename handling.

What's Changed

Full Changelog: v2.1.0...v2.2.0

Don't miss a new website-downloader release

NewReleases is sending notifications on new releases.