🚀 New Features & Enhancements
- Global Run Totals Summary: Added a comprehensive, multi-line end-of-run summary. The
StatsManagernow aggregates total downloads, data sizes (using humanfriendly auto-scaling), metadata updates, and text saves across all models into a clean report. - API Gap Detection: Updated the
get_after()logic across all API endpoints to be aware of your active media type filters. The scraper will no longer unnecessarily rewind scans for missing media if that media type was intentionally skipped via the--mediatypesfilter. - Text Download Mode Overhaul: Changed
--textand--text-onlyfrom area selectors to global boolean flags for regular download mode. Text downloads are now directly linked to the active media type filters (Images/Videos/Audios) inside thePostCollectionpipeline. - Username Aliases: Added multiple new aliases for the
--usernameCLI argument for better ease of use.
⚡ Performance & Network Optimizations
- Fail-Fast CDN Recovery: Drastically reduced default connection and chunk read timeouts. Implemented native
sock_readtimeouts to handle CDN connection stalls gracefully. The scraper will now instantly drop and recover from dead streams rather than hanging for minutes - Improved Rate Limiting: Refactored
SessionSleepto calculate elapsed time rather than enforcing flat delays—eliminating hours of artificial wait time. Replaced generic exponential backoffs with a tight 1-3s jitter buffer, preventing overlap with the custom API rate-limiter. - Cloudflare Penalty Buffer: Increased the initial 403 Forbidden sleep timer to 30 seconds to guarantee temporary Cloudflare blocks expire before the scraper attempts a retry.
- Processing Speed: Removed a massive redundant processing block in
post.pyand updated session calls to use a customr.json_()method.
🛠 Database & SQLite Stability
- Deadlock Protection: Completely overhauled concurrent SQLite writes. Database connections are now safely queued using
BEGIN IMMEDIATEtransactions and a dedicated retry loop to eliminateOperationalError: database is lockedcrashes during high-speed downloads. - Query & Schema Fixes: Fixed a critical deduplication bug where the append operation was placed outside of the conditional block. Resolved SQLite syntax errors (duplicate
WHEREclauses) and fixed missing commas beforeUNIQUEconstraints during table creation inprofile.pyandlabels.py.
🐛 Critical Bug Fixes
@TesticularMass provided many of these bug fixes
- Infinite Network Hangs Resolved: Fixed an issue where syncing profile IDs would cause infinite network hangs.
get_id()now properly utilizes the RAM-cached profile info instead of making redundant sync API calls. - Logic & Evaluation Errors: Fixed a logic short-circuit where
('like' or 'unlike')evaluations were always returning 'like'. Fixed a bug inhighlights.pypreventingTypeErrorcrashes. Fixed shadowed reversed built-ins in the filter pipeline, and patched a crash caused by disabling "self-media" filtering. - Stats Tracking: Corrected text download stats so they print properly even when zero text files are found, and accurately track skipped text files by checking file existence before marking the attempt.
- Misc Fixes: Cleaned up minor string syntax issues (removed redundant double slashes and bad escapes). Fixed messages missing from
purchase_check, and resolved keyboard input issues.
🔐 Security & Logging
- Security & Fingerprinting: Enabled custom fingerprints
- Network Visibility: Upgraded the download retry logic with a
before_sleephook to instantly log exactly when and why a connection drops mid-download. - Log Cleanup: Cleaned up terminal spam by logging certificate changes only once per identity (instead of on every request) and added better debugging logs for retrieved usernames
🛠️ Maintenance:
- Contributions: Added a
contributing.mdfile, cleaned up unused imports, and normalized legacy/inconsistent media type strings in the database usingMEDIA_ALIASES.