datawhores/OF-Scraper 3.14.4 on GitHub

🚀 New Features & Enhancements

Global Run Totals Summary: Added a comprehensive, multi-line end-of-run summary. The StatsManager now aggregates total downloads, data sizes (using humanfriendly auto-scaling), metadata updates, and text saves across all models into a clean report.
API Gap Detection: Updated the get_after() logic across all API endpoints to be aware of your active media type filters. The scraper will no longer unnecessarily rewind scans for missing media if that media type was intentionally skipped via the --mediatypes filter.
Text Download Mode Overhaul: Changed --text and --text-only from area selectors to global boolean flags for regular download mode. Text downloads are now directly linked to the active media type filters (Images/Videos/Audios) inside the PostCollection pipeline.
Username Aliases: Added multiple new aliases for the --username CLI argument for better ease of use.

⚡ Performance & Network Optimizations

Fail-Fast CDN Recovery: Drastically reduced default connection and chunk read timeouts. Implemented native sock_read timeouts to handle CDN connection stalls gracefully. The scraper will now instantly drop and recover from dead streams rather than hanging for minutes
Improved Rate Limiting: Refactored SessionSleep to calculate elapsed time rather than enforcing flat delays—eliminating hours of artificial wait time. Replaced generic exponential backoffs with a tight 1-3s jitter buffer, preventing overlap with the custom API rate-limiter.
Cloudflare Penalty Buffer: Increased the initial 403 Forbidden sleep timer to 30 seconds to guarantee temporary Cloudflare blocks expire before the scraper attempts a retry.
Processing Speed: Removed a massive redundant processing block in post.py and updated session calls to use a custom r.json_() method.

🛠 Database & SQLite Stability

Deadlock Protection: Completely overhauled concurrent SQLite writes. Database connections are now safely queued using BEGIN IMMEDIATE transactions and a dedicated retry loop to eliminate OperationalError: database is locked crashes during high-speed downloads.
Query & Schema Fixes: Fixed a critical deduplication bug where the append operation was placed outside of the conditional block. Resolved SQLite syntax errors (duplicate WHERE clauses) and fixed missing commas before UNIQUE constraints during table creation in profile.py and labels.py.

🐛 Critical Bug Fixes
@TesticularMass provided many of these bug fixes

Infinite Network Hangs Resolved: Fixed an issue where syncing profile IDs would cause infinite network hangs. get_id() now properly utilizes the RAM-cached profile info instead of making redundant sync API calls.
Logic & Evaluation Errors: Fixed a logic short-circuit where ('like' or 'unlike') evaluations were always returning 'like'. Fixed a bug in highlights.py preventing TypeError crashes. Fixed shadowed reversed built-ins in the filter pipeline, and patched a crash caused by disabling "self-media" filtering.
Stats Tracking: Corrected text download stats so they print properly even when zero text files are found, and accurately track skipped text files by checking file existence before marking the attempt.
Misc Fixes: Cleaned up minor string syntax issues (removed redundant double slashes and bad escapes). Fixed messages missing from purchase_check, and resolved keyboard input issues.

🔐 Security & Logging

Security & Fingerprinting: Enabled custom fingerprints
Network Visibility: Upgraded the download retry logic with a before_sleep hook to instantly log exactly when and why a connection drops mid-download.
Log Cleanup: Cleaned up terminal spam by logging certificate changes only once per identity (instead of on every request) and added better debugging logs for retrieved usernames

🛠️ Maintenance:

Contributions: Added a contributing.md file, cleaned up unused imports, and normalized legacy/inconsistent media type strings in the database using MEDIA_ALIASES.

datawhores/OF-Scraper 3.14.4 Release 3.14.4 on GitHub