github D4Vinci/Scrapling v0.4.6
Release v0.4.6

10 hours ago

A focused update on browser stealth, privacy, and developer experience 🔒

🚀 New Stuff and quality of life changes

  • Added built-in ad blocking for browser fetchers. Pass block_ads=True to block requests to ~3,500 known ad and tracker domains at the route interception level -- no DNS, no TCP, instant abort. Can be combined with blocked_domains for custom lists. The MCP server and CLI --ai-targeted mode enable this automatically to save tokens and speed up page loads.
    page = StealthyFetcher.fetch('https://example.com', block_ads=True)
  • Added DNS-over-HTTPS support to prevent DNS leaks when using proxies. Pass dns_over_https=True to route DNS queries through Cloudflare's DoH, so your real location isn't exposed through DNS resolution even when your HTTP traffic goes through a proxy.
    page = StealthyFetcher.fetch('https://example.com', proxy='http://proxy:8080', dns_over_https=True)
  • Added page_setup callback for browser fetchers. A function that runs before page.goto(), letting you register event listeners, routes, or scripts that must be set up before the page navigates. Pairs with page_action (which runs after navigation). (Solves #237)
    def capture_websockets(page):
        page.on("websocket", lambda ws: print(f"WS: {ws.url}"))
    
    page = DynamicFetcher.fetch('https://example.com', page_setup=capture_websockets)
  • Added --block-ads and --dns-over-https CLI options to both fetch and stealthy-fetch commands.

🐛 Bug Fixes

  • Fixed Seconds type alias rejecting float values. Passing wait=1.5 or timeout=500.0 to browser fetchers would fail with a type error because the type alias incorrectly treated float as metadata instead of a type. by @kuishou68 in #240
  • Fixed duplicate ID segments in full-path selector generation. Elements with id attributes had their selector appended twice when generating full CSS/XPath paths, producing selectors like body > #main > #main > #target > #target. Also fixed full-path XPath emitting bare [@id='x'] predicates (invalid XPath) instead of *[@id='x']. by @sjhddh in #241
  • Fixed missing shell signature parameters. The interactive shell was missing blocked_domains, block_ads, retries, retry_delay, capture_xhr, executable_path, and dns_over_https from its function signatures.

🙏 Special thanks to the community for all the continuous testing and feedback


Big shoutout to our Platinum Sponsors

Don't miss a new Scrapling release

NewReleases is sending notifications on new releases.