success: falseon 4xx targets — scraping a 403/404/429 target with minimal body now correctly returnssuccess: falsewith error details, instead ofsuccess: truewith a warning. Targets with real content (custom error pages) still returnsuccess: truewith a warning- JS renderer fallback warning — when
renderJs: trueis requested but no CDP renderer is available, the response now includesrendered_with: "http_only_fallback"and a warning instead of silently falling back - CDP health check —
is_available()now runs a realBrowser.getVersioncommand instead of just testing the WebSocket connection - Specific error messages — unknown formats now return descriptive errors (e.g.,
"Unknown format 'extract'. Valid formats: ...") instead of generic 422 "extract"format alias —formats: ["extract"]andformats: ["llm-extract"]are now accepted as aliases for"json"(Firecrawl compatibility)- Chunk dedup by default — deduplication is now enabled by default for all chunking strategies; separator-only chunks (
---,***) are filtered out - Chunk relevance scores — chunks now return
{ content, score, index }objects instead of plain strings when a query is provided - Map timeout —
/v1/mapaccepts atimeoutparameter (default 120s, max 300s) to prevent 502s on large sites - Stealth + JS rendering fix —
stealth: truewithrenderJs: trueno longer bypasses CDP; the shared renderer is used with stealth headers injected - BM25 NaN guard — prevents
NaNscores when all chunks are empty