github unclecode/crawl4ai vr0.6.0rc1
Crawl4AI 0.6.0rc1

latest releases: v0.7.4, v0.7.3, v0.7.2...
4 months ago

🚀 0.6.0rc1 — 22 Apr 2025

Highlights

  1. World‑aware crawlers, set geo_locale={"city":"Tokyo","lang":"ja","tz":"Asia/Tokyo"} and scrape the right version every time.
  2. Table‑to‑DataFrame extraction, flip extract_tables=True and get CSV or pandas without extra parsing.
  3. Crawler pool with pre‑warm, pages launch hot, lower P90 latency, lower memory.
  4. Network and console capture, full traffic log plus MHTML snapshot for audits and debugging.

Added

  • Geolocation, locale, and timezone flags for every crawl.
  • Browser pooling with page pre‑warming.
  • Table extractor that exports to CSV or pandas.
  • Crawler pool manager in SDK and Docker API.
  • Network & console log capture, plus MHTML snapshot.
  • MCP socket and SSE endpoints with playground UI.
  • Stress‑test framework (tests/memory) for 1 k+ URL runs.
  • Docs v2: TOC, GitHub badge, copy‑code buttons, Docker API demo.
  • “Ask AI” helper button, work in progress, shipping soon.
  • New examples: geo location, network/console capture, Docker API, markdown source selection, crypto analysis.

Changed

  • Browser strategy consolidation, legacy docker modules removed.
  • ProxyConfig moved to async_configs.
  • Server migrated to pool‑based crawler management.
  • FastAPI validators replace custom query validation.
  • Docker build now uses a Chromium base image.
  • Repo cleanup, ≈36 k insertions, ≈5 k deletions across 121 files.

Fixed

Removed

  • Obsolete modules in crawl4ai/browser/*.

Deprecated

  • Old markdown generator names now alias DefaultMarkdownGenerator and warn.

Upgrade notes

  1. Update any imports from crawl4ai/browser/* to the new pooled browser modules.
  2. If you override AsyncPlaywrightCrawlerStrategy.get_page adopt the new signature.
  3. Rebuild Docker images to pick up the Chromium layer.
  4. Switch to DefaultMarkdownGenerator to silence deprecation warnings.

121 files changed, ≈36 223 insertions, ≈4 975 deletions

Don't miss a new crawl4ai release

NewReleases is sending notifications on new releases.