github unclecode/crawl4ai v0.5.0.post1
Crawl4AI v0.5.0.post1

one day ago

Crawl4AI v0.5.0.post1 Release

Release Theme: Power, Flexibility, and Scalability

Crawl4AI v0.5.0 is a major release focused on significantly enhancing the library's power, flexibility, and scalability.

Key Features

  1. Deep Crawling System - Explore websites beyond initial URLs with BFS, DFS, and BestFirst strategies, with page limiting and scoring capabilities
  2. Memory-Adaptive Dispatcher - Scale to thousands of URLs with intelligent memory monitoring and concurrency control
  3. Multiple Crawling Strategies - Choose between browser-based (Playwright) or lightweight HTTP-only crawling
  4. Docker Deployment - Easy deployment with FastAPI server, JWT authentication, and streaming/non-streaming endpoints
  5. Command-Line Interface - New crwl CLI provides convenient access to all features with intuitive commands
  6. Browser Profiler - Create and manage persistent browser profiles to save authentication states for protected content
  7. Crawl4AI Coding Assistant - Interactive chat interface for asking questions about Crawl4AI and generating Python code examples
  8. LXML Scraping Mode - Fast HTML parsing using the lxml library for 10-20x speedup with complex pages
  9. Proxy Rotation - Built-in support for dynamic proxy switching with authentication and session persistence
  10. PDF Processing - Extract and process data from PDF files (both local and remote)

Additional Improvements

  • LLM Content Filter for intelligent markdown generation
  • URL redirection tracking
  • LLM-powered schema generation utility for extraction templates
  • robots.txt compliance support
  • Enhanced browser context management
  • Improved serialization and config handling

Breaking Changes

This release contains several breaking changes. Please review the full release notes for migration guidance.

For complete details, visit: https://docs.crawl4ai.com/blog/releases/0.5.0/

Don't miss a new crawl4ai release

NewReleases is sending notifications on new releases.