Introducing LLMs.txt API
The /llmstxt endpoint allows you to transform any website into clean, LLM-ready text files. Simply provide a URL, and Firecrawl will crawl the site and generate both llms.txt and llms-full.txt files that can be used for training or analysis with any LLM.
Docs here: https://docs.firecrawl.dev/features/alpha/llmstxt
Introducing Deep Research API (Alpha)
The /deep-research endpoint enables AI-powered deep research and analysis on any topic. Simply provide a research query, and Firecrawl will autonomously explore the web, gather relevant information, and synthesize findings into comprehensive insights.
Join the waitlist here: https://www.firecrawl.dev/deep-research
Official Firecrawl MCP Server
Introducing the Firecrawl MCP Server. Give Cursor, Windsurf, Claude enhanced web extraction capabilities. Big thanks to @vrknetha, @cawstudios for the initial implementation!
See here: https://github.com/mendableai/firecrawl-mcp-server
Fixes & Enhancements
- Improved charset detection and re-decoding.
- Fixed extract token limit issues.
- Addressed issues with includes/excludes handling.
- Fixed AI SDK handling of JSON responses.
New Features & Improvements
- AI-SDK Migration – transitioned to AI-SDK.
- Auto-Recharge Emails – notify users about upgrades.
- Fire-Index Added – introduced a new indexing system.
- Self-Hosting Enhancements – OpenAI-compatible API & Ollama env support.
- Batch Billing – streamlined billing processes.
- Supabase Read Replica Routing – improved database performance.
Crawler & AI Improvements
- Implemented Claude 3.7 and GPT-4.5 web crawlers.
- Added Groq Web Crawler example.
- Updated crawl-status behavior for better error handling.
- Improved cross-origin redirect handling.
Documentation & Maintenance
- Updated Dockerfile.
- Fixed missing "required" field in docs.
New Contributors
Detailed breakdown
Deep Research API & LLMS TXT API
- (feat/deep-research-alpha) Added Max URLs, Sources, and Fixes by @nickscamara in #1271
- (feat/deep-research) Alpha prep + Improvements by @nickscamara in #1284
- Truncate llmstxt cache based on max URLs limit & improve max URLs handling by @ericciarla in #1285
Fixes & Enhancements
- fix(scrapeURL/engines/fetch): Discover charset and re-decode by @mogery in #1221
- fix(crawl-redis): Ignore empty includes/excludes by @mogery in #1223
- fix(token-slicer): Fix extract token limit issues by @nickscamara in #1236
- fix(scraper): Improve charset detection regex to accurately parse meta tags by @GrassH in #1265
- fix(crawl): Includes/excludes fixes (FIR-1300) by @mogery in #1303
- Fix AI SDK being unable to handle the AI returning a JSON code block (FIR-1277) by @mogery in #1280
- Fix/p token by @nickscamara in #1305
Features & Improvements
- (feat/ai-sdk) Migrate to AI-SDK by @nickscamara in #1220
- (feat/auto-recharge) Send email suggesting an upgrade when hitting auto recharges by @nickscamara in #1237
- feat(self-host/ai): Use any OpenAI-compatible API by @mogery in #1245
- feat(self-host/ai): Pass in the Ollama envs into Docker Compose by @brrock in #1269
- feat(v1/crawl-status-ws): Update behavior to ignore errors like regular crawl-status (FIR-1106) by @mogery in #1234
- feat(fire-index): Added new fire-index by @nickscamara in #1263
- feat(supabase): Add read replica routing by @mogery in #1274
- feat(crawler): Handle cross-origin redirects differently than same-origin redirects by @mogery in #1279
- (feat/batch-billing): Batch billing by @nickscamara in #1264
- feat(tests/snips): Add billing tests + misc billing fixes (FIR-1280) by @mogery in #1283
New Implementations
- Implemented GitHub analyzer by @aparupganguly in #1229
- Implemented Claude 3.7 web crawler by @aparupganguly in #1257
- examples/Add GPT-4.5 web crawler by @aparupganguly in #1276
- examples/Add Claude 3.7 web extractor by @aparupganguly in #1291
- Add groq_web_crawler example and dependencies by @ceewaigit in #1267
Documentation & Maintenance
- docs: Remove undefined "required" field by @jmporchet in #1282
- Update Dockerfile by @mogery in #1232
New Contributors
- @GrassH made their first contribution in #1265
- @brrock made their first contribution in #1269
- @jmporchet made their first contribution in #1282
- @ceewaigit made their first contribution in #1267
Full Changelog: v1.5.0...v1.6.0
What's Changed
- fix(scrapeURL/engines/fetch): discover charset and re-decode by @mogery in #1221
- fix(crawl-redis): ignore empty includes/excludes by @mogery in #1223
- Feat/added eval run after deploy workflow by @rafaelsideguide in #1224
- (feat/ai-sdk) Migrate to AI-SDK by @nickscamara in #1220
- Implemented github analyzer by @aparupganguly in #1229
- Update Dockerfile (#1231) by @mogery in #1232
- (fix/token-slicer) Fixes extract token limit issues by @nickscamara in #1236
- (feat/auto-recharge) Send email suggesting an upgrade when hitting auto recharges by @nickscamara in #1237
- feat(self-host/ai): use any OpenAI-compatible API by @mogery in #1245
- feat(v1/crawl-status-ws): update behavior to ignore errors like regular crawl-status (FIR-1106) by @mogery in #1234
- Implemented claude 3.7 web crawler by @aparupganguly in #1257
- (feat/fire-index) Added new fire-index by @nickscamara in #1263
- fix(scraper): improve charset detection regex to accurately parse met… by @GrassH in #1265
- feat(self-host/ai): pass in the ollama envs into docker compose by @brrock in #1269
- (feat/deep-research-alpha) Added Max Urls, Sources and Fixes by @nickscamara in #1271
- (feat/batch-billing) Batch billing by @nickscamara in #1264
- feat(supabase): add read replica routing by @mogery in #1274
- examples/Add GPT-4.5 web crawler by @aparupganguly in #1276
- feat(crawler): handle cross-origin redirects differently than same-origin redirects by @mogery in #1279
- docs: remove undefined "required" field by @jmporchet in #1282
- Add groq_web_crawler example and dependencies by @ceewaigit in #1267
- Fix AI SDK being unable to handle the AI returning a JSON code block (FIR-1277) by @mogery in #1280
- feat(tests/snips): add billing tests + misc billing fixes (FIR-1280) by @mogery in #1283
- (feat/deep-research) Alpha prep + Improvements by @nickscamara in #1284
- examples/Add Claude 3.7 web extractor by @aparupganguly in #1291
- Truncate llmstxt cache based on maxurls limit & improve maxurls handling by @ericciarla in #1285
- feat(crawl): includes/excludes fixes (FIR-1300) by @mogery in #1303
- Fix/p token by @nickscamara in #1305
New Contributors
- @GrassH made their first contribution in #1265
- @brrock made their first contribution in #1269
- @jmporchet made their first contribution in #1282
- @ceewaigit made their first contribution in #1267
Full Changelog: v1.5.0...v1.6.0