- Document processing is now handled by Docling (when possible) and fallbacks to our old text-based mechanism. This will greatly improve processing of complex PDFs, tables, as well as add support for CSV among other formats.,
- URL processing is now extended and can be handled by either Firecrawl or Jina besides that original HTTP-based approach which failes quite often on Javascript-heavy websites. Both tools have a generous free-tier so it should be a no-brainer to use them.,
- New settings page, we moved some content processing related settings to a new Settings page to unclutter the "add source" UI. You can use the new page to change the default engines for documents and URLs (although I suggest to keep it as auto). You can also decide when to embed content and whether to keep a local copy of files you upload (suggest you don't).