github us/crw v0.0.10

latest releases: v0.16.0, v0.15.2, v0.15.1...
3 months ago

Combined release for v0.0.9 and v0.0.10.

  • Crawl cancel endpointDELETE /v1/crawl/{id} cancels a running crawl job via AbortHandle
  • API rate limiting — token-bucket rate limiter (configurable rate_limit_rps, default 10). Returns 429 when exceeded
  • Machine-readable error codes — all error responses now include an error_code field
  • Map response envelope/v1/map now returns { success, data: { links } } for consistency
  • Fenced code blocks — indented code blocks auto-converted to fenced for better LLM/RAG compatibility
  • Sphinx footer cleanup"footer" added to exact-token noise patterns
  • renderedWith: "http" — HTTP-only fetches now report rendered_with: "http" in metadata
  • 405 JSON responses — structured JSON with error_code: "method_not_allowed"
  • Anchor link cleanup — empty anchor links and pilcrow/section signs stripped from Markdown
  • role="contentinfo" cleanup — ARIA landmark roles removed during cleaning
  • Tiny chunk merging — heading-only chunks merged with next chunk for better RAG quality

Don't miss a new crw release

NewReleases is sending notifications on new releases.