- New extractors for LinkedIn, Threads, Bluesky, Discourse, Medium
- Footnotes refactor with sidenote support and more patterns
- Content boundary detection and eyebrow removals
- H1 fallback, title normalization
- Code blocks: fix duplicate language name (#235)
- Metadata:
rel=authorfallback, date deduplication, author name cleanup - Keep content grids with lots of content
- Audio/video source parity, empty video placeholder removal
- X extractor refactored to use comments for replies