Improvements
- Shadow DOM flattening to extract content from web components (#80)
- Added pipeline toggles for diagnosing content extraction issues (#145)
- Improve content selection for pages with multiple article elements (#97)
- Improve scoring for bylines and related posts (#147)
- Add Tailwind hidden classes to hidden element detection
- Add retry for index pages
Fixes
- Clip individual Hacker News comments
- Add more removal patterns for common clutter elements