New
- Add
includeRepliesoption to exclude replies from extractors (Reddit, HN, GitHub, Twitter/X) - Standardize callouts (Obsidian Publish, GitHub, Bootstrap) (#182).
Improved
- YouTube: Improve mobile extraction
- Truncate descriptions to 300 words
- Use
<base href>to resolve relative URLs (#179) - Pass separateMarkdown in CLI when --markdown, --md, or --json is used (#164)
- Prefer highest resolution image (rel #177)
- Remove
<wbr>tags to prevent unwanted spaces in markdown (rel #172) - Standardize removing anchors from headings
- Remove boundary patterns (#184)
- Exit with error when no content is extracted from CLI (rel #170)
- Use last match for metadata tags (#183)
Fixes
- Fix content extraction for pages without semantic entry points, e.g. Oxygen Builder
- Fix
el.classNamefor SVG elements whereclassNameis anSVGAnimatedString(#169) - Unwrap
<a>tags inside<code>elements to plain text before Markdown conversion (#168) - Protect child elements inside code blocks from partial selector removal (#167)