Breaking changes
defuddle/nodenow accepts any DOMDocument(linkedom, happy-dom, JSDOM, etc.), not just JSDOM.- JSDOM is no longer a peer dependency. linkedom is now the recommended DOM parser
- Passing a raw HTML string or JSDOM instance to
defuddle/nodeis deprecated and will be removed in the next major version.
Recommended usage
import { parseHTML } from 'linkedom';
import { Defuddle } from 'defuddle/node';
const { document } = parseHTML(html);
const result = await Defuddle(document, 'https://example.com/article');Passing a JSDOM instance still works but is deprecated:
// @deprecated — pass dom.window.document directly instead
const result = await Defuddle(dom, url);
// Preferred
const result = await Defuddle(dom.window.document, url);Improvements
- Generic document support for non-HTML content (#166)
- YouTube: Use existing page transcript before fetching via API
- YouTube: Improved transcript grouping, sentence merging, and cross-environment support
- YouTube: Fix diarization stripping
-speaker markers from auto-captions - Add
.post-bodyentry point for Ghost CMS sites - Smarter retry for hidden content (#163)
- CJK word count support (#158)
- Precompile partial selector regex for faster parsing (#157)