kepano/defuddle 0.9.0
on GitHub

latest releases: 0.18.1, 0.18.0, 0.17.0...

one month ago

Improvements

Async extraction support e.g. X URLs
Generic footnote detection fallback and backref cleanup (#138, #120)
Substack app support
Better Wikidot support
Better heading/code/pre preservation
Shiki language detection for code blocks
Improved scoring around code blocks and bios
Fixed nested list indentation

Fixes

Fix HTML element with id="menu" breaking content extraction (#106)
Fix page content not being able to start with a divider (#114)
Fix invalid CSS selector span.leading-tight,, img (#128)
Fix [href*="/category"] exact selector removing legitimate page content (#131)
Fix .hero exact selector removing primary content on documentation landing pages (#132)
Fix content of <time> element being removed (#136)
Fix DOMParser is not defined when running via defuddle/node (#137)
Fix content sanitization bypass via schema.org text fallback (#139)

Security

Fix XSS via attribute injection in image handling
Sanitize HTML to prevent unsafe elements in schema text fallback (#139)

Other

New website (#133), playground updates, README updates

Check out latest releases or
releases around kepano/defuddle 0.9.0

Don't miss a new defuddle release

NewReleases is sending notifications on new releases.

Get notifications