Added
- The
DomQuery
class (parent ofCssSelector
(Dom::cssSelector
) andXPathQuery
(Dom::xPath
)) has a new methodformattedText()
that uses the new crwlr/html-2-text package to convert the HTML to formatted plain text. You can also provide a customized instance of theHtml2Text
class to theformattedText()
method.
Fixed
- The
Http::crawl()
step won't yield a page again if a newly found URL responds with a redirect to a previously loaded URL.