Changed
- BREAKING: Removed methods
BaseStep::addToResult()
,BaseStep::addLaterToResult()
,BaseStep::addsToOrCreatesResult()
,BaseStep::createsResult()
, andBaseStep::keepInputData()
. These methods were deprecated in v1.8.0 and should be replaced withStep::keep()
,Step::keepAs()
,Step::keepFromInput()
, andStep::keepInputAs()
. - BREAKING: With the removal of the
addToResult()
method, the library no longer usestoArrayForAddToResult()
methods on output objects. Instead, please usetoArrayForResult()
. Consequently,RespondedRequest::toArrayForAddToResult()
has been renamed toRespondedRequest::toArrayForResult()
. - BREAKING: Removed the
result
andaddLaterToResult
properties fromIo
objects (Input
andOutput
). These properties were part of theaddToResult
feature and are now removed. Instead, use thekeep
property where kept data is added. - BREAKING: The return type of the
Crawler::loader()
method no longer allowsarray
. This means it's no longer possible to provide multiple loaders from the crawler. Instead, use the new functionality to directly provide a custom loader to a step described below. - BREAKING: Refactored the abstract
LoadingStep
class to a trait and removed theLoadingStepInterface
. Loading steps should now extend theStep
class and use the trait. As multiple loaders are no longer supported, theaddLoader
method was renamed tosetLoader
. Similarly, the methodsuseLoader()
andusesLoader()
for selecting loaders by key are removed. Now, you can directly provide a different loader to a single step using the trait's newwithLoader()
method (e.g.,Http::get()->withLoader($loader)
). - BREAKING: Removed the
PaginatorInterface
to allow for better extensibility. The oldCrwlr\Crawler\Steps\Loading\Http\Paginators\AbstractPaginator
class has also been removed. Please use the newer, improved versionCrwlr\Crawler\Steps\Loading\Http\AbstractPaginator
. This newer version has also changed: the first argumentUriInterface $url
is removed from theprocessLoaded()
method, as the URL also is part of the request (Psr\Http\Message\RequestInterface
) which is now the first argument. Additionally, the default implementation of thegetNextRequest()
method is removed. Child implementations must define this method themselves. If your custom paginator still has agetNextUrl()
method, note that it is no longer needed by the library and will not be called. ThegetNextRequest()
method now fulfills its original purpose. - BREAKING: Removed methods from
HttpLoader
:$loader->setHeadlessBrowserOptions()
=> use$loader->browser()->setOptions()
instead$loader->addHeadlessBrowserOptions()
=> use$loader->browser()->addOptions()
instead$loader->setChromeExecutable()
=> use$loader->browser()->setExecutable()
instead$loader->browserHelper()
=> use$loader->browser()
instead
- BREAKING: Removed method
RespondedRequest::cacheKeyFromRequest()
. UseRequestKey::from()
instead. - BREAKING: The
HttpLoader::retryCachedErrorResponses()
method now returns an instance of the newCrwlr\Crawler\Loader\Http\Cache\RetryManager
class. This class provides the methodsonly()
andexcept()
to restrict retries to specific HTTP response status codes. Previously, this method returned theHttpLoader
itself ($this
), so if you're using it in a chain and calling other loader methods after it, you will need to refactor your code. - BREAKING: Removed the
Microseconds
class from this package. It has been moved to thecrwlr/utils
package, which you can use instead.