- BREAKING:
Apify.utils.requestAsBrowser()
no longer aborts request on status code 406
or when other thantext/html
type is received. Useoptions.abortFunction
if you want to
retain this functionality. - BREAKING: Added
useInsecureHttpParser
option toApify.utils.requestAsBrowser()
which
istrue
by default and forces the function to use a HTTP parser that is less strict than
default Node 12 parser, but also less secure. It is needed to be able to bypass certain
anti-scraping walls and fetch websites that do not comply with HTTP spec. - BREAKING:
RequestList
now removes all the elements from thesources
array on
initialization. If you need to use the sources somewhere else, make a copy. This change
was added as one of several measures to improve memory management ofRequestList
in scenarios with very large amount ofRequest
instances. - DEPRECATED:
RequestListOptions.persistSourcesKey
is now deprecated. Please use
RequestListOptions.persistRequestsKey
. RequestListOptions.sources
can now be an array ofstring
URLs as well.- Added
sourcesFunction
toRequestListOptions
. It enables dynamic fetching of sources
and will only be called if persistedRequests
were not retrieved from key-value store.
Use it to reduce memory spikes and also to make sure that your sources are not re-created
on actor restarts. - Updated
stealth
hiding ofwebdriver
to avoid recent detections. Apify.utils.log
now points to an updated logger instance which prints colored logs (in TTY)
and supports overriding with custom loggers.- Improved
Apify.launchPuppeteer()
code to prevent triggering bugs in Puppeteer by passing
more than required options topuppeteer.launch()
. - Documented
BasicCrawler.autoscaledPool
property, and addedCheerioCrawler.autoscaledPool
andPuppeteerCrawler.autoscaledPool
properties. SessionPool
now persists state onteardown
. Before, it only persisted state every minute.
This ensures that after a crawler finishes, the state is correctly persisted.- Added TypeScript typings and typedef documentation for all entities used throughout SDK.
- Upgraded
proxy-chain
NPM package from 0.2.7 to 0.4.1 and many other dependencies - Removed all usage of the now deprecated
request
package.