github apify/crawlee v0.15.4

latest releases: v3.10.0, v3.9.2, v3.9.1...
4 years ago
  • DEPRECATED: dataset.delete(), keyValueStore.delete() and requestQueue.delete() methods have been deprecated in favor of *.drop() methods, because the drop name more clearly communicates the fact that those methods drop / delete the storage itself, not individual elements in the storage.
  • Added Apify.utils.requestAsBrowser() helper function that enables you to make HTTP(S) requests disguising as a browser (Firefox). This may help in overcoming certain anti-scraping and anti-bot protections.
  • Added options.gotoTimeoutSecs to PuppeteerCrawler to enable easier setting of navigation timeouts.
  • PuppeteerPool options that were deprecated from the PuppeteerCrawler constructor were finally removed. Please use maxOpenPagesPerInstance, retireInstanceAfterRequestCount, instanceKillerIntervalSecs, killInstanceAfterSecs and proxyUrls via the puppeteerPoolOptions object.
  • On the Apify Platform a warning will now be printed when using an outdated apify package version.
  • Apify.utils.puppeteer.enqueueLinksByClickingElements() will now print a warning when the nodes it
    tries to click become modified (detached from DOM). This is useful to debug unexpected behavior.

Don't miss a new crawlee release

NewReleases is sending notifications on new releases.