github binux/pyspider v0.3.9

latest release: v0.3.10
6 years ago

New features:

  • Support for Python 3.6.
  • Auto Pause: the project will be paused for scheduler.PAUSE_TIME (default: 5min) when last scheduler.FAIL_PAUSE_NUM (default: 10) task failed, and dispatch scheduler.UNPAUSE_CHECK_NUM (default: 3) tasks after scheduler.PAUSE_TIME. Project will resume if any one of last scheduler.UNPAUSE_CHECK_NUM tasks success.
  • Each callback now have a default 30s process time limit. (Platform support required) @beader
  • New Javascript render engine - Splash support: Enabled by fetch argument --splash-endpoint=http://splash:8050/execute
  • Python3 webdav support.
  • Python3 from projects import project support.
  • A link to corresponding task is added to webui debug page when debugging a exists task in webui.
  • New user_agent parameter in self.crawl, you can set user-agent by headers though.

Fix several bugs:

  • New webui dashboard frontend framework - vue.js, improved the performance when having large number of tasks (e.g.
  • Fix crawl_config doesn't work in webui while debugging a script issue.
  • Fix CSS Selector Helper doesn't work issue. @ackalker
  • Fix connection_timeout not working issue.
  • FIx need_auth option not applied on webdav issue.
  • Fix "fix can't dump counter to file: scheduler.all" error.
  • Some other fixes

Don't miss a new pyspider release

NewReleases is sending notifications on new releases.