github binux/pyspider v0.3.2

latest releases: v0.3.10, v0.3.9, v0.3.8...
9 years ago

Scheduler

  • The size of task queue is more accurate now, you can use it to determine all done status of scheduler.

Fetcher

  • Fix tornado loss cookies while doing 30x redirects
  • You can use cookies with cookie header at same time now
  • Fix proxy not working bug.
  • Enable proxy by default.
  • Proxy now support username and password authorization. @soloradish
  • Etag and Last-Modified header will be disabled while last crawl is failed.

Databases

  • MySQL default engine changed to InnoDB @laapsaap
  • MySQL, larger result column size, changed to MEDIUMBLOB(up to 16M) @laapsaap

WebUI

  • WebUI will use same arguments as the fetcher, fix proxy not word for webui bug.
  • Results will be sorted in the order of updatetime.

One Mode

  • Script exception logs would be printed to screen

New Command send_message

You can use the command pyspider send_message [project] [message] to send a message to project via command-line.

Other

  • Using localhosted test web pages
  • Remove version specify of lxml, you can use apt-get to install any version of lxml

Don't miss a new pyspider release

NewReleases is sending notifications on new releases.