- The size of task queue is more accurate now, you can use it to determine all done status of scheduler.
- Fix tornado loss cookies while doing 30x redirects
- Fix proxy not working bug.
- Enable proxy by default.
- Proxy now support username and password authorization. @soloradish
- Etag and Last-Modified header will be disabled while last crawl is failed.
- MySQL default engine changed to InnoDB @laapsaap
- MySQL, larger result column size, changed to MEDIUMBLOB(up to 16M) @laapsaap
- WebUI will use same arguments as the fetcher, fix proxy not word for webui bug.
- Results will be sorted in the order of updatetime.
- Script exception logs would be printed to screen
You can use the command
pyspider send_message [project] [message] to send a message to project via command-line.
- Using localhosted test web pages
- Remove version specify of lxml, you can use apt-get to install any version of lxml