pypi Scrapy 1.2.0

latest releases: 2.11.2, 2.11.1, 1.8.4...
7 years ago

New Features

  • New FEED_EXPORT_ENCODING setting to customize the encoding
    used when writing items to a file.
    This can be used to turn off \uXXXX escapes in JSON output.
    This is also useful for those wanting something else than UTF-8
    for XML or CSV output (#2034).
  • startproject command now supports an optional destination directory
    to override the default one based on the project name (#2005).
  • New SCHEDULER_DEBUG setting to log requests serialization
    failures (#1610).
  • JSON encoder now supports serialization of set instances (#2058).
  • Interpret application/json-amazonui-streaming as TextResponse (#1503).
  • scrapy is imported by default when using shell tools (shell,
    inspect_response) (#2248).

Bug fixes

  • DefaultRequestHeaders middleware now runs before UserAgent middleware
    (#2088). Warning: this is technically backwards incompatible,
    though we consider this a bug fix.
  • HTTP cache extension and plugins that use the .scrapy data directory now
    work outside projects (#1581). Warning: this is technically
    backwards incompatible
    , though we consider this a bug fix.
  • Selector does not allow passing both response and text anymore
    (#2153).
  • Fixed logging of wrong callback name with scrapy parse (#2169).
  • Fix for an odd gzip decompression bug (#1606).
  • Fix for selected callbacks when using CrawlSpider with scrapy parse
    (#2225).
  • Fix for invalid JSON and XML files when spider yields no items (#872).
  • Implement flush() for StreamLogger avoiding a warning in logs (#2125).

Refactoring

Tests & Requirements

Scrapy's new requirements baseline is Debian 8 "Jessie". It was previously Ubuntu 12.04 Precise.
What this means in practice is that we run continuous integration tests with these (main) packages versions at a minimum: Twisted 14.0, pyOpenSSL 0.14, lxml 3.4.

Scrapy may very well work with older versions of these packages (the code base still has switches for older Twisted versions for example) but it is not guaranteed (because it's not tested anymore).

Documentation

  • Grammar fixes: #2128, #1566.
  • Download stats badge removed from README (#2160).
  • New scrapy architecture diagram (#2165).
  • Updated Response parameters documentation (#2197).
  • Reworded misleading RANDOMIZE_DOWNLOAD_DELAY description (#2190).
  • Add StackOverflow as a support channel (#2257).

Don't miss a new Scrapy release

NewReleases is sending notifications on new releases.