PK!o1DÝtm&m&scrapy-0.22/news.html Release notes — Scrapy 0.22.0 documentation

Release notes

0.22.0 (released 2014-01-17)

Enhancements

Fixes

  • Update Selector class imports in CrawlSpider template (issue 484)
  • Fix unexistent reference to engine.slots (issue 464)
  • Do not try to call body_as_unicode() on a non-TextResponse instance (issue 462)
  • Warn when subclassing XPathItemLoader, previously it only warned on instantiation. (issue 523)
  • Warn when subclassing XPathSelector, previously it only warned on instantiation. (issue 537)
  • Multiple fixes to memory stats (issue 531, issue 530, issue 529)
  • Fix overriding url in FormRequest.from_response() (issue 507)
  • Fix tests runner under pip 1.5 (issue 513)
  • Fix logging error when spider name is unicode (issue 479)

0.20.2 (released 2013-12-09)

0.20.1 (released 2013-11-28)

  • include_package_data is required to build wheels from published sources (commit 5ba1ad5)
  • process_parallel was leaking the failures on its internal deferreds. closes #458 (commit 419a780)

0.20.0 (released 2013-11-08)

Enhancements

  • New Selector’s API including CSS selectors (issue 395 and issue 426),
  • Request/Response url/body attributes are now immutable (modifying them had been deprecated for a long time)
  • ITEM_PIPELINES is now defined as a dict (instead of a list)
  • Sitemap spider can fetch alternate URLs (issue 360)
  • Selector.remove_namespaces() now remove namespaces from element’s attributes. (issue 416)
  • Paved the road for Python 3.3+ (issue 435, issue 436, issue 431, issue 452)
  • New item exporter using native python types with nesting support (issue 366)
  • Tune HTTP1.1 pool size so it matches concurrency defined by settings (commit b43b5f575)
  • scrapy.mail.MailSender now can connect over TLS or upgrade using STARTTLS (issue 327)
  • New FilesPipeline with functionality factored out from ImagesPipeline (issue 370, issue 409)
  • Recommend Pillow instead of PIL for image handling (issue 317)
  • Added debian packages for Ubuntu quantal and raring (commit 86230c0)
  • Mock server (used for tests) can listen for HTTPS requests (issue 410)
  • Remove multi spider support from multiple core components (issue 422, issue 421, issue 420, issue 419, issue 423, issue 418)
  • Travis-CI now tests Scrapy changes against development versions of w3lib and queuelib python packages.
  • Add pypy 2.1 to continous integration tests (commit ecfa7431)
  • Pylinted, pep8 and removed old-style exceptions from source (issue 430, issue 432)
  • Use importlib for parametric imports (issue 445)
  • Handle a regression introduced in Python 2.7.5 that affects XmlItemExporter (issue 372)
  • Bugfix crawling shutdown on SIGINT (issue 450)
  • Do not submit reset type inputs in FormRequest.from_response (commit b326b87)
  • Do not silence download errors when request errback raises an exception (commit 684cfc0)

Bugfixes

Other

  • Dropped Python 2.6 support (issue 448)
  • Add cssselect python package as install dependency
  • Drop libxml2 and multi selector’s backend support, lxml is required from now on.
  • Minimum Twisted version increased to 10.0.0, dropped Twisted 8.0 support.
  • Running test suite now requires mock python library (issue 390)

Thanks

Thanks to everyone who contribute to this release!

List of contributors sorted by number of commits:

69 Daniel Graña <dangra@...>
37 Pablo Hoffman <pablo@...>
13 Mikhail Korobov <kmike84@...>
 9 Alex Cepoi <alex.cepoi@...>
 9 alexanderlukanin13 <alexander.lukanin.13@...>
 8 Rolando Espinoza La fuente <darkrho@...>
 8 Lukasz Biedrycki <lukasz.biedrycki@...>
 6 Nicolas Ramirez <nramirez.uy@...>
 3 Paul Tremberth <paul.tremberth@...>
 2 Martin Olveyra <molveyra@...>
 2 Stefan <misc@...>
 2 Rolando Espinoza <darkrho@...>
 2 Loren Davie <loren@...>
 2 irgmedeiros <irgmedeiros@...>
 1 Stefan Koch <taikano@...>
 1 Stefan <cct@...>
 1 scraperdragon <dragon@...>
 1 Kumara Tharmalingam <ktharmal@...>
 1 Francesco Piccinno <stack.box@...>
 1 Marcos Campal <duendex@...>
 1 Dragon Dave <dragon@...>
 1 Capi Etheriel <barraponto@...>
 1 cacovsky <amarquesferraz@...>
 1 Berend Iwema <berend@...>

0.18.4 (released 2013-10-10)

  • IPython refuses to update the namespace. fix #396 (commit 3d32c4f)
  • Fix AlreadyCalledError replacing a request in shell command. closes #407 (commit b1d8919)
  • Fix start_requests lazyness and early hangs (commit 89faf52)

0.18.3 (released 2013-10-03)

0.18.2 (released 2013-09-03)

  • Backport scrapy check command fixes and backward compatible multi crawler process(issue 339)

0.18.1 (released 2013-08-27)

  • remove extra import added by cherry picked changes (commit d20304e)
  • fix crawling tests under twisted pre 11.0.0 (commit 1994f38)
  • py26 can not format zero length fields {} (commit abf756f)
  • test PotentiaDataLoss errors on unbound responses (commit b15470d)
  • Treat responses without content-length or Transfer-Encoding as good responses (commit c4bf324)
  • do no include ResponseFailed if http11 handler is not enabled (commit 6cbe684)
  • New HTTP client wraps connection losts in ResponseFailed exception. fix #373 (commit 1a20bba)
  • limit travis-ci build matrix (commit 3b01bb8)
  • Merge pull request #375 from peterarenot/patch-1 (commit fa766d7)
  • Fixed so it refers to the correct folder (commit 3283809)
  • added quantal & raring to support ubuntu releases (commit 1411923)
  • fix retry middleware which didn’t retry certain connection errors after the upgrade to http1 client, closes GH-373 (commit bb35ed0)
  • fix XmlItemExporter in Python 2.7.4 and 2.7.5 (commit de3e451)
  • minor updates to 0.18 release notes (commit c45e5f1)
  • fix contributters list format (commit 0b60031)

0.18.0 (released 2013-08-09)

  • Lot of improvements to testsuite run using Tox, including a way to test on pypi
  • Handle GET parameters for AJAX crawleable urls (commit 3fe2a32)
  • Use lxml recover option to parse sitemaps (issue 347)
  • Bugfix cookie merging by hostname and not by netloc (issue 352)
  • Support disabling HttpCompressionMiddleware using a flag setting (issue 359)
  • Support xml namespaces using iternodes parser in XMLFeedSpider (issue 12)
  • Support dont_cache request meta flag (issue 19)
  • Bugfix scrapy.utils.gz.gunzip broken by changes in python 2.7.4 (commit 4dc76e)
  • Bugfix url encoding on SgmlLinkExtractor (issue 24)
  • Bugfix TakeFirst processor shouldn’t discard zero (0) value (issue 59)
  • Support nested items in xml exporter (issue 66)
  • Improve cookies handling performance (issue 77)
  • Log dupe filtered requests once (issue 105)
  • Split redirection middleware into status and meta based middlewares (issue 78)
  • Use HTTP1.1 as default downloader handler (issue 109 and issue 318)
  • Support xpath form selection on FormRequest.from_response (issue 185)
  • Bugfix unicode decoding error on SgmlLinkExtractor (issue 199)
  • Bugfix signal dispatching on pypi interpreter (issue 205)
  • Improve request delay and concurrency handling (issue 206)
  • Add RFC2616 cache policy to HttpCacheMiddleware (issue 212)
  • Allow customization of messages logged by engine (issue 214)
  • Multiples improvements to DjangoItem (issue 217, issue 218, issue 221)
  • Extend Scrapy commands using setuptools entry points (issue 260)
  • Allow spider allowed_domains value to be set/tuple (issue 261)
  • Support settings.getdict (issue 269)
  • Simplify internal scrapy.core.scraper slot handling (issue 271)
  • Added Item.copy (issue 290)
  • Collect idle downloader slots (issue 297)
  • Add ftp:// scheme downloader handler (issue 329)
  • Added downloader benchmark webserver and spider tools Benchmarking
  • Moved persistent (on disk) queues to a separate project (queuelib) which scrapy now depends on
  • Add scrapy commands using external libraries (issue 260)
  • Added --pdb option to scrapy command line tool
  • Added XPathSelector.remove_namespaces() which allows to remove all namespaces from XML documents for convenience (to work with namespace-less XPaths). Documented in Selectors.
  • Several improvements to spider contracts
  • New default middleware named MetaRefreshMiddldeware that handles meta-refresh html tag redirections,
  • MetaRefreshMiddldeware and RedirectMiddleware have different priorities to address #62
  • added from_crawler method to spiders
  • added system tests with mock server
  • more improvements to Mac OS compatibility (thanks Alex Cepoi)
  • several more cleanups to singletons and multi-spider support (thanks Nicolas Ramirez)
  • support custom download slots
  • added –spider option to “shell” command.
  • log overridden settings when scrapy starts

Thanks to everyone who contribute to this release. Here is a list of contributors sorted by number of commits:

130 Pablo Hoffman <pablo@...>
 97 Daniel Graña <dangra@...>
 20 Nicolás Ramírez <nramirez.uy@...>
 13 Mikhail Korobov <kmike84@...>
 12 Pedro Faustino <pedrobandim@...>
 11 Steven Almeroth <sroth77@...>
  5 Rolando Espinoza La fuente <darkrho@...>
  4 Michal Danilak <mimino.coder@...>
  4 Alex Cepoi <alex.cepoi@...>
  4 Alexandr N Zamaraev (aka tonal) <tonal@...>
  3 paul <paul.tremberth@...>
  3 Martin Olveyra <molveyra@...>
  3 Jordi Llonch <llonchj@...>
  3 arijitchakraborty <myself.arijit@...>
  2 Shane Evans <shane.evans@...>
  2 joehillen <joehillen@...>
  2 Hart <HartSimha@...>
  2 Dan <ellisd23@...>
  1 Zuhao Wan <wanzuhao@...>
  1 whodatninja <blake@...>
  1 vkrest <v.krestiannykov@...>
  1 tpeng <pengtaoo@...>
  1 Tom Mortimer-Jones <tom@...>
  1 Rocio Aramberri <roschegel@...>
  1 Pedro <pedro@...>
  1 notsobad <wangxiaohugg@...>
  1 Natan L <kuyanatan.nlao@...>
  1 Mark Grey <mark.grey@...>
  1 Luan <luanpab@...>
  1 Libor Nenadál <libor.nenadal@...>
  1 Juan M Uys <opyate@...>
  1 Jonas Brunsgaard <jonas.brunsgaard@...>
  1 Ilya Baryshev <baryshev@...>
  1 Hasnain Lakhani <m.hasnain.lakhani@...>
  1 Emanuel Schorsch <emschorsch@...>
  1 Chris Tilden <chris.tilden@...>
  1 Capi Etheriel <barraponto@...>
  1 cacovsky <amarquesferraz@...>
  1 Berend Iwema <berend@...>

0.16.5 (released 2013-05-30)

  • obey request method when scrapy deploy is redirected to a new endpoint (commit 8c4fcee)
  • fix inaccurate downloader middleware documentation. refs #280 (commit 40667cb)
  • doc: remove links to diveintopython.org, which is no longer available. closes #246 (commit bd58bfa)
  • Find form nodes in invalid html5 documents (commit e3d6945)
  • Fix typo labeling attrs type bool instead of list (commit a274276)

0.16.4 (released 2013-01-23)

  • fixes spelling errors in documentation (commit 6d2b3aa)
  • add doc about disabling an extension. refs #132 (commit c90de33)
  • Fixed error message formatting. log.err() doesn’t support cool formatting and when error occured, the message was: “ERROR: Error processing %(item)s” (commit c16150c)
  • lint and improve images pipeline error logging (commit 56b45fc)
  • fixed doc typos (commit 243be84)
  • add documentation topics: Broad Crawls & Common Practies (commit 1fbb715)
  • fix bug in scrapy parse command when spider is not specified explicitly. closes #209 (commit c72e682)
  • Update docs/topics/commands.rst (commit 28eac7a)

0.16.3 (released 2012-12-07)

0.16.2 (released 2012-11-09)

0.16.1 (released 2012-10-26)

  • fixed LogStats extension, which got broken after a wrong merge before the 0.16 release (commit 8c780fd)
  • better backwards compatibility for scrapy.conf.settings (commit 3403089)
  • extended documentation on how to access crawler stats from extensions (commit c4da0b5)
  • removed .hgtags (no longer needed now that scrapy uses git) (commit d52c188)
  • fix dashes under rst headers (commit fa4f7f9)
  • set release date for 0.16.0 in news (commit e292246)

0.16.0 (released 2012-10-18)

Scrapy changes:

  • added Spiders Contracts, a mechanism for testing spiders in a formal/reproducible way
  • added options -o and -t to the runspider command
  • documented AutoThrottle extension and added to extensions installed by default. You still need to enable it with AUTOTHROTTLE_ENABLED
  • major Stats Collection refactoring: removed separation of global/per-spider stats, removed stats-related signals (stats_spider_opened, etc). Stats are much simpler now, backwards compatibility is kept on the Stats Collector API and signals.
  • added process_start_requests() method to spider middlewares
  • dropped Signals singleton. Signals should now be accesed through the Crawler.signals attribute. See the signals documentation for more info.
  • dropped Signals singleton. Signals should now be accesed through the Crawler.signals attribute. See the signals documentation for more info.
  • dropped Stats Collector singleton. Stats can now be accessed through the Crawler.stats attribute. See the stats collection documentation for more info.
  • documented Core API
  • lxml is now the default selectors backend instead of libxml2
  • ported FormRequest.from_response() to use lxml instead of ClientForm
  • removed modules: scrapy.xlib.BeautifulSoup and scrapy.xlib.ClientForm
  • SitemapSpider: added support for sitemap urls ending in .xml and .xml.gz, even if they advertise a wrong content type (commit 10ed28b)
  • StackTraceDump extension: also dump trackref live references (commit fe2ce93)
  • nested items now fully supported in JSON and JSONLines exporters
  • added cookiejar Request meta key to support multiple cookie sessions per spider
  • decoupled encoding detection code to w3lib.encoding, and ported Scrapy code to use that mdule
  • dropped support for Python 2.5. See http://blog.scrapy.org/scrapy-dropping-support-for-python-25
  • dropped support for Twisted 2.5
  • added REFERER_ENABLED setting, to control referer middleware
  • changed default user agent to: Scrapy/VERSION (+http://scrapy.org)
  • removed (undocumented) HTMLImageLinkExtractor class from scrapy.contrib.linkextractors.image
  • removed per-spider settings (to be replaced by instantiating multiple crawler objects)
  • USER_AGENT spider attribute will no longer work, use user_agent attribute instead
  • DOWNLOAD_TIMEOUT spider attribute will no longer work, use download_timeout attribute instead
  • removed ENCODING_ALIASES setting, as encoding auto-detection has been moved to the w3lib library
  • promoted DjangoItem to main contrib
  • LogFormatter method now return dicts(instead of strings) to support lazy formatting (issue 164, commit dcef7b0)
  • downloader handlers (DOWNLOAD_HANDLERS setting) now receive settings as the first argument of the constructor
  • replaced memory usage acounting with (more portable) resource module, removed scrapy.utils.memory module
  • removed signal: scrapy.mail.mail_sent
  • removed TRACK_REFS setting, now trackrefs is always enabled
  • DBM is now the default storage backend for HTTP cache middleware
  • number of log messages (per level) are now tracked through Scrapy stats (stat name: log_count/LEVEL)
  • number received responses are now tracked through Scrapy stats (stat name: response_received_count)
  • removed scrapy.log.started attribute

0.14.4

0.14.3

  • forgot to include pydispatch license. #118 (commit fd85f9c)
  • include egg files used by testsuite in source distribution. #118 (commit c897793)
  • update docstring in project template to avoid confusion with genspider command, which may be considered as an advanced feature. refs #107 (commit 2548dcc)
  • added note to docs/topics/firebug.rst about google directory being shut down (commit 668e352)
  • dont discard slot when empty, just save in another dict in order to recycle if needed again. (commit 8e9f607)
  • do not fail handling unicode xpaths in libxml2 backed selectors (commit b830e95)
  • fixed minor mistake in Request objects documentation (commit bf3c9ee)
  • fixed minor defect in link extractors documentation (commit ba14f38)
  • removed some obsolete remaining code related to sqlite support in scrapy (commit 0665175)

0.14.2

  • move buffer pointing to start of file before computing checksum. refs #92 (commit 6a5bef2)
  • Compute image checksum before persisting images. closes #92 (commit 9817df1)
  • remove leaking references in cached failures (commit 673a120)
  • fixed bug in MemoryUsage extension: get_engine_status() takes exactly 1 argument (0 given) (commit 11133e9)
  • fixed struct.error on http compression middleware. closes #87 (commit 1423140)
  • ajax crawling wasn’t expanding for unicode urls (commit 0de3fb4)
  • Catch start_requests iterator errors. refs #83 (commit 454a21d)
  • Speed-up libxml2 XPathSelector (commit 2fbd662)
  • updated versioning doc according to recent changes (commit 0a070f5)
  • scrapyd: fixed documentation link (commit 2b4e4c3)
  • extras/makedeb.py: no longer obtaining version from git (commit caffe0e)

0.14.1

  • extras/makedeb.py: no longer obtaining version from git (commit caffe0e)
  • bumped version to 0.14.1 (commit 6cb9e1c)
  • fixed reference to tutorial directory (commit 4b86bd6)
  • doc: removed duplicated callback argument from Request.replace() (commit 1aeccdd)
  • fixed formatting of scrapyd doc (commit 8bf19e6)
  • Dump stacks for all running threads and fix engine status dumped by StackTraceDump extension (commit 14a8e6e)
  • added comment about why we disable ssl on boto images upload (commit 5223575)
  • SSL handshaking hangs when doing too many parallel connections to S3 (commit 63d583d)
  • change tutorial to follow changes on dmoz site (commit bcb3198)
  • Avoid _disconnectedDeferred AttributeError exception in Twisted>=11.1.0 (commit 98f3f87)
  • allow spider to set autothrottle max concurrency (commit 175a4b5)

0.14

New features and settings

  • Support for AJAX crawleable urls

  • New persistent scheduler that stores requests on disk, allowing to suspend and resume crawls (r2737)

  • added -o option to scrapy crawl, a shortcut for dumping scraped items into a file (or standard output using -)

  • Added support for passing custom settings to Scrapyd schedule.json api (r2779, r2783)

  • New ChunkedTransferMiddleware (enabled by default) to support chunked transfer encoding (r2769)

  • Add boto 2.0 support for S3 downloader handler (r2763)

  • Added marshal to formats supported by feed exports (r2744)

  • In request errbacks, offending requests are now received in failure.request attribute (r2738)

  • Big downloader refactoring to support per domain/ip concurrency limits (r2732)
  • Added builtin caching DNS resolver (r2728)

  • Moved Amazon AWS-related components/extensions (SQS spider queue, SimpleDB stats collector) to a separate project: [scaws](https://github.com/scrapinghub/scaws) (r2706, r2714)

  • Moved spider queues to scrapyd: scrapy.spiderqueue -> scrapyd.spiderqueue (r2708)

  • Moved sqlite utils to scrapyd: scrapy.utils.sqlite -> scrapyd.sqlite (r2781)

  • Real support for returning iterators on start_requests() method. The iterator is now consumed during the crawl when the spider is getting idle (r2704)

  • Added REDIRECT_ENABLED setting to quickly enable/disable the redirect middleware (r2697)

  • Added RETRY_ENABLED setting to quickly enable/disable the retry middleware (r2694)

  • Added CloseSpider exception to manually close spiders (r2691)

  • Improved encoding detection by adding support for HTML5 meta charset declaration (r2690)

  • Refactored close spider behavior to wait for all downloads to finish and be processed by spiders, before closing the spider (r2688)

  • Added SitemapSpider (see documentation in Spiders page) (r2658)

  • Added LogStats extension for periodically logging basic stats (like crawled pages and scraped items) (r2657)

  • Make handling of gzipped responses more robust (#319, r2643). Now Scrapy will try and decompress as much as possible from a gzipped response, instead of failing with an IOError.

  • Simplified !MemoryDebugger extension to use stats for dumping memory debugging info (r2639)

  • Added new command to edit spiders: scrapy edit (r2636) and -e flag to genspider command that uses it (r2653)

  • Changed default representation of items to pretty-printed dicts. (r2631). This improves default logging by making log more readable in the default case, for both Scraped and Dropped lines.

  • Added spider_error signal (r2628)

  • Added COOKIES_ENABLED setting (r2625)

  • Stats are now dumped to Scrapy log (default value of STATS_DUMP setting has been changed to True). This is to make Scrapy users more aware of Scrapy stats and the data that is collected there.

  • Added support for dynamically adjusting download delay and maximum concurrent requests (r2599)

  • Added new DBM HTTP cache storage backend (r2576)

  • Added listjobs.json API to Scrapyd (r2571)

  • CsvItemExporter: added join_multivalued parameter (r2578)

  • Added namespace support to xmliter_lxml (r2552)

  • Improved cookies middleware by making COOKIES_DEBUG nicer and documenting it (r2579)

  • Several improvements to Scrapyd and Link extractors

Code rearranged and removed

  • Merged item passed and item scraped concepts, as they have often proved confusing in the past. This means: (r2630)
    • original item_scraped signal was removed
    • original item_passed signal was renamed to item_scraped
    • old log lines Scraped Item... were removed
    • old log lines Passed Item... were renamed to Scraped Item... lines and downgraded to DEBUG level
  • Reduced Scrapy codebase by striping part of Scrapy code into two new libraries:
    • w3lib (several functions from scrapy.utils.{http,markup,multipart,response,url}, done in r2584)
    • scrapely (was scrapy.contrib.ibl, done in r2586)
  • Removed unused function: scrapy.utils.request.request_info() (r2577)

  • Removed googledir project from examples/googledir. There’s now a new example project called dirbot available on github: https://github.com/scrapy/dirbot

  • Removed support for default field values in Scrapy items (r2616)

  • Removed experimental crawlspider v2 (r2632)

  • Removed scheduler middleware to simplify architecture. Duplicates filter is now done in the scheduler itself, using the same dupe fltering class as before (DUPEFILTER_CLASS setting) (r2640)

  • Removed support for passing urls to scrapy crawl command (use scrapy parse instead) (r2704)

  • Removed deprecated Execution Queue (r2704)

  • Removed (undocumented) spider context extension (from scrapy.contrib.spidercontext) (r2780)

  • removed CONCURRENT_SPIDERS setting (use scrapyd maxproc instead) (r2789)

  • Renamed attributes of core components: downloader.sites -> downloader.slots, scraper.sites -> scraper.slots (r2717, r2718)

  • Renamed setting CLOSESPIDER_ITEMPASSED to CLOSESPIDER_ITEMCOUNT (r2655). Backwards compatibility kept.

0.12

The numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.

New features and improvements

  • Passed item is now sent in the item argument of the item_passed (#273)
  • Added verbose option to scrapy version command, useful for bug reports (#298)
  • HTTP cache now stored by default in the project data dir (#279)
  • Added project data storage directory (#276, #277)
  • Documented file structure of Scrapy projects (see command-line tool doc)
  • New lxml backend for XPath selectors (#147)
  • Per-spider settings (#245)
  • Support exit codes to signal errors in Scrapy commands (#248)
  • Added -c argument to scrapy shell command
  • Made libxml2 optional (#260)
  • New deploy command (#261)
  • Added CLOSESPIDER_PAGECOUNT setting (#253)
  • Added CLOSESPIDER_ERRORCOUNT setting (#254)

Scrapyd changes

  • Scrapyd now uses one process per spider
  • It stores one log file per spider run, and rotate them keeping the lastest 5 logs per spider (by default)
  • A minimal web ui was added, available at http://localhost:6800 by default
  • There is now a scrapy server command to start a Scrapyd server of the current project

Changes to settings

  • added HTTPCACHE_ENABLED setting (False by default) to enable HTTP cache middleware
  • changed HTTPCACHE_EXPIRATION_SECS semantics: now zero means “never expire”.

Deprecated/obsoleted functionality

  • Deprecated runserver command in favor of server command which starts a Scrapyd server. See also: Scrapyd changes
  • Deprecated queue command in favor of using Scrapyd schedule.json API. See also: Scrapyd changes
  • Removed the !LxmlItemLoader (experimental contrib which never graduated to main contrib)

0.10

The numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.

New features and improvements

  • New Scrapy service called scrapyd for deploying Scrapy crawlers in production (#218) (documentation available)
  • Simplified Images pipeline usage which doesn’t require subclassing your own images pipeline now (#217)
  • Scrapy shell now shows the Scrapy log by default (#206)
  • Refactored execution queue in a common base code and pluggable backends called “spider queues” (#220)
  • New persistent spider queue (based on SQLite) (#198), available by default, which allows to start Scrapy in server mode and then schedule spiders to run.
  • Added documentation for Scrapy command-line tool and all its available sub-commands. (documentation available)
  • Feed exporters with pluggable backends (#197) (documentation available)
  • Deferred signals (#193)
  • Added two new methods to item pipeline open_spider(), close_spider() with deferred support (#195)
  • Support for overriding default request headers per spider (#181)
  • Replaced default Spider Manager with one with similar functionality but not depending on Twisted Plugins (#186)
  • Splitted Debian package into two packages - the library and the service (#187)
  • Scrapy log refactoring (#188)
  • New extension for keeping persistent spider contexts among different runs (#203)
  • Added dont_redirect request.meta key for avoiding redirects (#233)
  • Added dont_retry request.meta key for avoiding retries (#234)

Command-line tool changes

  • New scrapy command which replaces the old scrapy-ctl.py (#199) - there is only one global scrapy command now, instead of one scrapy-ctl.py per project - Added scrapy.bat script for running more conveniently from Windows
  • Added bash completion to command-line tool (#210)
  • Renamed command start to runserver (#209)

API changes

  • url and body attributes of Request objects are now read-only (#230)

  • Request.copy() and Request.replace() now also copies their callback and errback attributes (#231)

  • Removed UrlFilterMiddleware from scrapy.contrib (already disabled by default)

  • Offsite middelware doesn’t filter out any request coming from a spider that doesn’t have a allowed_domains attribute (#225)

  • Removed Spider Manager load() method. Now spiders are loaded in the constructor itself.

  • Changes to Scrapy Manager (now called “Crawler”):
    • scrapy.core.manager.ScrapyManager class renamed to scrapy.crawler.Crawler
    • scrapy.core.manager.scrapymanager singleton moved to scrapy.project.crawler
  • Moved module: scrapy.contrib.spidermanager to scrapy.spidermanager

  • Spider Manager singleton moved from scrapy.spider.spiders to the spiders` attribute of ``scrapy.project.crawler singleton.

  • moved Stats Collector classes: (#204)
    • scrapy.stats.collector.StatsCollector to scrapy.statscol.StatsCollector
    • scrapy.stats.collector.SimpledbStatsCollector to scrapy.contrib.statscol.SimpledbStatsCollector
  • default per-command settings are now specified in the default_settings attribute of command object class (#201)

  • changed arguments of Item pipeline process_item() method from (spider, item) to (item, spider)
    • backwards compatibility kept (with deprecation warning)
  • moved scrapy.core.signals module to scrapy.signals
    • backwards compatibility kept (with deprecation warning)
  • moved scrapy.core.exceptions module to scrapy.exceptions
    • backwards compatibility kept (with deprecation warning)
  • added handles_request() class method to BaseSpider

  • dropped scrapy.log.exc() function (use scrapy.log.err() instead)

  • dropped component argument of scrapy.log.msg() function

  • dropped scrapy.log.log_level attribute

  • Added from_settings() class methods to Spider Manager, and Item Pipeline Manager

Changes to settings

  • Added HTTPCACHE_IGNORE_SCHEMES setting to ignore certain schemes on !HttpCacheMiddleware (#225)
  • Added SPIDER_QUEUE_CLASS setting which defines the spider queue to use (#220)
  • Added KEEP_ALIVE setting (#220)
  • Removed SERVICE_QUEUE setting (#220)
  • Removed COMMANDS_SETTINGS_MODULE setting (#201)
  • Renamed REQUEST_HANDLERS to DOWNLOAD_HANDLERS and make download handlers classes (instead of functions)

0.9

The numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.

New features and improvements

  • Added SMTP-AUTH support to scrapy.mail
  • New settings added: MAIL_USER, MAIL_PASS (r2065 | #149)
  • Added new scrapy-ctl view command - To view URL in the browser, as seen by Scrapy (r2039)
  • Added web service for controlling Scrapy process (this also deprecates the web console. (r2053 | #167)
  • Support for running Scrapy as a service, for production systems (r1988, r2054, r2055, r2056, r2057 | #168)
  • Added wrapper induction library (documentation only available in source code for now). (r2011)
  • Simplified and improved response encoding support (r1961, r1969)
  • Added LOG_ENCODING setting (r1956, documentation available)
  • Added RANDOMIZE_DOWNLOAD_DELAY setting (enabled by default) (r1923, doc available)
  • MailSender is no longer IO-blocking (r1955 | #146)
  • Linkextractors and new Crawlspider now handle relative base tag urls (r1960 | #148)
  • Several improvements to Item Loaders and processors (r2022, r2023, r2024, r2025, r2026, r2027, r2028, r2029, r2030)
  • Added support for adding variables to telnet console (r2047 | #165)
  • Support for requests without callbacks (r2050 | #166)

API changes

  • Change Spider.domain_name to Spider.name (SEP-012, r1975)
  • Response.encoding is now the detected encoding (r1961)
  • HttpErrorMiddleware now returns None or raises an exception (r2006 | #157)
  • scrapy.command modules relocation (r2035, r2036, r2037)
  • Added ExecutionQueue for feeding spiders to scrape (r2034)
  • Removed ExecutionEngine singleton (r2039)
  • Ported S3ImagesStore (images pipeline) to use boto and threads (r2033)
  • Moved module: scrapy.management.telnet to scrapy.telnet (r2047)

Changes to default settings

  • Changed default SCHEDULER_ORDER to DFO (r1939)

0.8

The numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.

New features

  • Added DEFAULT_RESPONSE_ENCODING setting (r1809)
  • Added dont_click argument to FormRequest.from_response() method (r1813, r1816)
  • Added clickdata argument to FormRequest.from_response() method (r1802, r1803)
  • Added support for HTTP proxies (HttpProxyMiddleware) (r1781, r1785)
  • Offiste spider middleware now logs messages when filtering out requests (r1841)

Backwards-incompatible changes

  • Changed scrapy.utils.response.get_meta_refresh() signature (r1804)

  • Removed deprecated scrapy.item.ScrapedItem class - use scrapy.item.Item instead (r1838)

  • Removed deprecated scrapy.xpath module - use scrapy.selector instead. (r1836)

  • Removed deprecated core.signals.domain_open signal - use core.signals.domain_opened instead (r1822)

  • log.msg() now receives a spider argument (r1822)
    • Old domain argument has been deprecated and will be removed in 0.9. For spiders, you should always use the spider argument and pass spider references. If you really want to pass a string, use the component argument instead.
  • Changed core signals domain_opened, domain_closed, domain_idle

  • Changed Item pipeline to use spiders instead of domains
    • The domain argument of process_item() item pipeline method was changed to spider, the new signature is: process_item(spider, item) (r1827 | #105)
    • To quickly port your code (to work with Scrapy 0.8) just use spider.domain_name where you previously used domain.
  • Changed Stats API to use spiders instead of domains (r1849 | #113)
    • StatsCollector was changed to receive spider references (instead of domains) in its methods (set_value, inc_value, etc).
    • added StatsCollector.iter_spider_stats() method
    • removed StatsCollector.list_domains() method
    • Also, Stats signals were renamed and now pass around spider references (instead of domains). Here’s a summary of the changes:
    • To quickly port your code (to work with Scrapy 0.8) just use spider.domain_name where you previously used domain. spider_stats contains exactly the same data as domain_stats.
  • CloseDomain extension moved to scrapy.contrib.closespider.CloseSpider (r1833)
    • Its settings were also renamed:
      • CLOSEDOMAIN_TIMEOUT to CLOSESPIDER_TIMEOUT
      • CLOSEDOMAIN_ITEMCOUNT to CLOSESPIDER_ITEMCOUNT
  • Removed deprecated SCRAPYSETTINGS_MODULE environment variable - use SCRAPY_SETTINGS_MODULE instead (r1840)

  • Renamed setting: REQUESTS_PER_DOMAIN to CONCURRENT_REQUESTS_PER_SPIDER (r1830, r1844)

  • Renamed setting: CONCURRENT_DOMAINS to CONCURRENT_SPIDERS (r1830)

  • Refactored HTTP Cache middleware

  • HTTP Cache middleware has been heavilty refactored, retaining the same functionality except for the domain sectorization which was removed. (r1843 )

  • Renamed exception: DontCloseDomain to DontCloseSpider (r1859 | #120)

  • Renamed extension: DelayedCloseDomain to SpiderCloseDelay (r1861 | #121)

  • Removed obsolete scrapy.utils.markup.remove_escape_chars function - use scrapy.utils.markup.replace_escape_chars instead (r1865)

0.7

First release of Scrapy.

Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PK&o1D j!!scrapy-0.22/searchindex.jsSearch.setIndex({envversion:42,terms:{duendex:33,"86635e4":33,canonicalize_url:31,four:14,prefix:[30,25,5,43,21,37],somenastyspid:35,francesco:33,oldest:35,r2744:33,whose:[36,23,15,4,26,9,25,27,31],pedro:33,aug:42,images_min_height:15,pprint:34,umask:33,dontclosedomain:33,concret:[16,10,35,6],cmd_get_global_stat:22,under:[],worth:0,digit:30,r1865:33,everi:[36,2,23,25,43,31,20,37],risk:[27,35],telnetconsol:[5,37],r1861:33,fuent:33,hardwar:[12,13],berend:33,output_processor:0,spider1:[42,6],personitem:38,affect:[35,15,5,19,33,10],trackref:[],name_out:0,"__nonzero__":30,cmd:22,upload:33,factori:0,items_scrap:37,pil:[15,33],readthedoc:44,httpcache_ignore_schem:[],scrapybot:[42,5,13],images_thumb:15,"3afec3b4765f8f0a07b78f98c07b83f013567a0a":15,scraper:[33,8,34],histor:17,second:[0,14,23,5,17,30,32,20,37],libxslt:30,blue:0,get_stat:[19,46,22],c897793:33,neg:5,introspect:17,b830e95:33,ptr:14,nonetyp:35,"new":[],net:[26,23,3],metadata:[32,0,9,23,10],customdownloadermiddlewar:23,behavior:[33,30,23],add_css:0,never:[1,35,23,5,20,32,33,47],here:[0,1,4,5,6,8,9,10,11,14,15,17,18,32,21,23,24,25,26,28,30,33,34,35,36,37,39,40,42,43,45,46],met:37,accur:20,studio:3,path:[36,1,3,15,4,38,5,23,18,8,30,33,45,32,37,22],process_result:25,interpret:[33,37,35],formnam:32,michal:33,anymor:35,myextens:5,adopt:35,precis:[33,17,41],datetim:[13,46],unnotic:46,dangra:33,aka:[7,33,23,10],lxmlitemload:33,portabl:33,dbmcachestorag:[33,23],dictionari:[15,19,0,9,32],replace_css:0,c90de33:33,unix:[18,8,5,43],total:[35,17],middlewar:[],highli:[38,43,6],get_media_request:15,followallspid:6,describ:[36,0,23,25,5,29,39,7,26,30,32,37,10],would:[25,26,14,23,15,4,5,42,17,6,24,8,30,35,16,32,43,37,10],program:[7,14,34],call:[],typo:[33,14],python27:3,xpath1:0,xpath2:0,type:[42,14,30,16,5,35,6,8,19,43,9,25,32,33,22,34,10],until:[36,0,35,23,15,6,7,27,32],"63d583d":33,get_oldest:35,relat:[45,32,38,43,33],yahoo:26,notic:[0,35,38,17,30,9,43,14],warn:[26,24,5,8,9,45,33,37],somepass:23,keep_al:33,hold:14,must:[0,2,5,9,10,11,14,15,18,19,32,37,22,23,25,39,30,31,28,35,36,21,42,45],accid:35,anydbm:23,join:[18,25,0,17,10],err:33,restor:33,setup:[15,27,33,17,6],work:[],spidermiddlewar:[36,5],notsobad:33,"0x1b6c2d0":14,pedrobandim:33,storemyitem:5,root:[18,42,14,10],overrid:[],fresher:41,give:[20,0,25],gotopag:31,smtp:[33,40],redirectmiddlewar:[],caution:27,unavail:8,want:[],memusage_report:[],david:14,mycommand:27,end:[0,14,23,30,33,43,10],quot:[14,35],travel:32,revisit:16,how:[],env:22,answer:[12,35],collaps:18,recogn:32,"89faf52":33,after:[0,1,2,3,4,5,6,10,14,17,19,32,37,23,25,26,39,30,33,35,36,43,46],downloadtimeout:[5,23],diagram:7,befor:[39,0,14,30,3,15,25,38,5,37,23,18,42,19,33,45,32,21,34,10],wrong:[33,30],"ram\u00edrez":33,parallel:[12,33,5,16],averag:20,attempt:23,third:[30,35],classmethod:[5,46,26,40,32,37,10],credenti:[8,26],snipplr:29,think:26,jobdir:11,unpaus:34,maintain:0,green:7,reloc:33,enter:[14,35],exclus:23,spider2:42,first:[],rolando:33,dont_retri:[32,33,23],oper:[],composit:0,feedback:14,mimino:33,overridden:[36,0,23,25,5,42,33,32,10],xlib:33,offici:[15,18,3],delv:42,becaus:[36,0,1,14,23,46,16,38,26,17,39,30,20,37],jpeg:15,"684cfc0":33,jordi:33,flexibl:[24,0,10],vari:[26,23,6],fit:[0,30],fix:[],better:[18,33,0],espinoza:33,enginestatu:22,persist:[],decod:[32,33],hidden:32,induct:33,easier:[18,4,1,14],send_catch_log:19,dontclosespid:33,closespider_itemcount:[],them:[],thei:[0,2,3,4,5,7,8,9,14,15,16,17,18,19,32,37,23,24,25,26,28,30,31,33,35,41,42,43,46],"86230c0":33,fragment:[35,23],safe:[11,25,47],r2779:33,"break":47,jinja2:26,bread:24,choic:38,reviewcount:30,examplespid:35,alex:33,httpcachemiddlewar:[],dont_cach:33,luca:30,timeout:[],each:[0,1,2,4,5,6,7,9,10,14,15,16,21,22,23,24,25,26,30,31,35,36,37,38,42,46],ws_name:22,c16150c:33,httpproxi:[5,23],r1859:33,runtest:[18,33],formrequest:[],forgot:33,parse_length:0,collector:[],fragil:32,unbound:[36,33,16],webservic:[33,5,37,22],pyopenssl:3,open:[],newli:23,predefin:19,content:[42,14,15,16,40,17,29,30,43,25,32,33],parse_page1:32,parse_page2:32,daniel:33,torrentitem:17,rare:[33,30,47,41],got:[12,35,33],impract:16,include_headers_lin:10,nightmar:0,memorystatscollector:[],situat:24,free:[35,16,26,42,6,29,22],standard:[],breviti:11,md5:15,precompil:3,reinvent:26,e429f63:33,traceback:[9,22],openssl:3,middlwar:[28,23],mininovaspid:17,regress:33,confus:33,user:[],rang:[36,2,5,17,42,22,34],busi:20,mingl:21,rank:4,system:[],restrict:[25,5,9,17,35],hook:[36,30,23,17,7,19,21,34],unlik:[32,23,30,10],alreadi:[0,14,3,38,26,18,30,33,45,32,37,2],request_handl:33,wasn:33,uncollect:37,tox:33,tor:[17,6],top:[4,14,6],req_or_url:[43,14],sometim:[11,0,2,24,26,6,42,43,32,35],fiction:5,fd85f9c:33,maxproc:33,cumul:35,master:[33,41],too:[],tom:33,logged_in:25,john:[32,25,38,10],listen:[33,37,22,34],clientlanguage_nl:23,consol:[],namespac:[],tool:[],setuptool:[27,35,33],lower:[20,19,21,46],djangoitem:[],user_ag:[],task:[],cdata:0,r2631:33,white:[0,30],target:[42,22],keyword:[0,25,19,32,21,10],provid:[0,3,5,6,7,8,9,10,11,12,14,15,16,17,19,32,37,22,23,25,26,39,30,31,35,38,40,42,43,45,46],tree:30,rate:[12,13,30,16],project:[],matter:[36,43,37,23],minut:[13,35,30],r2030:33,httpauth:[5,23],close:[],moderm:33,mind:[11,36,0,1,23,16,38,5,6,18,42,30,25,32,43,22,47],mine:[14,17],raw:[0,14,23],aforement:14,vat_factor:2,manner:6,ecfa7431:33,"__main__":22,seen:[15,33,0,17,23],myspid:[36,5,24,25,26,42,30,43],wangxiaohugg:33,implement:[],latter:[25,32],dupefilter_class:[],contact:6,csviter:26,travi:33,hasheadercontract:21,field_2:21,field_1:21,jsonrpc_client_cal:22,copi:[32,26,9,33],field_in:0,though:[25,0,14,4,42,17,39,8,30,16,32,37,35],usernam:32,netloc:33,what:[],regular:[],ids_seen:2,linkextractor:[4,33,31,17,25],tradit:0,simplic:10,don:[11,36,25,0,2,3,15,4,23,39,42,6,18,26,19,31,16,37,14],dom:[],doc:[33,30,23],declar:[],filefeedstorag:8,unchang:[0,10],dot:30,class_nam:[35,6],reactor:6,opposit:19,"__str__":32,random:[25,5],r2616:33,syntax:[5,42,0,9,14],dupe:33,json_get:22,sigusr2:37,a274276:33,absolut:[8,30,31],acquir:0,explain:[18,4,24,30],response_head:23,apach:42,latenc:20,folder:[33,3],requests_per_domain:33,nasti:35,amazon:[15,5,17,8,33,21],add_xpath:0,bat:[18,33],bar:[4,30,23],sgmllinkextractor:[],excel:43,"public":[18,31,41],product_titl:0,bad:[36,30,25],b43b5f575:33,default_input_processor:0,respond:16,testitem:25,r2763:33,r2769:33,mandatori:[25,8,21,14],result:[36,0,30,15,4,13,25,10],fail:[23,15,16,3,42,32,33],themselv:[16,5,30,19],optimum:[16,20],best:[],subject:[18,21,40],awar:[32,4,33,23],said:[16,38,14],databas:[2,23,25,38,17,7],sigint:33,campal:33,field_list:6,figur:[14,35],invalu:43,irc:[18,12],approach:[4,30,47],accord:[36,25,33,23],r1849:33,r1844:33,r1840:33,r1841:33,r1843:33,extens:[],lazi:33,html5:33,clickdata:[32,33],image1_thumb:30,advertis:33,scrapymanag:33,protect:14,somedomain:6,howev:[36,25,0,14,4,35,6,46,30,16,32,20,43,37,34,10],against:[33,14],log_count:[33,13],logic:[36,25,19,16],countri:32,xmlitemexport:[],browser:[],com:[11,36,40,14,3,15,4,42,5,29,23,6,24,8,30,43,25,32,33,21,35],dont_merge_cooki:32,simpledb:33,bf3c9ee:33,request_byt:13,height:15,cepoi:33,guid:[],assum:[14,30,3,25,26,19,37,22,10],summar:16,duplic:[],strong:17,fri:42,three:[0,21,14],been:[39,14,4,5,18,7,19,44,33,2],accumul:35,much:[36,13,14,30,16,38,26,0,9,33,43],dont:33,allow_domain:31,interest:[16,42,14,17,32],subscrib:27,domain_open:33,"__doc__":22,quickli:[11,12,0,14,16,26,30,33,43,21,34],arena:35,deeper:14,lifo:26,"340fbdb":33,kumara:33,telnet:[],argument:[],downloader_stat:[],dave:33,child:30,"catch":[39,33,19,17,3],googlesitemap_depth:37,cmd_list_avail:22,davi:33,sender:40,emploi:16,ident:[32,0],"17in":30,properti:[16,38,30,14],sourceforg:3,aim:[25,46,23],my_scrapy_modul:27,spawn:13,after_post:32,twistedmatrix:3,pengtaoo:33,memdebug:[5,37],request_or_url:43,ilya:33,kwarg:[19,25,0,22,10],conf:33,attach_nam:40,tediou:[4,14],sever:[0,2,30,15,25,39,42,19,31,33,22,14],close_spid:[39,33,19,2,22],grown:35,process_item:[33,2,10],perform:[36,13,2,12,46,16,5,40,17,39,23,30,9,25,32,33,37,14],suggest:[18,16],make:[],format:[],get_output_processor:0,complex:[15,4,16,14,36],split:[33,19,14,10],page1:24,complet:[39,14,15,16,18,30,33,43,47,35],wheel:33,elli:30,bcb3198:33,hang:33,urlfiltermiddlewar:33,hand:[36,1],krestiannykov:33,rais:[36,2,23,15,37,26,39,28,33,32,21],c72e682:33,parse_pag:[28,23],tune:[12,33,20,16],butter:24,kept:[33,0],scenario:13,faustino:33,thu:[0,35,23,16,7,30,32],hypothet:[37,2,35],inherit:[19,0,21,35,25],client:[],formnumb:32,thi:[],start_export:10,gzip:[33,23],programm:[30,14],everyth:[12,38],left:[36,37,17,23],"_disconnecteddef":33,protocol:[23,22,10],background:30,just:[],laptop:9,fa4f7f9:33,b326b87:33,bandwidth:[28,23],human:10,"__dict__":35,yet:[],previous:[33,0,9,35],easi:[36,0,2,23,25,40,17,37,14],had:33,builtin:[36,26,30,33],httpcompress:[5,23],cherri:33,save:[14,23,16,38,33,32],plasma:0,xml_respons:30,func_glob:35,irgmedeiro:33,get_meta_refresh:33,applic:[5,14,35,38,26,17,6,45,10],which:[],mayb:[4,25],natan:33,capi:33,birth:17,field_nam:[0,6],httprespons:39,apart:[25,30],measur:[20,6],specif:[],arbitrari:[5,16,0,32],manual:[39,4,33,30,6],from_crawl:[5,46,26,19,33,37,10],spider_nam:22,genspid:[],unnecessari:25,"4dc76e":33,underli:[45,10],www:[11,36,14,3,15,25,5,42,17,23,30,32,43,21,35],right:[39,38],old:[33,26,9,35,3],deal:[36,0,17,6,30,32],r2576:33,r2579:33,r2578:33,interv:5,heurist:23,eas:[0,1,17],begin:[25,14,10],httpcach:[],intern:[15,33,0,35,6],sure:[11,3,15,4,23,18,10],flatten:[0,30],"8c780fd":33,inact:30,successfulli:15,itertag:25,referer_en:[],txt:[25,5,17,23],cooper:20,integerfield:38,deploy:[45,17],subclass:[],track:[],tracker:[18,12,33],unit:[18,0,21],condit:[36,21,37,6],foo:[0,30,23],adjust_request_arg:21,r1785:33,localhost:[33,40,22,34,35],core:[],who:[4,5,33],bold:30,"6a5bef2":33,insecur:40,burn:30,clear_stat:19,promot:33,sgml:[4,31,17,25],post:[],"super":[25,10],plug:[7,17,36],alexand:33,item_scrap:[],py26:33,slightli:42,unfortun:35,unsav:38,canonic:31,commit:[18,33,38],luan:33,marshal:[],match:[25,33,30,31,32],aws_secret:8,r1838:33,r1830:33,r1833:33,"float":19,encod:[23,5,17,30,33,32,10],bound:[16,21,35],r1836:33,down:[5,35,4,26,16,33],wrap:[33,22],opportun:25,byvia:35,microwav:30,wan:33,accordingli:43,git:[33,29],r2577:33,suffici:30,b7e46df:33,happi:30,avail:[],width:[15,1],reli:[11,39,47],images_stor:15,editor:[],overhead:[36,16],"0a070f5":33,head:[30,14],offer:[30,21],forc:30,"1994f38":33,image5_thumb:30,nolog:[42,43],pagerank:4,heap:35,"true":[36,25,26,30,23,15,4,31,5,19,9,16,45,32,33,22,10],reset:[33,37],attr:[33,30,31],response_received_count:[33,13],maximum:[36,13,23,5,20,33,37],cmdname:22,tell:[5,18,0,13],plymouth:17,fa766d7:33,fundament:0,iternod:[25,33],nnn:33,featur:[],request:[],"abstract":10,filedownloadhandl:5,exist:[1,40,19,9,32,37,47],glanc:[],necessarili:[16,26],ship:[23,3],ramirez:33,mybucket:8,check:[],sticki:23,encrypt:40,first_spid:42,image3_thumb:30,when:[0,1,2,4,5,6,7,8,10,11,12,14,15,16,17,18,19,20,21,23,24,25,39,30,31,32,33,35,36,37,38,42,43,45,46],elementtre:30,refactor:33,jona:33,jone:[33,35],"243be84":33,unlimit:16,node:[25,30,14,33],spiderqueu:33,notif:[12,37,40],intend:[36,13,23,25,39,19],urllib2:23,brief:7,intens:38,intent:42,award:4,consid:[23,24,16,6,18,31,33,10],domain_idl:33,xmliter:26,loren:33,outdat:27,faster:[16,23],home:[25,38],pseudo:30,some_callback:11,exce:[5,37],ignor:[0,35,23,16,5,28,31,33,32,21,10],distro:[33,34,41],time:[11,36,0,35,23,16,5,32,8,43,31,25,20,33,37,34,26],backward:[],recipi:[37,40],mydomain:42,concept:[],rom:14,chain:[36,0,23],bd58bfa:33,global:[],focus:16,barraponto:33,"6d1457d":33,signific:[5,35],customari:[5,37,2],per:[],informit:14,row:[25,35],decid:[36,35,23,17,18,2],depend:[36,0,3,16,38,5,23,6,39,26,13,9,32,20,33,37],request_method_count:13,decim:5,readabl:[18,33,40,10],certainli:42,decis:10,aspx:14,insophia:6,sourc:[0,1,14,4,26,17,41,18,42,30,9,25,33],string:[0,14,30,4,40,39,42,19,31,25,32,33,37],parse_detail:24,yourspid:25,download_timeout:[],marshalitemexport:8,cook:30,word:[26,30,25,5,0,9],archiv:[12,17,41],cool:33,level:[],reproduc:[18,33],leftov:10,dig:14,iter:[36,0,35,25,26,40,30,33,32,21],item:[],unsupport:28,scraperdragon:33,quick:[],div:[30,0,14,17],dir:[15,33,5,23],upper:[0,21],slower:13,empir:16,crawlspid:[],sign:41,port:[5,26,40,33,37,22,34],myimagespipelin:15,"7b5310d":33,appear:[45,4,5,40],currenc:32,current:[0,1,35,30,23,15,5,42,19,45,33,43,37],response_bodi:23,stefan:33,urlopen:22,honour:20,gener:[],cache_doc:23,explicitli:[45,33,0],address:[18,32,5,40,33],along:[1,23,16,5,40,32],wait:[39,33,5,35,23],box:[11,33,8,39],mysit:38,checksum:[15,33],parse_opt:22,mortim:33,proud:26,queue:[35,15,26,18,33,22,34],trial:[18,16,33],cssselect:[33,30],behav:42,spider_clos:[],extrem:46,pillow:[15,33],commonli:[25,8,0],ourselv:26,get_project_set:6,crawlera:6,semant:33,regardless:[36,46,32],extra:[33,38,9],circumv:6,modul:[],prefer:[25,38,5,40],r2783:33,metarefreshmiddldewar:33,r2781:33,ba14f38:33,subcategori:4,visibl:18,instal:[],regex:[25,0,30],cookiejar:[32,33,23],prove:[33,30],long_descript:33,pref:[35,34],visit:[11,32],todai:[0,9,17],live:[],ignored_extens:31,msg:[45,25,33],scope:[5,30],checkout:29,chapter:[0,14],countertop:30,r1822:33,peopl:35,pylint:33,r1827:33,bot_nam:[],visual:3,"nenad\u00e1l":33,accept:[39,26,5,18,42,0,9,33],service_queu:33,kmike84:33,ibl:33,prepar:6,uniqu:[25,31,2,14],cat:17,reviewr:30,minimum:[15,18,5,45,33],http1:33,can:[],purpos:[],dd55067:33,problemat:26,r1956:33,startproject:[],r1955:33,stream:10,setup_crawl:6,agent:[23,5,17,6,42,33],hreflang:25,topic:[33,37,6],critic:[45,5],tabl:[46,1],recycl:[33,35],download_handl:[],newcom:[18,35],alexanderlukanin13:33,occur:[39,12,18,7,30,33],itemtyp:30,alwai:[36,0,14,30,23,35,18,42,19,32,45,20,33,46,10],differenti:42,multipl:[],remove_namespac:[33,30],"8bf19e6":33,charset:[33,42],idl:[39,33,5],stdoutfeedstorag:8,takefirst:[33,0],scaw:33,fourth:30,familiar:[12,29,43,9,14],csvitemexport:[],tbodi:[4,1],name_in:0,map:[5,0],xmlrespons:[],item_el:10,book:14,max:[33,21,23],clone:32,spot:6,proxymesh:6,mac:33,someextens:26,mai:[0,2,3,4,6,8,10,11,20,14,16,17,18,19,27,21,23,25,26,39,30,32,33,35,36,38,42,43,45],log_level:[],grow:[0,47],gethostnam:46,practic:[],secur:40,jsessionid:23,inform:[4,5,7,10,12,14,15,17,18,32,37,22,23,24,25,26,28,31,20,34,35,36,42,45],"switch":33,preced:[5,19,0,31],combin:[25,23],offend:33,googledirectoryspid:4,allowed_domain:[36,14,4,17,42,25,33],kenmor:30,extractor:[],graduat:33,nbsp:17,still:[36,0,35,23,4,6,33,20,46],mainli:25,process_respons:23,dynam:[],equiv:[32,33],snippet:[29,0],conjunct:17,group:[0,14,25,7,30,31,33,35],monitor:[12,37,22,17],wanzuhao:33,sec13:23,"3fe2a32":33,platform:[],window:[],mail:[],main:[14,23,16,41,19,9,33,37,22],danilak:33,non:[36,0,14,23,5,40,17,7,30,33,37],within:[25,38],halt:23,request_depth_max:13,concurrent_domain:33,sooner:18,aesthet:18,now:[14,4,44,30,25,33,35],discuss:[18,33],nor:[0,35],enabl:[],mailfrom:40,workload:46,pyxi:14,name:[],didn:33,django_model:38,separ:[0,15,16,44,6,18,9,25,33,21],"12693a5":33,updat:[14,23,4,3,41,43,33,21,34],attributeerror:33,errback:[32,36,33,23],compil:[25,0,30,3],domain:[12,26,14,36,16,5,23,6,46,7,42,31,25,32,33,37],replai:23,replac:[0,41,8,30,9,33,32,43],continu:[36,2,14,23,3,41,43,32,37,10],ensur:33,make_requests_from_url:25,backport:33,htmlrespons:[],year:[16,21],urlpars:22,baselin:13,shown:[14,4,5,7,43,32],space:35,storag:[],item_count:37,etheriel:33,internet:[23,6],instati:0,modelform:38,correct:33,earlier:14,million:38,image_info_or_failur:15,ajax:[],orm:[38,14],crawlerresourc:22,parse_categori:[4,25],after_login:32,org:[14,23,25,5,17,41,30,44,3,33,43],"byte":[42,35],care:[11,32,19],reusabl:17,wai:[],spidermanag:33,refus:33,recov:33,thing:[11,36,0,1,14,3,16,17,6,42,43,45,37,47],place:[7,26,0,9,25],charfield:38,disconnect_al:19,imposs:16,frequent:[],lambda:[11,0,31],origin:[0,1,23,15,4,5,17,19,9,25,32,33],suspend:33,directli:[0,1,17,6,30,45,32],onc:[26,30,23,15,4,5,7,19,43,31,25,32,33,10],r2552:33,yourself:[25,42,6],price:[],lsb_releas:41,potentiadataloss:33,darkrho:33,molveyra:33,oppos:16,"6d2b3aa":33,bdf61c4:33,size:[15,33,35,17],given:[0,14,30,23,16,31,35,40,42,19,43,9,25,32,33,10],send_catch_log_def:19,sheet:43,sigquit:37,convent:18,stuck:16,"1aeccdd":33,indiviu:33,citi:17,silenc:33,max_length:38,process_start_request:[36,33],formxpath:32,conveni:[12,0,35,15,25,38,26,7,42,30,9,33,24,32,43,46,34,10],forthcom:33,stabil:[],response_count:13,specifi:[36,0,37,14,23,15,25,31,5,35,18,38,8,30,9,32,20,33,21,22,10],github:[14,41,18,29,30,33],enclos:[0,14],pragma:23,mapcompos:0,than:[36,25,0,46,23,15,4,26,41,18,19,16,20,21,37],serv:[26,31,23],wide:[1,17],posix:37,were:[2,23,15,19,43,33],posit:[33,0],seri:47,pre:[],sai:[18,4,2,17,14],nicer:[20,33],ani:[],get_collected_valu:0,dash:[33,0,47],r2047:33,sax:14,deliv:39,exclud:[30,31],"3d32c4f":33,e292246:33,engin:[],techniqu:24,advic:1,destroi:35,note:[],altogeth:[30,31],take:[11,2,4,5,40,17,7,19,25,33,14],environ:[12,23,38,5,19,33,3,20],htmlimagelinkextractor:33,noth:[25,13,21,46],channel:[18,12],r1961:33,r1960:33,normal:[15,38,37],buffer:33,unsuccess:36,r1969:33,stacktracedump:[33,37],compress:[33,17,23],r1781:33,detect:[25,33,5,17,30],pair:[11,23],gunzip:33,size_nam:15,downloadermiddlewar:[5,23],renam:33,later:[11,0,14,16,7,32],r1816:33,r1813:33,runtim:38,link:[],max_valu:[19,46],"72811f648e718090f041317756c03adb0ada46c7":23,importlib:33,serializ:11,show:[35,24,4,5,18,7,42,30,33,34],remove_cdata:0,jobitem:43,concurr:[],permiss:[33,1],smtpport:40,fifth:30,help:[],slot:[33,34],onli:[0,4,8,9,10,14,15,16,19,32,37,22,25,26,39,30,31,20,33,35,36,42,43,46,47],slow:[16,26,30,3],closedomain:33,activ:[],behind:17,parametr:33,remove_escape_char:33,dict:[11,36,0,37,2,23,15,25,5,35,46,8,19,9,33,32,21,34,14],analyz:[26,5],parse_torr:17,over:[0,1,2,4,30,31,25,33,35],add_opt:22,min_valu:[19,46],variou:21,get:[],repr:23,repo:41,ssl:[33,40],"4d17048":33,requir:[11,0,14,3,16,5,42,23,6,18,8,30,35,25,32,26,33,10],"gra\u00f1a":33,item_nam:25,yield:[14,23,15,4,24,25],http_pass:23,summari:33,wiki:3,main_url:32,proce:43,sean:14,wonder:30,concurrent_spid:33,myext_itemcount:37,item_descript:25,review:[],enumer:[30,23],inaccur:33,label:[32,33],enough:[25,14],request_count:13,between:[],"import":[],across:32,dequeu:13,parent:4,screen:[30,17],log_encod:[],cycl:[16,25],unpopul:10,has_head:21,pythonlib:3,cookies_en:[],come:[36,0,1,35,30,3,24,4,23,19,13,25,33,37,22,34],sitemap_follow:25,img:30,"1a20bba":33,region:[0,31],myitem:[15,24,32,25],contract:[],tutori:[],start_url:[14,24,4,17,42,25,32],mani:[],among:[0,1,6,30,33,43,10],undocu:[15,33],"28eac7a":33,color:[43,0,10],overview:[],unittest:33,inspir:26,period:33,pop:10,default_loader_context:0,exploit:37,jsonrpcerror:22,cancel:[28,39],nicola:33,ultim:20,coupl:[32,30,14],object_ref:35,b7b2e7f:33,korobov:33,pend:26,xpathselector:33,repons:[32,26],resolut:32,c4da0b5:33,instant:[],kuyanatan:33,those:[0,2,3,4,5,6,9,10,14,15,17,32,37,23,25,26,39,30,31,28,35,36,40,41,42,43,47],"case":[25,0,14,30,23,15,4,38,5,42,17,32,18,24,26,13,16,20,33,35],myself:33,plugin:33,include_package_data:33,smtpssl:40,invok:[],piccinno:33,margin:46,advantag:24,ctrl:[11,39,43,37],metric:[22,17],canon:3,worri:[19,14],myapp:38,"nicol\u00e1":33,ascii:[32,14],"__init__":[46,14,25,38,26,42,10,37,22,2],develop:[1,35,23,16,18,42,19,45,33,43,47],author:[18,30,21,14],obfusc:30,alphabet:5,same:[],binari:[32,3],inconsist:33,"8c4fcee":33,pai:14,finish_export:10,eventu:31,finish:[14,23,15,5,17,6,39,7,19,33,37],webserv:33,screenshot:4,nest:[],tld:25,executionengin:33,decompress:33,r2056:33,r2057:33,r2054:33,r2055:33,r2053:33,r2050:33,improv:[],delayedclosedomain:33,appropri:[32,23],megabyt:[5,35,17],cct:33,markup:[4,33,0,30,25],join_multivalu:[33,10],my_command:27,pep8:33,model:[12,38,35,9,14],xmliter_lxml:33,comment:[18,33],execut:[36,0,1,14,23,7,19,33,32,22,34,2],r1975:33,resp:0,rest:[],depth_prior:[],kill:37,invalid:[33,23],http_proxi:23,speed:[13,23,16,5,17,26,30,33,20],aws_access_key_id:[],item_url:24,struct:33,except:[],littl:24,desktop:9,blog:[18,33,30],r1802:33,r1803:33,r1804:33,lakhani:33,textrespons:[],r1809:33,earli:33,lukanin:33,lukasz:33,around:[33,22,6],read:[0,35,17,18,33,32,22,10],rfc2616:[],traffic:23,dispatch:[39,33],smtphost:40,world:[18,0],item_pass:33,default_output_processor:0,mininova:17,"8dcf8aa":33,"9817df1":33,whitespac:21,scrapysettings_modul:33,integ:[32,5,37,2,17],server:[13,23,5,42,6,8,30,26,20,33,37,22],r2584:33,process_valu:31,either:[36,0,2,23,15,25,30,31],"_static":30,django_settings_modul:38,output:[],inter:33,quantal:[33,41],fulfil:3,feedexport:[8,5],"3b01bb8":33,llonchj:33,item_pipelin:[],jsonrpcresourc:22,w3lib:33,handshak:33,mail_us:[],tharmalingam:33,contractfail:21,process_spider_except:36,confirm:42,dummypolici:23,definit:[36,38,0,9,35],achiev:[24,16,43,30,6],exit:[33,43,37,22],uci:3,extract_link:31,complic:14,refer:[],c4bf324:33,print_help:22,arrow:7,power:[35,17,43,10,21,14],garbag:[37,35],inspect:[],moduledict:35,broken:[33,17],sanit:17,found:[14,23,16,35,17,19,25,32,43,10,2],image3:30,regexp:30,referr:14,exslt:[],image5:30,image4:30,comparison:13,deflat:23,middelwar:33,degre:6,item_complet:15,processor:[],road:33,effici:[12,0,23,16,17,46],arijitchakraborti:33,myext_en:37,randomize_download_delai:[],strip:0,referermiddlewar:[],mark:33,your:[],load_item:0,compon:[],complianc:18,area:31,aren:[36,38],length_out:0,metarefresh_maxdelai:23,image2_thumb:30,strict:26,compliant:[37,23],interfac:[31,30,22,34,3],low:[2,23],lot:[12,0,1,14,16,38,26,17,33,35],resum:[],besid:[24,38,26,30,46,47],premis:20,media:[32,17],hard:[20,5,14,35,32],programmat:17,field_dict:6,verbatim:19,tupl:[35,15,25,40,32,33],bundl:[25,31,22,10],regard:17,jul:42,httpproxymiddlewar:[],ajaxcrawl_en:[],statsmailer_rcpt:[],pull:[18,36,5,33],mertz:14,"default":[],discountedproduct:9,basespid:33,expect:[18,4,42,43,14],gone:39,hartsimha:33,creat:[],certain:[12,2,23,24,25,26,6,39,7,42,30,9,33,43,37,22],site_id:8,my_seri:9,deep:35,decreas:[33,5,35],firebug:[],file:[],"673a120":33,nerd:43,fill:[32,36,25],jsonlin:[33,8],again:[24,32,33,14,23],googl:[23,4,6,30,33,37],extract:[],runspid:[],idiom:33,valid:[],collis:37,r2628:33,you:[],architectur:[],gettabl:41,"175a4b5":33,get_func_arg:33,file_object:40,docstr:[18,33,21],dictitem:6,picklefifodiskqueu:26,pool:[33,35,6],r2688:33,reduc:[],csvfeed:42,directori:[],image2:30,bindaddress:[],set_stat:19,tricki:6,goe:[36,0,23,15,25,7],potenti:[16,23],r1988:33,escap:32,cpu:16,unset:22,represent:[32,33],all:[],illustr:[0,14,46,26,18,8,30,9,37,22],images_min_width:15,lack:21,tremberth:33,parse_other_pag:23,r2065:33,follow:[],alt:30,disk:[11,33],children:30,cfc2d46:33,dmoz_spid:14,makedeb:33,r2599:33,validatemyitem:5,request_fingerprint:5,commands_settings_modul:33,closedomain_itemcount:33,introduc:[33,21,2,17],sound:32,liter:[0,34],emschorsch:33,interf:40,far:[4,0,14,47,35],faq:[18,12,33],multi:[32,33,10,6],offlin:23,util:[],from_set:[33,40],replace_escape_char:33,mechan:[36,0,14,25,5,17,39,7,26,30,33,32,37,35],get_command:22,veri:[0,1,4,9,10,13,14,16,17,18,20,37,24,25,30,31,35,36,40,43,46,47],ticket:[18,33],render_get:22,process_except:23,occurr:0,webmast:23,list:[],adjust:[12,2,15,5,24,33,20],stderr:22,scrapy_settings_modul:[33,5],small:[],"11133e9":33,get_engine_statu:[33,37,22],dimens:15,process_request:[25,23],edu:3,past:33,zero:[36,23,5,20,33,37,47],design:[],pass:[],further:[39,0,2,14],newfaq:5,spider_stat:[33,46],deleg:10,sub:[35,15,17,18,42,33],section:[46,0,14,30,24,25,5,42,6,7,26,19,27,37],spideropencloselog:[33,37],abl:[11,36,0,1,35,23,4,38,7,8,13],othersit:36,overload:23,delet:[1,23],version:[],consecut:5,method:[],price_excludes_vat:2,hasn:[19,47],full:[15,18,1,14,37],hash:15,unmodifi:[23,10],sophist:6,behaviour:[0,15,25,5,6,24,42,19,9,32,37,22,34],shouldn:[16,33,37,22,31],ineffici:23,modifi:[0,1,30,23,25,38,5,42,26,19,31,33,43,21],valu:[],adapt_respons:25,search:[12,4,5,18,31,16],microdata:30,robotstxt:[5,23],ftpfeedstorag:8,getbool:[19,37],hgtag:33,noitem:42,amount:[18,5,37,10],base:[36,25,1,14,12,24,4,31,5,42,17,32,23,30,43,9,16,20,33,10],codebas:33,dismiss:18,believ:26,via:[],finish_tim:13,httpcache_en:[],filenam:14,href:[1,14,4,17,30,31,25],parsed_length:0,establish:20,select:[0,14,4,5,17,30,31,33,46],tpeng:33,stdout:8,regist:[25,30,33],two:[0,14,30,23,24,16,40,42,19,31,33,32,37,22,10],coverag:33,scheduler_ord:33,bash:33,libxml2:33,taken:[15,38,23,30,10],dmoz:[33,29,14],minor:[33,21],more:[0,1,4,5,7,9,10,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,28,29,30,31,32,33,34,35,36,37,41,42,43,46],cookies_debug:[],desir:[11,32],hoffman:33,hurri:10,aptitud:41,flag:[32,33],items_count:11,stick:23,particular:[36,0,1,35,23,15,16,5,42,25,45,32,37,10],known:[1,35,23,15,5,6,18,42,43,22],set_valu:[33,19,46],downloader_middlewar:[],cach:[35,23,5,17,6,33,32,37],none:[0,5,8,10,14,15,19,32,37,22,23,24,25,30,31,33,34,35,36,38,40,43,45],endpoint:33,aws_secret_access_kei:[],fa4dc0c496c8762ae4f1a620eab34f38:23,outlin:7,histori:18,memorydebugg:[33,5,37],remain:[0,15,28,33,32,37],tpip:14,learn:[12,29,14],hasnain:33,deb:41,male:0,def:[0,2,4,5,6,10,11,14,15,17,32,37,22,23,24,25,26,28,30,31,21,42,43,46],scan:31,share:[11,0,15,4,17,18,29],templat:[33,42,5,17],mailsend:[],action:[36,2,23,4,7,32],get_output_valu:0,cours:[4,42,35],divid:4,rather:[16,38,35],anoth:[36,25,0,1,35,15,4,40,24,30,16,33,46],comfort:26,open_spid:[33,19,2,22],from_respons:[],opt:22,simpl:[12,13,2,14,24,46,31,40,30,9,37,22,10],css:[33,43,0,14,30],unabl:[20,35,6],jsonrpc_cal:22,resourc:[],referenc:35,variant:[14,23],s_product:10,associ:[36,0,14,12,17,42,30,21],revalid:23,"short":[35,17],footer:10,filesystemcachestorag:23,closespider_errorcount:[],examplifi:15,logstdout:45,dragon:33,caus:[],callback:[],link_extractor:25,egg:[33,3],isleaf:22,rotat:[33,6],soon:[25,0],r2690:33,r2691:33,paper:35,r2694:33,r2697:33,hierarchi:0,urlcontract:[5,21],notifi:[39,27,5,21],get_valu:[19,0,46],get_target:22,html:[],reschedul:23,might:14,alter:23,tri:32,good:[18,16,38,33],memusag:[5,37],mdule:33,timestamp:8,cacovski:33,framework:[36,35,23,25,26,40,17,18,7,37],deny_domain:31,request_with_cooki:32,bandwidth_exceed:28,placehold:14,bigger:[0,30],shop:25,even:[11,36,0,35,23,15,25,38,26,17,39,42,31,33,32,47],"419a780":33,r2586:33,refresh:[33,5,23],easili:[24,12,26,17,33],token:32,fulli:[33,14],unicod:[0,14,35,30,33,32,10],httperrormiddlewar:[],idea:[36,13,14,26,17,18],procedur:[26,30],realli:[36,33,1,2,16],xmlexportpipelin:10,connect:[23,15,16,40,6,39,42,19,33,20,37,22,34,10],django:[],hart:33,event:[],hostnam:[33,46],ftp:[],memdebug_en:[],publish:[33,41],etag:[42,23],print:[11,36,14,25,5,42,30,9,35,45,33,43,21,22,34,10],sitemap_shop:25,asp:26,concurrent_requests_per_domain:[],advanc:[33,19,35],responsefail:33,suspici:35,filespipelin:33,someon:40,mikhail:33,reason:[39,0,35,15,25,28,29,30,9,32,20,37],starttl:[33,40],scrapinghub:[33,41,6],ask:[],mail_sent:33,runserv:33,basi:36,thrown:[16,26],ratingvalu:30,product_id:35,thread:[33,37],capabl:32,parse_product:25,omit:[11,23,25,40,8,45],practi:33,testspid:6,s3imagesstor:33,lifetim:[35,23],suppport:[],logstat:[33,5,37],getfloat:19,someintranetsitespid:23,obviou:[17,6],feel:[29,26,22],misc:[33,5],number:[2,23,15,16,5,17,6,30,33,32,37,47],evolut:17,feet:30,footprint:16,alexandr:33,done:[33,38,26,17],construct:[],pablo:33,stabl:[5,41,19,27,22,47],python2:33,miss:[42,2,23],decoupl:33,script:[],interact:[7,43,30,17,12],gpg:41,dfo:[33,26],least:[0,37],"10ed28b":33,customspidermiddlewar:36,"573c1":42,spiderclosedelai:33,scheme:[33,8,23],store:[],schema:[25,30],assign:[36,0,2,23,5,17,9,32,22,34,14],option:[0,24,25,5,18,42,19,9,45,33,21,22],similarli:36,part:[35,30,24,6,19,33],dmozspid:14,fall:6,kind:[25,6,42,43,9,22],grep:23,some_selector:0,remov:[],reus:[],str:[39,0,35,25,31,40,28,9,45,32,22,10],r2029:33,stale:23,extensionthataccessstat:46,initi:[14,25,9,20,32,37],grei:[4,33],someothersit:36,comput:[12,33,43,14,23],add_valu:[18,0],defect:33,log_en:[],direct:[7,23,22,30,36],packag:[],xmln:30,expir:[],dedic:16,"null":0,parse_item:[11,24,42,25],telnet_var:34,built:[],equival:[32,0],start_request:[36,33,25],self:[0,2,4,10,11,35,15,17,32,37,22,23,24,25,26,28,30,14,21,42,43,46],undeclar:14,also:[0,1,2,4,5,6,7,8,9,10,14,15,16,17,18,19,32,37,22,23,24,25,26,30,31,27,33,34,35,36,40,42,43,47],exampl:[],build:[20,26,14,33],r2033:33,pipelin:[],distribut:[],unsur:18,previou:[36,0,46,23,25,26,9,20,37],reach:[36,15,5,32,37,47],most:[0,3,5,8,9,10,11,12,14,15,19,32,22,24,25,26,30,34,35,36,38,42],plai:[12,29,43,14],plan:[26,35,47,6],prentic:14,pages_crawl:46,charg:[25,14],exc:33,filesystem:[],field_out:0,clear:19,cover:[36,13,14,16,26,6,10],ext:37,clean:[1,17,6],latest:[12,30,41,3],optionpars:22,dont_click:[32,33],sector:33,aramberri:33,wsgi:38,get_css:0,session:[],entry_point:27,font:4,fine:16,find:[],penalti:[30,46],indexerror:22,pretti:[25,33,21,10],writer:10,bfo:26,factor:33,nenad:33,smtplib:40,darwin:17,httpcache_expiration_sec:[],urljoin:22,hit:[39,43,5,6],unus:[18,33],express:[],ec41673:33,images_expir:15,longest:35,is_idl:34,closespid:[],"2aa491b":33,handles_request:33,"2548dcc":33,item_id:25,statist:[12,5,37],taikano:33,art:[4,17],sep:33,memoryusag:[33,5,37],startup:[33,37],sex:[38,0],see:[],sec:5,arg:[25,42,30,9,21,22,10],reserv:39,sel:[14,30,4,17,43,25],r2006:33,someth:[23,15,25,38,26,42,43],xmlproductload:0,won:[11,36,1,35,23,15,25,38,18,19,31,46,43,37],"63bbfea82b8880ed33cdb762aa11fab722a90a24":15,httperror:[36,5],nope:43,spiderst:[33,5],list_domain:33,signatur:33,some_pag:32,javascript:[32,1,31],r2714:33,r2717:33,concurrent_requests_per_ip:[],solv:[],"06149e0":33,classnam:42,r2718:33,popul:[],both:[0,14,30,25,26,19,43,32,45,20,33,35],last:[36,0,23,25,42,43,9,32,46],delimit:25,boto:[33,8],ignorerequest:[],retent:15,foreign:[34,17],context:[],"2b00042f7481c7b056c4b410d28f33cf":15,whole:[25,30,35,10],load:[],replace_valu:0,simpli:38,point:[30,24,4,16,19,25,33,43,37],instanti:[0,14,30,25,5,40,19,31,33,37,10],schedul:[],header:[36,26,1,14,23,25,5,40,42,35,32,20,33,21,10],shutdown:[39,33,5,6],linux:[37,34,3],encoding_alias:33,mistak:33,throughout:35,simpler:[4,33,9,16],backend:[],qgvbmfybnaq:33,spidercontext:33,due:[16,26,35,47],empti:[36,0,23,5,42,8,30,31,33,32,10],runner:33,secret:[25,5,32],httpcache_storag:[],r1923:33,urllength:[36,5],nois:36,addison:14,closedomain_timeout:33,zamaraev:33,fire:[39,19,6],crawleabl:33,imag:[],great:[26,30],unnecessarili:16,coordin:19,understand:[18,12,42],func:22,luanpab:33,acount:33,look:[14,12,0,1,2,24,4,5,35,17,42,30,31,25,32,10],theref:4,tip:[1,6],batch:[],s3feedstorag:8,"while":[36,0,1,35,23,24,37,5,26,19,32,43,21],unifi:33,smart:43,itemscop:30,fun:4,closespider_pagecount:[],r2571:33,loop:[16,32],subsect:25,propag:32,increment:[19,46,10],readi:[18,30,42,19,17],readm:[33,29],jpg:[15,30],hover:4,cento:42,txrequest:22,scraped_data:17,limit:[36,33,5,23,16],rid:[12,30],seem:[4,14],corestat:[5,37],irrelev:37,minim:33,finish_reason:13,belong:[36,35,14,25],anniversari:17,almeroth:33,zope:3,inc_valu:[33,19,46],memusage_notify_mail:[],higher:[15,18,26,20,6],mybot:[42,5,43],optim:16,keyr:41,moment:[8,26],temporari:[43,23],stripe:33,cmd_help:22,koch:33,robust:[33,17],worstrat:30,wherev:33,typic:[39,25,0,2,3,15,4,5,35,6,18,7,42,19,9,16,32,43,37,10],recent:[15,12,9,33],travers:30,sha1:15,html_respons:30,older:23,find_packag:27,entri:[33,19,37],rfc2616polici:23,productxmlexport:10,pickl:[],person:38,expens:30,spend:46,downlod:42,propos:35,test_contrib_load:18,pywin32:[26,3],obscur:35,indic:[23,16,38,28,25,32],firefox:[],mandat:26,password:[32,8,40],min_free_memory_perc:46,r2789:33,roschegel:33,shortcut:[],rgb:15,sroth77:33,zuhao:33,telnetconsole_port:[],appli:[36,0,1,14,23,25,5,42,26,30,31,10],input:[],subsequ:[36,0,14,23,15,25,5,26],arijit:33,bin:[18,33,22,3],marco:33,march:30,response_download:[],big:[],fifomemoryqueu:26,selfish:21,r2780:33,inprogress:34,"56b45fc":33,formal:33,httpcompressionmiddlewar:[],lost:[45,33,23],root_el:10,resolv:[16,33,17,32],elaps:20,collect:[],api:[],closespider_itempass:33,spider_contract:[],popular:[7,30],filter_pric:0,sitemap:[25,37,17,33],often:[0,35,23,16,38,26,6,18,8,30,25,32,33,46,10],lastest:33,creation:[],some:[],back:[36,14,23,26,33,32],itemprop:30,sampl:[],sight:4,scale:[38,10],pep:18,other_callback:11,prop:30,retri:[],gnosi:14,prog:22,sample1:30,proc:0,machin:6,object:[],martin:33,peterarenot:33,offsitemiddlewar:[],spider_fil:42,wget:5,major:[33,47],r2011:33,impos:[5,36,0],jsonrpc:22,constraint:[15,16,21],listjob:33,optpars:22,mycompani:6,preset:30,softwar:[26,14],block:[25,5,40,6,7,43,33],charl:17,"__future__":22,real:[],heavilti:33,pythonpath:38,"0x2dc2b10":34,logformatt:33,r2706:33,r2704:33,steven:33,contributor:[18,33],r2708:33,https_proxi:23,inclus:18,span:30,question:[],fast:[16,14,25],custom:[],includ:[28,1,2,23,15,25,31,5,37,18,7,30,9,33,47,32,21,34,10],suit:[13,16,26,31,25,33,10],parse_arg:22,etc:[12,0,1,23,15,5,41,18,26,33,32,34],xpathitemload:33,properli:[35,24,26,8,43,27],a4a9199:33,atop:30,lint:33,replace_xpath:0,skip:33,translat:30,second_spid:42,atom:30,depthmiddlewar:[],line:[],info:[36,0,14,23,15,16,5,42,17,18,26,13,43,25,45,32,33,37,35],concaten:[0,14],utf:[32,42,5,30,10],b1d8919:33,consist:[35,17],image4_thumb:30,nlao:33,getint:[19,37],someus:23,retailpricex:42,sitemap_alternate_link:25,r1939:33,similar:[36,0,14,30,3,25,5,18,42,19,9,33,43,35],ktharmal:33,inspect_respons:[24,43],constant:16,response_byt:13,parser:[33,30,10],repres:[32,12,37,14,25],"char":[18,10],incomplet:27,instock:30,field:[],curl:[41,6],proxyhub:33,coder:33,itemload:[],titl:[25,0,21,14,30],sequenti:2,process_parallel:33,diveintopython:33,smtptl:40,extensions_bas:[],priori:35,bear:38,s3downloadhandl:5,mock:33,deseri:23,formdata:[25,32],aggregater:30,finit:35,est:34,evan:[33,35],cookiesmiddlewar:[],mymodul:27,ago:35,domain_stat:33,mimetyp:40,dvd:10,robotstxt_obei:[],autothrottle_en:[],fresh:23,"200th":17,hello:[5,7,0,45],pluggabl:33,partial:[24,33],edg:12,queri:30,loginspid:32,last_upd:[0,9],"5ba1ad5":33,"40667cb":33,compact:35,pydispatch:33,privat:47,friendli:[23,10],send:[],update_telnet_var:34,c9b690d:33,estim:20,open_in_brows:[24,33],another_subcategori:4,through:[0,2,4,5,6,7,8,9,10,11,14,17,19,32,37,22,23,25,26,39,31,33,36,42,45,46],fields_to_export:10,whichev:32,get_input_processor:0,spider_error:[],mous:[4,1],listifi:21,electron:25,spoof:17,relev:[18,30],mywebsit:25,tonal:33,jsonencod:10,descend:[4,30,2],my_spid:26,"try":[],d52c188:33,"2fbd662":33,dcef7b0:33,particularli:21,resource1:22,pleas:[18,41,6],default_set:[33,5],somespid:11,smaller:18,fortun:24,paid:6,cfg:[42,14],focu:[26,6],possibl:[0,6,8,13,33,47],xmlfeed:42,odd:47,click:[32,29,14],append:[38,0,9,14],compat:[33,26,5,47],index:[12,35,23,25,30,16],resembl:18,page2:24,access:[],product3:9,product2:[15,9],product1:15,"_spider_nam":22,whatev:[25,46],c45e5f1:33,usag:[],len:34,renderget:22,bodi:[14,23,24,4,40,28,30,43,25,32,33],let:[0,2,25,38,35,17,42,30,14],ubuntu:[],r2643:33,ioerror:33,r2640:33,sinc:[0,1,25,26,41,30,27,32,43,10],de3e451:33,produc:[5,0,10],request_info:33,defaulthead:[5,23],fetch:[],"48c9c87349680":42,ctl:33,other_url:[25,32],chang:[],gene:21,chanc:[18,5,35],r2027:33,r2026:33,r2025:33,r2024:33,r2023:33,r2022:33,apr:23,"_sre":35,app:38,foundat:26,apt:41,r2028:33,ran:23,"boolean":[30,15,25,5,40,19,31,45,32,22],"__name__":22,multitask:20,input_processor:0,fed:[32,14],usd:32,from:[],zip:29,commun:17,upgrad:[33,40],iwema:33,websit:[],few:[11,25,30,35,40],log_stdout:[],doubt:6,usr:22,lock:[15,30],en_en:23,panel:3,sort:[36,33,22,23],clever:1,concurrent_requests_per_spid:33,datepublish:30,src:30,"0x2bed9d0":43,http_user:23,benchmark:[],r2732:33,memusage_limit_mb:[],r2737:33,getchild:22,dont_redirect:[32,33,23],gayotcom:34,r2738:33,clientform:33,concurrent_request:[],account:38,retriev:25,schorsch:33,alia:[9,35],iter_al:35,cumbersom:21,"0b60031":33,annoi:21,meet:15,scatter:35,aliv:[37,35],control:[],sqlite:33,process:[],httpauthmiddlewar:[],sdist:33,sudo:41,jsonresourc:22,high:[20,2],tag:[23,24,4,16,17,29,30,43,31,25,32,33],tarbal:[33,29],serial:[],delai:[5,23,15,26,6,33,20],print_funct:22,surfac:17,sit:7,six:30,httpdownloadhandl:5,lamp:30,subdirectori:23,scheduler_memory_queu:26,chri:33,stock:[0,9,30],link5:30,link4:30,link1:30,link3:30,link2:30,constructor:[0,25,31,40,28,9,33,32,37,22,10],iter_spider_stat:33,attent:14,discard:[36,33,16],aws_kei:8,retry_complet:23,alloc:35,loglevel:45,light:[23,3],counter:[46,37],robot:[25,5,17,23],element:[0,1,14,15,4,30,25,32,33,10],issu:[11,12,35,23,26,18,33,32],body_as_unicod:[32,33],mainten:0,allow:[36,25,26,1,14,23,15,4,5,17,32,24,8,30,43,31,16,20,33,21],movi:17,move:[33,1,35,44],followal:6,comma:25,pricepipelin:2,mcgrath:14,fe2ce93:33,directoryitem:4,blake:33,chosen:25,clickabl:32,"668e352":33,infrastructur:[5,6],request_bodi:23,strip_dash:0,therefor:[37,1],greater:[19,46],"80f9bb6":33,auto:[32,43,17,33],dan:33,autothrottl:[],handi:[30,1,14,3],auth:[33,23],mention:[14,35],fingerprint:[5,23],xhtml:[25,5],scrapeditem:33,olveyra:33,trac:[33,3],anyth:[0,46,34,30,23],edit:[],prepackag:3,mode:[15,20,33,5,10],lfd:3,biedrycki:33,bump:33,chunk:[33,5,23,10],getlist:19,e3d6945:33,meta:[],"static":[12,43],redirect_en:[],our:[],special:[],out:[],variabl:[],http11:33,attend:20,nramirez:33,identifi:[4,23,1,14,16],categori:[4,42,30,17,25],gohlk:3,suitabl:16,rel:[],libor:33,default_item_class:[],squeue:26,ref:33,common:[],clarifi:0,shut:[4,5,33],jsonwriterpipelin:2,insid:[36,0,14,24,4,5,17,6,7,42,30,31,35,37,34,10],workflow:15,manipul:30,parse_start_url:25,responsecheck:21,tempt:30,releas:[],bleed:12,tilden:33,log:[],unquot:30,could:[36,14,23,4,17,30,9,25,32,43,47,35],put:[15,18,32,14,6],sitespecificload:0,keep:[],length:[5,36,42,0,33],enforc:[33,5,40],outsid:[36,38,26,42],retain:[15,33],r2655:33,dirbot:[33,29,14],r2657:33,isbn:14,suffix:0,r2653:33,r2658:33,serialize_pric:10,date:[33,42,47,23],specificproduct:9,baryshev:33,"98f3f87":33,facil:[11,12,40,17,6,45,46],prioriti:[15,32,5,33],"long":[33,35],brunsgaard:33,start:[11,36,0,14,30,23,16,38,26,17,6,39,19,42,13,43,25,45,20,33,47],unknown:[32,9,22],licens:33,capac:[36,16],wrapper:[33,30],attach:[15,40],termin:37,queuelib:33,erron:36,"final":[36,0,14,3,4,23,17,18,43,35,25,32,37,10],b4fc359:33,request_head:23,shell:[],r2034:33,r2035:33,r2036:33,r2037:33,r2039:33,shallow:32,rfpdupefilt:[33,5],enqueu:[7,13],rst:33,exactli:[18,33,23],shelp:[43,14],cmd_list_run:22,bother:[36,23],structur:[],charact:[25,14],dai:15,mail_pass:[],sens:16,becom:[27,0,14],signifi:47,torrent:17,start_tim:[13,46,34],exhibit:17,loader_context:0,default_response_encod:33,domain_nam:33,deprec:[],clearli:4,correspond:[30,14,23],myproject:[36,0,2,23,25,5,6,42,21],r2728:33,spiders_dev:5,have:[0,2,3,4,5,6,7,10,12,13,14,15,16,18,19,20,21,23,24,25,26,39,30,32,33,35,36,38,42,43,47],"_out":0,need:[0,1,5,6,9,10,11,12,14,15,16,17,18,19,20,21,23,25,26,39,30,31,33,34,36,37,38],turn:[7,37,35,16],fltere:33,emanuel:33,dmozitem:14,min:[13,21],statscol:[19,5,46,33],expos:[25,22],accuraci:30,forbidden:23,log_fil:[],rational:[],scrapescontract:[5,21],brand:23,singl:[11,36,0,2,23,15,16,5,17,6,18,24,31,25,32,37,47],uppercas:32,googlesitemap_en:37,unless:[36,0,1,16,5,18,19,25,32,37,10],clash:30,alreadycallederror:33,"0de3fb4":33,customhead:21,galleri:17,discov:[25,42,35,17,32],callabl:[0,25,19,31,32,10],sitemap_rul:25,why:[],"4b86bd6":33,url:[4,5,6,7,14,15,16,17,32,21,22,23,24,25,26,30,31,33,35,36,38,42,43],otherpag:23,hardcod:21,uri:[],face:4,inde:14,"_in":0,deni:[25,31],talk:25,determin:[2,6],fact:[26,14,35],dbm:[],text:[0,14,4,5,17,42,30,31,25,43],verbos:[24,33,42,5],urllib:[22,23],process_link:25,memusage_warning_mb:[],longer:[11,36,2,4,18,25,33,10],pre_process:21,redirect:[],textual:30,locat:[0,25,38,5,18,7,30],launchpad:3,scrapy3:6,scrapy2:6,scrapy1:6,jar:23,should:[],bytyp:35,suppos:[38,0,35,30],extract_regex:0,"7d97e98f8af710c7e7fe703abc8f639e0ee507c4":15,local:[],hall:14,export_empty_field:10,meant:[11,36,0,23,26,42,43,31,32],convert:[15,25,0,21,32],jsonitemexport:[],pypi:[33,3],pave:33,autom:0,acces:33,smtppass:40,image_info_or_error:15,increas:[],endless:36,domain_clos:33,unstructur:9,organ:0,cmd_get_spider_stat:22,lala:9,imagespipelin:[15,33],d20304e:33,stuff:[43,13],image_url:15,integr:[33,38,21],partit:[35,6],contain:[0,1,2,3,4,5,8,9,10,12,14,15,17,18,19,32,21,22,23,25,29,30,31,27,33,36,37,41,42,43,46],urllengthmiddlewar:[],view:[],altern:[25,33],legaci:33,project_nam:42,nolink:42,vkrest:33,knowledg:35,lazy:33,displai:[24,20,42,43],smtpuser:40,multipart:33,stack:[],closer:[36,23],unexist:33,default_selector_class:0,correctli:16,pattern:[4,31],sitemapspid:[],thumb:[15,33],spider_is_idl:34,favor:33,written:[13,14,25,5,18,7,8,26,35],boutel:5,post_process:21,progress:[27,14],neither:35,email:[12,5,37,40],getdict:33,image_path:15,kei:[],"1fbb715":33,ip_isocod:23,job:[],entir:[26,14,47,10],disconnect:[33,19],"454a21d":33,addit:[],"37c24e01d7":33,has_capac:34,image_id:15,b6bed44c:33,equal:32,response_status_count:13,april:30,instanc:[0,35,25,38,6,30,33,32,37,10],dupefilt:5,stats_spider_open:33,itemvalid:21,guidelin:18,arriv:[15,25],walk:14,ipython:[33,43,14],rpc:[],googledir:33,respect:[23,16,5,40,3,10],restrict_xpath:31,quit:[35,26,8,30,32,37],slowli:16,statscollector:[33,19,46],addition:2,spiders_prod:5,compos:0,cb_kwarg:[25,35],slashdot:43,treat:33,get_wsurl:22,cmd_list_resourc:22,r2625:33,presenc:21,camper:30,mnot:23,parse_shop:25,dont_filt:[32,36,25],"6cbe684":33,player:10,togeth:[0,30],scheduler_disk_queu:26,furnitur:42,present:[15,38,18,32,21,10],print_live_ref:35,replic:9,parse_nod:25,plain:[16,26,9,23],signalmanag:19,defin:[],telnetconsole_en:[],"6cb9e1c":33,intranet:23,returnscontract:[5,21],b15470d:33,purchas:30,duplicatespipelin:2,almost:46,demo:21,site:[11,36,25,0,1,14,23,15,4,5,17,6,26,16,32,20,33],"1ca5879492b8fd606df1964ea3c1e2f4520f076f":15,substanti:16,product_nam:0,revis:27,"912202e":33,price_in:0,welcom:[4,43],parti:35,juan:33,member:32,handl:[36,23,25,17,7,42,43,16,33,37],no_proxi:23,memusage_en:[],failur:[15,32,39,21,33],inc:14,infer:[32,30],difficult:[35,6],spell:33,redirect_url:[32,23],cubic:30,"14a8e6":33,effect:[36,5,19,35,10],whodatninja:33,export_item:10,logfil:45,php:[25,43,35,32],stats_dump:[],expand:33,off:[36,23,16],ajaxcrawl:23,sre_pattern:35,colour:[4,0],well:[13,35,23,16,38,26,17,6,18,30,25,10],subclas:35,thought:[18,42],part1:6,part2:6,part3:6,productload:0,choos:[42,26,14,30],undefin:30,rocio:33,sibl:4,usual:[36,0,46,16,5,6,19,32,37],retry_en:[],paus:[],pickled_meta:23,stop_on_non:0,obtain:[25,38,14,33],tcp:[20,34],paul:33,simultan:5,web:[],wed:42,priorit:36,drawback:30,bench:[],"58998f4":33,cleanup:33,bool:33,spider_open:[],kick:[36,23],gmt:[42,23],css3:30,rememb:[18,32,42,14,6],offist:33,dest:22,necessari:[24,16,23],robotstxt_en:5,know:[36,0,1,14,12,4,26,39,18,9,35,10],burden:0,press:[11,37],remove_ent:0,loader:[],recurs:42,desc:14,insert:[36,37,23],resid:42,like:[0,1,2,3,5,6,7,8,9,10,11,12,13,14,15,17,19,32,37,23,25,26,30,33,34,35,36,38,40,42,43,46],success:[15,36,14,25],create_item_class:6,handle_httpstatus_list:[36,32],closespider_timeout:[],downloader_middlewares_bas:[],active_s:34,lose:0,polit:16,ajaxcrawlmiddlewar:[],default_request_head:[],diningc:23,exceed:28,drop:[],captur:14,cmdarg:22,gayotspid:34,opyat:33,contin:33,"0x3c44a10":43,b9628c4ab9b595f72f280b90c4fd093d:15,proper:[5,33,0,30],guarante:[15,32],amarquesferraz:33,tmp:8,win32:3,abf756f:33,afterword:0,broad:[],avoid:[],thank:[],overlap:27,"2b4e4c3":33,outgo:32,leav:[43,19],documentari:17,weslei:14,encourag:29,throttl:[],process_spider_output:36,simplifi:[33,46],bestrat:30,host:[36,44,22,34,40],isn:[30,9,10],obei:[33,23],although:[24,16,40,19,32,22],filter_world:0,stage:[15,16,39,28],about:[12,13,14,30,23,15,25,5,42,17,18,7,29,19,9,24,33,37,34,35],actual:[0,1,4,17,30,25,45,43],socket:46,testsuit:33,column:10,cleans:[7,2,17],commerci:6,fals:[36,0,30,23,15,16,31,5,40,32,8,19,9,25,20,33,38,34,10],codetyp:35,disabl:[],own:[],joehillen:33,process_spider_input:36,tight:22,easy_instal:[35,3],bb35ed0:33,"8e9f607":33,track_ref:33,merg:[36,35,23,32,18,33,27],whom:36,chunkedtransfermiddlewar:[],transfer:[33,23],museum:17,somearg:11,trigger:[7,37],downgrad:33,troubl:[15,12],stai:[9,35],vat:2,parse_row:25,"function":[],mailer:40,netherland:23,xmlfeedspid:[],shane:33,"0cb68af":33,keyerror:9,needs_backout:34,robotstxt_cachedir:5,everyon:33,overflow:36,loc:25,count:[36,19,35],image1:30,succe:32,made:[12,23,5,17,33,22],caffe0:33,whether:[36,23,5,18,8,31,46,10],simpledbstatscollector:33,redistribut:3,max_items_scrap:46,asynchron:[7,16,6],record:19,below:[39,0,35,24,25,38,5,18,7,42,30,31,32,46,22,10],"487b9b5":33,immut:33,otherwis:[39,25,14,30,3,15,4,5,42,18,29,19,16,45,37,10],problem:[],sitemap_url:25,r2632:33,r2630:33,discount_expiration_d:9,r2636:33,evalu:33,"int":[32,19,22,10],descript:[14,25,42,17,7,29,30,4,22,34],dure:[32,17,33],r2639:33,pid:[37,35],ellisd23:33,twist:[],register_namespac:[25,30],item_details_url:24,rule:[],userag:[5,23],pip:[33,3],itemproc_s:34,contributt:33,probabl:[26,13,35,23],hpy:[35,34],jobsbot:43,percent:[32,30],detail:[1,14,24,38,5,17,18,7,29,30,33,35],cmd_stop:22,preinstal:3,other:[],lookup:23,futur:[13,35,23,24,27,45,32,37],branch:[47,41],executionqueu:33,discount_perc:9,spider_queue_class:33,stat:[],repeat:7,star:[43,30],llonch:33,"class":[],scrapyd:[],preform:16,singleton:[33,26],debian:33,matrix:33,experienc:35,sphinx:18,deny_extens:31,parse_oth:25,download_delai:[],get_xpath:0},objtypes:{"0":"std:command","1":"std:setting","2":"std:reqmeta","3":"std:signal","4":"py:module","5":"py:class","6":"py:method","7":"py:exception","8":"py:function","9":"py:attribute","10":"py:classmethod","11":"py:data"},objnames:{"0":["std","command","command"],"1":["std","setting","setting"],"2":["std","reqmeta","reqmeta"],"3":["std","signal","signal"],"4":["py","module","Python module"],"5":["py","class","Python class"],"6":["py","method","Python method"],"7":["py","exception","Python exception"],"8":["py","function","Python function"],"9":["py","attribute","Python attribute"],"10":["py","classmethod","Python class method"],"11":["py","data","Python data"]},filenames:["topics/loaders","topics/firefox","topics/item-pipeline","intro/install","topics/firebug","topics/settings","topics/practices","topics/architecture","topics/feed-exports","topics/items","topics/exporters","topics/jobs","index","topics/benchmarking","intro/tutorial","topics/images","topics/broad-crawls","intro/overview","contributing","topics/api","topics/autothrottle","topics/contracts","topics/webservice","topics/downloader-middleware","topics/debug","topics/spiders","faq","experimental/index","topics/exceptions","intro/examples","topics/selectors","topics/link-extractors","topics/request-response","news","topics/telnetconsole","topics/leaks","topics/spider-middleware","topics/extensions","topics/djangoitem","topics/signals","topics/email","topics/ubuntu","topics/commands","topics/shell","topics/scrapyd","topics/logging","topics/stats","versioning"],titles:["Item Loaders","Using Firefox for scraping","Item Pipeline","Installation guide","Using Firebug for scraping","Settings","Common Practices","Architecture overview","Feed exports","Items","Item Exporters","Jobs: pausing and resuming crawls","Scrapy 0.22 documentation","Benchmarking","Scrapy Tutorial","Downloading Item Images","Broad Crawls","Scrapy at a glance","Contributing to Scrapy","Core API","AutoThrottle extension","Spiders Contracts","Web Service","Downloader Middleware","Debugging Spiders","Spiders","Frequently Asked Questions","Experimental features","Exceptions","Examples","Selectors","Link Extractors","Requests and Responses","Release notes","Telnet Console","Debugging memory leaks","Spider Middleware","Extensions","DjangoItem","Signals","Sending e-mail","Ubuntu packages","Command line tool","Scrapy shell","Scrapyd","Logging","Stats Collection","Versioning and API Stability"],objects:{"":{CLOSESPIDER_ITEMCOUNT:[37,1,1,"std:setting-CLOSESPIDER_ITEMCOUNT"],update_telnet_vars:[34,3,1,"std:signal-update_telnet_vars"],genspider:[42,0,1,"std:command-genspider"],dont_retry:[23,2,1,"std:reqmeta-dont_retry"],CONCURRENT_REQUESTS_PER_IP:[5,1,1,"std:setting-CONCURRENT_REQUESTS_PER_IP"],HTTPCACHE_DIR:[23,1,1,"std:setting-HTTPCACHE_DIR"],MEMUSAGE_REPORT:[5,1,1,"std:setting-MEMUSAGE_REPORT"],DOWNLOAD_HANDLERS_BASE:[5,1,1,"std:setting-DOWNLOAD_HANDLERS_BASE"],MAIL_FROM:[40,1,1,"std:setting-MAIL_FROM"],HTTPCACHE_EXPIRATION_SECS:[23,1,1,"std:setting-HTTPCACHE_EXPIRATION_SECS"],bench:[42,0,1,"std:command-bench"],DOWNLOAD_TIMEOUT:[5,1,1,"std:setting-DOWNLOAD_TIMEOUT"],MAIL_PASS:[40,1,1,"std:setting-MAIL_PASS"],MEMUSAGE_LIMIT_MB:[5,1,1,"std:setting-MEMUSAGE_LIMIT_MB"],EXTENSIONS:[5,1,1,"std:setting-EXTENSIONS"],DEPTH_PRIORITY:[5,1,1,"std:setting-DEPTH_PRIORITY"],dont_redirect:[23,2,1,"std:reqmeta-dont_redirect"],TELNETCONSOLE_HOST:[34,1,1,"std:setting-TELNETCONSOLE_HOST"],WEBSERVICE_HOST:[22,1,1,"std:setting-WEBSERVICE_HOST"],HTTPCACHE_IGNORE_MISSING:[23,1,1,"std:setting-HTTPCACHE_IGNORE_MISSING"],spider_error:[39,3,1,"std:signal-spider_error"],IMAGES_MIN_WIDTH:[15,1,1,"std:setting-IMAGES_MIN_WIDTH"],SPIDER_MODULES:[5,1,1,"std:setting-SPIDER_MODULES"],RETRY_TIMES:[23,1,1,"std:setting-RETRY_TIMES"],TELNETCONSOLE_PORT:[34,1,1,"std:setting-TELNETCONSOLE_PORT"],TELNETCONSOLE_ENABLED:[5,1,1,"std:setting-TELNETCONSOLE_ENABLED"],DOWNLOADER_MIDDLEWARES:[5,1,1,"std:setting-DOWNLOADER_MIDDLEWARES"],item_dropped:[39,3,1,"std:signal-item_dropped"],HTTPCACHE_DBM_MODULE:[23,1,1,"std:setting-HTTPCACHE_DBM_MODULE"],ROBOTSTXT_OBEY:[5,1,1,"std:setting-ROBOTSTXT_OBEY"],DEPTH_LIMIT:[5,1,1,"std:setting-DEPTH_LIMIT"],settings:[42,0,1,"std:command-settings"],edit:[42,0,1,"std:command-edit"],list:[42,0,1,"std:command-list"],close_spider:[2,6,1,""],CLOSESPIDER_PAGECOUNT:[37,1,1,"std:setting-CLOSESPIDER_PAGECOUNT"],view:[42,0,1,"std:command-view"],AUTOTHROTTLE_MAX_DELAY:[20,1,1,"std:setting-AUTOTHROTTLE_MAX_DELAY"],URLLENGTH_LIMIT:[5,1,1,"std:setting-URLLENGTH_LIMIT"],FEED_EXPORTERS:[8,1,1,"std:setting-FEED_EXPORTERS"],LOG_ENCODING:[5,1,1,"std:setting-LOG_ENCODING"],FEED_EXPORTERS_BASE:[8,1,1,"std:setting-FEED_EXPORTERS_BASE"],DOWNLOADER_DEBUG:[5,1,1,"std:setting-DOWNLOADER_DEBUG"],FEED_FORMAT:[8,1,1,"std:setting-FEED_FORMAT"],HTTPCACHE_IGNORE_SCHEMES:[23,1,1,"std:setting-HTTPCACHE_IGNORE_SCHEMES"],spider_idle:[39,3,1,"std:signal-spider_idle"],MEMDEBUG_ENABLED:[5,1,1,"std:setting-MEMDEBUG_ENABLED"],DNSCACHE_ENABLED:[5,1,1,"std:setting-DNSCACHE_ENABLED"],response_downloaded:[39,3,1,"std:signal-response_downloaded"],RETRY_HTTP_CODES:[23,1,1,"std:setting-RETRY_HTTP_CODES"],version:[42,0,1,"std:command-version"],MAIL_HOST:[40,1,1,"std:setting-MAIL_HOST"],MAIL_TLS:[40,1,1,"std:setting-MAIL_TLS"],CONCURRENT_REQUESTS:[5,1,1,"std:setting-CONCURRENT_REQUESTS"],TEMPLATES_DIR:[5,1,1,"std:setting-TEMPLATES_DIR"],CLOSESPIDER_ERRORCOUNT:[37,1,1,"std:setting-CLOSESPIDER_ERRORCOUNT"],COMMANDS_MODULE:[42,1,1,"std:setting-COMMANDS_MODULE"],AUTOTHROTTLE_ENABLED:[20,1,1,"std:setting-AUTOTHROTTLE_ENABLED"],item_scraped:[39,3,1,"std:signal-item_scraped"],open_spider:[2,6,1,""],response_received:[39,3,1,"std:signal-response_received"],FEED_STORAGES_BASE:[8,1,1,"std:setting-FEED_STORAGES_BASE"],REDIRECT_ENABLED:[23,1,1,"std:setting-REDIRECT_ENABLED"],SPIDER_MIDDLEWARES:[5,1,1,"std:setting-SPIDER_MIDDLEWARES"],AWS_ACCESS_KEY_ID:[5,1,1,"std:setting-AWS_ACCESS_KEY_ID"],AUTOTHROTTLE_DEBUG:[20,1,1,"std:setting-AUTOTHROTTLE_DEBUG"],NEWSPIDER_MODULE:[5,1,1,"std:setting-NEWSPIDER_MODULE"],DEPTH_STATS_VERBOSE:[5,1,1,"std:setting-DEPTH_STATS_VERBOSE"],CONCURRENT_ITEMS:[5,1,1,"std:setting-CONCURRENT_ITEMS"],DOWNLOADER_MIDDLEWARES_BASE:[5,1,1,"std:setting-DOWNLOADER_MIDDLEWARES_BASE"],WEBSERVICE_ENABLED:[22,1,1,"std:setting-WEBSERVICE_ENABLED"],WEBSERVICE_PORT:[22,1,1,"std:setting-WEBSERVICE_PORT"],AWS_SECRET_ACCESS_KEY:[5,1,1,"std:setting-AWS_SECRET_ACCESS_KEY"],MAIL_PORT:[40,1,1,"std:setting-MAIL_PORT"],REFERER_ENABLED:[36,1,1,"std:setting-REFERER_ENABLED"],HTTPCACHE_POLICY:[23,1,1,"std:setting-HTTPCACHE_POLICY"],STATS_DUMP:[5,1,1,"std:setting-STATS_DUMP"],MEMUSAGE_NOTIFY_MAIL:[5,1,1,"std:setting-MEMUSAGE_NOTIFY_MAIL"],DOWNLOAD_HANDLERS:[5,1,1,"std:setting-DOWNLOAD_HANDLERS"],IMAGES_MIN_HEIGHT:[15,1,1,"std:setting-IMAGES_MIN_HEIGHT"],redirect_urls:[23,2,1,"std:reqmeta-redirect_urls"],LOG_LEVEL:[5,1,1,"std:setting-LOG_LEVEL"],spider_closed:[39,3,1,"std:signal-spider_closed"],handle_httpstatus_list:[36,2,1,"std:reqmeta-handle_httpstatus_list"],REDIRECT_MAX_TIMES:[5,1,1,"std:setting-REDIRECT_MAX_TIMES"],REDIRECT_PRIORITY_ADJUST:[5,1,1,"std:setting-REDIRECT_PRIORITY_ADJUST"],cookiejar:[23,2,1,"std:reqmeta-cookiejar"],DUPEFILTER_CLASS:[5,1,1,"std:setting-DUPEFILTER_CLASS"],RETRY_ENABLED:[23,1,1,"std:setting-RETRY_ENABLED"],SPIDER_CONTRACTS:[5,1,1,"std:setting-SPIDER_CONTRACTS"],HTTPCACHE_ENABLED:[23,1,1,"std:setting-HTTPCACHE_ENABLED"],LOG_ENABLED:[5,1,1,"std:setting-LOG_ENABLED"],MAIL_USER:[40,1,1,"std:setting-MAIL_USER"],DEFAULT_ITEM_CLASS:[5,1,1,"std:setting-DEFAULT_ITEM_CLASS"],DEPTH_STATS:[5,1,1,"std:setting-DEPTH_STATS"],shell:[42,0,1,"std:command-shell"],runspider:[42,0,1,"std:command-runspider"],EXTENSIONS_BASE:[5,1,1,"std:setting-EXTENSIONS_BASE"],FEED_STORAGES:[8,1,1,"std:setting-FEED_STORAGES"],BOT_NAME:[5,1,1,"std:setting-BOT_NAME"],startproject:[42,0,1,"std:command-startproject"],SPIDER_CONTRACTS_BASE:[5,1,1,"std:setting-SPIDER_CONTRACTS_BASE"],crawl:[42,0,1,"std:command-crawl"],AJAXCRAWL_ENABLED:[23,1,1,"std:setting-AJAXCRAWL_ENABLED"],IMAGES_EXPIRES:[15,1,1,"std:setting-IMAGES_EXPIRES"],HTTPCACHE_IGNORE_HTTP_CODES:[23,1,1,"std:setting-HTTPCACHE_IGNORE_HTTP_CODES"],WEBSERVICE_LOGFILE:[22,1,1,"std:setting-WEBSERVICE_LOGFILE"],engine_stopped:[39,3,1,"std:signal-engine_stopped"],MEMUSAGE_WARNING_MB:[5,1,1,"std:setting-MEMUSAGE_WARNING_MB"],MEMDEBUG_NOTIFY:[5,1,1,"std:setting-MEMDEBUG_NOTIFY"],FEED_STORE_EMPTY:[8,1,1,"std:setting-FEED_STORE_EMPTY"],fetch:[42,0,1,"std:command-fetch"],COOKIES_DEBUG:[23,1,1,"std:setting-COOKIES_DEBUG"],FEED_URI:[8,1,1,"std:setting-FEED_URI"],USER_AGENT:[5,1,1,"std:setting-USER_AGENT"],parse:[42,0,1,"std:command-parse"],COOKIES_ENABLED:[23,1,1,"std:setting-COOKIES_ENABLED"],process_item:[2,6,1,""],check:[42,0,1,"std:command-check"],ITEM_PIPELINES:[5,1,1,"std:setting-ITEM_PIPELINES"],SPIDER_MIDDLEWARES_BASE:[5,1,1,"std:setting-SPIDER_MIDDLEWARES_BASE"],METAREFRESH_ENABLED:[23,1,1,"std:setting-METAREFRESH_ENABLED"],STATS_CLASS:[5,1,1,"std:setting-STATS_CLASS"],MAIL_SSL:[40,1,1,"std:setting-MAIL_SSL"],HTTPERROR_ALLOW_ALL:[36,1,1,"std:setting-HTTPERROR_ALLOW_ALL"],DOWNLOAD_DELAY:[5,1,1,"std:setting-DOWNLOAD_DELAY"],COMPRESSION_ENABLED:[23,1,1,"std:setting-COMPRESSION_ENABLED"],IMAGES_THUMBS:[15,1,1,"std:setting-IMAGES_THUMBS"],deploy:[42,0,1,"std:command-deploy"],RANDOMIZE_DOWNLOAD_DELAY:[5,1,1,"std:setting-RANDOMIZE_DOWNLOAD_DELAY"],AUTOTHROTTLE_START_DELAY:[20,1,1,"std:setting-AUTOTHROTTLE_START_DELAY"],jDITOR:[5,1,1,"std:setting-jDITOR"],LOG_STDOUT:[5,1,1,"std:setting-LOG_STDOUT"],DOWNLOADER_STATS:[5,1,1,"std:setting-DOWNLOADER_STATS"],LOG_FILE:[5,1,1,"std:setting-LOG_FILE"],HTTPCACHE_STORAGE:[23,1,1,"std:setting-HTTPCACHE_STORAGE"],HTTPERROR_ALLOWED_CODES:[36,1,1,"std:setting-HTTPERROR_ALLOWED_CODES"],REDIRECT_MAX_METAREFRESH_DELAY:[5,1,1,"std:setting-REDIRECT_MAX_METAREFRESH_DELAY"],engine_started:[39,3,1,"std:signal-engine_started"],CONCURRENT_REQUESTS_PER_DOMAIN:[5,1,1,"std:setting-CONCURRENT_REQUESTS_PER_DOMAIN"],spider_opened:[39,3,1,"std:signal-spider_opened"],bindaddress:[32,2,1,"std:reqmeta-bindaddress"],DEFAULT_REQUEST_HEADERS:[5,1,1,"std:setting-DEFAULT_REQUEST_HEADERS"],CLOSESPIDER_TIMEOUT:[37,1,1,"std:setting-CLOSESPIDER_TIMEOUT"],IMAGES_STORE:[15,1,1,"std:setting-IMAGES_STORE"],SCHEDULER:[5,1,1,"std:setting-SCHEDULER"],ITEM_PIPELINES_BASE:[5,1,1,"std:setting-ITEM_PIPELINES_BASE"],STATSMAILER_RCPTS:[5,1,1,"std:setting-STATSMAILER_RCPTS"],MEMUSAGE_ENABLED:[5,1,1,"std:setting-MEMUSAGE_ENABLED"]},"scrapy.contrib.downloadermiddleware.cookies":{CookiesMiddleware:[23,5,1,""]},"scrapy.contrib.downloadermiddleware.stats":{DownloaderStats:[23,5,1,""]},"scrapy.http.Response":{status:[32,9,1,""],body:[32,9,1,""],url:[32,9,1,""],request:[32,9,1,""],replace:[32,6,1,""],headers:[32,9,1,""],meta:[32,9,1,""],flags:[32,9,1,""],copy:[32,6,1,""]},"scrapy.contrib.spidermiddleware.urllength":{UrlLengthMiddleware:[36,5,1,""]},"scrapy.contrib.linkextractors.sgml":{SgmlLinkExtractor:[31,5,1,""],BaseSgmlLinkExtractor:[31,5,1,""]},"scrapy.contrib.downloadermiddleware.defaultheaders":{DefaultHeadersMiddleware:[23,5,1,""]},"scrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonResource":{ws_name:[22,9,1,""]},"scrapy.contrib.pipeline.images.ImagesPipeline":{get_media_requests:[15,6,1,""],item_completed:[15,6,1,""]},"scrapy.contrib.downloadermiddleware.httpproxy":{HttpProxyMiddleware:[23,5,1,""]},"scrapy.utils":{trackref:[35,4,0,"-"]},"scrapy.contrib.loader.processor":{MapCompose:[0,5,1,""],Join:[0,5,1,""],Compose:[0,5,1,""],TakeFirst:[0,5,1,""],Identity:[0,5,1,""]},"scrapy.exceptions":{IgnoreRequest:[28,7,1,""],DropItem:[28,7,1,""],NotSupported:[28,7,1,""],CloseSpider:[28,7,1,""],NotConfigured:[28,7,1,""]},"scrapy.contrib":{downloadermiddleware:[23,4,0,"-"],exporter:[10,4,0,"-"],memusage:[37,4,0,"-"],webservice:[22,4,0,"-"],closespider:[37,4,0,"-"],memdebug:[37,4,0,"-"],loader:[0,4,0,"-"],statsmailer:[37,4,0,"-"],spidermiddleware:[36,4,0,"-"],logstats:[37,4,0,"-"],linkextractors:[31,4,0,"-"],debug:[37,4,0,"-"],corestats:[37,4,0,"-"],spiders:[25,4,0,"-"]},"scrapy.crawler":{Crawler:[19,5,1,""]},"scrapy.contrib.pipeline":{images:[15,4,0,"-"]},"scrapy.item.Item":{fields:[9,9,1,""]},"scrapy.contrib.closespider.scrapy.contrib.closespider":{CloseSpider:[37,5,1,""]},"scrapy.settings.Settings":{getfloat:[19,6,1,""],getlist:[19,6,1,""],get:[19,6,1,""],overrides:[19,9,1,""],getbool:[19,6,1,""],getint:[19,6,1,""]},"scrapy.contrib.downloadermiddleware.ajaxcrawl":{AjaxCrawlMiddleware:[23,5,1,""]},"scrapy.settings":{Settings:[19,5,1,""]},"scrapy.statscol.StatsCollector":{get_stats:[19,6,1,""],get_value:[19,6,1,""],max_value:[19,6,1,""],min_value:[19,6,1,""],inc_value:[19,6,1,""],close_spider:[19,6,1,""],open_spider:[19,6,1,""],clear_stats:[19,6,1,""],set_stats:[19,6,1,""],set_value:[19,6,1,""]},"scrapy.http.TextResponse":{body_as_unicode:[32,6,1,""],encoding:[32,9,1,""]},"scrapy.contrib.spiders.CSVFeedSpider":{headers:[25,9,1,""],delimiter:[25,9,1,""],parse_row:[25,6,1,""]},"scrapy.signals":{engine_started:[39,8,1,""],response_downloaded:[39,8,1,""],item_scraped:[39,8,1,""],spider_error:[39,8,1,""],engine_stopped:[39,8,1,""],response_received:[39,8,1,""],spider_closed:[39,8,1,""],spider_opened:[39,8,1,""],item_dropped:[39,8,1,""],spider_idle:[39,8,1,""]},"scrapy.contrib.spidermiddleware.httperror":{HttpErrorMiddleware:[36,5,1,""]},"scrapy.telnet":{update_telnet_vars:[34,8,1,""]},"scrapy.contrib.spiders.CrawlSpider":{rules:[25,9,1,""],parse_start_url:[25,6,1,""]},"scrapy.statscol":{MemoryStatsCollector:[46,5,1,""],StatsCollector:[19,5,1,""],DummyStatsCollector:[46,5,1,""]},"scrapy.spider.Spider":{start_urls:[25,9,1,""],allowed_domains:[25,9,1,""],parse:[25,6,1,""],make_requests_from_url:[25,6,1,""],start_requests:[25,6,1,""],log:[25,6,1,""],name:[25,9,1,""]},"scrapy.contrib.spidermiddleware.offsite":{OffsiteMiddleware:[36,5,1,""]},"scrapy.crawler.Crawler":{engine:[19,9,1,""],stats:[19,9,1,""],configure:[19,6,1,""],settings:[19,9,1,""],signals:[19,9,1,""],start:[19,6,1,""],extensions:[19,9,1,""],spiders:[19,9,1,""]},"scrapy.contrib.loader.ItemLoader":{context:[0,9,1,""],default_selector_class:[0,9,1,""],get_css:[0,6,1,""],add_value:[0,6,1,""],add_css:[0,6,1,""],get_output_processor:[0,6,1,""],default_input_processor:[0,9,1,""],replace_css:[0,6,1,""],replace_xpath:[0,6,1,""],get_output_value:[0,6,1,""],selector:[0,9,1,""],get_value:[0,6,1,""],get_collected_values:[0,6,1,""],replace_value:[0,6,1,""],item:[0,9,1,""],get_xpath:[0,6,1,""],default_item_class:[0,9,1,""],add_xpath:[0,6,1,""],get_input_processor:[0,6,1,""],default_output_processor:[0,9,1,""],load_item:[0,6,1,""]},"scrapy.contrib.spiders":{XMLFeedSpider:[25,5,1,""],CrawlSpider:[25,5,1,""],CSVFeedSpider:[25,5,1,""],Rule:[25,5,1,""],SitemapSpider:[25,5,1,""]},"scrapy.selector.SelectorList":{xpath:[30,6,1,""],re:[30,6,1,""],"__nonzero__":[30,6,1,""],extract:[30,6,1,""],css:[30,6,1,""]},"scrapy.contrib.spiders.SitemapSpider":{sitemap_urls:[25,9,1,""],sitemap_rules:[25,9,1,""],sitemap_follow:[25,9,1,""],sitemap_alternate_links:[25,9,1,""]},"scrapy.contrib.spidermiddleware.depth":{DepthMiddleware:[36,5,1,""]},"scrapy.contrib.webservice":{enginestatus:[22,4,0,"-"],stats:[22,4,0,"-"],crawler:[22,4,0,"-"]},"scrapy.signalmanager":{SignalManager:[19,5,1,""]},"scrapy.spider":{Spider:[25,5,1,""]},"scrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonRpcResource":{get_target:[22,6,1,""]},"scrapy.contrib.webservice.enginestatus":{EngineStatusResource:[22,5,1,""]},"scrapy.http":{HtmlResponse:[32,5,1,""],Request:[32,5,1,""],XmlResponse:[32,5,1,""],TextResponse:[32,5,1,""],FormRequest:[32,5,1,""],Response:[32,5,1,""]},"scrapy.contrib.statsmailer.scrapy.contrib.statsmailer":{StatsMailer:[37,5,1,""]},"scrapy.contrib.downloadermiddleware.DownloaderMiddleware":{process_response:[23,6,1,""],process_exception:[23,6,1,""],process_request:[23,6,1,""]},"scrapy.contrib.corestats":{CoreStats:[37,5,1,""]},"scrapy.contrib.spidermiddleware.referer":{RefererMiddleware:[36,5,1,""]},"scrapy.contracts":{"default":[21,4,0,"-"],Contract:[21,5,1,""]},"scrapy.contrib.spidermiddleware":{offsite:[36,4,0,"-"],depth:[36,4,0,"-"],SpiderMiddleware:[36,5,1,""],referer:[36,4,0,"-"],urllength:[36,4,0,"-"],httperror:[36,4,0,"-"]},"scrapy.contrib.downloadermiddleware.useragent":{UserAgentMiddleware:[23,5,1,""]},"scrapy.contrib.downloadermiddleware.downloadtimeout":{DownloadTimeoutMiddleware:[23,5,1,""]},"scrapy.webservice.scrapy.webservice":{WebService:[37,5,1,""]},"scrapy.contrib.linkextractors":{sgml:[31,4,0,"-"]},"scrapy.signalmanager.SignalManager":{disconnect_all:[19,6,1,""],send_catch_log_deferred:[19,6,1,""],disconnect:[19,6,1,""],connect:[19,6,1,""],send_catch_log:[19,6,1,""]},"scrapy.utils.trackref":{print_live_refs:[35,8,1,""],iter_all:[35,8,1,""],get_oldest:[35,8,1,""],object_ref:[35,5,1,""]},"scrapy.contrib.downloadermiddleware":{redirect:[23,4,0,"-"],httpcompression:[23,4,0,"-"],cookies:[23,4,0,"-"],retry:[23,4,0,"-"],stats:[23,4,0,"-"],ajaxcrawl:[23,4,0,"-"],httpcache:[23,4,0,"-"],robotstxt:[23,4,0,"-"],DownloaderMiddleware:[23,5,1,""],httpauth:[23,4,0,"-"],downloadtimeout:[23,4,0,"-"],defaultheaders:[23,4,0,"-"],httpproxy:[23,4,0,"-"],useragent:[23,4,0,"-"],chunked:[23,4,0,"-"]},"scrapy.contrib.exporter":{JsonLinesItemExporter:[10,5,1,""],PprintItemExporter:[10,5,1,""],XmlItemExporter:[10,5,1,""],BaseItemExporter:[10,5,1,""],PickleItemExporter:[10,5,1,""],JsonItemExporter:[10,5,1,""],CsvItemExporter:[10,5,1,""]},"scrapy.contrib.exporter.BaseItemExporter":{encoding:[10,9,1,""],serialize_field:[10,6,1,""],export_empty_fields:[10,9,1,""],export_item:[10,6,1,""],finish_exporting:[10,6,1,""],start_exporting:[10,6,1,""],fields_to_export:[10,9,1,""]},"scrapy.mail":{MailSender:[40,5,1,""]},"scrapy.contrib.downloadermiddleware.chunked":{ChunkedTransferMiddleware:[23,5,1,""]},"scrapy.contrib.loader":{processor:[0,4,0,"-"],ItemLoader:[0,5,1,""]},"scrapy.contracts.Contract":{post_process:[21,6,1,""],adjust_request_args:[21,6,1,""],pre_process:[21,6,1,""]},"scrapy.contrib.pipeline.images":{ImagesPipeline:[15,5,1,""]},"scrapy.contrib.downloadermiddleware.robotstxt":{RobotsTxtMiddleware:[23,5,1,""]},"scrapy.statscol.MemoryStatsCollector":{spider_stats:[46,9,1,""]},"scrapy.contrib.webservice.stats":{StatsResource:[22,5,1,""]},"scrapy.contrib.webservice.enginestatus.scrapy.webservice":{JsonResource:[22,5,1,""],JsonRpcResource:[22,5,1,""]},"scrapy.item":{Field:[9,5,1,""],Item:[9,5,1,""]},"scrapy.contrib.memdebug.scrapy.contrib.memdebug":{MemoryDebugger:[37,5,1,""]},"scrapy.selector":{Selector:[30,5,1,""],SelectorList:[30,5,1,""]},"scrapy.contrib.memusage.scrapy.contrib.memusage":{MemoryUsage:[37,5,1,""]},"scrapy.contrib.spidermiddleware.SpiderMiddleware":{process_spider_input:[36,6,1,""],process_start_requests:[36,6,1,""],process_spider_output:[36,6,1,""],process_spider_exception:[36,6,1,""]},"scrapy.log":{INFO:[45,11,1,""],WARNING:[45,11,1,""],start:[45,8,1,""],CRITICAL:[45,11,1,""],ERROR:[45,11,1,""],DEBUG:[45,11,1,""],msg:[45,8,1,""]},"scrapy.contrib.webservice.crawler":{CrawlerResource:[22,5,1,""]},scrapy:{signalmanager:[19,4,0,"-"],http:[32,4,0,"-"],webservice:[37,4,0,"-"],settings:[19,4,0,"-"],contracts:[21,4,0,"-"],statscol:[46,4,0,"-"],spider:[25,4,0,"-"],telnet:[34,4,0,"-"],selector:[30,4,0,"-"],signals:[39,4,0,"-"],item:[9,4,0,"-"],exceptions:[28,4,0,"-"],mail:[40,4,0,"-"],crawler:[19,4,0,"-"],log:[45,4,0,"-"]},"scrapy.http.FormRequest":{from_response:[32,10,1,""]},"scrapy.contrib.debug.scrapy.contrib.debug":{Debugger:[37,5,1,""],StackTraceDump:[37,5,1,""]},"scrapy.contrib.downloadermiddleware.redirect":{MetaRefreshMiddleware:[23,5,1,""],RedirectMiddleware:[23,5,1,""]},"scrapy.http.Request":{body:[32,9,1,""],url:[32,9,1,""],replace:[32,6,1,""],headers:[32,9,1,""],meta:[32,9,1,""],copy:[32,6,1,""],method:[32,9,1,""]},"scrapy.telnet.scrapy.telnet":{TelnetConsole:[37,5,1,""]},"scrapy.contrib.downloadermiddleware.httpcache":{HttpCacheMiddleware:[23,5,1,""]},"scrapy.contrib.logstats":{LogStats:[37,5,1,""]},"scrapy.contrib.downloadermiddleware.retry":{RetryMiddleware:[23,5,1,""]},"scrapy.selector.Selector":{xpath:[30,6,1,""],register_namespace:[30,6,1,""],remove_namespaces:[30,6,1,""],"__nonzero__":[30,6,1,""],re:[30,6,1,""],extract:[30,6,1,""],css:[30,6,1,""]},"scrapy.contracts.default":{ScrapesContract:[21,5,1,""],ReturnsContract:[21,5,1,""],UrlContract:[21,5,1,""]},"scrapy.contrib.spiders.XMLFeedSpider":{process_results:[25,6,1,""],iterator:[25,9,1,""],adapt_response:[25,6,1,""],parse_node:[25,6,1,""],namespaces:[25,9,1,""],itertag:[25,9,1,""]},"scrapy.mail.MailSender":{from_settings:[40,10,1,""],send:[40,6,1,""]},"scrapy.contrib.downloadermiddleware.httpauth":{HttpAuthMiddleware:[23,5,1,""]},"scrapy.contrib.downloadermiddleware.httpcompression":{HttpCompressionMiddleware:[23,5,1,""]}},titleterms:{breadth:26,all:[12,26,9],code:[18,33,26],set_trac:26,serialize_field:10,consum:26,concept:12,per:[5,23],httpcompressionmiddlewar:23,follow:4,httpcache_en:23,send:[32,40],under:14,sent:26,global:5,rational:5,spider_error:39,robotstxtmiddlewar:23,util:35,trackref:35,depth_stat:5,level:[45,16],did:26,list:42,"try":14,item:[0,2,15,14,26,6,7,9,10],concurr:16,form:26,cooki:[11,16,26,23],feed_export:8,metarefreshmiddlewar:23,httpcache_ignore_schem:23,prevent:26,crawlspid:25,design:[20,5],pass:32,download:[15,7,26,23,16],run:[18,26,17,6],proxi:26,what:[26,14,17],compar:26,beautifulsoup:26,access:[5,9,34],spider_idl:39,version:[42,26,47],webservice_resources_bas:22,"new":33,method:10,redirect:16,memorystatscollector:46,gener:[15,37],error:26,concurrent_requests_per_ip:5,debugg:37,ubuntu:[41,3],redirect_max_tim:[5,23],valu:9,retrymiddlewar:23,dropitem:28,download_delai:5,bot:26,mail_ssl:40,compression_en:23,spider_clos:39,pick:17,chang:33,control:42,via:32,firefox:1,modul:[45,26,5,35],basesgmllinkextractor:31,httpcache_ignore_miss:23,deprec:33,api:[33,19,47],instal:3,autothrottle_start_delai:20,middlewar:[7,23,36],from:[45,43,26,6],item_drop:39,memori:[26,37,35],next:[14,17],websit:17,live:1,handler:[39,26],call:26,recommend:26,cookies_debug:23,benchmark:13,enhanc:33,bot_nam:5,memusage_limit_mb:5,downloader_middlewar:5,concurrent_request:5,aws_secret_access_kei:5,httpcache_expiration_sec:23,work:[20,26,9,30],can:26,caveat:[38,1],depth_stats_verbos:5,purpos:37,fetch:42,overrid:[5,10],defer:[39,26],process:6,redirectmiddlewar:23,mailsend:40,want:17,download_handl:5,memusage_report:5,huge:26,lxml:26,multipl:[23,6],goal:20,download_timeout:5,tamper:1,write:[36,2,23,17,18,37,22],from_respons:32,jsonlinesitemexport:10,instead:26,csvitemexport:10,csv:[8,26],product:26,jsonitemexport:10,resourc:22,httpcache_ignore_http_cod:23,feed_uri:8,log_level:5,data:[1,14,4,26,17,7,32],stabil:47,practic:6,baseitemexport:10,closespider_errorcount:37,django:[38,26],caus:35,callback:32,webservice_port:22,order:26,extractor:31,help:12,stats_class:5,report:18,dynam:6,paramet:[8,26],style:18,thank:33,how:[11,5,26,45,20,34],polici:[18,23],fix:[33,26],platform:3,window:3,httperror_allow_al:36,persist:11,mail:40,closespider_itemcount:37,them:26,crash:26,python:26,autothrottl:20,introduct:[4,14],name:[26,5],edit:42,httpcachemiddlewar:23,authent:26,httperror_allowed_cod:36,retry_tim:23,autothrottle_en:20,timeout:16,debug:[24,26,37,35],feed_exporters_bas:8,mean:26,dnscache_en:5,resum:[11,34],spider:[36,14,23,24,25,26,37,17,6,7,43,45,21,22,35],autothrottle_max_delai:20,formrequest:32,meta:32,concurrent_item:5,redirect_en:23,our:14,happen:14,extract:[4,14,17],event:7,special:32,out:15,variabl:34,ftp:8,network:7,memdebug_en:5,feed_store_empti:8,rel:30,default_item_class:5,urllength_limit:5,concurrent_requests_per_domain:5,xpather:1,standard:8,quick:40,ajax:16,ask:26,redirect_priority_adjust:5,depth_limit:5,launch:43,where:26,keep:11,filter:[15,26,2],scrapyd:[33,44],frequent:26,first:[12,26,14],oper:30,instruct:26,construct:30,httperrormiddlewar:36,webservice_en:22,open:24,hood:14,differ:26,script:[22,6],system:15,messag:[45,26],checker:1,useragentmiddlewar:23,mail_port:40,store:14,too:35,shell:[24,42,43,14],consol:[37,34],dupefilter_class:5,namespac:30,tool:[33,42],metarefresh_en:23,djangoitem:38,user_ag:5,pars:[24,42,26],crawl:[11,14,16,26,6,42,25],newspider_modul:5,remov:[33,30],structur:42,exampl:[2,15,25,26,40,29,30,32,43,22,34,35],project:[42,26,5,14],mail_pass:40,reus:0,browser:[24,1],pre:3,response_receiv:39,spider_middlewares_bas:5,feed_storag:8,argument:[25,26],log_en:5,packag:41,expir:[11,15],notsupport:28,requisit:3,incompat:33,engin:[7,22,34],built:[36,0,12,25,5,39,23,28,30,31,37,10],retry_http_cod:23,note:[33,3],client:22,log_fil:5,which:35,gotcha:11,autothrottle_debug:20,pipelin:[15,7,2],distribut:6,trace:37,track:35,price:2,filesystem:[8,23],downloaderstat:23,regular:30,deploi:[42,26],selector:[26,14,30],"class":[40,6],dom:1,flow:7,uri:8,doe:26,dummi:23,declar:[0,9,10],webservice_logfil:22,statsmail:37,dbm:23,session:[43,23],memusage_warning_mb:5,find:26,xml:[8,26,30],configur:26,activ:[36,37,2,23],should:26,experiment:27,downloader_debug:5,local:8,contribut:18,get:[4,26,12,9,6],xmlrespons:32,stop:[26,34],nativ:26,cannot:26,increas:16,closespid:28,enabl:[15,16,37],sgmllinkextractor:31,htmlrespons:32,patch:18,common:[35,46,9,6],httpcache_dbm_modul:23,ban:[26,6],steal:26,view:[42,34],set:[42,36,26,30,23,38,5,40,8,19,9,33,45,20,37,22,34],dump:[26,37],spider_modul:5,see:26,respons:[32,43,26,30],close:37,best:26,statu:[26,22,34],kei:32,selectorlist:30,review:17,sitemapspid:25,yet:[],crawlabl:16,state:11,simplest:26,"import":26,thumbnail:15,commands_modul:42,attribut:26,defaultheadersmiddlewar:23,extend:[12,0,9],mail_tl:40,extens:[20,5,37,22,30],job:11,solv:12,cookies_en:23,addit:[15,32],cryptic:26,ignorerequest:28,contract:21,tutori:14,context:0,improv:33,login:[32,26],pdb:26,load:37,overview:7,rfc2616:23,loader:0,rpc:22,guid:3,backend:[8,23],compon:[7,2],json:[8,26,2,22],httpcache_storag:23,basic:[12,26],pickleitemexport:10,popul:[5,0,9],imag:15,telnet:[37,34],ani:26,downloader_stat:5,els:17,servic:[12,37,22],batch:11,memdebug_notifi:5,defin:[14,17],invok:43,telnetconsole_en:5,abov:3,mail_host:40,spider_middlewar:5,glanc:17,itself:26,"return":26,feed:[8,26],disabl:[16,37],downloadtimeoutmiddlewar:23,obsolet:33,receiv:26,redirect_max_metarefresh_delai:[5,23],make:26,same:6,html:30,memusage_en:5,document:[18,12,26],memusage_notify_mail:5,http:[32,26],nest:30,xmlitemexport:10,driven:7,user:[32,26],mani:35,extern:27,download_handlers_bas:5,stats_dump:5,stack:37,telnetconsole_host:34,task:9,feed_storages_bas:8,pickl:8,without:[26,35],command:[24,27,42,5,33],thi:26,english:26,urllengthmiddlewar:36,retry_en:23,paus:[11,34],just:14,less:26,templates_dir:5,rest:12,runspid:42,aws_access_key_id:5,languag:26,web:[37,22],except:28,shortcut:43,bench:42,add:[27,1],other:[33,9],telnetconsole_port:[5,34],schedul:[7,5],textrespons:32,input:0,spider_open:39,real:35,format:8,big:26,response_download:39,specif:[12,3],signal:[39,26,19,34],csvfeedspid:25,closespider_timeout:37,collect:46,downloader_middlewares_bas:5,spider_contract:5,output:[8,0],genspid:42,ajaxcrawlmiddlewar:23,page:[16,26],default_request_head:5,crawler:[26,19,22],"function":[32,33],drop:2,enginestatusresourc:22,item_pipelin:5,creation:6,some:26,httpcach:23,pprintitemexport:10,mail_us:40,"export":[8,26,10],httpcache_dir:23,small:15,item_pipelines_bas:5,librari:27,referer_en:36,ajaxcrawl_en:23,leak:[26,35],avoid:6,notconfigur:28,subclass:32,retri:16,larg:26,duplic:2,statsresourc:22,refer:[36,12,25,5,39,40,23,28,30,31,37,10],core:[19,37],object:[0,35,30,43,9,32],throttl:20,importerror:26,inspect:[43,1],usag:[15,32,37,34],offsitemiddlewar:36,step:[12,14],feed_format:8,post:32,exslt:30,between:11,item_scrap:39,httpauthmiddlewar:23,simul:[32,26],webservice_resourc:22,webservice_host:22,marshal:8,win32api:26,own:[36,37,2,23],httpcache_polici:23,guppi:35,automat:26,cookiesmiddlewar:23,contrib:18,referermiddlewar:36,storag:[15,8,23],your:[15,36,37,2,23],manag:[26,22],processor:0,log:[24,16,37,45],wai:26,chunkedtransfermiddlewar:23,support:26,question:26,log_stdout:5,submit:18,custom:[15,42,21],avail:[0,46,42,43,37,22,34],editor:5,xpath:[26,1,30],"__viewstat":26,xmlfeedspid:25,link:[4,31],httpproxymiddlewar:23,startproject:42,depthmiddlewar:36,line:[33,8,42],bug:18,engine_start:39,mail_from:40,"default":[33,42,5,23],bugfix:33,engine_stop:39,log_encod:5,sampl:37,problem:12,closespider_pagecount:37,featur:[15,27,33],creat:[42,26,9,14],request:[11,32,26],doesn:26,twist:26,implement:15,firebug:[4,1],file:[15,26,2],rearrang:33,check:42,scrapi:[12,14,3,24,26,17,6,18,7,42,43,45,22,34,35],itemload:0,collector:[19,46,22],dummystatscollector:46,depth_prior:5,scrape:[4,26,1,14,17],extensions_bas:5,field:[9,10],valid:2,test:18,you:17,architectur:7,stat:[46,19,37,22],offsit:26,why:26,spider_contracts_bas:5,express:30,releas:33,randomize_download_delai:5,serial:[11,8,10],reduc:16,statsmailer_rcpt:5,algorithm:20,directori:11,bindaddress:32,rule:25,depth:26,robotstxt_obei:5,broad:16,firecooki:1,backward:33}})PK&o1Dscrapy-0.22/py-modindex.html Python Module Index — Scrapy 0.22.0 documentation

Python Module Index

s
 
s
scrapy
    scrapy.contracts
    scrapy.contracts.default
    scrapy.contrib.closespider Close spider extension
    scrapy.contrib.corestats Core stats collection
    scrapy.contrib.debug Extensions for debugging Scrapy
    scrapy.contrib.downloadermiddleware
    scrapy.contrib.downloadermiddleware.ajaxcrawl
    scrapy.contrib.downloadermiddleware.chunked Chunked Transfer Middleware
    scrapy.contrib.downloadermiddleware.cookies Cookies Downloader Middleware
    scrapy.contrib.downloadermiddleware.defaultheaders Default Headers Downloader Middleware
    scrapy.contrib.downloadermiddleware.downloadtimeout Download timeout middleware
    scrapy.contrib.downloadermiddleware.httpauth HTTP Auth downloader middleware
    scrapy.contrib.downloadermiddleware.httpcache HTTP Cache downloader middleware
    scrapy.contrib.downloadermiddleware.httpcompression Http Compression Middleware
    scrapy.contrib.downloadermiddleware.httpproxy Http Proxy Middleware
    scrapy.contrib.downloadermiddleware.redirect Redirection Middleware
    scrapy.contrib.downloadermiddleware.retry Retry Middleware
    scrapy.contrib.downloadermiddleware.robotstxt robots.txt middleware
    scrapy.contrib.downloadermiddleware.stats Downloader Stats Middleware
    scrapy.contrib.downloadermiddleware.useragent User Agent Middleware
    scrapy.contrib.exporter Item Exporters
    scrapy.contrib.linkextractors Link extractors classes
    scrapy.contrib.linkextractors.sgml SGMLParser-based link extractors
    scrapy.contrib.loader Item Loader class
    scrapy.contrib.loader.processor A collection of processors to use with Item Loaders
    scrapy.contrib.logstats Basic stats logging
    scrapy.contrib.memdebug Memory debugger extension
    scrapy.contrib.memusage Memory usage extension
    scrapy.contrib.pipeline.images Images Pipeline
    scrapy.contrib.spidermiddleware
    scrapy.contrib.spidermiddleware.depth Depth Spider Middleware
    scrapy.contrib.spidermiddleware.httperror HTTP Error Spider Middleware
    scrapy.contrib.spidermiddleware.offsite Offsite Spider Middleware
    scrapy.contrib.spidermiddleware.referer Referer Spider Middleware
    scrapy.contrib.spidermiddleware.urllength URL Length Spider Middleware
    scrapy.contrib.spiders Collection of generic spiders
    scrapy.contrib.statsmailer StatsMailer extension
    scrapy.contrib.webservice Built-in web service resources
    scrapy.contrib.webservice.crawler Crawler JSON-RPC resource
    scrapy.contrib.webservice.enginestatus Engine Status JSON resource
    scrapy.contrib.webservice.stats Stats JSON-RPC resource
    scrapy.crawler The Scrapy crawler
    scrapy.exceptions Scrapy exceptions
    scrapy.http Request and Response classes
    scrapy.item Item and Field classes
    scrapy.log Logging facility
    scrapy.mail Email sending facility
    scrapy.selector Selector class
    scrapy.settings Settings manager
    scrapy.signalmanager The signal manager
    scrapy.signals Signals definitions
    scrapy.spider Spiders base class, spider manager and spider middleware
    scrapy.statscol Stats Collectors
    scrapy.telnet The Telnet Console
    scrapy.utils.trackref Track references of live objects
    scrapy.webservice Web service
Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PK&o1DMƛ||scrapy-0.22/search.html Search — Scrapy 0.22.0 documentation

Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PK&o1DHIscrapy-0.22/.buildinfo# Sphinx build info version 1 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. config: 1184ebd5b1b9a8dcd725a5c6111c647a tags: efa25262f700e02b1777eb79ee109f5c PK o1D8.ZZscrapy-0.22/contributing.html Contributing to Scrapy — Scrapy 0.22.0 documentation

Contributing to Scrapy

There are many ways to contribute to Scrapy. Here are some of them:

  • Blog about Scrapy. Tell the world how you’re using Scrapy. This will help newcomers with more examples and the Scrapy project to increase its visibility.
  • Report bugs and request features in the issue tracker, trying to follow the guidelines detailed in Reporting bugs below.
  • Submit patches for new functionality and/or bug fixes. Please read Writing patches and Submitting patches below for details on how to write and submit a patch.
  • Join the scrapy-developers mailing list and share your ideas on how to improve Scrapy. We’re always open to suggestions.

Reporting bugs

Well-written bug reports are very helpful, so keep in mind the following guidelines when reporting a new bug.

  • check the FAQ first to see if your issue is addressed in a well-known question
  • check the open issues to see if it has already been reported. If it has, don’t dismiss the report but check the ticket history and comments, you may find additional useful information to contribute.
  • search the scrapy-users list to see if it has been discussed there, or if you’re not sure if what you’re seeing is a bug. You can also ask in the #scrapy IRC channel.
  • write complete, reproducible, specific bug reports. The smaller the test case, the better. Remember that other developers won’t have your project to reproduce the bug, so please include all relevant files required to reproduce it.
  • include the output of scrapy version -v so developers working on your bug know exactly which version and platform it occurred on, which is often very helpful for reproducing it, or knowing if it was already fixed.

Writing patches

The better written a patch is, the higher chance that it’ll get accepted and the sooner that will be merged.

Well-written patches should:

  • contain the minimum amount of code required for the specific change. Small patches are easier to review and merge. So, if you’re doing more than one change (or bug fix), please consider submitting one patch per change. Do not collapse multiple changes into a single patch. For big changes consider using a patch queue.
  • pass all unit-tests. See Running tests below.
  • include one (or more) test cases that check the bug fixed or the new functionality added. See Writing tests below.
  • if you’re adding or changing a public (documented) API, please include the documentation changes in the same patch. See Documentation policies below.

Submitting patches

The best way to submit a patch is to issue a pull request on Github, optionally creating a new issue first.

Remember to explain what was fixed or the new functionality (what it is, why it’s needed, etc). The more info you include, the easier will be for core developers to understand and accept your patch.

You can also discuss the new functionality (or bug fix) in scrapy-developers first, before creating the patch, but it’s always good to have a patch ready to illustrate your arguments and show that you have put some additional thought into the subject.

Finally, try to keep aesthetic changes (PEP 8 compliance, unused imports removal, etc) in separate commits than functional changes, to make the pull request easier to review.

Coding style

Please follow these coding conventions when writing code for inclusion in Scrapy:

  • Unless otherwise specified, follow PEP 8.
  • It’s OK to use lines longer than 80 chars if it improves the code readability.
  • Don’t put your name in the code you contribute. Our policy is to keep the contributor’s name in the AUTHORS file distributed with Scrapy.

Scrapy Contrib

Scrapy contrib shares a similar rationale as Django contrib, which is explained in this post. If you are working on a new functionality, please follow that rationale to decide whether it should be a Scrapy contrib. If unsure, you can ask in scrapy-developers.

Documentation policies

  • Don’t use docstrings for documenting classes, or methods which are already documented in the official (sphinx) documentation. For example, the ItemLoader.add_value() method should be documented in the sphinx documentation, not its docstring.
  • Do use docstrings for documenting functions not present in the official (sphinx) documentation, such as functions from scrapy.utils package and its sub-modules.

Tests

Tests are implemented using the Twisted unit-testing framework called trial.

Running tests

To run all tests go to the root directory of Scrapy source code and run:

bin/runtests.sh (on unix)

bin\runtests.bat (on windows)

To run a specific test (say scrapy.tests.test_contrib_loader) use:

bin/runtests.sh scrapy.tests.test_contrib_loader (on unix)

bin\runtests.bat scrapy.tests.test_contrib_loader (on windows)

Writing tests

All functionality (including new features and bug fixes) must include a test case to check that it works as expected, so please include tests for your patches if you want them to get accepted sooner.

Scrapy uses unit-tests, which are located in the scrapy.tests package (scrapy/tests directory). Their module name typically resembles the full path of the module they’re testing. For example, the item loaders code is in:

scrapy.contrib.loader

And their unit-tests are in:

scrapy.tests.test_contrib_loader
Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PK&o1DîE:scrapy-0.22/genindex.html Index — Scrapy 0.22.0 documentation

Index

_ | A | B | C | D | E | F | G | H | I | J | L | M | N | O | P | R | S | T | U | V | W | X

_

__nonzero__() (scrapy.selector.Selector method)
(scrapy.selector.SelectorList method)

A

adapt_response() (scrapy.contrib.spiders.XMLFeedSpider method)
add_css() (scrapy.contrib.loader.ItemLoader method)
add_value() (scrapy.contrib.loader.ItemLoader method)
add_xpath() (scrapy.contrib.loader.ItemLoader method)
adjust_request_args() (scrapy.contracts.Contract method)
AJAXCRAWL_ENABLED
setting
AjaxCrawlMiddleware (class in scrapy.contrib.downloadermiddleware.ajaxcrawl)
allowed_domains (scrapy.spider.Spider attribute)
AUTOTHROTTLE_DEBUG
setting
AUTOTHROTTLE_ENABLED
setting
AUTOTHROTTLE_MAX_DELAY
setting
AUTOTHROTTLE_START_DELAY
setting
AWS_ACCESS_KEY_ID
setting
AWS_SECRET_ACCESS_KEY
setting

B

BaseItemExporter (class in scrapy.contrib.exporter)
BaseSgmlLinkExtractor (class in scrapy.contrib.linkextractors.sgml)
bench
command
bindaddress
reqmeta
body (scrapy.http.Request attribute)
(scrapy.http.Response attribute)
body_as_unicode() (scrapy.http.TextResponse method)
BOT_NAME
setting

C

check
command
ChunkedTransferMiddleware (class in scrapy.contrib.downloadermiddleware.chunked)
clear_stats() (scrapy.statscol.StatsCollector method)
close_spider()
(scrapy.statscol.StatsCollector method)
CloseSpider
CLOSESPIDER_ERRORCOUNT
setting
CLOSESPIDER_ITEMCOUNT
setting
CLOSESPIDER_PAGECOUNT
setting
CLOSESPIDER_TIMEOUT
setting
command
bench
check
crawl
deploy
edit
fetch
genspider
list
parse
runspider
settings
shell
startproject
version
view
COMMANDS_MODULE
setting
Compose (class in scrapy.contrib.loader.processor)
COMPRESSION_ENABLED
setting
CONCURRENT_ITEMS
setting
CONCURRENT_REQUESTS
setting
CONCURRENT_REQUESTS_PER_DOMAIN
setting
CONCURRENT_REQUESTS_PER_IP
setting
configure() (scrapy.crawler.Crawler method)
connect() (scrapy.signalmanager.SignalManager method)
context (scrapy.contrib.loader.ItemLoader attribute)
Contract (class in scrapy.contracts)
cookiejar
reqmeta
COOKIES_DEBUG
setting
COOKIES_ENABLED
setting
CookiesMiddleware (class in scrapy.contrib.downloadermiddleware.cookies)
copy() (scrapy.http.Request method)
(scrapy.http.Response method)
CoreStats (class in scrapy.contrib.corestats)
crawl
command
Crawler (class in scrapy.crawler)
CrawlerResource (class in scrapy.contrib.webservice.crawler)
CrawlSpider (class in scrapy.contrib.spiders)
CRITICAL (in module scrapy.log)
css() (scrapy.selector.Selector method)
(scrapy.selector.SelectorList method)
CSVFeedSpider (class in scrapy.contrib.spiders)
CsvItemExporter (class in scrapy.contrib.exporter)

D

DEBUG (in module scrapy.log)
default_input_processor (scrapy.contrib.loader.ItemLoader attribute)
DEFAULT_ITEM_CLASS
setting
default_item_class (scrapy.contrib.loader.ItemLoader attribute)
default_output_processor (scrapy.contrib.loader.ItemLoader attribute)
DEFAULT_REQUEST_HEADERS
setting
default_selector_class (scrapy.contrib.loader.ItemLoader attribute)
DefaultHeadersMiddleware (class in scrapy.contrib.downloadermiddleware.defaultheaders)
delimiter (scrapy.contrib.spiders.CSVFeedSpider attribute)
deploy
command
DEPTH_LIMIT
setting
DEPTH_PRIORITY
setting
DEPTH_STATS
setting
DEPTH_STATS_VERBOSE
setting
DepthMiddleware (class in scrapy.contrib.spidermiddleware.depth)
disconnect() (scrapy.signalmanager.SignalManager method)
disconnect_all() (scrapy.signalmanager.SignalManager method)
DNSCACHE_ENABLED
setting
dont_redirect
reqmeta
dont_retry
reqmeta
DOWNLOAD_DELAY
setting
DOWNLOAD_HANDLERS
setting
DOWNLOAD_HANDLERS_BASE
setting
DOWNLOAD_TIMEOUT
setting
DOWNLOADER_DEBUG
setting
DOWNLOADER_MIDDLEWARES
setting
DOWNLOADER_MIDDLEWARES_BASE
setting
DOWNLOADER_STATS
setting
DownloaderMiddleware (class in scrapy.contrib.downloadermiddleware)
DownloaderStats (class in scrapy.contrib.downloadermiddleware.stats)
DownloadTimeoutMiddleware (class in scrapy.contrib.downloadermiddleware.downloadtimeout)
DropItem
DummyStatsCollector (class in scrapy.statscol)
DUPEFILTER_CLASS
setting

E

edit
command
encoding (scrapy.contrib.exporter.BaseItemExporter attribute)
(scrapy.http.TextResponse attribute)
engine (scrapy.crawler.Crawler attribute)
engine_started
signal
engine_started() (in module scrapy.signals)
engine_stopped
signal
engine_stopped() (in module scrapy.signals)
EngineStatusResource (class in scrapy.contrib.webservice.enginestatus)
ERROR (in module scrapy.log)
export_empty_fields (scrapy.contrib.exporter.BaseItemExporter attribute)
export_item() (scrapy.contrib.exporter.BaseItemExporter method)
EXTENSIONS
setting
extensions (scrapy.crawler.Crawler attribute)
EXTENSIONS_BASE
setting
extract() (scrapy.selector.Selector method)
(scrapy.selector.SelectorList method)

F

FEED_EXPORTERS
setting
FEED_EXPORTERS_BASE
setting
FEED_FORMAT
setting
FEED_STORAGES
setting
FEED_STORAGES_BASE
setting
FEED_STORE_EMPTY
setting
FEED_URI
setting
fetch
command
Field (class in scrapy.item)
fields (scrapy.item.Item attribute)
fields_to_export (scrapy.contrib.exporter.BaseItemExporter attribute)
finish_exporting() (scrapy.contrib.exporter.BaseItemExporter method)
flags (scrapy.http.Response attribute)
FormRequest (class in scrapy.http)
from_response() (scrapy.http.FormRequest class method)
from_settings() (scrapy.mail.MailSender class method)

G

genspider
command
get() (scrapy.settings.Settings method)
get_collected_values() (scrapy.contrib.loader.ItemLoader method)
get_css() (scrapy.contrib.loader.ItemLoader method)
get_input_processor() (scrapy.contrib.loader.ItemLoader method)
get_media_requests() (scrapy.contrib.pipeline.images.ImagesPipeline method)
get_oldest() (in module scrapy.utils.trackref)
get_output_processor() (scrapy.contrib.loader.ItemLoader method)
get_output_value() (scrapy.contrib.loader.ItemLoader method)
get_stats() (scrapy.statscol.StatsCollector method)
get_target() (scrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonRpcResource method)
get_value() (scrapy.contrib.loader.ItemLoader method)
(scrapy.statscol.StatsCollector method)
get_xpath() (scrapy.contrib.loader.ItemLoader method)
getbool() (scrapy.settings.Settings method)
getfloat() (scrapy.settings.Settings method)
getint() (scrapy.settings.Settings method)
getlist() (scrapy.settings.Settings method)

H

handle_httpstatus_list
reqmeta
headers (scrapy.contrib.spiders.CSVFeedSpider attribute)
(scrapy.http.Request attribute)
(scrapy.http.Response attribute)
HtmlResponse (class in scrapy.http)
HttpAuthMiddleware (class in scrapy.contrib.downloadermiddleware.httpauth)
HTTPCACHE_DBM_MODULE
setting
HTTPCACHE_DIR
setting
HTTPCACHE_ENABLED
setting
HTTPCACHE_EXPIRATION_SECS
setting
HTTPCACHE_IGNORE_HTTP_CODES
setting
HTTPCACHE_IGNORE_MISSING
setting
HTTPCACHE_IGNORE_SCHEMES
setting
HTTPCACHE_POLICY
setting
HTTPCACHE_STORAGE
setting
HttpCacheMiddleware (class in scrapy.contrib.downloadermiddleware.httpcache)
HttpCompressionMiddleware (class in scrapy.contrib.downloadermiddleware.httpcompression)
HTTPERROR_ALLOW_ALL
setting
HTTPERROR_ALLOWED_CODES
setting
HttpErrorMiddleware (class in scrapy.contrib.spidermiddleware.httperror)
HttpProxyMiddleware (class in scrapy.contrib.downloadermiddleware.httpproxy)

I

Identity (class in scrapy.contrib.loader.processor)
IgnoreRequest
IMAGES_EXPIRES
setting
IMAGES_MIN_HEIGHT
setting
IMAGES_MIN_WIDTH
setting
IMAGES_STORE
setting
IMAGES_THUMBS
setting
ImagesPipeline (class in scrapy.contrib.pipeline.images)
inc_value() (scrapy.statscol.StatsCollector method)
INFO (in module scrapy.log)
Item (class in scrapy.item)
item (scrapy.contrib.loader.ItemLoader attribute)
item_completed() (scrapy.contrib.pipeline.images.ImagesPipeline method)
item_dropped
signal
item_dropped() (in module scrapy.signals)
ITEM_PIPELINES
setting
ITEM_PIPELINES_BASE
setting
item_scraped
signal
item_scraped() (in module scrapy.signals)
ItemLoader (class in scrapy.contrib.loader)
iter_all() (in module scrapy.utils.trackref)
iterator (scrapy.contrib.spiders.XMLFeedSpider attribute)
itertag (scrapy.contrib.spiders.XMLFeedSpider attribute)

J

jDITOR
setting
Join (class in scrapy.contrib.loader.processor)
JsonItemExporter (class in scrapy.contrib.exporter)
JsonLinesItemExporter (class in scrapy.contrib.exporter)

L

list
command
load_item() (scrapy.contrib.loader.ItemLoader method)
log() (scrapy.spider.Spider method)
LOG_ENABLED
setting
LOG_ENCODING
setting
LOG_FILE
setting
LOG_LEVEL
setting
LOG_STDOUT
setting
LogStats (class in scrapy.contrib.logstats)

M

MAIL_FROM
setting
MAIL_HOST
setting
MAIL_PASS
setting
MAIL_PORT
setting
MAIL_SSL
setting
MAIL_TLS
setting
MAIL_USER
setting
MailSender (class in scrapy.mail)
make_requests_from_url() (scrapy.spider.Spider method)
MapCompose (class in scrapy.contrib.loader.processor)
max_value() (scrapy.statscol.StatsCollector method)
MEMDEBUG_ENABLED
setting
MEMDEBUG_NOTIFY
setting
MemoryStatsCollector (class in scrapy.statscol)
MEMUSAGE_ENABLED
setting
MEMUSAGE_LIMIT_MB
setting
MEMUSAGE_NOTIFY_MAIL
setting
MEMUSAGE_REPORT
setting
MEMUSAGE_WARNING_MB
setting
meta (scrapy.http.Request attribute)
(scrapy.http.Response attribute)
METAREFRESH_ENABLED
setting
MetaRefreshMiddleware (class in scrapy.contrib.downloadermiddleware.redirect)
method (scrapy.http.Request attribute)
min_value() (scrapy.statscol.StatsCollector method)
msg() (in module scrapy.log)

N

name (scrapy.spider.Spider attribute)
namespaces (scrapy.contrib.spiders.XMLFeedSpider attribute)
NEWSPIDER_MODULE
setting
NotConfigured
NotSupported

O

object_ref (class in scrapy.utils.trackref)
OffsiteMiddleware (class in scrapy.contrib.spidermiddleware.offsite)
open_spider()
(scrapy.statscol.StatsCollector method)
overrides (scrapy.settings.Settings attribute)

P

parse
command
parse() (scrapy.spider.Spider method)
parse_node() (scrapy.contrib.spiders.XMLFeedSpider method)
parse_row() (scrapy.contrib.spiders.CSVFeedSpider method)
parse_start_url() (scrapy.contrib.spiders.CrawlSpider method)
PickleItemExporter (class in scrapy.contrib.exporter)
post_process() (scrapy.contracts.Contract method)
PprintItemExporter (class in scrapy.contrib.exporter)
pre_process() (scrapy.contracts.Contract method)
print_live_refs() (in module scrapy.utils.trackref)
process_exception() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)
process_item()
process_request() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)
process_response() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)
process_results() (scrapy.contrib.spiders.XMLFeedSpider method)
process_spider_exception() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)
process_spider_input() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)
process_spider_output() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)
process_start_requests() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)
Python Enhancement Proposals
PEP 8, [1]

R

RANDOMIZE_DOWNLOAD_DELAY
setting
re() (scrapy.selector.Selector method)
(scrapy.selector.SelectorList method)
REDIRECT_ENABLED
setting
REDIRECT_MAX_METAREFRESH_DELAY
setting, [1]
REDIRECT_MAX_TIMES
setting, [1]
REDIRECT_PRIORITY_ADJUST
setting
redirect_urls
reqmeta
RedirectMiddleware (class in scrapy.contrib.downloadermiddleware.redirect)
REFERER_ENABLED
setting
RefererMiddleware (class in scrapy.contrib.spidermiddleware.referer)
register_namespace() (scrapy.selector.Selector method)
remove_namespaces() (scrapy.selector.Selector method)
replace() (scrapy.http.Request method)
(scrapy.http.Response method)
replace_css() (scrapy.contrib.loader.ItemLoader method)
replace_value() (scrapy.contrib.loader.ItemLoader method)
replace_xpath() (scrapy.contrib.loader.ItemLoader method)
reqmeta
bindaddress
cookiejar
dont_redirect
dont_retry
handle_httpstatus_list
redirect_urls
Request (class in scrapy.http)
request (scrapy.http.Response attribute)
Response (class in scrapy.http)
response_downloaded
signal
response_downloaded() (in module scrapy.signals)
response_received
signal
response_received() (in module scrapy.signals)
RETRY_ENABLED
setting
RETRY_HTTP_CODES
setting
RETRY_TIMES
setting
RetryMiddleware (class in scrapy.contrib.downloadermiddleware.retry)
ReturnsContract (class in scrapy.contracts.default)
ROBOTSTXT_OBEY
setting
RobotsTxtMiddleware (class in scrapy.contrib.downloadermiddleware.robotstxt)
Rule (class in scrapy.contrib.spiders)
rules (scrapy.contrib.spiders.CrawlSpider attribute)
runspider
command

S

SCHEDULER
setting
ScrapesContract (class in scrapy.contracts.default)
scrapy.contracts (module)
scrapy.contracts.default (module)
scrapy.contrib.closespider (module)
scrapy.contrib.closespider.CloseSpider (class in scrapy.contrib.closespider)
scrapy.contrib.corestats (module)
scrapy.contrib.debug (module)
scrapy.contrib.debug.Debugger (class in scrapy.contrib.debug)
scrapy.contrib.debug.StackTraceDump (class in scrapy.contrib.debug)
scrapy.contrib.downloadermiddleware (module)
scrapy.contrib.downloadermiddleware.ajaxcrawl (module)
scrapy.contrib.downloadermiddleware.chunked (module)
scrapy.contrib.downloadermiddleware.cookies (module)
scrapy.contrib.downloadermiddleware.defaultheaders (module)
scrapy.contrib.downloadermiddleware.downloadtimeout (module)
scrapy.contrib.downloadermiddleware.httpauth (module)
scrapy.contrib.downloadermiddleware.httpcache (module)
scrapy.contrib.downloadermiddleware.httpcompression (module)
scrapy.contrib.downloadermiddleware.httpproxy (module)
scrapy.contrib.downloadermiddleware.redirect (module)
scrapy.contrib.downloadermiddleware.retry (module)
scrapy.contrib.downloadermiddleware.robotstxt (module)
scrapy.contrib.downloadermiddleware.stats (module)
scrapy.contrib.downloadermiddleware.useragent (module)
scrapy.contrib.exporter (module)
scrapy.contrib.linkextractors (module)
scrapy.contrib.linkextractors.sgml (module)
scrapy.contrib.loader (module)
scrapy.contrib.loader.processor (module)
scrapy.contrib.logstats (module)
scrapy.contrib.memdebug (module)
scrapy.contrib.memdebug.MemoryDebugger (class in scrapy.contrib.memdebug)
scrapy.contrib.memusage (module)
scrapy.contrib.memusage.MemoryUsage (class in scrapy.contrib.memusage)
scrapy.contrib.pipeline.images (module)
scrapy.contrib.spidermiddleware (module)
scrapy.contrib.spidermiddleware.depth (module)
scrapy.contrib.spidermiddleware.httperror (module)
scrapy.contrib.spidermiddleware.offsite (module)
scrapy.contrib.spidermiddleware.referer (module)
scrapy.contrib.spidermiddleware.urllength (module)
scrapy.contrib.spiders (module)
scrapy.contrib.statsmailer (module)
scrapy.contrib.statsmailer.StatsMailer (class in scrapy.contrib.statsmailer)
scrapy.contrib.webservice (module)
scrapy.contrib.webservice.crawler (module)
scrapy.contrib.webservice.enginestatus (module)
scrapy.contrib.webservice.stats (module)
scrapy.crawler (module)
scrapy.exceptions (module)
scrapy.http (module)
scrapy.item (module)
scrapy.log (module)
scrapy.mail (module)
scrapy.selector (module)
scrapy.settings (module)
scrapy.signalmanager (module)
scrapy.signals (module)
scrapy.spider (module)
scrapy.statscol (module), [1]
scrapy.telnet (module), [1]
scrapy.telnet.TelnetConsole (class in scrapy.telnet)
scrapy.utils.trackref (module)
scrapy.webservice (module)
scrapy.webservice.JsonResource (class in scrapy.contrib.webservice.enginestatus)
scrapy.webservice.JsonRpcResource (class in scrapy.contrib.webservice.enginestatus)
scrapy.webservice.WebService (class in scrapy.webservice)
Selector (class in scrapy.selector)
selector (scrapy.contrib.loader.ItemLoader attribute)
SelectorList (class in scrapy.selector)
send() (scrapy.mail.MailSender method)
send_catch_log() (scrapy.signalmanager.SignalManager method)
send_catch_log_deferred() (scrapy.signalmanager.SignalManager method)
serialize_field() (scrapy.contrib.exporter.BaseItemExporter method)
set_stats() (scrapy.statscol.StatsCollector method)
set_value() (scrapy.statscol.StatsCollector method)
setting
AJAXCRAWL_ENABLED
AUTOTHROTTLE_DEBUG
AUTOTHROTTLE_ENABLED
AUTOTHROTTLE_MAX_DELAY
AUTOTHROTTLE_START_DELAY
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
BOT_NAME
CLOSESPIDER_ERRORCOUNT
CLOSESPIDER_ITEMCOUNT
CLOSESPIDER_PAGECOUNT
CLOSESPIDER_TIMEOUT
COMMANDS_MODULE
COMPRESSION_ENABLED
CONCURRENT_ITEMS
CONCURRENT_REQUESTS
CONCURRENT_REQUESTS_PER_DOMAIN
CONCURRENT_REQUESTS_PER_IP
COOKIES_DEBUG
COOKIES_ENABLED
DEFAULT_ITEM_CLASS
DEFAULT_REQUEST_HEADERS
DEPTH_LIMIT
DEPTH_PRIORITY
DEPTH_STATS
DEPTH_STATS_VERBOSE
DNSCACHE_ENABLED
DOWNLOADER_DEBUG
DOWNLOADER_MIDDLEWARES
DOWNLOADER_MIDDLEWARES_BASE
DOWNLOADER_STATS
DOWNLOAD_DELAY
DOWNLOAD_HANDLERS
DOWNLOAD_HANDLERS_BASE
DOWNLOAD_TIMEOUT
DUPEFILTER_CLASS
EXTENSIONS
EXTENSIONS_BASE
FEED_EXPORTERS
FEED_EXPORTERS_BASE
FEED_FORMAT
FEED_STORAGES
FEED_STORAGES_BASE
FEED_STORE_EMPTY
FEED_URI
HTTPCACHE_DBM_MODULE
HTTPCACHE_DIR
HTTPCACHE_ENABLED
HTTPCACHE_EXPIRATION_SECS
HTTPCACHE_IGNORE_HTTP_CODES
HTTPCACHE_IGNORE_MISSING
HTTPCACHE_IGNORE_SCHEMES
HTTPCACHE_POLICY
HTTPCACHE_STORAGE
HTTPERROR_ALLOWED_CODES
HTTPERROR_ALLOW_ALL
IMAGES_EXPIRES
IMAGES_MIN_HEIGHT
IMAGES_MIN_WIDTH
IMAGES_STORE
IMAGES_THUMBS
ITEM_PIPELINES
ITEM_PIPELINES_BASE
LOG_ENABLED
LOG_ENCODING
LOG_FILE
LOG_LEVEL
LOG_STDOUT
MAIL_FROM
MAIL_HOST
MAIL_PASS
MAIL_PORT
MAIL_SSL
MAIL_TLS
MAIL_USER
MEMDEBUG_ENABLED
MEMDEBUG_NOTIFY
MEMUSAGE_ENABLED
MEMUSAGE_LIMIT_MB
MEMUSAGE_NOTIFY_MAIL
MEMUSAGE_REPORT
MEMUSAGE_WARNING_MB
METAREFRESH_ENABLED
NEWSPIDER_MODULE
RANDOMIZE_DOWNLOAD_DELAY
REDIRECT_ENABLED
REDIRECT_MAX_METAREFRESH_DELAY, [1]
REDIRECT_MAX_TIMES, [1]
REDIRECT_PRIORITY_ADJUST
REFERER_ENABLED
RETRY_ENABLED
RETRY_HTTP_CODES
RETRY_TIMES
ROBOTSTXT_OBEY
SCHEDULER
SPIDER_CONTRACTS
SPIDER_CONTRACTS_BASE
SPIDER_MIDDLEWARES
SPIDER_MIDDLEWARES_BASE
SPIDER_MODULES
STATSMAILER_RCPTS
STATS_CLASS
STATS_DUMP
TELNETCONSOLE_ENABLED
TELNETCONSOLE_HOST
TELNETCONSOLE_PORT, [1]
TEMPLATES_DIR
URLLENGTH_LIMIT
USER_AGENT
WEBSERVICE_ENABLED
WEBSERVICE_HOST
WEBSERVICE_LOGFILE
WEBSERVICE_PORT
jDITOR
settings
command
Settings (class in scrapy.settings)
settings (scrapy.crawler.Crawler attribute)
SgmlLinkExtractor (class in scrapy.contrib.linkextractors.sgml)
shell
command
signal
engine_started
engine_stopped
item_dropped
item_scraped
response_downloaded
response_received
spider_closed
spider_error
spider_idle
spider_opened
update_telnet_vars
SignalManager (class in scrapy.signalmanager)
signals (scrapy.crawler.Crawler attribute)
sitemap_alternate_links (scrapy.contrib.spiders.SitemapSpider attribute)
sitemap_follow (scrapy.contrib.spiders.SitemapSpider attribute)
sitemap_rules (scrapy.contrib.spiders.SitemapSpider attribute)
sitemap_urls (scrapy.contrib.spiders.SitemapSpider attribute)
SitemapSpider (class in scrapy.contrib.spiders)
Spider (class in scrapy.spider)
spider_closed
signal
spider_closed() (in module scrapy.signals)
SPIDER_CONTRACTS
setting
SPIDER_CONTRACTS_BASE
setting
spider_error
signal
spider_error() (in module scrapy.signals)
spider_idle
signal
spider_idle() (in module scrapy.signals)
SPIDER_MIDDLEWARES
setting
SPIDER_MIDDLEWARES_BASE
setting
SPIDER_MODULES
setting
spider_opened
signal
spider_opened() (in module scrapy.signals)
spider_stats (scrapy.statscol.MemoryStatsCollector attribute)
SpiderMiddleware (class in scrapy.contrib.spidermiddleware)
spiders (scrapy.crawler.Crawler attribute)
start() (in module scrapy.log)
(scrapy.crawler.Crawler method)
start_exporting() (scrapy.contrib.exporter.BaseItemExporter method)
start_requests() (scrapy.spider.Spider method)
start_urls (scrapy.spider.Spider attribute)
startproject
command
stats (scrapy.crawler.Crawler attribute)
STATS_CLASS
setting
STATS_DUMP
setting
StatsCollector (class in scrapy.statscol)
STATSMAILER_RCPTS
setting
StatsResource (class in scrapy.contrib.webservice.stats)
status (scrapy.http.Response attribute)

T

TakeFirst (class in scrapy.contrib.loader.processor)
TELNETCONSOLE_ENABLED
setting
TELNETCONSOLE_HOST
setting
TELNETCONSOLE_PORT
setting, [1]
TEMPLATES_DIR
setting
TextResponse (class in scrapy.http)

U

update_telnet_vars
signal
update_telnet_vars() (in module scrapy.telnet)
url (scrapy.http.Request attribute)
(scrapy.http.Response attribute)
UrlContract (class in scrapy.contracts.default)
URLLENGTH_LIMIT
setting
UrlLengthMiddleware (class in scrapy.contrib.spidermiddleware.urllength)
USER_AGENT
setting
UserAgentMiddleware (class in scrapy.contrib.downloadermiddleware.useragent)

V

version
command
view
command

W

WARNING (in module scrapy.log)
WEBSERVICE_ENABLED
setting
WEBSERVICE_HOST
setting
WEBSERVICE_LOGFILE
setting
WEBSERVICE_PORT
setting
ws_name (scrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonResource attribute)

X

XMLFeedSpider (class in scrapy.contrib.spiders)
XmlItemExporter (class in scrapy.contrib.exporter)
XmlResponse (class in scrapy.http)
xpath() (scrapy.selector.Selector method)
(scrapy.selector.SelectorList method)
Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PK o1DΓ  scrapy-0.22/faq.html Frequently Asked Questions — Scrapy 0.22.0 documentation

Frequently Asked Questions

How does Scrapy compare to BeautifulSoup or lxml?

BeautifulSoup and lxml are libraries for parsing HTML and XML. Scrapy is an application framework for writing web spiders that crawl web sites and extract data from them.

Scrapy provides a built-in mechanism for extracting data (called selectors) but you can easily use BeautifulSoup (or lxml) instead, if you feel more comfortable working with them. After all, they’re just parsing libraries which can be imported and used from any Python code.

In other words, comparing BeautifulSoup (or lxml) to Scrapy is like comparing jinja2 to Django.

What Python versions does Scrapy support?

Scrapy is supported under Python 2.7 only. Python 2.6 support was dropped starting at Scrapy 0.20.

Does Scrapy work with Python 3?

No, but there are plans to support Python 3.3+. At the moment, Scrapy works with Python 2.7.

Did Scrapy “steal” X from Django?

Probably, but we don’t like that word. We think Django is a great open source project and an example to follow, so we’ve used it as an inspiration for Scrapy.

We believe that, if something is already done well, there’s no need to reinvent it. This concept, besides being one of the foundations for open source and free software, not only applies to software but also to documentation, procedures, policies, etc. So, instead of going through each problem ourselves, we choose to copy ideas from those projects that have already solved them properly, and focus on the real problems we need to solve.

We’d be proud if Scrapy serves as an inspiration for other projects. Feel free to steal from us!

Does Scrapy work with HTTP proxies?

Yes. Support for HTTP proxies is provided (since Scrapy 0.8) through the HTTP Proxy downloader middleware. See HttpProxyMiddleware.

How can I scrape an item with attributes in different pages?

See Passing additional data to callback functions.

Scrapy crashes with: ImportError: No module named win32api

You need to install pywin32 because of this Twisted bug.

How can I simulate a user login in my spider?

See Using FormRequest.from_response() to simulate a user login.

Does Scrapy crawl in breadth-first or depth-first order?

By default, Scrapy uses a LIFO queue for storing pending requests, which basically means that it crawls in DFO order. This order is more convenient in most cases. If you do want to crawl in true BFO order, you can do it by setting the following settings:

DEPTH_PRIORITY = 1
SCHEDULER_DISK_QUEUE = 'scrapy.squeue.PickleFifoDiskQueue'
SCHEDULER_MEMORY_QUEUE = 'scrapy.squeue.FifoMemoryQueue'

My Scrapy crawler has memory leaks. What can I do?

See Debugging memory leaks.

Also, Python has a builtin memory leak issue which is described in Leaks without leaks.

How can I make Scrapy consume less memory?

See previous question.

Can I use Basic HTTP Authentication in my spiders?

Yes, see HttpAuthMiddleware.

Why does Scrapy download pages in English instead of my native language?

Try changing the default Accept-Language request header by overriding the DEFAULT_REQUEST_HEADERS setting.

Where can I find some example Scrapy projects?

See Examples.

Can I run a spider without creating a project?

Yes. You can use the runspider command. For example, if you have a spider written in a my_spider.py file you can run it with:

scrapy runspider my_spider.py

See runspider command for more info.

I get “Filtered offsite request” messages. How can I fix them?

Those messages (logged with DEBUG level) don’t necessarily mean there is a problem, so you may not need to fix them.

Those message are thrown by the Offsite Spider Middleware, which is a spider middleware (enabled by default) whose purpose is to filter out requests to domains outside the ones covered by the spider.

For more info see: OffsiteMiddleware.

Can I use JSON for large exports?

It’ll depend on how large your output is. See this warning in JsonItemExporter documentation.

Can I return (Twisted) deferreds from signal handlers?

Some signals support returning deferreds from their handlers, others don’t. See the Built-in signals reference to know which ones.

What does the response status code 999 means?

999 is a custom reponse status code used by Yahoo sites to throttle requests. Try slowing down the crawling speed by using a download delay of 2 (or higher) in your spider:

class MySpider(CrawlSpider):

    name = 'myspider'

    download_delay = 2

    # [ ... rest of the spider code ... ]

Or by setting a global download delay in your project with the DOWNLOAD_DELAY setting.

Can I call pdb.set_trace() from my spiders to debug them?

Yes, but you can also use the Scrapy shell which allows you too quickly analyze (and even modify) the response being processed by your spider, which is, quite often, more useful than plain old pdb.set_trace().

For more info see Invoking the shell from spiders to inspect responses.

Simplest way to dump all my scraped items into a JSON/CSV/XML file?

To dump into a JSON file:

scrapy crawl myspider -o items.json -t json

To dump into a CSV file:

scrapy crawl myspider -o items.csv -t csv

To dump into a XML file:

scrapy crawl myspider -o items.xml -t xml

For more information see Feed exports

What’s this huge cryptic __VIEWSTATE parameter used in some forms?

The __VIEWSTATE parameter is used in sites built with ASP.NET/VB.NET. For more info on how it works see this page. Also, here’s an example spider which scrapes one of these sites.

What’s the best way to parse big XML/CSV data feeds?

Parsing big feeds with XPath selectors can be problematic since they need to build the DOM of the entire feed in memory, and this can be quite slow and consume a lot of memory.

In order to avoid parsing all the entire feed at once in memory, you can use the functions xmliter and csviter from scrapy.utils.iterators module. In fact, this is what the feed spiders (see Spiders) use under the cover.

Does Scrapy manage cookies automatically?

Yes, Scrapy receives and keeps track of cookies sent by servers, and sends them back on subsequent requests, like any regular web browser does.

For more info see Requests and Responses and CookiesMiddleware.

How can I see the cookies being sent and received from Scrapy?

Enable the COOKIES_DEBUG setting.

How can I instruct a spider to stop itself?

Raise the CloseSpider exception from a callback. For more info see: CloseSpider.

How can I prevent my Scrapy bot from getting banned?

See Avoiding getting banned.

Should I use spider arguments or settings to configure my spider?

Both spider arguments and settings can be used to configure your spider. There is no strict rule that mandates to use one or the other, but settings are more suited for parameters that, once set, don’t change much, while spider arguments are meant to change more often, even on each spider run and sometimes are required for the spider to run at all (for example, to set the start url of a spider).

To illustrate with an example, assuming you have a spider that needs to log into a site to scrape data, and you only want to scrape data from a certain section of the site (which varies each time). In that case, the credentials to log in would be settings, while the url of the section to scrape would be a spider argument.

I’m scraping a XML document and my XPath selector doesn’t return any items

You may need to remove namespaces. See Removing namespaces.

I’m getting an error: “cannot import name crawler”

This is caused by Scrapy changes due to the singletons removal. The error is most likely raised by a module (extension, middleware, pipeline or spider) in your Scrapy project that imports crawler from scrapy.project. For example:

from scrapy.project import crawler

class SomeExtension(object):
    def __init__(self):
        self.crawler = crawler
        # ...

This way to access the crawler object is deprecated, the code should be ported to use from_crawler class method, for example:

class SomeExtension(object):

    @classmethod
    def from_crawler(cls, crawler):
        o = cls()
        o.crawler = crawler
        return o

Scrapy command line tool has some backwards compatibility in place to support the old import mechanism (with a deprecation warning), but this mechanism may not work if you use Scrapy differently (for example, as a library).

Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PK&o1D\{Amscrapy-0.22/objects.inv# Sphinx inventory version 2 # Project: Scrapy # Version: 0.22 # The remainder of this file is compressed using zlib. xڽ]IHׯa zl}SJdz RVV 籲E TR"/pg%K p糟֯V^/`5?Sydٚ|zt|nܽhyfye嬞w/\{o_tr aLg?gesP OK`Tv6ٳ::~vvh/n?L_b>_:/@nY6IܺmbvX{O˓FBS J 6 Hӷ}3ռ!UI&ulo:L$g(@ѻ2Ro\zſ2F0+ޔu\w6kC+Ќ4]ֹF*jڎc|Go3wC0}X:=ΝRzkGاa6./imbFH 7g@4=w2Unrj<#6br1<[o_7@WS9suִ=:bMmw*kt \jR cYPs6*Y+Iou[4?̷RIhLD_vX^>F hՎk?lLCUk !cm%2&ZF8ϢCi',SY: Xʨ b_Ľ˼'zJ{>{/sSW# ɩ\߶mgf]#_/ i^} Cwd4`5c[Ž[pcƗ+%j͇NDUc2X[VK]nG>l@S<1OɎo{wg!ta `UJlV Ў=֍=0nرMZp:|ǚxKZjs|d! _b[$/][/ @PPkٿV#(g,{&C H+~ fl}:)~J+-~UU*)ǣ nmuvJqml.>F;pTt4[M9tlQq)_J"i{^&EYyNZܻe41:3aՑC[$k.4zv=:ЏtRBȫ&q{ =FUbPǦDvirtQW9?auPXJ9N&۹pngl,,*Ybԣ] )aV@M6qJՂyyg4Jz0PRN~f_fSGh"9,`[2DdOl~5{VvM?)Qbwj.ZNO+5)'Fzx6m?aETٷWg\kꦔ j\Tq@1;EL69b_hT rbHc5iHjt\8zycn@$&3\hmNg~ _w(=HΘ*+7Ale]^(ɍmI;!(_I{8WKk($>W$Q଎0xbV 84ޞa8%l^;"sc,&(=NO6}'逞ҽH]c+/o{O4/3"U`l達}{<ׄW6qXA1q<(qx !bj߀\P T4[oP1u돫Z\Oa!IPB4Q !4n:b$9%+~z0`@D3SDo; )7 律=ࡎwe /3n\\ƅN}"Z5J#iCEn;Rr֕tؕe>gzs7Q1C] x-V ;ȑnԂ`UUZ`:*7{(dRtQ^dY3tl EuLB%Kƕ,h쫦%Q8Ul0!XXL*q^ݵn_wz~ 軠 VnfS?e6{CVf0bŠ8Z\CS&0pd@q:  gACwm`;L"Tez2og1<Ĭ1Q >hXG[uY]׳8p&QBRFڤf፩'05Hm=JTxsnnZeB~ U|┤ B1=23(QJ Q8d֡v81GK~:$NrfooRpkkY*jnsF|xy[a|(NY݀y6 IGJ=4!͜= IzZG+{DEm6XʤG68oQaT[ hگBZW4=)<ҫ|B] ]?Xr"Kޢe N ?sl'9w$X!Y{Fkj5(I~Q0>#A>!i hC6z~_β;M+*oƖFI0 /Daz7;2LT)LN#5~[2>+N4ɭJI TѧT ɩA~kHgPu90;&kE%_(M#$r}I! ʔlhSuƠ`SNԯuOo1O@ `d>o>-b4`)-Vܰ* ䷝rjZb`z%5=ts`]#rȱG]WDU45'^(iq Y>bm |IP$A,v-*>+Ɯ7$qG3+~j'| V&Hd@QLNx-yfaE8$3+<sdv&ؼL?CKãߩMۆjqq~h /VjAm~gH ytE7:qe[],e+~W^t^䧧Aőqޝ*H-qD"C'$4j!hUi V +U~hv>obgQ?-(!1uE툈e=SOt{3nPK&o1D1벤scrapy-0.22/versioning.html Versioning and API Stability — Scrapy 0.22.0 documentation

Versioning and API Stability

Versioning

Scrapy uses the odd-numbered versions for development releases.

There are 3 numbers in a Scrapy version: A.B.C

  • A is the major version. This will rarely change and will signify very large changes. So far, only zero is available for A as Scrapy hasn’t yet reached 1.0.
  • B is the release number. This will include many changes including features and things that possibly break backwards compatibility. Even Bs will be stable branches, and odd Bs will be development.
  • C is the bugfix release number.

For example:

  • 0.14.1 is the first bugfix release of the 0.14 series (safe to use in production)

API Stability

API stability is one of Scrapy major goals for the 1.0 release, which doesn’t have a due date scheduled yet.

Methods or functions that start with a single dash (_) are private and should never be relied as stable. Besides those, the plan is to stabilize and document the entire API, as we approach the 1.0 release.

Also, keep in mind that stable doesn’t mean complete: stable APIs could grow new methods or functionality but the existing methods should keep working the same way.

Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PK!o1D-GGscrapy-0.22/index.html Scrapy 0.22 documentation — Scrapy 0.22.0 documentation

Scrapy 0.22 documentation

This documentation contains everything you need to know about Scrapy.

Getting help

Having trouble? We’d like to help!

First steps

Scrapy at a glance
Understand what Scrapy is and how it can help you.
Installation guide
Get Scrapy installed on your computer.
Scrapy Tutorial
Write your first Scrapy project.
Examples
Learn more by playing with a pre-made Scrapy project.

Basic concepts

Command line tool
Learn about the command-line tool used to manage your Scrapy project.
Items
Define the data you want to scrape.
Spiders
Write the rules to crawl your websites.
Selectors
Extract the data from web pages using XPath.
Scrapy shell
Test your extraction code in an interactive environment.
Item Loaders
Populate your items with the extracted data.
Item Pipeline
Post-process and store your scraped data.
Feed exports
Output your scraped data using different formats and storages.
Link Extractors
Convenient classes to extract links to follow from pages.

Built-in services

Logging
Understand the simple logging facility provided by Scrapy.
Stats Collection
Collect statistics about your scraping crawler.
Sending e-mail
Send email notifications when certain events occur.
Telnet Console
Inspect a running crawler using a built-in Python console.
Web Service
Monitor and control a crawler using a web service.

Solving specific problems

Frequently Asked Questions
Get answers to most frequently asked questions.
Debugging Spiders
Learn how to debug common problems of your scrapy spider.
Spiders Contracts
Learn how to use contracts for testing your spiders.
Common Practices
Get familiar with some Scrapy common practices.
Broad Crawls
Tune Scrapy for crawling a lot domains in parallel.
Using Firefox for scraping
Learn how to scrape with Firefox and some useful add-ons.
Using Firebug for scraping
Learn how to scrape efficiently using Firebug.
Debugging memory leaks
Learn how to find and get rid of memory leaks in your crawler.
Downloading Item Images
Download static images associated with your scraped items.
Ubuntu packages
Install latest Scrapy packages easily on Ubuntu
Scrapyd
Deploying your Scrapy project in production.
AutoThrottle extension
Adjust crawl rate dynamically based on load.
Benchmarking
Check how Scrapy performs on your hardware.
Jobs: pausing and resuming crawls
Learn how to pause and resume crawls for large spiders.
DjangoItem
Write scraped items using Django models.

Extending Scrapy

Architecture overview
Understand the Scrapy architecture.
Downloader Middleware
Customize how pages get requested and downloaded.
Spider Middleware
Customize the input and output of your spiders.
Extensions
Extend Scrapy with your custom functionality
Core API
Use it on extensions and middlewares to extend Scrapy functionality

Reference

Command line tool
Learn about the command-line tool and see all available commands.
Requests and Responses
Understand the classes used to represent HTTP requests and responses.
Settings
Learn how to configure Scrapy and see all available settings.
Signals
See all available signals and how to work with them.
Exceptions
See all available exceptions and their meaning.
Item Exporters
Quickly export your scraped items to a file (XML, CSV, etc).

All the rest

Release notes
See what has changed in recent Scrapy versions.
Contributing to Scrapy
Learn how to contribute to the Scrapy project.
Versioning and API Stability
Understand Scrapy versioning and API stability.
Experimental features
Learn about bleeding-edge features.
Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PK o1D#scrapy-0.22/experimental/index.html Experimental features — Scrapy 0.22.0 documentation

Experimental features

This section documents experimental Scrapy features that may become stable in future releases, but whose API is not yet stable. Use them with caution, and subscribe to the mailing lists to get notified of any changes.

Since it’s not revised so frequently, this section may contain documentation which is outdated, incomplete or overlapping with stable documentation (until it’s properly merged) . Use at your own risk.

Warning

This documentation is a work in progress. Use at your own risk.

Add commands using external libraries

You can also add Scrapy commands from an external library by adding scrapy.commands section into entry_points in the setup.py.

The following example adds my_command command:

from setuptools import setup, find_packages

setup(name='scrapy-mymodule',
  entry_points={
    'scrapy.commands': [
      'my_command=my_scrapy_module.commands:MyCommand',
    ],
  },
 )
Read the Docs v: 0.22
Versions
latest
0.22
0.20
0.18
0.16
0.14
0.12
0.10.3
0.9
0.8
0.7
Downloads
PDF
HTML
Epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.
PKT`1DXUii+scrapy-0.22/_images/scrapy_architecture.pngPNG  IHDRvHu pHYsmM] vpAg.z{IDATxw|\u(sfn^{'XދzeKqw\R$NsK8ũN܋,KDuQ+@m{qV!~( ޝݻ޳3g O;"zB)%Bꘘ141y7T1K)p]WB8NST2LRTj||\״Up7P(su]7 c6qPF0ȷm'LMM &lVӌP(2>>i:(|sXEEuiiiqqI(wL(;iQ*m۶L&H$&'''''3L&BDdHDP岌1"I睋8bEEEh4xOJD.cJ 7{'ᑮ] `1]70yĘ!׵ icc=ehɊ̆RᡁT*.1 ? e@R " Rtq@d Cd? c\d"aFMMMCCCmmmiiiII׆7_*hPl6{\ХK`/6Ͱ{w.T^%%ضeBa"fGjjѨPp;;;;::\WU&Ɔu]$I!`3L03Ȍ'H;'}zSSSё!s`f(P)PA ؜i+gzW?!2r #H&#9pF"{ٺulV<Y*hPpB=zCkhK9}Wڦ;l!y.\nc*nX^}DZm۲L& G|~DZQQQUUUEEE4++-5M0 ]׽;!|/O :5f||<,+X4VW_g9lܹs훘@$c jFv}~ 3} (2x# +"7=88fƑk(\鮻b RAtxm8sd۰akJJJ,gϞ}駆G}> S@45mV3ȝ5pSt3viep]#u[lfˑe>Ɏc>|̙EE͍0{ ݺ1B8}}/om]go0;vbbL{ؘ+$ KB0#Q#זJ`b ع uc%c#mmm˖-y|>?66f/\8XHJ $asǸOc± iNNN:u:qƄG%MuA C#si=ޅu3756w}udAY"o?oooZ[U1MJ5""@g|kʕ+nЄ1,ѣ'Nu=+.'+Z JxX @L y tt-QHBZ"MQ"#Źd>4qÆ ůN4mtt0L+ur)+Di2D!c'.te~,rX4B0\25k}+ߔIy/@@:lp2ܾc{ye8vJK76ʹ"y1`oouf''|UdƐcz뭷vuUР,T:|}@*bw^Rj.r߰xqG:thl|3ƀPW+kAfioi6d^?I ױ'_,w#IBFRnݺu׮]_|\WU6jƽbJh C@ ]1'!@􅷺3$j̞O'q{ueq>OL&rfq}d]9׶3W/Hv/k &$@ ]lIB.фkWp3瞚(޽{Wy:PAXy.^4>>RW4ƒBa7=hwwO[ۚnfAJygyuaۢk偨b "9(on"tmfH9r} \;\>4}L< E;K~YY*UiaEu+O4~ٰ0T=rO?%I[o/ 4_ #W23_3e* :4S FGG1FE; Jt5{hMcV!hksx1#^mG*ooxxŋH1ȣ  Bu4 $I[J[3b 76)< E[ _!UI& tjjjj*Lgry 5|ANfr*ɻW3d}MqT*c_vV pM(TqFŒ~񂆱\6  6l@4?+0H s>448yn5<,>^7@oIIa~\C;yܑo?#d(]C-󁝛9ԥ9$/|{?]%pvs$$NGeyfoOt..WӭȃX8\w6 oR)۶}G镟UЂBL Z6u0 xٯSAXe\.___Ϲ&Nh a}oYgD"褦&/w}Ň3j+rf:chiH 1@dW$3[dWDb.$i5,@u ܧ 9tGWntN_Du%rΑL*1BUPssU٠ryu@R w7DX(@nIU)!r 4(Ҵ@ PGHD_9^hl{F-IJdfH@!{vЗdr9B `$ WlG?A=lB>yk =Rʽ~ j> Ed@uv |a4,X\njjsHjR`oyLNizȥس>D4Mm; |>gykA ʢ$t]\*CI&GƳSr )YtŇЦ,UPxEj5fc/=qR=\bkJ+ $1?_IG6mqk, H:3s'W"$r_:|DJBjok p HG`X)9jT tzձ7@c>{c߳] RU.eaǻ}{kԌv:+3ܖrL 'a2|{ uCHRKL~X,Bai|>[3M3 _L[|b4(L1|;IE*'5$ nzЧM\+5G&(~M ro}o=v2~HO{]޼)@$đȳXT?bpm].n^;~+~k_ @8sP;q}( 9'+۵.uHt'|v+P]K7ȍ|C5_-QoG% gDiN&Ԕ)\.H~!a:X 0dFeF$dm I)B#2]ףhUUUcccCCio9UР,JBB!i!ε4 (PhEs[K'Ϳ7?LYF{_}\'7m`w*莶2"͏N>_Ś.rg۟zԣO>ؼ-fk©͟Xa 2/ƷT1#~ߋ1:`\Rhx444hg+@ RO?C[75ءoWZᾍ>ΌjyROfd7K|MR/K&BЂqjlbBF.pZ䆐4g `No)\l*|q0˛WX3AYBHDdWe2&Y~0 E }ހ+"~WVJe?k>t?cyC`}͑!e3ذ33@ oA&.?9,c3eĐ FmI?ѓ179>2V+hU 6 N]cNMN:1@J J⑦Ѕ)g]3l!)Uhpn!z{~C/iՎZм " 03;Qk۞q3J+-݂)xbggwڪiu3%UР,JhY9םD$] _lɾg{Ot:v;w2?d ( HI !Ә$#'g*Wp?_'qz!c"5>rV&cK>HDzGݟ8ةlH?}@N?/ѩB dD 3W3вԤfXB)@JМvfp-R˰u[vN975'}D+_ʆ o^YY QA(1|>Hr_'53 KV^rZugSNP/;3/ݿN9Lɛ&YL$X{7A^  I?A+Q *n}SM2VYᗲx-VG-p ^)M0(++sI湨D _ "ˇ2}Vi5r>w"qs۷o߸q{UР,2^ثz(pݼWt2] 2D"זv_]m:.ĝ!36SxkL7lo hQO||ӏ+oƀ$SH0@@/d,KH/:<,k1x'`Ak>{?)HH'3gutdrA,/|4#}C]gD4CakI`HGU (/-^70ry3ٓ&3WRAX(d8%%C\eeex,0>-g^VP J["׃ w|2ԼK8g+ط不z &!xࢢ˚&;D4O_<7Hn>ƊM!Ըe.y<1 g4 1Ƽ syl1{sKr :/ן!6׵T׋g.e0]f{<·$+#Cau%h@d!l3 9㺉gK_ߏ&l`]"b 2|>Y^^7\u!pM<+sG*Ι,A@uo*MOsSNy jxBYh<Ǜc荒6o}GϿZ$p7J0IG=w_{Sv=k BTvgjH7ɤ doӃ./ eG/ON{sTHQq׎v_S_^+ ֮_M$% ݺꙧ˿<83tz(SawW@s)(؀yߏ>k[bv{xb nyK ]!t&m!n%DH\7_[[7xL˗/9}4D>GI䆥[F 1/Sβ{;3ߍU;>>>10SZڪͅDI,[ !(!f|ÿrJ@6kYɱIūwЯ""  $X dž\J$tU`z!)>tpP WI>h g [~йg@>AV6a?tvND>vp*#NMσe_/DRЙΧ+W^ψ ιeY_'''$kW.`F &};sD3Lߵk" eQ򮫵5˖;_TRI82cL " 2~ 9Mwe}jr܉ī;KD29qY^$iKN)gϺYɸ5"M+`lw"s$3. 9+kv#wdz3 1B>?QZZM'/V`yOJ9>>GYSZ؈{5ֈS6"I;X,eQzCPUUu_˓JJZ޶9o O&W-EDHFFj c+ L+-~W5A!ZH%`SGRJ\[*gBZ'&&r d]?L_ƛT0LMڵKΟVм!dte1"hZ~'ijS yhDSL_2 I!I nsd")afH:?^r(q,ˆ\GbӠ,!sB!*+%/>˗޽[ ΋~HQy:1Dko0DDTBӍHE 3Z3"7UAXоwҖLg\I& 9Cp,Ђὃ lD?3@ B.}ēz+ -b#d2AY"H׵={nuҥCi_K]U08x|`p]]mԌE ?˗ H%Fx}~=VHX F$!0@gъY=yMm-f~5ҿ3CĖ͆(ız湩tNҵX~xƲ?ZR~gʣAF{쟾mC~[䳟{_'LaFҷqL8?h?c<|nOyc[>OT:;<'Z[bkwn6b@$nfAYJ4;0|6?RJ\'DžHzˀ /-_rϞ=~_E ䷿K.~azžP(xycY^k|ޯ ZWYY *hP&DR666tt=qxOmP<u&l9煥q_ ӧ`BDQꟜ:9m۶sժUHKSüⶋ/NNN!"E׾W ť#q g%b/B|}_E> p.!beAI#s.wɟ+(8 ${; up(}yn \]]mYYAYcRP(u떦3gN_94t:,D#pRӦ%p_;"h*u9H V(//DnreY- Qa^[173k\G@4??} IH hc#e;1>r]Wm!Њ+/=98cskēbe$I19;O lƍPTР,e#%%7ts:8wcppdlsSC`a4ͧ릔 "v0Dt]u f8rqֶrUx|*bX8P(iFEWr]1&k WLtkK̨x>rڢ5ܪ +1}9 v~=(]XRt@$]K̖e@r;l8v|ʊyO/[40QkZbu-~$ZtaԩG@ "4[b@ 7zhtǎ[l1!L&}~-,~;NC"*++1 }mO7v]C$ai#WE#1#VqfhXU7je/xCߐ<i|6JLS>nƫ Q SBv?2Pb`i gr @ q $Hw|ED#! #rfFŹLj֭~ܼfӠܠp4n˫uuu !֯_? y +wC,NB6̠/Iґbfi_@c CrqB8@9'-Ғm^ؓ$!`v;8BD=˅VIih'F݇="7l4{_5J :ۈHJIDN.X\LC +F$b.8S_'5+ W"7.p4f'sG sk֬ٹsgcckNA-Y*|rjN^-ZG@ǺD>% )rE7tڵkM|P*hPTe$]>?~϶dQTMB. mC>O8A$)` E2֭[S\\ o<}Z (KSUUi9Q|I/N.$Āq]ѶmmKK7MިS (Kx[iia\^97= $hDg/sҵiZ0௩ټyskk7TArc!䕕OL H Ƃ@@j^755m޼ 4(7\$K5OA*FƲ5S:]EJkY\8#,V|Unrc)8i[?C8;(wXjƤɜͤ3ʍ\2jP)7pljjjttL!|OTРX.KfQxժUt{r3$\`0ҕPIzYw KsG^WIQݻw[q3|4w2T#7dҖfyYb%edȑ#(|[[RAr HFӗGRݶȳe;JKK]f'/ ] d3 ' w*W8366z቉; GI9;;~ $[Qdqq=kg7ڨ!r*t0 ֬sr1|J&fyQܠGʙ/8M/?UT Ba䜴sD TAlOuO$@#Kdf5 E!xzPhժU˖>|e9.qצ̨Mzrd s<c(|k8p"s-nniSCTР,mĐ $/ r @ n䅦MLu3(7KV[[[[[f>|ɩt:UVY#i3-8=@$*%kFVߍ\W 5F@Ӵf4.4(KwNMw.O39ęJn*, [nonoo?syre9^I0;]s M.X$,{c;nEEʕ+_ok;*hP26$HjI]Jd¾w xk֬Yf\.dl.߶m ^o5+QqÂDNݮcx<.TAL# Dfsz%I$Ɏڢ5LͺT, som; ˲hj*wdRxz@a!XOTIy Xn;aUEHC261s44Ö~1U ")R! "0hyyy]]]}}uk٣it3O"u4-,u9@˦A G ʒMIue r$فlwn,,Es9"Ҍ`b˖-WvVH}DLe @Ù /0MB|={c,K REB1_._=͞pv<.+j# bj|ގD"=Xww#_ oUc \ 0}$* I@MLMNd{(CTbΝ[ngPAdMfDR@>=N1{YdT~`-*H^(?wSǏ4 9Ou'_xD+1ȧ~-TbDH^HFQr>Hi5:t['wd/w# B.QCX[n_b*hPـ8h޴s$92(RB$% IR;mAy)~xx.]fmk);< Dz(3uZi&r54ܕH220:($a =v0zIfpM֫@!}'1FIUQ'[276"HǕvޞ, F#Cie鬶onmtQArC$+"\ K561O#VVV>\㝝L8LN:]i)fj=^mq |" &@: s % "q'9&(2& M7HaK/ķDR2zϤ>>n>:hubcmX^Χg7 7D J:k0vQҌȘt(NN&jjj6nܸe˖)  ?Eq`;QQ صk׮]:;;/\` t&@5` ݩ>{bF#"dfAf`uV͌1'5FXI88tבkq }Ԕİa rSǿYc<@ǰH 7y~7u{(;b|~9̀.e۲lێD--͑Ȇ۷|>/AQr/d]Tw+`!˗/_>66jꉉѱT*i:c(VL$3c&)HI$h4"cNhf;+HA$i2 @M‘)cLB2ϛ$xBxݮTA(KCƙ>Sߔ7]̼%x<аnݺ\6k;NRd&IRL&=#`ȧVTTTVVyka|O*4|Guv=7qFZVJ=Vޝgf3F[NzT擨|>ycuuuuuu[lI$&GGG &2W"C#B8 c"ሕ8V1N@{;k⢶p8L1D}74(!C1K+sDaۤ!o+vc,4RUeCf(뺮iڛ ^D(xsg1Nar j-RakOQ'=F2Zq33Lʢݻwoذa"`0X[[aumd2|~kB;;:399#Y %\[Hqa`0|>]RݻnO ,q  ^箉U=`U[xe^{]X޽{ST2iHD 37=45 1 .IZnYYʕ+ZIsbea MR&d2麮(aDyAQ!t;[ ;H={Ғl.Nb(O'q5l* ! ťp(X\\\WWq7YWXIIIIIɵ~4(ptnzZk͎@6;|WWWkkk2L$!Vh4D`eUU,+\BI ɫN ,qDVR^x)RK.ܹ311ṝ s$EyTР(nr[_av7öm"2 l=gËyXTР(Rk "ETȠ9 :so4(yIL&DGXDF'hdգԻ(ʢM &z eR L=H$1o{(W eB$wiS[GQwI ,O QPZPA(K Nj(K EYC_F(QAQ%B>CI;n,A*hPh[`h~dn,A*hPH͞x{zHT~lۤ(K EYHjaIӅn,A*hPe)$ n,A*hPe)sFT&$w(K EYF\!t!*TQ4(4 ̨95nxi *dPH-(R6 рCd3zيrEQD9 cP} rAQ% F!Cd QA(KW@3/Lf "BTg9Ej(ь*t҅qT rAQ% й &sj EJTР(R4+uWràQA(KE,2lQEjTР(u*ă n2?"I 25RQ 4(txi ft1X^@$^y:$ 爣GB|EAMEQ "L+Zsq:Ӆ 3O[D?xNMݱ+ #S}B?iD)9yf$\ mjv0^[vwGԩd>[Q+ F͵H<ܯ"wl0q9mqMVx&=8q宎,iZsCE3MFIYQKDEy54($j+!I%Δjҵh7âHƹiS'/+g]cO#OE߲gw׶0{;ȅɄ,wnA <E;MvsG FćgO7ԾZ}8#{x{y,鸄j!q@ӊe&3xsc)7[nKÉ/?vk]ֻ;mmM?{wχK]/n.=?LRQ! ='O=OuTTW_$Z*hPeIBI32T:!`0]mlO:_Nzt ޲mO,϶?1 J#:90(н7 twUɱ+ZT;/rTC%~ ȁSZѺO`.P;Q{W0L VTfc&ύ/~ۅxuɂ 0qk)۴b!_jPA(KYin):zq[X+%#;821=_KVۛnmhΉ<|;K"vVd0=\Z-Qw. xYENNNo{Dp 1h4j@nlNlrs &-^]wOuWzj͍} :2u$XrW'Ncږ6UFTР(-RU ь5s뉗 ep{>v˧kGyG+X I `XSiܵrye'b)LI[i+zjrh0/.I24%~AKb%Kz`ņKurK]?~Of&21+ja 82Z?UOv&8;~_!WqZ*hPe":A!55vlЁ/mG p_˥{ϴ寧O )+~cu{G Zq.03g.̹fxyr] "e!@ԉ^"cmd@H :r[GodjE'D:5UJDirC)/z`'&{VDW'CYZTР(ʒx]Qj(yΏ\M"D…(p+GAa'SO<*kT?uVґ?yajG[bWz.>r$9s/?Yw.Zcy>Sy no |{`|\w띩w_z?^o rJ8nAuXɌ`V֟>{Wim6KlA~Ay]*hPeB@IoD*b"9JGR;&!@ahBZJAkxN8J^~ŧ,o6oEan~㏞{_k?׺npIzYkKׇ(.o^fdjp5I}m,';~;|@,V|}k?\Ə<{&2QQqA.Z }O=HSys'wto;a!'cuyetKAQ hy N:]H>CNϒ|[mVTafik+2)4Livݹqc@CmI Pn/ǫlf6}.pMI sJCqU5kՔDPxX k{v%X*"@M_" WX{~՛YזM$Z7TР(RȈdy,\9qFHę5w;l@#fedͫom_Ɉ$3K0"VRWСdj `t,k#I_uY=H2 ze]eVyh|c[|I76U7jdBy3j&,8_]͑ԥKG@mRH)^BYHIrl4$$189H'"+Bs 3fIH":_tB^I͙!m*bP~4(ʂ|7aI 64n*@H{fwy JWKYVjNN"{rtvt3TW!ͼ!n+\teڍ饭{_) PF޽_0`_/R@k#/\+W|=򃩞EQnͥ[2Үݼ7'sjXJYTР(ʍBM)8?~9qj[4mz@( E!"6m4;~QG9:EYTР( JbHȆ{mQй94uEy TР( 4W"`}ѹK,8n,*hPFD3oQA(7oDsͶ(pzqpӫt4߭SN @6?2DI+lRTР(ʍ!'і{$PXr5H(oN x6f~whI8EYTР( ѧ65GZ?~/ڠ(oH ܠ/6>T]f9: #dn,D*hPFFy !{?= d!+EYhTР(ʍ <Ҵ{G3g5՚Z*hPF'IVs4n%/<ﮰ @ 2 E! 2mo@}:͚о/| A7(*hP[Jry=!"Z?QiݼN}WU~xTР( lm\.'mu]ÙɰU? ;5ўGu*nP4(B@D^#裏& D4 ;<"z|t*7TZo_>=/} #I(*hP$d~߾}B]|.|\gy饗,RWH6lcէFv :7[P+h+78m(7."b 8p`dd4M)bծ^xQ48qD"ؾ}{ii@rxV5nҘ\͢LondWB$7.FE7xĉǏy4]mmmD;::l6MdrӦM+WVrI5?҅/u8r0[y*-^|7SQ7,+( {=xeY;S^^k׮)%555ܹ3;cF&y8̙m\# 9sOQ0H{'c9/bP{AѨAQkw .0Ƽd֭[a!w!D<߹sg}}7*'OUuy)BnMr=Ϟ׼ *AQAQYN.a8 7mR!o޼4MqL|GΟ?*;Zz|zuܕN=?쀙U8RAQW͒RjfvUU֭[]}e۶m%%%m{|8Ur-07/+'qm۶={t]?z֭[˽d_FDbwIů];s?Cv[}kk\QKDd*GRYTР(l.‰'N8f~eYpx^Σ@vu]Ϸnݺx>7rŊ˗/4`x[ޝx|׮]ިę3gnbP]Cu$jk.3z} -4233,R&SC5`5G J AQ2/@qA2뺡PBx3)S躾~x#wnjjbڙ.ę斲GzxNzqt{ƏN\[sWE%`D #/}}*?VƖUDUF[BbC 7n&IwӕwN r5Β8rHGGaްByyڵk#lwɛiDN>H$cmݻwڵ6l*;򚚽gVWr}3G3}`{XYqe@HsG770.O$#hKy,2FoDrqDH2T"E:GFF~8XsVruo~ٞuĉ;w֪q !Qm+bnB2uBd (8K'x8T[+ UGQYЌ@Hj-B r̎:uȑ#evxuכUaƺubXGGGP|7oU*5v$, Y/'tl'A3JDGW. חʣ͕e}$It Kv$IP#D@0]jgp?Tząуy_ T@Uq8T2(o .l/^DDRee5kѨW]N UxkUرCͪI* וWMw_=riPN.L$@11q7t"Co hx]Qۖ'Ϟ~ipL-@$u<'aS /+- 7L~5@$svळV2rv*o'-7ixr^ƸtfɩpW\m. S*4(;E R3g>|K_Bx%*++Z IDt]6z4M~Gm۶|ro 5TqM@@ęEe2VwdرT4nqX4`12*b"KM-Đ!2D;ʄ䍧t$"TMLITITN欤- DһWd$ 2 r@:7K-jj*b( m.r?i4˲u.Vy/ϳP455ESNMLLp ??66uV4Up} ח4Lܹμ:<=iY"P0vhd"3ygzا|zȧ5njLL̘fjݜ #l!mGX\nvnnzN{?]@0'Θ 5o>0d@Euw*hPg IxU9omm[i8NQQю;:::YSN% 5{Ji)RفTxTn8gOf-7WpȘƐs 9"s.LPH'/ړuOބODD`8M$9?K43tnj֘npIg.y0Tp yq&bEMі!;EyRӧO=zԲ,4mBVB\!7ͪ @ENoً|˖-jVż@Q(X2Rь5&3V"o'sv2grvr(xcӃD 5n"r y ^H $%H >-k>t343AS L=sɳ [lg!'L+Z[_lK

/˽ jbͦ#xo ~=Frsy;wRue,7;; \aB+kh0ޚݜi 9Cs1Mgs`p}׹1CsMCy+yKY^hy]HJ *ObEy+466:jժfa_DD۶KJJlrܹn"zC;v(++Sq \ B0@ _@)]W: "-V% co?=B1g0k&!1f=9nίGVZ$`ppa QAUǏ?~xPu]ڼY. U-nYa^(5T̃p%r"2oF^l{n"$xu6 @ .0EmշoKA@`*\X2TР(oh6[0Ͽ]]]^&ugגXP޲'O4-Ͽ [lQC 1oXfÂW(ܧsInc-e[gk\z {paQA$fB0Ɩ/_@fIu^;XlΝgΜ|2ivɱݻwrXL`?]ǽuGܹ3ֹkb ʢv{ɓ?ĄiM6ys(UJ9߸qڵk~m>oxxG섙 nTF[9L[|DžEy:٩뺮BfʕEEEm/pa.uZm?3ccc7nTj+oWJ 74(+x*/_|Сa]Xjղe˼b`tqqmۼY^m۶ 5NEC}PW@Ǐ?ӣ>q@ e˖UV-$JYmmm6l0Mspp'<{JP(TOdykInr|ڵ,`bǏONNΪPj+**hPn\7W-o=11͒@˗X[ wK{$h4k׮3gzoooͪPE O(78o뺧NzGP(qƶg*RӴ 6_4M!iccc?pggFyKh*rҤ79qAD>oZ{$9Rԑ#GΞ=k]_YYF$滭׏Vӧǽ45BQ冃~RF__/<224mٲeh{h۶7ҥK1]O<9>>}v5ByԹӼyiOY[AD֭[97Ds',˫q ڪh!,o=_cmf͚X,viO>yUVwEfnЎ,oELӎ9NBi  A444|>z""y:'ۏh\4ͶkI=~"b}}}$9uؘilv߾}vR*;,`Ym8LT$tgaT&Daiwc^AspA`ȥLrp:Y|)J+~ț09i-rVιW<ޛn< @uuuqٳrKII Y+<ᎎ N9FDb<3X;xGBsoDD]7KKKjj**+,6_ǝ( *h!P뇈CD$"Rj "ϿnݺJ/PWnݺX,Q(Y<Ȏ;Z[[c*nP`e"L;;/={!yܢa8@7fʹ@"@$mݶl60<<44t.3׬imii0jA2``}F0MLK3\M |mۺp8aP^wnR666F"3gΌzS{͛7/YB)w,2ksBُWiUN@cq꺮7I_q7C$giڕ9ޭ3Eo>ᎎ]]'kkcx糼waHfka6(-z3gj-kiiFpݻ)|4BDd_WnW9f#^8 '&44}孫4$ cl}֪(**ںuV~ԩD"cǎ򅖱:n̽$뺚y?>V)LR[»ի{1O˛yJbqld7/bdg;/Djժx7 'joyeV~=CQ M.4KAlA$(Ǐ8fe%k \QVVqmA, 4"I!\bK,xPE4=w\6 O<쬊2TS׍s:LJhSS?!o­Gxir-G1?7nx{s=3/Ahm;{+?Аhj-)uuBHޟnbE9nVSŊkz*{r-[KqQ<%BJÅ_Vzf:@GłJj*?UT.bf_z饮.@g@Jβ=۪?s㸙4Txږ<)[~Ǟ-u~ۯ^wB^M˗ر4ַͅ 5~?|f||UUeB3g/>}O_^)- ҲWsơC+*2--DB^7oqoedq1p!=0ܶmxҥS55XDd=]rbU?~SB$׭뮩5͂j^>M[jd Ms3KUG"ݻ566_OեH?Gw쪾XqK0'5 M^s+T˸711isgcןNYF2{Q.7"5dqWn{U+|E̽Q}{޺uٳg퓓v*++;jGٟysƮ=EE|I$n{'ww56Bmܗt'~oΜ9l窱bf:m韎|?zMw>|`SOTY |_?vM[87UV~-r3`?n-+BԔ5)C?/]@D$I8"^0vGJR@ۧY4\S''N;v̶mogyÎә/満hO${GRH#l~?QKRv\o^(O7CAovƿuWzfٖ-nH "@gR7]q 䆩3Zih H8qM7 Z#hۃ˃F^oV7goݺuʕގXhm/bp>ۓ|~@ `.]5k|\ӄw?O|bx3_|'S}?;X@_>=W~e=3g 4"??~~oӯ&R?=7ny=+xg/TW'6m:ઃgE2YtкT*kk_.q>/7OBU–cR'iU?($q` vHIC4ׯvJL~C`ؒML'LK4&"1\uVCaXTc l6sϝ>}fUVVnڴ;E:WZ塚`wC̏ooE1gLu]fbզe Wޱ" |&#ãL!R'Z ȭD__XGrMXHM$9 _Ч+1fFzGs* ˃t_8hɑq}>m NȌ X_wG%T*d,BP^^Fu>񟬬Qž ;vF]f yڵ8@$f2F;wϋH߯I)gc1ص1P(sϽ8~1ð0\s8|…<$_qEfT2./xS z.:?&HCg!9\Z_I Xb,FbVJ|/I޷+V ua[ e%`Eo? ˻ЭO=?vz !T}C5;q/͟W~?߼5,җ{s mo=;ßk>y<#DBV>s{7񍕐 H?d}OK*Bh4s3g\|YaS&&&vܹfUg?}>zu>÷V][mg|pcD|DQ/>bJD D| / {p" @g??yD͙Ez?~7oO5m2"Ç_no?А\ꄮ;Bp0/d`f/l,/[.3V2\n3 ~ruѦ2q7?GU[k|z2_&mJ8\e'BoJ_]mE`} )=A4\uީSJy_|1ɘiYV4]ve˼tn\ܟs]4wvvX$/.---4 @H"ٗ {?y[1#>8^}"n'|#mj_VDzgg- W:'^853oď䁻vo˜zѧO ܷh21|se;-a#cO{V3U{૪LL&|MMMYVV*FO|/bˮ]ӴZUSLN᜹gOͿBA]ʋ%o|sgZ^\z!ƍg'#˗М ›o.环_mi)J(tBi{**a>N~pܾ}G6m:l*bgDhvipw" _#na 4x4ύOYab!#Cp}l.&F&:jC5ALNL}3-[kwX,_{;.޺Wb^[uODym:?6E26߶>|O?'Ƕ|oo*֑?IDAT\o*)))**fT0 u{zz|YY7|~?woO݊4ٻ}f[ZT V,Q,XПTBhdS7d7n9?fI l 6s9v.[圏 w;>f()?|\#.ގ_;*Gp==W^i94ܔ۶l)= e_^tOpDLS4n1eAǣ55% E+*!bɹ.b`0#Z4\"肥zl==!e("DHq۸"LHOtL͐sMZM`@s8%XDKKˊ+뭇Mzu]YP55C_߼}^ZhO>~_{ŝaΐ(P{ϟ}? tBL03b&LXU`tjyϗ_J}]=A[A=ЈD)sKޠnzvnwT>&Օ!7\<0#x-YfZ\F$ٳgŊ=a$^vY%[Q1*Dl<+[I㦛JLSuVx5kZ)Qa IGI^!qB,__fMdž -% a|c/d4u6fz7 Ҹ!hRZZĉ--Սi9{ CJ#FVڂ7^$ Z{ &  $DA"Q6BJH8>P}iq&1r};vha.ҒxToBxLSRK^7q{Zٵkۛvo۵gWk&} 'eفPeD[QČ"ȹv$)7%Q@l~OkMc_TX(sʊ'7Ld@9)É<;&(ڰuw6%j44̓zƃ7m&11qݚɲյbŊYf]k7,)<W?ྛo.MKs"!|pwLB e|sORI#6 !@fdx'NLk{2# c}ڴ3ֹsc#o*řo}pU^\ܕ5Auw'm޼ /9'E7b[_ml<x1P^E;`ÙJ{^7̛kC7t%.]#4$(T@ɑG5yUY6!C_wb؍gʫ'%%YU]v͚5hI ~wYN Ʈ04h @Ļ40 @)^tQƍ{"@4;vI"L+^zg@壏Lkyog#tis/LtgߟW3abM5, gϞo)=YFÓO>eٔeS …aOH@qwwTӢV(D^U`1k1|y~1bpϙiS7ccDRSx՗  ywps_/9ǘ''Gy#Dy$#ѿ=88#f!mAmk#xMH@دw|?ٿqrafZKA=<}hM7}?(^xݲ]Xü6}'k (%C! 3SWoYGSsy_Æ3lrqPvQoh`W%LfaE)#sg&o{awy=F]דg̘qKz}}}s8e4n+)lgR`.]s[u]{XB 3=x:ss*R ĬI~~8,}zy/p~ A$%98!t+Zx,tttbNBI}p M۵?BӨ h d7/fŕשvlQ@2b˦L41xGo % {^|7%rEU:F=@/.K-v \CzfK'v{g ölv'+= ܯE"=0R%aiIhfi*:Ή'ZUNP*RqN+[+6c&'Mΐ n8^{IO=SZM(Rw%JsF$"),M[W8Ēѥu˛O?&~!W6uHvm>_ׄg>jMV6`NFՂj5igC[Rj vu]Uծ_~yڴiVti`Mmm˗,fƓ61M(`N*݃YB$D]y'%==ÇB َg$%Q \C̝}7hGRF))>-z ;67Mp[.]M|>g3?zkY3ύ5$_mIlZIwA@gB dR6GZeb &w4 ɮQmN $۵?mirC7(t$0w=^#Eg_n~e{[Dv:y >(\\BA8Kz__LKK;]a4@VbD q4EJr,bp|Qq4t]߷oߖ-[tdeeY4 g`M(ͱ)#m"3Z?Go5U&K(99!f$0?z+?stOLNj`9$iҤI^w߾}HD3g,//?SY !btnB(y {PK{{Ccbe$MSoh dĶ/q8RO0G20u]'g( ;::,FkG9)T#/mang˗I/7]ThCVn}~7o;?>ͬ&\8 0[""vv ! ¯ojE{TF~دX#j7pNz<{ ShϘRyR z_|{:g?}Lϓ]dBo=k8- :Uw,<%$ĵ _fRpCƪB;|U͔HmVv q2@'t188vsh8,~ OTpf` {zzMǥ!"Fit]߿={$I322jjjn<#lN݉Vg9A=+t25'6t Ӯ헦q~0M޽Ge]׭[&L`jV$uvvfff;X)8GUU{{}H5 %)ʑ0ádh+R-^ϑhfff"z4M $.nGFFgDTNshx?Q5kV0Vz;N{W !!U%%%V9h1 wѧ>]-hBp ^I&x1Ol̜!'$$̝;wΝ---EQٳ ~+WR)-/8v-$##'%E?vM1ymB.1 inBE2}yWUE)2c?BxN<'iYFcIIROHMQ#Qg4pBגRa?Sji~Er5f.رqrʁfl,8 5)5PD]DR4 k tROZDd "H2[m]`=(D52R%yq Ngtѵ97МLJthWi^+Nźv@BM4l/  Y{b } ^~}A6LAllOm5 (lR FOTUz+l7 !M8oUD:::b5X`'&&Q BPJޅP 4JL97H(57tfϷlǽܧJȁ8pq UeݙAQhoohڴc:,̛@B!56ve4>m 34v{+%%q4 Q-nӓNp?gb0$+-O@G@ ESY&J q04IߋWUYeth ]+ N`jX;uo6"إyd (B d @9:M@BЉ\@~9H,Mҏ3Hb"4n~ØȀEIT&#bKIF JS b6" ҟy5!Čp;PD 7T; 秙}o2}LJCe U|%K& qT !4Ҿ篹 .Fo.eoڴyDtMX OM!Nh fS!mB6QvҲ9? @.Ne}Y4ϏjRd*0&C Ҁ#qG*>Pc F\;$6uAB9vAl2P'"0A0۟l= la1FA%;]Wa%@a8"D1B 2PJ(ɨs s%w$IV{ժUQmm竨PUENSdR<2"7u4x*L7#2M(crR c,ݻRj1t-X !!Np\GZri͑FQ3/lo8Cs@~vKx쾈! !<4, IƯzŠ;ɜV,꧃ Yyrd ſ0A)J'Y]z[l5[" nRIzl=}<  6zG݉Δ"9Mj}VAڷu9BLQ&pmX9 HoJ?B6H%6FE}Ǘ.Kq:pfq0h% QMRŌ~6yڌ! G~8$0BRJM"w 99+ܰaC}}U8w!7yd׋/eyBPbzC1K0;c;*pNOzق51z{{w,ViY'M4gΜӞc Wbb""p!ཽ,1ч}5MttEZ7,'9:dd`x<0L뒹\.II9ID5ɌKdՀDIqC{;rHE0b҂Db"fHS vrfJw[ZD$L8vWpJrQhGZIL]AQha"4o͝0V*R&xEW( GAtB$ z }mddx 4As/2D`2!`X?BKAU%&&4 @$ho]6*#!q^RtSE$7E$U_]y*we$f>vyȝu͓^,RE.Ez(qJPe[l x㍚錗ˉ~' c 3s]}ǵ+N,˾߰_S΍Seq8,kf6GYd)C 6v@t'$WÆN@HNVG~ Αsv1Bn<ҍ1)Irbb:۝)S*NJl%^ Jkw"ck'B L ʤ\RNoN^OQ32YL" pitUd*:$Ѳ2ܱOD)M ZXOLZ:k- 1v4hHJ$ #,ʈ!ت$ @M+{ykz`wƃHDL|J}b]B~_y4ev Cns!u$$x^iI%<uH47{q3X"5Ъږ7_]1M)_@iҧ7PJHȂcB\ Axw%*L]I wLqfK @dz(!cC(\6mjooW4m۶Yy4@$t&]|Zꘔ=h %HL`MDdF-?6!QS?\X"Rr0 !\EœS.(%9Ib0|Lc:&^zBQP(&Y)ϝ;7))ih -g^nn4w ͌ +s/VT9^Pα,c1j8A$"S9AL49cjFF&|:J-.nf,v֠o67G`8n j>I5QDe럴d+,Aɒ;ղ$7ɭmOTtR th^jGJ)A!R`Ɔ"  S櫗2arl 06`>ŭCN-/k+HMW!A},fiz3gPhzI0eIׂ|9drs0+}|(M;$%+&vʗ ==yyvIu B00%ǔ$+}5z&D*.:B )$E!dCI,}T3 c|&}S e#$dBL2Y9l2@9}TLFw@`\ȸkkkwc5443)X5ХW|3SM~'h( ";]IudJPF4CD$Lu{L) ÆJE0ޒ1V_$(c^'11ѳo_ccci' E䪠39hb*ϦTǘ t;FGlStm$)I20T| -a#PТP8W֯AUM 1jcW==D M Btçsô QN.q>TAJ2ͽ 90hQ3/gU(spRQq3c]@M Y# m48xb8DT @p(*G$iY,?BM[NMSz>5笽=1gAAV_8 D202Q3g9%Q`CzGb'Y[OM]o-Ψ \0ge!l.|ִM-vb Q#3T@[k﷥u_ Ke7]!0Hc$fS'x%-Y{)s304vM0!))iڵ]]]FQ+Z3O Ɋ&cϥ5 " zͯoem$[e况ox$E X_7M_wΒW-d}meS4ҏ0 +CׯlKڽy]jUq ^7G_ Ԯw^g'h]ḙZ $E={Uh"+zĖ,ݻW$11k`YYp*";@ED5770 >LJR^币5k:].1/;;'==-l4UUc'D&K` Q,@zW F5aܴj:A,i&Mϗ$HM@ &3o&L+ !4$+WVB,1.Ir@p5FXtJr+(C$(83Y(Lc$ %U n!5 0EXYF#(ɳL SK6@$lvvxʊE=Oh4 @ $&"1$J@YِHV\9v4%*] <>Ϗ';h,=w}&_v_}ǽ&{@%lgʤEXCj lI"o2"ö-_?RdݡNMȒػu=?e+/b<`ē|͛7Y&"!! K_ice eV>=Qlޤt{ËK>{/IW3//dukct} Rl}OVD.kA6WryO^]l?~rÂj O߻ o]Z#=/>~zykݦgF }}};vN:c ؈x_#y?S7UU1:9nГ ^⥗Zc1]]9sG* ~˖9gRRZZZvmہ%&2͓" q[R8Q"V*Kb!0!I$4@ʎ!0e4^hp rP0:=ƬE0: BR:c0tB(5 ~Çd8h A ,]즦b~6G1I Y%Q YSl 2`r*v4&%e7tLs{Z|zb ߭v)A0׵S;X,?#D,n"[ZV[R}'܎MN0<:7vUz3$bĭ +MUՅ 曑HD 6TUUYQI/ێ `ާ!sI\RQbfOlctAZ!R'̹Cm kQp!F}p_4[=޾͚:ٶfe`mw|,BJ]| ;~+'nKi޻w%9f7o^QQÑDp ݏc1f;Yg>ȘvmWNQdB0{@fRU׵w…n `lB?9`A1 (lBT KҘCK` !"ܹhAA眝p,+2eߏ;<{m2؎=&!@ ;BPE[[ ڼ&y< |ø @(DM4a4 c[CSa%7B uB\"I` %(44bz|8IE6(;IJ*SJ r!}߀/gntrH$- GQ-wCeeeRRHUa~ߪ D rQLI2\!$ #0ͮP`1 $Mș0m/S*B39 SNRF ؤg/wkaM2UOq߿Sq+[C8޷oÇ%IRsrr͛4"XmPjDDd?)%}@Ƙ?3Vԧ*ل@Ev+dMvtȇ>T`^90&[uB=BH]w)adg͚uufm:3qaw}}ݞ.D2et4heY޳9y(%` ~ďʪU uq3X^Vu "!RCC[] `S}FF,3?yy%{JJJu]$}z{/dǓ0x\HOkBB07iWʓm6@O~x% 8))Lxu f0Ra4XUwnooGDYhaa9s,&84V̙iI>[ߚk c`@y[</ѾT|ߝ&QΑ1g~󛹲NJ ̓{ 7(@Ժg*"}5vל!۝^8{;pIq㽽۶Ι3WQq1ƹLFQLn ýx4;9`JbAN35CeErl 5EQھF!N*7w.HKHuB{cq6xh6J G.lĉ#ҍ7n$ Nu>)$Ibh0$Ib=#GHOMɔHܶOTkH WRҋJ?: mӛ[~}kk]i̙3/dzn1 w^@JJDTݸ;tiaL!@zD? J1jUGNs|赦)Qx`w^&e |챃\3{viF;08G7%'NAee̛7sm65qIy2q>_%iѢ.{zd4FQ"/*a*C{_詝Gl쓥lOɝy sRm֮_p`7^mVvtz*DZZ(FH3 R~Wk@ݾ_je0y'ӁU .$oܸZNQJ#f, pQʹ mRzҲfOn[fiv1LB( 1c|E =&'@DϞ~=giPJ5o0.CADŽx?w(7f\peΪ^zhŬG>4X tFX/򦎞_gW?UB=;#f/" |EtqQwj&" ,7seEWLS !JI.R]1PZ 3d3fsͽ8(q0S4-wDiS [u/ʙgeW\7/La [ִe䗕z&.8WDW.&Čn&\7dU.X81#x|##U34cl qپ !_Oqǫu 4%Pǖ-?Q>(49G|W^iyg 0$Y|n0!L+4砪V_e1ġJlA_`1kG0 H#>+FiӦ˲iӶkgL?31nr<5rH%)TkW۝dɂ3:_1/dn8A)od d `%kcjr;4ŏd:# BB!o3-(E@a !8pBࠣV *GJ FxB FyI˼庫ٕȨݻw۶mP&70M,e[ nF˦H'wJL= 9{NO ڬ@T h22*s0;wvZC!Go%F 4{Y,G6"\ͣC@ M V i (L]pQOЅ.L] ]S8m<%TYt*a"---##Cd FBJV1#DH( GA$c GPԄ#Ԅ#PN|_}ϛ[Z (жG}eNOУPD3Z`$::BbBZ, Bp$ɐqS4ݹsgCChѢE999|?SJ))˗__W6iRcvl_rg?g>px32++SKzx}_khhMiitldg˖=5''011_|'¦5kXk׮m_}׿Ŗ|OFF*c ckWVBW7eTal,դL]7D4DxK@$Rz =x|׮3Ν& B͗u/)!D>[%IQr(%#MdtQ%^Q2j>q K.ݺynDdбf䤔'}xʤoHF4M%!#%U 5$iܺiÁ)`Q[E`!w?1$pz& IȲ,Ϙ1cʔ)i=aBڵ+~/?Ɇ$H'xqK_ro+*\`0{wOaEzwO} Q0z{͂C ̿+~_]wsrT7b%/ꪫ;ve$1zzK_-/9sV0IcW̞}M***{+Y$11yɒy6ض18`E8UX!80ޞՕ˞9srVV.GGs`崅'(tl^.&9N蜅BA+_UZzƹ}oa![leh{$U5COY dƻV?g6(A3^{ d5CVaq;z{{G}[fggMX4r:5Z 0,4Z>zɱk-Mѧ?f;Y{@ BPGGgoo___鈀F&z֣oѤ${JJJffFFF;+ϝTZ\s$ fIR:=y4ryXTSe```999֧YyŹ9 uuuiG$8&d׾')x~].WYYiY 뺮9 !TQdI:J\pISL---= /6pyR8@ JHqiu[-۷YsBEQBЈ5c,XZ痫<gɈDQEQq7=tH]t:θsXY_|qmm]$I"466'&%%Ӫ(JWW޽{G !ӧOO߶4F"pgÞkw?8,BDB)#Ne_2H03"\"G6& nuSc[!R(k4M>2C!(U:Qȑ_~8}č8]UUe˖VfUU|M5.H!Dٳg'''Bsd[K_rm|'__:ON |-{"`O*<,C=jcGތ9*С95sEYfhcjvںBM/2*BHhJxo2eZY:EK:G{sk][@[riU@DBwx ]A1Ji#i6R6oڸp_IgL+NF>иaYռi[m#wxB q!8r'dff^r%[nݳgcLT]y %I$,&Hy%v r8G%?['uS0XX]SVl A~uOw*?De_zhtWG.,Wk Z:>`:s @a víxVklFgeg{Ƈ[|zQm?zb7mv&8 ~?V0<_[| 5Ο x{YS~l%Ah#`eG秤l޼9lH$RVVVTTdq5װXգ8sPJ)2ʈVO|nޞTh}m2w_h˿kuqr*λ$tµݳxM$5fgWt-}05 wvTQ"7~/tBk/^hKϽp~z ;zΏM[˓>>+_~W+f^+۞O/@|y?/s [_SXg{G V!/̌bD"]vZIgRMӶnݺw^TӴ+yxSNBz\[h7_Ňfa=cG͍4ݮMڹwE.[^;wnԙY@DNMSh-X=#0hvln׼+oZYkyZ!i~,Nhr%y|ۀ]F5J'tE30 6?U}mr=0um(4GK.XPbcI8➆88FRRSS/ 6[U 'NLLLk@jBMa ›3q>č88!F*/^jUUȲlUUTWWRJTٷ"Z3MӒxEQMiW,wuum޼Ye+s˖-vO LQ#2 ?pݷG%ssq?JJL??UqdigWzA1H/rz_k522YF%*ɒd\˲leͺ~8?>pTGa:G;{a vt4f-=~Q|_|Ɨo}bea"9'̼˟6_ݙ{۷dI,yܼXc{k?joz_Ns̻*Zd5l_8S Qq^䔰gݷ'^6!LJOn*,z(,,<D$ٳgO[[!1iZQQѬY,_Ч:@h9Xԫ0L$dffe# p 'Ud)]iӊ0Շ &U&;@㶝T)ֽka^4eJfyԉS!#$Ud 6;<sP+buub,t֤L_V|zQЍ}c eWVQEU~};lK]0k{q!N`Mb͛ge!G*Eܴ{ngnB&M4uTY dҩ2vZGddaDAxexx"8N# oܸ1I400~\BY$ɪ;w%~.G@!h B)E(@$tD HA4 .`ĔR eֆ'j#6`PD0B(@#R2$XB888m qN0R^gj]16RUQ^^~CD(kii$L_`%o ı Gtn;r-xL # zVsG9<ٓ#SFz"8F*++-6Y6m400 I gBw7@M6577+bclɗ^zGq7]atU… v늢YW Ilm6[,KLL\x9sOc#?r0 Іq͆V' IBn)@oGqna PQQaÆvkޱc%=!y[d9cXaa9s+b 2Y9A܁I2$ㄫX?Q&o9 xqP/]tɺ[SC6md0G#! +93f̸ eDžWn*`Yp0ŷnO@ʟ|`%%"Z_cƁq7@p#߾{ڣ0ҐՉ?;#gl|dymG+ p8Ν{Z 64773NaDź@@$4DzefΜ{KTqU?w7_[?_]*<)ېa7J`u+wvǎvLl]}F7^2nE8|#8fТMzcGDqNX J ֮]-IV"r~փU0,- ggg/^y|?U1{'m%K  +w r\v%Lx,󇢺HS] ^P": ^;vDqx J`(fy2z2m.C!B#Tq h E%k d@#Lr@STC"D AUWR[&C!S n}?7޷\cB)9GM9KGj/_|ӦM,F_SSdG{zzveYg͚5Z[ :}Fr;taa0쭜Uwӡ ~*)S>23K\O}{ۖto~yOh͏<Ы{A?O//ZgovIPs}\>%C#S9:=1J"w ]gTU]hт ǘ JVUX!U eYu^xsy#Nˆ4^\{/7cMo7VsRq$)6U@l=bEw= v}G2x'iMYqXU J:Sý8ܗIM 7 -;EZ. #T޼1fCߊ~7~?>b/FNs6oe{ן+A>oZ"wMFm8*j$ϋr2ja#cҼ !#0zϚRɳ؍ ! UUyE޽JlhhUUU`iIٳYY8s̉'NBТfeg|~7c[ݙ 'c|8fTႣ nzSr@Ȉ?)MlđmHz^j%nE'N|l{C3N-޵] Vl#L޶IpUi"ꮺhO)P ϘTt!_ݵَBc$ڻݽV|~o9rŋ*ʓ .`ǻChxwI;Y=' %D f޺GvmذGe?kIMfѹ6;vc ':`4PKAQ@N|sR!7o^rr͛Ѩ(===p-4Ӵu]\.ׂ `v!t'Yny)sg#cǍqJR#a1"L)%BJؘ䚠̓Y2 Jz|ofuAr҄Js[BZm̝Xޓ6gzz UCsmp\r NE:4aCDLDŽ,ND{UNu-1<^56#/?G.qV[`[#n4߀18 |RΤUSͯ?it7uR_yc (IK\W!;Y,5B.PT}hqqհ9+}RQQbUUȲv qLvعsgGGXyyy ,zx{ D]ve~h*qؓzÄ !n1,ՙL  h*[#LU3']}r:6`iEmաk4I(6aiJ'2:s]eѼkڑ8X}v'ϸkǞ "IΛ扗 ;WI/`ka:y`H՞4;y:8 Db}??XNr>uA$ضfWRX3;1?7q_O[ hh((6#;>!n :忬p_@';z+?Wqfz~|NDQB q.͘L_=t(eze @Ĺ52^yUUU!qÆ >*YN)]n][[H/R׋Q}f2eQ@ mM37|`Ѽ%$ H%I o( eҠoNJ]1$k$7LΝ:#l▀&P$X}<'jyD߿?lB UjΗpA =Bi~5skru%PXG D$Y;Ϙ%m~aCMk{pۿ~_>+@%'pRuGD9fu ➆>AU#rDSߴ_hriUum/E.M7w?"UX_6,dK.\o>xwo; L 8k~=_ioniQ8w^I"{;zI~Ë9z ; S{}={ĈR8g=eO]8#%9\]mm~Ly\ 5ю=.($*ʫ.Jwɕ=ef!{[:4wM_cJw4hZ~YnjBy%Bl95sxmm|SW(͞7Hho Κ>[.ɱ/(8NPpBYwݏ 0lab㡺MQw_=ayG%apDTdU1P5ۊ9ASᣫIf\DtןĹ/t|տ'2M Umֶ74]Uŷ^9/?a瞬~*r ڼ]5GHrv| p\@N {v}qR%o¤l:۬D4}7,G]Ӆ===60 J)S&O,r<3_ScP[ض()PztxDygX{0K^p@ [bSe .Xr.-A$B +ث @Ue%UFZ| %voJ'Ӛ޺Ǹ,gg: Q* SIrHǸ1;Eu^߾/y%2__5)t"y"ូd6ymxD(`n/X 55u͑HDQnt2̙STTw䦉I0MTJd@n2QKIb C2لJ )I%|h%apd 02%@BcÄ(.YSp$Y`8 79!̆BNXQuִ=MOn+$aOyk'd&8:&cOq%@7Ѹ樤8RU;7PJ$G^:ƙᜐt>AFVR.@om6.y䛮&oDbk|7lP5e F^o|ɩ]0!r|[Ua [s)(W#rM<"M#1"}J:d[tYK4FP&/9rhƎs7u.;{h@D@@13ҿ͋+oW$YApCO9yQJJ(B;;qUk7vJ͘n9o*ΚGq/1F@d'Vd'V =}=&_+맄F u8mŁJP'Q5_DGtyAJ)aPˎS2<ř 2% t=C6w,ܡ#8=0Q@H~uuúoﴷ|$G ;˖ %.Td.XO0c@HD{|nLUUrP*u|0Y 2etO2r=Gq{h0)3[QԢԩQ=<hfD~׍OGSf(]B)ΜdgN!m0 D{֞@`Q&3UQ=cm]DGV+7[<)S8ts08#$ރFH]'OO|v:3lKq*DJKMt2p(6vvMa(#M}uSJfd8#8x/ B5E% gcG**$Su9]S4#3{{v U#?3j~y'.Kpd ~B0qx?' r87F0b='!?RGfU8ܿ@׆ak(tQmoo&UMν4[$QBs"881<0m;$9Oa#/x 뇞,JKMZ\x0QߵpN'HfSomC~hIDATS.L(U$ k5#8-!玐eA  4 +0$cBGqJ7! D|Q8čc,Ltd.~v[7w;(elܿ$mFE¼٬'9n˟/xohqaJ@DB%l̴m ]7CxT7t3fF \#@9\ΣvS" %%T DF%Y+&1Ef6UɡHvd܅S %ȰN/Nohm; 3աx6vӦL)L q4`q&=b%6 DtD@TF`TF@ P rlʡp|u'PE.C J%-?pH8!n4)ǰ-B̊M}u;&:н$meɮ3y H.a}W M`XOy0vb-}Y%̩&Xh1\ '] r.LUvD e' k;_OLl .LJMv@t#n4GkO8uzcֆ-h\Jʬ MscIq*z4v{͡@XLt\TQ"̦Hv߲ L*{Gq=ݽ~-ɶmnG BH!HO!!!t ^L'`ظ[eȖμ+ɲ6r4ߏ? ][;y.ҽui^{]=!Ӹq:c3ݲdHJ-aJi2iK;iE-DIӎ&hҎY͐^T#@dؾc O;m͎KK1C lziմAKhJ5(,, +IԷmҰiU[%HMlpT^ x҃;Jt;nu9=K{琄G,'kz4*N ufNd[ Ѳ/~;QNйKՑTgdS$ +6mk^I4P)QsC!OϕOyt&ӳ .vZwTo= zoI#ag^ oȉ Omk^=}E~DNEQdNC K*>Omi 924uϕ(  z2u:~woدd?^B9B#cS~;Vڸ1C.-ɚj E 'HMƪuַmL34-;Q zs2EbܹgLKY=9r+<*hHgڠ ~e.7[ZWtP#YhK4n_aI;#hd$3RGg3E4_^/TR}RACؾpWQ -g.x1Z5}*n8:3@UنRB-k>޼6Iti^` 5!74(͗5BO$ٞrRoL }$I\_(Z)K$ S^(l=T_Rq1U4EI Q"e;օ7U6bLә, G96śs?#Vp&U)J7RACC^73_Un)m='.+8@  I)hiRMsڵ[nmhhhnj*]o~nnn0I6d G;8cKvtP:oAT &;8񷾽mM4jռSFݐ2TSt/"Bh6E" ?yV,^kh,+i WuWcy?ymDIK3 J'L8s' gd8mk,hS-lZ}vf1LwyиS|Jgu( &)uO1!{3\/mZ!8cdrҥ.^j+/蔔k\~ass ^/H a0YZ6o=O= osɓ'NiU)̐"nɧ^hpi>4E,:bL~=Qu * .$"]zl]ԗ}s挾)?e9˲t]^x^?9O9rh R@R D2!%IRJ!Դ9iisӁ1@"~믿KO+:뤓O۶iCѲ~Y嵟dk>4y x$aڗ>͙PVrAГᆧrӎo'!/e&a%w}?}[i1)) %ضHec=LCd9B j8q^kjzg>{~B'pue8CH|%|F)¶J-;0m$8kNrCdD?.n~ebfk~rB%*n80RJ]_~+nweىpD}wyB ykK롇Mvߺ$CV޴- 4{`l ́8WV=.J=_t%3!2R1^x.ƌv[m0'iDb)gn~9|ᇠֻ!{6,74SResS9UrPgОHE у.dӎ4omxӇO"ȺTW!ի/{n,*ɤmQv9mmSZzY2yeՁxDvt~# 37l =)sS7pﬢSA G:pu 5|Z< оCD!u7p݂+`.t;HD`Y<ه}$rFZc˪ӸKJ!o?Z7XyRQӠ̧И1}DCMx,ZЛ=xUiߕWVV/Z!Ce Q>/~m맒"LNPռn~#ͱj7X7gʀs0qS@AQv̧f^tgv/?^ 37kqRԨS3U+xC/晴.ͷe3"r=R]a68:VW,.™+q{ejKkJf8P$D WC(͛10kBҎz oDd*d 0u<슊74KA$_kj.۾9sF!ʄBNՐ-˒׻:-vED5H ;nUCrV9JOeL8׭0aEn'=@}?ͽeooTcY.`&uCJ"3M7̗^~y5@"R\w8C_Dpwq g9w*9oHђE.SzWxO\嵟 ʞ:$(8!i?馣MOÆ n޶mjYyiifdIJIN8*D9C x^4hϷyٲg^~sUݷA8TzTԄm@#5xSf K -z_o̺ڱFVS/LϟNGBҊ.TkV-qEsF1#R= > `JK0#)Wiޭ#>kȐD<~T0Zku,+{ 1qGπ1iѿ_ Ga%]EE}e4Mrc;xxƫniGr鄰# ۶Dl Х8"D+3/.o`}- _8쩓͐#2ǠPAzNH+uʀs_D[1iEv|TqP_lkLII4۲m8 3sNJxom۲t}f pl pAKsDŲE&06￷uBg,˫yo1cڶuILyvu}^;%k LF_h]gtPEOe>?rϯ?/lZo:mg Auk ͫ"SAOPйkpk#oO(>i"Yr+mΘ NxߞB`xֲZ, ܈]T׏0y@Gs;|j JïjWN-MO ]Ԝ, g8 +7/ڿ{4n~O?~G=pa"Pw| l2)zӘlk;7[y7M[>xvA\zYg!g_nz%3\/5RTР+fN\ʦUHǺ5M7m3lDqQyqqm))2v`pAJ r]s'wK7E"u㎓R^9J̙7p {;vnjq1`\Ds.:{Jsls:5M_<9o_s(Pph-1DW>1 PSso~}g =[/ѱー5ZP^ŬֱN=U ) }BFiތ֍BZͱ5U &.I*Őftƌ1qcO^%Ik_6@Ȳ,4&%:|5&rܹNE/Ϟ@$Eo9_,[|Ke嚷~t淾'dVBfH ̕32鵦iHDu~I->x|_V#oo_-4]VhݱjjQs6`rR""ukUΙفު'^ee?8K&eƕI+q "]yW T]qJzCs׆7!cum5)ɚ:v3n̘3;r~mzz['5kwn Qc쇛7 .(..cEH"@+O2=ҙ/}umpHҴtDvyZ\:3#to OxXm+I@cC.G#&gqTtKrܖ渢93H2lhqo|[>{_"N3@zè0?$aGϛPC"A97N[Ul\Uޔ@ b<u}0~UXxIJʭUUCV0-"]7)#9t]Ks;S;y@Բ*?־zB0nzrU.sI`L['??œ30֪OfItzCօOqٗ̄h-7%G_tQ퇰a% IƓp gxC;oΘWղ]5 3}o IĐ-|1 1y3ClIBy&*hPаcUپymVޣ#H֭?.-%v4dK vTۙ^v\ܙ1!r'sF:O&7&L55dA3**XlqB]&P C <.^n43.ٙg*p.#mW_6+^hf7o G53%M=G^7yw#y^5;.~y|0oKN*NzW'dDEl<;X2bj6һA9@!OVK\wsaӼF{xr ïe²JK;+347f}"Qri #ycI"Ӷm{iVYV[ZZjl{3xD(%mY¥랆.?nf]:#)g\5 Ɲ+O:vIv leOw"#o:'@vwuٕ^Xݮ1'1C0Qz4(aǬkvlhz5ݱ}vt(BL ڦV|4Mtνʼnqll\^Q)Cz[Y*g|c/tn^uCuzer\ќ.AR7E%LK$-qSz"phHf5=KM~fƝ$҈-L$$5YG1d@Rvd2MjVQ4+8pÏ5sn#˲= 0uD4v Q[eHgiSrԨロ̀gI"U/{]fsѴkw,Ԙ]=dP|]Ye2 ddLTzl5"vfl_^,4hm.zחۜݶ $%tifϮn!mA4P fÚϽœU^z{QXLM)mnOl+^@u[o;iΜSn$pM{]<^N[0b\q`9~9q?|偣S3@m? G HBJ)k{.ſ_7c?=@aב t&dm^m5͙j$m[ 8e6n1lB@k՝fsfsHǤowP!(O_dQvdΰ?Px˂?g;6ضrJLB[-2\9}s$x溊V3.A3q^2|Špؓ:}dJڪQ\ޒ!C=D [f`qIy}8vR6m^YOY?>ӥ;G-eM6S2]+!U.d(ԈկiĹ(U.B={׎I_`]Gۆ.t /,-{􅲏1U &SO8c7V.4`ʉSn9ϗcamINqF}˩FxBOj9ӷ7)㹐hn[ PSsNEʼn٩spZ_SE9g G]_DsSSOxj֮>x=!ė _FQ/vz{MJj|o!bw?wyW]=L:I<)9ܯ0OzV8I`|qMߥfS24,v޹~isǁJZu-%ED>-Q[1UHJuOʸًmMSHB-Rw׿yֲfΞÉiƺ7][)rqkK:@{mf_+5kj7G,u۷rcP9h4ٳǜW芗?lP_mA#xo)%?}m ggO?f-3rȚOW?Ѧ mQ>G[`Ȫ/?[76xgQA!@fyu"JK+wʄּɸ ' vUwjEꀫ{jkZ֗Z)ۻm۶\6xYQ̡ǟu)c׼|?> ] w?^u9yLX K٧_w1X/b[T/oԩ̝1$gL߾x#zFzߺQn@se,N7 Ι|04Yaϕ2!$mV4ޔi?i;˶THpeؼo.y7Y7 i5=շWL1}tyT3'vFQ XիW޳ውBA_ a?9fV{M*7d/w1vGƦ.~zպ'5Kk-V Yt%cվ6s+?rlfzCTl0ۗ eeYiUuksR/W˞I 7tpeo4+ Z{]WŴ#^6 7SAG;´bÈX4Srh[xj7痦)4ohxizÉK]4,F,7k$7.!@w:f4氥 ( ,,:ã$4fzsjG +{x%_3{?z)6Nm[=6kl>on~ =!?oNC[vg,z/N?yV峷~wgYe/E _ϽX:zcrMk^~qʋB{q>sD^7l)[[Z*s~# ~z-}&3gL4}ֿp/c/~ܘլzﱟ]S~C.d7v5ӎ,@m(ֺ& ^2@$ Mmg ZA# /dvAĻ_45-m aj&Z0?Ts#,ҹԔlHwf7{ʭƭeި_1d,߬n 71mݹcz¢#-tDiQn0lafq7kJ0mmIE1}yCSBcK^ߺemJ3gJ}$cԺD ܰ=i[Zߺ.#C$Ioj\ > LӢ-5h+I mikY e*#=/s-m\{|iiSlrvAʅ6I+7umx=x4fRJH戃=ϥii1Ek--D"/X;v`dBkk^%s5<](;{cR }HiȾ!Rxr;ϟWu8sًjuK^&/{iiFs8ʻ:<]F zWMչY,ER yk|?I??h鿟x,/?; `~q6IWo.>?I8k /nդe7 VBdy)C]wBۚꫀ\&b(IÄ́a):'ȣ%+7כHX S O.W? +9oֈHcU"]GM)$-$ )AY$ @ƈ4uѶbz@JJ6F*Mn AVƿ_{‚Ss U66   ]p5K־w75uʴ1O9譓ҽTpCZZ6nHK=h=N&lQQF,/, ܒypDnHC f[UmwƟq}Cֿl۲HCg)qyynƉ r%!OV+qw4o C=ѣ7_ =#t)ùഔRRڟ@TLn4ͦd5-&|33 \E.^FBO1?Db{EHsgsGלּzVi@+PI!VXJ.1Zm<Ͱ[l˙68RK0|옺|)vͪUz<`kz5m')~ #U+Wċ7l#~4>qf zOzbfK٪JS ~ lƋgMG PTGӏҶZYL؆e,ɥ'_WV}K:긩 [RکK`#fCu8n172bnG*+HMO`E-P?*ے=ܣSPs׿q{[zي7]vř[bвGs[0iH4Ze$&6\2!Ѷ~oЌ~5y`x` "mѲf*L2ιOּ)Z}cU_sV (J O,1ťۖV5Z}qUЏgQAF hǎh8)PhT@8oXN!qNf2[i`oh~EeU%:BYuBt%vT$X!W||oiIBJ"#)ޜ4_XCr≿ꪊDb )yDz::/0vMYaO?0;c4vWdO~L%~vulO-}QdQ xIrXhUK9Aܕ%);@"yOfhlzJgtyg=- ma C@>M*Zѝ (45kV,d):3<$%snji leHjigO> Q}{1])qqnG%&4nGlq0mk\ޡCSͤ ʷ۸{¢m/4K]f8ܴdyB"60ç{d>N|f-) oP$j"UQ,<\yk˷U'NlIseqU_Q&NL\}I+7}ކm%qRfYd,P$``fmO++ YgM8qY>RnAn/ә&u"H3W}dWe'kNՠ~"A;- \%1epC\c.ߴv_,xϷhzlu#‰X5̕ȐKEMֵRC3yJquj?ME#Wmb3X믯x-[Nm],4i/)myu7ֿ# jGMڶ/З{c=$07-\cOw(fI],0r#P8vMx 7ҋ/95,>(/ztLFV9#2xgLjxkռ-&Wı7W:CҦ;ּ[t#mq̤˧(gPAY7k_G%R!ԁCn$dBz0%PЁq=6e#yc&?\0Λ*C&w,tUN-)$2۝8cNp@Š6E> >l_-[֪qΝ/_I )`Y՛7?rEqqzzW@DNYpYcg~ѐ,[G9}K//V B͆L?2Y- D*2P49 J k?rӉ7q Ӟ?Faj)?"L* m⻓5!΢`búm'=FƄOy~_~u_o%SIOc.Y^޴h޲MG]ߟ"@n-OXcʌ!`o~Xl=YDїvŧ jW3)9̝1izՒqWq'0ģKƌI_AKpC4נ!%,S ޢoM2=`?Kx$KMyt73}.89I1%䒨$hAn;%t=,ky>Oh\:)N6)(ty]FZ048Mhggg|7-cB·)Sv݈mӝ1v9Fr@xᅕ/С?߶mDsEEpjvT㠃ʥ)K#vJn;%%Krg|EN"k?)<ƍj1#R N "cdJL/6'æџ71ߟ/A=B^XI8GsgS % .?!>z“;~LDtt *s \^y?uθ;O[ȯq[ڏ@={Ƙ;o ?ݕ;j˿?sl1-iZb5Ps&:,Igifd""[XХpRrF$ك{;鿼M84j ]_]wuy$ IH@@z(4$:}uRC-|団Z~;׃ҹBSߘ={J/=YȒV4 "INӲ65 : Vu e9]bS:-sYYbL&n{QϿ T -ui^[9ýSrH6E6q끸1Y2BsF$ :7 MS·~o& ^@g1H!Af!U۲1;RȌw5K"Ƙ֞)))_QR90N:akICxsFoUQzUK6d8A;ܺJR ͷ7N2︣6sƬOwz8a"@DD8};p"nrnHig^ܩ(G#MWz'8*Fdtt6tC={شʲt]gA$bU$ǖ}5 xwIM*$ a+J;ݙDV[:2EZ?w>6z\xLMa`ҎEMu+-՗ّ +} \vӎҩ=|vvaa2q[[ﭭ}(uazzvΏp90cj' k/ &yiih23OOIxeI˺$+}4{֬=WT ";J׹oTQz4(ɣ["a pa3o޼ c"ؖ5wAZX왖r+4QD0r3%%2f4DXlU,VkY-BT$~O ^8pr@˲I|0{֬^[pp"F]q\Dm*hPfqK$6NQgz$Dad\!ڤ\N8*7v!R4o۶rZd̖ҩy픔q^o2²QR 9VD[!R=ATРt@Df3<'u뺆@RHѴ3KOw.Ibe^RěG^ \J Ltn&BOK0~jK4M;H4@eKazH4ݜM sp*hPsfDR)_잪*$+.h<^v0,۷Zz 73٤&M1I=4[TРt'DԹ@(uwpbI)S_۟x>M+\Qti2N!"AdI.""LALȵ ٸP7x{Je8xfIm:ǨQO{!3.1ƈiU_|}ǟz51##p݆!5 iu1"fYX},~}k&,.-_|+:(V"64O|KE[TРt''"!s@:(t 7pŋSOVT"mHdB>GQV76Xli$h4mءCq唃!n!$ih^OoRAҭ!B`Id9H)N:uT([z矯-/oŞݺ5nmL,_[[hk0/QпvjԋgN^4`Y۶cw 3K JrZDL JJ LL?ܻ(JV i9'׷Tz+)3T_>J J7;S Ja #L?*} nENЀQS Ja;eщ:'9*hPS9%%Q%@@ 9j9>JJw"HtiP MqwA[8 ggP 4(ݩ :zT̰i*ÑƖI RAҭli:v;TQÁP4N"$!2T-WS_N tzqA JoAӂ2rJJwD@\kp!JIӎݚ&]T Y*hP )Dtnh\2(EgEHETРtD"q7gN;@G>J Jw2]ZG]$ )mRk+}JQSj@sJyMPe>K JwaD$\WלA5Jo\O݇{_QAҝbfyugʥ*#}!d5Gw"{QAҝV%I5T̰?N7: g:tUQI5LJw$ya9pAuPŤ8lIWEcTȬt'+_жgD8XKX@JRu]XL"82I5/v HJ;z9UA^{G#1@-_ .,,|cG# μz=gTp( u4g}W_ ԯfE>CCe] DP35r PA0pab֟$)D Vp^5n&G#ϡ(  :D,#Б7*Iɡ.E9dTРHD@z[>~u2K+JCZ >W(ʗ(>;+wAD;?$XmYr{WeZ{SA P5Ѓ8u)^iC-MI6dtjL#TР67A{ܠt/AicH);;PH'LT2p`a.ͣ[ cV"Q*>B J! * jý747_W^~= zԴn~Ѳ@dD$xpi>[$ѲoPz=4( lh/\R= chaln,LpCd-jC ƙ3Cum%IH}(I(G4(lh0 @c(FeNOO-?PY`xwsר~'li\s΍N й9$;kH4֠ *hPz!d{Р4z@xcf+F])uB@[!#;?u!"W*hPz'D`ɡ%vH+уD# E@#:TBgA+"VqAB@rݰm3D "0^ӯȈd7''4\=R(} >tqg+Iy  gpiRBGDCzǏ?{:RėKAٓn*!O `Uވ@Yd;dy9u]xqgqK˯ o5Oׄ  ӳn*!Ù~H H@ZAԆ=Ly2E5w˶3}wSQ)4(}H{u&Nr=pAaMK-@$?c\OT/"=c 2*Ls6F霓(T_^ԒJk)I GDxDq5RcTР(W J&TrZUmTР(WEz@Dd43E~ƸF (_8ӊG#"v"?ehȓ!!*hPEjCr" iL b;*>)JEQ* y-;( I*hPEz )I;/J橵I (54ONhCF ؄AQksMS1Q罢(B04O/ReEQY{/pR((D (5<( pd<$ud"g1 9ɖ=+WAQc`I*p. @qY" VZAQCD -("c IkC$wx 1% K8ωyџ[42zlEEiyf%*(.TР(IMZ8 ʮL~DN":@yۇkȥC&qfs;?i@ɍ@26$&_y뚗qִ"vYI8}0B?V|s깮b& [0ĵBɎڵg|q%qi =ѸHz@ O(4(Jw. 7N~t?L=3{wT}d]4P.a $AeNi^xfm" (@8~@k|.@ V5eIM8Qȟuu=>w^}LD(҅ $mbxDD"\z߾Es2{<8Ⲍ=7ecd[;k*- ˍ@l˅~|)߻-xod}AO.lynA ԰=Qn}C߹=Y!2( 4(=IM&&CI` lC7&ņ[Ywp!C"ۤd6Kk|@6@6X &m@?m/R,:5/6\NDž+*uүj#/ Ĉ#C(a8Z4PC7-?[ JX$ijKEQD Jxqn눗nk!:k=2FBE@FoFgIDg5.;fT@@ PC1m B@<"5D5q`"HD4qu$"Mv+堠`9  1(g*hPzCׯ暼p[cb,8c?lDD8?"犾[ҡRtHd'ɴHr@cv-P`F8;7ovyL60@e(ʮTРH#G9rQX#hhG_?\<ÅFӑx|^JxTe_A鑤RJ)d566 [8]PL뷈ιl@j&EQ`QAAg_+vS`7B͎ý(=   ^zKRC~KREQ ކB`(]YY4(RA "Xi{EQz04("◪B2դKEQMlásSA҇ 2 0 .UpN EQz 4(}`C IGJI}w*M¶(rTР!yy |pʁB,X#-0p(՜uH&˗-oh=N=6"I9fX0(r  ,\"%tEXtdate:create2011-05-18T19:37:29-03:00՛z%tEXtdate:modify2011-05-18T19:37:29-03:00¾IENDB`PKT`1D7JF scrapy-0.22/_images/firebug2.pngPNG  IHDR{u~psRGB pHYs  tIME  );f IDATxyX7\ rVTūިX.ZVP*V[[ **x+hPFQ IIȹ?C,gQó;3;̻f Z}3FT`-BF}f m`U[v| AG/c34C!c~k3ۈLxvSW'2_\HiQfmʤy桑ǚ䨍TTAD:?mڴ#F` ^I+zv~1xz<8V.FњF+b,o;?g2ݎɈDC< |wpQ޽{0\{qs 㟔(>W0%O"nōy)r>vKܥ1F\ٿYoH$'7 &ÇY555d&,=;5?.n3?U)$q\J$9 PYeRb]T(p -̤5BywhD"Uۂ$suQɹ,؃U4U֘GۂWE`դBl˰04tPY6œz<,C;8 L;8d0. õѭO\GhDd0&:?Q]x3L%?嫁iXjf7WNj(T>pֽ= ۊ bFLY(cMrXVټ6MLL4ݚlٲe))) KDQTJ|̌[V??ݕ_l iv$ y¬h}(`Q&]ҿVee W/RVOD*bL xRO2OR͕s'/*K>}7:uJ(#dmPS%cb7qg 1Fح2d8vt&#p_&DM䉻;v {z' *H $Mcef[i {*;j 4f):|LY5[@2E"nx.UQ(VTؿZ 2tnVK a[l^*)؍5FFF*w)k׮ٳg#ᑐ3 9K|'5e7 **(Xhy+$?\t;<|{Wꀏ mf!mEG2ziH%GtoOD-e$sIj3*jkk7oތ H||K,Yn)LYh~|gzN{A?6:A[u~,<,Ct&'enI_{_|cSUkbw A3XfV:昫FZJeli߭^lrmo HU%hؘ$FFFcƌ{탂NfiSGw"#ɻy'^zpl`џ7|]vjnw'vx&'pcbmB%cU~-ڮL-lNNN e:F8qbYYHH nhhlٲ*_@)--uvv_Dfd o]4$lNLNגt<;HJ_\nV .*\T%5w%$W8))8$޾/$I:7QAM/2mN<ɓj |wpiFJ/ W8[as+;c ͭuI:# \@!(G6WtV;8XCORӋvS39ɫ &TB2-MxzzfgggϞeX >))K?}teeB4h~y.+Jo߾=uT߬૝׬a#+D/Q56MYeFdj0T=eR X:blͳuLBrkdgM0V+p]Xxq:O$PH97|d6r"z#DHt航1X%Oժ0 VP( ɸx%yZDz=dls9dQ! v2mN<ɷrݼ zzz0\{&|O|xC+qҦlqU[|}<st+r5ܐd\f_ォXm2{dLwգ0q޵.L-l JHHަSUUe``0`Kbx̙*cccooӧc...D˗/il6N_x1j'B5}Gor=9fۢ/6Yi%(~F,XrN'j\['"f_+w;[M]rf{ι.C 'U۲ )U_/olf}n)'ǝe eJŨ_;%S_ڑ[}"7T]UK8l/i8duĈ'""B*zxx,\/fwþertW(;&`|=ӣ='#0,6G7q޽9U'Ҧ|ॿ[2:DŽR:еh 88"a̖xewkC;)p Ȥh@R/[gd4*'JdRe&!O7^y#$''u \\\`@TOq \P/_ڦFnQF3.InYhZ9Ŀop=} -lth]9D覫=.$==߿sOKKJ0€T{|ٞn;RBHwEn0mŭ:sκߠxl&Xw@plFG:] 5E/pLV:?rCEqVmv˭zضد4-O`13LTussþ P(xbAA1ct:]__ڴi#Faðvqq2d 2?ĀU.]>;2rS_'Cy'"*PmvO7A#LаsPyłXC\jrsMϟ߽{TWWcFFFX ADi>~c 2y q_}$vc^(Ȭe!{!)8 ICu(>V"&ycIU^?Uk`;>DҥdxzSdĉ; ?) M_]8³mu3"ZIT cƍ->m}7:uJ(ZǃEt{ZwW~.0꒲a2o:~V+Y-Ʒ=<<b1~Gmm͛tҒ%K`=_w* HOnfbj5unت$'v&4NEEŵkVZCPaB;>[+/w#܊[dGpz?+-+qj\.Ҳ҂ .\cc 7k,AZYV$adͭ+W+36~7l$$&vsMhrsMإKZ[[*-naϭBG[RW-[[gy% y<9W̭*t~ cccӫWVϧ(훌=mB/?lqzhRҟ;)$ɉ`ໆ˖-ۿxx;nooVMKKKa='q%$pwöJҦ5&4 gΜYl?jPjCeFYz4;l'W呈Q]`FLm#2gWij:TVV!!!CʏMzY+\ @$<Z1ٺjD!G5Y&gvv6{y.+Jo߾=uT,'!!A( xOOV%!S LObn="(Wa[%Qn Mn鴐_;::h琱FqXӃ,:,IWgrpW+嗏7Cu0rh PPPJ +Dw_U_Uer z }ny{C.C. GB\=\[RaРA 555؛\\\:$\]]/_ncň#8NDDT*Xp!^s"ͦ/@l=Q% ^G9L']mDv&N ݻoh߾}T*PfBCcG1r925Т݁ϒuPǝ|ɘG`˽k9Z=L8QDB# RRR F~*::>~ﰛLAĉ'c? ߁cs+Хh 88"N୏8 {]WsugϏP#TTTJJ #@ GH$ti?--M*@JGn(Z~Uǒ~Xwy4[We*k8l^m*߶~|Nҟ) ͉Fɼ e\$'`% _=̺ߠK4|}-'sǶUu~-hyیaP-Z:>n\yQ(xjPsssO!RQQHM6bA RSS ^~LMMaٗ/_Λ7AEoܸC 2jԨӧS(A]\\ ^ o=sVP$լluݫq(Ww/{TX$g1%We*lb(4:B"(7q6[9%7*r6(&܎%xQ&7Nͽ'kɮVy+}R&&)ZoJE?jW۵4*uFO<;wuuuqqqXѣGsssBacc;w:㏷n B0))ȷwpZZd@dJ9`f~epmmm|[ n޼AK.-Y A@dΜ9nBOT*ۦRb| (fWL̵[f^\-2nL̬tjY2ݶ62n#Z7KX˒zTX7KS[ռNG"FK_4>Gpz?+-댫Z*ef3q˶ڥ%,mu=} w1˵lZӛ5k aaaʅ###֬Y Hllw^ll_G4M%D*oG68=6K"Fu oԪWgt͑#l|N9]o\O<*wVqOmm.cˤ# Tɉ`h'a%%%'Oן>>x999xtUu7O7z \z 7 em=Kt|]=']]Q9|;{W%(|QeKl}U|Sl7GR؇z -ףiMc-9"˝J&Ql^Y*՝ߝO`_p} ϕ80sε4׺Y˿S^UaРA 555+9Gp"""R… b^^^111UUU Xt)x˗/"U[[?`K=at:}@#`g_W|n)::z lx.Af$vUbaLȩ&;OhA'6DD'''@ @ @ qqDDD'''gHHHWHEo_ @| J:;;ϙ3ֶXdddgIbbImkxw"N<~jll3g6lCDgٿԩS.\HR+**޽Kq=1Ġh'N}6P(n޼cX駟RT򵵵W^-..}bYMd/P &?4iѣ]gg?SZJ/^ͥRǏ~:V^*^r%''Aoo9sbuxC+yBB6112eȑ#1g Fnrrryyyhh+Wϟ\[t\.uֵkׂ"##U*Mܼya(ũ_mV[ollܾ}; Ν߼yGQ466͛ #44HY M={v|>x NC!!!!!!7o~ҥKǏϝ;̌JΜ93//O]WWFM>Pm̜3g9s+T"j=+++ hxٳgcR͝;7++ ʓ_WWO?'$DQܹsL&Ag BQ9N_~dJ$04T{aCCIˬnnnmb`0|>255%_en߾hٳg{yy@ཎ8P*((R&&&k׮Ufa:u* ÃJ͛7-cc㺺:+++Ajkk۷oߜcvE .4IellLl7M;99XEϟ?"Nm>N333WWLAF}…BQYYRX*]pOׯw53dȐk׮544444\vM<~~~)))> 2` ?ol0=[\E|i]=8nnnVVVnAAAjj*ׯ_@@) ϟ?{niiٳCBBBHZkkk!C @;5 th:_+^h|.:~D&EQB/EQF9ZgzNsv/m똤hu'pi3F.]u <{L3i7n\xx;O} &=1ĠP Je U l-zY3]jTrQ2ƞ?B/k!Gl; {˧%B6ʬ="K8u^H%軧c=zۻR):OK0Jf!'(ֹ9^[}lrzB&EcZN09{#w^v*kSK=*kᏗc 8Ӝq9+F͉{`w%SC[;~l=6UjҶuL.h ߍWt-#$b4d?jDnVXQ5^Ś k2&ZФ8IP>nQZ?\\\ofT*uXU<<o7eOK:mH )T45 YI̯gW-01S~(.#!)i ') _Ck/yVIm{MlVSC_&ٳǶnɯ~x!@ jm똜8׼Zk[+0Ǽ.>Ö%?-Tۭy~TQ- *%5X6A0ɸZTV&\#221===..n͚5xzqqqllҥK[5 Uz ߯q*#iYBH Ƕ5R wwm.գOXL&1Tj#H'~b8/S.Չ-WFe7"أۼ#(tY2aP=uQɹ,Mk f~6W+n%Ʃ6Ͼ5ɓ{+}t2OPIr,XHr!:>WP <&}T8Dj'l58I:N2'OT%H֯_߼X}2[qfdd: lTܯ7}Ag~b5 =_5ߊ%^R((_YGd0_28_Zm~yg8ɭqkdnlߘ\%J w_šuZv \inP*GzMGdԦdDjۯkc5ԪߦʻGwϞ=b E}}}RR~fZZZbb_HҀ999L@ګZ?6BSk9#*Aw&Moa++s?ZYW,ljդ"ks?jks9yWlxzW ANw}݈jɢ=_2~A;o`~:9+ٵCv;Xj_Vl*, &~>Oon{ [qB{ {.lP|Sp=_2;kzg}dҪp/f8k>#ӭۣcV-q uѻTH*>d6kUMw(3hР앜^^^111UUU Xt)V,>>A{۷J"J67%l:xb5hA/>k>)N,S!%%`,_댎vvvг"N!@?В'D ɹѻ+%̶l`6 3 ;0@ @ q}m/T77F;=bGEE@DGH$ti?--M*ӅoGFY/Z=^=UjJQe'vZ溹a}%􊊊D:?mڴ#F 7>|HPF5}t K~~!C8q BgZw"i\jm}Cgx|IffϞ|r@bG7oތ ȉ'?~:W8-ӱ ?ܲ6PwU.&ӊH kJtLʬ;9ۮLKI6ih}yd#hjuפ&BCCe2Y`U׮]{9@ bccO:( l[P]?D&x;^02֨enmUvpEWT{}ؙ*=lySEaP$P B4gX4iD3ВPZB:3|oHS&ZCؠ{s#ehK,5 !b hؘZZZo͚5A0,Jb X,RvdUnk#YˬLt<2nJN]؇ H]`bEZ[9PM Jpgbh-ǙV=.tMѓT~_[iGdkڊhwioo6ֶAr[[[<|z>]иYfZƸ~V%/кr[VW-;|[@zIAL(viM,B4M5]Ѫ j_YGd0_28_tMѡ_-:tϤy7 dkhfڌv陝}||BP(?裤$񒒒|||Crrrbd]x s^1_Ϻp{uʹ%6A\ VY6f_ժIER~kU ,kkU \ؠh PSS}Sm#8NDDT*Xp!VfԨQ?2 A6M/^ > =ܠ՗OC'%%`,_ڊvvv@ϧ8S$˟uO\ @8{'"nF L@ @ qqq@૲ݾRRR"=$==߿}IR "+xxa`50g'q o U ghzR3(,|FJ)\ IDAT>H\777ϟ?{niiٳD:?m4웙...C 3t]ƙ@d:EVUiZh }zfş={6p@|7--mĉw޲e~LL fF{:y@z{{?}NVFFAl"lW!G~W1&wAOK|9.1}s'=Q2>qH+؊YcGѲL{ET"0J&ڊoHW0ƙ3͉Xɐ4*W~ctyQ:k8tݬW8R n˨\eV1,#Y rn~vvߧTVP6ZG_6f_j&[|WSwHhfO;RA}PVVZÃJ͚5 ,oݺ5uԁ陛⇸∳'?cU=U3YɬT}7U`|]LHŔ?Gb(!3Cz%W܏vh]9puV~5KkEUec^YiqOm{(g.7'1pg:hORH{3oŚO7;wyJFZ(#Ok^W-{؀Y[+0ǶUpزDs=2ַVm_D2|Yx0K>}7:uJ(l9afft]ܠh0VX "lJ2W.=wPԲ XeDu~Jm؆}.frZ)^NfjfO띧ҐZ#(74KE:"4X*^:tc* %b 6I*?tmz{4ӣRgmJPŔ6?-!KmRXSwX* Fǖ}Nqrrb0gvvv||_|'kNEts6lg\ sX`ehqh˧%XK,)s֪OӱbJQhZb6GAʹ-뙼@Hy]ZSpO-rh}yW]3iyMA?QOC t?!K"@C9lY]@Tך:U<==ݴįZ9D'!!A( xOOO<+''Gy2Dd keF&6޺>~vӓ(+wعz>|[Gۣc~l儢AA=oy{^}yjen{hm6-j?Ls~D-:QxXnol39φWھԉ2h :^SSs8{?bA#FEDDܹS,ϛ7+ft\(sV_>} դ0˗hggg___0#t]u'|O\Wԑx}q*}V|DD'''w`0NDEE}GM#COAwDO\?--M*B?E zzӓmU)Oà=qrssܬ݂#Gnٲ%&&a...0E-,I}}oETxeR>ppA\du]q 9ZgzNsE;=Nٳgwݻ7nܸ;v۟99Zx뢒C6Yx(qꙮ#2{>x&=SeŔw5 /hjw(TBmgC<\ժqel,DԄ]9hQJF#XP}glD,D%Ğ,d@ϛ*4~8'6۽6Jp|&:r9p G~y){kk鵗^׊<$Ƕ&f^Ճ9ZWsү/~ʟ\m& #$8YMo}Ӹ}Gd&ڊ /vE5*ĭhkVVņow+MLLZcw)fffEvwg#;9HZtxuM%z1 EVYf8畾hʭL6rL4UB@}yjWe5bیBC.5 9z1?q̻p*ɔI/iP$[Ӭ.bV /h$%Qm[82Vz*6xxhhLz '88888x֭˟2Ly;XL{^{Pa6=ڼdwAj*nTMVWHR(Y&M 2V[A%*wpnԾM^`>ce[թezaܨu,`8y~TQ- Xc^J#Scgv¦:Ne& cGZJ/ Qk6/ĭΈW[4u[!#6x{`lr--5k`\.Wj(@7 mP(KZa 9XnEة/Mmy+;ݓ8蒬_mmd*yO:]c4k13vcK.qCEDlaCFM Jpgbh-Ǚu>(<۵1r`G,3856-4Ys֡4 X«mS8j툎m0@wNNN C%DFM8199O)--uvvi.s䋜:T!GY#+FbY3Xmy5SUqW+ޙ7Qmf_Mto)Z,XP EdG@6}O TDe O@i e+ikI !L҅RO?ws9ng,)gZl)yƤVvIC'iuOݜ.T[6WDMY|!$i̥)y7,f N{ eiWj-&#vr?'[iU&uV^l^z!IFվR2薝jkݕJ/hhq{333oFVl6V?GEEIYYY<>mN7 2_y:Pop(J"]'ᔮ^;8cIP7yFlj.-Ҙr  [ 4]a#,,w>#yX{enx7i~ڱen9ENoejygzv!R7<'2QEi=o91:egzew Z9Zܞ]WVV {Yhƍz3|($޻ 𘒞T*g͚EgݑÆ sHh8 _*bG"xx''gN>̙3wnwڕ@8mL&ӯ:vؖnرϟ7`y6_uVAS^dKJ'{jd]q42իajj}P(\~=B͛R/7BH:t͛<oF"%ںu+ݽ{wZ估v V[>۵e<&!ͤ6U?nܸa3kkks f/w(;> .]ԧO\XPbbf9ݛB <;|#HϘxN45V5;kld]ٻlj̕3 LFe< h^+sDеH/~;Cr10Mab;-xq)yz/3.|!w4ia[V[¥ɺb/:TFEE9[~gΝۥKH$}b<͛/X,ɗ.]riTTTaa!+Zdl?stʧK*JMN|Y_3~;,jﻎ_Z,x`v?H=TިE]'rm=0֖ݺߗ錺@}.5)7%UqiuX UXuRut29111s{>cbbC {.--uu Gi#_S|dTJ.4"0'bO[ɼWw@[} 'Tr~rx}[pK<+.NXNٹo}oS7#vWsGx`_,弹>lF|!R-]z֥UIա8`v!C8ħ<|rr2өS~A~h4€ Eg޺yB1 )s٬^2)3dV*%,CwC@Ue<yEe'DiiOkOI(/6AijX[LfO:M j4?d T(n믿߿BM4+V H$.h4ټX(*Nh[`y}N?8A*3ڿCė|]VT4F?4_t?M9|g}g\`(!Iy 6?=J%uAYMkRqO04P(2dș3gC//3g/wi(((q=xW}wVmVL2XR׋}7-MU#I/TTXj*,[FMuOψ)]h*-OO>(mUl7Qڝ0tcw!45EV mX:%70Kkf+Op9 isM>߮];oFVl6V?/}l6ggg:ujȑ.kʊq=͸v}]Ym8M5!^p!4l|"[ɤ:۰7>tOBC9&"[,eW$:7)f, ڳ썡w*J̑Y˃wGnYXT:>/07m-RLyK_g쫜]v*++Or;?gϞ2Xܹs3fQQQ`h׮ݬYS"cI/2+a6\J7 o={,/M6pzzR5kV Ե{aÆ%y={Νew>!vҥ>}Зxu85Ќw>{sCXH;ÒpO~#dj~M9;kw֚JS - \,f$۬w^}FtJ^g%R{4*Z]^og;^_B#5fD 'j'%H>&1z iPPg\B@iŒF߭.8@ȿ+3&޺0**8|7:u$$ /O͛7_|EX,/]䲜B=ήHV^h9$}݄ė^ %y7 3;J]dr8?kn]tFE boX<0sYWxs=TިE]'rmYr+%~[lzH*HgRg%CjRnJLyꮑxx8Wv_;Wm8ճ]v:N&&;K.}׬Yn_CbCNviir|||jkka@`yX󠭱XR41F?aƘkV[Q!Ww20**vMcp(<yԪ2pU 8z: 0坆a Vz93~ӍPsυ_;CVSI#r)?i!^{ aY;+U[e&^yѢE9hʕVxΝiiix߿.˱X,-h4sRμuօbR4|YdRާgbBUjKX4-V߿[YjPX^lw8dȳZ埭(OPL=RӞ 'ɟGGGw*R7"ݕa!&}s^kNqNPo=2G(f[-.nKRF̹1cF@@ClvHHȴi֭[7n8ФIzdǻ( 4$ H \{X{ym9O~8K9 U$jpOR~B/[6|gRނOy8zuwVRiqfff=zW_ '=f7ӽfΜ/^㲜Hh8g?{̡jf*ӎ% }}7-MU#OȤX7/P985EV mX:%p9 e_~uy0O}*(N?v:A{[wb0 [zޤnʋk_/t)?i!*L d|J.񙙙;6g7q޽6r]v[Fc6OCZd IDAT:5rHrBYYY05Ќ{|hgוZ0نCx5!^p!4l|Nmtak;8cIО eo SQbZ~?ôEi=old;:ώ^|12Vv4˿prתPw~9\C/?i!*eaQP޴EkNt6[t~?yQкu 6l]tʤRibb1cԨ?`0kn֬YxCЖಹyeGD|JVu K,76;!W߬9QohECXlԓźb/uy6,BhH!s1ၱcȁDZ |Kw<aY_ڎzo|!6͂/M4jY‿fzW_gjE%Z-M:O/a<#=sL̘*CUY]ظD6b-0\{< =4_97ˍT>T|Ծ ϓ> dD[}*H-0g1]_ѥoP?gY^LNl?:@ŕ/t&݄D~x|:/fܷrXl[}.~}QxxFw޼c~|c΍ge8L{>a{ywٗT3N枴d Y\:`)v\rK#\ VKB"Sɝˍ-zsN1/^ V}ȤE99a~Xα Ng[#6Qzdg -/mlA_ w[GmkgJyOz-;m {޶Q۲ʲB7>ճ:O?F ;T ~k~_f7a hj3v 8W_c^#ؕKk:dO=wm||[D D Yɇ'w]LV / nQCqHNv5Cּ;.Kr~ӆқLjdbi]x0fߘweyΟtfd.[\ 2A/ƽ薎EBҺ[$|I#J}8_ Iۥu9 VZoo!4C>Ӌ*Ug#[R?ߦBo$"fKÞ=;>O\ ZXW,f&\&.!-ӁpYx4H[t3o%w4!T^_RשBPI%wd?EپFŧoaRXKҗܬI)yq*Dd:G.#snC =iB'q ޜkC*OMAz5iQ|"AK$=esDR*ۇpm|Ooۥm_)S~_>&-Ɩ?Lz)[rj&כ)=^,x"ğE.$+դIH6fm5K& ,+ $)$撻\L+]?lk_T@]?駩?}f>zw0Zxl=r]?:p ѯ#>y"0G0|Gʗ0=ei&݅ jj\ڡk- 7|[Fnyȭ/mrSʲK-ɥKTƎ}!? \|4L?+׾F}}=CTЯ&M?bЊeJC9,N;>2^~ \O.Kr~2Az뇭Wx)\s$0Ig&$Cc^gv\ށ7~"?½ir#zPU=fOxw6YM|j~q,, T\c9Gigaej6^ؘ1ɊY]+c׆a_Ld7MNwM($D-#m"Q[>CTЯ&MAowNl2>5jff.).]_:I%wd?EŰֵC~2 &Ȅg_8±c_'9_ PܑbsVҼuKFn#5~ڼ{=&hY ѯcμq0OjҪ$E 92^S_ m!*aQnøcmܟϺ(̆ҌqQ= mo5iUâ܆qNx[`Y0H` 0w/bٽo2]o,8 fuqytoߺCL*sBIyFm?=b!lR` ڇ~LH_g8& z=U3Cܩ86Q/ ٬X-A* o妵to_NmC=V[}9Nϭ.7;DV\Re%E?"FXVh2ld!rz`PoSvV J|xzv }nASjb%Ɯ,}N^uGjkVV*xA(f*x喦Raγ;\ZZ0)gAѴ168\q[[cxs<_/8l_TiK81Obrn]pYR㨩HzUA(:&7/r!|E&.BH%lXY\#WyRBH(b+fP/ !ϵZRsHԃ\]b! C5XF^ϲ16||Gpuu:̆b\U* σD>\ZZB^^^!\YY )887O]xpp0aZa"H.yV묋J:jr\\nXZ-~WaXMM^gR#d2 ?`0 pyܪЦ<΂f&HDrbłYټdH©z!f6ab-:[`8K%v1}2L-P7lxqK蘧 FEftB{pHBUWW8Y&I ؇BhxVFS#lh׳$NBMәL Eh4r^@2"8V(ZՊ;դjf9((biZ<?1(z  u8 uVPmU˱|Ղfn]9ݽn!cY-"ƝBP}}=OĝBϒ=Nluu>>>f{{{㛋lz餶l­t8z 發\FXbƴ/h9}GBs)bXx]*3ޏ>}絰X,;Xm$Ϗ>)j45+th2bA}4#VkYY[m6b!0BH(j4X\SSCY{]&D"PXL^SScDHUSS?lj{xkMM[:ښ"sˋ"=N<,u4@ (1^l?MH\^d0$#b|n"CrIԴr)Jj̆>!B痩Nohh}w k%5gIddZ+_&gBwMMMii)͒Ie $G&?!$WI?m;HӾрg_]w{aII ѩܼMkpS?nܼXܦ9^~i x@뢸(<ii 7lF0  0 l65N6|ZVWWiZŊ2, ##Z0fy{{GFFr8hTT:0TeU"-.,,\m2{AUummmqqQCl* Yp\f2C纊1 V(4 @;`8\!8 CSNDDaIFDDZ]f2BADDh,--5L"0*H$1.qnPZmryDDi !d2n߾_WWG,=Tai#-a$˳n逻Zlš%\*# 断\&.դ2K#sׯ_o߾=>B޽{ nهJ} a;6r:mDnxKC_*..&p$[6TH8 8u:]ll,UՅʸN2^ߩS'R#""BxG,**j׮}uh9ԩ3˥ ""bl6JJJb_WIIITf/=ДCjXz!I/K:vrrunnL&ر#~T*h ⠋Uu!T*KJKB) VTr(-- p85ړ'7at{nIZgݲZX,"2cr"ya8mD֤`4j1|duuu"d2T*SWW'ܵ0l&N=ds&B&nSͣ]":̨Ƿry}}h$lB D '44TR)ZKIHw <<2ffbH<:,,xjU]]EE|-^O!Rú%$B(22~! @^7.VVCL&pUlCCV[FR{&1,nO-IjnD3Qz`Lu^ 'FteF331|d:bcJSxp4r:`k1ޒKCٗw|#=1<%8]lpZ,L?xl6߸qt_Xl6O?KKKKKK9NXXBb<@ X,.y{&4 p$ДCjXt(ࠋU|B/م^AArT¥ԞLzrS.-$]yh&~K.s] ps]M&z14KcڗJ* _J룢ܵxp4r:`k1ޒKCPXXرcG.9UݲyGDIfJHH Zh4޾};$$xZK0ooocZ\x:tU*xGr& Wh4bl6\Y7FP4ݔT0ZIqR!Zi}B^&ɥ^>>>%%%uuuFR{zܓՎ AZWtKwv~nus@^ f8Lv0*5$]8@ aޥF rUMp0lFNtM=Zrtw_EEEIqlQ.pW///@%0?h4bf0x l62JDᩪ3 q"T*l6U*rH$J1LTuT*dZKwJ0HqR!ZaR5JX,}jupp0+}ړeOnv0lҺ햤/LfȭAu.HۋJ`y0iFd(դ0KL;LE&&U [ڇ^}z}95Ihɥ!Fp޽ D2nSͣ]"aǑPVVzuWΝ;fY(񑑑EEE&) gVPܺufy/}FP( 2 U m܇:22RTx *W;$8֭[o@`4 n9B2-¤꺺:)@"ܸqW}X,@ -BjO=ڑ°!Hr[R !!!P̻չTF m/*ĭ!ـuTfiiId"t/ѫϰ1\/o#'&-t;tI#=1<%5~ڼ{wUNxܹs'88X*:SJý{r/G! 6 A Sܺ`|@B9# 48='I-_keSG\rmiJ ɺzN 8sf; }\>q.捼{Pͣ9kBV 7 7}#IMMmKCicvp*׸q^2?PXa)o+=fBU <5.utW% FUrO 3/U6S8sꆹ z}*|j~k}:^?' l ro 4/pj"ez&YE+KfI%s~vF2{e{k&5ݻnxk^dzU؋A/-{xm镕Z9ܶyr4gb+MrW.VBs]etZw,)?r|!{_O\uowT|w)]7lgSsLD3aPyA[!խUʻAgʻ):dcx:Es2^~υOU`.iWǥ{ȣ"'''..9ܶyr4pKNڲ<!Ex{7<獧(CK>Hx{J#b3>^K>غhk${**N2yP"[ŢoyJo-{^3&S[e i#i/zߒ?DAp)m#[… Z6((hԩ&֭x8))ܹs&)11qĉ\i$55σ^}^W_ B]|w]v!f`Ög_<{sC}ꨗSܘ>Rڦ!"ܹs\p!e!ν{,Yڿ'iӦهݻh9ҵkW@o zRǵZ1 ۷oǎKߞ'O.^X ?xࢢ"g%''K$D2nܸ+Wl*i~~BDḸ\Е+Wx<^FFyPqI$T:~x_&LDƍ?^kX>; 2995iw\Vhj{rDOrDgZKvxJxs4UV@k+_ _2lу 23Ò}UjK@Jmid-~)qb} 6Sj3:cbb9SRRT*NaI t~~~:Z8eEZgc9X$YO{HòY1kxpG{wphgyJ“>H4g-w zc>^{#6f_O=eÒ9Xms@5>@n4a&R)233y<^||j%1"J HB&UU1 q瞉{d^^Uꥅ˧{rz!.kw,)^w0O8'p•{:vTX8k+B^# (G>Xs+MQ^F$"<Уǃg'>z}yyyTTB>ԩѣGٳ#G И$J$&&{iii.-@٫WÇqXwASqLHE~Ӟ׆\ݽt@͐!C6oL܎w8DEGG_?=zwލp8aP\\N޽;B{Zxӳ$&3F*ZjժU2l̘15jTPPƍWZ%ir:h 8?m;0C ŠAm'GSπ7(y?{ ~'3''g͓)xmpǎ{4h 8qM;' {)i+4@ ~S.Q,ÎQ/UIZY*1g:O-EFc[&9}o>{yaے w}f2i:}fؚࢢ"=ŋ3FV5RZ3x < Jr֬Y45v_~f/a'Oԩj̚5k%"{'Hɓ'Ho?s#G\hEEEuuu+WիC5I)&''ׯYF*N88۷/y$'K.1bJIMMui潔 Z[I$E۟9z& e&h.e{}LڂJSRӭx ۖ\nt)sv2; ѥ&:qrf< .44ݻx8''}6.++S(x8<<|ǎ*S__{o> ܹV__\!^$tR CBBܹCJ[UUErIUTvvP\ww.pHEe^*ɉ@XXXNNs0_I[1ݠZtp9XLR8MP=3\r>QU4f~&8<װm冉Hv.'g_&M' [F8 rdqr8łf3r/_}1>t…{yy9pXŒ}vE"H:J&i?t&vTRٙ :K%xj4V[[KlzjQQQF~ݻĥ H]Ak&FM* 3@*RIN> U SS .Te6[ZyTf.'.ev9QKg]ɤ?l[rQ^8=[iLDNʕ+?䓕+W[,SN]vݴi^/..~םOOIIYpaEEEyyyjj*Ϟ=޽{%;;{ʔ)D7oWlF0  Ba~~ٳ-[G-_瓞TQQ`Sҋ䬑'ONMM-/////?>չ0ю@R;3]=%>}oYTTh,X}z)i+4<l=\ C*~xwsSd5*.T2=邿_'n ;sfsy+?Q02ȭ ĭ?u@,w[݄4[T5]JT~SqX,%xVp_U1pu;l\X{76L x% O[+5F_u/0#)*ݕIj9ψ3+n[ؠ٧=YW?"j~k}:^? XYY٪Umm!z5㷺w' N39]Nyt6q˱_m(pH}ǭ>^窻\XRl.&$|!{_O\uowT|Ee2~NNN\\sՓ,z7\cdpOa 3F֖~q>.#٬~9o?xࢢ"g%''K$D2nܸ+Wػ8x ??_PD"p\\\nn.Bʕ+</##w^X,|>fcw|ZsXpLL̑#GBwJJJzzz~JԩS1 `ZϏFM :tn5{Aj+]v-b?XTTd2i -Amny 3Ly+p{#&˽2<~]`?h㘌KiW͊9,j'u4Iuu?J'q ??Lԩl>{o$14^&UUUP^]fJDiUUUlɄKX__ooܹS,2V4,X!z;u$F#0=>>o'|bن `*y晃&''ӧOO>!d6\.˭駟\6%) e`8_Ȏ,|iA 3gHG@D{?RS+{.TthOv.-YjxM[}1<>}V|u,Bʄae~rȐ!7o6p^~d֭ѣN{ntt4q#~ݻ#w~QqLϒMM!iQ;qFЈ#K#0f̘Ç㟮֭ۘ1cOZAAAbذa'|j aRRR=_z{{<833)Iqh#@Owx0p~فC) ?aVO @k8)ihMPzrrrm$6`xĸq *fyޫ )d 6lݺU&+L}vH̅R"7T*]tرc !C7h`oZ޾(oذI l۷744tvv޶mjϟoccӰaWj:a[:;;zzz޹sGmOZD&Zj.q7255_f.K!'OT>>ZDzyt;;;Ղ666}z<]VlْM6)ҴQ@Dq-v1cX]z/²ѣ>Q\\w;66iӦ:>hgM ڣG5kV^^βlyysRRRu&M$$$pt-oҤ|$cbbUoo;w @Suwt`cc#XD666} uttvZS ێlҤTVi]ӎiq*AZR>WyFQTΪ}Z_xaV솴>ijˬC)׌(eee+Wʎ*82s/^޽RORlڵknѢ)*lT6X. Pf͚,ۡCϏef͚2D$HhY) 0`Ϟ=,޽{\JQ_5r4M͡C\]]%ɸq6oެZPm<_گh…JW-Zߗ-Z,]TS֞cjgS>,ZSYG몫*Uʠu-)OՙkݩOUDEұ 򢬶%:KeZSJʾ8ˈ8+q:99}gϞ2PORRR򂃃+Q۠ROϟ?ĉD4a„srehBP(?^QMfMShPPPjffYec.X +++++k/qq#TũijYlڵk/_nhh%5hjډS]Z"ڳgO׮])󎗗WDD(5mܹs_)))}vPPЯ7pcƌ >駟Vך M]ZRy2״tkݩOUPӊU ?ijˬC) UK4Vc=)JWZռyssgBY=x`-5kyfyL~֭[͛7 FVvꪯ߮]'N6O?mԨQÆ JmT6h*S] i۷gdd߹sG *>H~`IhY+Nr`_~'O2pBn~gĉƶ+WOD~!Wp͚5ZgSò}\\\TO~j'NiUkdE^^^ǎSJ<|0wھOKKsssot\{J˽N/^8##ChIIɔ)SͧLRRR{5cU>kZK̵UTT牋Ϊ}Z["Ժ4feW/Z2Pk_5iqxSSFo \B}d-x޽WXѫW/<LLLk:tرcǍWKd2T ^^\kUiqcd3ghz(88/ ,6lXjL&_njS{[}}2%krq"[n]ZZ:d===gg爈"^ݺu[z5VBjT Ϊ@A@ 8q"N@*ʗnP&UqwE?xVǻ{H1sS ԽW\3"Ye>7wlwͅb]ktnu/_ڷ55yDxd2 SG7T|(9^fӭgZlo̥65;ֻ^m:82[8p^ƾ,L^.V|h׆9pSe{smwmȨ&:Wah;,Ňr2W~䶿!spz17?yTn"u@>74[tLjӿ[osPmy@ޔ9>Ȗ ¾Ԗ3hO읒'/?V_qF?eee2((:>>>iӦѵk׺tBD۶m#*<|rždgg8p >>^*L0̌"1bq "Lvر+Wۏ;ȈgCAP1M+JG~L& իO˃O>occ3z謬{{':88@mE[}<[~]{SOm x?4-*zTӵ/ٲqmHSr3Sʥk.OVItidll?DDD̟?2塒]NBB’%Kh׮]Ǐ|ȑ\ Õ+WFieeedd4xwo:|QFYYY>\knjceeehhسgOX(: |$LK*5R)m˚0ߎ1??E;5uk^2;ޢFmЦbZӟW(u㘘p7 e2]cF*bzkk낂 111J]\\OD7n7nܩS|}}ƏϲlRǏ9RVVFD p酅8VWiCvJ#.WO/R\cܟyJ9ebEe12)+cuf54nܘann6T<622}AvN KQ ݈b,6sq DPWXqQ@E%qjee 01Oɱlla=sMDaFmM\&?>y괾K2q'UaQ^Dd,Im~ F bc]\Z 8YR\$w5ӈ)v7ַuR\l0irr'OOUUvJ7=Mi55kv[5Y{ň3傘(4 )s1$C̚3-iBW95{*ՒV۩ճ'!~.{^W}bGU8޲QIژJ0_HXFzvlaIyUX71##; `}@T]}%ޱ5X`(57?OHH/x"vj㡧'7{G6m{ON]~=8u,N]y xqy{Ц{>8X :ҭooPݶg۷_ξ G+ՊW~mں{^ֿgO{-Zkݦo˖YTTK;KKKy`Yŵݷu*;QJ VZnuŵ}gjjNL|׽WYY.>եe gWl>]sn*7kp+Z_-ZSYvl^^䷽c<:ߵn.]bbN4FAVٹy-լi ^welbr>}z%'˪F$lӧ|ƍ[6{OGxvW>& THTZ# >[5{pgeثE2y4y+Ǩu] t端GDq2sҫfԾK?:dWϞwժӧ̜1*v傥0ƌߺM.4 15MKoLnE bIlާOM0`qWQ,˺uԽlvRISkMnYJmc<ƍ[ׯߐo߮gVիjy5ݵiԪQt ahhXSW")?tGƕ V׮ȵ][򪦩mRU r£5kZ5 T7tm{Ёz$}4EY[' O:c?ZhfٰgVؠUsW)f^xI$\tiӦۻwI$N:Kӹ߯^{qH$.]2mCϜŗ\r<)Kt;;۫WWvڔIoU辭rKDG.((\fOFt8~{vO$ܽRSӵnb҇[1_}7i҇ZKwRQJAVUfMKgyz7.ZjkwX,_+Ŝ:"dz36OnYJ9N SVbe 6ibu5.qCGeN= zR)J8kڇ7۫s؏M2)hgR̞?_MCÖ,]{iS'q^D5 vs-^hIff&˲| ^!T>]ѤSW|RNݷ5C֮lɲEΜ17snLTZxQ#6lHH`gM8~˅ ,,TDP&X"5)}~$Pվ8, p :YVv* G!VF؎aa؊3ۯC 0@ `el ގ;~4 ק ee^8ur՟G\тO ⬜9f];.Wp2 HIS'"ztJP׮]%>z|>ǩ6~9\KNIJlmNDii:oiaчҫfka###;[]6?$镫322֧sf4q;z$'"YjZ5Uթǻt#su /_ihhթyI(>+;-\ dR"ʤ ˾8R'SnԜw9^mݻDžJ/$aYր d2)22{"M^^׹"'>;ahKKKє^5թmOjufױ"֬YS.qo1CՠA<q ۹^t=---KJb;{MjZPQv6ժndeջsaa?.e\.٥l'\D")..173{brrrO>0̻|wݻsDD]`ҺpޑU `ڵtb.Z>xڵv6F]4lhٳ_LL ---n߹GD+Vz]SS^{;}}) J"6dOe2)CDRND2F0@ J:(%9o&˒L&cYP&&pyC"Q-WD&HqqS~}ӫFSm50пyKW% εCED66һura?+URkmg*ܪe 1G\x%_n-qBEȰ][n+пݻ\~58u3_;*.-U8WLƝH>iԠ_G6߯#eKbx r˲ \@ x-]2L&U\pW>w%=}}# qVAEK!s+} ୎8U/$_'w\>BɝX@&c$`eLE{+scb 99Ysg}}S7Wd2Ldbܠ~> ƍ[Ē֯m Ô>xp˫ gW];M233E oQ6\A5F=2[)dYT_yiiqȮ<rWUlTԋ%KXccݝy)Tߟ54d]\sԷA(,\\\&O)..,d{xl76^dY!y6Dl٨FkM{<@yUtՕᅧ۷omڨO\55e,aYݲ վNtaÍJM-chs*թG"#~_X'K#O$Z}{Tc&##Mҥgv_(:ՔSNx^䬱:w11\bttvοl̟}vtvLRX6W୎8KKE0'#ڈٳ gۙŚBFD^:8xZ--eWd"ΡCٝ;R EcG>oww;f̟^ݳjեR7\T/U_~bQQYDģTcS/jzehȊ,˲ݻb1kd1e٬,v4ĉ*L'O>*']\~ldarR+/JV.\Hvpim<˲|sTj%oLA~}Щ]V+O}^{}%K曋{V  ijf;t(eYu}tdž ן=+ * TqDJG|SbFk~W<Q͍cXy-?eY6?51+nb⸩~~ !BaCKVZ#΢2aVWuMi8YVMĩx,lZZ6kfCnnx*Hw;|zD/51Y|~ Qܐ&iTGx]^^ņt 4ZSSh5W4Վ8U#cu ZbP*^ѣ,_x͟(JeR,+++KLJ8" Fr>ڜSPx8 Db1YXho˒[zz$QZJ>98PIrUE=ڵt0ғ'dR(::{ʔׯ/ @8kT*+,̕J%eYYYH\Z6gR&O>fff0jmlLtߐW>͘A/MȘ@a;m8^3 rr99\aZ~(_mNGG`K& E\)=x9,(,wH0)4||*(<(4EY?vɓ_$G~/N vt4@m1ΚYR/)/'beY@e˖V˖.[ "oo^:KJ{A'L+Cڹ"q;^D@=JݺHsg2& ^1|q8&NirLN8Yx40 //ں8vCq D`Oӣ"õuK|O8'e?~q{7R멚f:Y:G&'K$&Zuco1 -YBffW۶;w2 7mˆ0LZN=)l{\\~NÄ޲Ը戈G#M!//24$//yĄ:u8Z=bDcر#Zs,~VZ?(Oq.RܹQ#F*>}:;,,*Fa+W狚5sl1 iP=EҨQ4w:X[Ϝy{7HM-j#?BHT4G)I1!Ϊf_{ttR:؂ q#[?Xƍ4.x7_BV3j:WD)x]^^)˲V,+~,˲yyl\b~>kbKŏe FsuT*+,̕J%eYYYH\Z6gR&O>fff0jmlLxj67׾ΝӋӋ;wL90hPKXja僚ʧugjj@Dzz wR:J_ӣB" Ba1偁ðjΪ9E9aj0-?WXR6YpK "=xP jCJllLb^Rj+ϴVN6aayϔF0)4OQYQn^.{7&aE m`qԼ>8 !@ 8q"NxT׆T^H> uUj=k,utaBs;W\)$ݣR"ŋ)4Υ"Nx\O5˯4w.mF{17TDd`PpV>ɤ02T&%EV᱃a*NopbcӓLLhRdelmSJJde&Sdҥy̡75n9"QG$$BCiРpf͢~h\\hQQcg<ٽ΍oadsYBv ;%)&dYk~VJ'Ril,yT.*걟EAPX{TcjW\ReJRƥ,+fuА+6'-(`ܙeYYٴ4ݫy(w}5Xf83TVX+JlrJEb9>x4y133Q\omccRM;hR y5 K).. t5 Uϊ rr9D+Lh}~B?yxGϏéBC604uڵƏoruԎ<_$&ux$h]ZZn,**S!(j;8W g3LD,./je Sq IDAT\~KdbBKHرj<9 |W(JJ sjsmx[?f:imm5e+Χop̡ѣ)7LL^$FE69d;7nZ9UXVbll6g]8l>յ/ddD_T$ؘ̜1sGtt)-|qR0W*2eee"qiڜIIGD<fff0j"*ZU"7eAZR <9hAJ?ӠA$K\*Νnw:EnO?Qh(]W(-v#4P0]3Ș@a=qB=QTSVZ*,K,+ˤjs::{{'$?ΝjsPXSh(hi)=x@UXCZLH{u^_v333_ؐXL~[/HWr~#4`P4lX9+/ɬvqG ײvL7dfg&ggddij2_M&޽dfFӏ?ji@Py{Wk֯=tÝ \\~Rrh~qq[+%:7'''-9^wQ@r/^nidDfUlRqsnJqqdk£J] Ν˖"xs]M}'Ԥ޽i #uΪv}\Y9g` MPpDӾ-+`4qԼ_G? P:!Q`q"L@ &+ly+߽SUFOÈ%8@z,^N`BBmEڼtݢɺ$(>*S.l:ҿפF7W}?,^nmqu -S2ވ+$BMRFvMK,LZ s+[Iz@]>|ο8w;6\o {x}SOVdOS8u'1O{zOx9~C._z=̜y~!Ff<3'ݏJ Zsjݺ 5m寰N^ڻkrzVȯR3s\1vV-CPhafŬ)˷ʹ "7,&bFhM۹Drur9ID =wg٫%.)-M~Ӎ!&>6g񏰩Q>f& zaMcrM|WabS>*m]B[yp,X;Ϭ1ز>zLƪM,(*{'Wr۩["]&)DN.򙁪ID"Xޣ{Ϧ_#Do>;ҝE~6iCNu6ofoG0'87jr=8UiEi?\l"Zm>>J}цP{ ℷ׎>>.jwxR$OIeBU1EIS Q)8iyb8v"z<B֩T@@HjEׇy`c8̬ٳDQ?׼x,k^{e3׶?Mœ>s<33 Y×9?Olcy)GF>#LKf:M=iuW6mp7l02䃝'oXpM >;"_: œ9*Q;]N/zG Dc^&Uvg^  zeܢEO?27fO!0 INz S#PUS[uW]г9bh 8$2uʠUm'/\W6 zy#g~wҦ&NN"Vji~^f8iR릒[E2AU gO/H}?:"_:EƓqQYC{9+*IoT͟uRE{{GaOS.ǥ"xaCzCܨ)a)p_5_8O8x;^HD͖^S \e6s:ҦCq0z+d?__qudwg~gV~ ǙPlPMCԌt|eL7ԹKKLdg] WK >n?+tL;GWo'Niމyֺ|gl28)l/ S0S 'WզK=GD_H$ ;R-M,-G7 ՉS"޴nYhgce5 F|͚zgɓҳǏv:^pB&fMC9ikޜ`;}+pjRrrGd֢'z}_:&9:nܪ#LDeu MGϭ[Z~z3HmC:;s =@dNPXgq}_1m^dOz_-1@q>F ?\ %ӽO ?*+j[dO 7G 65YhjR:Ž `wFkDT[uYYϬh(1qjU/:e2JHV'Hlmx?--^z陈YTԄ*6;Wu+)Sn]R٦x23ۨ*:JNwtn1$h}}ǿr]]Kee.ѣ8ȑ|}οqMsC5%=][9;O?B.& /h.rvfo 8~Sxߗ闻p4ӑ>8ᮚ,a8' @ 8'=+(q@ Wn=j򺘘}˵kƆ\\ҥu$ ;;7۲Z_\]_i/&Q@ @׮)m+?tp~/ZzU ]uɺl^yٯ98X/[9am ]S"} ߫ݻwSVB4h54Сt/\Xjooo3gիj>U"$5lɒ#vvs:8 b#P_'$={>nVU5>|)=]Ԥ tNO85]x[~ĩUU*?TS\9nM_YѳōV+,,,,,V޳g]&~C`!묱q} rqyͧ}sjʕ+csnrrrB@TzWsp(/)pqGn' JʊRR( %'zqJM%ͲOdd(FciDQss6QS(l47^|6;9+w9b..bb /'O<)\3--wYYY|IuuuhhG@@@QQ?uOXI\\ׂ tĉ@www***d2YPPPEEi : )I&iS|18h?]bb#<3JR4((d(t;v .44jkk}}}U*>}zѢE 3̚A|M:df뽜[L@*++,X 'sqf@v?  3C Wccxz;sg͛M_u˭_ϵp|ޮRɥrÆq5dР+ju>/n>|֭;V>b.~)G"b7o6Y䓦.y{{k/ښ: EFFFEEhS|ƒ2Of44v3;όkii?8qDyS4~Dɡ# ꫯ8[vmFFu633\gGdƮN3Lbw^~~2*ÒlM1G0 N34"V93t 8[F.kk8]G8ko$3N_ߔ1csRR.45 g^^fiJllh2KG%QSc&jjzĉ'NTGf@̟8N.NS7w\777B2NzԤ]RppylْRBOOORq\ccѿ: yyy544G|0Ʈyf eO켩Q?"R)FtĉŋN4Iu63̚A|M:dfwd&)q2\o2t2N"3Nfz9\]]q'_!eK;9 ic^=~עv*/[(P 4`)ssJou{6{K銦&u`sz,EEESLѼ?~DdffC33|+++jk׮0aZ>2N"jTt.ŝ:uJPddd&''_D"433xh}.N3;o(Jfffbf󽌒k۷o4iwu; Dg2b'][d?{o35t:_du9:څ/JLZUK5TS&jmxɌ֖JKniiKDDȢ&V ܹܰ_17Our6RwKR%&N=yrQxp˗73.[,//Ϥ {1Z]>dȐ1`]/^lѣG:t---m|O<Ғ"ܐwlll]]]ZONfG _p&СCj۰BIDATСCcǎnhڴi_+҃mČ'/ƮO6bnLO;vlzzZNKKc. skQ/7IS 3t?!FFz,1T!!Ύ22h#k[܌9 4`{N.~BDfo]s˗k8;zNO%%K"#=TrOnnIZvmeeV1cƓO>]9$$$,,L:;xev.bbb}0""";;+''g͚6l0}t{{{""":;;ߺx~~?ůyQbD|CDtY R'333""BL&{駵Wly3̚v>ۈٺI3Yd?dǎ1 řanMC?Xr?O>K1; 3t?IKDttt[no1v9dqJґ#G&$$=wuq/ۻ?_ޛ1/WkFonM$c'<@v޽gϞ+#0g\!)'\د]E=8~u-'''֭gVr.q\\dn8N}犊 LTQQ]9..kj{|'ſ#--wYYY|IuuuhhG@@@QQϟ)c~bX`\.߱cؙoEEuB0ǓT?J~DBTFGGO4a6 s~ Ȏ;r… +++g%f$åyC]ZQ^^SOSſ̿͞Z}ǂBAR)YYTJ ijǓ=8^e_LLAyyc/[[[}'MMM111 j'N {ä欬3gV:zhkkk~~YwttL: 2b DI{V@]] 45:a3$$d޽:#b1:m 1;/~.\*8sD(1;ό <9#ܑ vjclO~~w,yI+ozZ>Q\K :|)\j*7le9yi)SƞzGux8!,T*566zzzjWkjjGɟ@xϛ7o˖-*=lnnj)])˛;wvI~CLQYֶtҏ>H9Ld9L战6̆?aOCsR3G$~3S.z9x<=_xKw44 g\}=gm}RqI$BFYY~ M8 666/%IggJv)SSN)dS{emmN33|+++jk׮0aZDM g( Lׯ]zp!si( IF!f%f@>,.>2&m;N ~n~BHjjْ%G,up2Ć9 .%"277rQUU×MM@Y8 q}fчjkkKKK=z@'x"''%%%Exr|˖-+//7tI {1Z]>dȐ'C:o̘1Riii=Ⱦ}._gTa S;vlzzΈcccz6m:f@LoC;%fQ? D.:;]x[~ĩUU*?T"V0d@O2ʶTZڭ/J8Env82N@DDDvvWNNΚ5kjmذaF7@&ļg vJ???WZ5cƌ'|RrHHHXX,}w333}||ZZZ¼c#!!AP*\x3}W>v옏d.""3 @K 3 &ihȟ0uG`Fyf:CQ!'Y/88xҥKR|&6:clduܹAx !wUx'6:8'2N@ 8ݻޞD| ઇ~풨(ͥ}!Tdn=jvɍ 0]yy]LL>Mɯ6{3R"!8 O H*%45RSo//&@ kה۶g:8Xgg텡/wef -_NgSc#W[[K@7H3%-˜0m׮)E9|:a?t΃Z>…Vݦ[hcC7nРAT_OCRssWlm,-o3gիQ-|:a!eK;9 i"ڳ\`s:>C-{hjR:$sfx@ t|aID?]*E8ýqbӑuz2p@#G>7sK3$Z8;juu>]qVVV%]uKKˡC|饩)ν>*`iύ;eZjZY̚;?ǹ+kjj4ߗ|CyKlGG6Cqjx;믏/vٺy[>ޘ_{yG"ŗ$>aD?[~(ҥm@+*+u..c---ǍLSs-/MS>~Jg<2HR,-$"w me/i0q);N#x&3#F<{t淧Ϝqvat%9'}dc ND_yߢsgϞ[ԇV9RT*U~K x Whт3nY4T{굧r W~?)C8;v7cc<$sfxx'Y^u@ 8q2N@ `8Ą*IENDB`PKT`1Dh[Kagg scrapy-0.22/_images/firebug1.pngPNG  IHDRXt sRGB pHYs  tIME  Hb IDATxy|e'Mi)mI/J/.RCYY@AaA)\.R9$izG\c!dJ~dgyLy2IySdpTN@  g@ YrVY>$@!DzzbЏ(r,Cx'MA80 ݑٰ/!<8=i̳!O`FF/@<+tΙ3NFȬiYvzY wg|1#.w"xyt鱬cͼcYtf^;dɱ+*1褼<#*o._ ̩gʚL3x:?,=p(c3˕_?{2-1cn7PjY:Ӵ*q?}_?dvrhkݘt8A#ypŪ:H:e@jST[0-O-c5o`f"vfp pDT35+ϮȽ/|ʫ\7=eO fFkHnyx?'TV%Jl4]!H]:m})XDռ#G{G!xшERJHBB-6,PB0]'* ]粄eo:T\~|eiQW*z7q#4|av}¥\#G}5/on=qY!k2ԠnGRdu0Yzpqlg†m6cg]?=Ma:'+SgTU k$!W󫕭Ʈ+YQݨ{mڶu?V۸qzHr:[tXש%g-nf/N\&k xJEre& ⵟ_3-CgUCIFѯ~TO웘|v6̄/?w.ݨ%BG8v+:xwr\nrhk83ffGʫxek);bULFE:〾kYsOkUW<)tIl!t,acÀ%\>Ll%?B.,`xՠRTk#8<+?u\_džxmW۪;V)lYbq_ "`g!Km̿|ls ;0,S)_#O0煜}ӫrjsbzVbiaلs&u0XXzI"k{CSꔛ~UW:.#p6tQ$_/_j€ \ڜd7oe#?i?!dm~\$~Ɂ:T;f]GKl݇{;Z38K}"f3F{ =сI}f e0 3[/E.fZUMZjf+U:**[=r[}lHSqddHFbD4-,n,f2Bިؙ:.|gM'\VYkmb铥# #G:%! !KF.l,q|*W !':wx%l:6SxCj[T^9wjݏd5>kЪ֪'Nh"e*2%ՒӅxD> 4p27t†BfdqSVq>}5M.n.j(b|NBM?6%rYԦm|˓{g"狔R\S}"-M,/3ӌOg X__ZV7c{;~6V}"tI}җWV5մ՜/;~gl#YL4UF7۴mi5i/u9k8ZV#' ȢʽE,ʩu]L?`?1zܺ\M,:8.Oog)Y"|w޿~9UzziӄK?}NٻoI#}zlIIV!_H|:m4BȜ/):)5zoM|zWv ͰSH `K{Kb'8p"1eY ^Xg5[nP>:qDNmv#=ժk˞B?=W}q@}|:ՌK^Z4bQNɫ9gƯHZ16dH(hx{7$KJHӖ ~7vɆpe;Xv:w o=}K[}ASR WS7N?q"gsWȜ94in!" po8:̳rV9+rV@  g@ Y gY9+rV9+rV@ F.9+8kǣ@  g@ YrVDŽe?ttOBxgxx8 :k̻0 =O׮\wgw狧tzsfߝ@e2պu/TD =ʖ$l\=S>(*C֭ =#gdz!kMXZf2ǣ&|2)3J`*5?|T ۽|U> J AAcFqn = _}5Plj~fʔ"4(m'ytLIE7߬6o@hJ-cEQ^}ܳRjo{cnx.|)󴺸eWFoCgisTW>̙W bg 'O6sk}K{(ܜoK%&VVuu'ϞmNM: `uBܙ6L;fL>穩mmwxMX<.-nPEY>^yVa?Ȣ(<5ʉ FQDY1/QOV ⏮ ?ӵVP蓓=c1 תL-ˋjGB׏%ˋQy1W*+u jG ;p ͸n]%?i~۶#x`f@ݴ^(̌jGdfF'$y*~9+ءOAM*,lsp$b)o tű.^Y<wTVK <.* uEELښ~?u @ik~#{~[ܓy=-3f_8 ! ӅgV퀸č ԫ`EY,7WSŋ˘v<ًR*z͘ڽ[>rd.zzѢCڌϷr {B/.TVgyd0PgW9:a؟Co?kB3.U Xj;Ч,>7H $Uf%) vؼ9SSۢ}6o_ZSS␤4+#C͏>ڋKnͿ6xpv7GKo̳@o x䩡W=:o!5['@Oo~Ը >S&m$t8!,L~7 Y 80};_QQ..<^d%KЃӧ9jh055r$/))ID K{wƏ/\oXȩB{CedDߍ16`ҵnGs-fK23wB =$ ;w#̙ߖ_j@1cujgt„BBߩ0?1iy6((ЃL7|}g6.gdkY,8ErVaӧ{+~k#Kk}fnPUU+,-()/nX#xIk2͝glZ#rP77k+O?]ys`kk܅ /+mְeӦluZ!7nh1kGRK7@0~555qi/.b[gͲ<=ΆrsLX?8iދ+WR`1ގ]`9;ijǏuB! /Vm?ދOL)Z_&t]S]af_f3_z ?//~˗B-W:-]t"LOYNBQSQ5~|r 4|u߾ECB2t:#EQ:1,,tЌ<5FN?/cyEE{y`^j;zaf7@uc H$tte=\USN2v6-<;ijq֬EmV-(j|' (X֩xHH=rMppRiȿ~]#jEQjA,kGrno6O q49Fc L?z>00]1^|^3K„ Ǐ7PuXܹf>@bzn.zmʔB__YDD&x'mvٵ#'qYg4Ͷ8:^U{6оQ`նBB2(2$( hk38B'IwdsN5FFf9s}W޳zuY*fMH-[Y gY)zj@m[u[ﳃnNCѹE+*;t޸S:}) &Hɒ:l/]8jl>{7@9ܹ#6j;j ;j VDCbb.EQN5>PI߾Ǐ745FYٱh ^",,؃aOOܚI-.5~-Y9fY!JN7NY|VP( ~Ŋٳ}=֫ŸwÃnҥŋˮ_oL9sJrr4:EQDLmÆ)WKoim/}_| mSᥗ*8,iͶ6_\nqw}+h۶ի',x-骁ȎQ`k>xU*!<ͬ='j5suu*)ivj'uDܱ֯CaCPƤ$QXY2rĉFnT8fyVpyVkU*…ii Tfur2[CC3%g⑑Y$&&ٛ^xq̘1vM*UЌr{aj+}|divհbKg4f[kL,GD䨧M+;M($$~E}kEǹ:UQm(sljҔDQ%Lr>#"2|IHHƻ|/h80֠9MF{%)!!uZY9ft=O?6蓻jԢGaMOY~=\#kqݝms6t!Q?Moxd66+t/O&cG0!D.1?4]hX"22-E W*;9ofqBȜ9>zB.K/_2-RnLsu7Mϋ; }ϛ'{0~#{  oQ ░K1m8~|s-&LKZo˖ !UUJZԨCl}-8f ~F60++pGʻrt׃6n 6c،l6#˕T̜=n\4+Рӟ 6/w))B5nn)K,nH?vvr…VR* 2j׮#r:f9ӬRkTK>[Jȵunkƺ:݁ BMXe_k1/ krZ3ϔr-99/;[y㐝Vktkcf{lx6f9~]#cgmZ}㆖)|y9O7T*?LR1~<$eZ?Y',:|8ge|^?DQTTv~ӓ蜔$5;<ܥcIFAmi1xyɘ2uoins_k1﫯ϜYZfrrbvhbb9aGJqӦjt7o x:.j=v̳ڌl6爃a=r-ԩ>ڋQY|9*)I?w29+|vR-߹S>xkNN ]ɉ ^_ J 2^\,#GKСg񵘳#|G'[z=e\ž6qΜ>8`t(kv4~'[4j=v6f9P4Umg477 rϏaZc; }֏`C$(׃!ׯ߼N9(ؿ|[DѕCϟ8;\\xW9n]z倨(W-Y"t)˾%#.s@(!dʊ[kLWFM%99%%>ܽqb=w#m #ٳ} ruwwrr">>G脕#r%j˖8wŅqT;b&}֏yVpt߭cXn '~s#rVQ{3mL'θl$rV>N:9+Gthy<.>o4!!&qܪkn6\Y,/>2޲iӮ3A.VLz͜YLw_]?#+,l衂GUPwoHd"8Y̚UQ?4HϏ;7`0c]|1[XP[ 9+8j2Bi\\LS Di;vM˛mN?8s966G(e>\Ǭ2X$wtrnk0PU͞]jVG~SyzJ'L(T(u:jժbqI1.ZT%-^\UgHpai{;U 6mZ6ž^R>$W"͹x ʢ(3SucZJ풁c5V,U\ʚnݾ=8<܅ł3e˩Gx֢ /l6V@6ΟǏ70ۖk2m_ܹ@~555qi/.nXY[/)>ZlVӥ7]y.+WR`񶩦[k$T:X.vm:0L)Z_&t]MU괴%%Cnвߦ t1ڍU.efƔ Zu7:cYޝA8qPS3/.c:kIf!2U1kRǭX=93K5cx,^\vR6v|Գv֢%_m~֚Q<6njn6̌x^~}Wם^b[bͻhzr ryP_+@B?7PC?WXO~{+*-eVVj; Sӏkj,u4Tb{|نo nur߾L 4vt#$:*>5"fOul4 ;L1i_̶b l{,rP46_yFLL4$$cŊ=˩gmw֢%k흖~Y`/зo7j G3gRiruR(t5"fOu iIdIfX޹3K5{zoVVJJ̙,Y"6L44 _o~ ч\&dʯ_kU/O+Ǔ f|. @PRr8!D T7?gu1MH}̀a{3@72r΁;3/j477(M-.n~ourqqY,/- CǸq_|>bY[;¾l`wOHT;puՅkѕ ~ԩB޺u.-Y"U77?y29+89s|V++o s\2VV.-7)_("lG|}-]*^vTϞ]R'1ꎩՔ+թK1/Vܸmj2R{m6u,++ B_blf^;Ǿ1]nqش)h۶իJ,}‘Ůf!2w12kV *kͳWs7:оJQ"Sg궳ٰsgMd {ٳ}uu:<Y>p B]- Z\V~M'?kR2..G f߯`;V%HbbϞmnR!/ں:44Y}xK/^3f u0ƈL>_rZ>>2o]jo*aRO4Oϴ KU*S`j??t`GPE䨧M+;M($$~E>aUŭYZJu62o6{j_̶o9Տ?LVK3.,Ѳzvg-lF>L@$_^@4̛WL_|]_|0p`M䝻"gC=O?ݻMmm5DC"gcFqnKHH66VvsKsqFFfBիJDH7x)ʸvm%Ka$jc <رizǓDDdO1 49s|z ],_eV̙I|}ӝoYm4eeu@kرVgUʴ tB ҔM^~9`,%[xh7Y>XJ[Y3]ᔚ5tekM?ܹ3 ʡYmmHzvCIúhQu.Zԇ2o^ɑ# s9N9tnE<JXC˕T̜=n\xjteKwrY6r)`z<+-!$>ޝkU)dvDvvLrB*bGbBFbTwG9z|rGVV4Sa}~Ӧ[:9XĤWW_ԭ[PR2DQW76uCٳ8f۟n%\i0+q2Y]#F#BAP +oP6u8OB=Wj4g[g϶ |ܸ^ta5CB}|EE[֌W[5f!3gTq ׬߀u~RLЃjJf-t``FM.''f`WzIN&&&;0йj(<##z7BH[3"EF)P[v |^?w2uӓ2([Ǣ5h@$ѬY.D.,l裰}*!Cӣm̘;WA0lXNfZ.& 0 ~Xg\өS}sj~b1F;/yVpBBDb)SW' xE~\[{R11nW؛BG1sF6*np'l2k`tHI?:ԭpԩu* zDH y!U;;~~XlJe/NnVٚ!Cc Wz~~̾}!gxyU*eFC3r,ySY'MқyshL>BHaa;~L/u9p=y&9ڎ2Y1RyM/sR\3`v@džђDѕCϟZp|Rij>3&.c!jz*7WxqnV7+ Blnn6sZRRJ6 yɓ!))2Jڌiiݻ#G7&99CbO8ы}ݻvR-2tVzTL02aB,ɓ7bA ys󭵵zjj2|U!d W5p,\f90Y]z≢* Xڄ\(w>8,`[b|4pjk`>}{0=kڵ_ruqṸ"#],_ޘK-ӿͭV) }:1w8l[`3Ո11Q$kp|:{ϠANNNG製vrrXඤ`10z2kuXn={k":Bc,Y8e֎09+ g^ʗEϷdrN!gܻ֬nF ý߻/].w;sjn6\Y,/}L5IǓdSԭ%E>4yOㅦ-y3<}}]\ uǓsv ;=N :rVԬY:O4عs}x}̻:_$+`>=T# bZZ 9~qGQAjw>ph79+8S<=\]&*z~2$vjR(- 30 +;[M?zAvz,5 Ժu~~ig8Ǔ8P)Jre23Qa:Oc~fXb?nBܙ3e.ZT%-^\j6Kxk۷׬]<$!GL梬2Ct\e}~LsllP( <|i?Y.-\XN<,`ͦMUkrVp\S\P ˇEFZUrE)6 !7VKJ.\h#ܸ} Ž[k$T:X.vm%KkjjTciz/^\N< KͰX;Ś1c</.tIVOlPUU+,-()/nXeV^z6}B67Z,еwo!))%:eb3K7olmp!e%SՅ m1%%CrVlZ,Y;;|f prn<̒~;iwb!˖9lN1;"SS=k:T* b٪Rˬ7PC?WӛuY)ڶZ,:TKQ̙׿{͡yyjqM?0uLĬ{fͰV\:Q+7bb]]!!+V74UAAo_ 5>(kNq IDAT-(jkMZ,7=dfMźu7Zl{뗱gl/L 46O49ƭ6X#Ht:#{tHmm5dP5dHw5%'Qf^@ 1}bqY,V͞r&U?\#XR{*4c``ѣK[ 9.k3rۍ9j}~j۔)L!,+Y,1,*mgsVb=sV'L(<~c-vzuYfMEin#9k߾Ǐ745FYoi:UPa'N4$&R{TCw~2nq8瘳rܜ# J yĠ =]~oU-HmBfr9Ճgm6ϙw#O7w7-,0sc,d|Ϭ ̝[;rkqJ~\W㘳v#gѣs)5*W*Uv~9v}Ւ[Sŷ]9KDW <+# K\]yN%%Y+6gϪUzBr[תUBzg V?5/]*^vTϞ]l{>߻7??_å~.mnk >Q. TEW*GѫfYB++*fy,=ۦ˭=Wt߾bo6t+l6)tE tkk/T1wZ,1nblm[u vUOެv0;*RtiM.{WKq^?.νK۩N8~1)I&d):qiUnqGgǝg#"2|IHHƻʭW O?]&5ڴ4@ Q(tE:@":U@mZ,ͦ?(ŋcƌcu˖*)ZYmnkǖiӊ|Ipp…557T*…ii T-1K6EQ;~\zZ ]LS_`,}~X}dd@ >{~~2ww%Y,1n-e4srӦ&JrfR2..G f߯8ٓ,XСZ'k?Ulq MSBB.Ko%⎐tC=O?hu,iľL)MOY~==o~cνZ^ q tsow  p#8趖xN | RIptY9+rV@ 9+ g@ Y gYrV9+rV**w+~TQqqnS~>{xra޼DmSS=UJ5}xR!""2%\#ѹyyjָo^2jTn^Z2󎜐ktg-%ںu7JJڵZc]!&N, 8;KvB@)LkW͈9 'f}.]jc،h<}ՠR~eʔB]ylRi^-{35[<"%\KNVklurr!/1-qLJehhЯYSA/Al~]#cb~ଭ[o2lY5rM&SKnbz{Zmlk:~TļSW*6U}USEV@F032 q#==!2Yan <1 !AAպ䷴ gf==ѮIIY]0^+wqjcpV``FM.''fǒ tȱc=--//G ڋKp {:m`޼zӦBٶ@O B.]jc/lYΝvśDMĸ1:bc]WzBs^m^Z[ov jrEyTӺueQ=DQFcB1OxYk3Zr=!dw]if\b]]2pZc]-2M#!]p^ퟟo_>^^|ʸ|yCݚK#GJ#!둳@O4ɋ+U!p:D9v,-[[:VzL꼵O@).wdGHJGWV=>ĉez=aC%~X0o7 M8Fb49V%# 3Krm WWeǎgfFBJK9;Ϛb8~ee^9+U}|*>>˵:Sf7?"ٙWTԾ`Atd/BHJJL멶6cZjnȑt伃o~nttBftP *C@BQ >@DUȦt/t{$i6 ɹs7766*GΚ0c=j&cXfL;פg̙HY(>K- 4)BHJΤᡇ!+Vge &+Krea2l)?wI,V(L} !Sb: þP^r%|E5k. ٳ _Eu~ɢ}ju\(J19EE v=vl0Wo,_/{睊ի]?[hju6󄐰T)nlxg'~=='G~G?0p"'}G4ujKK9uVfddGXƆS΂1c_{=)){r[XPaa}֯w ,,@h˗orrЊ,_}U;)7-NNWn Z[,,cm 4o2؅K: ْ%ΪY) R(xI0eݱcلU?; u`fZe2zb')y/IrrEL~g#|C~|ӲjU_xӧ,@ |\9+ g0]~ف KpђR>4<@ H9+꟣cʜ9Z\ʽ쉭mɹ2.77W&a2}z𻗙3mmmmgOO/Sw'zRU98Ն|x:75]skȝUk [tΎ+n[TE;=4v^]0dHyoW_UFA^[pqIS6U)u=ZBi`\, >6LX=چ}"H7$77mӦ>z[U၁kזZz%))# !6ŊO7ߍaaBnV$&J++YY^{LEEC++7knu 7zv*|_vm٦qX-ܴIРyshzzȅ cdqpdڑ#t͡!w:W=Q*74Nlz g'p>y{ JŽSC;Y-MM^b71ȝ9#V߳zb'GG>[ Gܠ0uԌ%t5N.a ! oܐ/yz99S;nҢtqIQ5UV֦zԼ7'$AxxrwCՠOZv}\QoUrse^^iHTT־}5ZZ[PS#Wm%$[Pо,-m5h575ȳ Ӿiŭo%|Tvysq"SQ3(TUUɟzh֬|cdڰի^z}ݴR1h9x s.]jdJNl}OO3 ngOh:2?H@er*ee iUUrK h^Q!P@6ntryMnBS0 y O|};̙&B΂#5C]CtL9TKFHsvkLe_?,ywÃ1,,Ʊc MzZ<(24zBȡCuc `VD7( t5Ccma_ 8Y$d22I2ց f,DYgU=.-muwOmlTptC}֭6 6˓u\5fX7^oS_`3ha_~6-aɓs:ku Y\G@in^^}{ODveajmhwwO5b;Sk?0fL0Gg%%tVG]|G\PWd:jŒC8YT44(w l_4ZiҪ*EUK.tb˥RҒ\YE֮-UUիKu/6?P U\V,.(hU(t…XXtb u ƍW攞p+ܪVTW+:ܺ%jbivϡ!s̕>z|rC#7=v qcQgOΝ7{nsgDBW*Qmɑuzh |#^)!%;9O?ᇇ ;}+@q"SQXh`*'Z['M:{I"Q._~7%G 'ǕRҥEVVI);vT^p¸q:xCDD&G7TJ&6',qȐjUo/wuM zH*P@Zt;]gmiQ>d!;mj￯ڦ95d]sBH:Q*74IMLkm|JDRRRKxx@WjmVMRk57Ќ?l'Y9/<^.sL}!=͜5Fo%80oFZH $)TKb} ޻Em{1$H3oޮ%ڏJ4YJ5?xpPǫ;GU[d̙_|ccÏS ^3!{}9+rV@ 9+ g@ Y gYrVPT"E%z;j]sO0psozDW0L$@7:+t"a@ }T; @ }x(ILt|UWiY{'Y[!3sd /Qu6;|nܸd3$W1c_x9+t[kD6,hnSR$6YSjІ11 )k+tV(XqJK\^~WD =g9 PDݰ'#TVg”{ B{74FFYx@0{\ΔK$VVP[`P$Ҋ E/ +[YcAߚ*IJK:k_S^^7{BdۢB%I..fe# !-U=T:݊WQQY66ɪm0>S̒L߶V{CH$tllEddme9ZgǠ/͕B-r㛛S.Wi١CuX['$?p~vmqRRҶl͓<=Ӟ~fqq[s99Ӻ:եVVI/PrZm?5p+.ԳYcdPkrɖLVLCYZYΖܖzzyxi4]<考Ջ8+J%S^.3Vksb [G$UV&WƆwbаaVҰϻBa" ݝ֭7z&LȽqCQAώA_"<'M8V5ggAbb9!mȬ*zWWktExǖȜ:5ﯿ;TXgLT?"\.gCXED0ݠ Had0"BdIQtܩ*9f+kVÇ[ ! u>($y{nE~RyLa{Qж>ްLgn:~?[ZU깡y|D^[n!/ٗ| A>)\" ?s&G n_~$o4)7$$w*4BƎq#-"##tX[o 3QUbXPU*śo#\zzL!@2%<%%""n:+t:.lN&z7O„fD"y}Q@QXպZII`/|qR_Lt)0 }V^.X֖ގP[\qw6KJ{>R("!^^iee-? jj|S'U%% yGMb 7Bf,44L$FHsTZ/-?$څ IDATl( PuVtAuS~=RRʨjE|R5uލa"i:j7_?AM-W*+nԤhSϭF1&-}ZY h,Ppqv[NNg Xޞ/111B*~AqTfԤz 2F +Q6^%3CXZXh]_$ h(vԨ4ZgWqz7("xI'??cokc)B@$XVW;P#needȑFw KKϻw-y{BȒ%7SR$ L''K>rĈ,vN6T3ǑhQQ||CK -ї.5ϝ[UC/0lX-5 Ws=!DuJ kŊLBdeV,&LjǾCvl*o))6vl=Սʑ#f9̘aO:b묞fMMfZyʂV(Q5Zw<}Z?~ĉ N6Tٲ٦73fߍHOke믻?x944Cё曞4\o~[xv`T*˗[._nѨcffTPҒ(BQҒ"`@0ٙ"B!o@sּow +>us3KHZ=(‚-]._hĸdC''իA[z yر6GuU |…NB!#NN)Shz4jO(MnBoK: ْ%ׯ*pؾKA+V03kW_u۷omŸot=Y}pM~ńs>cy. *2C#M= KmC=e?ȑ7uV ]]y ~:+t(* Ԧ_)_>5X`pnWj,pa\ p2t[JD =?{q=4M~6^p(^B *ޫ4]{E%E২ O7atǸgt8. t].tm>2gNAQQ=|4ٽ[^.^;L:\Ӽ"~A_꭭セ_!6i:2'gHHygtGG/:; AWs50ٰ5U(L^Iᇕ<^&.M77O JI [g̷ML>=Ju/X[pqIS:S[psKSZwv0dHyoW_UsrX} +^{߿w\ ,06"1Q\Y9ʊkeW$%t{ͳg.^ 3aGWv^Y9<0rR{!89 sܳ.Sx#Fvtܵ֞ҥ7lhj ?>ʕBs:VU/]jf^Sx1GիviSX(*|tcfzK!MN.*zV[ t\ SHcH$ -)i۴I=KF;9==hhebfַk}F!GнMb"6bhI~k>=`ZQ&s0ݤ@"*J..)2%B4||Ҳ㊊67TU6U5Bjj@q--J{afeI Hiar7{zki|RYZڪވB1 3:5wztԼ㕓#8"/1.q\Tt\ ӠHc357ߝ$kmPל+Jƻ̈4:9zuPO{G7$ܓu2 I*OkךVQ8Xg{X,VXY%STUXؽ[̽IY<(({ZU\zM'?"_<]|}966hm\]WPehCƎ04vzj qXG!r~eڴ4ҥyEb@sA,/k-7nwY;>4(!Aie{>xF!Gʈ [J'Yd~ASO9GEYO(rq ݅]X-*{aˮnjtyISW4Рt 1c=:+)F/}s4rww-s둹2#Yi׮6i-5. E>sHmq3UuּJKYY%Qqmm/898:&A묆N/0NwzVƕ)$UqHH˗ߴMM^D%qP۾5E(Lz"w]F!GʠE=|Vk뤩SsUgdz-8o8ck=jG$w`M8qӦM> c6Y)*Q{"z.}@i"4~=hvog#zt pd!&BxԝܠuVskI g@ Y gY9+rV9+rV@  g@ Y gY/` nI`$@W9+ g@  gJ@)D:+??<@ +z g$`꨹Kboz&K@)#`9+|V@  gM`f-M<eQ_&JxZa0U\g-ϼyp]m}?g'Oc>JHHT(V50dx00#L 0Z?yJ`f~6+t # 07xBK)_}~cenn~AAN3`Ụ[UO K˜bNT VkJJK#isP=۽(v9: eV@Yso\n9*/*pJP Ϝ)+{BSD._USK F6,ۉߣ?J ЎSC01 87/$?Ow.+ͼī,ǣ >eWx BF, vwcϜ=/dHxXĨWֲ/>bUXĨаYsGj~4ezP'M=QO:謹acMꗈ:ufc ?tX݈S>ऩA!'M+ٓϞt~ACnj7L&Ky_uU^>j5uJu.onty9fi'N2|i<{Ec?֦#HF ;\ZzkE _ltMMAQn5H= ]nZ;G#]l],=glp@>0qjphÏINN=z 7#G3,ӛ9 \mJ ,J_("L wÞՖ- Դ6 !ϿғO.ÝYY_Mj ZvCS&Bǟm[!!UU[lѧ}r7wۢFDW|f uпNG \vFClKyu27IPlw4M敉*|}ǎ5f;NiGKB#tq-뚥G⋯v}^hhȍ׾jee5чuzРCBC~sٖ?>i҃]Ͽyǻޗ}Y;߹]n5(~uFF$L? 1B..>gVD2۽_Ozk|=5wI̡>4 jkk?y>@Q05+Σ6ʕ tQXע to`àe!~޶N hp@c~~OLOM 8}O77׎~Ƕ/Iq~h77o d_9{ޥ g!<ܪoq`Pu}|WޫFƨSϞ'gUQ*_ں{;@A czMY)8 ]۠AVÖu_ Ǿt?9F}Z¦o(B'ջK:|7(lƒS]{ 8ޭ&i>@8F1܍ta~t9cBϞ˫?!exHޟh;uֳ8'%e|1TFP=>1k|B=f;G]9Gfqq;;!wv1T7k {;N5٬GBK-뚥Q#T/92'w5Ga%>}vҤKKoRaKJJĞzs9ޭ&i>@G͠F0L?1BB5J29k^F 76uĠ~7UMwqWTk1y#)i{OylO˦m {! ͚WϞ;}Ŋ+nkW.#fͺon޸y h(dP]އlu{3OI'X^ 7v#0֠wF(t}Ԉ݈.=g zN)^RjzIY1E׎g[kw5*ީp;ήlJ w{ Y첏wO~[Uz|zАNkm*88իZ?"9R=j}';2c慅E͝i _X|Nn\.nY`SQpu͒kε;iS+*S.]2u.+ϙ㥧0ps[M0t}9 i4..y@YCGL;k>QTصx۝)v6agcUUf}M?}%T*H.^z-MKKW(ion,c_~9wxT*J/]ؗ^ͷ/_*ˋK^]-wssz>i^m~迯e+^괎`B_'N566c'GM''1G>Bo,ݷ3g/D^lVcJ9 wnwزYzzonojjZKom}饝?ԒW.??ښƦ :V4] ck gDKޭY_PmVayE- =jۼ&x|%E5uv.y"nTssq}zO^L?vO?Ӹl0Þ}i+/psu}UlWּ?UUU s~˰حM{饗_3z36oѨj{opss^|k|} dy._7w֚՘R}EƶkfϞ)J׼LԿsVuw{ϟO.7:mmx;\}fps[M0t}: h\#]83&]~n76֜$Nn="!Vcִ<>0'{p .pԺ3swiA'r67T."t}x(G1 #TUwW{vc*-g%y֎cƇczVVV}.9+a`7bj'=<gE9 )upf {)YWk?ٓ 0ݜr^<º'N"y7 ~k[Lzik=)g߭PI+,-R°ɫM|.Y e/!M4CRfEn<aYjeh4ݻ'rލiBC3 a(bSI[F :Y{5X MC(b? (B1,x<p嬦/)Bؖfhr{&ݑ>=9[oB^Y"fps:\.lmFFEkR!E{yBnjΞUh/=Uojjj/_VzT*pu7BHyy_'O {;.f*7N׶vZ,,,\]Fzfqɕ++!nnGE P߰υeK!j^z-###$$8==MX5a:J(a3bFJAUV<,Z͞K}}ÕMuŏ]tUՎrIk}Y kWPWW:tdx G~♥_|׹T2QuuK*6a.edf;Z L&k]\LP`grww2°YsJ!A[[KrL!v/\$Ȣiz('G=awz##LT1zdWSoE8VZd=W@yDǒb_Vر={>_j7aԴde^y,)>poDp srjɳYcF;vSdC UuXf`hO>?_}yΏ!_}ݑ!}ZY)&O\_eyܙ8r.Y##}qgΕWT7_H9L$*yJKok4/-E(ʰkkjk\\ ڐLF+.(Y%pX2=~W]Q2%([>x59YLg`qfIii>/YZZLǗ*ۿ[D1 REE%!ͽ*1!UPFEEy䴩caIɩ"*KKK|6O@Q<=k^t5Ffdp{{$7/?2|8!ET^!Wiz;98ljj>sBkk#nnDŁm%_UX۟:$.ZK|bO,U& (V-rO߷_'N|t[}}Wf>2r磣Gpwү3۔rj-pFfVbB򸱣y(kNiaM6MbBhiB֝:uƍa?n̏?|܅Y3&;wqchUaaÇq<744ҥ+ ̺v-Կu<P CCC ٴA@/#GG ɻz=L`oon6|EtB{ذ!v||xpHh %_BfϜgVRR(QiJ㩞%nRBao)`ffNJt_ԓ_|R͕݄VG9">48Z>m 1)+c/B\\M}h"[>n+WG#عŋ hU ?v_|Mqww[c!wa9+0 z-Ain\~ v]8=Ԕ)߷[ ߷Cy<$fڥxI)<:U&S[ax<޽ M %"t r{Yťߔɓ~{z0"_ `Ꞝ=E{B(Ǧ^[kbV@]߻RVvmbO%eO@ P#$0c{6] hsB5la!dv%rVc܇{]:+B 򋟎&W`6/; kN\J$ g0i8%9+@[wȴ>\%M|bY-~#o'Қ ڹnzvϸ^S|fhԻ=!@ ^p)Uy'-̙4+lvʥ:Mlt`0:{?Gb♒KD>]ı#x~BH܁3U)㡿.=@/ !N6kSU/!jtL7gAQe},[5N:~qQۿYTUkP#t8u>';)|BfEu>Sw5شNk rVȐֶK)o|OxTV]aG 6Be?l$e>\>gO^I{{/=ʷT;R_fK샎y~?@ G'[8RJΝ27~=[.mm[{s&^`O~hc8FuxٱK'䍏 Yp9+@ol'G^򨴵brf%~v7>X݂MNin+ԟ4vz-uV~!)Kc_FzgZG1uEIΧ] dia~!)+æ3]utZ` MGO_=z[_z/ M3jG)EO>r%^xG~! GO]}dBTv-U"̂RC:"]SW륾k6RR.an} IYquF1`::}0ttYzrq?/!'6z}?G?Xkn& ^na\xpAiEUmC5Ŀl>pj7GU6B_WVYW mStMwbK sr::$w`"7dD>,2oo1E@L-IQPUQVzVD]Ctė04rĵEDԔ R_K|!CIuy~..s>9;?ϙAS^C ?f۾apo*l{ħ` R*X4 gڔm QhfMkmm~4NrVլcgMK(hXrV g,BFe/O䬀Z6s\9+` nFEe~'M2,Z$66VK~~UZ)**&h*EEcb?93czdîجٳݭc4Iªj';8t MUTtz)&N Kټ@(==ǩT]KKﹹ;UX!n=On+Z| @+Ya g%% ck P;wf#"7 t1$).sg~JJ^Y60P2z66<) c5~~I"RQ?>g`rNjOU_Rq&!ȃ#S=l )t{w;{eoҦD:wJj@ X.sݺ&:Ht4S X rV9+@ΊA mNFe/O䬀Z6s\9+Є nFEe՛EFjF V|؋o8qmVo((IT gAQ혘,ofLsp蘞>Y_og'%%2ir PMllVVU`';8t MUTt8*(&N Kټ@(==ǩT]Wܺ%RZ*RVT꡴Ξ koyn+Z|B &LUu1$aO'wvOI++RRgZB99ن{=:%!aLq/I_/K|;B|Μƕ~~IŚ1GL pr% ̳%"+"K_,)ΝEdsGؑޣW/{ 9+дt:덭[nH"9+`22 ֳ g9+@ -ٰADh+B9uEq9OÇE~>FOuq^9+Pߝ/Zëׯzr֫W3amjKM/ТXfUYYS(1HAͨ z>aO7d`i^JBAzJp;vQ\PDM*h+U+Wn8;o8tfZ)**& f0ϊfjEDD'"Z}_JklYTt;99wN''To!&ȑ#j8PU_R".ݲ`L\ Nݽ;`P'kn+o ylw}YM_}u)$$5?FU~'iMy@j$''[USLy=@3X_uhg\caޖ#WP*W{z:&&SڇNzU-qt;wܾ-:IE(RYY9rO?97+G߯5j[Ok:uZ{pGGkYH rssoff^j3hVRRQQޥKo\Wsyn?5 IDATnY- g%% ck P;wfScRNDD硫{'k}wh{퟈eRh4:t6t]DOٹ3?%%LJIv-//"%W˵[:9ن{{|m%egrPpC븀NqsR嗟=YgƏaʕN:?VVɎŋ/Ymk۾^jK tIHӻwm`mϪxrVUӵkguMu>s?JN'[iwޑkEcS?-jeӧkT0ge%kv]+_3c}є7ZjX_9+Pz?+O4Y@  gY@ -#㲗קr]9wq䬰72Mo8qmP_*66.WQ7EEe$MvLLg3g98tLOl5{Pck+%% aaU5:l**MH(&N Kټ@Q(==ǩT]KKﹹ;U#O!n=On+Z| QaᬤsAA{lm*nlDdS.ՄNWQ\|g2m`*%e9M/U5~~IŚ1GL pr%@gEݻKDWDؑޣW/{6%%ҹT\UD@u̳ɭ[W+Elqv&HY22̳e g@g-"kV-'UD-X\ C"}Gp'Z&J YEg7=: ѤiaX KWkZYY ,4g{?%@hq5@II롳OXY_H Y׬~_@h`9O>1GXh*":tXiJ`9k7 rVBтѲgSL-q-ŲO7%Xt*"3 Xt*"X urVIs݈QS MCb \Z©S9cǽZ[MMjkF/F\ǞԴǎ'XJFN󬻾/ T ?:e2gu[H]1;MK;vIʕC_f N9k o`rے.}{yd ےGwQcS2Lu50 4juyg-HziW7+?q5z¤f_u\g~3gW{}a`pH?U}6h@Y7lظ݈'rm;닱.1~1qݚU9'b֮߄}M gζW*U}L=:w^b"r'ңG70{R?ޘa욜Y֬xca裍k?9fut|G].3>|øGJپ5?BLl\f}اO%֔Ƭ>PTvu_9ZϮEۿ?R9`@Ko8{}Wk\>z4CDRPڐA[2g}g ۝;k4zv=xacpnnhProZB㢯ѣ߈H/'N!aAn^AyxzC<=sw3TwvGWov\!͜]Dy~SN&EnmԳ&=OOåK?̘v[JJJ2h Bh+8w.WD\]]j,zzx5O?9a̚q_\S^^Gc5hYosgTf>oa>GINr؃>,hR+Bj+HVփo33\\T5c;vpuUĮ>\]T رCc50pQ9why>mmfaA]f:4x_mWWo^oT͟H盉gBg7y_Fl`?-\sLyyy3K<4)K.?yTlj\1mku2ӡ9a+_{OFh4G3 1 }%29j /`Qgz.øtLVFxI[^rՅ wٳgy1ՙ?K7='EAOLZ";?%ihuk /9.] O UV!]→T*>߶m!d0{5קOCEmذH ܹ3t- n(ϋ|L6j©KK& L:!0P}H[ի:t NcӧOWWWǧN۷o||!xk-m.\⑪CEAAAllkf?~itƌ:tD駟.**ӯ]ϋbXR•pgB,aLT?=v`7r ɎK?o(E$_,弹*dOjbR-XqޭUIա8`v;?9}Qv`0& : 3X2b CוkU l:he52G ?Dt4ְ7i0Ou{AN I>"~PVdpHg\owr2 :c: P6T邂9++KP(J6l?s̮]z-иqٳxbPػwoDVܹ C#{ +U7/JGˠ2̺vz ĥ稫ʭUC'Og ]*Nm'Qͤi=``!4zzi&6.x!O71M˧\opT9^כr5>ߪU+n~x~t%;;ѣC q[cVVVbb"+Z>a{ۃx=Ve3Ϸ8VƲpRޒqIWc>חxM5<:[,eRY}eCּ>fy%p0 2tf̙3#FhFqii}ϺBXt >Rx p-X¥K∟=~ɓ'ܹڵk7rHD~s`ػwkx<^>}J_NHHHlllvvvΝϋ|L6j©KK}Njk)N TpFzvի:t NO>ݿ8rȑ۷Ϙ1)v|B(===z/'99ʕ+O4~ j\J!A_n_2ۗ?k4ގ[-9C.g}O${mؖw).=)\BF\ʺTα ˵6GfaʜrO1&=\=.ZO~FJl4NT*B%CQisOI3f5얶adfgLH~3@_oAAAll,q믷kN H${.??k׮=bX,5nˉ-((COw|R|u&ҧˋ-?Jf{x]5lw0B]_\hؙxi" .a_}Xb__}X:K!8} ;%߳|h3 WgKw/ &?fIT~&x'G2䦦)u)ocPlpƨ~,<%H-7G LkvaoI Ǐ?W8UTIDUoq|uխ4\`lθ뢞*q 5YhJOgff8pW^"f>m9w܉@cӈ=}*mnJz\֖Yeֵԃ_ӟћBsζnکCOS6٬Xnq yx4ʩw ߧ, nM+x|m;o&}Kc cy Kئ軵{YepIdB_.7-rȞI>}oScǎ^QQk׮;?NX,G2d}9D46>a{ۃx=V/XzWBX~7_q^ c{aNN}7ˋ-1mS0qrbZ;ρAQ{#獺]xVV4pjEo;﫢 L oY?dt"7q؞*];v+**OP8p!rJ"իA;F#J_㏍FcVNP(tr_z%46g۱jΈC^I n>8qBRM: ڶm[LL<4Hx !IX=ThՕu1,jO\bw?h]aD 4%l0@< O4O4@< 4@< O@< O4OO\c-aXixl1-Ac2z&<74)|^2rro˟ΙLP4hNBMU!:*;fMϺjc+jE[V^;{g>4L\67oV^,7}*HϗTqUB> F$A֕^+60*jICXdɴ#iE5E_6[ܬO?ߪ?~L3#b"k +l];YVuM];usv+>pfiӻB!NwzKN/ ((iLk-2k KLkǴ@gi= :rW+5 w>1}l,ĚkN[ yOۗ:tE&UjVYL0T}mDbZ4l1f{Vvɗ)_2[=޺vW_&Yk1=8f;,4Ղ> ,K%w;e?Id3m<!'c:U ?("}3I#j>`v}Os LZ\NGNӇ1y9q`ѽq.guUW3^5p՜^sKvDo P|a 3 sЍYuOco]=#{N?!hu?MjSm;,fl՛NF÷ttTo~s[&n!-Ӊ>JNUW X^Jپk_Γ︽\rSBRb`R#O0qؽ;ogGIR$={ `X?d~=ѱ(\/:[aV KjKW\%#?/9p8,6 ިQR[ze,3X m@~_sͭ0TT+k[^suʯ[gmG=Y!4rCy|F<@j vgw= 3@Ffe X54fxdɏgeW4rC<1_%Sךk!'c2L'BBBuNbĪk]^C{j\:k]_s{n8)9zTMnz׷ܮҩzmuQRi̋Zs-~\MbX\zέ0Y]dv$ 2X nas͈#xlBԝSxHY$Bqqǂ ?^|La6uz_+^B<,Sh' _?STi{{+0 ed'N 4mqu`WU9FSwNlѽw=LSr-R-e:Dɢ¤aB<1#\*+C֖""n/1'e4HU[XO̿V~JST*c낮Ps+ng.&>!z&weTw,Rj~ b \rSEzoɥ-)R~w-nQW<)sss< YOsڎi?nm?C:@ KvckA5= 뫅\+/AX=hu(0X~fjǧ?|;>W_Cf$7bv B]KXLOK//xa*!4!Wo#K%w;e?I@k:>_GùYo[\كns̷`2M|^k[9K|_o_*QA6:ߜ^sk?ʿV\rS1ZmX1`'?w }L8L=mA5]Ce^0㭒%yUy+\9(zOs< YY'ܷxzW/R`3J2FIzW]j\.l\.W.[V^aVUUe0lT*%0auuu!Lgo4!\u2/M<}NJ* %MpB1jŬcj~2BH$Vd1aG]'NRvqh .s;lZzUf[M&Rd555UUUJ锸+TjAlBd2x4@< Mŋ_PPSSݢTn 4i!-멚Y|N3NaꄉQ2@󢨨0**K.`f;0 0O{Fzlcc4{*ϡaz[p@rOðȊ n_"nWPhzY\X,H5av?&& L&Z0L*x<Q|ȲRҥK M</,, YݮVBr<**fe:>uj+)JIҝF0G\)'::հBX#::Tc6[BAttd*))1"06H$1.qm􈈈R.ˣqH&'''$$$00`6oܸX[[K,=Tai#-[[:va+JvR2i{ :0nmk VM*xݾ4xʕ{ וAAA!x$C>ϰ86p=KEEETv ˡI-iyH&3-T5|tur9 QS555m۶rv!Fc0ڵkpjuQQQtt4B(777:::..wVZ9@*߉vs\ܹoZKJJnP\\lX őfqXU\\gH*isRah!5,]mڴre2M6JJHH1.V][[۾}{J*.)+,,LVr|DPp8II@>1,rKҺsj]\\lZHˤɽnۋT`qaw[h$5KCڗ.jkkE"lVV&yj*Ia؎>N|Ǜr9 NG[6PO g@TN.=BT]]'L&ih>fJ`$zTTTp8:](Jl6JBZQQQQUf-j111xVx</**JպDZ.F0GDrH 두(6^ `7.VYEL&p3z}}^_P(HjOOa=[TkVK4o!Q ^paޔS]5R,//w ]vj) xBOFWjtL!]ulmիOTEEEn3ݗCV:..DFF V@ 3VwwPA#K" M9HHB7,Z'nC4ߒ ^pa]!fnLJJ",h2nܸNRu;mۖh0Hcٲ|.Ԟ^{ڑ°!H[zyTR ̰:TfIm#lIINEo''єS]-///66QNDݲq{b'#߃*???@'DŽɄah1 c,d2T*9N}}}C:??h4OsޔS]x 4kla"rg۱TostU͛aaaR)~nΕd?p@0 ~sËAihiӦ`VYYY_o叞v-J0evaP3O@ ## sh> *x` ֧ix ~tʺť/e C/g 5%cr)._o*.knvi=ު7L|_u;Io[V=(hxcyi* k,.`]=M@dRD_xKى3 Թ&W<ցv`x%q7ffHQ[jYFAKgWJם 6+".glse2w}\:v$ScIq3>.%*bXw"XE_|X[jȞT1_'/_7IkRS0+Ʒt@WTZ[ UQ zi{8zh.]zhr4 %7 {v{ϋ]';3}޶y~z΄.yw~\WA5ǐ&|'j<- -Jc\7#{R ƐO,zwCu<^q;K^q461V%~+x44t„ j4 N㔔SNcrw4<YxٲeJhh(B޽;BHlݺ!ťEU!dXߟJNN5jå;vk~Gl6/׏"'əFfaDžmڴyPZ %ⶌ?jg$ғH&'~*+جB)O;s4Li,@1y;3 ŋҘוu|Tӟ S7WG:^bЦwn榥DSNٳgΜ9x MN8oߞ?>Bh׮]NIIq C&NxpPN; \|A׻KnA:tH/Zð;w:thĈ49rHIIɼy#G+rhQ~R]iM&6dVKkRrLk;trngTfP(T(f &ILOE % rI s{6j[@1?[, $4*AKO65 ƎWXXQFI$D2rȋAJH$r>~ .8qW^*j„ yq:555AAA555Ľn+rhQtCOzg% Y+Iò0kس_o}ŸwKYjP$ |bzc$CSZCl~N$9^=M%Ò{ߣuL9[hԡ#6V%R)xZ effxD͖yO*%$ LVY hx>}v0rK by!}nҍF=1T~Nᬏ"x|֡ﵹ$Ya,7G5}ͩ TeSU[/LOE07߾ʙ[8GhB @BVï}CzFm:ʭ(gDŽW'G8V:ڿBsQ^F$) SO?tro:`0"Bڵ;pB]vݿ>}r $)zvzzzrr[ evm߾}i-N8i (`}%6C. mY7BK YGgξOoHڷVG|;_For-G⇤zW"X o, wd5Mi U_l[ri`ȥm5twwrԴo,|1سՂ-"aQ[S58W{ok qhԡ߿u눍"NUV}~~~Æ s֭[qqq!PSSӹsgPΝz= ڻKL>|T*]tҥKe2Ù;tЏ>hҥr&@5z};6!`{*ʾ}:?ڴMW}DJ^0Es򜜜zѦh @<&{|x f˻OL#,#M̙3Ր2KQݯ9v3춤eFu&}i xS?ҶX7oޖ-[0 ߃D<葐OA̘1UV|>?00pذa!܁t4'JoiScvv/ =⵴ҋDMw{bqƕ7A>楒80ͳgXv[0RVh4A 6Sv.'Q*<^$wJ z2.h42=%\yn۔ w}f2iѥ&:>d~~~ZVWYYjoW(Z֣pi: ͛7|RF3tP"?4hZ'O~}׫TSؾ}`0tYf9r]v6 3uUVTZZ3? !4~2 ~gyWsȐ!suvɒ%ݺuc&v4RIq{=޵QF-_\*;8ٳ'y$',X0x`Z]UU>̽MЂԤzTe 0F$N=g_\NGw]&Q=.Vf[Js 4ڛmFMz;n1G&m,4MUѣG{aXϞ=322t,YO>YdI~~j=z(qcǎk׮5 EEEӧOw=55uΜ9eeeiiiDӧMvm՚ /]?m[VFP( ӦMs|]p5k-ZIE*//={ ErUǧ͚5^BhGS AO47,,,tg>T^J qՂTqHUw>Lr(C CQLh!͹6`E[PQ 5T {1RFFFrr2ˍOIZ}饗D"Bp|!fXk5)) Wmr#G_=\?PD=yV/o3AAz5CNNN۶mߏpBjj'zR&La F냂h$PSSC$xO֭[tt4ZzŊDEKaalvL|'i>x 3bΈۃ37+5^k?陟uN;4p&ǿ~lnw DR4&vA (((33%&&=z4;;;88㱩 F.*++CB(H3HRJRll%sonٲE,V4̞=)o}۵k'L& -4>v{K6V~|ݚԩ 3~\49/ 6@&o^2~Jsra3}Ge۹toBO S8u{mn8GO gn"LP"~TK{ݺ,66!xj׮݁ ڵӐK#M֭۾}RSSĎ@ xzzzrr$NׯhLOOwm@@o':ه'|rϞ=F ,--=vؤIBr\V믿mJR-4n8~m/dǵ8[ܔ`Fqю5(WOOxe\h㯭,(^fŒzM/-2fk<>{,ZRPx׭[g2!j*ܩSaÆ9~֭8tJHH:wܹm]r65`C=xG}S͝:u>|Ǐ߽{ѣGx%ǫ3gOl6[hh}8p'zBAzo߿_~MIS#kęvlC OF8w^R UKXw[MMONNSO=z ۪%M O4`i RXMW{3Qi0p3gDGG3?p7$bO-bQ2Λ7o˖-A+@?vwqM]? {l"*(Y:pUj[7G:*`]b *E$q1ɽ77D|?|{< MX, 277՝0aByy9Y6|pmm̂GO;wr\Q._pEjjj,Ybfffbb}vei9x𠃃Ǐ)g*7b! 'N 㝜(pBBB,,,fΜ)?KOLbE(] p8"33bi͢" b@@0m4===KKː6HD3f eYck.kkkRV=O9l:\|dsѣVZ޽eQ{YfYFGΙ3hܹBE>%7K\RYƒԚbN_SFJө煋H/5(-aB6tTu =:KXRW)fpttrww>~ I\\BYXNR1:|H c'֧I֭[뛞^VV6}?,o߾׫JJJ-Z4i$NN`yyỵl̙3BիWfR4oժU~~~Ł* ZFFFEEw} s!n``PVVFľ}K6 СCssssss W_)ΚR =3XV7:06o޼-[Hڼy3P0eҥ#F6lR˗/'4h\m(]*:f͒g~6kl!C=z cǎ 2\~733 .=իW,--a0dқ\lYӵU)&wQ|#(M,vشs:2Ԙ.>{)1E>URzݣe*5]}bB7zׯ_WVVnܸ_zڽ{w5tbtgƑi%>!!<ɱPlXYYiff%833SZN7޽{lnժUbb"nP6@aat<M>Mلr_uAlٲA7ŋld8))UV,/J{fpnt.`vvv555A888z>a Mjj*y>S*-kO<)-kH3@g?KRSSA,--y.W)M䱵uRRԛ666gMV.٬Mн肍M>-kXi/^g#Ԙ.>{)1zE>URzݣe*5]}cFNirwv: (; )Fqo1Vy}Y~޽{ɕÆ 355utt$K9444ȴLAY76PQ^^nggGD~={aggWQQF%kjj}:c`A3 }BPm\s…mN8qϞ= ){f9g8,[L˙D!])]T6k^QvUPKb^_SFQ=҅O}J-a#eC)SZ/JMW_QV˔6PBջٿqOOfddP~ /))H$=ktɤ($?KO,eiCtC|]v%k׮.\ׯKd?&''ˮ.0g.Vȃ+Wу ݻ3 u(?+ e$^S5C˗/wppD,{f9oirAŋҒ4r5\ ==)$$n2M]f͔)Sa Hfmm4ȩMi*cIp1WהQTtbSwi6A4جO4%,2fOq+D8{*3o޼s禦M0AM^[[[[[ŋsΕ766&SN۶m̜7oL8ٳggZņӦM[`AFFFIIŋU tʹ>C 4hɒ%'OI&͟?vFXdI~~~~~ŋ^SCit.f{zz_0aҥK ,Y>'NH*Xs Yz֭[kMMMJK{f9tbT+nbxxxn%[򊈈`PUVQQQ?͛Վ=~aݺud}-Ritm`Sڥ4ޟ9],5e:\TݧU.ްGg 0SUj* }*#( fAw;cP*Fqoq}ZiXi&{{{>N=Aϟwttа۳g 111<~txF k۶-sssT4zѢE-Z011ٱcJL|"66sssy<ǏY*|M&]T=ӹXnv.`a.{ ::W^&OcnnqFZQ!NJ6 Q5]C/ =3tS:N.DҥKr/^$D9iٮ֭c{rwA}}N:}WWfΜi`````0s̪*kۀt4Jm`%M{1%6"US CZ2F7Uv7:KXRW)f肓JBYbtgƑ3f3?Ug@nr'f͚~1IHH2dHZZf#&L0qF_"|XDF]ر.߰x7XzoXngO#HE(>\=ACC!""Boԇ^zmٲ#ynb A)=AAD}C<  `>   O#    4   O7q$bH~*Xb1"@ˁj  4_(M~D񥫧Mrw^JCo䫧Ϲ4^1t ]? 򾁿rwYLd_JO}Kwļ愅>KjYI DAirn],ic*e_:3Ksi2-θv6I1QS&Ԏ/lu/X_'_VId_%8?i86k1\=׀ʭ~z㏁gAcSAA0nvf'w"߻V>`1ec]+o {WOdysE9/J_{7:i_Y&޻2SC _OZךFWzu +/$&&kNi*/  B=TPQ*>Y/4Ws^WjCUj5FT覜9k2:=#sv/ctP;g+-cnz+Z1jNKU}k,--lllȵUv;% _UWW{zzxuaH!3ݻoذaٖp]@NNΑ#G@kٹ;w.%%E,;;;O4I__jjjΞ=c=z4WH$K.ݽ{W$O0AKKa r/X,=tĺ311F7003f ޽;fccc--aÆ=}yc5JǏ766۷H$+zAA)߃sg%0]8R0itoplEqlIaƹ$wC AKr71sk]8S]]]@SSS"ahQԴLBbbb۶m厝Ϟ= >8qkzի xIn/^oprSSS,u(--ݴiCPsYYH) I;;/AA0VNm q/:HwKdKȜ 4+g#>R;zQ $"_6B:\ DLp58NFד-[XKK466绹]r%..e˖dFK?~|ȑZZZ"ܑR ) pFFF,fffo}SrE/  f>}#3m=tihFs%M\ﱝ?ij+-,Ey+>!{[;FyoFw8@A;d, kr.\'Plnϕ8wN^^9)?~ĉA.m' WUU9881^puO>>/̗ kjjx<+**ߥ<==ϝ;Gg;wӓz=zt-rssiӦ1 D' R㽽Ϝ93qDr4CmGAD %lw޸ bw*x'֋βAϊ~VG7ݿ;~pfO~ؐE2C%C߯E0_wx~fgҍ"re˖u HSyAA̧?T5kP#MAAdH AA|AAA0FAA̧s9,{LzO ۄslw0c"`>Dio˗/ %y}{? j. Y؄  kl]M6 ̐ fM}A/qX#\-B3 Ȃ*#])p8tpp|1Y.f̘giiuVN҆n``=x༼"_ 9s͝;W(Jܹ֖咧pttrww>~ s\\r#Z0:G(!!!zzz3gDJRTOfYCZGa#DTꄲ2 :e| iӦCBBcu|G7Gd!9e@f#O%Y 'N 㝜̅5t.`yQICA̧7o޾}xԨQs% ׬Yŋ'Oܸqarss۶md-Z;vCm:`bb"-occc_xAnڴѣG111:::+Vϼuz ȃ:+++999))ײ?q޽DB^v-**8 `Ȑ!.]~zIIT1J5FatJQnݺŋܵk*KԩS[nݽ{ُtQ,R;]җ/_EGG3MsTILf2 Oܧ,uddd,XL| ~.*1\\pTAc'/rWB򸲲666qbb>fff m?nggWSSCDMMëWd-iժUbb\ qNN2Kv"IIIR;mllu233e,((v%w*N 9餧 SZFy*ԪU+cѩG7]`Hlmmݛe'R'Yʥv3 < ؤ)vQ%1ZH4}*믿? -[>| q]xQ%q,s "^0fyes%444jkk㚚Nn߾ݻwo===?S2MyJb$Si>d)//#C{Ϟ= PIA\҃*i tӲkmkkddֲXɧ`XVctcILNNCӍ*++もUקe+D"~JPVf)W}t [[[SO+KCj`iـTi>UeO#""vJD׮]/\Я_?Uőn,?,5Danx&Ndɒ@:@[[[[[ŋ|{zz6mڴ ddd,^,7oܹsSSSkkk&L.?~6xֻAi|:D”omWu9H8PS3eޗ|]_vݱm 6%dz*T~|R4[߷'%)z# =ڴy-3= WZhr[ps[r9zeۜ23>*HS0A%yPAipTX600ק[ڸʴg!䤈s_J myhhhp5x<-v?ӣs ȗRSf͙ѹLjQ\,m~v;? ^װc\\Bx?Y˃5Ǐ&Ovcc\e- 3e2h$՗rMu=\lU\\mNx7RڤFK~Ǝ/3sڷ50n㩜KoNN.nW.֦#Y^ZZֻOܓ֡km=ܻwO֭=Ϝ>~vuvmɗ {'/?ysERKL6;. p|YfL:Gܺy9#ۡIϟ; 5zl8)**NLJINI524lֹïYg)+׳g:=nq}>!_ukϞ}F05k~$ 1X@ط3ODC6 9?[N]eOQ {Kw9R T6r5uA=m{3(Ur6!3wElJnJ \NӊRs3cAvyF-clZfdd(I㇏3h_N{nii!Yz핐H?b?o yB#1!!QΪw3y*=PsÇczsnrrJҗu횘,~׮^:7n߿_zzbUO_N_aÇ1 7Ha׆A4E' ɝ 1w.HUBA>Bif5W,rlރf%bMM ,wt< HNF=-!`UЊÇͨ[6ϙ3󳹳Hw ,G\.znXfϚ~;a n_{˦ -Zֵ#*kcHwEs.G J(pcĽݣ[W ὤ˙ZF-Zv2jh_Z}jjnI&Ȥvsk?g=_8Kc/Ow|7ϜY;Zj%}?}W[[o[?{ |Ѣle@ swɗ>los^MMͫW}D[X߻w3Zu؏5cBuZhiҒ9g@p‚Mzt:9`\n/_ 寛YY9UR=9w|ɓu/}aJИ>mʲ+?FܺéHoekMD CBYDkT5=w. u@A>h&S]eK.zQ5CZQG\éHiڶ?ilNǏhjjz{uٿoY3.Yeee z4ճG_ N:Ξ5|ӚMCRRR-ͿrY|WC_I$[ W^^pҜ\Z;RQ\4ǎ; x}z񱳵7oݻ|rBaJrڝ3B*էg:9J X,͑>Yqcv;ace㊠o0=kƺFp0֐Y;5 H=$«*^'ۺ8l&fTZVVV%>udllT3#*q㯨K#lF)й R>߃ch†]Ӊk0g7;tj" yy۶D\tttvfmmjsA$AWZ` \Aݣf9N {>;  > mp  >     4  4)uoݼZ   :4  Oz\3aqII?mm ?74 - qä~AAi|ƢP,q4MOA&:Z[|AA"7J$  4Á&Bo y  QD p8d:4)5AHjYAA;nG$$yr8Ch8!,Qs\B_EA0?}WVBHBB4 B m͕;n߹c c]~ʔE555<@wZQ96l 3#]RniF+򸕭=HO5ݻ߾k\\LC(=Pك!|D<ܼ:#Gܬd+^+,*0Ν{~գ>-[Q=+ǒ/K~t3* IDATnjkE&_3k*\nj.M 33̬ennL7IϞٝ`!AA>fBѼk/teNNΉGR6z;ͭU?";311}OK555q&Y|>q_^]]ZX@VvnbRJU|i sr%i :: 6 ]&$ϧ S.]~8zt֐#Gzu#jy6A~ 2NMM!6Vв"_*.)--M=]]MՃn* `҅uiK/9bpێҚjrMM[~?m OӾOr҆V1O *++KVVIn+U_*m[144hXA0kVvvk Zԩ3wA+sXvNY7tY]xp/ڻ[sS99ӦN9bYݭu|2ŋ|fA+˗vI\uuuSL/gY|L[PlۥsGا.vV Ig&,;!7>!3+{.|>b۫70&tqZq7o:tpƌYrn D|rζbV6֯38emeٺCgNZZ Ak7=OѽkϞ4o׿c7>|ƭ1lQȶղEYYZ@Nnyf䩅$ݺv]r Oweͻvg/]4ѿau@AkZu-xʤ7n5k:yϧ691)yzxbbpR?o]. FGGg@_Ώ?\bY**+ݿqsȶ|Ūqc6o\|Ǔ~~S/].V᜗Wp^^lkk.5d9kckmea`8'7D2O\2(>:''<Wk% |_OoErC+W,БA*6k&/2t$*++_z*]G}ĸTe׮S R^Si_?٭ LO}nt|K,ݱckҝ #X|:==uk{ݷO̰_22C%99`aY晙YyyVVryڠ;{>Qp鑔\?55KѥgJJ@ ۇ5d<r>5͵m{q>Ȩ*)9g'07/ SD s3n[wR^^WTHTM"#3+-U[gr[i r8F"sBREY2yRIt!H瓓^1|K !p܅ݺZYZp-[]ѕ]o.NϞ?z۫gw+rɞ$?Bڶ"y<^mmmee~}?(Id{8˹Ç@TT4ճ\\ytѩ#,Ԁvpk]?{~Ozـ:  mjJ[4i@OeeotC - ߐ<};L:y6޹R}z&&&{S]] 9}֖eUdy{o!N d.Rvu;l0~w|XZZ7ڱ̅\[{{J;P|}|2,xŸO  /?~GnqU^[Sǘ^={1NoҒ۵o2ƌ8k4:MMM8c_dj3ML ! eV9\nD"!xΞ;dm:h^,AD L6[3[YYfgp!ɳMڵkOH\r/?2xl)Ғ//29SG .GyT3 7NAүq\.$Sr/#=C9  С{͚3̬G^KGӦNZjMVV9|=n&?? Ha*Vʶ=k!2$`#G=Ws׎9 g̔.ь}aGœ|4M>M8IvWo.AAQJN]  H~&NtV|>_#SSSWg|c  !0ǠW燏bݿ4  H3Oxc  4R>}   j3,D!AAD  5N_ KJ_m.&6g}R A!aQ&'Qf A@=h V2N(?%D0w.|TWSXzzb'>JGgd~ O7WQrEC~Jظ22 %RRKؼp>O kUSrJ t"?#%ZIZZɦMxx-Ȓ|>xxk.xx@RSZZnQ)(ﭪ/㬪[P w"Ocۃ,]mcje=Fӧa0333py`@X4RuRьZ-; ?-m..nJWϳg۷?o ^*JWw{Ǔe%zxv%It$ON.&jU:낔7S ZEO q,),IL,:Dbb+WF<0w:w(6oizx۴韴")󉏒S7BP( @(*//z|b;oyyd9u絵Dt4aeEHmHƍA,'A>V>|Z (t*2Bz)!9u$?WWBC 8ALuu ?isE^}ѳO99\np~~U?].VZ*$D4&g1EOE>-[x&1u*ް; zn3bužUVVKz uu{a `ف(MVRR7O2tM'{(.?=).;#mDJ kި|Gx"bIyyX\kllVTS-5_<"cǶ5L̙p  "* .hk+iտܹw?H]V @CCnVgqպ5$$@ǎϞAԅ))pAh(,\ 7548RmmR) ##]2&&DGOROm# w_y*R&ճpd@>mHϯ:>̙gΌE?]QVTS٥U5/sgoVM@$*߿a~[[{j5PQrfuf𤑞=m%+TVy8X ?`1c (ܺ/&ݫ8Ç;-]z@k]QQлw3We[]-޿?WVo^kf+ׯl{>v"xXuA|^:ʠBOٹ]7ݭġ1ݻ[3;;.f=M_߈|oߞx>P|c(0/07߿gV27ix Oa[۷(n~~o5?x\өg蠥sg Q_4ܻԩgFF^'X_\Mpr[[ . E ge~bsr_Y۰i+Λw!2 oqI _#4$c]\fTId/eR~(z}RONql}~ m)NzzXG6xEጙ2̏{?™_Lh.12O  qbJPPQ:AyxOrƼyPj x+M̦DA(  `>   O#    4`(>I|@3HW/KYfJq_( A̧&E)]\D%SC yWO "̝ FF`d}ԅ+VXj?>;(G=6Z&} 43>=Jjb_(|.?dl0y2l |CQH̸ J呆 o A#!l1ʟ֖׮WIILgghi ܺE]@`nFZRUUca@@IIEtuE1'CCcLMwl'"yè|4lff`fN]H…p萪㔖f̸ԢnIegWEhimsq9tV}=ؾ>ZsŲ[=<]$ ''[*JuAJTO)-]ӳEn8TD$&yyu"1H+wxzc;;7Fxx۴韴")󉏒S7BP( ȿܬn*V,߹AϞ?<WIPw^[KDGVVƍPHlHWMɓPHDFδƏw8cOǏU- ߴPXnmٗ䪑%ݮxno,Sܬ45 woB$"h (( f&"#ph@3捞gO k##Ӝg;eοNU۸PXq?.N e0r,E XPXijd/eE;ғ=s ' }|unV^fYh޼+ sgplF ʃc ;df +u>-}xHOx!={P+AJK JK ]]溺@ݟ"W󧜜 .78?G^}A[l+-QR"TOWTT Eݘɧ "-y:VoX}B=Ks||:vaaa*+H=KK=F@&l+)UGɊܤz2h"Y|Zq3&[L]6ϟ7TA>Z>%bq-!!BR-5_<"cǶ5L̙p  "* .hk+iտܹw?H]V @Cfh*~ukHH=֭ RR1 PX6ot448RmmR) ##]2&&DGOROm# w_y*R&ճ{Gz6 h)lΟO:s&߿3#{hOWeggWURִ_Ν FgC|[5B03֯|~wonݷ k@E95֭ߛ5#÷z9p Z#p(߯SYq`*((|  nj r뾘w t鍂z:YwEECTZsv\ϗflux^Z{H$^IݻDcqM'{(.ߑF:޽޽"844{wkssGŬ7"?_p۷'/^mm7TA0)-+K/((,-ʭ,/enҦ=?A[+p۷È`oo5,.##+Qܰj~ $'SϘAKKc::;HhwSόvwO4(z5XYA6D]},P=Y%{徲aC?rsW,$[iim7ѣCd)42|>7~1⬽[[f *GhIƻ**\+A|^:ʠ,|wz}5ҹ;@* ;ΞMܷϏ-Q/X3VѦ1GA8c?ޏp&?JR,lCӪw+@"!.^L z|6Jz" !^Θ7#ဣ#Q A|ArWst {%@AA̧AAiAA|AAA0Fqf+AAD= @<ʚaaAAfˣ䬯&U, G\# 4opG l11 ]]DMMĺ¤$]~8?LMuҥئMAA0~*/-ʧUk^~7ի n@øq0~]_P^#F_~%Š[Wbf{CD0wn]իMMu>J'wzUCA1Sq/ QB¢ꢣc [*l} |֮]Ҷm<ÁR042[lՅ04R*Bc~Ο[Eh^™3/?x]S=  6a%bq-!!BR-5_<"cǶ IDAT5L@쿳ܹW0jt~~Ig${L 8r. #"b4RZD1EBTDEA貶xb]ԕNb@rBr5] " 3g\T<<̙널ӧ=|Ǯ֊vy9}y?DkOnKHKԮԕCz&cƐꇑ9=?sV7.B'wnƬmްdBBښdo3̃8\!3A9! >>@d{"z{p6&'p"KCCeo{ZdkqE9{·gVWPCn&3fЇ ih gEԞm[[>84aB %hVnBEt_ rթWT^|qo}QiE"D,-5d%GNxg^}Gkj;:tvF~NcNW,6ijjޠ{ܼB_(ΩMU)9zXYF2`{AFbeEȀ'9o^Nq۷::3 tkiG_^ -2(D"=eaR9ϙ5. !/fիw4!d޼צMsjmW0܍%YX !"vUF)뉼feddT߹뒑z#h]NsS}Cu uW67Rc:8H.[Pnruu-SN%g/lّ7wTjp2Ox—_qcIkkGRi޵n$8Ƒ$JגDHR?|Sյ$'ҥc$Ow967oo'!0 cddL߅g''p2v,iinr>>.^^ǎ3fd2 Yzr5k^{-MU!sb9s~Z|`5ZΤ$;䣏ȨQ$5ăbmM\\Hz:Μ BmatbOwM̾}ZWY']yihHDGhez@;O8 XOݿ߱yϡ#?Gm`= ٚ)9~ '/%+8JztmXOSv CNSxL2ȭ$}8=t&z9x_tt[܋)+ʘy%;9gEڕ;GZCsw9[+%cCG G\v=~G':`= hH>^cFLHmlKh;.M=^Z}YS<ե|F?26#j.65ezNE4սsӠfz/PQ8A Xyiկvlö?XZ-~gZSry0!}o?]u6c1}33K1w|濵Msp%~+?WJ+ƅ #?kk?]re$Y,鯌z+mvd645|7ۆeEfXOCyvCG}aζ{-PclzaDx_ZZOWbOlߙ6cҋ"hm,VjBkk'tv2"!;~oA>r'~~o?_hsn-1p>}I*cf=YsS}۽{1 CΎvjٹ'sOza䚰wne J"f^/d瞜X?3wW^ٯyc՝S5uUZ].<[s:鯌U B2棄}6g=&*sg@c͍- bsȕw"`>䆨ŽoE;͋,YY?ߚ!{q԰ l~P9USӿ-}^տ^rg62b<1f,&TDiS p}gƟQb1J`= (Tp?e,+8>vm?*XOCO>~ЌV֨zOڕ;ixawpVllBHD-#ii"TW1cHdR]U9qs#55E"E$ mGll :4tu޽X_GY[ۼaC߅5fo$ fh(!o3Hh3XZ2gY{+Wȸq~޶5C&nPR[یf'O4+`Q?7"f`ffxbuj} b_}..}3in&ŤBիʊ46ݻb66++D w<"Fԇ ښt,9vx!ބ <?ON\ XOCw=izizY"ZOeKx]N% tkf޺Q8) ɀL_"==3O_Sw|; )d.hM}55Kla!ؕ1LWٸj 55H6xk񵑐zE<B44t O? OTڕ;kf7~ݣr,[&(t.X2}:$R)O OÓD[[r­B7{bF~ACZlm7%U*EQQyW;ؐ+W_{-da[&e?RDbkKHZQd]uܜM?zz %Oz_?rW%HP=q^牻;17'3V]Mƌ!{1cHu5_ p+Wk2l9v/fc# "}>RE"2v,'NNd<151?NBb&j#G$"QQD"!j;V5ium&"$˗#Ϧ%$BʮZus4ܵkݝlSCnn;bc /\W h#ݽ{yÆ 9dkk&M~=4pB^Qeyhܸ _G55-ݴi2L$CV8"&'yo$)DF9@DBd"պq#YȤIdϞJO'WPIΎELJ44KˇLJܾMf |é, +ij"6/fh(Hu#UJMDo{7Ou{ 4V p!344ss;2s 7/!5U^uH v@zx,Q'jW/rթ w&d|9D&{MaCImm3^>Y'e~li鄗/V׏{۪ʁ1C#Em ܿ)}ԪKB>g)d#XTBfa(!Ȫ! 0A&@d^e:ÃɴŤuArR=[@ébnK$LC%tt<_T͕MLMZLu \LܪfSFjBj^N+ԵjSW_eqv~X.j+SusRI,╷Ƿz}W. I}㾍ն{-Pc^ -2(D"=eaRy$9LFZ[,,Ącc炗eo_E--MM5 ]ha]/oi<<%r/rIg'ٰ\F !Zam+aؘSSסу#PS&攈bROuTUMUK$W8s~"w??zO"47SOKc|}S?‚vd?0nn)=`ݙ^ww/z80۷3#i]̘w2ܺ266ĤUuU_Z5j QIMzxEʘ2QQyu+TVW%iWTSome ag|}\ YN"ԮUG='qq{Eg XMx_OƑkU_pv{jD"$<"XO`= 4@!YXOCWX?oQ4.xٲz@ nPO_qaHb"%vv$-Aj2f ݛCm*bnNHM@DE?qs-?`= Rmm %rL&{[K.mlp7tC7D"#r61><%ih sE)^2۶f|ph„ Jjk"Xf,&TXlK_֑#UTҲ'"‚tttvB9zXYF2`{AFbeEȀ8CSdy9WWQ.^\VtP"Ϟ2s0ԜR;\fYXBj_bnVVMFF;.oE~00ҥc MN~eʔ4BHGtjɒ17$%Vp!$)$2aSյ$'ҥc$hxp}''BHff4Dy|5> oh Ņ~Lj&>8>ޛCϓS5ڵ XOCUXJ ziixD"`= b 4⫯-zz4pVllEx5BHVM qw'dժnn;bc /\ZmްdBBښdo+qc…Eyo ɜ9? ޶5C&nPR[یzG4+`Q?7"qb_}..}ÛZGVQ7K;4!ܜ\JHc#y9WW*Bŋ *J$ٳ]f&BRR|}+/UQ}NKF[gy!At؂w_k2%$%Zd ')/$ӓ$'60pʔ]t XOCdqL!$3zV8dGi׿5iV#}LPL jz0<<7UOZ~4XO`= XO| jgc)9_zV5g1W zSzv+_~ѓuvvD"QggGgֻ\p+6mH`;VM qw'dժnn;bc /\:hv:j JƏ.$䐭L6pXq}톎66W޻Cy₴ʠxl3IUQ}NKF[|xF`GOTPwz-!JcCFjLҥc MN~eʔ46‚?ӓ$'60pʔ]t\mn^|@}122&³MLGLkx8;x ! ?uie:;إO~GiNKP6r6E=r_fz~WсV066^h6oNjڟϖ|nmR}eK'ǜ=]CD"l I?-س ӄ].~zGnnZv" _0],wUlxRqk\]GkCgfFc@K_G<^k)~jiia>W^>c/!zje7}t ݚ=:::6%&/_(ZT\2YIoYYO߿ڵl;}OVٲ=[i<ے%>_9------?h v4`bbb>wNy.oݖW:, ~K<<=%%nLHLZ0ݎhVi/;{ U4XO`= XO<6?0 @[YP O\.O ǝ Pȧ,Y'!!!,,i(O=6R=p8 R:??kN]u wiWoG9U7hw5eA]vIȔ3t#5kְNNNNNNϟxEV۷oO6566׷5ϡC]ڵk9sfw)ג%K6n($:FUYYַo_C%$J+** CBd2uh C6nLOy=mbb㓕TqBVU.EEE/[EEEoL&_zƍ7>Aڵ255uРA<BJKKס===_v鐐BD"INNvpp'5f^bbbJJJN>=o<,ۧWUU+_W'N8w\pp0[L4)''G+BX999&M?֭[ ![nU'""<33_~< y~ƍ#OgfffyyyDD1ucǎ͞=rղSSWWdxzN+<fff~駎1W^fSS lڴԘ۷oH$+W9|ƍQQQ6lPe˖Z[[s XsU )..?e2L&S쨑l>###ׯ_1!!5oTI(>>>22OԘ,..6>}{ꢢ"|8ᓀ>O.WՈ[n]ttʕ+{2{{rG>}4:h CxW7 LHPE)SP7oɪ ):W_}m1ͧ +dJvI&m޼ɓQQQwxtmrssCCCL9Vfff}o߮PYY9`ri,U,(ަgNmmmٷeKKRsRcsTʾy())Q&rKMM3Ns*<<<55uɒ%a)Ju5moO jLjk8qwvvŊéML=Zv|rz/{ZLj6:wjsP?8dȐ9s攕mٲ_~ɧ> QAA_~emG\QQ"&OX4:\흗l#j;ޕO K/ݻƦ*))iܹZOjsPdՄE*:O,<ӚV;Ug~ߤz 0 0!NNN_lZZZ{Ґ܄ LLL>rHRN5q!Ck9Yj{{"r]]]LL ;'&&"ȈSRjLTze!5imm}MB͛7q?rnݺB,--4p߾}/|+.YM6;!&5CCCCBBbccj6e>UAUNS]O= 9/_ DO3Oz>|/'h;飈)$bj֭[^[[+\g&noIIoAL'9_'RS)pv2 u  9p7|3qDD"kt(ZxqBBBRRy\vGs:>>~ф7n:;;;99 :'Ԙ7n8p ijjb&ekkkOUUըQTkeefgg/\`gavjֿ]ָ͎'[n)?ujLjk?>44tŊsk׮MMM]` oM~~~)))NNNjyҥKaaaWH$# m8955z#GRݻySvᯅ< ZLjOuTzNjsP'|SݒgP /0⼼ Rt8'B%@3lذo\&׼[UV_5.́nB@QSSY__ɋ/M@>%-%K$$$IHHO (O=6R=p8 R:??YRV6gS'jO޴G]BWn <*N̝;w?ϝ;}ʨb9{lJJlll⊋٣x[9 S޾DFBnsfyǑrNA2SXX8b}j;6ߎޏ:hw5eA]v y#t_F]'+qqqb888x͚5l8ook"+D||۷MСCO~ڵÇϜ9K秎óIu =:++2--M)J샟 eС-e2nj 5pƻk0>mmmO !H$䊊l|Rc+&&ӊ'EQdoo^UU|]M.8qܹs'¨4iRNNWjrrr&M[BnݪODDDyyy~Q>هDDDښ^^^>vٳyz\.W-;5uuzJpPImT?RP='9199911ݝ}< OWTqZS.D"n]\JK.UdI ZLN!$$￷/F$@yu@ΊbNx6'v6__S:+fee`}=Nfff~駎1W^fSS p9'5fDD%ʕ+y߸qcTTitt mٲe֜B*..V\Blll5L&333d5rgddhzUGI@ӧOjĭ[.::tʕ=ZL὎=9{O>Fmچq?&zxttthh"”)S OjsdՄ᫯ҶS牅g2SI6o|ɨ(;<m۶憆*?zn`WWWk۾}jBeee۷o,MLLTX\SSxg:u?e.,--9ώ✓3??7uHRCIIr4[jjj!p*ؘSTᩩK,QOTʮKyn#|O}HPcR[ĉX,VNmbԲS|pbR;ԹP:~!C̙3l˖-CSu XȨx/#H'O,{\.Ȏ?T饗kccSSScee4w\'9xzjBsrޢ\OkO'iMӪ3C~BCCsssoRt}B0 0!NNN_lZZZ{Ґ܄ LLL>rHRN5q!Ck9Yj{{"r]]]LL ;'&&"ȈSRjLTze!5imm}MB͛7q?rnݺB,--4p߾}/|+.YM6;!&5CCCCBBbccj6e>UAUNS]O= 9/_ DO3Oz>|/'h;飈)$bj֭[^[[+\g&noIIoAL'9_'RS)pv2 u  9p7|3qD|/0Qx℄˗v׏9Uمu||ѣ !7npuuuvvvrr:t(O>1oܸ1p@!~MVמQF WWJ' ^p!gyԬͫ>qOBnR'Ԙ,++ ?~|hh+xI>\Z NTCUΧTFm@`\٩Ëԇ\.߿r!CʦOՈkhhS6Z5nj3^z%www!E8Pw]QE5Pcǎ=ztŊNR]n6ʫRLpvEbOm ~NԮ8qof666B zzrҥ8///Lqբ cƌQѧO4'''vKebbR__߻wŋzJJرcMLLpBZ|b"5ԘYYYq~,,,8o8Yϟ7o tkܩS Sg1bĨQV\~3((y':\$t ŗԡƤfbbX,H$ӔnYv ߖd2wj*ss믿F?uݥ9mfs IDAT^1jjjz|1Oudɒ8 :ּAiۦ]G=AJ??k_*ЪFyDm+QW AʼnsN???sOU gϞMII}Z(jӫӕ'N;wDU&MQJVNNΤI϶uVB֭[(//W~5O#j'ScR[ǎ={6OSe>UWWj1Vx>yG 6u 8::&'''&&G銪#N]krHd؍1'O:t͛7٧ qR4##|ҥ,T))|25nH~YA, 栢du8Πp]S;qJvŬ,Ϡ crZuEGG\'S)ױ'g/pGcѨV04wuT UD2e Imu󶺞3|WS|}1dSLzݺukkkKl$ |m-))qtt|7!^^^iPc6ˢb0dUs궽^x>!NBtᱎAӼz666o&Ne3/^|r׮]~#G:=z4!ƍNNNC'57(BدISUU5j(!ZYY񄡡 .,/Yyuק5nI֭[OښeeeǏ ]bOt!ק>K!)j1uԡʨA ð;;;ux1ޚ?~\.2dHYYӵq vvvf^1cK/.hg*.+qرGXԩSM~ ^yUӓnΠxSL}-!ӯ)'N7߿&00pXAYOB.]%5Z_x\|y̘1_=<<铓.sLLL{xbROII;vI>}.\~yСC˗/pppPl^3++?ׯ'K311?~AAn͖;uT!2ul#F5jʕ+]]]o||?~ 5BӜ?dbb:ԘLLLtqqyySԶrm19A=ڵkSSSuxy>tF\^^^ppi``ת9.]zjD1F]3QS9rD*޽dUp13{~NԮ.ɼ.]xv_-Xܹs jW:}2Eݻg̘~):~̙SZZzԩ{:W_}533ٳ[n@177/'N̚5'Ԙb?.--=s , 0 ##zϞ=mvW]߿Twd;wT'22B zּRZݠ޽{+**"""7RcR[sڴi2S 9A|Itt4OUwKI@5 /)򌍍7mᡜ'Jٛ*{y2sNyCl#wNLuDM neeURRbddTԞ.+WVTT(Z"SzZ+|J癗T<qnA@ؕЭkIy ?tSSSWH!5{Ou|DK5Sc >2 ΰaþrL^n1VZennh4 U=FMMM@@@/f}}ɓ'/^6qxfxǗPȧ,Y'!!A#>ݚ7H5O 2{?xu% =_0>81wk>>>YYYoРAvLMM4hO??buhOOׯ>}:$$"H+**xI٫WӧO+E͒}zzzUUUzzu5\|ĉsqjҤI999_ə4iٶnOߺu"?гi|DdR~jLjkzyy7rIZyDhyFn@ 1Cqs=TWW?TY.UwksO9;;gddIa)N:%sss1ߟ,gXرcAqikk3|1P(lkk3_RT*5z3//9J!<%h޽yyy==5lkk[t) *..Ƹ9vt=1#*$wrȠ%ʰA9{=1* 'wƑ'kHHHs0krgg\|$#Iut ///CCfa' ns\.J ˗/'H b?6wiqb_ Ÿ|;<`:qZTjx)z`lb?LN .:u+ԩSE0ŧY*t{>5kq :;'fss3K7cjH$?"^XoyLؐo@DziӦM<it19r=YD6G80Ak ȉpkvuu;v>T٘G},hogϞ%K;T*Sb2z}|||}}=>ېǧ0 6 :N >|xf'H0B6KO &s|}}Rimm-M*299999iXJx蟎&ĽK|>Μ9cf X\`RsY|͎;贒~Сuy{{766۳FYS$ Xsdd@`_&CJ 4/\sNBZQ纒mv&қRt)))ߧ/H ؑҍٓҪBr!i,h 2l@N??d;;;I1=Y3< Xɜ9s꫉ȷ~$Q5LXPܹsd\Ƭ$@?-Kj:88xʕEİҠDLFL$sal{=9KA9OK.=z466dc~'''Y(334,**ڳgϼy""",N/ݻ`#""CCC0z"kd||ԬY,ϟ? }}}&;JUUU[la-/,Lf-omvAccc=ґ5J/J۶mø6$fr5eX#ӂ5ӂatr"LNNҷg .޴^ŋfܷ~7WyܱzꨨHf$N |bb!++n۶m&;|pUd:aZ_” أG^xQ(B !i cbbJevvUs=uʕ?<44^"NNNH?q?󽼼lBy:''ߟټY͍aRssƍ|~ZZZKKen]bAVXqu|osΝ?~nnnDDOZFX>7YSSS|>9c k"yСgy㹻Oc†D:ƞ䓋, 0 ZȉpΝzԂɽIBCCkjj$L&;qDhhK̚q)))(o;vn2I4>YYl~5HTVV&/_lHw\`hhh09L+$A~ S:2JeLLLaa34W===4!!),++[z5{RRYvFioo߰aR:e˖߼y/(ruu裏֬YYh4N9X*͞=Ν,S^k ~~~Ϝ9C_?}4O^^V57v_ʬ⊊ V+5\jRt/_^hȱ##1kBX" Zr= 'ȉor)aI .T($OW__phC3N$/UʢcjNu9 cن$>1UX&jd!@ P'HlLPnnV5| rD6ѓ|=M~S:&/qxby;ٝzX61d8bJJNNvuu}rKKK W6IA'{d!<&3|H)3-駟O{{{Je{ͧŌ۾}M>pcg`f :.99ysttlXw@= zZ֭[8sx-o=^Nznh&H$Oڗ /2~J#M#8K{h*D&|l<'ӧ% SF r'NYBaaaa[[[sCޢg&bXV|_#EKu釡t`e˹sZc徏m];BӶ3h.31GQ0==㥧޽.u;$ݻ[jUDD￟4ܖX]]m|wԼke\:s+ρ3D$?ي0JҲ6ns1|~bbbee%)s**??_P[T*իT*?Ϝ9WZZNQD"h4mmm}Eݻ;::6oLQ#GZmUU?FOdMGG;vꎎ(JbXP+ jz>==u" +W0" Bsʕ8|oǏ_nEQ֭;~8L&5rȡC"##Q0ȝqƼl1fҥaaa###3N$?- &y1l޼/pss$ې'OBͯ^Tx*LJ<>@dc8`0<sٵw|5ґXYYa}@[[[KJJtICׯ[Y__ooUUUUUO^^[hSJJٳg1}җ(w 6k׮ݻw;;;gddIa)N:%sss1ߟ,gXرcAqikk3|1P(l IDATkk3_RT*5z3//9J!<%h޽yyy==5lkk[t) *..Ƹ9vt=1#*$wrȠ%ʰA9{=1* 'wƑ'kHHHs0krgg\|$#Iut ///CCfa' ns\.J ˗/'H b?6wiqb_ Ÿ|;<`:qZTjx)z`lb?LN .:u+ԩSE0ŧY*t{>5kq :;'fss3K7cjH$?"^XoyLؐo@DziӦM<it19r=YD6G80Ak ȉpkvuu;v>T٘G},hogϞ%K;T*Sb2z}|||}}=>ېǧ0 6 :N >|xf'H0B6KO &s|}}Rimm-M*299999iXJx蟎&ĽK|>Μ9cf X\`RsY|͎;贒~Сuy{{766۳FYS$ Xsdd@`_&CJ 4/\sNBZQ纒mv&қRt)))ߧ/H ؑҍٓҪBr!i,h 2l@N??d;;;I1=Y3< Xɜ9s꫉ȷ~$Q5LXPܹsd\Ƭ$@?-Kj:88xʕEİҠDLFL$sal{=9KA9OK.=z466dc~'''Y(334,**ڳgϼy""",N/ݻ`#""CCC0z"kd||ԬY,ϟ? }}}&;JUUU[la-/,Lf-omvAccc=ґ5J/J۶mø6$fr5eX#ӂ5ӂatr"LNNҷg .޴^ŋfܷ~7WyܱzꨨHf$N |bb!++n۶m&;|pUd:aZ_” أG^xQ(B !i cbbJevvUs=uʕ?<44^"NNNH?q?󽼼lBy:''ߟټY͍aRssƍ|~ZZZKKen]bAVXqu|osΝ?~nnnDDOZFX>7YSSS|>9c k"yСgy㹻Oc†D:ƞ䓋, 0 ZȉpΝzԂɽIBCCkjj$L&;qDhhK̚q)))(o;vn2I4>YYl~5HTVV&/_lHw\`hhh09L+$A~ S:2JeLLLaa34W===4!!),++[z5{RRYvFioo߰aR:e˖߼y/(ruu裏֬YYh4N9X*͞=Ν,S^k ~~~Ϝ9C_?}4O^^V57v_ʬ⊊ V+5\jRt/_^hȱ##1kBX" Zr= 'ȉor)aI .T($OW__phC3N$/UʢcjNu9 cن$>1UX&jd!@ P'HlLPnnV5| rD6ѓ|=M~S:&/qxby;ٝzX61d8bJJNNvuu}rKKK W6IA'{d!<&3|H)3-駟O{{{Je{ͧŌ۾}M>pcg`f :.99ysttlXw@= zZ֭[8sx-o=^Nznh&H$Oڗ /2~J#M#8K{h*D_<yhNN>-Hׯ_O2d7o8q"(($ ϝ;6?5aj$$/ZL4]O? KO(_Ν;kLǥ(}lc5-?eAS6w!y=4,짎*</==}t9}G||[Lޑ%a޽}ݪU"""ij߽{^.qɄ;,XPYY[y!">VѯW*(=հUaw[y=+++1/WT ⭷?00R^}դ$R?Ox̙@t$FikkcC/Zh޽_߿ѱyfݏ9j1z"k:::رCVwtt0'E!U _PWMMM===a]ri+W{;~u(Zn}d2YooVZ#jٓ>@&GDz3&& :z;vtc67fdB09LdВ #nHw 'Bpp#G:I E3M^ogggK. O'q"󽽽oe00ɣa_|񅛛~P$ن< -o,m~pST>>>td$A>ˮm$ӯɔ D>3`ZRRBKz~ݺuzch|}}<<jǎ d K[[sUABa[[I*JRyyyhVZ),A{ˣf[[ҥKy<^PPPqq1ȱ##1mV!D-V ȉ{VT=3< XCBBY3X.;;;#9L;opxyy2ۘ |7? qrTTX|9܀<>0E2W 4WO &td(ŕvx~`zsZTjx)z`lb?L΍ .:u+ԩSE0ŧY*t{>5kq :;'fss3K7cjH$?"^XoyLؐo@DziӦM<it19r=YD6G80Ak ȉpkvuu;v>T٘G},hogϞ%K;T*Sb2z}|||}}=>ېǧ0 6 :N >|xf'H0B6KO &s|}}Rimm-M*XOS\&''''' KBCC ~::K/9s̛7]hbccryHHIg<`*7;vJzzC֭[hgggoo) dMH444DbIOOϑFFFm8(@066f &;pBffΝ; i=GGG[VZJӂHoJ͛7 ]l,lcGJ7fOns H %ʰA9˓'d8$`%sꫯ&&&"##~msgSDy0͊bBqqfy,+WRJ&"31̅Ӳzs/IJg:?" /]tXMd͢>|8''ӰhϞ=͋X:޻w ( 艬9<<@bqkRfͲ?|DDDP*VUUmٲ0&a1KGDzK*.^X*n۶ bLؐHӚe֔a;L PO iȉ099I߮-{zŋ/^CBB̚q~S\rի"##IFm`;a(r/(Bn󉉉m۶sxMzU)&7n``>.biB~ S:2ccc=zEP 5`),,,Q*&W-W8 CCC=gttו+W>Pz珎:99effZ ĉ?<ڲe ꜜ777f"ROdʂ___777ևJ7niii---vŊ$Yb͝;w7>ih=SSSmbPܜf jllLMM̗猁CyO?a\ {O.Dz䂐47hYz;LHw ';wzꩢR NT&&& H$2ĉ.]2kק;;;XYڵk$ƚPd%jd!kD"QYYX,|#an$sa0ԓ|a%LPV*111Jf8W===4!!),++[z5{RR^vFioo߰aR:e˖߼y/(ruu裏֬YYh4N9X*͞=Νz1i4>s =_ӌ>yyyZ VZ})^P$+**ZL&o DDzsժUJR]|yѢE##ǎ't_ 1c&2h46L dHw 'o-1%& p+/\PPX>>7΄" n󨨨 PÇׯ_oV|"݁\ALz,=-N,finf 0H!$S*1C^ IF3e' {{;{?̫(*_kۦ.?x)p‹/xD*>쳧Onooa6gd ,f922BQܚލnd".X}"#SZ3G 5YhL0 7t(ґ}bB`~~~CCCyyyɇơ6gΜ9_}Dddd{{;SNytt4SDy0͊+Wܹsd\N[juppʕ+)`V|"AvbT>- Er= s$D0vBB2enZ9e ZOzU??<Rj˖wK*.^X*n۶4'9Maq >_;5k^ pDDDxxxhhhXXf>ɥٳg޼y`3rAa3fC=-!Pd$}4W7A_xqz>$$+!!Y z1dt~S\rի"##IgY00'''X YYYuuu۶mkoo.7 AGi,KL05z=2~ S$Mc)*1}"L4)gӿPU/[sΟ??777" IDAT"◿%]xСgy㹻O4+V쾔UcccY15531fuuuNN?i9ʂ___77Lsr|>tt)33S#݁,$fssƍ|~ZZZKKz;LPd؝;wzꩢҌ s{cyH$LvĉK.E955ctgg甔 7YYYvrww7YyŊׯ_, CCC=>ڵk"L,_|\͑070`rVI߃0ts32Xz"'"L$S*1}"LiּvIB=_ȽFwرcǎ1DrʡCz{{333Yɂ9q~࿟n'N7C;::Y\\} B_bƾo߾ʕ+̍1dsr~IIIcc#t!faa /յ`\=X EV544YFPDDDl޼䁵!ޤ(j…=քxs綷/]u)舎.**˜?\`F>s\ysnppo~󛊊 lBxs{9~8+X ~mCCك<> d_|?ČaI>i<IREL>WCk &n0?=YYYW% |<ޓ J&U*UrrÖ[ZZm[GӲ5СChؙRxz> 雬fƈO?駟*J?6䕷oiӦO>ؙ.` &NKNNmmm}ā'6۾nv[U?ퟶ/ͯ~EQTuwW/$۾=[nX[B<ԁ40ӯBGRW{&be4$<婆jl\6w.EQwߙmxAfyx|Y5wtSo}窶۲qG Sk=x>89Y/T1avX3~٤q/ۖ ޽G+Q|DxzȵDmAWeS< x2GNe֭[x Ӛꏬfeee$J$rwwgtww1wLaF^<{ltt5~>̨E$i֭[ORXi/^^^G xyy+^W111OL-1c|>?113gJ$Fo(4=='==(ww#Ghڪ*^x W:ӕa e2Yoooyy/_|7n ^ΔxSrΝYf.../7&;EQ?cXXXXXؽ{ ō_z???D@*\i mAAA ʁ(OKJJ~_|_z?0//ϰ7/^/Sm۶ TWWoǡ0>/&5gUJGiL… U*Unn$oN._)uB ,JD9<<'3VCwݻ7==rtI~~JZp١VZd >lggt czJCCC~~~PP˖-Y|9sڬ_p!11q͚5ѩT_&Ry*qLEX?'OJ|>/ӓ( Ru Cgatl^DED"JeɝQQQBP Ç_9Fc54Q[[SB j5]!Bj#c:$RɓoFGGGtt͛7x zj؅Ba}}}uu5nvpp#3)ϝCR{d!!!&ҏ>>&&fΝ#,DjcOfD' ~ӓirGH$b6Aa/˪U׿~}ݿ˿;[nO?ĕwHLĄ5҂|K.utt|gׯ_/**~WWWe.]?ɽ]js;;;U=<<\s1{CCCVVV]]ݶmGGG)"/DYӘ=a366Fל5kI|׆Lq L%66V"DFFVUU .^^/ZZZ^{& Ƅ>7o46#gWyS01۫WvvvNKK3Yypp0&&FTfggSGGG233mtuuuNN={05C9X nlll& 7nSSSkzz\vM$˗/[ï̞0illLMMW I,((ussɡ iAiii---x_ޜV*111z/}}}=@[9vKܹD@*\cuy^lY[[FH$\׮]hۙWovuu裏֬YcFbFFFOOPiZ!sZ j{{{F!!/DJGDSjZ7nfzZV&a6S 4Mgg'fϞ}saYfnnbX/YĂFoʿyٳg-ZQ<0…  d-rie eeesLCRf`2lY"tww[ s SD'p/M^ OO֭[x m0m2vw< ILytppxw:;;;;; <| y0XL5eoxnzz "Ztі`fjjSO=aÆկ~zʶ}Z.'䮧G E9s400P"h4xCMMM===! GjUUUL5^oggg3t+>d^@Ozjxx8ST*Ugg'a0bBW(a4z.;vPG$b<62114#2ȣM$$$ݻwdddddd߾}Y#1N>i&V2&caMH^L4A:3 Yj[Ɇ H^O緶QUWWh"''gd2cǎxzz=ڵk4DJ{|gg缼<T"˥R)S|RB~Zy\o>a!N:%sssI<$..5???((Ș͍&I8.WWWw777o՜\:&湽޸q^PbD>eu r=޽{LM+gV$s2>*!l>ydmmT*EޠxAEx?bI ,K\̍dVB'^%ˌ ɧ!0YlY[[FH$\֮]h7l؀onl6G}Դf\0322zzzz>//OVTTv{sZ j{{{F!!/DJGsϟ8wIx~VZ20Qeқk׮7o޼ywޡJ阘XСC+a›d9 d@H,WTThZLbǵ{F24$`XpBK À &iHiiitt4lf`2lY"tww[䐹 V*o'0B `Vퟞԭ[&3MkN[;xHV¼\D`⵻ԩSB VmiiH$j(lh:\OOAD"V.n:D__~cgׯ_=Sw=mEJϜ9WZZo(H4M[[[||ww#Ghڪ*^#|ZM"ʰOc2חDW3<ORuvvK:w[cZaxdMXP( l_|񅛛t\!#<.6oRYYgO?BT677߿L&3+A JHx+t]Oc> ohhxWj~iIIɯ뚚>a``K/?oŋ_~em622`۷3+KGiL… U*Unn!ӵ˗/gcccoݺP( 1߅ [jժ%K`Vjg}$%%YaLu2BCј'W%n8pW^ihh b5wssKKK5Vs555YydmmT*EޠW'EQ<Onp޸q.h4ū)z^$Q%T*&wNGEEUTTBN'>~zB0=0]<ȚjWV`z>>>jtHa".6!{}}}Rimmɓ'Yhyc:>̜drcY+e%8ڼIX\BwEQ<`U{ލvvvuO˘tBFFF( &u0sZV\;w޻wDOz1dP$ YӓsddD X`bBq:!fYb֞`VaÆ;w&''# =<?::䔙i+srrكqccc7YqF>Zӳڵk"L,_|Bc=0nnn999>5>ZZZ0͍`VV֮]#L( qߘ;BJeLLLaa ]\\$ 6a#G/^񂂂2 $R:J̐ie4D"vZF޾a|sc\]]?5kӃ/yyyZ noj^0{H z}nnV5| h4N*D֜={Ν;i 0e˖1?CF&L+`3MzzzΝk^n J .T(`@ (++},qJ`^xLLXM K$n \pm#Oo_~%~1_Ere;o&&<ps1:88;f&xxF8>aTfE"V.XYYIQT``3gJKK %Fikk7C~V[UUTvvv+ΕteاBLkx$ W^ gyۋIQt嬛7nܠK<<<4 ~\KHDQH$RT&9UQQ! u:@ 8| MjVWVɵHӘ;X477 #...ρ@+JkkkO#UVV›J2&&Y&.ZwߥCw]hEQ.]ںu֭[/]b!;VBJGZɬO0YlY[[FH$\֮]h7l؀onl6G}Դf\1322zzzz>//OVTTv{sZ j{{{F!!/DJGcϟ8wa.ar{FwVil?y<^AAF_blDf:mׯ_  o:88ڵ~;Co0k+?Gld$XpB yxk=G X1Z4:: yi`&a&,&[%Iww]=Qm;X +氡'ٛ wc[Yɗ_~9w\BY {xKwM}?u#X[Mnᗤ6_ZЛeF{xx"x8ΰD"V;]%<C0-;i S ?'&&VVVRx̙@|CDh 755v9rDVUU3z>LW}+d徾$_z5<<)x*ǘ7cGZ\>XB|ttܱcZظq#]( EB`N**??_P[$4211̸G.iOtr{ǬtPPP~~~kkkIII\\EQuuu-rrrJII9{,^L&sqqIJJ1,?vX@@'î]v윑qAcښ;Ltv~~s^^tc*vR)_|yii)y!O'6 B"N:%sssޔc\něŕMwp| =1i>ydmmT*EޠxAExO3c˄͛73d2ݷ)җ%%%Ry)yt[Vݻwt:ݞ;vѣ̙CILL߲eK[[DOO`ٍ׼TQMnܸ!K y=&O6:C8I!.TS?w X,[.YԩSUUU>{Ν;nZTTdXjjj|r---3g|-gTyٖ%FFFDDDSwN^\\YTT]YYt: !}EZLgL&h4Tcd$z& :th{{TJHHHII0aɓ'SSSp?ϟv朜,Bju:e͚5:ӧO]6$$dȑg/kmm?~˗[e˖ >̙3QQQ4Q+ޔ]oܸ1qD_ES٨f##N8m۶萐k ϟVVV@YYY999^OY8Åj}i}!zGJJJp8)))^\-Zp8*++.]֍ʽssskkkKKK}r1==N?iÆ Ǐ-3m6n8pOT]nreשyȞb$#c4YFZЩˀm۶9j1#F=zȑ#Ç({t/!s8tc< ˼c=``AD}9څ ^xy|oTg @Jw!V6E̮rjjjBCCŗuuu?𜜜+W9r$1148xN:ףMߍ`ĨrCwܭ["""l6[]]fv}s۷o_p!ƐbY 8q2jԨ5j)))B.ԴbŊҺ:7BCCpɑ#G555QJ2wPoPM\n]}}cǢd?رc􀀀jÆ T9vX}}uSG׿{▪PO-՚j6J111yyyW\˓~玲Ӌޱjժo6$$mܥN8`]}yСCg̘4cƌHa>uLw(HfԩwsNGGGUUժUƪjcP=>>>??7tdcfc;tLVmoO{OTqٳgٳ7|g/c5jڴi}ц &MZx񫯾:c BHvv{^yӧOoڴItgd OR?wcjed6ׯ_5q2$&Κ5KLOHHvZ^^ĖYܸqcyyy\\bׯ__YY &C>U4Bj^l޼_S .LNN~w7+={fϞ]\\qXO}|r֧f͚%OkS/^ԏQ]]ݦMFŪ6sssw5nܸwygɞ608v嘬Usi$fdd.Zz„֫WJekZ?2dá3hR4wbŨ(a!nӷ~xaBrbj8TS'p-`kNFZ"ݮTiϝ;\5~jtttFFFaa2OYfUUUSLtŋIVcs9lTTԇ~XVVVYYcǎCzTs/--]rرc ~:x:r[G2wJOϏcIK7o㼼7oNWCgܺukܹwܹp_{˗/S]i{/LWXw޷zxUsXjUjjf۾}~g;lLVm{0~rY,H-bcc,YrԩÇ={vΝ[n-**X,555>9閖3g>xӥ&))+,122%""BпsZ̢Ja&d<8 %Æ stKPR0L&2lZLz3a„*+%$$L0ɓw [TӦM/ !eeeoFii1jwݽ{wuuG}S5 B?OƛM/GcjST])ׯ_?~```ZZۍsrr !VtZ,5kO>vڐ#GڵKeHƋZ[[Ǐ6qVuٲe%%%sz8sLTTԡCbbb=M`KJJ-[fZ/_.&;wnŊ^CuKrƍ' OSϟbVVVj1U;333?C .Ngee3gyl6''' +}8#Osw9a>quLw/4ijXzlj{}BCC>|i7lz9QGPɪMQ5wզHJJp8)))^h"QYYtR+88877499GB6lР|lGϜlv8!Cb&))رcW\tO-:A5s\z5??ԩ~ 3lMx`d1>P/,@_]H:$mb1׮]”G5~8S#9s wӦM+Wܷo +83 nwyׯZ||B0wMMMg}f>:G߭fY6ǕL>*q)oDžYQ 6O@TXPrڟ1pqLnwvb_/;;\t|ڨ£ !S_{9B.?i@ԟ?-[/}r1!por4*@S_?"o{O@a' 4xDBI|/Dd1%R[(M+nEu/oBo ?)5iq&B%z=FzWv3大!_C)ۀš#gb&D 9ISJhzT%׏{)>.e53[,vծ$<=^L IO؍0J}pGj۪tXL8w狄1rqxa1J ~UZ)Lp 8}d"!C<0E>r1G&cՖr9҃F =WOb`O 0ae1Ƅ5J)yx>nTGjA߽d^VP\ Y,aE:K>`eȾ(4tVO{\~1ދ-“ib1FaHCT3l"\]!aUbhyaMeʦ+ /K/:Ť, IENDB`PKo1D$J;V;V"scrapy-0.22/.doctrees/news.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X api changesqNX0.9qNX new featuresqNX0.16.1 (released 2012-10-26)q NX0.16.2 (released 2012-11-09)q NX0.8q NX0.12q NX0.18.0 (released 2013-08-09)q NX0.7qNXnew features and improvementsqNXchanges to default settingsqNXthanksqNXnewsqXw3libqX0.18.3 (released 2013-10-03)qNXlxmlqX enhancementsqNXnew features and settingsqNXcode rearranged and removedqNXqueuelibqXchanges to settingsqNXchunked transfer encodingqXotherqNXw3lib.encodingqX0.16.3 (released 2012-12-07)qNXajax crawleable urlsqX0.18.2 (released 2013-09-03)q NX0.14.4q!NXbugfixesq"NX0.14.1q#NX0.14.3q$NX0.14.2q%NX0.20.0 (released 2013-11-08)q&NX"deprecated/obsoleted functionalityq'NXscrapelyq(X0.18.4 (released 2013-10-10)q)NX0.10q*NX0.20.2 (released 2013-12-09)q+NX0.20.1 (released 2013-11-28)q,NX release notesq-NXscrapyd changesq.NX clientformq/X0.16.0 (released 2012-10-18)q0NXresourceq1Xcommand-line tool changesq2NX0.14q3NXfixesq4NX0.22.0 (released 2014-01-17)q5NXbackwards-incompatible changesq6NX0.16.5 (released 2013-05-30)q7NX0.16.4 (released 2013-01-23)q8NX cssselectq9X0.18.1 (released 2013-08-27)q:NXmarshalq;uUsubstitution_defsq<}q=Uparse_messagesq>]q?(cdocutils.nodes system_message q@)qA}qB(U rawsourceqCUUparentqDcdocutils.nodes section qE)qF}qG(hCUU referencedqHKhDhE)qI}qJ(hCUhDhE)qK}qL(hCUhDhUsourceqMcdocutils.nodes reprunicode qNX:/var/build/user_builds/scrapy/checkouts/0.22/docs/news.rstqOqP}qQbUexpect_referenced_by_nameqR}qShcdocutils.nodes target qT)qU}qV(hCX .. _news:hDhhMhPUtagnameqWUtargetqXU attributesqY}qZ(Uidsq[]Ubackrefsq\]Udupnamesq]]Uclassesq^]Unamesq_]Urefidq`UnewsqauUlineqbKUdocumentqchUchildrenqd]ubshWUsectionqehY}qf(h]]h^]h\]h[]qg(U release-notesqhhaeh_]qi(h-heuhbKhchUexpect_referenced_by_idqj}qkhahUshd]ql(cdocutils.nodes title qm)qn}qo(hCX Release notesqphDhKhMhPhWUtitleqqhY}qr(h]]h^]h\]h[]h_]uhbKhchhd]qscdocutils.nodes Text qtX Release notesquqv}qw(hChphDhnubaubhE)qx}qy(hCUhDhKhMhPhWhehY}qz(h]]h^]h\]h[]q{Ureleased-2014-01-17q|ah_]q}h5auhbKhchhd]q~(hm)q}q(hCX0.22.0 (released 2014-01-17)qhDhxhMhPhWhqhY}q(h]]h^]h\]h[]h_]uhbKhchhd]qhtX0.22.0 (released 2014-01-17)qq}q(hChhDhubaubhE)q}q(hCUhHKhDhxhMhPhWhehY}q(h]]qX enhancementsqah^]h\]h[]qU enhancementsqah_]uhbK hchhd]q(hm)q}q(hCX EnhancementsqhDhhMhPhWhqhY}q(h]]h^]h\]h[]h_]uhbK hchhd]qhtX Enhancementsqq}q(hChhDhubaubcdocutils.nodes bullet_list q)q}q(hCUhDhhMhPhWU bullet_listqhY}q(UbulletqX-h[]h\]h]]h^]h_]uhbK hchhd]q(cdocutils.nodes list_item q)q}q(hCX[**Backwards incompatible**] Switched HTTPCacheMiddleware backend to filesystem (:issue:`541`) To restore old backend set `HTTPCACHE_STORAGE` to `scrapy.contrib.httpcache.DbmCacheStorage`hDhhMhPhWU list_itemqhY}q(h]]h^]h\]h[]h_]uhbNhchhd]qcdocutils.nodes paragraph q)q}q(hCX[**Backwards incompatible**] Switched HTTPCacheMiddleware backend to filesystem (:issue:`541`) To restore old backend set `HTTPCACHE_STORAGE` to `scrapy.contrib.httpcache.DbmCacheStorage`hDhhMhPhWU paragraphqhY}q(h]]h^]h\]h[]h_]uhbK hd]q(htX[q}q(hCX[hDhubcdocutils.nodes strong q)q}q(hCX**Backwards incompatible**hY}q(h]]h^]h\]h[]h_]uhDhhd]qhtXBackwards incompatibleqq}q(hCUhDhubahWUstrongqubhtX6] Switched HTTPCacheMiddleware backend to filesystem (qq}q(hCX6] Switched HTTPCacheMiddleware backend to filesystem (hDhubcdocutils.nodes reference q)q}q(hCX :issue:`541`hY}q(UrefuriX+https://github.com/scrapy/scrapy/issues/541h[]h\]h]]h^]h_]uhDhhd]qhtX issue 541qq}q(hCUhDhubahWU referencequbhtX) To restore old backend set qq}q(hCX) To restore old backend set hDhubcdocutils.nodes title_reference q)q}q(hCX`HTTPCACHE_STORAGE`hY}q(h]]h^]h\]h[]h_]uhDhhd]qhtXHTTPCACHE_STORAGEqɅq}q(hCUhDhubahWUtitle_referencequbhtX to qͅq}q(hCX to hDhubh)q}q(hCX*`scrapy.contrib.httpcache.DbmCacheStorage`hY}q(h]]h^]h\]h[]h_]uhDhhd]qhtX(scrapy.contrib.httpcache.DbmCacheStorageqԅq}q(hCUhDhubahWhubeubaubh)q}q(hCXEProxy https:// urls using CONNECT method (:issue:`392`, :issue:`397`)qhDhhMhPhWhhY}q(h]]h^]h\]h[]h_]uhbNhchhd]qh)q}q(hChhDhhMhPhWhhY}q(h]]h^]h\]h[]h_]uhbKhd]q(htXProxy qq}q(hCXProxy hDhubh)q}q(hCXhttps://qhY}q(Urefurihh[]h\]h]]h^]h_]uhDhhd]qhtXhttps://q腁q}q(hCUhDhubahWhubhtX urls using CONNECT method (q녁q}q(hCX urls using CONNECT method (hDhubh)q}q(hCX :issue:`392`hY}q(UrefuriX+https://github.com/scrapy/scrapy/issues/392h[]h\]h]]h^]h_]uhDhhd]qhtX issue 392qq}q(hCUhDhubahWhubhtX, qq}q(hCX, hDhubh)q}q(hCX :issue:`397`hY}q(UrefuriX+https://github.com/scrapy/scrapy/issues/397h[]h\]h]]h^]h_]uhDhhd]qhtX issue 397qq}q(hCUhDhubahWhubhtX)q}r(hCX)hDhubeubaubh)r}r(hCXSAdd a middleware to crawl ajax crawleable pages as defined by google (:issue:`343`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r (htXFAdd a middleware to crawl ajax crawleable pages as defined by google (r r }r (hCXFAdd a middleware to crawl ajax crawleable pages as defined by google (hDjubh)r }r(hCX :issue:`343`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/343h[]h\]h]]h^]h_]uhDjhd]rhtX issue 343rr}r(hCUhDj ubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXTRename scrapy.spider.BaseSpider to scrapy.spider.Spider (:issue:`510`, :issue:`519`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX9Rename scrapy.spider.BaseSpider to scrapy.spider.Spider (rr }r!(hCX9Rename scrapy.spider.BaseSpider to scrapy.spider.Spider (hDjubh)r"}r#(hCX :issue:`510`hY}r$(UrefuriX+https://github.com/scrapy/scrapy/issues/510h[]h\]h]]h^]h_]uhDjhd]r%htX issue 510r&r'}r((hCUhDj"ubahWhubhtX, r)r*}r+(hCX, hDjubh)r,}r-(hCX :issue:`519`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/519h[]h\]h]]h^]h_]uhDjhd]r/htX issue 519r0r1}r2(hCUhDj,ubahWhubhtX)r3}r4(hCX)hDjubeubaubh)r5}r6(hCX=Selectors register EXSLT namespaces by default (:issue:`472`)r7hDhhMhPhWhhY}r8(h]]h^]h\]h[]h_]uhbNhchhd]r9h)r:}r;(hCj7hDj5hMhPhWhhY}r<(h]]h^]h\]h[]h_]uhbKhd]r=(htX0Selectors register EXSLT namespaces by default (r>r?}r@(hCX0Selectors register EXSLT namespaces by default (hDj:ubh)rA}rB(hCX :issue:`472`hY}rC(UrefuriX+https://github.com/scrapy/scrapy/issues/472h[]h\]h]]h^]h_]uhDj:hd]rDhtX issue 472rErF}rG(hCUhDjAubahWhubhtX)rH}rI(hCX)hDj:ubeubaubh)rJ}rK(hCX?Unify item loaders similar to selectors renaming (:issue:`461`)rLhDhhMhPhWhhY}rM(h]]h^]h\]h[]h_]uhbNhchhd]rNh)rO}rP(hCjLhDjJhMhPhWhhY}rQ(h]]h^]h\]h[]h_]uhbKhd]rR(htX2Unify item loaders similar to selectors renaming (rSrT}rU(hCX2Unify item loaders similar to selectors renaming (hDjOubh)rV}rW(hCX :issue:`461`hY}rX(UrefuriX+https://github.com/scrapy/scrapy/issues/461h[]h\]h]]h^]h_]uhDjOhd]rYhtX issue 461rZr[}r\(hCUhDjVubahWhubhtX)r]}r^(hCX)hDjOubeubaubh)r_}r`(hCX=Make `RFPDupeFilter` class easily subclassable (:issue:`533`)rahDhhMhPhWhhY}rb(h]]h^]h\]h[]h_]uhbNhchhd]rch)rd}re(hCjahDj_hMhPhWhhY}rf(h]]h^]h\]h[]h_]uhbKhd]rg(htXMake rhri}rj(hCXMake hDjdubh)rk}rl(hCX`RFPDupeFilter`hY}rm(h]]h^]h\]h[]h_]uhDjdhd]rnhtX RFPDupeFilterrorp}rq(hCUhDjkubahWhubhtX class easily subclassable (rrrs}rt(hCX class easily subclassable (hDjdubh)ru}rv(hCX :issue:`533`hY}rw(UrefuriX+https://github.com/scrapy/scrapy/issues/533h[]h\]h]]h^]h_]uhDjdhd]rxhtX issue 533ryrz}r{(hCUhDjuubahWhubhtX)r|}r}(hCX)hDjdubeubaubh)r~}r(hCXEImprove test coverage and forthcoming Python 3 support (:issue:`525`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDj~hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX8Improve test coverage and forthcoming Python 3 support (rr}r(hCX8Improve test coverage and forthcoming Python 3 support (hDjubh)r}r(hCX :issue:`525`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/525h[]h\]h]]h^]h_]uhDjhd]rhtX issue 525rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXLPromote startup info on settings and middleware to INFO level (:issue:`520`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX?Promote startup info on settings and middleware to INFO level (rr}r(hCX?Promote startup info on settings and middleware to INFO level (hDjubh)r}r(hCX :issue:`520`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/520h[]h\]h]]h^]h_]uhDjhd]rhtX issue 520rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXDSupport partials in `get_func_args` util (:issue:`506`, issue:`504`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXSupport partials in rr}r(hCXSupport partials in hDjubh)r}r(hCX`get_func_args`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX get_func_argsrr}r(hCUhDjubahWhubhtX util (rr}r(hCX util (hDjubh)r}r(hCX :issue:`506`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/506h[]h\]h]]h^]h_]uhDjhd]rhtX issue 506rr}r(hCUhDjubahWhubhtX, issue:rr}r(hCX, issue:hDjubh)r}r(hCX`504`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX504rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX4Allow running indiviual tests via tox (:issue:`503`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX'Allow running indiviual tests via tox (rr}r(hCX'Allow running indiviual tests via tox (hDjubh)r}r(hCX :issue:`503`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/503h[]h\]h]]h^]h_]uhDjhd]rhtX issue 503rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX;Update extensions ignored by link extractors (:issue:`498`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX.Update extensions ignored by link extractors (rr}r(hCX.Update extensions ignored by link extractors (hDjubh)r}r(hCX :issue:`498`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/498h[]h\]h]]h^]h_]uhDjhd]rhtX issue 498rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXFAdd middleware methods to get files/images/thumbs paths (:issue:`490`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX9Add middleware methods to get files/images/thumbs paths (rr}r(hCX9Add middleware methods to get files/images/thumbs paths (hDjubh)r}r(hCX :issue:`490`hY}r (UrefuriX+https://github.com/scrapy/scrapy/issues/490h[]h\]h]]h^]h_]uhDjhd]r htX issue 490r r }r (hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX/Improve offsite middleware tests (:issue:`478`)rhDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX"Improve offsite middleware tests (rr}r(hCX"Improve offsite middleware tests (hDjubh)r}r(hCX :issue:`478`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/478h[]h\]h]]h^]h_]uhDjhd]rhtX issue 478r r!}r"(hCUhDjubahWhubhtX)r#}r$(hCX)hDjubeubaubh)r%}r&(hCXPAdd a way to skip default Referer header set by RefererMiddleware (:issue:`475`)r'hDhhMhPhWhhY}r((h]]h^]h\]h[]h_]uhbNhchhd]r)h)r*}r+(hCj'hDj%hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbKhd]r-(htXCAdd a way to skip default Referer header set by RefererMiddleware (r.r/}r0(hCXCAdd a way to skip default Referer header set by RefererMiddleware (hDj*ubh)r1}r2(hCX :issue:`475`hY}r3(UrefuriX+https://github.com/scrapy/scrapy/issues/475h[]h\]h]]h^]h_]uhDj*hd]r4htX issue 475r5r6}r7(hCUhDj1ubahWhubhtX)r8}r9(hCX)hDj*ubeubaubh)r:}r;(hCXGDo not send `x-gzip` in default `Accept-Encoding` header (:issue:`469`)r<hDhhMhPhWhhY}r=(h]]h^]h\]h[]h_]uhbNhchhd]r>h)r?}r@(hCj<hDj:hMhPhWhhY}rA(h]]h^]h\]h[]h_]uhbKhd]rB(htX Do not send rCrD}rE(hCX Do not send hDj?ubh)rF}rG(hCX`x-gzip`hY}rH(h]]h^]h\]h[]h_]uhDj?hd]rIhtXx-gziprJrK}rL(hCUhDjFubahWhubhtX in default rMrN}rO(hCX in default hDj?ubh)rP}rQ(hCX`Accept-Encoding`hY}rR(h]]h^]h\]h[]h_]uhDj?hd]rShtXAccept-EncodingrTrU}rV(hCUhDjPubahWhubhtX header (rWrX}rY(hCX header (hDj?ubh)rZ}r[(hCX :issue:`469`hY}r\(UrefuriX+https://github.com/scrapy/scrapy/issues/469h[]h\]h]]h^]h_]uhDj?hd]r]htX issue 469r^r_}r`(hCUhDjZubahWhubhtX)ra}rb(hCX)hDj?ubeubaubh)rc}rd(hCXBSupport defining http error handling using settings (:issue:`466`)rehDhhMhPhWhhY}rf(h]]h^]h\]h[]h_]uhbNhchhd]rgh)rh}ri(hCjehDjchMhPhWhhY}rj(h]]h^]h\]h[]h_]uhbKhd]rk(htX5Support defining http error handling using settings (rlrm}rn(hCX5Support defining http error handling using settings (hDjhubh)ro}rp(hCX :issue:`466`hY}rq(UrefuriX+https://github.com/scrapy/scrapy/issues/466h[]h\]h]]h^]h_]uhDjhhd]rrhtX issue 466rsrt}ru(hCUhDjoubahWhubhtX)rv}rw(hCX)hDjhubeubaubh)rx}ry(hCXBUse moderm python idioms wherever you find legacies (:issue:`497`)rzhDhhMhPhWhhY}r{(h]]h^]h\]h[]h_]uhbNhchhd]r|h)r}}r~(hCjzhDjxhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX5Use moderm python idioms wherever you find legacies (rr}r(hCX5Use moderm python idioms wherever you find legacies (hDj}ubh)r}r(hCX :issue:`497`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/497h[]h\]h]]h^]h_]uhDj}hd]rhtX issue 497rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDj}ubeubaubh)r}r(hCXImprove and correct documentation (:issue:`527`, :issue:`524`, :issue:`521`, :issue:`517`, :issue:`512`, :issue:`505`, :issue:`502`, :issue:`489`, :issue:`465`, :issue:`460`, :issue:`425`, :issue:`536`) hDhhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCXImprove and correct documentation (:issue:`527`, :issue:`524`, :issue:`521`, :issue:`517`, :issue:`512`, :issue:`505`, :issue:`502`, :issue:`489`, :issue:`465`, :issue:`460`, :issue:`425`, :issue:`536`)hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX#Improve and correct documentation (rr}r(hCX#Improve and correct documentation (hDjubh)r}r(hCX :issue:`527`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/527h[]h\]h]]h^]h_]uhDjhd]rhtX issue 527rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`524`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/524h[]h\]h]]h^]h_]uhDjhd]rhtX issue 524rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`521`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/521h[]h\]h]]h^]h_]uhDjhd]rhtX issue 521rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`517`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/517h[]h\]h]]h^]h_]uhDjhd]rhtX issue 517rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`512`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/512h[]h\]h]]h^]h_]uhDjhd]rhtX issue 512rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`505`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/505h[]h\]h]]h^]h_]uhDjhd]rhtX issue 505rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`502`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/502h[]h\]h]]h^]h_]uhDjhd]rhtX issue 502rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`489`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/489h[]h\]h]]h^]h_]uhDjhd]rhtX issue 489rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`465`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/465h[]h\]h]]h^]h_]uhDjhd]rhtX issue 465rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`460`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/460h[]h\]h]]h^]h_]uhDjhd]rhtX issue 460rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`425`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/425h[]h\]h]]h^]h_]uhDjhd]rhtX issue 425rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`536`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/536h[]h\]h]]h^]h_]uhDjhd]r htX issue 536r r }r (hCUhDjubahWhubhtX)r }r(hCX)hDjubeubaubeubeubhE)r}r(hCUhDhxhMhPhWhehY}r(h]]h^]h\]h[]rUfixesrah_]rh4auhbK$hchhd]r(hm)r}r(hCXFixesrhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbK$hchhd]rhtXFixesrr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r (hX-h[]h\]h]]h^]h_]uhbK&hchhd]r!(h)r"}r#(hCXDUpdate Selector class imports in CrawlSpider template (:issue:`484`)r$hDjhMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbNhchhd]r&h)r'}r((hCj$hDj"hMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbK&hd]r*(htX7Update Selector class imports in CrawlSpider template (r+r,}r-(hCX7Update Selector class imports in CrawlSpider template (hDj'ubh)r.}r/(hCX :issue:`484`hY}r0(UrefuriX+https://github.com/scrapy/scrapy/issues/484h[]h\]h]]h^]h_]uhDj'hd]r1htX issue 484r2r3}r4(hCUhDj.ubahWhubhtX)r5}r6(hCX)hDj'ubeubaubh)r7}r8(hCX9Fix unexistent reference to `engine.slots` (:issue:`464`)r9hDjhMhPhWhhY}r:(h]]h^]h\]h[]h_]uhbNhchhd]r;h)r<}r=(hCj9hDj7hMhPhWhhY}r>(h]]h^]h\]h[]h_]uhbK'hd]r?(htXFix unexistent reference to r@rA}rB(hCXFix unexistent reference to hDj<ubh)rC}rD(hCX`engine.slots`hY}rE(h]]h^]h\]h[]h_]uhDj<hd]rFhtX engine.slotsrGrH}rI(hCUhDjCubahWhubhtX (rJrK}rL(hCX (hDj<ubh)rM}rN(hCX :issue:`464`hY}rO(UrefuriX+https://github.com/scrapy/scrapy/issues/464h[]h\]h]]h^]h_]uhDj<hd]rPhtX issue 464rQrR}rS(hCUhDjMubahWhubhtX)rT}rU(hCX)hDj<ubeubaubh)rV}rW(hCXTDo not try to call `body_as_unicode()` on a non-TextResponse instance (:issue:`462`)rXhDjhMhPhWhhY}rY(h]]h^]h\]h[]h_]uhbNhchhd]rZh)r[}r\(hCjXhDjVhMhPhWhhY}r](h]]h^]h\]h[]h_]uhbK(hd]r^(htXDo not try to call r_r`}ra(hCXDo not try to call hDj[ubh)rb}rc(hCX`body_as_unicode()`hY}rd(h]]h^]h\]h[]h_]uhDj[hd]rehtXbody_as_unicode()rfrg}rh(hCUhDjbubahWhubhtX! on a non-TextResponse instance (rirj}rk(hCX! on a non-TextResponse instance (hDj[ubh)rl}rm(hCX :issue:`462`hY}rn(UrefuriX+https://github.com/scrapy/scrapy/issues/462h[]h\]h]]h^]h_]uhDj[hd]rohtX issue 462rprq}rr(hCUhDjlubahWhubhtX)rs}rt(hCX)hDj[ubeubaubh)ru}rv(hCXaWarn when subclassing XPathItemLoader, previously it only warned on instantiation. (:issue:`523`)hDjhMhPhWhhY}rw(h]]h^]h\]h[]h_]uhbNhchhd]rxh)ry}rz(hCXaWarn when subclassing XPathItemLoader, previously it only warned on instantiation. (:issue:`523`)hDjuhMhPhWhhY}r{(h]]h^]h\]h[]h_]uhbK)hd]r|(htXTWarn when subclassing XPathItemLoader, previously it only warned on instantiation. (r}r~}r(hCXTWarn when subclassing XPathItemLoader, previously it only warned on instantiation. (hDjyubh)r}r(hCX :issue:`523`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/523h[]h\]h]]h^]h_]uhDjyhd]rhtX issue 523rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjyubeubaubh)r}r(hCX_Warn when subclassing XPathSelector, previously it only warned on instantiation. (:issue:`537`)hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCX_Warn when subclassing XPathSelector, previously it only warned on instantiation. (:issue:`537`)hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbK+hd]r(htXRWarn when subclassing XPathSelector, previously it only warned on instantiation. (rr}r(hCXRWarn when subclassing XPathSelector, previously it only warned on instantiation. (hDjubh)r}r(hCX :issue:`537`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/537h[]h\]h]]h^]h_]uhDjhd]rhtX issue 537rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXIMultiple fixes to memory stats (:issue:`531`, :issue:`530`, :issue:`529`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbK-hd]r(htX Multiple fixes to memory stats (rr}r(hCX Multiple fixes to memory stats (hDjubh)r}r(hCX :issue:`531`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/531h[]h\]h]]h^]h_]uhDjhd]rhtX issue 531rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`530`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/530h[]h\]h]]h^]h_]uhDjhd]rhtX issue 530rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :issue:`529`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/529h[]h\]h]]h^]h_]uhDjhd]rhtX issue 529rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXBFix overriding url in `FormRequest.from_response()` (:issue:`507`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbK.hd]r(htXFix overriding url in rr}r(hCXFix overriding url in hDjubh)r}r(hCX`FormRequest.from_response()`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXFormRequest.from_response()rr}r(hCUhDjubahWhubhtX (rr}r(hCX (hDjubh)r}r(hCX :issue:`507`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/507h[]h\]h]]h^]h_]uhDjhd]rhtX issue 507rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX-Fix tests runner under pip 1.5 (:issue:`513`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbK/hd]r(htX Fix tests runner under pip 1.5 (rr}r(hCX Fix tests runner under pip 1.5 (hDjubh)r}r(hCX :issue:`513`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/513h[]h\]h]]h^]h_]uhDjhd]rhtX issue 513rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX=Fix logging error when spider name is unicode (:issue:`479`) hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCX<Fix logging error when spider name is unicode (:issue:`479`)hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbK0hd]r(htX/Fix logging error when spider name is unicode (rr}r(hCX/Fix logging error when spider name is unicode (hDjubh)r}r(hCX :issue:`479`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/479h[]h\]h]]h^]h_]uhDjhd]rhtX issue 479r r }r (hCUhDjubahWhubhtX)r }r (hCX)hDjubeubaubeubeubeubhE)r}r(hCUhDhKhMhPhWhehY}r(h]]h^]h\]h[]rUreleased-2013-12-09rah_]rh+auhbK3hchhd]r(hm)r}r(hCX0.20.2 (released 2013-12-09)rhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbK3hchhd]rhtX0.20.2 (released 2013-12-09)rr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbK5hchhd]r (h)r!}r"(hCXEUpdate CrawlSpider Template with Selector changes (:commit:`6d1457d`)r#hDjhMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbNhchhd]r%h)r&}r'(hCj#hDj!hMhPhWhhY}r((h]]h^]h\]h[]h_]uhbK5hd]r)(htX3Update CrawlSpider Template with Selector changes (r*r+}r,(hCX3Update CrawlSpider Template with Selector changes (hDj&ubh)r-}r.(hCX:commit:`6d1457d`hY}r/(UrefuriX/https://github.com/scrapy/scrapy/commit/6d1457dh[]h\]h]]h^]h_]uhDj&hd]r0htXcommit 6d1457dr1r2}r3(hCUhDj-ubahWhubhtX)r4}r5(hCX)hDj&ubeubaubh)r6}r7(hCX>fix method name in tutorial. closes GH-480 (:commit:`b4fc359` hDjhMhPhWhhY}r8(h]]h^]h\]h[]h_]uhbNhchhd]r9h)r:}r;(hCX=fix method name in tutorial. closes GH-480 (:commit:`b4fc359`hDj6hMhPhWhhY}r<(h]]h^]h\]h[]h_]uhbK6hd]r=(htX,fix method name in tutorial. closes GH-480 (r>r?}r@(hCX,fix method name in tutorial. closes GH-480 (hDj:ubh)rA}rB(hCX:commit:`b4fc359`hY}rC(UrefuriX/https://github.com/scrapy/scrapy/commit/b4fc359h[]h\]h]]h^]h_]uhDj:hd]rDhtXcommit b4fc359rErF}rG(hCUhDjAubahWhubeubaubeubeubhE)rH}rI(hCUhDhKhMhPhWhehY}rJ(h]]h^]h\]h[]rKUreleased-2013-11-28rLah_]rMh,auhbK9hchhd]rN(hm)rO}rP(hCX0.20.1 (released 2013-11-28)rQhDjHhMhPhWhqhY}rR(h]]h^]h\]h[]h_]uhbK9hchhd]rShtX0.20.1 (released 2013-11-28)rTrU}rV(hCjQhDjOubaubh)rW}rX(hCUhDjHhMhPhWhhY}rY(hX-h[]h\]h]]h^]h_]uhbK;hchhd]rZ(h)r[}r\(hCX[include_package_data is required to build wheels from published sources (:commit:`5ba1ad5`)r]hDjWhMhPhWhhY}r^(h]]h^]h\]h[]h_]uhbNhchhd]r_h)r`}ra(hCj]hDj[hMhPhWhhY}rb(h]]h^]h\]h[]h_]uhbK;hd]rc(htXIinclude_package_data is required to build wheels from published sources (rdre}rf(hCXIinclude_package_data is required to build wheels from published sources (hDj`ubh)rg}rh(hCX:commit:`5ba1ad5`hY}ri(UrefuriX/https://github.com/scrapy/scrapy/commit/5ba1ad5h[]h\]h]]h^]h_]uhDj`hd]rjhtXcommit 5ba1ad5rkrl}rm(hCUhDjgubahWhubhtX)rn}ro(hCX)hDj`ubeubaubh)rp}rq(hCXfprocess_parallel was leaking the failures on its internal deferreds. closes #458 (:commit:`419a780`) hDjWhMhPhWhhY}rr(h]]h^]h\]h[]h_]uhbNhchhd]rsh)rt}ru(hCXeprocess_parallel was leaking the failures on its internal deferreds. closes #458 (:commit:`419a780`)hDjphMhPhWhhY}rv(h]]h^]h\]h[]h_]uhbK(h]]h^]h\]h[]h_]uhbNhchhd]r?h)r@}rA(hCXPFix permission and set umask before generating sdist tarball (:commit:`06149e0`)hDj<hMhPhWhhY}rB(h]]h^]h\]h[]h_]uhbKhd]rC(htX>Fix permission and set umask before generating sdist tarball (rDrE}rF(hCX>Fix permission and set umask before generating sdist tarball (hDj@ubh)rG}rH(hCX:commit:`06149e0`hY}rI(UrefuriX/https://github.com/scrapy/scrapy/commit/06149e0h[]h\]h]]h^]h_]uhDj@hd]rJhtXcommit 06149e0rKrL}rM(hCUhDjGubahWhubhtX)rN}rO(hCX)hDj@ubeubaubeubeubhE)rP}rQ(hCUhDhKhMhPhWhehY}rR(h]]h^]h\]h[]rSUreleased-2013-09-03rTah_]rUh auhbKhchhd]rV(hm)rW}rX(hCX0.18.2 (released 2013-09-03)rYhDjPhMhPhWhqhY}rZ(h]]h^]h\]h[]h_]uhbKhchhd]r[htX0.18.2 (released 2013-09-03)r\r]}r^(hCjYhDjWubaubh)r_}r`(hCUhDjPhMhPhWhhY}ra(hX-h[]h\]h]]h^]h_]uhbKhchhd]rbh)rc}rd(hCXbBackport `scrapy check` command fixes and backward compatible multi crawler process(:issue:`339`) hDj_hMhPhWhhY}re(h]]h^]h\]h[]h_]uhbNhchhd]rfh)rg}rh(hCXaBackport `scrapy check` command fixes and backward compatible multi crawler process(:issue:`339`)hDjchMhPhWhhY}ri(h]]h^]h\]h[]h_]uhbKhd]rj(htX Backport rkrl}rm(hCX Backport hDjgubh)rn}ro(hCX`scrapy check`hY}rp(h]]h^]h\]h[]h_]uhDjghd]rqhtX scrapy checkrrrs}rt(hCUhDjnubahWhubhtX= command fixes and backward compatible multi crawler process(rurv}rw(hCX= command fixes and backward compatible multi crawler process(hDjgubh)rx}ry(hCX :issue:`339`hY}rz(UrefuriX+https://github.com/scrapy/scrapy/issues/339h[]h\]h]]h^]h_]uhDjghd]r{htX issue 339r|r}}r~(hCUhDjxubahWhubhtX)r}r(hCX)hDjgubeubaubaubeubhE)r}r(hCUhDhKhMhPhWhehY}r(h]]h^]h\]h[]rUreleased-2013-08-27rah_]rh:auhbKhchhd]r(hm)r}r(hCX0.18.1 (released 2013-08-27)rhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbKhchhd]rhtX0.18.1 (released 2013-08-27)rr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbKhchhd]r(h)r}r(hCXFremove extra import added by cherry picked changes (:commit:`d20304e`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX4remove extra import added by cherry picked changes (rr}r(hCX4remove extra import added by cherry picked changes (hDjubh)r}r(hCX:commit:`d20304e`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/d20304eh[]h\]h]]h^]h_]uhDjhd]rhtXcommit d20304err}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX?fix crawling tests under twisted pre 11.0.0 (:commit:`1994f38`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX-fix crawling tests under twisted pre 11.0.0 (rr}r(hCX-fix crawling tests under twisted pre 11.0.0 (hDjubh)r}r(hCX:commit:`1994f38`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/1994f38h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 1994f38rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX=py26 can not format zero length fields {} (:commit:`abf756f`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX+py26 can not format zero length fields {} (rr}r(hCX+py26 can not format zero length fields {} (hDjubh)r}r(hCX:commit:`abf756f`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/abf756fh[]h\]h]]h^]h_]uhDjhd]rhtXcommit abf756frr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXEtest PotentiaDataLoss errors on unbound responses (:commit:`b15470d`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX3test PotentiaDataLoss errors on unbound responses (rr}r(hCX3test PotentiaDataLoss errors on unbound responses (hDjubh)r}r(hCX:commit:`b15470d`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/b15470dh[]h\]h]]h^]h_]uhDjhd]rhtXcommit b15470drr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXaTreat responses without content-length or Transfer-Encoding as good responses (:commit:`c4bf324`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXOTreat responses without content-length or Transfer-Encoding as good responses (rr}r(hCXOTreat responses without content-length or Transfer-Encoding as good responses (hDjubh)r}r(hCX:commit:`c4bf324`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/c4bf324h[]h\]h]]h^]h_]uhDjhd]rhtXcommit c4bf324rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXQdo no include ResponseFailed if http11 handler is not enabled (:commit:`6cbe684`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX?do no include ResponseFailed if http11 handler is not enabled (rr}r(hCX?do no include ResponseFailed if http11 handler is not enabled (hDjubh)r }r (hCX:commit:`6cbe684`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/6cbe684h[]h\]h]]h^]h_]uhDjhd]r htXcommit 6cbe684r r}r(hCUhDj ubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX`New HTTP client wraps connection losts in ResponseFailed exception. fix #373 (:commit:`1a20bba`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXNNew HTTP client wraps connection losts in ResponseFailed exception. fix #373 (rr}r(hCXNNew HTTP client wraps connection losts in ResponseFailed exception. fix #373 (hDjubh)r}r(hCX:commit:`1a20bba`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/1a20bbah[]h\]h]]h^]h_]uhDjhd]r!htXcommit 1a20bbar"r#}r$(hCUhDjubahWhubhtX)r%}r&(hCX)hDjubeubaubh)r'}r((hCX0limit travis-ci build matrix (:commit:`3b01bb8`)r)hDjhMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbNhchhd]r+h)r,}r-(hCj)hDj'hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbKhd]r/(htXlimit travis-ci build matrix (r0r1}r2(hCXlimit travis-ci build matrix (hDj,ubh)r3}r4(hCX:commit:`3b01bb8`hY}r5(UrefuriX/https://github.com/scrapy/scrapy/commit/3b01bb8h[]h\]h]]h^]h_]uhDj,hd]r6htXcommit 3b01bb8r7r8}r9(hCUhDj3ubahWhubhtX)r:}r;(hCX)hDj,ubeubaubh)r<}r=(hCXDMerge pull request #375 from peterarenot/patch-1 (:commit:`fa766d7`)r>hDjhMhPhWhhY}r?(h]]h^]h\]h[]h_]uhbNhchhd]r@h)rA}rB(hCj>hDj<hMhPhWhhY}rC(h]]h^]h\]h[]h_]uhbKhd]rD(htX2Merge pull request #375 from peterarenot/patch-1 (rErF}rG(hCX2Merge pull request #375 from peterarenot/patch-1 (hDjAubh)rH}rI(hCX:commit:`fa766d7`hY}rJ(UrefuriX/https://github.com/scrapy/scrapy/commit/fa766d7h[]h\]h]]h^]h_]uhDjAhd]rKhtXcommit fa766d7rLrM}rN(hCUhDjHubahWhubhtX)rO}rP(hCX)hDjAubeubaubh)rQ}rR(hCX<Fixed so it refers to the correct folder (:commit:`3283809`)rShDjhMhPhWhhY}rT(h]]h^]h\]h[]h_]uhbNhchhd]rUh)rV}rW(hCjShDjQhMhPhWhhY}rX(h]]h^]h\]h[]h_]uhbKhd]rY(htX*Fixed so it refers to the correct folder (rZr[}r\(hCX*Fixed so it refers to the correct folder (hDjVubh)r]}r^(hCX:commit:`3283809`hY}r_(UrefuriX/https://github.com/scrapy/scrapy/commit/3283809h[]h\]h]]h^]h_]uhDjVhd]r`htXcommit 3283809rarb}rc(hCUhDj]ubahWhubhtX)rd}re(hCX)hDjVubeubaubh)rf}rg(hCXEadded quantal & raring to support ubuntu releases (:commit:`1411923`)rhhDjhMhPhWhhY}ri(h]]h^]h\]h[]h_]uhbNhchhd]rjh)rk}rl(hCjhhDjfhMhPhWhhY}rm(h]]h^]h\]h[]h_]uhbKhd]rn(htX3added quantal & raring to support ubuntu releases (rorp}rq(hCX3added quantal & raring to support ubuntu releases (hDjkubh)rr}rs(hCX:commit:`1411923`hY}rt(UrefuriX/https://github.com/scrapy/scrapy/commit/1411923h[]h\]h]]h^]h_]uhDjkhd]ruhtXcommit 1411923rvrw}rx(hCUhDjrubahWhubhtX)ry}rz(hCX)hDjkubeubaubh)r{}r|(hCXfix retry middleware which didn't retry certain connection errors after the upgrade to http1 client, closes GH-373 (:commit:`bb35ed0`)r}hDjhMhPhWhhY}r~(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCj}hDj{hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXtfix retry middleware which didn't retry certain connection errors after the upgrade to http1 client, closes GH-373 (rr}r(hCXtfix retry middleware which didn't retry certain connection errors after the upgrade to http1 client, closes GH-373 (hDjubh)r}r(hCX:commit:`bb35ed0`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/bb35ed0h[]h\]h]]h^]h_]uhDjhd]rhtXcommit bb35ed0rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXAfix XmlItemExporter in Python 2.7.4 and 2.7.5 (:commit:`de3e451`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX/fix XmlItemExporter in Python 2.7.4 and 2.7.5 (rr}r(hCX/fix XmlItemExporter in Python 2.7.4 and 2.7.5 (hDjubh)r}r(hCX:commit:`de3e451`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/de3e451h[]h\]h]]h^]h_]uhDjhd]rhtXcommit de3e451rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX7minor updates to 0.18 release notes (:commit:`c45e5f1`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX%minor updates to 0.18 release notes (rr}r(hCX%minor updates to 0.18 release notes (hDjubh)r}r(hCX:commit:`c45e5f1`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/c45e5f1h[]h\]h]]h^]h_]uhDjhd]rhtXcommit c45e5f1rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX2fix contributters list format (:commit:`0b60031`) hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCX1fix contributters list format (:commit:`0b60031`)hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXfix contributters list format (rr}r(hCXfix contributters list format (hDjubh)r}r(hCX:commit:`0b60031`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/0b60031h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 0b60031rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubeubeubhE)r}r(hCUhDhKhMhPhWhehY}r(h]]h^]h\]h[]rUreleased-2013-08-09rah_]rh auhbKhchhd]r(hm)r}r(hCX0.18.0 (released 2013-08-09)rhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbKhchhd]rhtX0.18.0 (released 2013-08-09)rr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbKhchhd]r(h)r}r(hCXOLot of improvements to testsuite run using Tox, including a way to test on pypirhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]rhtXOLot of improvements to testsuite run using Tox, including a way to test on pypirr}r(hCjhDjubaubaubh)r}r(hCXBHandle GET parameters for AJAX crawleable urls (:commit:`3fe2a32`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX0Handle GET parameters for AJAX crawleable urls (rr}r(hCX0Handle GET parameters for AJAX crawleable urls (hDjubh)r}r(hCX:commit:`3fe2a32`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/3fe2a32h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 3fe2a32rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX8Use lxml recover option to parse sitemaps (:issue:`347`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r (htX+Use lxml recover option to parse sitemaps (r r }r (hCX+Use lxml recover option to parse sitemaps (hDjubh)r}r(hCX :issue:`347`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/347h[]h\]h]]h^]h_]uhDjhd]rhtX issue 347rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXBBugfix cookie merging by hostname and not by netloc (:issue:`352`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX5Bugfix cookie merging by hostname and not by netloc (r r!}r"(hCX5Bugfix cookie merging by hostname and not by netloc (hDjubh)r#}r$(hCX :issue:`352`hY}r%(UrefuriX+https://github.com/scrapy/scrapy/issues/352h[]h\]h]]h^]h_]uhDjhd]r&htX issue 352r'r(}r)(hCUhDj#ubahWhubhtX)r*}r+(hCX)hDjubeubaubh)r,}r-(hCXQSupport disabling `HttpCompressionMiddleware` using a flag setting (:issue:`359`)r.hDjhMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbNhchhd]r0h)r1}r2(hCj.hDj,hMhPhWhhY}r3(h]]h^]h\]h[]h_]uhbKhd]r4(htXSupport disabling r5r6}r7(hCXSupport disabling hDj1ubh)r8}r9(hCX`HttpCompressionMiddleware`hY}r:(h]]h^]h\]h[]h_]uhDj1hd]r;htXHttpCompressionMiddlewarer<r=}r>(hCUhDj8ubahWhubhtX using a flag setting (r?r@}rA(hCX using a flag setting (hDj1ubh)rB}rC(hCX :issue:`359`hY}rD(UrefuriX+https://github.com/scrapy/scrapy/issues/359h[]h\]h]]h^]h_]uhDj1hd]rEhtX issue 359rFrG}rH(hCUhDjBubahWhubhtX)rI}rJ(hCX)hDj1ubeubaubh)rK}rL(hCXPSupport xml namespaces using `iternodes` parser in `XMLFeedSpider` (:issue:`12`)rMhDjhMhPhWhhY}rN(h]]h^]h\]h[]h_]uhbNhchhd]rOh)rP}rQ(hCjMhDjKhMhPhWhhY}rR(h]]h^]h\]h[]h_]uhbKhd]rS(htXSupport xml namespaces using rTrU}rV(hCXSupport xml namespaces using hDjPubh)rW}rX(hCX `iternodes`hY}rY(h]]h^]h\]h[]h_]uhDjPhd]rZhtX iternodesr[r\}r](hCUhDjWubahWhubhtX parser in r^r_}r`(hCX parser in hDjPubh)ra}rb(hCX`XMLFeedSpider`hY}rc(h]]h^]h\]h[]h_]uhDjPhd]rdhtX XMLFeedSpiderrerf}rg(hCUhDjaubahWhubhtX (rhri}rj(hCX (hDjPubh)rk}rl(hCX :issue:`12`hY}rm(UrefuriX*https://github.com/scrapy/scrapy/issues/12h[]h\]h]]h^]h_]uhDjPhd]rnhtXissue 12rorp}rq(hCUhDjkubahWhubhtX)rr}rs(hCX)hDjPubeubaubh)rt}ru(hCX4Support `dont_cache` request meta flag (:issue:`19`)rvhDjhMhPhWhhY}rw(h]]h^]h\]h[]h_]uhbNhchhd]rxh)ry}rz(hCjvhDjthMhPhWhhY}r{(h]]h^]h\]h[]h_]uhbKhd]r|(htXSupport r}r~}r(hCXSupport hDjyubh)r}r(hCX `dont_cache`hY}r(h]]h^]h\]h[]h_]uhDjyhd]rhtX dont_cacherr}r(hCUhDjubahWhubhtX request meta flag (rr}r(hCX request meta flag (hDjyubh)r}r(hCX :issue:`19`hY}r(UrefuriX*https://github.com/scrapy/scrapy/issues/19h[]h\]h]]h^]h_]uhDjyhd]rhtXissue 19rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjyubeubaubh)r}r(hCXTBugfix `scrapy.utils.gz.gunzip` broken by changes in python 2.7.4 (:commit:`4dc76e`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXBugfix rr}r(hCXBugfix hDjubh)r}r(hCX`scrapy.utils.gz.gunzip`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXscrapy.utils.gz.gunziprr}r(hCUhDjubahWhubhtX$ broken by changes in python 2.7.4 (rr}r(hCX$ broken by changes in python 2.7.4 (hDjubh)r}r(hCX:commit:`4dc76e`hY}r(UrefuriX.https://github.com/scrapy/scrapy/commit/4dc76eh[]h\]h]]h^]h_]uhDjhd]rhtX commit 4dc76err}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX8Bugfix url encoding on `SgmlLinkExtractor` (:issue:`24`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXBugfix url encoding on rr}r(hCXBugfix url encoding on hDjubh)r}r(hCX`SgmlLinkExtractor`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXSgmlLinkExtractorrr}r(hCUhDjubahWhubhtX (rr}r(hCX (hDjubh)r}r(hCX :issue:`24`hY}r(UrefuriX*https://github.com/scrapy/scrapy/issues/24h[]h\]h]]h^]h_]uhDjhd]rhtXissue 24rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXKBugfix `TakeFirst` processor shouldn't discard zero (0) value (:issue:`59`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXBugfix rr}r(hCXBugfix hDjubh)r}r(hCX `TakeFirst`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX TakeFirstrr}r(hCUhDjubahWhubhtX- processor shouldn't discard zero (0) value (rr}r(hCX- processor shouldn't discard zero (0) value (hDjubh)r}r(hCX :issue:`59`hY}r(UrefuriX*https://github.com/scrapy/scrapy/issues/59h[]h\]h]]h^]h_]uhDjhd]rhtXissue 59rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX2Support nested items in xml exporter (:issue:`66`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX&Support nested items in xml exporter (rr}r(hCX&Support nested items in xml exporter (hDjubh)r}r(hCX :issue:`66`hY}r(UrefuriX*https://github.com/scrapy/scrapy/issues/66h[]h\]h]]h^]h_]uhDjhd]rhtXissue 66rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX2Improve cookies handling performance (:issue:`77`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCjhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r (htX&Improve cookies handling performance (rr}r(hCX&Improve cookies handling performance (hDj ubh)r}r(hCX :issue:`77`hY}r(UrefuriX*https://github.com/scrapy/scrapy/issues/77h[]h\]h]]h^]h_]uhDj hd]rhtXissue 77rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDj ubeubaubh)r}r(hCX.Log dupe filtered requests once (:issue:`105`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r (hCjhDjhMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbKhd]r"(htX!Log dupe filtered requests once (r#r$}r%(hCX!Log dupe filtered requests once (hDjubh)r&}r'(hCX :issue:`105`hY}r((UrefuriX+https://github.com/scrapy/scrapy/issues/105h[]h\]h]]h^]h_]uhDjhd]r)htX issue 105r*r+}r,(hCUhDj&ubahWhubhtX)r-}r.(hCX)hDjubeubaubh)r/}r0(hCXQSplit redirection middleware into status and meta based middlewares (:issue:`78`)r1hDjhMhPhWhhY}r2(h]]h^]h\]h[]h_]uhbNhchhd]r3h)r4}r5(hCj1hDj/hMhPhWhhY}r6(h]]h^]h\]h[]h_]uhbKhd]r7(htXESplit redirection middleware into status and meta based middlewares (r8r9}r:(hCXESplit redirection middleware into status and meta based middlewares (hDj4ubh)r;}r<(hCX :issue:`78`hY}r=(UrefuriX*https://github.com/scrapy/scrapy/issues/78h[]h\]h]]h^]h_]uhDj4hd]r>htXissue 78r?r@}rA(hCUhDj;ubahWhubhtX)rB}rC(hCX)hDj4ubeubaubh)rD}rE(hCXIUse HTTP1.1 as default downloader handler (:issue:`109` and :issue:`318`)rFhDjhMhPhWhhY}rG(h]]h^]h\]h[]h_]uhbNhchhd]rHh)rI}rJ(hCjFhDjDhMhPhWhhY}rK(h]]h^]h\]h[]h_]uhbKhd]rL(htX+Use HTTP1.1 as default downloader handler (rMrN}rO(hCX+Use HTTP1.1 as default downloader handler (hDjIubh)rP}rQ(hCX :issue:`109`hY}rR(UrefuriX+https://github.com/scrapy/scrapy/issues/109h[]h\]h]]h^]h_]uhDjIhd]rShtX issue 109rTrU}rV(hCUhDjPubahWhubhtX and rWrX}rY(hCX and hDjIubh)rZ}r[(hCX :issue:`318`hY}r\(UrefuriX+https://github.com/scrapy/scrapy/issues/318h[]h\]h]]h^]h_]uhDjIhd]r]htX issue 318r^r_}r`(hCUhDjZubahWhubhtX)ra}rb(hCX)hDjIubeubaubh)rc}rd(hCXJSupport xpath form selection on `FormRequest.from_response` (:issue:`185`)rehDjhMhPhWhhY}rf(h]]h^]h\]h[]h_]uhbNhchhd]rgh)rh}ri(hCjehDjchMhPhWhhY}rj(h]]h^]h\]h[]h_]uhbKhd]rk(htX Support xpath form selection on rlrm}rn(hCX Support xpath form selection on hDjhubh)ro}rp(hCX`FormRequest.from_response`hY}rq(h]]h^]h\]h[]h_]uhDjhhd]rrhtXFormRequest.from_responsersrt}ru(hCUhDjoubahWhubhtX (rvrw}rx(hCX (hDjhubh)ry}rz(hCX :issue:`185`hY}r{(UrefuriX+https://github.com/scrapy/scrapy/issues/185h[]h\]h]]h^]h_]uhDjhhd]r|htX issue 185r}r~}r(hCUhDjyubahWhubhtX)r}r(hCX)hDjhubeubaubh)r}r(hCXCBugfix unicode decoding error on `SgmlLinkExtractor` (:issue:`199`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX!Bugfix unicode decoding error on rr}r(hCX!Bugfix unicode decoding error on hDjubh)r}r(hCX`SgmlLinkExtractor`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXSgmlLinkExtractorrr}r(hCUhDjubahWhubhtX (rr}r(hCX (hDjubh)r}r(hCX :issue:`199`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/199h[]h\]h]]h^]h_]uhDjhd]rhtX issue 199rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX<Bugfix signal dispatching on pypi interpreter (:issue:`205`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX/Bugfix signal dispatching on pypi interpreter (rr}r(hCX/Bugfix signal dispatching on pypi interpreter (hDjubh)r}r(hCX :issue:`205`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/205h[]h\]h]]h^]h_]uhDjhd]rhtX issue 205rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX=Improve request delay and concurrency handling (:issue:`206`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX0Improve request delay and concurrency handling (rr}r(hCX0Improve request delay and concurrency handling (hDjubh)r}r(hCX :issue:`206`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/206h[]h\]h]]h^]h_]uhDjhd]rhtX issue 206rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX@Add RFC2616 cache policy to `HttpCacheMiddleware` (:issue:`212`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htXAdd RFC2616 cache policy to rr}r(hCXAdd RFC2616 cache policy to hDjubh)r}r(hCX`HttpCacheMiddleware`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXHttpCacheMiddlewarerr}r(hCUhDjubahWhubhtX (rr}r(hCX (hDjubh)r}r(hCX :issue:`212`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/212h[]h\]h]]h^]h_]uhDjhd]rhtX issue 212rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX?Allow customization of messages logged by engine (:issue:`214`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbKhd]r(htX2Allow customization of messages logged by engine (rr}r(hCX2Allow customization of messages logged by engine (hDjubh)r}r(hCX :issue:`214`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/214h[]h\]h]]h^]h_]uhDjhd]rhtX issue 214rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r (hCXQMultiples improvements to `DjangoItem` (:issue:`217`, :issue:`218`, :issue:`221`)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r (htXMultiples improvements to r r }r (hCXMultiples improvements to hDj ubh)r }r (hCX `DjangoItem`hY}r (h]]h^]h\]h[]h_]uhDj hd]r htX DjangoItemr r }r (hCUhDj ubahWhubhtX (r r }r (hCX (hDj ubh)r }r (hCX :issue:`217`hY}r (UrefuriX+https://github.com/scrapy/scrapy/issues/217h[]h\]h]]h^]h_]uhDj hd]r htX issue 217r r }r (hCUhDj ubahWhubhtX, r r }r (hCX, hDj ubh)r }r (hCX :issue:`218`hY}r! (UrefuriX+https://github.com/scrapy/scrapy/issues/218h[]h\]h]]h^]h_]uhDj hd]r" htX issue 218r# r$ }r% (hCUhDj ubahWhubhtX, r& r' }r( (hCX, hDj ubh)r) }r* (hCX :issue:`221`hY}r+ (UrefuriX+https://github.com/scrapy/scrapy/issues/221h[]h\]h]]h^]h_]uhDj hd]r, htX issue 221r- r. }r/ (hCUhDj) ubahWhubhtX)r0 }r1 (hCX)hDj ubeubaubh)r2 }r3 (hCXCExtend Scrapy commands using setuptools entry points (:issue:`260`)r4 hDjhMhPhWhhY}r5 (h]]h^]h\]h[]h_]uhbNhchhd]r6 h)r7 }r8 (hCj4 hDj2 hMhPhWhhY}r9 (h]]h^]h\]h[]h_]uhbKhd]r: (htX6Extend Scrapy commands using setuptools entry points (r; r< }r= (hCX6Extend Scrapy commands using setuptools entry points (hDj7 ubh)r> }r? (hCX :issue:`260`hY}r@ (UrefuriX+https://github.com/scrapy/scrapy/issues/260h[]h\]h]]h^]h_]uhDj7 hd]rA htX issue 260rB rC }rD (hCUhDj> ubahWhubhtX)rE }rF (hCX)hDj7 ubeubaubh)rG }rH (hCXCAllow spider `allowed_domains` value to be set/tuple (:issue:`261`)rI hDjhMhPhWhhY}rJ (h]]h^]h\]h[]h_]uhbNhchhd]rK h)rL }rM (hCjI hDjG hMhPhWhhY}rN (h]]h^]h\]h[]h_]uhbKhd]rO (htX Allow spider rP rQ }rR (hCX Allow spider hDjL ubh)rS }rT (hCX`allowed_domains`hY}rU (h]]h^]h\]h[]h_]uhDjL hd]rV htXallowed_domainsrW rX }rY (hCUhDjS ubahWhubhtX value to be set/tuple (rZ r[ }r\ (hCX value to be set/tuple (hDjL ubh)r] }r^ (hCX :issue:`261`hY}r_ (UrefuriX+https://github.com/scrapy/scrapy/issues/261h[]h\]h]]h^]h_]uhDjL hd]r` htX issue 261ra rb }rc (hCUhDj] ubahWhubhtX)rd }re (hCX)hDjL ubeubaubh)rf }rg (hCX)Support `settings.getdict` (:issue:`269`)rh hDjhMhPhWhhY}ri (h]]h^]h\]h[]h_]uhbNhchhd]rj h)rk }rl (hCjh hDjf hMhPhWhhY}rm (h]]h^]h\]h[]h_]uhbKhd]rn (htXSupport ro rp }rq (hCXSupport hDjk ubh)rr }rs (hCX`settings.getdict`hY}rt (h]]h^]h\]h[]h_]uhDjk hd]ru htXsettings.getdictrv rw }rx (hCUhDjr ubahWhubhtX (ry rz }r{ (hCX (hDjk ubh)r| }r} (hCX :issue:`269`hY}r~ (UrefuriX+https://github.com/scrapy/scrapy/issues/269h[]h\]h]]h^]h_]uhDjk hd]r htX issue 269r r }r (hCUhDj| ubahWhubhtX)r }r (hCX)hDjk ubeubaubh)r }r (hCXDSimplify internal `scrapy.core.scraper` slot handling (:issue:`271`)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r (htXSimplify internal r r }r (hCXSimplify internal hDj ubh)r }r (hCX`scrapy.core.scraper`hY}r (h]]h^]h\]h[]h_]uhDj hd]r htXscrapy.core.scraperr r }r (hCUhDj ubahWhubhtX slot handling (r r }r (hCX slot handling (hDj ubh)r }r (hCX :issue:`271`hY}r (UrefuriX+https://github.com/scrapy/scrapy/issues/271h[]h\]h]]h^]h_]uhDj hd]r htX issue 271r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCX Added `Item.copy` (:issue:`290`)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r (htXAdded r r }r (hCXAdded hDj ubh)r }r (hCX `Item.copy`hY}r (h]]h^]h\]h[]h_]uhDj hd]r htX Item.copyr r }r (hCUhDj ubahWhubhtX (r r }r (hCX (hDj ubh)r }r (hCX :issue:`290`hY}r (UrefuriX+https://github.com/scrapy/scrapy/issues/290h[]h\]h]]h^]h_]uhDj hd]r htX issue 290r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCX,Collect idle downloader slots (:issue:`297`)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r (htXCollect idle downloader slots (r r }r (hCXCollect idle downloader slots (hDj ubh)r }r (hCX :issue:`297`hY}r (UrefuriX+https://github.com/scrapy/scrapy/issues/297h[]h\]h]]h^]h_]uhDj hd]r htX issue 297r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCX5Add `ftp://` scheme downloader handler (:issue:`329`)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r (htXAdd r r }r (hCXAdd hDj ubh)r }r (hCX`ftp://`hY}r (h]]h^]h\]h[]h_]uhDj hd]r htXftp://r r }r (hCUhDj ubahWhubhtX scheme downloader handler (r r }r (hCX scheme downloader handler (hDj ubh)r }r (hCX :issue:`329`hY}r (UrefuriX+https://github.com/scrapy/scrapy/issues/329h[]h\]h]]h^]h_]uhDj hd]r htX issue 329r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXIAdded downloader benchmark webserver and spider tools :ref:`benchmarking`r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r (htX6Added downloader benchmark webserver and spider tools r r }r (hCX6Added downloader benchmark webserver and spider tools hDj ubcsphinx.addnodes pending_xref r )r }r (hCX:ref:`benchmarking`r hDj hMhPhWU pending_xrefr hY}r (UreftypeXrefUrefwarnr U reftargetr X benchmarkingU refdomainXstdr h[]h\]U refexplicith]]h^]h_]Urefdocr Xnewsr uhbKhd]r cdocutils.nodes emphasis r )r }r (hCj hY}r (h]]h^]r (Uxrefr j Xstd-refr eh\]h[]h_]uhDj hd]r htX benchmarkingr r }r (hCUhDj ubahWUemphasisr ubaubeubaubh)r }r (hCX_Moved persistent (on disk) queues to a separate project (queuelib_) which scrapy now depends onr hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r! (hCj hDj hMhPhWhhY}r" (h]]h^]h\]h[]h_]uhbKhd]r# (htX9Moved persistent (on disk) queues to a separate project (r$ r% }r& (hCX9Moved persistent (on disk) queues to a separate project (hDj ubh)r' }r( (hCX queuelib_Uresolvedr) KhDj hWhhY}r* (UnameXqueuelibr+ Urefurir, X"https://github.com/scrapy/queuelibr- h[]h\]h]]h^]h_]uhd]r. htXqueuelibr/ r0 }r1 (hCUhDj' ubaubhtX) which scrapy now depends onr2 r3 }r4 (hCX) which scrapy now depends onhDj ubeubaubh)r5 }r6 (hCX;Add scrapy commands using external libraries (:issue:`260`)r7 hDjhMhPhWhhY}r8 (h]]h^]h\]h[]h_]uhbNhchhd]r9 h)r: }r; (hCj7 hDj5 hMhPhWhhY}r< (h]]h^]h\]h[]h_]uhbKhd]r= (htX.Add scrapy commands using external libraries (r> r? }r@ (hCX.Add scrapy commands using external libraries (hDj: ubh)rA }rB (hCX :issue:`260`hY}rC (UrefuriX+https://github.com/scrapy/scrapy/issues/260h[]h\]h]]h^]h_]uhDj: hd]rD htX issue 260rE rF }rG (hCUhDjA ubahWhubhtX)rH }rI (hCX)hDj: ubeubaubh)rJ }rK (hCX6Added ``--pdb`` option to ``scrapy`` command line toolrL hDjhMhPhWhhY}rM (h]]h^]h\]h[]h_]uhbNhchhd]rN h)rO }rP (hCjL hDjJ hMhPhWhhY}rQ (h]]h^]h\]h[]h_]uhbKhd]rR (htXAdded rS rT }rU (hCXAdded hDjO ubcdocutils.nodes literal rV )rW }rX (hCX ``--pdb``hY}rY (h]]h^]h\]h[]h_]uhDjO hd]rZ htX--pdbr[ r\ }r] (hCUhDjW ubahWUliteralr^ ubhtX option to r_ r` }ra (hCX option to hDjO ubjV )rb }rc (hCX ``scrapy``hY}rd (h]]h^]h\]h[]h_]uhDjO hd]re htXscrapyrf rg }rh (hCUhDjb ubahWj^ ubhtX command line toolri rj }rk (hCX command line toolhDjO ubeubaubh)rl }rm (hCXAdded :meth:`XPathSelector.remove_namespaces` which allows to remove all namespaces from XML documents for convenience (to work with namespace-less XPaths). Documented in :ref:`topics-selectors`.rn hDjhMhPhWhhY}ro (h]]h^]h\]h[]h_]uhbNhchhd]rp h)rq }rr (hCjn hDjl hMhPhWhhY}rs (h]]h^]h\]h[]h_]uhbKhd]rt (htXAdded ru rv }rw (hCXAdded hDjq ubj )rx }ry (hCX':meth:`XPathSelector.remove_namespaces`rz hDjq hMhPhWj hY}r{ (UreftypeXmethj j XXPathSelector.remove_namespacesU refdomainXpyr| h[]h\]U refexplicith]]h^]h_]j j Upy:classr} NU py:moduler~ NuhbKhd]r jV )r }r (hCjz hY}r (h]]h^]r (j j| Xpy-methr eh\]h[]h_]uhDjx hd]r htX!XPathSelector.remove_namespaces()r r }r (hCUhDj ubahWj^ ubaubhtX~ which allows to remove all namespaces from XML documents for convenience (to work with namespace-less XPaths). Documented in r r }r (hCX~ which allows to remove all namespaces from XML documents for convenience (to work with namespace-less XPaths). Documented in hDjq ubj )r }r (hCX:ref:`topics-selectors`r hDjq hMhPhWj hY}r (UreftypeXrefj j Xtopics-selectorsU refdomainXstdr h[]h\]U refexplicith]]h^]h_]j j uhbKhd]r j )r }r (hCj hY}r (h]]h^]r (j j Xstd-refr eh\]h[]h_]uhDj hd]r htXtopics-selectorsr r }r (hCUhDj ubahWj ubaubhtX.r }r (hCX.hDjq ubeubaubh)r }r (hCX(Several improvements to spider contractsr hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htX(Several improvements to spider contractsr r }r (hCj hDj ubaubaubh)r }r (hCXdNew default middleware named MetaRefreshMiddldeware that handles meta-refresh html tag redirections,r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htXdNew default middleware named MetaRefreshMiddldeware that handles meta-refresh html tag redirections,r r }r (hCj hDj ubaubaubh)r }r (hCXVMetaRefreshMiddldeware and RedirectMiddleware have different priorities to address #62r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htXVMetaRefreshMiddldeware and RedirectMiddleware have different priorities to address #62r r }r (hCj hDj ubaubaubh)r }r (hCX$added from_crawler method to spidersr hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htX$added from_crawler method to spidersr r }r (hCj hDj ubaubaubh)r }r (hCX#added system tests with mock serverr hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htX#added system tests with mock serverr r }r (hCj hDj ubaubaubh)r }r (hCX=more improvements to Mac OS compatibility (thanks Alex Cepoi)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htX=more improvements to Mac OS compatibility (thanks Alex Cepoi)r r }r (hCj hDj ubaubaubh)r }r (hCXUseveral more cleanups to singletons and multi-spider support (thanks Nicolas Ramirez)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htXUseveral more cleanups to singletons and multi-spider support (thanks Nicolas Ramirez)r r }r (hCj hDj ubaubaubh)r }r (hCXsupport custom download slotsr hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htXsupport custom download slotsr r }r (hCj hDj ubaubaubh)r }r (hCX)added --spider option to "shell" command.r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htX)added --spider option to "shell" command.r r }r (hCj hDj ubaubaubh)r }r (hCX+log overridden settings when scrapy starts hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCX*log overridden settings when scrapy startsr hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhd]r htX*log overridden settings when scrapy startsr r }r (hCj hDj ubaubaubeubh)r }r (hCXoThanks to everyone who contribute to this release. Here is a list of contributors sorted by number of commits::hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbKhchhd]r htXnThanks to everyone who contribute to this release. Here is a list of contributors sorted by number of commits:r r }r (hCXnThanks to everyone who contribute to this release. Here is a list of contributors sorted by number of commits:hDj ubaubcdocutils.nodes literal_block r )r }r (hCX130 Pablo Hoffman 97 Daniel Graña 20 Nicolás Ramírez 13 Mikhail Korobov 12 Pedro Faustino 11 Steven Almeroth 5 Rolando Espinoza La fuente 4 Michal Danilak 4 Alex Cepoi 4 Alexandr N Zamaraev (aka tonal) 3 paul 3 Martin Olveyra 3 Jordi Llonch 3 arijitchakraborty 2 Shane Evans 2 joehillen 2 Hart 2 Dan 1 Zuhao Wan 1 whodatninja 1 vkrest 1 tpeng 1 Tom Mortimer-Jones 1 Rocio Aramberri 1 Pedro 1 notsobad 1 Natan L 1 Mark Grey 1 Luan 1 Libor Nenadál 1 Juan M Uys 1 Jonas Brunsgaard 1 Ilya Baryshev 1 Hasnain Lakhani 1 Emanuel Schorsch 1 Chris Tilden 1 Capi Etheriel 1 cacovsky 1 Berend Iwema hDjhMhPhWU literal_blockr hY}r (U xml:spacer! Upreserver" h[]h\]h]]h^]h_]uhbKhchhd]r# htX130 Pablo Hoffman 97 Daniel Graña 20 Nicolás Ramírez 13 Mikhail Korobov 12 Pedro Faustino 11 Steven Almeroth 5 Rolando Espinoza La fuente 4 Michal Danilak 4 Alex Cepoi 4 Alexandr N Zamaraev (aka tonal) 3 paul 3 Martin Olveyra 3 Jordi Llonch 3 arijitchakraborty 2 Shane Evans 2 joehillen 2 Hart 2 Dan 1 Zuhao Wan 1 whodatninja 1 vkrest 1 tpeng 1 Tom Mortimer-Jones 1 Rocio Aramberri 1 Pedro 1 notsobad 1 Natan L 1 Mark Grey 1 Luan 1 Libor Nenadál 1 Juan M Uys 1 Jonas Brunsgaard 1 Ilya Baryshev 1 Hasnain Lakhani 1 Emanuel Schorsch 1 Chris Tilden 1 Capi Etheriel 1 cacovsky 1 Berend Iwema r$ r% }r& (hCUhDj ubaubeubhE)r' }r( (hCUhDhKhMhPhWhehY}r) (h]]h^]h\]h[]r* Ureleased-2013-05-30r+ ah_]r, h7auhbMhchhd]r- (hm)r. }r/ (hCX0.16.5 (released 2013-05-30)r0 hDj' hMhPhWhqhY}r1 (h]]h^]h\]h[]h_]uhbMhchhd]r2 htX0.16.5 (released 2013-05-30)r3 r4 }r5 (hCj0 hDj. ubaubh)r6 }r7 (hCUhDj' hMhPhWhhY}r8 (hX-h[]h\]h]]h^]h_]uhbMhchhd]r9 (h)r: }r; (hCXZobey request method when scrapy deploy is redirected to a new endpoint (:commit:`8c4fcee`)r< hDj6 hMhPhWhhY}r= (h]]h^]h\]h[]h_]uhbNhchhd]r> h)r? }r@ (hCj< hDj: hMhPhWhhY}rA (h]]h^]h\]h[]h_]uhbMhd]rB (htXHobey request method when scrapy deploy is redirected to a new endpoint (rC rD }rE (hCXHobey request method when scrapy deploy is redirected to a new endpoint (hDj? ubh)rF }rG (hCX:commit:`8c4fcee`hY}rH (UrefuriX/https://github.com/scrapy/scrapy/commit/8c4fceeh[]h\]h]]h^]h_]uhDj? hd]rI htXcommit 8c4fceerJ rK }rL (hCUhDjF ubahWhubhtX)rM }rN (hCX)hDj? ubeubaubh)rO }rP (hCXQfix inaccurate downloader middleware documentation. refs #280 (:commit:`40667cb`)rQ hDj6 hMhPhWhhY}rR (h]]h^]h\]h[]h_]uhbNhchhd]rS h)rT }rU (hCjQ hDjO hMhPhWhhY}rV (h]]h^]h\]h[]h_]uhbMhd]rW (htX?fix inaccurate downloader middleware documentation. refs #280 (rX rY }rZ (hCX?fix inaccurate downloader middleware documentation. refs #280 (hDjT ubh)r[ }r\ (hCX:commit:`40667cb`hY}r] (UrefuriX/https://github.com/scrapy/scrapy/commit/40667cbh[]h\]h]]h^]h_]uhDjT hd]r^ htXcommit 40667cbr_ r` }ra (hCUhDj[ ubahWhubhtX)rb }rc (hCX)hDjT ubeubaubh)rd }re (hCXfdoc: remove links to diveintopython.org, which is no longer available. closes #246 (:commit:`bd58bfa`)rf hDj6 hMhPhWhhY}rg (h]]h^]h\]h[]h_]uhbNhchhd]rh h)ri }rj (hCjf hDjd hMhPhWhhY}rk (h]]h^]h\]h[]h_]uhbM hd]rl (htXTdoc: remove links to diveintopython.org, which is no longer available. closes #246 (rm rn }ro (hCXTdoc: remove links to diveintopython.org, which is no longer available. closes #246 (hDji ubh)rp }rq (hCX:commit:`bd58bfa`hY}rr (UrefuriX/https://github.com/scrapy/scrapy/commit/bd58bfah[]h\]h]]h^]h_]uhDji hd]rs htXcommit bd58bfart ru }rv (hCUhDjp ubahWhubhtX)rw }rx (hCX)hDji ubeubaubh)ry }rz (hCX>Find form nodes in invalid html5 documents (:commit:`e3d6945`)r{ hDj6 hMhPhWhhY}r| (h]]h^]h\]h[]h_]uhbNhchhd]r} h)r~ }r (hCj{ hDjy hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM!hd]r (htX,Find form nodes in invalid html5 documents (r r }r (hCX,Find form nodes in invalid html5 documents (hDj~ ubh)r }r (hCX:commit:`e3d6945`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/e3d6945h[]h\]h]]h^]h_]uhDj~ hd]r htXcommit e3d6945r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj~ ubeubaubh)r }r (hCXFFix typo labeling attrs type bool instead of list (:commit:`a274276`) hDj6 hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCXEFix typo labeling attrs type bool instead of list (:commit:`a274276`)hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM"hd]r (htX3Fix typo labeling attrs type bool instead of list (r r }r (hCX3Fix typo labeling attrs type bool instead of list (hDj ubh)r }r (hCX:commit:`a274276`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/a274276h[]h\]h]]h^]h_]uhDj hd]r htXcommit a274276r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubeubeubhE)r }r (hCUhDhKhMhPhWhehY}r (h]]h^]h\]h[]r Ureleased-2013-01-23r ah_]r h8auhbM%hchhd]r (hm)r }r (hCX0.16.4 (released 2013-01-23)r hDj hMhPhWhqhY}r (h]]h^]h\]h[]h_]uhbM%hchhd]r htX0.16.4 (released 2013-01-23)r r }r (hCj hDj ubaubh)r }r (hCUhDj hMhPhWhhY}r (hX-h[]h\]h]]h^]h_]uhbM'hchhd]r (h)r }r (hCX:fixes spelling errors in documentation (:commit:`6d2b3aa`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM'hd]r (htX(fixes spelling errors in documentation (r r }r (hCX(fixes spelling errors in documentation (hDj ubh)r }r (hCX:commit:`6d2b3aa`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/6d2b3aah[]h\]h]]h^]h_]uhDj hd]r htXcommit 6d2b3aar r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXCadd doc about disabling an extension. refs #132 (:commit:`c90de33`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM(hd]r (htX1add doc about disabling an extension. refs #132 (r r }r (hCX1add doc about disabling an extension. refs #132 (hDj ubh)r }r (hCX:commit:`c90de33`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/c90de33h[]h\]h]]h^]h_]uhDj hd]r htXcommit c90de33r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXFixed error message formatting. log.err() doesn't support cool formatting and when error occured, the message was: "ERROR: Error processing %(item)s" (:commit:`c16150c`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM)hd]r (htXFixed error message formatting. log.err() doesn't support cool formatting and when error occured, the message was: "ERROR: Error processing %(item)s" (r r }r (hCXFixed error message formatting. log.err() doesn't support cool formatting and when error occured, the message was: "ERROR: Error processing %(item)s" (hDj ubh)r }r (hCX:commit:`c16150c`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/c16150ch[]h\]h]]h^]h_]uhDj hd]r htXcommit c16150cr r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXBlint and improve images pipeline error logging (:commit:`56b45fc`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM*hd]r (htX0lint and improve images pipeline error logging (r r }r (hCX0lint and improve images pipeline error logging (hDj ubh)r }r (hCX:commit:`56b45fc`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/56b45fch[]h\]h]]h^]h_]uhDj hd]r htXcommit 56b45fcr r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCX#fixed doc typos (:commit:`243be84`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM+hd]r (htXfixed doc typos (r r }r (hCXfixed doc typos (hDj ubh)r }r (hCX:commit:`243be84`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/243be84h[]h\]h]]h^]h_]uhDj hd]r htXcommit 243be84r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXLadd documentation topics: Broad Crawls & Common Practies (:commit:`1fbb715`)r hDj hMhPhWhhY}r! (h]]h^]h\]h[]h_]uhbNhchhd]r" h)r# }r$ (hCj hDj hMhPhWhhY}r% (h]]h^]h\]h[]h_]uhbM,hd]r& (htX:add documentation topics: Broad Crawls & Common Practies (r' r( }r) (hCX:add documentation topics: Broad Crawls & Common Practies (hDj# ubh)r* }r+ (hCX:commit:`1fbb715`hY}r, (UrefuriX/https://github.com/scrapy/scrapy/commit/1fbb715h[]h\]h]]h^]h_]uhDj# hd]r- htXcommit 1fbb715r. r/ }r0 (hCUhDj* ubahWhubhtX)r1 }r2 (hCX)hDj# ubeubaubh)r3 }r4 (hCXhfix bug in scrapy parse command when spider is not specified explicitly. closes #209 (:commit:`c72e682`)r5 hDj hMhPhWhhY}r6 (h]]h^]h\]h[]h_]uhbNhchhd]r7 h)r8 }r9 (hCj5 hDj3 hMhPhWhhY}r: (h]]h^]h\]h[]h_]uhbM-hd]r; (htXVfix bug in scrapy parse command when spider is not specified explicitly. closes #209 (r< r= }r> (hCXVfix bug in scrapy parse command when spider is not specified explicitly. closes #209 (hDj8 ubh)r? }r@ (hCX:commit:`c72e682`hY}rA (UrefuriX/https://github.com/scrapy/scrapy/commit/c72e682h[]h\]h]]h^]h_]uhDj8 hd]rB htXcommit c72e682rC rD }rE (hCUhDj? ubahWhubhtX)rF }rG (hCX)hDj8 ubeubaubh)rH }rI (hCX4Update docs/topics/commands.rst (:commit:`28eac7a`) hDj hMhPhWhhY}rJ (h]]h^]h\]h[]h_]uhbNhchhd]rK h)rL }rM (hCX3Update docs/topics/commands.rst (:commit:`28eac7a`)hDjH hMhPhWhhY}rN (h]]h^]h\]h[]h_]uhbM.hd]rO (htX!Update docs/topics/commands.rst (rP rQ }rR (hCX!Update docs/topics/commands.rst (hDjL ubh)rS }rT (hCX:commit:`28eac7a`hY}rU (UrefuriX/https://github.com/scrapy/scrapy/commit/28eac7ah[]h\]h]]h^]h_]uhDjL hd]rV htXcommit 28eac7arW rX }rY (hCUhDjS ubahWhubhtX)rZ }r[ (hCX)hDjL ubeubaubeubeubhE)r\ }r] (hCUhDhKhMhPhWhehY}r^ (h]]h^]h\]h[]r_ Ureleased-2012-12-07r` ah_]ra hauhbM1hchhd]rb (hm)rc }rd (hCX0.16.3 (released 2012-12-07)re hDj\ hMhPhWhqhY}rf (h]]h^]h\]h[]h_]uhbM1hchhd]rg htX0.16.3 (released 2012-12-07)rh ri }rj (hCje hDjc ubaubh)rk }rl (hCUhDj\ hMhPhWhhY}rm (hX-h[]h\]h]]h^]h_]uhbM3hchhd]rn (h)ro }rp (hCXRemove concurrency limitation when using download delays and still ensure inter-request delays are enforced (:commit:`487b9b5`)rq hDjk hMhPhWhhY}rr (h]]h^]h\]h[]h_]uhbNhchhd]rs h)rt }ru (hCjq hDjo hMhPhWhhY}rv (h]]h^]h\]h[]h_]uhbM3hd]rw (htXmRemove concurrency limitation when using download delays and still ensure inter-request delays are enforced (rx ry }rz (hCXmRemove concurrency limitation when using download delays and still ensure inter-request delays are enforced (hDjt ubh)r{ }r| (hCX:commit:`487b9b5`hY}r} (UrefuriX/https://github.com/scrapy/scrapy/commit/487b9b5h[]h\]h]]h^]h_]uhDjt hd]r~ htXcommit 487b9b5r r }r (hCUhDj{ ubahWhubhtX)r }r (hCX)hDjt ubeubaubh)r }r (hCX?add error details when image pipeline fails (:commit:`8232569`)r hDjk hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM4hd]r (htX-add error details when image pipeline fails (r r }r (hCX-add error details when image pipeline fails (hDj ubh)r }r (hCX:commit:`8232569`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/8232569h[]h\]h]]h^]h_]uhDj hd]r htXcommit 8232569r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCX0improve mac os compatibility (:commit:`8dcf8aa`)r hDjk hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM5hd]r (htXimprove mac os compatibility (r r }r (hCXimprove mac os compatibility (hDj ubh)r }r (hCX:commit:`8dcf8aa`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/8dcf8aah[]h\]h]]h^]h_]uhDj hd]r htXcommit 8dcf8aar r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXIsetup.py: use README.rst to populate long_description (:commit:`7b5310d`)r hDjk hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM6hd]r (htX7setup.py: use README.rst to populate long_description (r r }r (hCX7setup.py: use README.rst to populate long_description (hDj ubh)r }r (hCX:commit:`7b5310d`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/7b5310dh[]h\]h]]h^]h_]uhDj hd]r htXcommit 7b5310dr r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXBdoc: removed obsolete references to ClientForm (:commit:`80f9bb6`)r hDjk hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM7hd]r (htX0doc: removed obsolete references to ClientForm (r r }r (hCX0doc: removed obsolete references to ClientForm (hDj ubh)r }r (hCX:commit:`80f9bb6`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/80f9bb6h[]h\]h]]h^]h_]uhDj hd]r htXcommit 80f9bb6r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCX<correct docs for default storage backend (:commit:`2aa491b`)r hDjk hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM8hd]r (htX*correct docs for default storage backend (r r }r (hCX*correct docs for default storage backend (hDj ubh)r }r (hCX:commit:`2aa491b`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/2aa491bh[]h\]h]]h^]h_]uhDj hd]r htXcommit 2aa491br r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCX>doc: removed broken proxyhub link from FAQ (:commit:`bdf61c4`)r hDjk hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM9hd]r (htX,doc: removed broken proxyhub link from FAQ (r r }r (hCX,doc: removed broken proxyhub link from FAQ (hDj ubh)r }r (hCX:commit:`bdf61c4`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/bdf61c4h[]h\]h]]h^]h_]uhDj hd]r htXcommit bdf61c4r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXGFixed docs typo in SpiderOpenCloseLogging example (:commit:`7184094`) hDjk hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCXEFixed docs typo in SpiderOpenCloseLogging example (:commit:`7184094`)hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM:hd]r (htX3Fixed docs typo in SpiderOpenCloseLogging example (r r }r (hCX3Fixed docs typo in SpiderOpenCloseLogging example (hDj ubh)r }r (hCX:commit:`7184094`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/7184094h[]h\]h]]h^]h_]uhDj hd]r htXcommit 7184094r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubeubeubhE)r }r (hCUhDhKhMhPhWhehY}r (h]]h^]h\]h[]r Ureleased-2012-11-09r ah_]r h auhbM>hchhd]r (hm)r }r (hCX0.16.2 (released 2012-11-09)r hDj hMhPhWhqhY}r (h]]h^]h\]h[]h_]uhbM>hchhd]r! htX0.16.2 (released 2012-11-09)r" r# }r$ (hCj hDj ubaubh)r% }r& (hCUhDj hMhPhWhhY}r' (hX-h[]h\]h]]h^]h_]uhbM@hchhd]r( (h)r) }r* (hCX6scrapy contracts: python2.6 compat (:commit:`a4a9199`)r+ hDj% hMhPhWhhY}r, (h]]h^]h\]h[]h_]uhbNhchhd]r- h)r. }r/ (hCj+ hDj) hMhPhWhhY}r0 (h]]h^]h\]h[]h_]uhbM@hd]r1 (htX$scrapy contracts: python2.6 compat (r2 r3 }r4 (hCX$scrapy contracts: python2.6 compat (hDj. ubh)r5 }r6 (hCX:commit:`a4a9199`hY}r7 (UrefuriX/https://github.com/scrapy/scrapy/commit/a4a9199h[]h\]h]]h^]h_]uhDj. hd]r8 htXcommit a4a9199r9 r: }r; (hCUhDj5 ubahWhubhtX)r< }r= (hCX)hDj. ubeubaubh)r> }r? (hCX3scrapy contracts verbose option (:commit:`ec41673`)r@ hDj% hMhPhWhhY}rA (h]]h^]h\]h[]h_]uhbNhchhd]rB h)rC }rD (hCj@ hDj> hMhPhWhhY}rE (h]]h^]h\]h[]h_]uhbMAhd]rF (htX!scrapy contracts verbose option (rG rH }rI (hCX!scrapy contracts verbose option (hDjC ubh)rJ }rK (hCX:commit:`ec41673`hY}rL (UrefuriX/https://github.com/scrapy/scrapy/commit/ec41673h[]h\]h]]h^]h_]uhDjC hd]rM htXcommit ec41673rN rO }rP (hCUhDjJ ubahWhubhtX)rQ }rR (hCX)hDjC ubeubaubh)rS }rT (hCXDproper unittest-like output for scrapy contracts (:commit:`86635e4`)rU hDj% hMhPhWhhY}rV (h]]h^]h\]h[]h_]uhbNhchhd]rW h)rX }rY (hCjU hDjS hMhPhWhhY}rZ (h]]h^]h\]h[]h_]uhbMBhd]r[ (htX2proper unittest-like output for scrapy contracts (r\ r] }r^ (hCX2proper unittest-like output for scrapy contracts (hDjX ubh)r_ }r` (hCX:commit:`86635e4`hY}ra (UrefuriX/https://github.com/scrapy/scrapy/commit/86635e4h[]h\]h]]h^]h_]uhDjX hd]rb htXcommit 86635e4rc rd }re (hCUhDj_ ubahWhubhtX)rf }rg (hCX)hDjX ubeubaubh)rh }ri (hCX:added open_in_browser to debugging doc (:commit:`c9b690d`)rj hDj% hMhPhWhhY}rk (h]]h^]h\]h[]h_]uhbNhchhd]rl h)rm }rn (hCjj hDjh hMhPhWhhY}ro (h]]h^]h\]h[]h_]uhbMChd]rp (htX(added open_in_browser to debugging doc (rq rr }rs (hCX(added open_in_browser to debugging doc (hDjm ubh)rt }ru (hCX:commit:`c9b690d`hY}rv (UrefuriX/https://github.com/scrapy/scrapy/commit/c9b690dh[]h\]h]]h^]h_]uhDjm hd]rw htXcommit c9b690drx ry }rz (hCUhDjt ubahWhubhtX)r{ }r| (hCX)hDjm ubeubaubh)r} }r~ (hCXNremoved reference to global scrapy stats from settings doc (:commit:`dd55067`)r hDj% hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj} hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMDhd]r (htX<removed reference to global scrapy stats from settings doc (r r }r (hCX<removed reference to global scrapy stats from settings doc (hDj ubh)r }r (hCX:commit:`dd55067`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/dd55067h[]h\]h]]h^]h_]uhDj hd]r htXcommit dd55067r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCX>Fix SpiderState bug in Windows platforms (:commit:`58998f4`) hDj% hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCX<Fix SpiderState bug in Windows platforms (:commit:`58998f4`)hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMEhd]r (htX*Fix SpiderState bug in Windows platforms (r r }r (hCX*Fix SpiderState bug in Windows platforms (hDj ubh)r }r (hCX:commit:`58998f4`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/58998f4h[]h\]h]]h^]h_]uhDj hd]r htXcommit 58998f4r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubeubeubhE)r }r (hCUhDhKhMhPhWhehY}r (h]]h^]h\]h[]r Ureleased-2012-10-26r ah_]r h auhbMIhchhd]r (hm)r }r (hCX0.16.1 (released 2012-10-26)r hDj hMhPhWhqhY}r (h]]h^]h\]h[]h_]uhbMIhchhd]r htX0.16.1 (released 2012-10-26)r r }r (hCj hDj ubaubh)r }r (hCUhDj hMhPhWhhY}r (hX-h[]h\]h]]h^]h_]uhbMKhchhd]r (h)r }r (hCXjfixed LogStats extension, which got broken after a wrong merge before the 0.16 release (:commit:`8c780fd`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMKhd]r (htXXfixed LogStats extension, which got broken after a wrong merge before the 0.16 release (r r }r (hCXXfixed LogStats extension, which got broken after a wrong merge before the 0.16 release (hDj ubh)r }r (hCX:commit:`8c780fd`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/8c780fdh[]h\]h]]h^]h_]uhDj hd]r htXcommit 8c780fdr r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXKbetter backwards compatibility for scrapy.conf.settings (:commit:`3403089`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMLhd]r (htX9better backwards compatibility for scrapy.conf.settings (r r }r (hCX9better backwards compatibility for scrapy.conf.settings (hDj ubh)r }r (hCX:commit:`3403089`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/3403089h[]h\]h]]h^]h_]uhDj hd]r htXcommit 3403089r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXYextended documentation on how to access crawler stats from extensions (:commit:`c4da0b5`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMMhd]r (htXGextended documentation on how to access crawler stats from extensions (r r }r (hCXGextended documentation on how to access crawler stats from extensions (hDj ubh)r }r (hCX:commit:`c4da0b5`hY}r (UrefuriX/https://github.com/scrapy/scrapy/commit/c4da0b5h[]h\]h]]h^]h_]uhDj hd]r htXcommit c4da0b5r r }r (hCUhDj ubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r (hCXOremoved .hgtags (no longer needed now that scrapy uses git) (:commit:`d52c188`)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMNhd]r(htX=removed .hgtags (no longer needed now that scrapy uses git) (rr}r(hCX=removed .hgtags (no longer needed now that scrapy uses git) (hDj ubh)r}r(hCX:commit:`d52c188`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/d52c188h[]h\]h]]h^]h_]uhDj hd]rhtXcommit d52c188rr }r (hCUhDjubahWhubhtX)r }r (hCX)hDj ubeubaubh)r }r(hCX0fix dashes under rst headers (:commit:`fa4f7f9`)rhDj hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDj hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMOhd]r(htXfix dashes under rst headers (rr}r(hCXfix dashes under rst headers (hDjubh)r}r(hCX:commit:`fa4f7f9`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/fa4f7f9h[]h\]h]]h^]h_]uhDjhd]rhtXcommit fa4f7f9rr}r(hCUhDjubahWhubhtX)r }r!(hCX)hDjubeubaubh)r"}r#(hCX9set release date for 0.16.0 in news (:commit:`e292246`) hDj hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbNhchhd]r%h)r&}r'(hCX7set release date for 0.16.0 in news (:commit:`e292246`)hDj"hMhPhWhhY}r((h]]h^]h\]h[]h_]uhbMPhd]r)(htX%set release date for 0.16.0 in news (r*r+}r,(hCX%set release date for 0.16.0 in news (hDj&ubh)r-}r.(hCX:commit:`e292246`hY}r/(UrefuriX/https://github.com/scrapy/scrapy/commit/e292246h[]h\]h]]h^]h_]uhDj&hd]r0htXcommit e292246r1r2}r3(hCUhDj-ubahWhubhtX)r4}r5(hCX)hDj&ubeubaubeubeubhE)r6}r7(hCUhDhKhMhPhWhehY}r8(h]]h^]h\]h[]r9Ureleased-2012-10-18r:ah_]r;h0auhbMThchhd]r<(hm)r=}r>(hCX0.16.0 (released 2012-10-18)r?hDj6hMhPhWhqhY}r@(h]]h^]h\]h[]h_]uhbMThchhd]rAhtX0.16.0 (released 2012-10-18)rBrC}rD(hCj?hDj=ubaubh)rE}rF(hCXScrapy changes:rGhDj6hMhPhWhhY}rH(h]]h^]h\]h[]h_]uhbMVhchhd]rIhtXScrapy changes:rJrK}rL(hCjGhDjEubaubh)rM}rN(hCUhDj6hMhPhWhhY}rO(hX-h[]h\]h]]h^]h_]uhbMXhchhd]rP(h)rQ}rR(hCX[added :ref:`topics-contracts`, a mechanism for testing spiders in a formal/reproducible wayrShDjMhMhPhWhhY}rT(h]]h^]h\]h[]h_]uhbNhchhd]rUh)rV}rW(hCjShDjQhMhPhWhhY}rX(h]]h^]h\]h[]h_]uhbMXhd]rY(htXadded rZr[}r\(hCXadded hDjVubj )r]}r^(hCX:ref:`topics-contracts`r_hDjVhMhPhWj hY}r`(UreftypeXrefj j Xtopics-contractsU refdomainXstdrah[]h\]U refexplicith]]h^]h_]j j uhbMXhd]rbj )rc}rd(hCj_hY}re(h]]h^]rf(j jaXstd-refrgeh\]h[]h_]uhDj]hd]rhhtXtopics-contractsrirj}rk(hCUhDjcubahWj ubaubhtX>, a mechanism for testing spiders in a formal/reproducible wayrlrm}rn(hCX>, a mechanism for testing spiders in a formal/reproducible wayhDjVubeubaubh)ro}rp(hCXCadded options ``-o`` and ``-t`` to the :command:`runspider` commandrqhDjMhMhPhWhhY}rr(h]]h^]h\]h[]h_]uhbNhchhd]rsh)rt}ru(hCjqhDjohMhPhWhhY}rv(h]]h^]h\]h[]h_]uhbMYhd]rw(htXadded options rxry}rz(hCXadded options hDjtubjV )r{}r|(hCX``-o``hY}r}(h]]h^]h\]h[]h_]uhDjthd]r~htX-orr}r(hCUhDj{ubahWj^ ubhtX and rr}r(hCX and hDjtubjV )r}r(hCX``-t``hY}r(h]]h^]h\]h[]h_]uhDjthd]rhtX-trr}r(hCUhDjubahWj^ ubhtX to the rr}r(hCX to the hDjtubj )r}r(hCX:command:`runspider`rhDjthMhPhWj hY}r(UreftypeXcommandj j X runspiderU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMYhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-commandreh\]h[]h_]uhDjhd]rhtX runspiderrr}r(hCUhDjubahWj^ ubaubhtX commandrr}r(hCX commandhDjtubeubaubh)r}r(hCXdocumented :doc:`topics/autothrottle` and added to extensions installed by default. You still need to enable it with :setting:`AUTOTHROTTLE_ENABLED`rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMZhd]r(htX documented rr}r(hCX documented hDjubj )r}r(hCX:doc:`topics/autothrottle`rhDjhMhPhWj hY}r(UreftypeXdocrj j Xtopics/autothrottleU refdomainUh[]h\]U refexplicith]]h^]h_]j j uhbMZhd]rjV )r}r(hCjhY}r(h]]h^]r(j jeh\]h[]h_]uhDjhd]rhtXtopics/autothrottlerr}r(hCUhDjubahWj^ ubaubhtXP and added to extensions installed by default. You still need to enable it with rr}r(hCXP and added to extensions installed by default. You still need to enable it with hDjubj )r}r(hCX:setting:`AUTOTHROTTLE_ENABLED`rhDjhMhPhWj hY}r(UreftypeXsettingj j XAUTOTHROTTLE_ENABLEDU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMZhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-settingreh\]h[]h_]uhDjhd]rhtXAUTOTHROTTLE_ENABLEDrr}r(hCUhDjubahWj^ ubaubeubaubh)r}r(hCXmajor Stats Collection refactoring: removed separation of global/per-spider stats, removed stats-related signals (``stats_spider_opened``, etc). Stats are much simpler now, backwards compatibility is kept on the Stats Collector API and signals.rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM[hd]r(htXrmajor Stats Collection refactoring: removed separation of global/per-spider stats, removed stats-related signals (rr}r(hCXrmajor Stats Collection refactoring: removed separation of global/per-spider stats, removed stats-related signals (hDjubjV )r}r(hCX``stats_spider_opened``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXstats_spider_openedrr}r(hCUhDjubahWj^ ubhtXk, etc). Stats are much simpler now, backwards compatibility is kept on the Stats Collector API and signals.rr}r(hCXk, etc). Stats are much simpler now, backwards compatibility is kept on the Stats Collector API and signals.hDjubeubaubh)r}r(hCXsadded :meth:`~scrapy.contrib.spidermiddleware.SpiderMiddleware.process_start_requests` method to spider middlewaresrhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM\hd]r(htXadded rr}r(hCXadded hDjubj )r}r(hCXP:meth:`~scrapy.contrib.spidermiddleware.SpiderMiddleware.process_start_requests`rhDjhMhPhWj hY}r(UreftypeXmethj j XGscrapy.contrib.spidermiddleware.SpiderMiddleware.process_start_requestsU refdomainXpyrh[]h\]U refexplicith]]h^]h_]j j j} Nj~ NuhbM\hd]rjV )r}r(hCjhY}r(h]]h^]r(j jXpy-methreh\]h[]h_]uhDjhd]rhtXprocess_start_requests()rr}r(hCUhDjubahWj^ ubaubhtX method to spider middlewaresrr}r(hCX method to spider middlewareshDjubeubaubh)r}r(hCXdropped Signals singleton. Signals should now be accesed through the Crawler.signals attribute. See the signals documentation for more info.rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM]hd]r htXdropped Signals singleton. Signals should now be accesed through the Crawler.signals attribute. See the signals documentation for more info.r r }r (hCjhDjubaubaubh)r }r(hCXdropped Signals singleton. Signals should now be accesed through the Crawler.signals attribute. See the signals documentation for more info.rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDj hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM^hd]rhtXdropped Signals singleton. Signals should now be accesed through the Crawler.signals attribute. See the signals documentation for more info.rr}r(hCjhDjubaubaubh)r}r(hCXdropped Stats Collector singleton. Stats can now be accessed through the Crawler.stats attribute. See the stats collection documentation for more info.rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM_hd]r!htXdropped Stats Collector singleton. Stats can now be accessed through the Crawler.stats attribute. See the stats collection documentation for more info.r"r#}r$(hCjhDjubaubaubh)r%}r&(hCXdocumented :ref:`topics-api`r'hDjMhMhPhWhhY}r((h]]h^]h\]h[]h_]uhbNhchhd]r)h)r*}r+(hCj'hDj%hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbM`hd]r-(htX documented r.r/}r0(hCX documented hDj*ubj )r1}r2(hCX:ref:`topics-api`r3hDj*hMhPhWj hY}r4(UreftypeXrefj j X topics-apiU refdomainXstdr5h[]h\]U refexplicith]]h^]h_]j j uhbM`hd]r6j )r7}r8(hCj3hY}r9(h]]h^]r:(j j5Xstd-refr;eh\]h[]h_]uhDj1hd]r<htX topics-apir=r>}r?(hCUhDj7ubahWj ubaubeubaubh)r@}rA(hCX@`lxml` is now the default selectors backend instead of `libxml2`rBhDjMhMhPhWhhY}rC(h]]h^]h\]h[]h_]uhbNhchhd]rDh)rE}rF(hCjBhDj@hMhPhWhhY}rG(h]]h^]h\]h[]h_]uhbMahd]rH(h)rI}rJ(hCX`lxml`hY}rK(h]]h^]h\]h[]h_]uhDjEhd]rLhtXlxmlrMrN}rO(hCUhDjIubahWhubhtX1 is now the default selectors backend instead of rPrQ}rR(hCX1 is now the default selectors backend instead of hDjEubh)rS}rT(hCX `libxml2`hY}rU(h]]h^]h\]h[]h_]uhDjEhd]rVhtXlibxml2rWrX}rY(hCUhDjSubahWhubeubaubh)rZ}r[(hCXJported FormRequest.from_response() to use `lxml`_ instead of `ClientForm`_r\hDjMhMhPhWhhY}r](h]]h^]h\]h[]h_]uhbNhchhd]r^h)r_}r`(hCj\hDjZhMhPhWhhY}ra(h]]h^]h\]h[]h_]uhbMbhd]rb(htX*ported FormRequest.from_response() to use rcrd}re(hCX*ported FormRequest.from_response() to use hDj_ubh)rf}rg(hCX`lxml`_j) KhDj_hWhhY}rh(UnameXlxmlj, Xhttp://lxml.de/rih[]h\]h]]h^]h_]uhd]rjhtXlxmlrkrl}rm(hCUhDjfubaubhtX instead of rnro}rp(hCX instead of hDj_ubh)rq}rr(hCX `ClientForm`_j) KhDj_hWhhY}rs(UnameX ClientFormj, X0http://wwwsearch.sourceforge.net/old/ClientForm/rth[]h\]h]]h^]h_]uhd]ruhtX ClientFormrvrw}rx(hCUhDjqubaubeubaubh)ry}rz(hCXMremoved modules: ``scrapy.xlib.BeautifulSoup`` and ``scrapy.xlib.ClientForm``r{hDjMhMhPhWhhY}r|(h]]h^]h\]h[]h_]uhbNhchhd]r}h)r~}r(hCj{hDjyhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMchd]r(htXremoved modules: rr}r(hCXremoved modules: hDj~ubjV )r}r(hCX``scrapy.xlib.BeautifulSoup``hY}r(h]]h^]h\]h[]h_]uhDj~hd]rhtXscrapy.xlib.BeautifulSouprr}r(hCUhDjubahWj^ ubhtX and rr}r(hCX and hDj~ubjV )r}r(hCX``scrapy.xlib.ClientForm``hY}r(h]]h^]h\]h[]h_]uhDj~hd]rhtXscrapy.xlib.ClientFormrr}r(hCUhDjubahWj^ ubeubaubh)r}r(hCXSitemapSpider: added support for sitemap urls ending in .xml and .xml.gz, even if they advertise a wrong content type (:commit:`10ed28b`)rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMdhd]r(htXwSitemapSpider: added support for sitemap urls ending in .xml and .xml.gz, even if they advertise a wrong content type (rr}r(hCXwSitemapSpider: added support for sitemap urls ending in .xml and .xml.gz, even if they advertise a wrong content type (hDjubh)r}r(hCX:commit:`10ed28b`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/10ed28bh[]h\]h]]h^]h_]uhDjhd]rhtXcommit 10ed28brr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXPStackTraceDump extension: also dump trackref live references (:commit:`fe2ce93`)rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMehd]r(htX>StackTraceDump extension: also dump trackref live references (rr}r(hCX>StackTraceDump extension: also dump trackref live references (hDjubh)r}r(hCX:commit:`fe2ce93`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/fe2ce93h[]h\]h]]h^]h_]uhDjhd]rhtXcommit fe2ce93rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX@nested items now fully supported in JSON and JSONLines exportersrhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMfhd]rhtX@nested items now fully supported in JSON and JSONLines exportersrr}r(hCjhDjubaubaubh)r}r(hCXZadded :reqmeta:`cookiejar` Request meta key to support multiple cookie sessions per spiderrhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMghd]r(htXadded rr}r(hCXadded hDjubj )r}r(hCX:reqmeta:`cookiejar`rhDjhMhPhWj hY}r(UreftypeXreqmetaj j X cookiejarU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMghd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-reqmetareh\]h[]h_]uhDjhd]rhtX cookiejarrr}r(hCUhDjubahWj^ ubaubhtX@ Request meta key to support multiple cookie sessions per spiderrr}r(hCX@ Request meta key to support multiple cookie sessions per spiderhDjubeubaubh)r}r(hCX`decoupled encoding detection code to `w3lib.encoding`_, and ported Scrapy code to use that mdulerhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhhd]r(htX%decoupled encoding detection code to rr}r(hCX%decoupled encoding detection code to hDjubh)r}r(hCX`w3lib.encoding`_j) KhDjhWhhY}r(UnameXw3lib.encodingrj, X=https://github.com/scrapy/w3lib/blob/master/w3lib/encoding.pyrh[]h\]h]]h^]h_]uhd]rhtXw3lib.encodingrr}r(hCUhDjubaubhtX*, and ported Scrapy code to use that mdulerr}r(hCX*, and ported Scrapy code to use that mdulehDjubeubaubh)r}r(hCX`dropped support for Python 2.5. See http://blog.scrapy.org/scrapy-dropping-support-for-python-25rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMihd]r (htX$dropped support for Python 2.5. See r r }r (hCX$dropped support for Python 2.5. See hDjubh)r}r(hCX<http://blog.scrapy.org/scrapy-dropping-support-for-python-25rhY}r(Urefurijh[]h\]h]]h^]h_]uhDjhd]rhtX<http://blog.scrapy.org/scrapy-dropping-support-for-python-25rr}r(hCUhDjubahWhubeubaubh)r}r(hCXdropped support for Twisted 2.5rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMjhd]rhtXdropped support for Twisted 2.5rr }r!(hCjhDjubaubaubh)r"}r#(hCXGadded :setting:`REFERER_ENABLED` setting, to control referer middlewarer$hDjMhMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbNhchhd]r&h)r'}r((hCj$hDj"hMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbMkhd]r*(htXadded r+r,}r-(hCXadded hDj'ubj )r.}r/(hCX:setting:`REFERER_ENABLED`r0hDj'hMhPhWj hY}r1(UreftypeXsettingj j XREFERER_ENABLEDU refdomainXstdr2h[]h\]U refexplicith]]h^]h_]j j uhbMkhd]r3jV )r4}r5(hCj0hY}r6(h]]h^]r7(j j2X std-settingr8eh\]h[]h_]uhDj.hd]r9htXREFERER_ENABLEDr:r;}r<(hCUhDj4ubahWj^ ubaubhtX' setting, to control referer middlewarer=r>}r?(hCX' setting, to control referer middlewarehDj'ubeubaubh)r@}rA(hCXFchanged default user agent to: ``Scrapy/VERSION (+http://scrapy.org)``rBhDjMhMhPhWhhY}rC(h]]h^]h\]h[]h_]uhbNhchhd]rDh)rE}rF(hCjBhDj@hMhPhWhhY}rG(h]]h^]h\]h[]h_]uhbMlhd]rH(htXchanged default user agent to: rIrJ}rK(hCXchanged default user agent to: hDjEubjV )rL}rM(hCX'``Scrapy/VERSION (+http://scrapy.org)``hY}rN(h]]h^]h\]h[]h_]uhDjEhd]rOhtX#Scrapy/VERSION (+http://scrapy.org)rPrQ}rR(hCUhDjLubahWj^ ubeubaubh)rS}rT(hCXdremoved (undocumented) ``HTMLImageLinkExtractor`` class from ``scrapy.contrib.linkextractors.image``rUhDjMhMhPhWhhY}rV(h]]h^]h\]h[]h_]uhbNhchhd]rWh)rX}rY(hCjUhDjShMhPhWhhY}rZ(h]]h^]h\]h[]h_]uhbMmhd]r[(htXremoved (undocumented) r\r]}r^(hCXremoved (undocumented) hDjXubjV )r_}r`(hCX``HTMLImageLinkExtractor``hY}ra(h]]h^]h\]h[]h_]uhDjXhd]rbhtXHTMLImageLinkExtractorrcrd}re(hCUhDj_ubahWj^ ubhtX class from rfrg}rh(hCX class from hDjXubjV )ri}rj(hCX'``scrapy.contrib.linkextractors.image``hY}rk(h]]h^]h\]h[]h_]uhDjXhd]rlhtX#scrapy.contrib.linkextractors.imagermrn}ro(hCUhDjiubahWj^ ubeubaubh)rp}rq(hCXVremoved per-spider settings (to be replaced by instantiating multiple crawler objects)rrhDjMhMhPhWhhY}rs(h]]h^]h\]h[]h_]uhbNhchhd]rth)ru}rv(hCjrhDjphMhPhWhhY}rw(h]]h^]h\]h[]h_]uhbMnhd]rxhtXVremoved per-spider settings (to be replaced by instantiating multiple crawler objects)ryrz}r{(hCjrhDjuubaubaubh)r|}r}(hCXY``USER_AGENT`` spider attribute will no longer work, use ``user_agent`` attribute insteadr~hDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCj~hDj|hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMohd]r(jV )r}r(hCX``USER_AGENT``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX USER_AGENTrr}r(hCUhDjubahWj^ ubhtX+ spider attribute will no longer work, use rr}r(hCX+ spider attribute will no longer work, use hDjubjV )r}r(hCX``user_agent``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX user_agentrr}r(hCUhDjubahWj^ ubhtX attribute insteadrr}r(hCX attribute insteadhDjubeubaubh)r}r(hCXe``DOWNLOAD_TIMEOUT`` spider attribute will no longer work, use ``download_timeout`` attribute insteadrhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMphd]r(jV )r}r(hCX``DOWNLOAD_TIMEOUT``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXDOWNLOAD_TIMEOUTrr}r(hCUhDjubahWj^ ubhtX+ spider attribute will no longer work, use rr}r(hCX+ spider attribute will no longer work, use hDjubjV )r}r(hCX``download_timeout``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXdownload_timeoutrr}r(hCUhDjubahWj^ ubhtX attribute insteadrr}r(hCX attribute insteadhDjubeubaubh)r}r(hCXgremoved ``ENCODING_ALIASES`` setting, as encoding auto-detection has been moved to the `w3lib`_ libraryrhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMqhd]r(htXremoved rr}r(hCXremoved hDjubjV )r}r(hCX``ENCODING_ALIASES``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXENCODING_ALIASESrr}r(hCUhDjubahWj^ ubhtX; setting, as encoding auto-detection has been moved to the rr}r(hCX; setting, as encoding auto-detection has been moved to the hDjubh)r}r(hCX`w3lib`_j) KhDjhWhhY}r(UnameXw3librj, X&http://https://github.com/scrapy/w3librh[]h\]h]]h^]h_]uhd]rhtXw3librr}r(hCUhDjubaubhtX libraryrr}r(hCX libraryhDjubeubaubh)r}r(hCX1promoted :ref:`topics-djangoitem` to main contribrhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMrhd]r(htX promoted rr}r(hCX promoted hDjubj )r}r(hCX:ref:`topics-djangoitem`rhDjhMhPhWj hY}r(UreftypeXrefj j Xtopics-djangoitemU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMrhd]rj )r}r(hCjhY}r(h]]h^]r(j jXstd-refreh\]h[]h_]uhDjhd]rhtXtopics-djangoitemrr}r(hCUhDjubahWj ubaubhtX to main contribrr}r(hCX to main contribhDjubeubaubh)r}r(hCXuLogFormatter method now return dicts(instead of strings) to support lazy formatting (:issue:`164`, :commit:`dcef7b0`)rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMshd]r(htXULogFormatter method now return dicts(instead of strings) to support lazy formatting (rr}r(hCXULogFormatter method now return dicts(instead of strings) to support lazy formatting (hDjubh)r}r(hCX :issue:`164`hY}r(UrefuriX+https://github.com/scrapy/scrapy/issues/164h[]h\]h]]h^]h_]uhDjhd]rhtX issue 164rr}r(hCUhDjubahWhubhtX, r r }r (hCX, hDjubh)r }r (hCX:commit:`dcef7b0`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/dcef7b0h[]h\]h]]h^]h_]uhDjhd]rhtXcommit dcef7b0rr}r(hCUhDj ubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXxdownloader handlers (:setting:`DOWNLOAD_HANDLERS` setting) now receive settings as the first argument of the constructorrhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMthd]r(htXdownloader handlers (rr}r (hCXdownloader handlers (hDjubj )r!}r"(hCX:setting:`DOWNLOAD_HANDLERS`r#hDjhMhPhWj hY}r$(UreftypeXsettingj j XDOWNLOAD_HANDLERSU refdomainXstdr%h[]h\]U refexplicith]]h^]h_]j j uhbMthd]r&jV )r'}r((hCj#hY}r)(h]]h^]r*(j j%X std-settingr+eh\]h[]h_]uhDj!hd]r,htXDOWNLOAD_HANDLERSr-r.}r/(hCUhDj'ubahWj^ ubaubhtXG setting) now receive settings as the first argument of the constructorr0r1}r2(hCXG setting) now receive settings as the first argument of the constructorhDjubeubaubh)r3}r4(hCXoreplaced memory usage acounting with (more portable) `resource`_ module, removed ``scrapy.utils.memory`` moduler5hDjMhMhPhWhhY}r6(h]]h^]h\]h[]h_]uhbNhchhd]r7h)r8}r9(hCj5hDj3hMhPhWhhY}r:(h]]h^]h\]h[]h_]uhbMuhd]r;(htX5replaced memory usage acounting with (more portable) r<r=}r>(hCX5replaced memory usage acounting with (more portable) hDj8ubh)r?}r@(hCX `resource`_j) KhDj8hWhhY}rA(UnameXresourcerBj, X,http://docs.python.org/library/resource.htmlrCh[]h\]h]]h^]h_]uhd]rDhtXresourcerErF}rG(hCUhDj?ubaubhtX module, removed rHrI}rJ(hCX module, removed hDj8ubjV )rK}rL(hCX``scrapy.utils.memory``hY}rM(h]]h^]h\]h[]h_]uhDj8hd]rNhtXscrapy.utils.memoryrOrP}rQ(hCUhDjKubahWj^ ubhtX modulerRrS}rT(hCX modulehDj8ubeubaubh)rU}rV(hCX)removed signal: ``scrapy.mail.mail_sent``rWhDjMhMhPhWhhY}rX(h]]h^]h\]h[]h_]uhbNhchhd]rYh)rZ}r[(hCjWhDjUhMhPhWhhY}r\(h]]h^]h\]h[]h_]uhbMvhd]r](htXremoved signal: r^r_}r`(hCXremoved signal: hDjZubjV )ra}rb(hCX``scrapy.mail.mail_sent``hY}rc(h]]h^]h\]h[]h_]uhDjZhd]rdhtXscrapy.mail.mail_sentrerf}rg(hCUhDjaubahWj^ ubeubaubh)rh}ri(hCX_removed ``TRACK_REFS`` setting, now :ref:`trackrefs ` is always enabledrjhDjMhMhPhWhhY}rk(h]]h^]h\]h[]h_]uhbNhchhd]rlh)rm}rn(hCjjhDjhhMhPhWhhY}ro(h]]h^]h\]h[]h_]uhbMwhd]rp(htXremoved rqrr}rs(hCXremoved hDjmubjV )rt}ru(hCX``TRACK_REFS``hY}rv(h]]h^]h\]h[]h_]uhDjmhd]rwhtX TRACK_REFSrxry}rz(hCUhDjtubahWj^ ubhtX setting, now r{r|}r}(hCX setting, now hDjmubj )r~}r(hCX):ref:`trackrefs `rhDjmhMhPhWj hY}r(UreftypeXrefj j Xtopics-leaks-trackrefsU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMwhd]rj )r}r(hCjhY}r(h]]h^]r(j jXstd-refreh\]h[]h_]uhDj~hd]rhtX trackrefsrr}r(hCUhDjubahWj ubaubhtX is always enabledrr}r(hCX is always enabledhDjmubeubaubh)r}r(hCX@DBM is now the default storage backend for HTTP cache middlewarerhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMxhd]rhtX@DBM is now the default storage backend for HTTP cache middlewarerr}r(hCjhDjubaubaubh)r}r(hCXhnumber of log messages (per level) are now tracked through Scrapy stats (stat name: ``log_count/LEVEL``)rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMyhd]r(htXTnumber of log messages (per level) are now tracked through Scrapy stats (stat name: rr}r(hCXTnumber of log messages (per level) are now tracked through Scrapy stats (stat name: hDjubjV )r}r(hCX``log_count/LEVEL``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXlog_count/LEVELrr}r(hCUhDjubahWj^ ubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXgnumber received responses are now tracked through Scrapy stats (stat name: ``response_received_count``)rhDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMzhd]r(htXKnumber received responses are now tracked through Scrapy stats (stat name: rr}r(hCXKnumber received responses are now tracked through Scrapy stats (stat name: hDjubjV )r}r(hCX``response_received_count``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXresponse_received_countrr}r(hCUhDjubahWj^ ubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX)removed ``scrapy.log.started`` attribute hDjMhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCX(removed ``scrapy.log.started`` attributehDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM{hd]r(htXremoved rr}r(hCXremoved hDjubjV )r}r(hCX``scrapy.log.started``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXscrapy.log.startedrr}r(hCUhDjubahWj^ ubhtX attributerr}r(hCX attributehDjubeubaubeubeubhE)r}r(hCUhDhKhMhPhWhehY}r(h]]h^]h\]h[]rUid2rah_]rh!auhbM~hchhd]r(hm)r}r(hCX0.14.4rhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbM~hchhd]rhtX0.14.4rr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbMhchhd]r(h)r}r(hCX=added precise to supported ubuntu distros (:commit:`b7e46df`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX+added precise to supported ubuntu distros (rr}r(hCX+added precise to supported ubuntu distros (hDjubh)r}r(hCX:commit:`b7e46df`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/b7e46dfh[]h\]h]]h^]h_]uhDjhd]rhtXcommit b7e46dfrr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXfixed bug in json-rpc webservice reported in https://groups.google.com/d/topic/scrapy-users/qgVBmFybNAQ/discussion. also removed no longer supported 'run' command from extras/scrapy-ws.py (:commit:`340fbdb`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r (hCjhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMhd]r (htX-fixed bug in json-rpc webservice reported in r r }r(hCX-fixed bug in json-rpc webservice reported in hDjubh)r}r(hCXEhttps://groups.google.com/d/topic/scrapy-users/qgVBmFybNAQ/discussionrhY}r(Urefurijh[]h\]h]]h^]h_]uhDjhd]rhtXEhttps://groups.google.com/d/topic/scrapy-users/qgVBmFybNAQ/discussionrr}r(hCUhDjubahWhubhtXK. also removed no longer supported 'run' command from extras/scrapy-ws.py (rr}r(hCXK. also removed no longer supported 'run' command from extras/scrapy-ws.py (hDjubh)r}r(hCX:commit:`340fbdb`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/340fbdbh[]h\]h]]h^]h_]uhDjhd]rhtXcommit 340fbdbrr}r (hCUhDjubahWhubhtX)r!}r"(hCX)hDjubeubaubh)r#}r$(hCX]meta tag attributes for content-type http equiv can be in any order. #123 (:commit:`0cb68af`)r%hDjhMhPhWhhY}r&(h]]h^]h\]h[]h_]uhbNhchhd]r'h)r(}r)(hCj%hDj#hMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbMhd]r+(htXKmeta tag attributes for content-type http equiv can be in any order. #123 (r,r-}r.(hCXKmeta tag attributes for content-type http equiv can be in any order. #123 (hDj(ubh)r/}r0(hCX:commit:`0cb68af`hY}r1(UrefuriX/https://github.com/scrapy/scrapy/commit/0cb68afh[]h\]h]]h^]h_]uhDj(hd]r2htXcommit 0cb68afr3r4}r5(hCUhDj/ubahWhubhtX)r6}r7(hCX)hDj(ubeubaubh)r8}r9(hCX_replace "import Image" by more standard "from PIL import Image". closes #88 (:commit:`4d17048`)r:hDjhMhPhWhhY}r;(h]]h^]h\]h[]h_]uhbNhchhd]r<h)r=}r>(hCj:hDj8hMhPhWhhY}r?(h]]h^]h\]h[]h_]uhbMhd]r@(htXMreplace "import Image" by more standard "from PIL import Image". closes #88 (rArB}rC(hCXMreplace "import Image" by more standard "from PIL import Image". closes #88 (hDj=ubh)rD}rE(hCX:commit:`4d17048`hY}rF(UrefuriX/https://github.com/scrapy/scrapy/commit/4d17048h[]h\]h]]h^]h_]uhDj=hd]rGhtXcommit 4d17048rHrI}rJ(hCUhDjDubahWhubhtX)rK}rL(hCX)hDj=ubeubaubh)rM}rN(hCXLreturn trial status as bin/runtests.sh exit value. #118 (:commit:`b7b2e7f`) hDjhMhPhWhhY}rO(h]]h^]h\]h[]h_]uhbNhchhd]rPh)rQ}rR(hCXKreturn trial status as bin/runtests.sh exit value. #118 (:commit:`b7b2e7f`)hDjMhMhPhWhhY}rS(h]]h^]h\]h[]h_]uhbMhd]rT(htX9return trial status as bin/runtests.sh exit value. #118 (rUrV}rW(hCX9return trial status as bin/runtests.sh exit value. #118 (hDjQubh)rX}rY(hCX:commit:`b7b2e7f`hY}rZ(UrefuriX/https://github.com/scrapy/scrapy/commit/b7b2e7fh[]h\]h]]h^]h_]uhDjQhd]r[htXcommit b7b2e7fr\r]}r^(hCUhDjXubahWhubhtX)r_}r`(hCX)hDjQubeubaubeubeubhE)ra}rb(hCUhDhKhMhPhWhehY}rc(h]]h^]h\]h[]rdUid3reah_]rfh$auhbMhchhd]rg(hm)rh}ri(hCX0.14.3rjhDjahMhPhWhqhY}rk(h]]h^]h\]h[]h_]uhbMhchhd]rlhtX0.14.3rmrn}ro(hCjjhDjhubaubh)rp}rq(hCUhDjahMhPhWhhY}rr(hX-h[]h\]h]]h^]h_]uhbMhchhd]rs(h)rt}ru(hCX>forgot to include pydispatch license. #118 (:commit:`fd85f9c`)rvhDjphMhPhWhhY}rw(h]]h^]h\]h[]h_]uhbNhchhd]rxh)ry}rz(hCjvhDjthMhPhWhhY}r{(h]]h^]h\]h[]h_]uhbMhd]r|(htX,forgot to include pydispatch license. #118 (r}r~}r(hCX,forgot to include pydispatch license. #118 (hDjyubh)r}r(hCX:commit:`fd85f9c`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/fd85f9ch[]h\]h]]h^]h_]uhDjyhd]rhtXcommit fd85f9crr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjyubeubaubh)r}r(hCXTinclude egg files used by testsuite in source distribution. #118 (:commit:`c897793`)rhDjphMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXBinclude egg files used by testsuite in source distribution. #118 (rr}r(hCXBinclude egg files used by testsuite in source distribution. #118 (hDjubh)r}r(hCX:commit:`c897793`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/c897793h[]h\]h]]h^]h_]uhDjhd]rhtXcommit c897793rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXupdate docstring in project template to avoid confusion with genspider command, which may be considered as an advanced feature. refs #107 (:commit:`2548dcc`)rhDjphMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXupdate docstring in project template to avoid confusion with genspider command, which may be considered as an advanced feature. refs #107 (rr}r(hCXupdate docstring in project template to avoid confusion with genspider command, which may be considered as an advanced feature. refs #107 (hDjubh)r}r(hCX:commit:`2548dcc`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/2548dcch[]h\]h]]h^]h_]uhDjhd]rhtXcommit 2548dccrr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX`added note to docs/topics/firebug.rst about google directory being shut down (:commit:`668e352`)rhDjphMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXNadded note to docs/topics/firebug.rst about google directory being shut down (rr}r(hCXNadded note to docs/topics/firebug.rst about google directory being shut down (hDjubh)r}r(hCX:commit:`668e352`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/668e352h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 668e352rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXpdont discard slot when empty, just save in another dict in order to recycle if needed again. (:commit:`8e9f607`)rhDjphMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX^dont discard slot when empty, just save in another dict in order to recycle if needed again. (rr}r(hCX^dont discard slot when empty, just save in another dict in order to recycle if needed again. (hDjubh)r}r(hCX:commit:`8e9f607`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/8e9f607h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 8e9f607rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXSdo not fail handling unicode xpaths in libxml2 backed selectors (:commit:`b830e95`)rhDjphMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdo not fail handling unicode xpaths in libxml2 backed selectors (rr}r(hCXAdo not fail handling unicode xpaths in libxml2 backed selectors (hDjubh)r}r(hCX:commit:`b830e95`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/b830e95h[]h\]h]]h^]h_]uhDjhd]rhtXcommit b830e95rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXHfixed minor mistake in Request objects documentation (:commit:`bf3c9ee`)rhDjphMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX6fixed minor mistake in Request objects documentation (rr}r(hCX6fixed minor mistake in Request objects documentation (hDjubh)r}r(hCX:commit:`bf3c9ee`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/bf3c9eeh[]h\]h]]h^]h_]uhDjhd]rhtXcommit bf3c9eerr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXGfixed minor defect in link extractors documentation (:commit:`ba14f38`)r hDjphMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX5fixed minor defect in link extractors documentation (rr}r(hCX5fixed minor defect in link extractors documentation (hDj ubh)r}r(hCX:commit:`ba14f38`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/ba14f38h[]h\]h]]h^]h_]uhDj hd]rhtXcommit ba14f38rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDj ubeubaubh)r}r(hCX]removed some obsolete remaining code related to sqlite support in scrapy (:commit:`0665175`) hDjphMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r }r!(hCX\removed some obsolete remaining code related to sqlite support in scrapy (:commit:`0665175`)hDjhMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbMhd]r#(htXJremoved some obsolete remaining code related to sqlite support in scrapy (r$r%}r&(hCXJremoved some obsolete remaining code related to sqlite support in scrapy (hDj ubh)r'}r((hCX:commit:`0665175`hY}r)(UrefuriX/https://github.com/scrapy/scrapy/commit/0665175h[]h\]h]]h^]h_]uhDj hd]r*htXcommit 0665175r+r,}r-(hCUhDj'ubahWhubhtX)r.}r/(hCX)hDj ubeubaubeubeubhE)r0}r1(hCUhDhKhMhPhWhehY}r2(h]]h^]h\]h[]r3Uid4r4ah_]r5h%auhbMhchhd]r6(hm)r7}r8(hCX0.14.2r9hDj0hMhPhWhqhY}r:(h]]h^]h\]h[]h_]uhbMhchhd]r;htX0.14.2r<r=}r>(hCj9hDj7ubaubh)r?}r@(hCUhDj0hMhPhWhhY}rA(hX-h[]h\]h]]h^]h_]uhbMhchhd]rB(h)rC}rD(hCX]move buffer pointing to start of file before computing checksum. refs #92 (:commit:`6a5bef2`)rEhDj?hMhPhWhhY}rF(h]]h^]h\]h[]h_]uhbNhchhd]rGh)rH}rI(hCjEhDjChMhPhWhhY}rJ(h]]h^]h\]h[]h_]uhbMhd]rK(htXKmove buffer pointing to start of file before computing checksum. refs #92 (rLrM}rN(hCXKmove buffer pointing to start of file before computing checksum. refs #92 (hDjHubh)rO}rP(hCX:commit:`6a5bef2`hY}rQ(UrefuriX/https://github.com/scrapy/scrapy/commit/6a5bef2h[]h\]h]]h^]h_]uhDjHhd]rRhtXcommit 6a5bef2rSrT}rU(hCUhDjOubahWhubhtX)rV}rW(hCX)hDjHubeubaubh)rX}rY(hCXOCompute image checksum before persisting images. closes #92 (:commit:`9817df1`)rZhDj?hMhPhWhhY}r[(h]]h^]h\]h[]h_]uhbNhchhd]r\h)r]}r^(hCjZhDjXhMhPhWhhY}r_(h]]h^]h\]h[]h_]uhbMhd]r`(htX=Compute image checksum before persisting images. closes #92 (rarb}rc(hCX=Compute image checksum before persisting images. closes #92 (hDj]ubh)rd}re(hCX:commit:`9817df1`hY}rf(UrefuriX/https://github.com/scrapy/scrapy/commit/9817df1h[]h\]h]]h^]h_]uhDj]hd]rghtXcommit 9817df1rhri}rj(hCUhDjdubahWhubhtX)rk}rl(hCX)hDj]ubeubaubh)rm}rn(hCX@remove leaking references in cached failures (:commit:`673a120`)rohDj?hMhPhWhhY}rp(h]]h^]h\]h[]h_]uhbNhchhd]rqh)rr}rs(hCjohDjmhMhPhWhhY}rt(h]]h^]h\]h[]h_]uhbMhd]ru(htX.remove leaking references in cached failures (rvrw}rx(hCX.remove leaking references in cached failures (hDjrubh)ry}rz(hCX:commit:`673a120`hY}r{(UrefuriX/https://github.com/scrapy/scrapy/commit/673a120h[]h\]h]]h^]h_]uhDjrhd]r|htXcommit 673a120r}r~}r(hCUhDjyubahWhubhtX)r}r(hCX)hDjrubeubaubh)r}r(hCXnfixed bug in MemoryUsage extension: get_engine_status() takes exactly 1 argument (0 given) (:commit:`11133e9`)rhDj?hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX\fixed bug in MemoryUsage extension: get_engine_status() takes exactly 1 argument (0 given) (rr}r(hCX\fixed bug in MemoryUsage extension: get_engine_status() takes exactly 1 argument (0 given) (hDjubh)r}r(hCX:commit:`11133e9`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/11133e9h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 11133e9rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXQfixed struct.error on http compression middleware. closes #87 (:commit:`1423140`)rhDj?hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX?fixed struct.error on http compression middleware. closes #87 (rr}r(hCX?fixed struct.error on http compression middleware. closes #87 (hDjubh)r}r(hCX:commit:`1423140`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/1423140h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 1423140rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXCajax crawling wasn't expanding for unicode urls (:commit:`0de3fb4`)rhDj?hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX1ajax crawling wasn't expanding for unicode urls (rr}r(hCX1ajax crawling wasn't expanding for unicode urls (hDjubh)r}r(hCX:commit:`0de3fb4`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/0de3fb4h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 0de3fb4rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXBCatch start_requests iterator errors. refs #83 (:commit:`454a21d`)rhDj?hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX0Catch start_requests iterator errors. refs #83 (rr}r(hCX0Catch start_requests iterator errors. refs #83 (hDjubh)r}r(hCX:commit:`454a21d`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/454a21dh[]h\]h]]h^]h_]uhDjhd]rhtXcommit 454a21drr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX2Speed-up libxml2 XPathSelector (:commit:`2fbd662`)rhDj?hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX Speed-up libxml2 XPathSelector (rr}r(hCX Speed-up libxml2 XPathSelector (hDjubh)r}r(hCX:commit:`2fbd662`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/2fbd662h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 2fbd662rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXFupdated versioning doc according to recent changes (:commit:`0a070f5`)rhDj?hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX4updated versioning doc according to recent changes (rr}r(hCX4updated versioning doc according to recent changes (hDjubh)r}r(hCX:commit:`0a070f5`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/0a070f5h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 0a070f5rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX5scrapyd: fixed documentation link (:commit:`2b4e4c3`)rhDj?hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX#scrapyd: fixed documentation link (r r }r (hCX#scrapyd: fixed documentation link (hDjubh)r }r (hCX:commit:`2b4e4c3`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/2b4e4c3h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 2b4e4c3rr}r(hCUhDj ubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXLextras/makedeb.py: no longer obtaining version from git (:commit:`caffe0e`) hDj?hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCXKextras/makedeb.py: no longer obtaining version from git (:commit:`caffe0e`)hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX9extras/makedeb.py: no longer obtaining version from git (rr}r(hCX9extras/makedeb.py: no longer obtaining version from git (hDjubh)r }r!(hCX:commit:`caffe0e`hY}r"(UrefuriX/https://github.com/scrapy/scrapy/commit/caffe0eh[]h\]h]]h^]h_]uhDjhd]r#htXcommit caffe0er$r%}r&(hCUhDj ubahWhubhtX)r'}r((hCX)hDjubeubaubeubeubhE)r)}r*(hCUhDhKhMhPhWhehY}r+(h]]h^]h\]h[]r,Uid5r-ah_]r.h#auhbMhchhd]r/(hm)r0}r1(hCX0.14.1r2hDj)hMhPhWhqhY}r3(h]]h^]h\]h[]h_]uhbMhchhd]r4htX0.14.1r5r6}r7(hCj2hDj0ubaubh)r8}r9(hCUhDj)hMhPhWhhY}r:(hX-h[]h\]h]]h^]h_]uhbMhchhd]r;(h)r<}r=(hCXKextras/makedeb.py: no longer obtaining version from git (:commit:`caffe0e`)r>hDj8hMhPhWhhY}r?(h]]h^]h\]h[]h_]uhbNhchhd]r@h)rA}rB(hCj>hDj<hMhPhWhhY}rC(h]]h^]h\]h[]h_]uhbMhd]rD(htX9extras/makedeb.py: no longer obtaining version from git (rErF}rG(hCX9extras/makedeb.py: no longer obtaining version from git (hDjAubh)rH}rI(hCX:commit:`caffe0e`hY}rJ(UrefuriX/https://github.com/scrapy/scrapy/commit/caffe0eh[]h\]h]]h^]h_]uhDjAhd]rKhtXcommit caffe0erLrM}rN(hCUhDjHubahWhubhtX)rO}rP(hCX)hDjAubeubaubh)rQ}rR(hCX,bumped version to 0.14.1 (:commit:`6cb9e1c`)rShDj8hMhPhWhhY}rT(h]]h^]h\]h[]h_]uhbNhchhd]rUh)rV}rW(hCjShDjQhMhPhWhhY}rX(h]]h^]h\]h[]h_]uhbMhd]rY(htXbumped version to 0.14.1 (rZr[}r\(hCXbumped version to 0.14.1 (hDjVubh)r]}r^(hCX:commit:`6cb9e1c`hY}r_(UrefuriX/https://github.com/scrapy/scrapy/commit/6cb9e1ch[]h\]h]]h^]h_]uhDjVhd]r`htXcommit 6cb9e1crarb}rc(hCUhDj]ubahWhubhtX)rd}re(hCX)hDjVubeubaubh)rf}rg(hCX9fixed reference to tutorial directory (:commit:`4b86bd6`)rhhDj8hMhPhWhhY}ri(h]]h^]h\]h[]h_]uhbNhchhd]rjh)rk}rl(hCjhhDjfhMhPhWhhY}rm(h]]h^]h\]h[]h_]uhbMhd]rn(htX'fixed reference to tutorial directory (rorp}rq(hCX'fixed reference to tutorial directory (hDjkubh)rr}rs(hCX:commit:`4b86bd6`hY}rt(UrefuriX/https://github.com/scrapy/scrapy/commit/4b86bd6h[]h\]h]]h^]h_]uhDjkhd]ruhtXcommit 4b86bd6rvrw}rx(hCUhDjrubahWhubhtX)ry}rz(hCX)hDjkubeubaubh)r{}r|(hCXTdoc: removed duplicated callback argument from Request.replace() (:commit:`1aeccdd`)r}hDj8hMhPhWhhY}r~(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCj}hDj{hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXBdoc: removed duplicated callback argument from Request.replace() (rr}r(hCXBdoc: removed duplicated callback argument from Request.replace() (hDjubh)r}r(hCX:commit:`1aeccdd`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/1aeccddh[]h\]h]]h^]h_]uhDjhd]rhtXcommit 1aeccddrr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX3fixed formatting of scrapyd doc (:commit:`8bf19e6`)rhDj8hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX!fixed formatting of scrapyd doc (rr}r(hCX!fixed formatting of scrapyd doc (hDjubh)r}r(hCX:commit:`8bf19e6`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/8bf19e6h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 8bf19e6rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXpDump stacks for all running threads and fix engine status dumped by StackTraceDump extension (:commit:`14a8e6e`)rhDj8hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX^Dump stacks for all running threads and fix engine status dumped by StackTraceDump extension (rr}r(hCX^Dump stacks for all running threads and fix engine status dumped by StackTraceDump extension (hDjubh)r}r(hCX:commit:`14a8e6e`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/14a8e6eh[]h\]h]]h^]h_]uhDjhd]rhtXcommit 14a8e6err}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXPadded comment about why we disable ssl on boto images upload (:commit:`5223575`)rhDj8hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX>added comment about why we disable ssl on boto images upload (rr}r(hCX>added comment about why we disable ssl on boto images upload (hDjubh)r}r(hCX:commit:`5223575`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/5223575h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 5223575rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXXSSL handshaking hangs when doing too many parallel connections to S3 (:commit:`63d583d`)rhDj8hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXFSSL handshaking hangs when doing too many parallel connections to S3 (rr}r(hCXFSSL handshaking hangs when doing too many parallel connections to S3 (hDjubh)r}r(hCX:commit:`63d583d`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/63d583dh[]h\]h]]h^]h_]uhDjhd]rhtXcommit 63d583drr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXBchange tutorial to follow changes on dmoz site (:commit:`bcb3198`)rhDj8hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX0change tutorial to follow changes on dmoz site (rr}r(hCX0change tutorial to follow changes on dmoz site (hDjubh)r}r(hCX:commit:`bcb3198`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/bcb3198h[]h\]h]]h^]h_]uhDjhd]rhtXcommit bcb3198rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX[Avoid _disconnectedDeferred AttributeError exception in Twisted>=11.1.0 (:commit:`98f3f87`)rhDj8hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXIAvoid _disconnectedDeferred AttributeError exception in Twisted>=11.1.0 (rr}r(hCXIAvoid _disconnectedDeferred AttributeError exception in Twisted>=11.1.0 (hDjubh)r}r(hCX:commit:`98f3f87`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/98f3f87h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 98f3f87r r }r (hCUhDjubahWhubhtX)r }r (hCX)hDjubeubaubh)r}r(hCXEallow spider to set autothrottle max concurrency (:commit:`175a4b5`) hDj8hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCXDallow spider to set autothrottle max concurrency (:commit:`175a4b5`)hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX2allow spider to set autothrottle max concurrency (rr}r(hCX2allow spider to set autothrottle max concurrency (hDjubh)r}r(hCX:commit:`175a4b5`hY}r(UrefuriX/https://github.com/scrapy/scrapy/commit/175a4b5h[]h\]h]]h^]h_]uhDjhd]rhtXcommit 175a4b5rr}r(hCUhDjubahWhubhtX)r }r!(hCX)hDjubeubaubeubeubhE)r"}r#(hCUhDhKhMhPhWhehY}r$(h]]h^]h\]h[]r%Uid6r&ah_]r'h3auhbMhchhd]r((hm)r)}r*(hCX0.14r+hDj"hMhPhWhqhY}r,(h]]h^]h\]h[]h_]uhbMhchhd]r-htX0.14r.r/}r0(hCj+hDj)ubaubhE)r1}r2(hCUhDj"hMhPhWhehY}r3(h]]h^]h\]h[]r4Unew-features-and-settingsr5ah_]r6hauhbMhchhd]r7(hm)r8}r9(hCXNew features and settingsr:hDj1hMhPhWhqhY}r;(h]]h^]h\]h[]h_]uhbMhchhd]r<htXNew features and settingsr=r>}r?(hCj:hDj8ubaubh)r@}rA(hCUhDj1hMhPhWhhY}rB(hX-h[]h\]h]]h^]h_]uhbMhchhd]rC(h)rD}rE(hCX#Support for `AJAX crawleable urls`_rFhDj@hMhPhWhhY}rG(h]]h^]h\]h[]h_]uhbNhchhd]rHh)rI}rJ(hCjFhDjDhMhPhWhhY}rK(h]]h^]h\]h[]h_]uhbMhd]rL(htX Support for rMrN}rO(hCX Support for hDjIubh)rP}rQ(hCX`AJAX crawleable urls`_j) KhDjIhWhhY}rR(UnameXAJAX crawleable urlsj, XAhttp://code.google.com/web/ajaxcrawling/docs/getting-started.htmlrSh[]h\]h]]h^]h_]uhd]rThtXAJAX crawleable urlsrUrV}rW(hCUhDjPubaubeubaubh)rX}rY(hCXjNew persistent scheduler that stores requests on disk, allowing to suspend and resume crawls (:rev:`2737`)rZhDj@hMhPhWhhY}r[(h]]h^]h\]h[]h_]uhbNhchhd]r\h)r]}r^(hCjZhDjXhMhPhWhhY}r_(h]]h^]h\]h[]h_]uhbMhd]r`(htX^New persistent scheduler that stores requests on disk, allowing to suspend and resume crawls (rarb}rc(hCX^New persistent scheduler that stores requests on disk, allowing to suspend and resume crawls (hDj]ubh)rd}re(hCX :rev:`2737`hY}rf(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2737h[]h\]h]]h^]h_]uhDj]hd]rghtXr2737rhri}rj(hCUhDjdubahWhubhtX)rk}rl(hCX)hDj]ubeubaubh)rm}rn(hCXzadded ``-o`` option to ``scrapy crawl``, a shortcut for dumping scraped items into a file (or standard output using ``-``)rohDj@hMhPhWhhY}rp(h]]h^]h\]h[]h_]uhbNhchhd]rqh)rr}rs(hCjohDjmhMhPhWhhY}rt(h]]h^]h\]h[]h_]uhbMhd]ru(htXadded rvrw}rx(hCXadded hDjrubjV )ry}rz(hCX``-o``hY}r{(h]]h^]h\]h[]h_]uhDjrhd]r|htX-or}r~}r(hCUhDjyubahWj^ ubhtX option to rr}r(hCX option to hDjrubjV )r}r(hCX``scrapy crawl``hY}r(h]]h^]h\]h[]h_]uhDjrhd]rhtX scrapy crawlrr}r(hCUhDjubahWj^ ubhtXM, a shortcut for dumping scraped items into a file (or standard output using rr}r(hCXM, a shortcut for dumping scraped items into a file (or standard output using hDjrubjV )r}r(hCX``-``hY}r(h]]h^]h\]h[]h_]uhDjrhd]rhtX-r}r(hCUhDjubahWj^ ubhtX)r}r(hCX)hDjrubeubaubh)r}r(hCXeAdded support for passing custom settings to Scrapyd ``schedule.json`` api (:rev:`2779`, :rev:`2783`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX5Added support for passing custom settings to Scrapyd rr}r(hCX5Added support for passing custom settings to Scrapyd hDjubjV )r}r(hCX``schedule.json``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX schedule.jsonrr}r(hCUhDjubahWj^ ubhtX api (rr}r(hCX api (hDjubh)r}r(hCX :rev:`2779`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2779h[]h\]h]]h^]h_]uhDjhd]rhtXr2779rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r(hCX :rev:`2783`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2783h[]h\]h]]h^]h_]uhDjhd]rhtXr2783rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXlNew ``ChunkedTransferMiddleware`` (enabled by default) to support `chunked transfer encoding`_ (:rev:`2769`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXNew rr}r(hCXNew hDjubjV )r}r(hCX``ChunkedTransferMiddleware``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXChunkedTransferMiddlewarerr}r(hCUhDjubahWj^ ubhtX! (enabled by default) to support rr}r(hCX! (enabled by default) to support hDjubh)r}r(hCX`chunked transfer encoding`_j) KhDjhWhhY}r(UnameXchunked transfer encodingj, X6http://en.wikipedia.org/wiki/Chunked_transfer_encodingrh[]h\]h]]h^]h_]uhd]rhtXchunked transfer encodingrr}r(hCUhDjubaubhtX (rr}r(hCX (hDjubh)r}r(hCX :rev:`2769`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2769h[]h\]h]]h^]h_]uhDjhd]rhtXr2769rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX<Add boto 2.0 support for S3 downloader handler (:rev:`2763`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX0Add boto 2.0 support for S3 downloader handler (rr}r(hCX0Add boto 2.0 support for S3 downloader handler (hDjubh)r}r(hCX :rev:`2763`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2763h[]h\]h]]h^]h_]uhDjhd]rhtXr2763rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXCAdded `marshal`_ to formats supported by feed exports (:rev:`2744`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded rr}r(hCXAdded hDjubh)r }r (hCX `marshal`_j) KhDjhWhhY}r (UnameXmarshalr j, X+http://docs.python.org/library/marshal.htmlr h[]h\]h]]h^]h_]uhd]rhtXmarshalrr}r(hCUhDj ubaubhtX' to formats supported by feed exports (rr}r(hCX' to formats supported by feed exports (hDjubh)r}r(hCX :rev:`2744`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2744h[]h\]h]]h^]h_]uhDjhd]rhtXr2744rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXeIn request errbacks, offending requests are now received in `failure.request` attribute (:rev:`2738`)r hDj@hMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbNhchhd]r"h)r#}r$(hCj hDjhMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbMhd]r&(htX<In request errbacks, offending requests are now received in r'r(}r)(hCX<In request errbacks, offending requests are now received in hDj#ubh)r*}r+(hCX`failure.request`hY}r,(h]]h^]h\]h[]h_]uhDj#hd]r-htXfailure.requestr.r/}r0(hCUhDj*ubahWhubhtX attribute (r1r2}r3(hCX attribute (hDj#ubh)r4}r5(hCX :rev:`2738`hY}r6(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2738h[]h\]h]]h^]h_]uhDj#hd]r7htXr2738r8r9}r:(hCUhDj4ubahWhubhtX)r;}r<(hCX)hDj#ubeubaubh)r=}r>(hCXJBig downloader refactoring to support per domain/ip concurrency limits (:rev:`2732`) - ``CONCURRENT_REQUESTS_PER_SPIDER`` setting has been deprecated and replaced by: - :setting:`CONCURRENT_REQUESTS`, :setting:`CONCURRENT_REQUESTS_PER_DOMAIN`, :setting:`CONCURRENT_REQUESTS_PER_IP` - check the documentation for more detailshDj@hMNhWhhY}r?(h]]h^]h\]h[]h_]uhbNhchhd]r@cdocutils.nodes definition_list rA)rB}rC(hCUhY}rD(h]]h^]h\]h[]h_]uhDj=hd]rEcdocutils.nodes definition_list_item rF)rG}rH(hCXGBig downloader refactoring to support per domain/ip concurrency limits (:rev:`2732`) - ``CONCURRENT_REQUESTS_PER_SPIDER`` setting has been deprecated and replaced by: - :setting:`CONCURRENT_REQUESTS`, :setting:`CONCURRENT_REQUESTS_PER_DOMAIN`, :setting:`CONCURRENT_REQUESTS_PER_IP` - check the documentation for more detailshDjBhMhPhWUdefinition_list_itemrIhY}rJ(h]]h^]h\]h[]h_]uhbMhd]rK(cdocutils.nodes term rL)rM}rN(hCXTBig downloader refactoring to support per domain/ip concurrency limits (:rev:`2732`)hDjGhMhPhWUtermrOhY}rP(h]]h^]h\]h[]h_]uhbMhd]rQ(htXHBig downloader refactoring to support per domain/ip concurrency limits (rRrS}rT(hCXHBig downloader refactoring to support per domain/ip concurrency limits (hDjMubh)rU}rV(hCX :rev:`2732`hY}rW(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2732h[]h\]h]]h^]h_]uhDjMhd]rXhtXr2732rYrZ}r[(hCUhDjUubahWhubhtX)r\}r](hCX)hDjMubeubcdocutils.nodes definition r^)r_}r`(hCUhY}ra(h]]h^]h\]h[]h_]uhDjGhd]rbh)rc}rd(hCUhY}re(hX-h[]h\]h]]h^]h_]uhDj_hd]rf(h)rg}rh(hCX``CONCURRENT_REQUESTS_PER_SPIDER`` setting has been deprecated and replaced by: - :setting:`CONCURRENT_REQUESTS`, :setting:`CONCURRENT_REQUESTS_PER_DOMAIN`, :setting:`CONCURRENT_REQUESTS_PER_IP`hY}ri(h]]h^]h\]h[]h_]uhDjchd]rjjA)rk}rl(hCUhY}rm(h]]h^]h\]h[]h_]uhDjghd]rnjF)ro}rp(hCX``CONCURRENT_REQUESTS_PER_SPIDER`` setting has been deprecated and replaced by: - :setting:`CONCURRENT_REQUESTS`, :setting:`CONCURRENT_REQUESTS_PER_DOMAIN`, :setting:`CONCURRENT_REQUESTS_PER_IP`hDjkhMhPhWjIhY}rq(h]]h^]h\]h[]h_]uhbMhd]rr(jL)rs}rt(hCXO``CONCURRENT_REQUESTS_PER_SPIDER`` setting has been deprecated and replaced by:hDjohMhPhWjOhY}ru(h]]h^]h\]h[]h_]uhbMhd]rv(jV )rw}rx(hCX"``CONCURRENT_REQUESTS_PER_SPIDER``hY}ry(h]]h^]h\]h[]h_]uhDjshd]rzhtXCONCURRENT_REQUESTS_PER_SPIDERr{r|}r}(hCUhDjwubahWj^ ubhtX- setting has been deprecated and replaced by:r~r}r(hCX- setting has been deprecated and replaced by:hDjsubeubj^)r}r(hCUhY}r(h]]h^]h\]h[]h_]uhDjohd]rh)r}r(hCUhY}r(hX-h[]h\]h]]h^]h_]uhDjhd]rh)r}r(hCXp:setting:`CONCURRENT_REQUESTS`, :setting:`CONCURRENT_REQUESTS_PER_DOMAIN`, :setting:`CONCURRENT_REQUESTS_PER_IP`rhY}r(h]]h^]h\]h[]h_]uhDjhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(j )r}r(hCX:setting:`CONCURRENT_REQUESTS`rhDjhMhPhWj hY}r(UreftypeXsettingj j XCONCURRENT_REQUESTSU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-settingreh\]h[]h_]uhDjhd]rhtXCONCURRENT_REQUESTSrr}r(hCUhDjubahWj^ ubaubhtX, rr}r(hCX, hDjubj )r}r(hCX):setting:`CONCURRENT_REQUESTS_PER_DOMAIN`rhDjhMhPhWj hY}r(UreftypeXsettingj j XCONCURRENT_REQUESTS_PER_DOMAINU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-settingreh\]h[]h_]uhDjhd]rhtXCONCURRENT_REQUESTS_PER_DOMAINrr}r(hCUhDjubahWj^ ubaubhtX, rr}r(hCX, hDjubj )r}r(hCX%:setting:`CONCURRENT_REQUESTS_PER_IP`rhDjhMhPhWj hY}r(UreftypeXsettingj j XCONCURRENT_REQUESTS_PER_IPU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-settingreh\]h[]h_]uhDjhd]rhtXCONCURRENT_REQUESTS_PER_IPrr}r(hCUhDjubahWj^ ubaubeubahWhubahWhubahWU definitionrubeubahWUdefinition_listrubahWhubh)r}r(hCX(check the documentation for more detailsrhY}r(h]]h^]h\]h[]h_]uhDjchd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtX(check the documentation for more detailsrr}r(hCjhDjubaubahWhubehWhubahWjubeubahWjubaubh)r}r(hCX0Added builtin caching DNS resolver (:rev:`2728`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX$Added builtin caching DNS resolver (rr}r(hCX$Added builtin caching DNS resolver (hDjubh)r}r(hCX :rev:`2728`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2728h[]h\]h]]h^]h_]uhDjhd]rhtXr2728rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXMoved Amazon AWS-related components/extensions (SQS spider queue, SimpleDB stats collector) to a separate project: [scaws](https://github.com/scrapinghub/scaws) (:rev:`2706`, :rev:`2714`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX{Moved Amazon AWS-related components/extensions (SQS spider queue, SimpleDB stats collector) to a separate project: [scaws](rr}r(hCX{Moved Amazon AWS-related components/extensions (SQS spider queue, SimpleDB stats collector) to a separate project: [scaws](hDjubh)r}r(hCX$https://github.com/scrapinghub/scawsrhY}r(Urefurijh[]h\]h]]h^]h_]uhDjhd]rhtX$https://github.com/scrapinghub/scawsrr}r(hCUhDjubahWhubhtX) (rr}r(hCX) (hDjubh)r}r(hCX :rev:`2706`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2706h[]h\]h]]h^]h_]uhDjhd]rhtXr2706rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r }r (hCX :rev:`2714`hY}r (UrefuriX*http://hg.scrapy.org/scrapy/changeset/2714h[]h\]h]]h^]h_]uhDjhd]r htXr2714r r}r(hCUhDj ubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX[Moved spider queues to scrapyd: `scrapy.spiderqueue` -> `scrapyd.spiderqueue` (:rev:`2708`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX Moved spider queues to scrapyd: rr}r(hCX Moved spider queues to scrapyd: hDjubh)r}r(hCX`scrapy.spiderqueue`hY}r (h]]h^]h\]h[]h_]uhDjhd]r!htXscrapy.spiderqueuer"r#}r$(hCUhDjubahWhubhtX -> r%r&}r'(hCX -> hDjubh)r(}r)(hCX`scrapyd.spiderqueue`hY}r*(h]]h^]h\]h[]h_]uhDjhd]r+htXscrapyd.spiderqueuer,r-}r.(hCUhDj(ubahWhubhtX (r/r0}r1(hCX (hDjubh)r2}r3(hCX :rev:`2708`hY}r4(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2708h[]h\]h]]h^]h_]uhDjhd]r5htXr2708r6r7}r8(hCUhDj2ubahWhubhtX)r9}r:(hCX)hDjubeubaubh)r;}r<(hCXVMoved sqlite utils to scrapyd: `scrapy.utils.sqlite` -> `scrapyd.sqlite` (:rev:`2781`)r=hDj@hMhPhWhhY}r>(h]]h^]h\]h[]h_]uhbNhchhd]r?h)r@}rA(hCj=hDj;hMhPhWhhY}rB(h]]h^]h\]h[]h_]uhbMhd]rC(htXMoved sqlite utils to scrapyd: rDrE}rF(hCXMoved sqlite utils to scrapyd: hDj@ubh)rG}rH(hCX`scrapy.utils.sqlite`hY}rI(h]]h^]h\]h[]h_]uhDj@hd]rJhtXscrapy.utils.sqliterKrL}rM(hCUhDjGubahWhubhtX -> rNrO}rP(hCX -> hDj@ubh)rQ}rR(hCX`scrapyd.sqlite`hY}rS(h]]h^]h\]h[]h_]uhDj@hd]rThtXscrapyd.sqliterUrV}rW(hCUhDjQubahWhubhtX (rXrY}rZ(hCX (hDj@ubh)r[}r\(hCX :rev:`2781`hY}r](UrefuriX*http://hg.scrapy.org/scrapy/changeset/2781h[]h\]h]]h^]h_]uhDj@hd]r^htXr2781r_r`}ra(hCUhDj[ubahWhubhtX)rb}rc(hCX)hDj@ubeubaubh)rd}re(hCXReal support for returning iterators on `start_requests()` method. The iterator is now consumed during the crawl when the spider is getting idle (:rev:`2704`)rfhDj@hMhPhWhhY}rg(h]]h^]h\]h[]h_]uhbNhchhd]rhh)ri}rj(hCjfhDjdhMhPhWhhY}rk(h]]h^]h\]h[]h_]uhbMhd]rl(htX(Real support for returning iterators on rmrn}ro(hCX(Real support for returning iterators on hDjiubh)rp}rq(hCX`start_requests()`hY}rr(h]]h^]h\]h[]h_]uhDjihd]rshtXstart_requests()rtru}rv(hCUhDjpubahWhubhtXX method. The iterator is now consumed during the crawl when the spider is getting idle (rwrx}ry(hCXX method. The iterator is now consumed during the crawl when the spider is getting idle (hDjiubh)rz}r{(hCX :rev:`2704`hY}r|(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2704h[]h\]h]]h^]h_]uhDjihd]r}htXr2704r~r}r(hCUhDjzubahWhubhtX)r}r(hCX)hDjiubeubaubh)r}r(hCXiAdded :setting:`REDIRECT_ENABLED` setting to quickly enable/disable the redirect middleware (:rev:`2697`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded rr}r(hCXAdded hDjubj )r}r(hCX:setting:`REDIRECT_ENABLED`rhDjhMhPhWj hY}r(UreftypeXsettingj j XREDIRECT_ENABLEDU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-settingreh\]h[]h_]uhDjhd]rhtXREDIRECT_ENABLEDrr}r(hCUhDjubahWj^ ubaubhtX< setting to quickly enable/disable the redirect middleware (rr}r(hCX< setting to quickly enable/disable the redirect middleware (hDjubh)r}r(hCX :rev:`2697`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2697h[]h\]h]]h^]h_]uhDjhd]rhtXr2697rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXcAdded :setting:`RETRY_ENABLED` setting to quickly enable/disable the retry middleware (:rev:`2694`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded rr}r(hCXAdded hDjubj )r}r(hCX:setting:`RETRY_ENABLED`rhDjhMhPhWj hY}r(UreftypeXsettingj j X RETRY_ENABLEDU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-settingreh\]h[]h_]uhDjhd]rhtX RETRY_ENABLEDrr}r(hCUhDjubahWj^ ubaubhtX9 setting to quickly enable/disable the retry middleware (rr}r(hCX9 setting to quickly enable/disable the retry middleware (hDjubh)r}r(hCX :rev:`2694`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2694h[]h\]h]]h^]h_]uhDjhd]rhtXr2694rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXGAdded ``CloseSpider`` exception to manually close spiders (:rev:`2691`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded rr}r(hCXAdded hDjubjV )r}r(hCX``CloseSpider``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX CloseSpiderrr}r(hCUhDjubahWj^ ubhtX& exception to manually close spiders (rr}r(hCX& exception to manually close spiders (hDjubh)r}r(hCX :rev:`2691`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2691h[]h\]h]]h^]h_]uhDjhd]rhtXr2691rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX^Improved encoding detection by adding support for HTML5 meta charset declaration (:rev:`2690`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXRImproved encoding detection by adding support for HTML5 meta charset declaration (rr}r(hCXRImproved encoding detection by adding support for HTML5 meta charset declaration (hDjubh)r}r(hCX :rev:`2690`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2690h[]h\]h]]h^]h_]uhDjhd]rhtXr2690rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXRefactored close spider behavior to wait for all downloads to finish and be processed by spiders, before closing the spider (:rev:`2688`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCjhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMhd]r (htX}Refactored close spider behavior to wait for all downloads to finish and be processed by spiders, before closing the spider (rr}r(hCX}Refactored close spider behavior to wait for all downloads to finish and be processed by spiders, before closing the spider (hDj ubh)r}r(hCX :rev:`2688`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2688h[]h\]h]]h^]h_]uhDj hd]rhtXr2688rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDj ubeubaubh)r}r(hCXIAdded ``SitemapSpider`` (see documentation in Spiders page) (:rev:`2658`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r (hCjhDjhMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbMhd]r"(htXAdded r#r$}r%(hCXAdded hDjubjV )r&}r'(hCX``SitemapSpider``hY}r((h]]h^]h\]h[]h_]uhDjhd]r)htX SitemapSpiderr*r+}r,(hCUhDj&ubahWj^ ubhtX& (see documentation in Spiders page) (r-r.}r/(hCX& (see documentation in Spiders page) (hDjubh)r0}r1(hCX :rev:`2658`hY}r2(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2658h[]h\]h]]h^]h_]uhDjhd]r3htXr2658r4r5}r6(hCUhDj0ubahWhubhtX)r7}r8(hCX)hDjubeubaubh)r9}r:(hCXvAdded ``LogStats`` extension for periodically logging basic stats (like crawled pages and scraped items) (:rev:`2657`)r;hDj@hMhPhWhhY}r<(h]]h^]h\]h[]h_]uhbNhchhd]r=h)r>}r?(hCj;hDj9hMhPhWhhY}r@(h]]h^]h\]h[]h_]uhbMhd]rA(htXAdded rBrC}rD(hCXAdded hDj>ubjV )rE}rF(hCX ``LogStats``hY}rG(h]]h^]h\]h[]h_]uhDj>hd]rHhtXLogStatsrIrJ}rK(hCUhDjEubahWj^ ubhtXX extension for periodically logging basic stats (like crawled pages and scraped items) (rLrM}rN(hCXX extension for periodically logging basic stats (like crawled pages and scraped items) (hDj>ubh)rO}rP(hCX :rev:`2657`hY}rQ(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2657h[]h\]h]]h^]h_]uhDj>hd]rRhtXr2657rSrT}rU(hCUhDjOubahWhubhtX)rV}rW(hCX)hDj>ubeubaubh)rX}rY(hCXMake handling of gzipped responses more robust (#319, :rev:`2643`). Now Scrapy will try and decompress as much as possible from a gzipped response, instead of failing with an `IOError`.rZhDj@hMhPhWhhY}r[(h]]h^]h\]h[]h_]uhbNhchhd]r\h)r]}r^(hCjZhDjXhMhPhWhhY}r_(h]]h^]h\]h[]h_]uhbMhd]r`(htX6Make handling of gzipped responses more robust (#319, rarb}rc(hCX6Make handling of gzipped responses more robust (#319, hDj]ubh)rd}re(hCX :rev:`2643`hY}rf(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2643h[]h\]h]]h^]h_]uhDj]hd]rghtXr2643rhri}rj(hCUhDjdubahWhubhtXn). Now Scrapy will try and decompress as much as possible from a gzipped response, instead of failing with an rkrl}rm(hCXn). Now Scrapy will try and decompress as much as possible from a gzipped response, instead of failing with an hDj]ubh)rn}ro(hCX `IOError`hY}rp(h]]h^]h\]h[]h_]uhDj]hd]rqhtXIOErrorrrrs}rt(hCUhDjnubahWhubhtX.ru}rv(hCX.hDj]ubeubaubh)rw}rx(hCXaSimplified !MemoryDebugger extension to use stats for dumping memory debugging info (:rev:`2639`)ryhDj@hMhPhWhhY}rz(h]]h^]h\]h[]h_]uhbNhchhd]r{h)r|}r}(hCjyhDjwhMhPhWhhY}r~(h]]h^]h\]h[]h_]uhbMhd]r(htXUSimplified !MemoryDebugger extension to use stats for dumping memory debugging info (rr}r(hCXUSimplified !MemoryDebugger extension to use stats for dumping memory debugging info (hDj|ubh)r}r(hCX :rev:`2639`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2639h[]h\]h]]h^]h_]uhDj|hd]rhtXr2639rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDj|ubeubaubh)r}r(hCXAdded new command to edit spiders: ``scrapy edit`` (:rev:`2636`) and `-e` flag to `genspider` command that uses it (:rev:`2653`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX#Added new command to edit spiders: rr}r(hCX#Added new command to edit spiders: hDjubjV )r}r(hCX``scrapy edit``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX scrapy editrr}r(hCUhDjubahWj^ ubhtX (rr}r(hCX (hDjubh)r}r(hCX :rev:`2636`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2636h[]h\]h]]h^]h_]uhDjhd]rhtXr2636rr}r(hCUhDjubahWhubhtX) and rr}r(hCX) and hDjubh)r}r(hCX`-e`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX-err}r(hCUhDjubahWhubhtX flag to rr}r(hCX flag to hDjubh)r}r(hCX `genspider`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX genspiderrr}r(hCUhDjubahWhubhtX command that uses it (rr}r(hCX command that uses it (hDjubh)r}r(hCX :rev:`2653`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2653h[]h\]h]]h^]h_]uhDjhd]rhtXr2653rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXChanged default representation of items to pretty-printed dicts. (:rev:`2631`). This improves default logging by making log more readable in the default case, for both Scraped and Dropped lines.rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXBChanged default representation of items to pretty-printed dicts. (rr}r(hCXBChanged default representation of items to pretty-printed dicts. (hDjubh)r}r(hCX :rev:`2631`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2631h[]h\]h]]h^]h_]uhDjhd]rhtXr2631rr}r(hCUhDjubahWhubhtXu). This improves default logging by making log more readable in the default case, for both Scraped and Dropped lines.rr}r(hCXu). This improves default logging by making log more readable in the default case, for both Scraped and Dropped lines.hDjubeubaubh)r}r(hCX1Added :signal:`spider_error` signal (:rev:`2628`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded rr}r(hCXAdded hDjubj )r}r(hCX:signal:`spider_error`rhDjhMhPhWj hY}r(UreftypeXsignalj j X spider_errorU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-signalreh\]h[]h_]uhDjhd]rhtX spider_errorrr}r(hCUhDjubahWj^ ubaubhtX signal (rr}r(hCX signal (hDjubh)r}r(hCX :rev:`2628`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2628h[]h\]h]]h^]h_]uhDjhd]rhtXr2628rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX6Added :setting:`COOKIES_ENABLED` setting (:rev:`2625`)rhDj@hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCjhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMhd]r(htXAdded rr}r(hCXAdded hDj ubj )r}r(hCX:setting:`COOKIES_ENABLED`rhDj hMhPhWj hY}r(UreftypeXsettingj j XCOOKIES_ENABLEDU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-settingreh\]h[]h_]uhDjhd]rhtXCOOKIES_ENABLEDrr}r (hCUhDjubahWj^ ubaubhtX setting (r!r"}r#(hCX setting (hDj ubh)r$}r%(hCX :rev:`2625`hY}r&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2625h[]h\]h]]h^]h_]uhDj hd]r'htXr2625r(r)}r*(hCUhDj$ubahWhubhtX)r+}r,(hCX)hDj ubeubaubh)r-}r.(hCXStats are now dumped to Scrapy log (default value of :setting:`STATS_DUMP` setting has been changed to `True`). This is to make Scrapy users more aware of Scrapy stats and the data that is collected there.r/hDj@hMhPhWhhY}r0(h]]h^]h\]h[]h_]uhbNhchhd]r1h)r2}r3(hCj/hDj-hMhPhWhhY}r4(h]]h^]h\]h[]h_]uhbMhd]r5(htX5Stats are now dumped to Scrapy log (default value of r6r7}r8(hCX5Stats are now dumped to Scrapy log (default value of hDj2ubj )r9}r:(hCX:setting:`STATS_DUMP`r;hDj2hMhPhWj hY}r<(UreftypeXsettingj j X STATS_DUMPU refdomainXstdr=h[]h\]U refexplicith]]h^]h_]j j uhbMhd]r>jV )r?}r@(hCj;hY}rA(h]]h^]rB(j j=X std-settingrCeh\]h[]h_]uhDj9hd]rDhtX STATS_DUMPrErF}rG(hCUhDj?ubahWj^ ubaubhtX setting has been changed to rHrI}rJ(hCX setting has been changed to hDj2ubh)rK}rL(hCX`True`hY}rM(h]]h^]h\]h[]h_]uhDj2hd]rNhtXTruerOrP}rQ(hCUhDjKubahWhubhtX`). This is to make Scrapy users more aware of Scrapy stats and the data that is collected there.rRrS}rT(hCX`). This is to make Scrapy users more aware of Scrapy stats and the data that is collected there.hDj2ubeubaubh)rU}rV(hCXdAdded support for dynamically adjusting download delay and maximum concurrent requests (:rev:`2599`)rWhDj@hMhPhWhhY}rX(h]]h^]h\]h[]h_]uhbNhchhd]rYh)rZ}r[(hCjWhDjUhMhPhWhhY}r\(h]]h^]h\]h[]h_]uhbMhd]r](htXXAdded support for dynamically adjusting download delay and maximum concurrent requests (r^r_}r`(hCXXAdded support for dynamically adjusting download delay and maximum concurrent requests (hDjZubh)ra}rb(hCX :rev:`2599`hY}rc(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2599h[]h\]h]]h^]h_]uhDjZhd]rdhtXr2599rerf}rg(hCUhDjaubahWhubhtX)rh}ri(hCX)hDjZubeubaubh)rj}rk(hCX6Added new DBM HTTP cache storage backend (:rev:`2576`)rlhDj@hMhPhWhhY}rm(h]]h^]h\]h[]h_]uhbNhchhd]rnh)ro}rp(hCjlhDjjhMhPhWhhY}rq(h]]h^]h\]h[]h_]uhbMhd]rr(htX*Added new DBM HTTP cache storage backend (rsrt}ru(hCX*Added new DBM HTTP cache storage backend (hDjoubh)rv}rw(hCX :rev:`2576`hY}rx(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2576h[]h\]h]]h^]h_]uhDjohd]ryhtXr2576rzr{}r|(hCUhDjvubahWhubhtX)r}}r~(hCX)hDjoubeubaubh)r}r(hCX4Added ``listjobs.json`` API to Scrapyd (:rev:`2571`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded rr}r(hCXAdded hDjubjV )r}r(hCX``listjobs.json``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX listjobs.jsonrr}r(hCUhDjubahWj^ ubhtX API to Scrapyd (rr}r(hCX API to Scrapyd (hDjubh)r}r(hCX :rev:`2571`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2571h[]h\]h]]h^]h_]uhDjhd]rhtXr2571rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXG``CsvItemExporter``: added ``join_multivalued`` parameter (:rev:`2578`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(jV )r}r(hCX``CsvItemExporter``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXCsvItemExporterrr}r(hCUhDjubahWj^ ubhtX: added rr}r(hCX: added hDjubjV )r}r(hCX``join_multivalued``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXjoin_multivaluedrr}r(hCUhDjubahWj^ ubhtX parameter (rr}r(hCX parameter (hDjubh)r}r(hCX :rev:`2578`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2578h[]h\]h]]h^]h_]uhDjhd]rhtXr2578rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX9Added namespace support to ``xmliter_lxml`` (:rev:`2552`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded namespace support to rr}r(hCXAdded namespace support to hDjubjV )r}r(hCX``xmliter_lxml``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX xmliter_lxmlrr}r(hCUhDjubahWj^ ubhtX (rr}r(hCX (hDjubh)r}r(hCX :rev:`2552`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2552h[]h\]h]]h^]h_]uhDjhd]rhtXr2552rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX\Improved cookies middleware by making `COOKIES_DEBUG` nicer and documenting it (:rev:`2579`)rhDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX&Improved cookies middleware by making rr}r(hCX&Improved cookies middleware by making hDjubh)r}r(hCX`COOKIES_DEBUG`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX COOKIES_DEBUGrr}r(hCUhDjubahWhubhtX nicer and documenting it (rr}r(hCX nicer and documenting it (hDjubh)r}r(hCX :rev:`2579`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2579h[]h\]h]]h^]h_]uhDjhd]rhtXr2579rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX4Several improvements to Scrapyd and Link extractors hDj@hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCX3Several improvements to Scrapyd and Link extractorsrhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMhd]r htX3Several improvements to Scrapyd and Link extractorsr r }r (hCjhDjubaubaubeubeubhE)r}r(hCUhDj"hMhPhWhehY}r(h]]h^]h\]h[]rUcode-rearranged-and-removedrah_]rhauhbMhchhd]r(hm)r}r(hCXCode rearranged and removedrhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbMhchhd]rhtXCode rearranged and removedrr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbMhchhd]r (h)r!}r"(hCXMerged item passed and item scraped concepts, as they have often proved confusing in the past. This means: (:rev:`2630`) - original item_scraped signal was removed - original item_passed signal was renamed to item_scraped - old log lines ``Scraped Item...`` were removed - old log lines ``Passed Item...`` were renamed to ``Scraped Item...`` lines and downgraded to ``DEBUG`` levelhDjhMNhWhhY}r#(h]]h^]h\]h[]h_]uhbNhchhd]r$jA)r%}r&(hCUhY}r'(h]]h^]h\]h[]h_]uhDj!hd]r(jF)r)}r*(hCX}Merged item passed and item scraped concepts, as they have often proved confusing in the past. This means: (:rev:`2630`) - original item_scraped signal was removed - original item_passed signal was renamed to item_scraped - old log lines ``Scraped Item...`` were removed - old log lines ``Passed Item...`` were renamed to ``Scraped Item...`` lines and downgraded to ``DEBUG`` levelhDj%hMhPhWjIhY}r+(h]]h^]h\]h[]h_]uhbMhd]r,(jL)r-}r.(hCXxMerged item passed and item scraped concepts, as they have often proved confusing in the past. This means: (:rev:`2630`)hDj)hMhPhWjOhY}r/(h]]h^]h\]h[]h_]uhbMhd]r0(htXlMerged item passed and item scraped concepts, as they have often proved confusing in the past. This means: (r1r2}r3(hCXlMerged item passed and item scraped concepts, as they have often proved confusing in the past. This means: (hDj-ubh)r4}r5(hCX :rev:`2630`hY}r6(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2630h[]h\]h]]h^]h_]uhDj-hd]r7htXr2630r8r9}r:(hCUhDj4ubahWhubhtX)r;}r<(hCX)hDj-ubeubj^)r=}r>(hCUhY}r?(h]]h^]h\]h[]h_]uhDj)hd]r@h)rA}rB(hCUhY}rC(hX-h[]h\]h]]h^]h_]uhDj=hd]rD(h)rE}rF(hCX(original item_scraped signal was removedrGhY}rH(h]]h^]h\]h[]h_]uhDjAhd]rIh)rJ}rK(hCjGhDjEhMhPhWhhY}rL(h]]h^]h\]h[]h_]uhbMhd]rMhtX(original item_scraped signal was removedrNrO}rP(hCjGhDjJubaubahWhubh)rQ}rR(hCX7original item_passed signal was renamed to item_scrapedrShY}rT(h]]h^]h\]h[]h_]uhDjAhd]rUh)rV}rW(hCjShDjQhMhPhWhhY}rX(h]]h^]h\]h[]h_]uhbMhd]rYhtX7original item_passed signal was renamed to item_scrapedrZr[}r\(hCjShDjVubaubahWhubh)r]}r^(hCX.old log lines ``Scraped Item...`` were removedr_hY}r`(h]]h^]h\]h[]h_]uhDjAhd]rah)rb}rc(hCj_hDj]hMhPhWhhY}rd(h]]h^]h\]h[]h_]uhbMhd]re(htXold log lines rfrg}rh(hCXold log lines hDjbubjV )ri}rj(hCX``Scraped Item...``hY}rk(h]]h^]h\]h[]h_]uhDjbhd]rlhtXScraped Item...rmrn}ro(hCUhDjiubahWj^ ubhtX were removedrprq}rr(hCX were removedhDjbubeubahWhubh)rs}rt(hCXlold log lines ``Passed Item...`` were renamed to ``Scraped Item...`` lines and downgraded to ``DEBUG`` levelruhY}rv(h]]h^]h\]h[]h_]uhDjAhd]rwh)rx}ry(hCjuhDjshMhPhWhhY}rz(h]]h^]h\]h[]h_]uhbMhd]r{(htXold log lines r|r}}r~(hCXold log lines hDjxubjV )r}r(hCX``Passed Item...``hY}r(h]]h^]h\]h[]h_]uhDjxhd]rhtXPassed Item...rr}r(hCUhDjubahWj^ ubhtX were renamed to rr}r(hCX were renamed to hDjxubjV )r}r(hCX``Scraped Item...``hY}r(h]]h^]h\]h[]h_]uhDjxhd]rhtXScraped Item...rr}r(hCUhDjubahWj^ ubhtX lines and downgraded to rr}r(hCX lines and downgraded to hDjxubjV )r}r(hCX ``DEBUG``hY}r(h]]h^]h\]h[]h_]uhDjxhd]rhtXDEBUGrr}r(hCUhDjubahWj^ ubhtX levelrr}r(hCX levelhDjxubeubahWhubehWhubahWjubeubahWjubaubh)r}r(hCXReduced Scrapy codebase by striping part of Scrapy code into two new libraries: - `w3lib`_ (several functions from ``scrapy.utils.{http,markup,multipart,response,url}``, done in :rev:`2584`) - `scrapely`_ (was ``scrapy.contrib.ibl``, done in :rev:`2586`)hDjhMNhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rjA)r}r(hCUhY}r(h]]h^]h\]h[]h_]uhDjhd]rjF)r}r(hCXReduced Scrapy codebase by striping part of Scrapy code into two new libraries: - `w3lib`_ (several functions from ``scrapy.utils.{http,markup,multipart,response,url}``, done in :rev:`2584`) - `scrapely`_ (was ``scrapy.contrib.ibl``, done in :rev:`2586`)hDjhMhPhWjIhY}r(h]]h^]h\]h[]h_]uhbMhd]r(jL)r}r(hCXOReduced Scrapy codebase by striping part of Scrapy code into two new libraries:rhDjhMhPhWjOhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtXOReduced Scrapy codebase by striping part of Scrapy code into two new libraries:rr}r(hCjhDjubaubj^)r}r(hCUhY}r(h]]h^]h\]h[]h_]uhDjhd]rh)r}r(hCUhY}r(hX-h[]h\]h]]h^]h_]uhDjhd]r(h)r}r(hCXl`w3lib`_ (several functions from ``scrapy.utils.{http,markup,multipart,response,url}``, done in :rev:`2584`)rhY}r(h]]h^]h\]h[]h_]uhDjhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(h)r}r(hCX`w3lib`_j) KhDjhWhhY}r(UnameXw3libj, jh[]h\]h]]h^]h_]uhd]rhtXw3librr}r(hCUhDjubaubhtX (several functions from rr}r(hCX (several functions from hDjubjV )r}r(hCX5``scrapy.utils.{http,markup,multipart,response,url}``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX1scrapy.utils.{http,markup,multipart,response,url}rr}r(hCUhDjubahWj^ ubhtX , done in rr}r(hCX , done in hDjubh)r}r(hCX :rev:`2584`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2584h[]h\]h]]h^]h_]uhDjhd]rhtXr2584rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubahWhubh)r}r(hCX=`scrapely`_ (was ``scrapy.contrib.ibl``, done in :rev:`2586`)rhY}r(h]]h^]h\]h[]h_]uhDjhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(h)r}r(hCX `scrapely`_j) KhDjhWhhY}r(UnameXscrapelyrj, X"https://github.com/scrapy/scrapelyrh[]h\]h]]h^]h_]uhd]rhtXscrapelyrr}r(hCUhDjubaubhtX (was rr}r(hCX (was hDjubjV )r}r(hCX``scrapy.contrib.ibl``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXscrapy.contrib.iblrr}r(hCUhDjubahWj^ ubhtX , done in rr}r(hCX , done in hDjubh)r}r(hCX :rev:`2586`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2586h[]h\]h]]h^]h_]uhDjhd]rhtXr2586rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubahWhubehWhubahWjubeubahWjubaubh)r}r(hCXLRemoved unused function: `scrapy.utils.request.request_info()` (:rev:`2577`)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXRemoved unused function: rr}r(hCXRemoved unused function: hDj ubh)r}r(hCX%`scrapy.utils.request.request_info()`hY}r(h]]h^]h\]h[]h_]uhDj hd]rhtX#scrapy.utils.request.request_info()rr}r(hCUhDjubahWhubhtX (rr}r(hCX (hDj ubh)r}r(hCX :rev:`2577`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2577h[]h\]h]]h^]h_]uhDj hd]r htXr2577r!r"}r#(hCUhDjubahWhubhtX)r$}r%(hCX)hDj ubeubaubh)r&}r'(hCXRemoved googledir project from `examples/googledir`. There's now a new example project called `dirbot` available on github: https://github.com/scrapy/dirbotr(hDjhMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbNhchhd]r*h)r+}r,(hCj(hDj&hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbMhd]r.(htXRemoved googledir project from r/r0}r1(hCXRemoved googledir project from hDj+ubh)r2}r3(hCX`examples/googledir`hY}r4(h]]h^]h\]h[]h_]uhDj+hd]r5htXexamples/googledirr6r7}r8(hCUhDj2ubahWhubhtX+. There's now a new example project called r9r:}r;(hCX+. There's now a new example project called hDj+ubh)r<}r=(hCX`dirbot`hY}r>(h]]h^]h\]h[]h_]uhDj+hd]r?htXdirbotr@rA}rB(hCUhDj<ubahWhubhtX available on github: rCrD}rE(hCX available on github: hDj+ubh)rF}rG(hCX https://github.com/scrapy/dirbotrHhY}rI(UrefurijHh[]h\]h]]h^]h_]uhDj+hd]rJhtX https://github.com/scrapy/dirbotrKrL}rM(hCUhDjFubahWhubeubaubh)rN}rO(hCXFRemoved support for default field values in Scrapy items (:rev:`2616`)rPhDjhMhPhWhhY}rQ(h]]h^]h\]h[]h_]uhbNhchhd]rRh)rS}rT(hCjPhDjNhMhPhWhhY}rU(h]]h^]h\]h[]h_]uhbMhd]rV(htX:Removed support for default field values in Scrapy items (rWrX}rY(hCX:Removed support for default field values in Scrapy items (hDjSubh)rZ}r[(hCX :rev:`2616`hY}r\(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2616h[]h\]h]]h^]h_]uhDjShd]r]htXr2616r^r_}r`(hCUhDjZubahWhubhtX)ra}rb(hCX)hDjSubeubaubh)rc}rd(hCX1Removed experimental crawlspider v2 (:rev:`2632`)rehDjhMhPhWhhY}rf(h]]h^]h\]h[]h_]uhbNhchhd]rgh)rh}ri(hCjehDjchMhPhWhhY}rj(h]]h^]h\]h[]h_]uhbMhd]rk(htX%Removed experimental crawlspider v2 (rlrm}rn(hCX%Removed experimental crawlspider v2 (hDjhubh)ro}rp(hCX :rev:`2632`hY}rq(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2632h[]h\]h]]h^]h_]uhDjhhd]rrhtXr2632rsrt}ru(hCUhDjoubahWhubhtX)rv}rw(hCX)hDjhubeubaubh)rx}ry(hCXRemoved scheduler middleware to simplify architecture. Duplicates filter is now done in the scheduler itself, using the same dupe fltering class as before (`DUPEFILTER_CLASS` setting) (:rev:`2640`)rzhDjhMhPhWhhY}r{(h]]h^]h\]h[]h_]uhbNhchhd]r|h)r}}r~(hCjzhDjxhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXRemoved scheduler middleware to simplify architecture. Duplicates filter is now done in the scheduler itself, using the same dupe fltering class as before (rr}r(hCXRemoved scheduler middleware to simplify architecture. Duplicates filter is now done in the scheduler itself, using the same dupe fltering class as before (hDj}ubh)r}r(hCX`DUPEFILTER_CLASS`hY}r(h]]h^]h\]h[]h_]uhDj}hd]rhtXDUPEFILTER_CLASSrr}r(hCUhDjubahWhubhtX setting) (rr}r(hCX setting) (hDj}ubh)r}r(hCX :rev:`2640`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2640h[]h\]h]]h^]h_]uhDj}hd]rhtXr2640rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDj}ubeubaubh)r}r(hCXiRemoved support for passing urls to ``scrapy crawl`` command (use ``scrapy parse`` instead) (:rev:`2704`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX$Removed support for passing urls to rr}r(hCX$Removed support for passing urls to hDjubjV )r}r(hCX``scrapy crawl``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX scrapy crawlrr}r(hCUhDjubahWj^ ubhtX command (use rr}r(hCX command (use hDjubjV )r}r(hCX``scrapy parse``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX scrapy parserr}r(hCUhDjubahWj^ ubhtX instead) (rr}r(hCX instead) (hDjubh)r}r(hCX :rev:`2704`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2704h[]h\]h]]h^]h_]uhDjhd]rhtXr2704rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCX0Removed deprecated Execution Queue (:rev:`2704`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX$Removed deprecated Execution Queue (rr}r(hCX$Removed deprecated Execution Queue (hDjubh)r}r(hCX :rev:`2704`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2704h[]h\]h]]h^]h_]uhDjhd]rhtXr2704rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXaRemoved (undocumented) spider context extension (from scrapy.contrib.spidercontext) (:rev:`2780`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXURemoved (undocumented) spider context extension (from scrapy.contrib.spidercontext) (rr}r(hCXURemoved (undocumented) spider context extension (from scrapy.contrib.spidercontext) (hDjubh)r}r(hCX :rev:`2780`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2780h[]h\]h]]h^]h_]uhDjhd]rhtXr2780rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r}r(hCXRremoved ``CONCURRENT_SPIDERS`` setting (use scrapyd maxproc instead) (:rev:`2789`)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXremoved rr}r(hCXremoved hDjubjV )r}r(hCX``CONCURRENT_SPIDERS``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXCONCURRENT_SPIDERSrr}r(hCUhDjubahWj^ ubhtX( setting (use scrapyd maxproc instead) (rr}r(hCX( setting (use scrapyd maxproc instead) (hDjubh)r}r(hCX :rev:`2789`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2789h[]h\]h]]h^]h_]uhDjhd]rhtXr2789rr}r(hCUhDjubahWhubhtX)r}r(hCX)hDjubeubaubh)r }r (hCXRenamed attributes of core components: downloader.sites -> downloader.slots, scraper.sites -> scraper.slots (:rev:`2717`, :rev:`2718`)r hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r}r(hCj hDj hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXmRenamed attributes of core components: downloader.sites -> downloader.slots, scraper.sites -> scraper.slots (rr}r(hCXmRenamed attributes of core components: downloader.sites -> downloader.slots, scraper.sites -> scraper.slots (hDjubh)r}r(hCX :rev:`2717`hY}r(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2717h[]h\]h]]h^]h_]uhDjhd]rhtXr2717rr}r(hCUhDjubahWhubhtX, rr}r(hCX, hDjubh)r}r (hCX :rev:`2718`hY}r!(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2718h[]h\]h]]h^]h_]uhDjhd]r"htXr2718r#r$}r%(hCUhDjubahWhubhtX)r&}r'(hCX)hDjubeubaubh)r(}r)(hCX|Renamed setting ``CLOSESPIDER_ITEMPASSED`` to :setting:`CLOSESPIDER_ITEMCOUNT` (:rev:`2655`). Backwards compatibility kept. hDjhMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbNhchhd]r+h)r,}r-(hCX{Renamed setting ``CLOSESPIDER_ITEMPASSED`` to :setting:`CLOSESPIDER_ITEMCOUNT` (:rev:`2655`). Backwards compatibility kept.hDj(hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbMhd]r/(htXRenamed setting r0r1}r2(hCXRenamed setting hDj,ubjV )r3}r4(hCX``CLOSESPIDER_ITEMPASSED``hY}r5(h]]h^]h\]h[]h_]uhDj,hd]r6htXCLOSESPIDER_ITEMPASSEDr7r8}r9(hCUhDj3ubahWj^ ubhtX to r:r;}r<(hCX to hDj,ubj )r=}r>(hCX :setting:`CLOSESPIDER_ITEMCOUNT`r?hDj,hMhPhWj hY}r@(UreftypeXsettingj j XCLOSESPIDER_ITEMCOUNTU refdomainXstdrAh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rBjV )rC}rD(hCj?hY}rE(h]]h^]rF(j jAX std-settingrGeh\]h[]h_]uhDj=hd]rHhtXCLOSESPIDER_ITEMCOUNTrIrJ}rK(hCUhDjCubahWj^ ubaubhtX (rLrM}rN(hCX (hDj,ubh)rO}rP(hCX :rev:`2655`hY}rQ(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2655h[]h\]h]]h^]h_]uhDj,hd]rRhtXr2655rSrT}rU(hCUhDjOubahWhubhtX ). Backwards compatibility kept.rVrW}rX(hCX ). Backwards compatibility kept.hDj,ubeubaubeubeubeubhE)rY}rZ(hCUhDhKhMhPhWhehY}r[(h]]h^]h\]h[]r\Uid7r]ah_]r^h auhbMhchhd]r_(hm)r`}ra(hCX0.12rbhDjYhMhPhWhqhY}rc(h]]h^]h\]h[]h_]uhbMhchhd]rdhtX0.12rerf}rg(hCjbhDj`ubaubh)rh}ri(hCXeThe numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.rjhDjYhMhPhWhhY}rk(h]]h^]h\]h[]h_]uhbMhchhd]rlhtXeThe numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.rmrn}ro(hCjjhDjhubaubhE)rp}rq(hCUhHKhDjYhMhPhWhehY}rr(h]]rsXnew features and improvementsrtah^]h\]h[]ruUnew-features-and-improvementsrvah_]uhbMhchhd]rw(hm)rx}ry(hCXNew features and improvementsrzhDjphMhPhWhqhY}r{(h]]h^]h\]h[]h_]uhbMhchhd]r|htXNew features and improvementsr}r~}r(hCjzhDjxubaubh)r}r(hCUhDjphMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbMhchhd]r(h)r}r(hCXTPassed item is now sent in the ``item`` argument of the :signal:`item_passed` (#273)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXPassed item is now sent in the rr}r(hCXPassed item is now sent in the hDjubjV )r}r(hCX``item``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXitemrr}r(hCUhDjubahWj^ ubhtX argument of the rr}r(hCX argument of the hDjubj )r}r(hCX:signal:`item_passed`rhDjhMhPhWj hY}r(UreftypeXsignalj j X item_passedU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-signalreh\]h[]h_]uhDjhd]rhtX item_passedrr}r(hCUhDjubahWj^ ubaubhtX (#273)rr}r(hCX (#273)hDjubeubaubh)r}r(hCXQAdded verbose option to ``scrapy version`` command, useful for bug reports (#298)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded verbose option to rr}r(hCXAdded verbose option to hDjubjV )r}r(hCX``scrapy version``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXscrapy versionrr}r(hCUhDjubahWj^ ubhtX' command, useful for bug reports (#298)rr}r(hCX' command, useful for bug reports (#298)hDjubeubaubh)r}r(hCX?HTTP cache now stored by default in the project data dir (#279)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtX?HTTP cache now stored by default in the project data dir (#279)rr}r(hCjhDjubaubaubh)r}r(hCX1Added project data storage directory (#276, #277)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtX1Added project data storage directory (#276, #277)rr}r(hCjhDjubaubaubh)r}r(hCXHDocumented file structure of Scrapy projects (see command-line tool doc)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtXHDocumented file structure of Scrapy projects (see command-line tool doc)rr}r(hCjhDjubaubaubh)r}r(hCX+New lxml backend for XPath selectors (#147)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtX+New lxml backend for XPath selectors (#147)rr}r(hCjhDjubaubaubh)r}r(hCXPer-spider settings (#245)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtXPer-spider settings (#245)rr}r(hCjhDjubaubaubh)r}r(hCX=Support exit codes to signal errors in Scrapy commands (#248)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtX=Support exit codes to signal errors in Scrapy commands (#248)rr}r (hCjhDjubaubaubh)r }r (hCX1Added ``-c`` argument to ``scrapy shell`` commandr hDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCj hDj hMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXAdded rr}r(hCXAdded hDjubjV )r}r(hCX``-c``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX-crr}r(hCUhDjubahWj^ ubhtX argument to rr}r(hCX argument to hDjubjV )r }r!(hCX``scrapy shell``hY}r"(h]]h^]h\]h[]h_]uhDjhd]r#htX scrapy shellr$r%}r&(hCUhDj ubahWj^ ubhtX commandr'r(}r)(hCX commandhDjubeubaubh)r*}r+(hCX Made ``libxml2`` optional (#260)r,hDjhMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbNhchhd]r.h)r/}r0(hCj,hDj*hMhPhWhhY}r1(h]]h^]h\]h[]h_]uhbMhd]r2(htXMade r3r4}r5(hCXMade hDj/ubjV )r6}r7(hCX ``libxml2``hY}r8(h]]h^]h\]h[]h_]uhDj/hd]r9htXlibxml2r:r;}r<(hCUhDj6ubahWj^ ubhtX optional (#260)r=r>}r?(hCX optional (#260)hDj/ubeubaubh)r@}rA(hCXNew ``deploy`` command (#261)rBhDjhMhPhWhhY}rC(h]]h^]h\]h[]h_]uhbNhchhd]rDh)rE}rF(hCjBhDj@hMhPhWhhY}rG(h]]h^]h\]h[]h_]uhbMhd]rH(htXNew rIrJ}rK(hCXNew hDjEubjV )rL}rM(hCX ``deploy``hY}rN(h]]h^]h\]h[]h_]uhDjEhd]rOhtXdeployrPrQ}rR(hCUhDjLubahWj^ ubhtX command (#261)rSrT}rU(hCX command (#261)hDjEubeubaubh)rV}rW(hCX5Added :setting:`CLOSESPIDER_PAGECOUNT` setting (#253)rXhDjhMhPhWhhY}rY(h]]h^]h\]h[]h_]uhbNhchhd]rZh)r[}r\(hCjXhDjVhMhPhWhhY}r](h]]h^]h\]h[]h_]uhbMhd]r^(htXAdded r_r`}ra(hCXAdded hDj[ubj )rb}rc(hCX :setting:`CLOSESPIDER_PAGECOUNT`rdhDj[hMhPhWj hY}re(UreftypeXsettingj j XCLOSESPIDER_PAGECOUNTU refdomainXstdrfh[]h\]U refexplicith]]h^]h_]j j uhbMhd]rgjV )rh}ri(hCjdhY}rj(h]]h^]rk(j jfX std-settingrleh\]h[]h_]uhDjbhd]rmhtXCLOSESPIDER_PAGECOUNTrnro}rp(hCUhDjhubahWj^ ubaubhtX setting (#253)rqrr}rs(hCX setting (#253)hDj[ubeubaubh)rt}ru(hCX7Added :setting:`CLOSESPIDER_ERRORCOUNT` setting (#254) hDjhMhPhWhhY}rv(h]]h^]h\]h[]h_]uhbNhchhd]rwh)rx}ry(hCX6Added :setting:`CLOSESPIDER_ERRORCOUNT` setting (#254)hDjthMhPhWhhY}rz(h]]h^]h\]h[]h_]uhbM hd]r{(htXAdded r|r}}r~(hCXAdded hDjxubj )r}r(hCX!:setting:`CLOSESPIDER_ERRORCOUNT`rhDjxhMhPhWj hY}r(UreftypeXsettingj j XCLOSESPIDER_ERRORCOUNTU refdomainXstdrh[]h\]U refexplicith]]h^]h_]j j uhbM hd]rjV )r}r(hCjhY}r(h]]h^]r(j jX std-settingreh\]h[]h_]uhDjhd]rhtXCLOSESPIDER_ERRORCOUNTrr}r(hCUhDjubahWj^ ubaubhtX setting (#254)rr}r(hCX setting (#254)hDjxubeubaubeubeubhE)r}r(hCUhDjYhMhPhWhehY}r(h]]h^]h\]h[]rUscrapyd-changesrah_]rh.auhbM hchhd]r(hm)r}r(hCXScrapyd changesrhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbM hchhd]rhtXScrapyd changesrr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbMhchhd]r(h)r}r(hCX'Scrapyd now uses one process per spiderrhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtX'Scrapyd now uses one process per spiderrr}r(hCjhDjubaubaubh)r}r(hCXiIt stores one log file per spider run, and rotate them keeping the lastest 5 logs per spider (by default)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtXiIt stores one log file per spider run, and rotate them keeping the lastest 5 logs per spider (by default)rr}r(hCjhDjubaubaubh)r}r(hCXIA minimal web ui was added, available at http://localhost:6800 by defaultrhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htX)A minimal web ui was added, available at rr}r(hCX)A minimal web ui was added, available at hDjubh)r}r(hCXhttp://localhost:6800rhY}r(Urefurijh[]h\]h]]h^]h_]uhDjhd]rhtXhttp://localhost:6800rr}r(hCUhDjubahWhubhtX by defaultrr}r(hCX by defaulthDjubeubaubh)r}r(hCXXThere is now a `scrapy server` command to start a Scrapyd server of the current project hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCXWThere is now a `scrapy server` command to start a Scrapyd server of the current projecthDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXThere is now a rr}r(hCXThere is now a hDjubh)r}r(hCX`scrapy server`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX scrapy serverrr}r(hCUhDjubahWhubhtX9 command to start a Scrapyd server of the current projectrr}r(hCX9 command to start a Scrapyd server of the current projecthDjubeubaubeubeubhE)r}r(hCUhHKhDjYhMhPhWhehY}r(h]]rXchanges to settingsrah^]h\]h[]rUchanges-to-settingsrah_]uhbMhchhd]r(hm)r}r(hCXChanges to settingsrhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbMhchhd]rhtXChanges to settingsrr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbMhchhd]r(h)r}r(hCXTadded `HTTPCACHE_ENABLED` setting (False by default) to enable HTTP cache middlewarerhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXadded rr}r(hCXadded hDjubh)r}r (hCX`HTTPCACHE_ENABLED`hY}r (h]]h^]h\]h[]h_]uhDjhd]r htXHTTPCACHE_ENABLEDr r }r(hCUhDjubahWhubhtX; setting (False by default) to enable HTTP cache middlewarerr}r(hCX; setting (False by default) to enable HTTP cache middlewarehDjubeubaubh)r}r(hCXNchanged `HTTPCACHE_EXPIRATION_SECS` semantics: now zero means "never expire". hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCXMchanged `HTTPCACHE_EXPIRATION_SECS` semantics: now zero means "never expire".hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]r(htXchanged rr}r(hCXchanged hDjubh)r}r(hCX`HTTPCACHE_EXPIRATION_SECS`hY}r(h]]h^]h\]h[]h_]uhDjhd]r htXHTTPCACHE_EXPIRATION_SECSr!r"}r#(hCUhDjubahWhubhtX* semantics: now zero means "never expire".r$r%}r&(hCX* semantics: now zero means "never expire".hDjubeubaubeubeubhE)r'}r((hCUhDjYhMhPhWhehY}r)(h]]h^]h\]h[]r*U"deprecated-obsoleted-functionalityr+ah_]r,h'auhbMhchhd]r-(hm)r.}r/(hCX"Deprecated/obsoleted functionalityr0hDj'hMhPhWhqhY}r1(h]]h^]h\]h[]h_]uhbMhchhd]r2htX"Deprecated/obsoleted functionalityr3r4}r5(hCj0hDj.ubaubh)r6}r7(hCUhDj'hMhPhWhhY}r8(hX-h[]h\]h]]h^]h_]uhbMhchhd]r9(h)r:}r;(hCXxDeprecated ``runserver`` command in favor of ``server`` command which starts a Scrapyd server. See also: Scrapyd changesr<hDj6hMhPhWhhY}r=(h]]h^]h\]h[]h_]uhbNhchhd]r>h)r?}r@(hCj<hDj:hMhPhWhhY}rA(h]]h^]h\]h[]h_]uhbMhd]rB(htX Deprecated rCrD}rE(hCX Deprecated hDj?ubjV )rF}rG(hCX ``runserver``hY}rH(h]]h^]h\]h[]h_]uhDj?hd]rIhtX runserverrJrK}rL(hCUhDjFubahWj^ ubhtX command in favor of rMrN}rO(hCX command in favor of hDj?ubjV )rP}rQ(hCX ``server``hY}rR(h]]h^]h\]h[]h_]uhDj?hd]rShtXserverrTrU}rV(hCUhDjPubahWj^ ubhtXA command which starts a Scrapyd server. See also: Scrapyd changesrWrX}rY(hCXA command which starts a Scrapyd server. See also: Scrapyd changeshDj?ubeubaubh)rZ}r[(hCXgDeprecated ``queue`` command in favor of using Scrapyd ``schedule.json`` API. See also: Scrapyd changesr\hDj6hMhPhWhhY}r](h]]h^]h\]h[]h_]uhbNhchhd]r^h)r_}r`(hCj\hDjZhMhPhWhhY}ra(h]]h^]h\]h[]h_]uhbMhd]rb(htX Deprecated rcrd}re(hCX Deprecated hDj_ubjV )rf}rg(hCX ``queue``hY}rh(h]]h^]h\]h[]h_]uhDj_hd]rihtXqueuerjrk}rl(hCUhDjfubahWj^ ubhtX# command in favor of using Scrapyd rmrn}ro(hCX# command in favor of using Scrapyd hDj_ubjV )rp}rq(hCX``schedule.json``hY}rr(h]]h^]h\]h[]h_]uhDj_hd]rshtX schedule.jsonrtru}rv(hCUhDjpubahWj^ ubhtX API. See also: Scrapyd changesrwrx}ry(hCX API. See also: Scrapyd changeshDj_ubeubaubh)rz}r{(hCXYRemoved the !LxmlItemLoader (experimental contrib which never graduated to main contrib) hDj6hMhPhWhhY}r|(h]]h^]h\]h[]h_]uhbNhchhd]r}h)r~}r(hCXXRemoved the !LxmlItemLoader (experimental contrib which never graduated to main contrib)rhDjzhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbMhd]rhtXXRemoved the !LxmlItemLoader (experimental contrib which never graduated to main contrib)rr}r(hCjhDj~ubaubaubeubeubeubhE)r}r(hCUhDhKhMhPhWhehY}r(h]]h^]h\]h[]rUid8rah_]rh*auhbM!hchhd]r(hm)r}r(hCX0.10rhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbM!hchhd]rhtX0.10rr}r(hCjhDjubaubh)r}r(hCXeThe numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM#hchhd]rhtXeThe numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.rr}r(hCjhDjubaubhE)r}r(hCUhHKhDjhMhPhWhehY}r(h]]rjtah^]h\]h[]rUid9rah_]uhbM&hchhd]r(hm)r}r(hCXNew features and improvementsrhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbM&hchhd]rhtXNew features and improvementsrr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbM(hchhd]r(h)r}r(hCXrNew Scrapy service called ``scrapyd`` for deploying Scrapy crawlers in production (#218) (documentation available)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM(hd]r(htXNew Scrapy service called rr}r(hCXNew Scrapy service called hDjubjV )r}r(hCX ``scrapyd``hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXscrapydrr}r(hCUhDjubahWj^ ubhtXM for deploying Scrapy crawlers in production (#218) (documentation available)rr}r(hCXM for deploying Scrapy crawlers in production (#218) (documentation available)hDjubeubaubh)r}r(hCXfSimplified Images pipeline usage which doesn't require subclassing your own images pipeline now (#217)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM)hd]rhtXfSimplified Images pipeline usage which doesn't require subclassing your own images pipeline now (#217)rr}r(hCjhDjubaubaubh)r}r(hCX7Scrapy shell now shows the Scrapy log by default (#206)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM*hd]rhtX7Scrapy shell now shows the Scrapy log by default (#206)rr}r(hCjhDjubaubaubh)r}r(hCXeRefactored execution queue in a common base code and pluggable backends called "spider queues" (#220)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM+hd]rhtXeRefactored execution queue in a common base code and pluggable backends called "spider queues" (#220)rr}r(hCjhDjubaubaubh)r}r(hCXNew persistent spider queue (based on SQLite) (#198), available by default, which allows to start Scrapy in server mode and then schedule spiders to run.rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM,hd]rhtXNew persistent spider queue (based on SQLite) (#198), available by default, which allows to start Scrapy in server mode and then schedule spiders to run.rr}r(hCjhDjubaubaubh)r}r(hCXnAdded documentation for Scrapy command-line tool and all its available sub-commands. (documentation available)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM-hd]rhtXnAdded documentation for Scrapy command-line tool and all its available sub-commands. (documentation available)rr}r(hCjhDjubaubaubh)r}r(hCXGFeed exporters with pluggable backends (#197) (documentation available)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r (h]]h^]h\]h[]h_]uhbM.hd]r htXGFeed exporters with pluggable backends (#197) (documentation available)r r }r (hCjhDjubaubaubh)r}r(hCXDeferred signals (#193)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM/hd]rhtXDeferred signals (#193)rr}r(hCjhDjubaubaubh)r}r(hCXaAdded two new methods to item pipeline open_spider(), close_spider() with deferred support (#195)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r (hCjhDjhMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbM0hd]r"htXaAdded two new methods to item pipeline open_spider(), close_spider() with deferred support (#195)r#r$}r%(hCjhDjubaubaubh)r&}r'(hCX@Support for overriding default request headers per spider (#181)r(hDjhMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbNhchhd]r*h)r+}r,(hCj(hDj&hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbM1hd]r.htX@Support for overriding default request headers per spider (#181)r/r0}r1(hCj(hDj+ubaubaubh)r2}r3(hCXoReplaced default Spider Manager with one with similar functionality but not depending on Twisted Plugins (#186)r4hDjhMhPhWhhY}r5(h]]h^]h\]h[]h_]uhbNhchhd]r6h)r7}r8(hCj4hDj2hMhPhWhhY}r9(h]]h^]h\]h[]h_]uhbM2hd]r:htXoReplaced default Spider Manager with one with similar functionality but not depending on Twisted Plugins (#186)r;r<}r=(hCj4hDj7ubaubaubh)r>}r?(hCXNSplitted Debian package into two packages - the library and the service (#187)r@hDjhMhPhWhhY}rA(h]]h^]h\]h[]h_]uhbNhchhd]rBh)rC}rD(hCj@hDj>hMhPhWhhY}rE(h]]h^]h\]h[]h_]uhbM3hd]rFhtXNSplitted Debian package into two packages - the library and the service (#187)rGrH}rI(hCj@hDjCubaubaubh)rJ}rK(hCXScrapy log refactoring (#188)rLhDjhMhPhWhhY}rM(h]]h^]h\]h[]h_]uhbNhchhd]rNh)rO}rP(hCjLhDjJhMhPhWhhY}rQ(h]]h^]h\]h[]h_]uhbM4hd]rRhtXScrapy log refactoring (#188)rSrT}rU(hCjLhDjOubaubaubh)rV}rW(hCXPNew extension for keeping persistent spider contexts among different runs (#203)rXhDjhMhPhWhhY}rY(h]]h^]h\]h[]h_]uhbNhchhd]rZh)r[}r\(hCjXhDjVhMhPhWhhY}r](h]]h^]h\]h[]h_]uhbM5hd]r^htXPNew extension for keeping persistent spider contexts among different runs (#203)r_r`}ra(hCjXhDj[ubaubaubh)rb}rc(hCXDAdded `dont_redirect` request.meta key for avoiding redirects (#233)rdhDjhMhPhWhhY}re(h]]h^]h\]h[]h_]uhbNhchhd]rfh)rg}rh(hCjdhDjbhMhPhWhhY}ri(h]]h^]h\]h[]h_]uhbM6hd]rj(htXAdded rkrl}rm(hCXAdded hDjgubh)rn}ro(hCX`dont_redirect`hY}rp(h]]h^]h\]h[]h_]uhDjghd]rqhtX dont_redirectrrrs}rt(hCUhDjnubahWhubhtX/ request.meta key for avoiding redirects (#233)rurv}rw(hCX/ request.meta key for avoiding redirects (#233)hDjgubeubaubh)rx}ry(hCX@Added `dont_retry` request.meta key for avoiding retries (#234) hDjhMhPhWhhY}rz(h]]h^]h\]h[]h_]uhbNhchhd]r{h)r|}r}(hCX?Added `dont_retry` request.meta key for avoiding retries (#234)hDjxhMhPhWhhY}r~(h]]h^]h\]h[]h_]uhbM7hd]r(htXAdded rr}r(hCXAdded hDj|ubh)r}r(hCX `dont_retry`hY}r(h]]h^]h\]h[]h_]uhDj|hd]rhtX dont_retryrr}r(hCUhDjubahWhubhtX- request.meta key for avoiding retries (#234)rr}r(hCX- request.meta key for avoiding retries (#234)hDj|ubeubaubeubeubhE)r}r(hCUhDjhMhPhWhehY}r(h]]h^]h\]h[]rUcommand-line-tool-changesrah_]rh2auhbM:hchhd]r(hm)r}r(hCXCommand-line tool changesrhDjhMhPhWhqhY}r(h]]h^]h\]h[]h_]uhbM:hchhd]rhtXCommand-line tool changesrr}r(hCjhDjubaubh)r}r(hCUhDjhMhPhWhhY}r(hX-h[]h\]h]]h^]h_]uhbM<hchhd]r(h)r}r(hCXNew `scrapy` command which replaces the old `scrapy-ctl.py` (#199) - there is only one global `scrapy` command now, instead of one `scrapy-ctl.py` per project - Added `scrapy.bat` script for running more conveniently from WindowshDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCXNew `scrapy` command which replaces the old `scrapy-ctl.py` (#199) - there is only one global `scrapy` command now, instead of one `scrapy-ctl.py` per project - Added `scrapy.bat` script for running more conveniently from WindowshDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM<hd]r(htXNew rr}r(hCXNew hDjubh)r}r(hCX`scrapy`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXscrapyrr}r(hCUhDjubahWhubhtX command which replaces the old rr}r(hCX command which replaces the old hDjubh)r}r(hCX`scrapy-ctl.py`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX scrapy-ctl.pyrr}r(hCUhDjubahWhubhtX# (#199) - there is only one global rr}r(hCX# (#199) - there is only one global hDjubh)r}r(hCX`scrapy`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXscrapyrr}r(hCUhDjubahWhubhtX command now, instead of one rr}r(hCX command now, instead of one hDjubh)r}r(hCX`scrapy-ctl.py`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX scrapy-ctl.pyrr}r(hCUhDjubahWhubhtX per project - Added rr}r(hCX per project - Added hDjubh)r}r(hCX `scrapy.bat`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtX scrapy.batrr}r(hCUhDjubahWhubhtX2 script for running more conveniently from Windowsrr}r(hCX2 script for running more conveniently from WindowshDjubeubaubh)r}r(hCX1Added bash completion to command-line tool (#210)rhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCjhDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM?hd]rhtX1Added bash completion to command-line tool (#210)rr}r(hCjhDjubaubaubh)r}r(hCX.Renamed command `start` to `runserver` (#209) hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbNhchhd]rh)r}r(hCX-Renamed command `start` to `runserver` (#209)hDjhMhPhWhhY}r(h]]h^]h\]h[]h_]uhbM@hd]r(htXRenamed command rr}r(hCXRenamed command hDjubh)r}r(hCX`start`hY}r(h]]h^]h\]h[]h_]uhDjhd]rhtXstartrr}r(hCUhDjubahWhubhtX to rr}r(hCX to hDjubh)r}r(hCX `runserver`hY}r (h]]h^]h\]h[]h_]uhDjhd]r htX runserverr r }r (hCUhDjubahWhubhtX (#209)r r }r (hCX (#209)hDjubeubaubeubeubhE)r }r (hCUhHKhDjhMhPhWhehY}r (h]]r X api changesr ah^]h\]h[]r U api-changesr ah_]uhbMChchhd]r (hm)r }r (hCX API changesr hDj hMhPhWhqhY}r (h]]h^]h\]h[]h_]uhbMChchhd]r htX API changesr r }r (hCj hDj ubaubh)r }r (hCUhDj hMhPhWhhY}r (hX-h[]h\]h]]h^]h_]uhbMEhchhd]r (h)r }r (hCXK``url`` and ``body`` attributes of Request objects are now read-only (#230)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r! }r" (hCj hDj hMhPhWhhY}r# (h]]h^]h\]h[]h_]uhbMEhd]r$ (jV )r% }r& (hCX``url``hY}r' (h]]h^]h\]h[]h_]uhDj! hd]r( htXurlr) r* }r+ (hCUhDj% ubahWj^ ubhtX and r, r- }r. (hCX and hDj! ubjV )r/ }r0 (hCX``body``hY}r1 (h]]h^]h\]h[]h_]uhDj! hd]r2 htXbodyr3 r4 }r5 (hCUhDj/ ubahWj^ ubhtX7 attributes of Request objects are now read-only (#230)r6 r7 }r8 (hCX7 attributes of Request objects are now read-only (#230)hDj! ubeubaubh)r9 }r: (hCXq``Request.copy()`` and ``Request.replace()`` now also copies their ``callback`` and ``errback`` attributes (#231)r; hDj hMhPhWhhY}r< (h]]h^]h\]h[]h_]uhbNhchhd]r= h)r> }r? (hCj; hDj9 hMhPhWhhY}r@ (h]]h^]h\]h[]h_]uhbMFhd]rA (jV )rB }rC (hCX``Request.copy()``hY}rD (h]]h^]h\]h[]h_]uhDj> hd]rE htXRequest.copy()rF rG }rH (hCUhDjB ubahWj^ ubhtX and rI rJ }rK (hCX and hDj> ubjV )rL }rM (hCX``Request.replace()``hY}rN (h]]h^]h\]h[]h_]uhDj> hd]rO htXRequest.replace()rP rQ }rR (hCUhDjL ubahWj^ ubhtX now also copies their rS rT }rU (hCX now also copies their hDj> ubjV )rV }rW (hCX ``callback``hY}rX (h]]h^]h\]h[]h_]uhDj> hd]rY htXcallbackrZ r[ }r\ (hCUhDjV ubahWj^ ubhtX and r] r^ }r_ (hCX and hDj> ubjV )r` }ra (hCX ``errback``hY}rb (h]]h^]h\]h[]h_]uhDj> hd]rc htXerrbackrd re }rf (hCUhDj` ubahWj^ ubhtX attributes (#231)rg rh }ri (hCX attributes (#231)hDj> ubeubaubh)rj }rk (hCXURemoved ``UrlFilterMiddleware`` from ``scrapy.contrib`` (already disabled by default)rl hDj hMhPhWhhY}rm (h]]h^]h\]h[]h_]uhbNhchhd]rn h)ro }rp (hCjl hDjj hMhPhWhhY}rq (h]]h^]h\]h[]h_]uhbMGhd]rr (htXRemoved rs rt }ru (hCXRemoved hDjo ubjV )rv }rw (hCX``UrlFilterMiddleware``hY}rx (h]]h^]h\]h[]h_]uhDjo hd]ry htXUrlFilterMiddlewarerz r{ }r| (hCUhDjv ubahWj^ ubhtX from r} r~ }r (hCX from hDjo ubjV )r }r (hCX``scrapy.contrib``hY}r (h]]h^]h\]h[]h_]uhDjo hd]r htXscrapy.contribr r }r (hCUhDj ubahWj^ ubhtX (already disabled by default)r r }r (hCX (already disabled by default)hDjo ubeubaubh)r }r (hCX{Offsite middelware doesn't filter out any request coming from a spider that doesn't have a allowed_domains attribute (#225)r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMHhd]r htX{Offsite middelware doesn't filter out any request coming from a spider that doesn't have a allowed_domains attribute (#225)r r }r (hCj hDj ubaubaubh)r }r (hCX[Removed Spider Manager ``load()`` method. Now spiders are loaded in the constructor itself.r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMIhd]r (htXRemoved Spider Manager r r }r (hCXRemoved Spider Manager hDj ubjV )r }r (hCX ``load()``hY}r (h]]h^]h\]h[]h_]uhDj hd]r htXload()r r }r (hCUhDj ubahWj^ ubhtX: method. Now spiders are loaded in the constructor itself.r r }r (hCX: method. Now spiders are loaded in the constructor itself.hDj ubeubaubh)r }r (hCXChanges to Scrapy Manager (now called "Crawler"): - ``scrapy.core.manager.ScrapyManager`` class renamed to ``scrapy.crawler.Crawler`` - ``scrapy.core.manager.scrapymanager`` singleton moved to ``scrapy.project.crawler``hDj hMNhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r jA)r }r (hCUhY}r (h]]h^]h\]h[]h_]uhDj hd]r jF)r }r (hCXChanges to Scrapy Manager (now called "Crawler"): - ``scrapy.core.manager.ScrapyManager`` class renamed to ``scrapy.crawler.Crawler`` - ``scrapy.core.manager.scrapymanager`` singleton moved to ``scrapy.project.crawler``hDj hMhPhWjIhY}r (h]]h^]h\]h[]h_]uhbMKhd]r (jL)r }r (hCX1Changes to Scrapy Manager (now called "Crawler"):r hDj hMhPhWjOhY}r (h]]h^]h\]h[]h_]uhbMKhd]r htX1Changes to Scrapy Manager (now called "Crawler"):r r }r (hCj hDj ubaubj^)r }r (hCUhY}r (h]]h^]h\]h[]h_]uhDj hd]r h)r }r (hCUhY}r (hX-h[]h\]h]]h^]h_]uhDj hd]r (h)r }r (hCXQ``scrapy.core.manager.ScrapyManager`` class renamed to ``scrapy.crawler.Crawler``r hY}r (h]]h^]h\]h[]h_]uhDj hd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMKhd]r (jV )r }r (hCX%``scrapy.core.manager.ScrapyManager``hY}r (h]]h^]h\]h[]h_]uhDj hd]r htX!scrapy.core.manager.ScrapyManagerr r }r (hCUhDj ubahWj^ ubhtX class renamed to r r }r (hCX class renamed to hDj ubjV )r }r (hCX``scrapy.crawler.Crawler``hY}r (h]]h^]h\]h[]h_]uhDj hd]r htXscrapy.crawler.Crawlerr r }r (hCUhDj ubahWj^ ubeubahWhubh)r }r (hCXS``scrapy.core.manager.scrapymanager`` singleton moved to ``scrapy.project.crawler``r hY}r (h]]h^]h\]h[]h_]uhDj hd]r h)r }r (hCj hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbMLhd]r (jV )r }r (hCX%``scrapy.core.manager.scrapymanager``hY}r (h]]h^]h\]h[]h_]uhDj hd]r htX!scrapy.core.manager.scrapymanagerr r }r (hCUhDj ubahWj^ ubhtX singleton moved to r r }r (hCX singleton moved to hDj ubjV )r }r (hCX``scrapy.project.crawler``hY}r (h]]h^]h\]h[]h_]uhDj hd]r htXscrapy.project.crawlerr r }r (hCUhDj ubahWj^ ubeubahWhubehWhubahWjubeubahWjubaubh)r }r (hCXJMoved module: ``scrapy.contrib.spidermanager`` to ``scrapy.spidermanager``r hDj hMhPhWhhY}r (h]]h^]h\]h[]h_]uhbNhchhd]r!h)r!}r!(hCj hDj hMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbMMhd]r!(htXMoved module: r!r!}r!(hCXMoved module: hDj!ubjV )r!}r !(hCX ``scrapy.contrib.spidermanager``hY}r !(h]]h^]h\]h[]h_]uhDj!hd]r !htXscrapy.contrib.spidermanagerr !r !}r!(hCUhDj!ubahWj^ ubhtX to r!r!}r!(hCX to hDj!ubjV )r!}r!(hCX``scrapy.spidermanager``hY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!htXscrapy.spidermanagerr!r!}r!(hCUhDj!ubahWj^ ubeubaubh)r!}r!(hCXSpider Manager singleton moved from ``scrapy.spider.spiders`` to the ``spiders` attribute of ``scrapy.project.crawler`` singleton.r!hDj hMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbNhchhd]r!h)r!}r!(hCj!hDj!hMhPhWhhY}r !(h]]h^]h\]h[]h_]uhbMNhd]r!!(htX$Spider Manager singleton moved from r"!r#!}r$!(hCX$Spider Manager singleton moved from hDj!ubjV )r%!}r&!(hCX``scrapy.spider.spiders``hY}r'!(h]]h^]h\]h[]h_]uhDj!hd]r(!htXscrapy.spider.spidersr)!r*!}r+!(hCUhDj%!ubahWj^ ubhtX to the r,!r-!}r.!(hCX to the hDj!ubjV )r/!}r0!(hCX2``spiders` attribute of ``scrapy.project.crawler``hY}r1!(h]]h^]h\]h[]h_]uhDj!hd]r2!htX.spiders` attribute of ``scrapy.project.crawlerr3!r4!}r5!(hCUhDj/!ubahWj^ ubhtX singleton.r6!r7!}r8!(hCX singleton.hDj!ubeubaubh)r9!}r:!(hCXmoved Stats Collector classes: (#204) - ``scrapy.stats.collector.StatsCollector`` to ``scrapy.statscol.StatsCollector`` - ``scrapy.stats.collector.SimpledbStatsCollector`` to ``scrapy.contrib.statscol.SimpledbStatsCollector``hDj hMNhWhhY}r;!(h]]h^]h\]h[]h_]uhbNhchhd]r!(hCUhY}r?!(h]]h^]h\]h[]h_]uhDj9!hd]r@!jF)rA!}rB!(hCXmoved Stats Collector classes: (#204) - ``scrapy.stats.collector.StatsCollector`` to ``scrapy.statscol.StatsCollector`` - ``scrapy.stats.collector.SimpledbStatsCollector`` to ``scrapy.contrib.statscol.SimpledbStatsCollector``hDj=!hMhPhWjIhY}rC!(h]]h^]h\]h[]h_]uhbMPhd]rD!(jL)rE!}rF!(hCX%moved Stats Collector classes: (#204)rG!hDjA!hMhPhWjOhY}rH!(h]]h^]h\]h[]h_]uhbMPhd]rI!htX%moved Stats Collector classes: (#204)rJ!rK!}rL!(hCjG!hDjE!ubaubj^)rM!}rN!(hCUhY}rO!(h]]h^]h\]h[]h_]uhDjA!hd]rP!h)rQ!}rR!(hCUhY}rS!(hX-h[]h\]h]]h^]h_]uhDjM!hd]rT!(h)rU!}rV!(hCXO``scrapy.stats.collector.StatsCollector`` to ``scrapy.statscol.StatsCollector``rW!hY}rX!(h]]h^]h\]h[]h_]uhDjQ!hd]rY!h)rZ!}r[!(hCjW!hDjU!hMhPhWhhY}r\!(h]]h^]h\]h[]h_]uhbMPhd]r]!(jV )r^!}r_!(hCX)``scrapy.stats.collector.StatsCollector``hY}r`!(h]]h^]h\]h[]h_]uhDjZ!hd]ra!htX%scrapy.stats.collector.StatsCollectorrb!rc!}rd!(hCUhDj^!ubahWj^ ubhtX to re!rf!}rg!(hCX to hDjZ!ubjV )rh!}ri!(hCX"``scrapy.statscol.StatsCollector``hY}rj!(h]]h^]h\]h[]h_]uhDjZ!hd]rk!htXscrapy.statscol.StatsCollectorrl!rm!}rn!(hCUhDjh!ubahWj^ ubeubahWhubh)ro!}rp!(hCXg``scrapy.stats.collector.SimpledbStatsCollector`` to ``scrapy.contrib.statscol.SimpledbStatsCollector``rq!hY}rr!(h]]h^]h\]h[]h_]uhDjQ!hd]rs!h)rt!}ru!(hCjq!hDjo!hMhPhWhhY}rv!(h]]h^]h\]h[]h_]uhbMQhd]rw!(jV )rx!}ry!(hCX1``scrapy.stats.collector.SimpledbStatsCollector``hY}rz!(h]]h^]h\]h[]h_]uhDjt!hd]r{!htX-scrapy.stats.collector.SimpledbStatsCollectorr|!r}!}r~!(hCUhDjx!ubahWj^ ubhtX to r!r!}r!(hCX to hDjt!ubjV )r!}r!(hCX2``scrapy.contrib.statscol.SimpledbStatsCollector``hY}r!(h]]h^]h\]h[]h_]uhDjt!hd]r!htX.scrapy.contrib.statscol.SimpledbStatsCollectorr!r!}r!(hCUhDj!ubahWj^ ubeubahWhubehWhubahWjubeubahWjubaubh)r!}r!(hCXsdefault per-command settings are now specified in the ``default_settings`` attribute of command object class (#201)r!hDj hMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbNhchhd]r!h)r!}r!(hCj!hDj!hMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbMRhd]r!(htX6default per-command settings are now specified in the r!r!}r!(hCX6default per-command settings are now specified in the hDj!ubjV )r!}r!(hCX``default_settings``hY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!htXdefault_settingsr!r!}r!(hCUhDj!ubahWj^ ubhtX) attribute of command object class (#201)r!r!}r!(hCX) attribute of command object class (#201)hDj!ubeubaubh)r!}r!(hCXchanged arguments of Item pipeline ``process_item()`` method from ``(spider, item)`` to ``(item, spider)`` - backwards compatibility kept (with deprecation warning)hDj hMNhWhhY}r!(h]]h^]h\]h[]h_]uhbNhchhd]r!jA)r!}r!(hCUhY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!jF)r!}r!(hCXchanged arguments of Item pipeline ``process_item()`` method from ``(spider, item)`` to ``(item, spider)`` - backwards compatibility kept (with deprecation warning)hDj!hMhPhWjIhY}r!(h]]h^]h\]h[]h_]uhbMShd]r!(jL)r!}r!(hCXjchanged arguments of Item pipeline ``process_item()`` method from ``(spider, item)`` to ``(item, spider)``hDj!hMhPhWjOhY}r!(h]]h^]h\]h[]h_]uhbMShd]r!(htX#changed arguments of Item pipeline r!r!}r!(hCX#changed arguments of Item pipeline hDj!ubjV )r!}r!(hCX``process_item()``hY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!htXprocess_item()r!r!}r!(hCUhDj!ubahWj^ ubhtX method from r!r!}r!(hCX method from hDj!ubjV )r!}r!(hCX``(spider, item)``hY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!htX(spider, item)r!r!}r!(hCUhDj!ubahWj^ ubhtX to r!r!}r!(hCX to hDj!ubjV )r!}r!(hCX``(item, spider)``hY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!htX(item, spider)r!r!}r!(hCUhDj!ubahWj^ ubeubj^)r!}r!(hCUhY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!h)r!}r!(hCUhY}r!(hX-h[]h\]h]]h^]h_]uhDj!hd]r!h)r!}r!(hCX7backwards compatibility kept (with deprecation warning)r!hY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!h)r!}r!(hCj!hDj!hMhPhWhhY}r!(h]]h^]h\]h[]h_]uhbMThd]r!htX7backwards compatibility kept (with deprecation warning)r!r!}r!(hCj!hDj!ubaubahWhubahWhubahWjubeubahWjubaubh)r!}r!(hCXumoved ``scrapy.core.signals`` module to ``scrapy.signals`` - backwards compatibility kept (with deprecation warning)hDj hMNhWhhY}r!(h]]h^]h\]h[]h_]uhbNhchhd]r!jA)r!}r!(hCUhY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!jF)r!}r!(hCXtmoved ``scrapy.core.signals`` module to ``scrapy.signals`` - backwards compatibility kept (with deprecation warning)hDj!hMhPhWjIhY}r!(h]]h^]h\]h[]h_]uhbMUhd]r!(jL)r!}r!(hCX:moved ``scrapy.core.signals`` module to ``scrapy.signals``hDj!hMhPhWjOhY}r!(h]]h^]h\]h[]h_]uhbMUhd]r!(htXmoved r!r!}r!(hCXmoved hDj!ubjV )r!}r!(hCX``scrapy.core.signals``hY}r!(h]]h^]h\]h[]h_]uhDj!hd]r!htXscrapy.core.signalsr!r!}r!(hCUhDj!ubahWj^ ubhtX module to r!r!}r!(hCX module to hDj!ubjV )r!}r!(hCX``scrapy.signals``hY}r"(h]]h^]h\]h[]h_]uhDj!hd]r"htXscrapy.signalsr"r"}r"(hCUhDj!ubahWj^ ubeubj^)r"}r"(hCUhY}r"(h]]h^]h\]h[]h_]uhDj!hd]r"h)r "}r "(hCUhY}r "(hX-h[]h\]h]]h^]h_]uhDj"hd]r "h)r "}r"(hCX7backwards compatibility kept (with deprecation warning)r"hY}r"(h]]h^]h\]h[]h_]uhDj "hd]r"h)r"}r"(hCj"hDj "hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbMVhd]r"htX7backwards compatibility kept (with deprecation warning)r"r"}r"(hCj"hDj"ubaubahWhubahWhubahWjubeubahWjubaubh)r"}r"(hCX{moved ``scrapy.core.exceptions`` module to ``scrapy.exceptions`` - backwards compatibility kept (with deprecation warning)hDj hMNhWhhY}r"(h]]h^]h\]h[]h_]uhbNhchhd]r"jA)r"}r"(hCUhY}r"(h]]h^]h\]h[]h_]uhDj"hd]r "jF)r!"}r""(hCXzmoved ``scrapy.core.exceptions`` module to ``scrapy.exceptions`` - backwards compatibility kept (with deprecation warning)hDj"hMhPhWjIhY}r#"(h]]h^]h\]h[]h_]uhbMWhd]r$"(jL)r%"}r&"(hCX@moved ``scrapy.core.exceptions`` module to ``scrapy.exceptions``hDj!"hMhPhWjOhY}r'"(h]]h^]h\]h[]h_]uhbMWhd]r("(htXmoved r)"r*"}r+"(hCXmoved hDj%"ubjV )r,"}r-"(hCX``scrapy.core.exceptions``hY}r."(h]]h^]h\]h[]h_]uhDj%"hd]r/"htXscrapy.core.exceptionsr0"r1"}r2"(hCUhDj,"ubahWj^ ubhtX module to r3"r4"}r5"(hCX module to hDj%"ubjV )r6"}r7"(hCX``scrapy.exceptions``hY}r8"(h]]h^]h\]h[]h_]uhDj%"hd]r9"htXscrapy.exceptionsr:"r;"}r<"(hCUhDj6"ubahWj^ ubeubj^)r="}r>"(hCUhY}r?"(h]]h^]h\]h[]h_]uhDj!"hd]r@"h)rA"}rB"(hCUhY}rC"(hX-h[]h\]h]]h^]h_]uhDj="hd]rD"h)rE"}rF"(hCX7backwards compatibility kept (with deprecation warning)rG"hY}rH"(h]]h^]h\]h[]h_]uhDjA"hd]rI"h)rJ"}rK"(hCjG"hDjE"hMhPhWhhY}rL"(h]]h^]h\]h[]h_]uhbMXhd]rM"htX7backwards compatibility kept (with deprecation warning)rN"rO"}rP"(hCjG"hDjJ"ubaubahWhubahWhubahWjubeubahWjubaubh)rQ"}rR"(hCX:added ``handles_request()`` class method to ``BaseSpider``rS"hDj hMhPhWhhY}rT"(h]]h^]h\]h[]h_]uhbNhchhd]rU"h)rV"}rW"(hCjS"hDjQ"hMhPhWhhY}rX"(h]]h^]h\]h[]h_]uhbMYhd]rY"(htXadded rZ"r["}r\"(hCXadded hDjV"ubjV )r]"}r^"(hCX``handles_request()``hY}r_"(h]]h^]h\]h[]h_]uhDjV"hd]r`"htXhandles_request()ra"rb"}rc"(hCUhDj]"ubahWj^ ubhtX class method to rd"re"}rf"(hCX class method to hDjV"ubjV )rg"}rh"(hCX``BaseSpider``hY}ri"(h]]h^]h\]h[]h_]uhDjV"hd]rj"htX BaseSpiderrk"rl"}rm"(hCUhDjg"ubahWj^ ubeubaubh)rn"}ro"(hCXHdropped ``scrapy.log.exc()`` function (use ``scrapy.log.err()`` instead)rp"hDj hMhPhWhhY}rq"(h]]h^]h\]h[]h_]uhbNhchhd]rr"h)rs"}rt"(hCjp"hDjn"hMhPhWhhY}ru"(h]]h^]h\]h[]h_]uhbMZhd]rv"(htXdropped rw"rx"}ry"(hCXdropped hDjs"ubjV )rz"}r{"(hCX``scrapy.log.exc()``hY}r|"(h]]h^]h\]h[]h_]uhDjs"hd]r}"htXscrapy.log.exc()r~"r"}r"(hCUhDjz"ubahWj^ ubhtX function (use r"r"}r"(hCX function (use hDjs"ubjV )r"}r"(hCX``scrapy.log.err()``hY}r"(h]]h^]h\]h[]h_]uhDjs"hd]r"htXscrapy.log.err()r"r"}r"(hCUhDj"ubahWj^ ubhtX instead)r"r"}r"(hCX instead)hDjs"ubeubaubh)r"}r"(hCX?dropped ``component`` argument of ``scrapy.log.msg()`` functionr"hDj hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbNhchhd]r"h)r"}r"(hCj"hDj"hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbM[hd]r"(htXdropped r"r"}r"(hCXdropped hDj"ubjV )r"}r"(hCX ``component``hY}r"(h]]h^]h\]h[]h_]uhDj"hd]r"htX componentr"r"}r"(hCUhDj"ubahWj^ ubhtX argument of r"r"}r"(hCX argument of hDj"ubjV )r"}r"(hCX``scrapy.log.msg()``hY}r"(h]]h^]h\]h[]h_]uhDj"hd]r"htXscrapy.log.msg()r"r"}r"(hCUhDj"ubahWj^ ubhtX functionr"r"}r"(hCX functionhDj"ubeubaubh)r"}r"(hCX*dropped ``scrapy.log.log_level`` attributer"hDj hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbNhchhd]r"h)r"}r"(hCj"hDj"hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbM\hd]r"(htXdropped r"r"}r"(hCXdropped hDj"ubjV )r"}r"(hCX``scrapy.log.log_level``hY}r"(h]]h^]h\]h[]h_]uhDj"hd]r"htXscrapy.log.log_levelr"r"}r"(hCUhDj"ubahWj^ ubhtX attributer"r"}r"(hCX attributehDj"ubeubaubh)r"}r"(hCXUAdded ``from_settings()`` class methods to Spider Manager, and Item Pipeline Manager hDj hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbNhchhd]r"h)r"}r"(hCXTAdded ``from_settings()`` class methods to Spider Manager, and Item Pipeline ManagerhDj"hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbM]hd]r"(htXAdded r"r"}r"(hCXAdded hDj"ubjV )r"}r"(hCX``from_settings()``hY}r"(h]]h^]h\]h[]h_]uhDj"hd]r"htXfrom_settings()r"r"}r"(hCUhDj"ubahWj^ ubhtX; class methods to Spider Manager, and Item Pipeline Managerr"r"}r"(hCX; class methods to Spider Manager, and Item Pipeline ManagerhDj"ubeubaubeubeubhE)r"}r"(hCUhHKhDjhMhPhWhehY}r"(h]]r"jah^]h\]h[]r"Uid10r"ah_]uhbM`hchhd]r"(hm)r"}r"(hCXChanges to settingsr"hDj"hMhPhWhqhY}r"(h]]h^]h\]h[]h_]uhbM`hchhd]r"htXChanges to settingsr"r"}r"(hCj"hDj"ubaubh)r"}r"(hCUhDj"hMhPhWhhY}r"(hX-h[]h\]h]]h^]h_]uhbMbhchhd]r"(h)r"}r"(hCXcAdded ``HTTPCACHE_IGNORE_SCHEMES`` setting to ignore certain schemes on !HttpCacheMiddleware (#225)r"hDj"hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbNhchhd]r"h)r"}r"(hCj"hDj"hMhPhWhhY}r"(h]]h^]h\]h[]h_]uhbMbhd]r"(htXAdded r"r"}r"(hCXAdded hDj"ubjV )r"}r"(hCX``HTTPCACHE_IGNORE_SCHEMES``hY}r"(h]]h^]h\]h[]h_]uhDj"hd]r"htXHTTPCACHE_IGNORE_SCHEMESr"r"}r"(hCUhDj"ubahWj^ ubhtXA setting to ignore certain schemes on !HttpCacheMiddleware (#225)r"r#}r#(hCXA setting to ignore certain schemes on !HttpCacheMiddleware (#225)hDj"ubeubaubh)r#}r#(hCXQAdded ``SPIDER_QUEUE_CLASS`` setting which defines the spider queue to use (#220)r#hDj"hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbNhchhd]r#h)r#}r#(hCj#hDj#hMhPhWhhY}r #(h]]h^]h\]h[]h_]uhbMchd]r #(htXAdded r #r #}r #(hCXAdded hDj#ubjV )r#}r#(hCX``SPIDER_QUEUE_CLASS``hY}r#(h]]h^]h\]h[]h_]uhDj#hd]r#htXSPIDER_QUEUE_CLASSr#r#}r#(hCUhDj#ubahWj^ ubhtX5 setting which defines the spider queue to use (#220)r#r#}r#(hCX5 setting which defines the spider queue to use (#220)hDj#ubeubaubh)r#}r#(hCX#Added ``KEEP_ALIVE`` setting (#220)r#hDj"hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbNhchhd]r#h)r#}r#(hCj#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbMdhd]r #(htXAdded r!#r"#}r##(hCXAdded hDj#ubjV )r$#}r%#(hCX``KEEP_ALIVE``hY}r&#(h]]h^]h\]h[]h_]uhDj#hd]r'#htX KEEP_ALIVEr(#r)#}r*#(hCUhDj$#ubahWj^ ubhtX setting (#220)r+#r,#}r-#(hCX setting (#220)hDj#ubeubaubh)r.#}r/#(hCX(Removed ``SERVICE_QUEUE`` setting (#220)r0#hDj"hMhPhWhhY}r1#(h]]h^]h\]h[]h_]uhbNhchhd]r2#h)r3#}r4#(hCj0#hDj.#hMhPhWhhY}r5#(h]]h^]h\]h[]h_]uhbMehd]r6#(htXRemoved r7#r8#}r9#(hCXRemoved hDj3#ubjV )r:#}r;#(hCX``SERVICE_QUEUE``hY}r<#(h]]h^]h\]h[]h_]uhDj3#hd]r=#htX SERVICE_QUEUEr>#r?#}r@#(hCUhDj:#ubahWj^ ubhtX setting (#220)rA#rB#}rC#(hCX setting (#220)hDj3#ubeubaubh)rD#}rE#(hCX3Removed ``COMMANDS_SETTINGS_MODULE`` setting (#201)rF#hDj"hMhPhWhhY}rG#(h]]h^]h\]h[]h_]uhbNhchhd]rH#h)rI#}rJ#(hCjF#hDjD#hMhPhWhhY}rK#(h]]h^]h\]h[]h_]uhbMfhd]rL#(htXRemoved rM#rN#}rO#(hCXRemoved hDjI#ubjV )rP#}rQ#(hCX``COMMANDS_SETTINGS_MODULE``hY}rR#(h]]h^]h\]h[]h_]uhDjI#hd]rS#htXCOMMANDS_SETTINGS_MODULErT#rU#}rV#(hCUhDjP#ubahWj^ ubhtX setting (#201)rW#rX#}rY#(hCX setting (#201)hDjI#ubeubaubh)rZ#}r[#(hCXpRenamed ``REQUEST_HANDLERS`` to ``DOWNLOAD_HANDLERS`` and make download handlers classes (instead of functions) hDj"hMhPhWhhY}r\#(h]]h^]h\]h[]h_]uhbNhchhd]r]#h)r^#}r_#(hCXoRenamed ``REQUEST_HANDLERS`` to ``DOWNLOAD_HANDLERS`` and make download handlers classes (instead of functions)hDjZ#hMhPhWhhY}r`#(h]]h^]h\]h[]h_]uhbMghd]ra#(htXRenamed rb#rc#}rd#(hCXRenamed hDj^#ubjV )re#}rf#(hCX``REQUEST_HANDLERS``hY}rg#(h]]h^]h\]h[]h_]uhDj^#hd]rh#htXREQUEST_HANDLERSri#rj#}rk#(hCUhDje#ubahWj^ ubhtX to rl#rm#}rn#(hCX to hDj^#ubjV )ro#}rp#(hCX``DOWNLOAD_HANDLERS``hY}rq#(h]]h^]h\]h[]h_]uhDj^#hd]rr#htXDOWNLOAD_HANDLERSrs#rt#}ru#(hCUhDjo#ubahWj^ ubhtX: and make download handlers classes (instead of functions)rv#rw#}rx#(hCX: and make download handlers classes (instead of functions)hDj^#ubeubaubeubeubeubhE)ry#}rz#(hCUhDhKhMhPhWhehY}r{#(h]]h^]h\]h[]r|#Uid11r}#ah_]r~#hauhbMjhchhd]r#(hm)r#}r#(hCX0.9r#hDjy#hMhPhWhqhY}r#(h]]h^]h\]h[]h_]uhbMjhchhd]r#htX0.9r#r#}r#(hCj#hDj#ubaubh)r#}r#(hCXeThe numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.r#hDjy#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbMlhchhd]r#htXeThe numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.r#r#}r#(hCj#hDj#ubaubhE)r#}r#(hCUhHKhDjy#hMhPhWhehY}r#(h]]r#Xnew features and improvementsr#ah^]h\]h[]r#Uid12r#ah_]uhbMohchhd]r#(hm)r#}r#(hCXNew features and improvementsr#hDj#hMhPhWhqhY}r#(h]]h^]h\]h[]h_]uhbMohchhd]r#htXNew features and improvementsr#r#}r#(hCj#hDj#ubaubh)r#}r#(hCUhDj#hMhPhWhhY}r#(hX-h[]h\]h]]h^]h_]uhbMqhchhd]r#(h)r#}r#(hCX&Added SMTP-AUTH support to scrapy.mailr#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbNhchhd]r#h)r#}r#(hCj#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbMqhd]r#htX&Added SMTP-AUTH support to scrapy.mailr#r#}r#(hCj#hDj#ubaubaubh)r#}r#(hCXENew settings added: ``MAIL_USER``, ``MAIL_PASS`` (:rev:`2065` | #149)r#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbNhchhd]r#h)r#}r#(hCj#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbMrhd]r#(htXNew settings added: r#r#}r#(hCXNew settings added: hDj#ubjV )r#}r#(hCX ``MAIL_USER``hY}r#(h]]h^]h\]h[]h_]uhDj#hd]r#htX MAIL_USERr#r#}r#(hCUhDj#ubahWj^ ubhtX, r#r#}r#(hCX, hDj#ubjV )r#}r#(hCX ``MAIL_PASS``hY}r#(h]]h^]h\]h[]h_]uhDj#hd]r#htX MAIL_PASSr#r#}r#(hCUhDj#ubahWj^ ubhtX (r#r#}r#(hCX (hDj#ubh)r#}r#(hCX :rev:`2065`hY}r#(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2065h[]h\]h]]h^]h_]uhDj#hd]r#htXr2065r#r#}r#(hCUhDj#ubahWhubhtX | #149)r#r#}r#(hCX | #149)hDj#ubeubaubh)r#}r#(hCX_Added new scrapy-ctl view command - To view URL in the browser, as seen by Scrapy (:rev:`2039`)r#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbNhchhd]r#h)r#}r#(hCj#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbMshd]r#(htXSAdded new scrapy-ctl view command - To view URL in the browser, as seen by Scrapy (r#r#}r#(hCXSAdded new scrapy-ctl view command - To view URL in the browser, as seen by Scrapy (hDj#ubh)r#}r#(hCX :rev:`2039`hY}r#(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2039h[]h\]h]]h^]h_]uhDj#hd]r#htXr2039r#r#}r#(hCUhDj#ubahWhubhtX)r#}r#(hCX)hDj#ubeubaubh)r#}r#(hCXlAdded web service for controlling Scrapy process (this also deprecates the web console. (:rev:`2053` | #167)r#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbNhchhd]r#h)r#}r#(hCj#hDj#hMhPhWhhY}r#(h]]h^]h\]h[]h_]uhbMthd]r#(htXYAdded web service for controlling Scrapy process (this also deprecates the web console. (r#r#}r#(hCXYAdded web service for controlling Scrapy process (this also deprecates the web console. (hDj#ubh)r#}r#(hCX :rev:`2053`hY}r#(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2053h[]h\]h]]h^]h_]uhDj#hd]r#htXr2053r#r$}r$(hCUhDj#ubahWhubhtX | #167)r$r$}r$(hCX | #167)hDj#ubeubaubh)r$}r$(hCXSupport for running Scrapy as a service, for production systems (:rev:`1988`, :rev:`2054`, :rev:`2055`, :rev:`2056`, :rev:`2057` | #168)r$hDj#hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbNhchhd]r $h)r $}r $(hCj$hDj$hMhPhWhhY}r $(h]]h^]h\]h[]h_]uhbMuhd]r $(htXASupport for running Scrapy as a service, for production systems (r$r$}r$(hCXASupport for running Scrapy as a service, for production systems (hDj $ubh)r$}r$(hCX :rev:`1988`hY}r$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1988h[]h\]h]]h^]h_]uhDj $hd]r$htXr1988r$r$}r$(hCUhDj$ubahWhubhtX, r$r$}r$(hCX, hDj $ubh)r$}r$(hCX :rev:`2054`hY}r$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2054h[]h\]h]]h^]h_]uhDj $hd]r$htXr2054r$r $}r!$(hCUhDj$ubahWhubhtX, r"$r#$}r$$(hCX, hDj $ubh)r%$}r&$(hCX :rev:`2055`hY}r'$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2055h[]h\]h]]h^]h_]uhDj $hd]r($htXr2055r)$r*$}r+$(hCUhDj%$ubahWhubhtX, r,$r-$}r.$(hCX, hDj $ubh)r/$}r0$(hCX :rev:`2056`hY}r1$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2056h[]h\]h]]h^]h_]uhDj $hd]r2$htXr2056r3$r4$}r5$(hCUhDj/$ubahWhubhtX, r6$r7$}r8$(hCX, hDj $ubh)r9$}r:$(hCX :rev:`2057`hY}r;$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2057h[]h\]h]]h^]h_]uhDj $hd]r<$htXr2057r=$r>$}r?$(hCUhDj9$ubahWhubhtX | #168)r@$rA$}rB$(hCX | #168)hDj $ubeubaubh)rC$}rD$(hCXdAdded wrapper induction library (documentation only available in source code for now). (:rev:`2011`)rE$hDj#hMhPhWhhY}rF$(h]]h^]h\]h[]h_]uhbNhchhd]rG$h)rH$}rI$(hCjE$hDjC$hMhPhWhhY}rJ$(h]]h^]h\]h[]h_]uhbMvhd]rK$(htXXAdded wrapper induction library (documentation only available in source code for now). (rL$rM$}rN$(hCXXAdded wrapper induction library (documentation only available in source code for now). (hDjH$ubh)rO$}rP$(hCX :rev:`2011`hY}rQ$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2011h[]h\]h]]h^]h_]uhDjH$hd]rR$htXr2011rS$rT$}rU$(hCUhDjO$ubahWhubhtX)rV$}rW$(hCX)hDjH$ubeubaubh)rX$}rY$(hCXLSimplified and improved response encoding support (:rev:`1961`, :rev:`1969`)rZ$hDj#hMhPhWhhY}r[$(h]]h^]h\]h[]h_]uhbNhchhd]r\$h)r]$}r^$(hCjZ$hDjX$hMhPhWhhY}r_$(h]]h^]h\]h[]h_]uhbMwhd]r`$(htX3Simplified and improved response encoding support (ra$rb$}rc$(hCX3Simplified and improved response encoding support (hDj]$ubh)rd$}re$(hCX :rev:`1961`hY}rf$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1961h[]h\]h]]h^]h_]uhDj]$hd]rg$htXr1961rh$ri$}rj$(hCUhDjd$ubahWhubhtX, rk$rl$}rm$(hCX, hDj]$ubh)rn$}ro$(hCX :rev:`1969`hY}rp$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1969h[]h\]h]]h^]h_]uhDj]$hd]rq$htXr1969rr$rs$}rt$(hCUhDjn$ubahWhubhtX)ru$}rv$(hCX)hDj]$ubeubaubh)rw$}rx$(hCXEAdded ``LOG_ENCODING`` setting (:rev:`1956`, documentation available)ry$hDj#hMhPhWhhY}rz$(h]]h^]h\]h[]h_]uhbNhchhd]r{$h)r|$}r}$(hCjy$hDjw$hMhPhWhhY}r~$(h]]h^]h\]h[]h_]uhbMxhd]r$(htXAdded r$r$}r$(hCXAdded hDj|$ubjV )r$}r$(hCX``LOG_ENCODING``hY}r$(h]]h^]h\]h[]h_]uhDj|$hd]r$htX LOG_ENCODINGr$r$}r$(hCUhDj$ubahWj^ ubhtX setting (r$r$}r$(hCX setting (hDj|$ubh)r$}r$(hCX :rev:`1956`hY}r$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1956h[]h\]h]]h^]h_]uhDj|$hd]r$htXr1956r$r$}r$(hCUhDj$ubahWhubhtX, documentation available)r$r$}r$(hCX, documentation available)hDj|$ubeubaubh)r$}r$(hCX\Added ``RANDOMIZE_DOWNLOAD_DELAY`` setting (enabled by default) (:rev:`1923`, doc available)r$hDj#hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbNhchhd]r$h)r$}r$(hCj$hDj$hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbMyhd]r$(htXAdded r$r$}r$(hCXAdded hDj$ubjV )r$}r$(hCX``RANDOMIZE_DOWNLOAD_DELAY``hY}r$(h]]h^]h\]h[]h_]uhDj$hd]r$htXRANDOMIZE_DOWNLOAD_DELAYr$r$}r$(hCUhDj$ubahWj^ ubhtX setting (enabled by default) (r$r$}r$(hCX setting (enabled by default) (hDj$ubh)r$}r$(hCX :rev:`1923`hY}r$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1923h[]h\]h]]h^]h_]uhDj$hd]r$htXr1923r$r$}r$(hCUhDj$ubahWhubhtX, doc available)r$r$}r$(hCX, doc available)hDj$ubeubaubh)r$}r$(hCX<``MailSender`` is no longer IO-blocking (:rev:`1955` | #146)r$hDj#hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbNhchhd]r$h)r$}r$(hCj$hDj$hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbMzhd]r$(jV )r$}r$(hCX``MailSender``hY}r$(h]]h^]h\]h[]h_]uhDj$hd]r$htX MailSenderr$r$}r$(hCUhDj$ubahWj^ ubhtX is no longer IO-blocking (r$r$}r$(hCX is no longer IO-blocking (hDj$ubh)r$}r$(hCX :rev:`1955`hY}r$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1955h[]h\]h]]h^]h_]uhDj$hd]r$htXr1955r$r$}r$(hCUhDj$ubahWhubhtX | #146)r$r$}r$(hCX | #146)hDj$ubeubaubh)r$}r$(hCXYLinkextractors and new Crawlspider now handle relative base tag urls (:rev:`1960` | #148)r$hDj#hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbNhchhd]r$h)r$}r$(hCj$hDj$hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbM{hd]r$(htXFLinkextractors and new Crawlspider now handle relative base tag urls (r$r$}r$(hCXFLinkextractors and new Crawlspider now handle relative base tag urls (hDj$ubh)r$}r$(hCX :rev:`1960`hY}r$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1960h[]h\]h]]h^]h_]uhDj$hd]r$htXr1960r$r$}r$(hCUhDj$ubahWhubhtX | #148)r$r$}r$(hCX | #148)hDj$ubeubaubh)r$}r$(hCXSeveral improvements to Item Loaders and processors (:rev:`2022`, :rev:`2023`, :rev:`2024`, :rev:`2025`, :rev:`2026`, :rev:`2027`, :rev:`2028`, :rev:`2029`, :rev:`2030`)r$hDj#hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbNhchhd]r$h)r$}r$(hCj$hDj$hMhPhWhhY}r$(h]]h^]h\]h[]h_]uhbM|hd]r$(htX5Several improvements to Item Loaders and processors (r$r$}r$(hCX5Several improvements to Item Loaders and processors (hDj$ubh)r$}r$(hCX :rev:`2022`hY}r$(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2022h[]h\]h]]h^]h_]uhDj$hd]r$htXr2022r$r$}r$(hCUhDj$ubahWhubhtX, r$r$}r$(hCX, hDj$ubh)r%}r%(hCX :rev:`2023`hY}r%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2023h[]h\]h]]h^]h_]uhDj$hd]r%htXr2023r%r%}r%(hCUhDj%ubahWhubhtX, r%r%}r %(hCX, hDj$ubh)r %}r %(hCX :rev:`2024`hY}r %(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2024h[]h\]h]]h^]h_]uhDj$hd]r %htXr2024r%r%}r%(hCUhDj %ubahWhubhtX, r%r%}r%(hCX, hDj$ubh)r%}r%(hCX :rev:`2025`hY}r%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2025h[]h\]h]]h^]h_]uhDj$hd]r%htXr2025r%r%}r%(hCUhDj%ubahWhubhtX, r%r%}r%(hCX, hDj$ubh)r%}r%(hCX :rev:`2026`hY}r %(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2026h[]h\]h]]h^]h_]uhDj$hd]r!%htXr2026r"%r#%}r$%(hCUhDj%ubahWhubhtX, r%%r&%}r'%(hCX, hDj$ubh)r(%}r)%(hCX :rev:`2027`hY}r*%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2027h[]h\]h]]h^]h_]uhDj$hd]r+%htXr2027r,%r-%}r.%(hCUhDj(%ubahWhubhtX, r/%r0%}r1%(hCX, hDj$ubh)r2%}r3%(hCX :rev:`2028`hY}r4%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2028h[]h\]h]]h^]h_]uhDj$hd]r5%htXr2028r6%r7%}r8%(hCUhDj2%ubahWhubhtX, r9%r:%}r;%(hCX, hDj$ubh)r<%}r=%(hCX :rev:`2029`hY}r>%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2029h[]h\]h]]h^]h_]uhDj$hd]r?%htXr2029r@%rA%}rB%(hCUhDj<%ubahWhubhtX, rC%rD%}rE%(hCX, hDj$ubh)rF%}rG%(hCX :rev:`2030`hY}rH%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2030h[]h\]h]]h^]h_]uhDj$hd]rI%htXr2030rJ%rK%}rL%(hCUhDjF%ubahWhubhtX)rM%}rN%(hCX)hDj$ubeubaubh)rO%}rP%(hCXIAdded support for adding variables to telnet console (:rev:`2047` | #165)rQ%hDj#hMhPhWhhY}rR%(h]]h^]h\]h[]h_]uhbNhchhd]rS%h)rT%}rU%(hCjQ%hDjO%hMhPhWhhY}rV%(h]]h^]h\]h[]h_]uhbM}hd]rW%(htX6Added support for adding variables to telnet console (rX%rY%}rZ%(hCX6Added support for adding variables to telnet console (hDjT%ubh)r[%}r\%(hCX :rev:`2047`hY}r]%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2047h[]h\]h]]h^]h_]uhDjT%hd]r^%htXr2047r_%r`%}ra%(hCUhDj[%ubahWhubhtX | #165)rb%rc%}rd%(hCX | #165)hDjT%ubeubaubh)re%}rf%(hCX<Support for requests without callbacks (:rev:`2050` | #166) hDj#hMhPhWhhY}rg%(h]]h^]h\]h[]h_]uhbNhchhd]rh%h)ri%}rj%(hCX;Support for requests without callbacks (:rev:`2050` | #166)hDje%hMhPhWhhY}rk%(h]]h^]h\]h[]h_]uhbM~hd]rl%(htX(Support for requests without callbacks (rm%rn%}ro%(hCX(Support for requests without callbacks (hDji%ubh)rp%}rq%(hCX :rev:`2050`hY}rr%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2050h[]h\]h]]h^]h_]uhDji%hd]rs%htXr2050rt%ru%}rv%(hCUhDjp%ubahWhubhtX | #166)rw%rx%}ry%(hCX | #166)hDji%ubeubaubeubeubhE)rz%}r{%(hCUhHKhDjy#hMhPhWhehY}r|%(h]]r}%j ah^]h\]h[]r~%Uid13r%ah_]uhbMhchhd]r%(hm)r%}r%(hCX API changesr%hDjz%hMhPhWhqhY}r%(h]]h^]h\]h[]h_]uhbMhchhd]r%htX API changesr%r%}r%(hCj%hDj%ubaubh)r%}r%(hCUhDjz%hMhPhWhhY}r%(hX-h[]h\]h]]h^]h_]uhbMhchhd]r%(h)r%}r%(hCXGChange ``Spider.domain_name`` to ``Spider.name`` (SEP-012, :rev:`1975`)r%hDj%hMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbNhchhd]r%h)r%}r%(hCj%hDj%hMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbMhd]r%(htXChange r%r%}r%(hCXChange hDj%ubjV )r%}r%(hCX``Spider.domain_name``hY}r%(h]]h^]h\]h[]h_]uhDj%hd]r%htXSpider.domain_namer%r%}r%(hCUhDj%ubahWj^ ubhtX to r%r%}r%(hCX to hDj%ubjV )r%}r%(hCX``Spider.name``hY}r%(h]]h^]h\]h[]h_]uhDj%hd]r%htX Spider.namer%r%}r%(hCUhDj%ubahWj^ ubhtX (SEP-012, r%r%}r%(hCX (SEP-012, hDj%ubh)r%}r%(hCX :rev:`1975`hY}r%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1975h[]h\]h]]h^]h_]uhDj%hd]r%htXr1975r%r%}r%(hCUhDj%ubahWhubhtX)r%}r%(hCX)hDj%ubeubaubh)r%}r%(hCX@``Response.encoding`` is now the detected encoding (:rev:`1961`)r%hDj%hMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbNhchhd]r%h)r%}r%(hCj%hDj%hMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbMhd]r%(jV )r%}r%(hCX``Response.encoding``hY}r%(h]]h^]h\]h[]h_]uhDj%hd]r%htXResponse.encodingr%r%}r%(hCUhDj%ubahWj^ ubhtX is now the detected encoding (r%r%}r%(hCX is now the detected encoding (hDj%ubh)r%}r%(hCX :rev:`1961`hY}r%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1961h[]h\]h]]h^]h_]uhDj%hd]r%htXr1961r%r%}r%(hCUhDj%ubahWhubhtX)r%}r%(hCX)hDj%ubeubaubh)r%}r%(hCXT``HttpErrorMiddleware`` now returns None or raises an exception (:rev:`2006` | #157)r%hDj%hMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbNhchhd]r%h)r%}r%(hCj%hDj%hMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbMhd]r%(jV )r%}r%(hCX``HttpErrorMiddleware``hY}r%(h]]h^]h\]h[]h_]uhDj%hd]r%htXHttpErrorMiddlewarer%r%}r%(hCUhDj%ubahWj^ ubhtX* now returns None or raises an exception (r%r%}r%(hCX* now returns None or raises an exception (hDj%ubh)r%}r%(hCX :rev:`2006`hY}r%(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2006h[]h\]h]]h^]h_]uhDj%hd]r%htXr2006r%r%}r%(hCUhDj%ubahWhubhtX | #157)r%r%}r%(hCX | #157)hDj%ubeubaubh)r%}r%(hCXM``scrapy.command`` modules relocation (:rev:`2035`, :rev:`2036`, :rev:`2037`)r%hDj%hMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbNhchhd]r%h)r%}r%(hCj%hDj%hMhPhWhhY}r%(h]]h^]h\]h[]h_]uhbMhd]r%(jV )r%}r%(hCX``scrapy.command``hY}r%(h]]h^]h\]h[]h_]uhDj%hd]r%htXscrapy.commandr%r%}r%(hCUhDj%ubahWj^ ubhtX modules relocation (r%r&}r&(hCX modules relocation (hDj%ubh)r&}r&(hCX :rev:`2035`hY}r&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2035h[]h\]h]]h^]h_]uhDj%hd]r&htXr2035r&r&}r&(hCUhDj&ubahWhubhtX, r &r &}r &(hCX, hDj%ubh)r &}r &(hCX :rev:`2036`hY}r&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2036h[]h\]h]]h^]h_]uhDj%hd]r&htXr2036r&r&}r&(hCUhDj &ubahWhubhtX, r&r&}r&(hCX, hDj%ubh)r&}r&(hCX :rev:`2037`hY}r&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2037h[]h\]h]]h^]h_]uhDj%hd]r&htXr2037r&r&}r&(hCUhDj&ubahWhubhtX)r&}r&(hCX)hDj%ubeubaubh)r&}r &(hCXDAdded ``ExecutionQueue`` for feeding spiders to scrape (:rev:`2034`)r!&hDj%hMhPhWhhY}r"&(h]]h^]h\]h[]h_]uhbNhchhd]r#&h)r$&}r%&(hCj!&hDj&hMhPhWhhY}r&&(h]]h^]h\]h[]h_]uhbMhd]r'&(htXAdded r(&r)&}r*&(hCXAdded hDj$&ubjV )r+&}r,&(hCX``ExecutionQueue``hY}r-&(h]]h^]h\]h[]h_]uhDj$&hd]r.&htXExecutionQueuer/&r0&}r1&(hCUhDj+&ubahWj^ ubhtX for feeding spiders to scrape (r2&r3&}r4&(hCX for feeding spiders to scrape (hDj$&ubh)r5&}r6&(hCX :rev:`2034`hY}r7&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2034h[]h\]h]]h^]h_]uhDj$&hd]r8&htXr2034r9&r:&}r;&(hCUhDj5&ubahWhubhtX)r<&}r=&(hCX)hDj$&ubeubaubh)r>&}r?&(hCX3Removed ``ExecutionEngine`` singleton (:rev:`2039`)r@&hDj%hMhPhWhhY}rA&(h]]h^]h\]h[]h_]uhbNhchhd]rB&h)rC&}rD&(hCj@&hDj>&hMhPhWhhY}rE&(h]]h^]h\]h[]h_]uhbMhd]rF&(htXRemoved rG&rH&}rI&(hCXRemoved hDjC&ubjV )rJ&}rK&(hCX``ExecutionEngine``hY}rL&(h]]h^]h\]h[]h_]uhDjC&hd]rM&htXExecutionEnginerN&rO&}rP&(hCUhDjJ&ubahWj^ ubhtX singleton (rQ&rR&}rS&(hCX singleton (hDjC&ubh)rT&}rU&(hCX :rev:`2039`hY}rV&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2039h[]h\]h]]h^]h_]uhDjC&hd]rW&htXr2039rX&rY&}rZ&(hCUhDjT&ubahWhubhtX)r[&}r\&(hCX)hDjC&ubeubaubh)r]&}r^&(hCXPPorted ``S3ImagesStore`` (images pipeline) to use boto and threads (:rev:`2033`)r_&hDj%hMhPhWhhY}r`&(h]]h^]h\]h[]h_]uhbNhchhd]ra&h)rb&}rc&(hCj_&hDj]&hMhPhWhhY}rd&(h]]h^]h\]h[]h_]uhbMhd]re&(htXPorted rf&rg&}rh&(hCXPorted hDjb&ubjV )ri&}rj&(hCX``S3ImagesStore``hY}rk&(h]]h^]h\]h[]h_]uhDjb&hd]rl&htX S3ImagesStorerm&rn&}ro&(hCUhDji&ubahWj^ ubhtX, (images pipeline) to use boto and threads (rp&rq&}rr&(hCX, (images pipeline) to use boto and threads (hDjb&ubh)rs&}rt&(hCX :rev:`2033`hY}ru&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2033h[]h\]h]]h^]h_]uhDjb&hd]rv&htXr2033rw&rx&}ry&(hCUhDjs&ubahWhubhtX)rz&}r{&(hCX)hDjb&ubeubaubh)r|&}r}&(hCXNMoved module: ``scrapy.management.telnet`` to ``scrapy.telnet`` (:rev:`2047`) hDj%hMhPhWhhY}r~&(h]]h^]h\]h[]h_]uhbNhchhd]r&h)r&}r&(hCXMMoved module: ``scrapy.management.telnet`` to ``scrapy.telnet`` (:rev:`2047`)hDj|&hMhPhWhhY}r&(h]]h^]h\]h[]h_]uhbMhd]r&(htXMoved module: r&r&}r&(hCXMoved module: hDj&ubjV )r&}r&(hCX``scrapy.management.telnet``hY}r&(h]]h^]h\]h[]h_]uhDj&hd]r&htXscrapy.management.telnetr&r&}r&(hCUhDj&ubahWj^ ubhtX to r&r&}r&(hCX to hDj&ubjV )r&}r&(hCX``scrapy.telnet``hY}r&(h]]h^]h\]h[]h_]uhDj&hd]r&htX scrapy.telnetr&r&}r&(hCUhDj&ubahWj^ ubhtX (r&r&}r&(hCX (hDj&ubh)r&}r&(hCX :rev:`2047`hY}r&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/2047h[]h\]h]]h^]h_]uhDj&hd]r&htXr2047r&r&}r&(hCUhDj&ubahWhubhtX)r&}r&(hCX)hDj&ubeubaubeubeubhE)r&}r&(hCUhDjy#hMhPhWhehY}r&(h]]h^]h\]h[]r&Uchanges-to-default-settingsr&ah_]r&hauhbMhchhd]r&(hm)r&}r&(hCXChanges to default settingsr&hDj&hMhPhWhqhY}r&(h]]h^]h\]h[]h_]uhbMhchhd]r&htXChanges to default settingsr&r&}r&(hCj&hDj&ubaubh)r&}r&(hCUhDj&hMhPhWhhY}r&(hX-h[]h\]h]]h^]h_]uhbMhchhd]r&h)r&}r&(hCX=Changed default ``SCHEDULER_ORDER`` to ``DFO`` (:rev:`1939`) hDj&hMhPhWhhY}r&(h]]h^]h\]h[]h_]uhbNhchhd]r&h)r&}r&(hCX<Changed default ``SCHEDULER_ORDER`` to ``DFO`` (:rev:`1939`)hDj&hMhPhWhhY}r&(h]]h^]h\]h[]h_]uhbMhd]r&(htXChanged default r&r&}r&(hCXChanged default hDj&ubjV )r&}r&(hCX``SCHEDULER_ORDER``hY}r&(h]]h^]h\]h[]h_]uhDj&hd]r&htXSCHEDULER_ORDERr&r&}r&(hCUhDj&ubahWj^ ubhtX to r&r&}r&(hCX to hDj&ubjV )r&}r&(hCX``DFO``hY}r&(h]]h^]h\]h[]h_]uhDj&hd]r&htXDFOr&r&}r&(hCUhDj&ubahWj^ ubhtX (r&r&}r&(hCX (hDj&ubh)r&}r&(hCX :rev:`1939`hY}r&(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1939h[]h\]h]]h^]h_]uhDj&hd]r&htXr1939r&r&}r&(hCUhDj&ubahWhubhtX)r&}r&(hCX)hDj&ubeubaubaubeubeubhE)r&}r&(hCUhDhKhMhPhWhehY}r&(h]]h^]h\]h[]r&Uid14r&ah_]r&h auhbMhchhd]r&(hm)r&}r&(hCX0.8r&hDj&hMhPhWhqhY}r&(h]]h^]h\]h[]h_]uhbMhchhd]r&htX0.8r&r&}r&(hCj&hDj&ubaubh)r&}r&(hCXeThe numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.r&hDj&hMhPhWhhY}r&(h]]h^]h\]h[]h_]uhbMhchhd]r&htXeThe numbers like #NNN reference tickets in the old issue tracker (Trac) which is no longer available.r&r&}r&(hCj&hDj&ubaubhE)r&}r&(hCUhDj&hMhPhWhehY}r&(h]]h^]h\]h[]r&U new-featuresr&ah_]r&hauhbMhchhd]r&(hm)r&}r&(hCX New featuresr&hDj&hMhPhWhqhY}r'(h]]h^]h\]h[]h_]uhbMhchhd]r'htX New featuresr'r'}r'(hCj&hDj&ubaubh)r'}r'(hCUhDj&hMhPhWhhY}r'(hX-h[]h\]h]]h^]h_]uhbMhchhd]r'(h)r '}r '(hCX5Added DEFAULT_RESPONSE_ENCODING setting (:rev:`1809`)r 'hDj'hMhPhWhhY}r '(h]]h^]h\]h[]h_]uhbNhchhd]r 'h)r'}r'(hCj 'hDj 'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbMhd]r'(htX)Added DEFAULT_RESPONSE_ENCODING setting (r'r'}r'(hCX)Added DEFAULT_RESPONSE_ENCODING setting (hDj'ubh)r'}r'(hCX :rev:`1809`hY}r'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1809h[]h\]h]]h^]h_]uhDj'hd]r'htXr1809r'r'}r'(hCUhDj'ubahWhubhtX)r'}r'(hCX)hDj'ubeubaubh)r'}r'(hCXbAdded ``dont_click`` argument to ``FormRequest.from_response()`` method (:rev:`1813`, :rev:`1816`)r 'hDj'hMhPhWhhY}r!'(h]]h^]h\]h[]h_]uhbNhchhd]r"'h)r#'}r$'(hCj 'hDj'hMhPhWhhY}r%'(h]]h^]h\]h[]h_]uhbMhd]r&'(htXAdded r''r('}r)'(hCXAdded hDj#'ubjV )r*'}r+'(hCX``dont_click``hY}r,'(h]]h^]h\]h[]h_]uhDj#'hd]r-'htX dont_clickr.'r/'}r0'(hCUhDj*'ubahWj^ ubhtX argument to r1'r2'}r3'(hCX argument to hDj#'ubjV )r4'}r5'(hCX``FormRequest.from_response()``hY}r6'(h]]h^]h\]h[]h_]uhDj#'hd]r7'htXFormRequest.from_response()r8'r9'}r:'(hCUhDj4'ubahWj^ ubhtX method (r;'r<'}r='(hCX method (hDj#'ubh)r>'}r?'(hCX :rev:`1813`hY}r@'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1813h[]h\]h]]h^]h_]uhDj#'hd]rA'htXr1813rB'rC'}rD'(hCUhDj>'ubahWhubhtX, rE'rF'}rG'(hCX, hDj#'ubh)rH'}rI'(hCX :rev:`1816`hY}rJ'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1816h[]h\]h]]h^]h_]uhDj#'hd]rK'htXr1816rL'rM'}rN'(hCUhDjH'ubahWhubhtX)rO'}rP'(hCX)hDj#'ubeubaubh)rQ'}rR'(hCXaAdded ``clickdata`` argument to ``FormRequest.from_response()`` method (:rev:`1802`, :rev:`1803`)rS'hDj'hMhPhWhhY}rT'(h]]h^]h\]h[]h_]uhbNhchhd]rU'h)rV'}rW'(hCjS'hDjQ'hMhPhWhhY}rX'(h]]h^]h\]h[]h_]uhbMhd]rY'(htXAdded rZ'r['}r\'(hCXAdded hDjV'ubjV )r]'}r^'(hCX ``clickdata``hY}r_'(h]]h^]h\]h[]h_]uhDjV'hd]r`'htX clickdatara'rb'}rc'(hCUhDj]'ubahWj^ ubhtX argument to rd're'}rf'(hCX argument to hDjV'ubjV )rg'}rh'(hCX``FormRequest.from_response()``hY}ri'(h]]h^]h\]h[]h_]uhDjV'hd]rj'htXFormRequest.from_response()rk'rl'}rm'(hCUhDjg'ubahWj^ ubhtX method (rn'ro'}rp'(hCX method (hDjV'ubh)rq'}rr'(hCX :rev:`1802`hY}rs'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1802h[]h\]h]]h^]h_]uhDjV'hd]rt'htXr1802ru'rv'}rw'(hCUhDjq'ubahWhubhtX, rx'ry'}rz'(hCX, hDjV'ubh)r{'}r|'(hCX :rev:`1803`hY}r}'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1803h[]h\]h]]h^]h_]uhDjV'hd]r~'htXr1803r'r'}r'(hCUhDj{'ubahWhubhtX)r'}r'(hCX)hDjV'ubeubaubh)r'}r'(hCXSAdded support for HTTP proxies (``HttpProxyMiddleware``) (:rev:`1781`, :rev:`1785`)r'hDj'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbNhchhd]r'h)r'}r'(hCj'hDj'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbMhd]r'(htX Added support for HTTP proxies (r'r'}r'(hCX Added support for HTTP proxies (hDj'ubjV )r'}r'(hCX``HttpProxyMiddleware``hY}r'(h]]h^]h\]h[]h_]uhDj'hd]r'htXHttpProxyMiddlewarer'r'}r'(hCUhDj'ubahWj^ ubhtX) (r'r'}r'(hCX) (hDj'ubh)r'}r'(hCX :rev:`1781`hY}r'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1781h[]h\]h]]h^]h_]uhDj'hd]r'htXr1781r'r'}r'(hCUhDj'ubahWhubhtX, r'r'}r'(hCX, hDj'ubh)r'}r'(hCX :rev:`1785`hY}r'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1785h[]h\]h]]h^]h_]uhDj'hd]r'htXr1785r'r'}r'(hCUhDj'ubahWhubhtX)r'}r'(hCX)hDj'ubeubaubh)r'}r'(hCXVOffiste spider middleware now logs messages when filtering out requests (:rev:`1841`) hDj'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbNhchhd]r'h)r'}r'(hCXUOffiste spider middleware now logs messages when filtering out requests (:rev:`1841`)hDj'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbMhd]r'(htXIOffiste spider middleware now logs messages when filtering out requests (r'r'}r'(hCXIOffiste spider middleware now logs messages when filtering out requests (hDj'ubh)r'}r'(hCX :rev:`1841`hY}r'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1841h[]h\]h]]h^]h_]uhDj'hd]r'htXr1841r'r'}r'(hCUhDj'ubahWhubhtX)r'}r'(hCX)hDj'ubeubaubeubeubhE)r'}r'(hCUhDj&hMhPhWhehY}r'(h]]h^]h\]h[]r'Ubackwards-incompatible-changesr'ah_]r'h6auhbMhchhd]r'(hm)r'}r'(hCXBackwards-incompatible changesr'hDj'hMhPhWhqhY}r'(h]]h^]h\]h[]h_]uhbMhchhd]r'htXBackwards-incompatible changesr'r'}r'(hCj'hDj'ubaubh)r'}r'(hCUhDj'hMhPhWhhY}r'(hX-h[]h\]h]]h^]h_]uhbMhchhd]r'(h)r'}r'(hCXLChanged ``scrapy.utils.response.get_meta_refresh()`` signature (:rev:`1804`)r'hDj'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbNhchhd]r'h)r'}r'(hCj'hDj'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbMhd]r'(htXChanged r'r'}r'(hCXChanged hDj'ubjV )r'}r'(hCX,``scrapy.utils.response.get_meta_refresh()``hY}r'(h]]h^]h\]h[]h_]uhDj'hd]r'htX(scrapy.utils.response.get_meta_refresh()r'r'}r'(hCUhDj'ubahWj^ ubhtX signature (r'r'}r'(hCX signature (hDj'ubh)r'}r'(hCX :rev:`1804`hY}r'(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1804h[]h\]h]]h^]h_]uhDj'hd]r'htXr1804r'r'}r'(hCUhDj'ubahWhubhtX)r'}r'(hCX)hDj'ubeubaubh)r'}r'(hCXeRemoved deprecated ``scrapy.item.ScrapedItem`` class - use ``scrapy.item.Item instead`` (:rev:`1838`)r'hDj'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbNhchhd]r'h)r'}r'(hCj'hDj'hMhPhWhhY}r'(h]]h^]h\]h[]h_]uhbMhd]r'(htXRemoved deprecated r'r'}r'(hCXRemoved deprecated hDj'ubjV )r'}r((hCX``scrapy.item.ScrapedItem``hY}r((h]]h^]h\]h[]h_]uhDj'hd]r(htXscrapy.item.ScrapedItemr(r(}r((hCUhDj'ubahWj^ ubhtX class - use r(r(}r((hCX class - use hDj'ubjV )r (}r ((hCX``scrapy.item.Item instead``hY}r ((h]]h^]h\]h[]h_]uhDj'hd]r (htXscrapy.item.Item insteadr (r(}r((hCUhDj (ubahWj^ ubhtX (r(r(}r((hCX (hDj'ubh)r(}r((hCX :rev:`1838`hY}r((UrefuriX*http://hg.scrapy.org/scrapy/changeset/1838h[]h\]h]]h^]h_]uhDj'hd]r(htXr1838r(r(}r((hCUhDj(ubahWhubhtX)r(}r((hCX)hDj'ubeubaubh)r(}r((hCX[Removed deprecated ``scrapy.xpath`` module - use ``scrapy.selector`` instead. (:rev:`1836`)r(hDj'hMhPhWhhY}r((h]]h^]h\]h[]h_]uhbNhchhd]r (h)r!(}r"((hCj(hDj(hMhPhWhhY}r#((h]]h^]h\]h[]h_]uhbMhd]r$((htXRemoved deprecated r%(r&(}r'((hCXRemoved deprecated hDj!(ubjV )r((}r)((hCX``scrapy.xpath``hY}r*((h]]h^]h\]h[]h_]uhDj!(hd]r+(htX scrapy.xpathr,(r-(}r.((hCUhDj((ubahWj^ ubhtX module - use r/(r0(}r1((hCX module - use hDj!(ubjV )r2(}r3((hCX``scrapy.selector``hY}r4((h]]h^]h\]h[]h_]uhDj!(hd]r5(htXscrapy.selectorr6(r7(}r8((hCUhDj2(ubahWj^ ubhtX instead. (r9(r:(}r;((hCX instead. (hDj!(ubh)r<(}r=((hCX :rev:`1836`hY}r>((UrefuriX*http://hg.scrapy.org/scrapy/changeset/1836h[]h\]h]]h^]h_]uhDj!(hd]r?(htXr1836r@(rA(}rB((hCUhDj<(ubahWhubhtX)rC(}rD((hCX)hDj!(ubeubaubh)rE(}rF((hCXqRemoved deprecated ``core.signals.domain_open`` signal - use ``core.signals.domain_opened`` instead (:rev:`1822`)rG(hDj'hMhPhWhhY}rH((h]]h^]h\]h[]h_]uhbNhchhd]rI(h)rJ(}rK((hCjG(hDjE(hMhPhWhhY}rL((h]]h^]h\]h[]h_]uhbMhd]rM((htXRemoved deprecated rN(rO(}rP((hCXRemoved deprecated hDjJ(ubjV )rQ(}rR((hCX``core.signals.domain_open``hY}rS((h]]h^]h\]h[]h_]uhDjJ(hd]rT(htXcore.signals.domain_openrU(rV(}rW((hCUhDjQ(ubahWj^ ubhtX signal - use rX(rY(}rZ((hCX signal - use hDjJ(ubjV )r[(}r\((hCX``core.signals.domain_opened``hY}r]((h]]h^]h\]h[]h_]uhDjJ(hd]r^(htXcore.signals.domain_openedr_(r`(}ra((hCUhDj[(ubahWj^ ubhtX instead (rb(rc(}rd((hCX instead (hDjJ(ubh)re(}rf((hCX :rev:`1822`hY}rg((UrefuriX*http://hg.scrapy.org/scrapy/changeset/1822h[]h\]h]]h^]h_]uhDjJ(hd]rh(htXr1822ri(rj(}rk((hCUhDje(ubahWhubhtX)rl(}rm((hCX)hDjJ(ubeubaubh)rn(}ro((hCX)``log.msg()`` now receives a ``spider`` argument (:rev:`1822`) - Old domain argument has been deprecated and will be removed in 0.9. For spiders, you should always use the ``spider`` argument and pass spider references. If you really want to pass a string, use the ``component`` argument instead.hDj'hMNhWhhY}rp((h]]h^]h\]h[]h_]uhbNhchhd]rq(jA)rr(}rs((hCUhY}rt((h]]h^]h\]h[]h_]uhDjn(hd]ru(jF)rv(}rw((hCX(``log.msg()`` now receives a ``spider`` argument (:rev:`1822`) - Old domain argument has been deprecated and will be removed in 0.9. For spiders, you should always use the ``spider`` argument and pass spider references. If you really want to pass a string, use the ``component`` argument instead.hDjr(hMhPhWjIhY}rx((h]]h^]h\]h[]h_]uhbMhd]ry((jL)rz(}r{((hCX>``log.msg()`` now receives a ``spider`` argument (:rev:`1822`)hDjv(hMhPhWjOhY}r|((h]]h^]h\]h[]h_]uhbMhd]r}((jV )r~(}r((hCX ``log.msg()``hY}r((h]]h^]h\]h[]h_]uhDjz(hd]r(htX log.msg()r(r(}r((hCUhDj~(ubahWj^ ubhtX now receives a r(r(}r((hCX now receives a hDjz(ubjV )r(}r((hCX ``spider``hY}r((h]]h^]h\]h[]h_]uhDjz(hd]r(htXspiderr(r(}r((hCUhDj(ubahWj^ ubhtX argument (r(r(}r((hCX argument (hDjz(ubh)r(}r((hCX :rev:`1822`hY}r((UrefuriX*http://hg.scrapy.org/scrapy/changeset/1822h[]h\]h]]h^]h_]uhDjz(hd]r(htXr1822r(r(}r((hCUhDj(ubahWhubhtX)r(}r((hCX)hDjz(ubeubj^)r(}r((hCUhY}r((h]]h^]h\]h[]h_]uhDjv(hd]r(h)r(}r((hCUhY}r((hX-h[]h\]h]]h^]h_]uhDj(hd]r(h)r(}r((hCXOld domain argument has been deprecated and will be removed in 0.9. For spiders, you should always use the ``spider`` argument and pass spider references. If you really want to pass a string, use the ``component`` argument instead.r(hY}r((h]]h^]h\]h[]h_]uhDj(hd]r(h)r(}r((hCj(hDj(hMhPhWhhY}r((h]]h^]h\]h[]h_]uhbMhd]r((htXkOld domain argument has been deprecated and will be removed in 0.9. For spiders, you should always use the r(r(}r((hCXkOld domain argument has been deprecated and will be removed in 0.9. For spiders, you should always use the hDj(ubjV )r(}r((hCX ``spider``hY}r((h]]h^]h\]h[]h_]uhDj(hd]r(htXspiderr(r(}r((hCUhDj(ubahWj^ ubhtXS argument and pass spider references. If you really want to pass a string, use the r(r(}r((hCXS argument and pass spider references. If you really want to pass a string, use the hDj(ubjV )r(}r((hCX ``component``hY}r((h]]h^]h\]h[]h_]uhDj(hd]r(htX componentr(r(}r((hCUhDj(ubahWj^ ubhtX argument instead.r(r(}r((hCX argument instead.hDj(ubeubahWhubahWhubahWjubeubahWjubaubh)r(}r((hCXJChanged core signals ``domain_opened``, ``domain_closed``, ``domain_idle``r(hDj'hMhPhWhhY}r((h]]h^]h\]h[]h_]uhbNhchhd]r(h)r(}r((hCj(hDj(hMhPhWhhY}r((h]]h^]h\]h[]h_]uhbMhd]r((htXChanged core signals r(r(}r((hCXChanged core signals hDj(ubjV )r(}r((hCX``domain_opened``hY}r((h]]h^]h\]h[]h_]uhDj(hd]r(htX domain_openedr(r(}r((hCUhDj(ubahWj^ ubhtX, r(r(}r((hCX, hDj(ubjV )r(}r((hCX``domain_closed``hY}r((h]]h^]h\]h[]h_]uhDj(hd]r(htX domain_closedr(r(}r((hCUhDj(ubahWj^ ubhtX, r(r(}r((hCX, hDj(ubjV )r(}r((hCX``domain_idle``hY}r((h]]h^]h\]h[]h_]uhDj(hd]r(htX domain_idler(r(}r((hCUhDj(ubahWj^ ubeubaubh)r(}r((hCXbChanged Item pipeline to use spiders instead of domains - The ``domain`` argument of ``process_item()`` item pipeline method was changed to ``spider``, the new signature is: ``process_item(spider, item)`` (:rev:`1827` | #105) - To quickly port your code (to work with Scrapy 0.8) just use ``spider.domain_name`` where you previously used ``domain``.hDj'hMNhWhhY}r((h]]h^]h\]h[]h_]uhbNhchhd]r(jA)r(}r((hCUhY}r((h]]h^]h\]h[]h_]uhDj(hd]r(jF)r(}r((hCX`Changed Item pipeline to use spiders instead of domains - The ``domain`` argument of ``process_item()`` item pipeline method was changed to ``spider``, the new signature is: ``process_item(spider, item)`` (:rev:`1827` | #105) - To quickly port your code (to work with Scrapy 0.8) just use ``spider.domain_name`` where you previously used ``domain``.hDj(hMhPhWjIhY}r((h]]h^]h\]h[]h_]uhbMhd]r((jL)r(}r((hCX7Changed Item pipeline to use spiders instead of domainsr(hDj(hMhPhWjOhY}r((h]]h^]h\]h[]h_]uhbMhd]r(htX7Changed Item pipeline to use spiders instead of domainsr(r(}r((hCj(hDj(ubaubj^)r(}r((hCUhY}r)(h]]h^]h\]h[]h_]uhDj(hd]r)h)r)}r)(hCUhY}r)(hX-h[]h\]h]]h^]h_]uhDj(hd]r)(h)r)}r)(hCXThe ``domain`` argument of ``process_item()`` item pipeline method was changed to ``spider``, the new signature is: ``process_item(spider, item)`` (:rev:`1827` | #105)r)hY}r )(h]]h^]h\]h[]h_]uhDj)hd]r )h)r )}r )(hCj)hDj)hMhPhWhhY}r )(h]]h^]h\]h[]h_]uhbMhd]r)(htXThe r)r)}r)(hCXThe hDj )ubjV )r)}r)(hCX ``domain``hY}r)(h]]h^]h\]h[]h_]uhDj )hd]r)htXdomainr)r)}r)(hCUhDj)ubahWj^ ubhtX argument of r)r)}r)(hCX argument of hDj )ubjV )r)}r)(hCX``process_item()``hY}r)(h]]h^]h\]h[]h_]uhDj )hd]r)htXprocess_item()r )r!)}r")(hCUhDj)ubahWj^ ubhtX& item pipeline method was changed to r#)r$)}r%)(hCX& item pipeline method was changed to hDj )ubjV )r&)}r')(hCX ``spider``hY}r()(h]]h^]h\]h[]h_]uhDj )hd]r))htXspiderr*)r+)}r,)(hCUhDj&)ubahWj^ ubhtX, the new signature is: r-)r.)}r/)(hCX, the new signature is: hDj )ubjV )r0)}r1)(hCX``process_item(spider, item)``hY}r2)(h]]h^]h\]h[]h_]uhDj )hd]r3)htXprocess_item(spider, item)r4)r5)}r6)(hCUhDj0)ubahWj^ ubhtX (r7)r8)}r9)(hCX (hDj )ubh)r:)}r;)(hCX :rev:`1827`hY}r<)(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1827h[]h\]h]]h^]h_]uhDj )hd]r=)htXr1827r>)r?)}r@)(hCUhDj:)ubahWhubhtX | #105)rA)rB)}rC)(hCX | #105)hDj )ubeubahWhubh)rD)}rE)(hCXyTo quickly port your code (to work with Scrapy 0.8) just use ``spider.domain_name`` where you previously used ``domain``.rF)hY}rG)(h]]h^]h\]h[]h_]uhDj)hd]rH)h)rI)}rJ)(hCjF)hDjD)hMhPhWhhY}rK)(h]]h^]h\]h[]h_]uhbMhd]rL)(htX=To quickly port your code (to work with Scrapy 0.8) just use rM)rN)}rO)(hCX=To quickly port your code (to work with Scrapy 0.8) just use hDjI)ubjV )rP)}rQ)(hCX``spider.domain_name``hY}rR)(h]]h^]h\]h[]h_]uhDjI)hd]rS)htXspider.domain_namerT)rU)}rV)(hCUhDjP)ubahWj^ ubhtX where you previously used rW)rX)}rY)(hCX where you previously used hDjI)ubjV )rZ)}r[)(hCX ``domain``hY}r\)(h]]h^]h\]h[]h_]uhDjI)hd]r])htXdomainr^)r_)}r`)(hCUhDjZ)ubahWj^ ubhtX.ra)}rb)(hCX.hDjI)ubeubahWhubehWhubahWjubeubahWjubaubh)rc)}rd)(hCX~Changed Stats API to use spiders instead of domains (:rev:`1849` | #113) - ``StatsCollector`` was changed to receive spider references (instead of domains) in its methods (``set_value``, ``inc_value``, etc). - added ``StatsCollector.iter_spider_stats()`` method - removed ``StatsCollector.list_domains()`` method - Also, Stats signals were renamed and now pass around spider references (instead of domains). Here's a summary of the changes: - To quickly port your code (to work with Scrapy 0.8) just use ``spider.domain_name`` where you previously used ``domain``. ``spider_stats`` contains exactly the same data as ``domain_stats``.hDj'hMNhWhhY}re)(h]]h^]h\]h[]h_]uhbNhchhd]rf)jA)rg)}rh)(hCUhY}ri)(h]]h^]h\]h[]h_]uhDjc)hd]rj)jF)rk)}rl)(hCXyChanged Stats API to use spiders instead of domains (:rev:`1849` | #113) - ``StatsCollector`` was changed to receive spider references (instead of domains) in its methods (``set_value``, ``inc_value``, etc). - added ``StatsCollector.iter_spider_stats()`` method - removed ``StatsCollector.list_domains()`` method - Also, Stats signals were renamed and now pass around spider references (instead of domains). Here's a summary of the changes: - To quickly port your code (to work with Scrapy 0.8) just use ``spider.domain_name`` where you previously used ``domain``. ``spider_stats`` contains exactly the same data as ``domain_stats``.hDjg)hMhPhWjIhY}rm)(h]]h^]h\]h[]h_]uhbMhd]rn)(jL)ro)}rp)(hCXHChanged Stats API to use spiders instead of domains (:rev:`1849` | #113)hDjk)hMhPhWjOhY}rq)(h]]h^]h\]h[]h_]uhbMhd]rr)(htX5Changed Stats API to use spiders instead of domains (rs)rt)}ru)(hCX5Changed Stats API to use spiders instead of domains (hDjo)ubh)rv)}rw)(hCX :rev:`1849`hY}rx)(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1849h[]h\]h]]h^]h_]uhDjo)hd]ry)htXr1849rz)r{)}r|)(hCUhDjv)ubahWhubhtX | #113)r})r~)}r)(hCX | #113)hDjo)ubeubj^)r)}r)(hCUhY}r)(h]]h^]h\]h[]h_]uhDjk)hd]r)h)r)}r)(hCUhY}r)(hX-h[]h\]h]]h^]h_]uhDj)hd]r)(h)r)}r)(hCX``StatsCollector`` was changed to receive spider references (instead of domains) in its methods (``set_value``, ``inc_value``, etc).r)hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)h)r)}r)(hCj)hDj)hMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbMhd]r)(jV )r)}r)(hCX``StatsCollector``hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)htXStatsCollectorr)r)}r)(hCUhDj)ubahWj^ ubhtXO was changed to receive spider references (instead of domains) in its methods (r)r)}r)(hCXO was changed to receive spider references (instead of domains) in its methods (hDj)ubjV )r)}r)(hCX ``set_value``hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)htX set_valuer)r)}r)(hCUhDj)ubahWj^ ubhtX, r)r)}r)(hCX, hDj)ubjV )r)}r)(hCX ``inc_value``hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)htX inc_valuer)r)}r)(hCUhDj)ubahWj^ ubhtX, etc).r)r)}r)(hCX, etc).hDj)ubeubahWhubh)r)}r)(hCX3added ``StatsCollector.iter_spider_stats()`` methodr)hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)h)r)}r)(hCj)hDj)hMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbMhd]r)(htXadded r)r)}r)(hCXadded hDj)ubjV )r)}r)(hCX&``StatsCollector.iter_spider_stats()``hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)htX"StatsCollector.iter_spider_stats()r)r)}r)(hCUhDj)ubahWj^ ubhtX methodr)r)}r)(hCX methodhDj)ubeubahWhubh)r)}r)(hCX0removed ``StatsCollector.list_domains()`` methodr)hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)h)r)}r)(hCj)hDj)hMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbMhd]r)(htXremoved r)r)}r)(hCXremoved hDj)ubjV )r)}r)(hCX!``StatsCollector.list_domains()``hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)htXStatsCollector.list_domains()r)r)}r)(hCUhDj)ubahWj^ ubhtX methodr)r)}r)(hCX methodhDj)ubeubahWhubh)r)}r)(hCX}Also, Stats signals were renamed and now pass around spider references (instead of domains). Here's a summary of the changes:r)hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)h)r)}r)(hCj)hDj)hMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbMhd]r)htX}Also, Stats signals were renamed and now pass around spider references (instead of domains). Here's a summary of the changes:r)r)}r)(hCj)hDj)ubaubahWhubh)r)}r)(hCXTo quickly port your code (to work with Scrapy 0.8) just use ``spider.domain_name`` where you previously used ``domain``. ``spider_stats`` contains exactly the same data as ``domain_stats``.r)hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)h)r)}r)(hCj)hDj)hMhPhWhhY}r)(h]]h^]h\]h[]h_]uhbMhd]r)(htX=To quickly port your code (to work with Scrapy 0.8) just use r)r)}r)(hCX=To quickly port your code (to work with Scrapy 0.8) just use hDj)ubjV )r)}r)(hCX``spider.domain_name``hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r)htXspider.domain_namer)r)}r)(hCUhDj)ubahWj^ ubhtX where you previously used r)r)}r)(hCX where you previously used hDj)ubjV )r)}r)(hCX ``domain``hY}r)(h]]h^]h\]h[]h_]uhDj)hd]r*htXdomainr*r*}r*(hCUhDj)ubahWj^ ubhtX. r*r*}r*(hCX. hDj)ubjV )r*}r*(hCX``spider_stats``hY}r *(h]]h^]h\]h[]h_]uhDj)hd]r *htX spider_statsr *r *}r *(hCUhDj*ubahWj^ ubhtX# contains exactly the same data as r*r*}r*(hCX# contains exactly the same data as hDj)ubjV )r*}r*(hCX``domain_stats``hY}r*(h]]h^]h\]h[]h_]uhDj)hd]r*htX domain_statsr*r*}r*(hCUhDj*ubahWj^ ubhtX.r*}r*(hCX.hDj)ubeubahWhubehWhubahWjubeubahWjubaubh)r*}r*(hCX``CloseDomain`` extension moved to ``scrapy.contrib.closespider.CloseSpider`` (:rev:`1833`) - Its settings were also renamed: - ``CLOSEDOMAIN_TIMEOUT`` to ``CLOSESPIDER_TIMEOUT`` - ``CLOSEDOMAIN_ITEMCOUNT`` to ``CLOSESPIDER_ITEMCOUNT``hDj'hMNhWhhY}r*(h]]h^]h\]h[]h_]uhbNhchhd]r*jA)r*}r*(hCUhY}r *(h]]h^]h\]h[]h_]uhDj*hd]r!*jF)r"*}r#*(hCX``CloseDomain`` extension moved to ``scrapy.contrib.closespider.CloseSpider`` (:rev:`1833`) - Its settings were also renamed: - ``CLOSEDOMAIN_TIMEOUT`` to ``CLOSESPIDER_TIMEOUT`` - ``CLOSEDOMAIN_ITEMCOUNT`` to ``CLOSESPIDER_ITEMCOUNT``hDj*hMhPhWjIhY}r$*(h]]h^]h\]h[]h_]uhbMhd]r%*(jL)r&*}r'*(hCX[``CloseDomain`` extension moved to ``scrapy.contrib.closespider.CloseSpider`` (:rev:`1833`)r(*hDj"*hMhPhWjOhY}r)*(h]]h^]h\]h[]h_]uhbMhd]r**(jV )r+*}r,*(hCX``CloseDomain``hY}r-*(h]]h^]h\]h[]h_]uhDj&*hd]r.*htX CloseDomainr/*r0*}r1*(hCUhDj+*ubahWj^ ubhtX extension moved to r2*r3*}r4*(hCX extension moved to hDj&*ubjV )r5*}r6*(hCX*``scrapy.contrib.closespider.CloseSpider``hY}r7*(h]]h^]h\]h[]h_]uhDj&*hd]r8*htX&scrapy.contrib.closespider.CloseSpiderr9*r:*}r;*(hCUhDj5*ubahWj^ ubhtX (r<*r=*}r>*(hCX (hDj&*ubh)r?*}r@*(hCX :rev:`1833`hY}rA*(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1833h[]h\]h]]h^]h_]uhDj&*hd]rB*htXr1833rC*rD*}rE*(hCUhDj?*ubahWhubhtX)rF*}rG*(hCX)hDj&*ubeubj^)rH*}rI*(hCUhY}rJ*(h]]h^]h\]h[]h_]uhDj"*hd]rK*h)rL*}rM*(hCUhY}rN*(hX-h[]h\]h]]h^]h_]uhDjH*hd]rO*h)rP*}rQ*(hCXIts settings were also renamed: - ``CLOSEDOMAIN_TIMEOUT`` to ``CLOSESPIDER_TIMEOUT`` - ``CLOSEDOMAIN_ITEMCOUNT`` to ``CLOSESPIDER_ITEMCOUNT``hY}rR*(h]]h^]h\]h[]h_]uhDjL*hd]rS*jA)rT*}rU*(hCUhY}rV*(h]]h^]h\]h[]h_]uhDjP*hd]rW*jF)rX*}rY*(hCXIts settings were also renamed: - ``CLOSEDOMAIN_TIMEOUT`` to ``CLOSESPIDER_TIMEOUT`` - ``CLOSEDOMAIN_ITEMCOUNT`` to ``CLOSESPIDER_ITEMCOUNT``hDjT*hMhPhWjIhY}rZ*(h]]h^]h\]h[]h_]uhbMhd]r[*(jL)r\*}r]*(hCXIts settings were also renamed:r^*hDjX*hMhPhWjOhY}r_*(h]]h^]h\]h[]h_]uhbMhd]r`*htXIts settings were also renamed:ra*rb*}rc*(hCj^*hDj\*ubaubj^)rd*}re*(hCUhY}rf*(h]]h^]h\]h[]h_]uhDjX*hd]rg*h)rh*}ri*(hCUhY}rj*(hX-h[]h\]h]]h^]h_]uhDjd*hd]rk*(h)rl*}rm*(hCX2``CLOSEDOMAIN_TIMEOUT`` to ``CLOSESPIDER_TIMEOUT``rn*hY}ro*(h]]h^]h\]h[]h_]uhDjh*hd]rp*h)rq*}rr*(hCjn*hDjl*hMhPhWhhY}rs*(h]]h^]h\]h[]h_]uhbMhd]rt*(jV )ru*}rv*(hCX``CLOSEDOMAIN_TIMEOUT``hY}rw*(h]]h^]h\]h[]h_]uhDjq*hd]rx*htXCLOSEDOMAIN_TIMEOUTry*rz*}r{*(hCUhDju*ubahWj^ ubhtX to r|*r}*}r~*(hCX to hDjq*ubjV )r*}r*(hCX``CLOSESPIDER_TIMEOUT``hY}r*(h]]h^]h\]h[]h_]uhDjq*hd]r*htXCLOSESPIDER_TIMEOUTr*r*}r*(hCUhDj*ubahWj^ ubeubahWhubh)r*}r*(hCX6``CLOSEDOMAIN_ITEMCOUNT`` to ``CLOSESPIDER_ITEMCOUNT``r*hY}r*(h]]h^]h\]h[]h_]uhDjh*hd]r*h)r*}r*(hCj*hDj*hMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbMhd]r*(jV )r*}r*(hCX``CLOSEDOMAIN_ITEMCOUNT``hY}r*(h]]h^]h\]h[]h_]uhDj*hd]r*htXCLOSEDOMAIN_ITEMCOUNTr*r*}r*(hCUhDj*ubahWj^ ubhtX to r*r*}r*(hCX to hDj*ubjV )r*}r*(hCX``CLOSESPIDER_ITEMCOUNT``hY}r*(h]]h^]h\]h[]h_]uhDj*hd]r*htXCLOSESPIDER_ITEMCOUNTr*r*}r*(hCUhDj*ubahWj^ ubeubahWhubehWhubahWjubeubahWjubahWhubahWhubahWjubeubahWjubaubh)r*}r*(hCXxRemoved deprecated ``SCRAPYSETTINGS_MODULE`` environment variable - use ``SCRAPY_SETTINGS_MODULE`` instead (:rev:`1840`)r*hDj'hMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbNhchhd]r*h)r*}r*(hCj*hDj*hMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbMhd]r*(htXRemoved deprecated r*r*}r*(hCXRemoved deprecated hDj*ubjV )r*}r*(hCX``SCRAPYSETTINGS_MODULE``hY}r*(h]]h^]h\]h[]h_]uhDj*hd]r*htXSCRAPYSETTINGS_MODULEr*r*}r*(hCUhDj*ubahWj^ ubhtX environment variable - use r*r*}r*(hCX environment variable - use hDj*ubjV )r*}r*(hCX``SCRAPY_SETTINGS_MODULE``hY}r*(h]]h^]h\]h[]h_]uhDj*hd]r*htXSCRAPY_SETTINGS_MODULEr*r*}r*(hCUhDj*ubahWj^ ubhtX instead (r*r*}r*(hCX instead (hDj*ubh)r*}r*(hCX :rev:`1840`hY}r*(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1840h[]h\]h]]h^]h_]uhDj*hd]r*htXr1840r*r*}r*(hCUhDj*ubahWhubhtX)r*}r*(hCX)hDj*ubeubaubh)r*}r*(hCXiRenamed setting: ``REQUESTS_PER_DOMAIN`` to ``CONCURRENT_REQUESTS_PER_SPIDER`` (:rev:`1830`, :rev:`1844`)r*hDj'hMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbNhchhd]r*h)r*}r*(hCj*hDj*hMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbMhd]r*(htXRenamed setting: r*r*}r*(hCXRenamed setting: hDj*ubjV )r*}r*(hCX``REQUESTS_PER_DOMAIN``hY}r*(h]]h^]h\]h[]h_]uhDj*hd]r*htXREQUESTS_PER_DOMAINr*r*}r*(hCUhDj*ubahWj^ ubhtX to r*r*}r*(hCX to hDj*ubjV )r*}r*(hCX"``CONCURRENT_REQUESTS_PER_SPIDER``hY}r*(h]]h^]h\]h[]h_]uhDj*hd]r*htXCONCURRENT_REQUESTS_PER_SPIDERr*r*}r*(hCUhDj*ubahWj^ ubhtX (r*r*}r*(hCX (hDj*ubh)r*}r*(hCX :rev:`1830`hY}r*(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1830h[]h\]h]]h^]h_]uhDj*hd]r*htXr1830r*r*}r*(hCUhDj*ubahWhubhtX, r*r*}r*(hCX, hDj*ubh)r*}r*(hCX :rev:`1844`hY}r*(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1844h[]h\]h]]h^]h_]uhDj*hd]r*htXr1844r*r*}r*(hCUhDj*ubahWhubhtX)r*}r*(hCX)hDj*ubeubaubh)r*}r*(hCXORenamed setting: ``CONCURRENT_DOMAINS`` to ``CONCURRENT_SPIDERS`` (:rev:`1830`)r*hDj'hMhPhWhhY}r*(h]]h^]h\]h[]h_]uhbNhchhd]r+h)r+}r+(hCj*hDj*hMhPhWhhY}r+(h]]h^]h\]h[]h_]uhbMhd]r+(htXRenamed setting: r+r+}r+(hCXRenamed setting: hDj+ubjV )r+}r +(hCX``CONCURRENT_DOMAINS``hY}r +(h]]h^]h\]h[]h_]uhDj+hd]r +htXCONCURRENT_DOMAINSr +r +}r+(hCUhDj+ubahWj^ ubhtX to r+r+}r+(hCX to hDj+ubjV )r+}r+(hCX``CONCURRENT_SPIDERS``hY}r+(h]]h^]h\]h[]h_]uhDj+hd]r+htXCONCURRENT_SPIDERSr+r+}r+(hCUhDj+ubahWj^ ubhtX (r+r+}r+(hCX (hDj+ubh)r+}r+(hCX :rev:`1830`hY}r+(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1830h[]h\]h]]h^]h_]uhDj+hd]r+htXr1830r +r!+}r"+(hCUhDj+ubahWhubhtX)r#+}r$+(hCX)hDj+ubeubaubh)r%+}r&+(hCX Refactored HTTP Cache middlewarer'+hDj'hMhPhWhhY}r(+(h]]h^]h\]h[]h_]uhbNhchhd]r)+h)r*+}r++(hCj'+hDj%+hMhPhWhhY}r,+(h]]h^]h\]h[]h_]uhbMhd]r-+htX Refactored HTTP Cache middlewarer.+r/+}r0+(hCj'+hDj*+ubaubaubh)r1+}r2+(hCXHTTP Cache middleware has been heavilty refactored, retaining the same functionality except for the domain sectorization which was removed. (:rev:`1843` )r3+hDj'hMhPhWhhY}r4+(h]]h^]h\]h[]h_]uhbNhchhd]r5+h)r6+}r7+(hCj3+hDj1+hMhPhWhhY}r8+(h]]h^]h\]h[]h_]uhbMhd]r9+(htXHTTP Cache middleware has been heavilty refactored, retaining the same functionality except for the domain sectorization which was removed. (r:+r;+}r<+(hCXHTTP Cache middleware has been heavilty refactored, retaining the same functionality except for the domain sectorization which was removed. (hDj6+ubh)r=+}r>+(hCX :rev:`1843`hY}r?+(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1843h[]h\]h]]h^]h_]uhDj6+hd]r@+htXr1843rA+rB+}rC+(hCUhDj=+ubahWhubhtX )rD+rE+}rF+(hCX )hDj6+ubeubaubh)rG+}rH+(hCXRRenamed exception: ``DontCloseDomain`` to ``DontCloseSpider`` (:rev:`1859` | #120)rI+hDj'hMhPhWhhY}rJ+(h]]h^]h\]h[]h_]uhbNhchhd]rK+h)rL+}rM+(hCjI+hDjG+hMhPhWhhY}rN+(h]]h^]h\]h[]h_]uhbMhd]rO+(htXRenamed exception: rP+rQ+}rR+(hCXRenamed exception: hDjL+ubjV )rS+}rT+(hCX``DontCloseDomain``hY}rU+(h]]h^]h\]h[]h_]uhDjL+hd]rV+htXDontCloseDomainrW+rX+}rY+(hCUhDjS+ubahWj^ ubhtX to rZ+r[+}r\+(hCX to hDjL+ubjV )r]+}r^+(hCX``DontCloseSpider``hY}r_+(h]]h^]h\]h[]h_]uhDjL+hd]r`+htXDontCloseSpiderra+rb+}rc+(hCUhDj]+ubahWj^ ubhtX (rd+re+}rf+(hCX (hDjL+ubh)rg+}rh+(hCX :rev:`1859`hY}ri+(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1859h[]h\]h]]h^]h_]uhDjL+hd]rj+htXr1859rk+rl+}rm+(hCUhDjg+ubahWhubhtX | #120)rn+ro+}rp+(hCX | #120)hDjL+ubeubaubh)rq+}rr+(hCXVRenamed extension: ``DelayedCloseDomain`` to ``SpiderCloseDelay`` (:rev:`1861` | #121)rs+hDj'hMhPhWhhY}rt+(h]]h^]h\]h[]h_]uhbNhchhd]ru+h)rv+}rw+(hCjs+hDjq+hMhPhWhhY}rx+(h]]h^]h\]h[]h_]uhbMhd]ry+(htXRenamed extension: rz+r{+}r|+(hCXRenamed extension: hDjv+ubjV )r}+}r~+(hCX``DelayedCloseDomain``hY}r+(h]]h^]h\]h[]h_]uhDjv+hd]r+htXDelayedCloseDomainr+r+}r+(hCUhDj}+ubahWj^ ubhtX to r+r+}r+(hCX to hDjv+ubjV )r+}r+(hCX``SpiderCloseDelay``hY}r+(h]]h^]h\]h[]h_]uhDjv+hd]r+htXSpiderCloseDelayr+r+}r+(hCUhDj+ubahWj^ ubhtX (r+r+}r+(hCX (hDjv+ubh)r+}r+(hCX :rev:`1861`hY}r+(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1861h[]h\]h]]h^]h_]uhDjv+hd]r+htXr1861r+r+}r+(hCUhDj+ubahWhubhtX | #121)r+r+}r+(hCX | #121)hDjv+ubeubaubh)r+}r+(hCXRemoved obsolete ``scrapy.utils.markup.remove_escape_chars`` function - use ``scrapy.utils.markup.replace_escape_chars`` instead (:rev:`1865`) hDj'hMhPhWhhY}r+(h]]h^]h\]h[]h_]uhbNhchhd]r+h)r+}r+(hCXRemoved obsolete ``scrapy.utils.markup.remove_escape_chars`` function - use ``scrapy.utils.markup.replace_escape_chars`` instead (:rev:`1865`)r+hDj+hMhPhWhhY}r+(h]]h^]h\]h[]h_]uhbMhd]r+(htXRemoved obsolete r+r+}r+(hCXRemoved obsolete hDj+ubjV )r+}r+(hCX+``scrapy.utils.markup.remove_escape_chars``hY}r+(h]]h^]h\]h[]h_]uhDj+hd]r+htX'scrapy.utils.markup.remove_escape_charsr+r+}r+(hCUhDj+ubahWj^ ubhtX function - use r+r+}r+(hCX function - use hDj+ubjV )r+}r+(hCX,``scrapy.utils.markup.replace_escape_chars``hY}r+(h]]h^]h\]h[]h_]uhDj+hd]r+htX(scrapy.utils.markup.replace_escape_charsr+r+}r+(hCUhDj+ubahWj^ ubhtX instead (r+r+}r+(hCX instead (hDj+ubh)r+}r+(hCX :rev:`1865`hY}r+(UrefuriX*http://hg.scrapy.org/scrapy/changeset/1865h[]h\]h]]h^]h_]uhDj+hd]r+htXr1865r+r+}r+(hCUhDj+ubahWhubhtX)r+}r+(hCX)hDj+ubeubaubeubeubeubhE)r+}r+(hCUhDhKhMhPhWhehY}r+(h]]h^]h\]h[]r+Uid15r+ah_]r+hauhbMhchhd]r+(hm)r+}r+(hCX0.7r+hDj+hMhPhWhqhY}r+(h]]h^]h\]h[]h_]uhbMhchhd]r+htX0.7r+r+}r+(hCj+hDj+ubaubh)r+}r+(hCXFirst release of Scrapy.r+hDj+hMhPhWhhY}r+(h]]h^]h\]h[]h_]uhbMhchhd]r+htXFirst release of Scrapy.r+r+}r+(hCj+hDj+ubaubhT)r+}r+(hCX[.. _AJAX crawleable urls: http://code.google.com/web/ajaxcrawling/docs/getting-started.htmlhHKhDj+hMhPhWhXhY}r+(j, jSh[]r+Uajax-crawleable-urlsr+ah\]h]]h^]h_]r+hauhbMhchhd]ubhT)r+}r+(hCXU.. _chunked transfer encoding: http://en.wikipedia.org/wiki/Chunked_transfer_encodinghHKhDj+hMhPhWhXhY}r+(j, jh[]r+Uchunked-transfer-encodingr+ah\]h]]h^]h_]r+hauhbMhchhd]ubhT)r+}r+(hCX1.. _w3lib: http://https://github.com/scrapy/w3libhHKhDj+hMhPhWhXhY}r+(j, jh[]r+Uw3libr+ah\]h]]h^]h_]r+hauhbMhchhd]ubhT)r+}r+(hCX0.. _scrapely: https://github.com/scrapy/scrapelyhHKhDj+hMhPhWhXhY}r+(j, jh[]r+Uscrapelyr+ah\]h]]h^]h_]r+h(auhbMhchhd]ubhT)r+}r+(hCX8.. _marshal: http://docs.python.org/library/marshal.htmlhHKhDj+hMhPhWhXhY}r+(j, j h[]r+Umarshalr+ah\]h]]h^]h_]r+h;auhbMhchhd]ubhT)r+}r+(hCXQ.. _w3lib.encoding: https://github.com/scrapy/w3lib/blob/master/w3lib/encoding.pyhHKhDj+hMhPhWhXhY}r+(j, jh[]r+Uw3lib-encodingr+ah\]h]]h^]h_]r+hauhbMhchhd]ubhT)r+}r,(hCX.. _lxml: http://lxml.de/hHKhDj+hMhPhWhXhY}r,(j, jih[]r,Ulxmlr,ah\]h]]h^]h_]r,hauhbMhchhd]ubhT)r,}r,(hCX@.. _ClientForm: http://wwwsearch.sourceforge.net/old/ClientForm/hHKhDj+hMhPhWhXhY}r,(j, jth[]r,U clientformr ,ah\]h]]h^]h_]r ,h/auhbMhchhd]ubhT)r ,}r ,(hCX:.. _resource: http://docs.python.org/library/resource.htmlhHKhDj+hMhPhWhXhY}r ,(j, jCh[]r,Uresourcer,ah\]h]]h^]h_]r,h1auhbMhchhd]ubhT)r,}r,(hCX0.. _queuelib: https://github.com/scrapy/queuelibhHKhDj+hMhPhWhXhY}r,(j, j- h[]r,Uqueuelibr,ah\]h]]h^]h_]r,hauhbMhchhd]ubhT)r,}r,(hCX6.. _cssselect: https://github.com/SimonSapin/cssselecthHKhDj+hMhPhWhXhY}r,(j, X'https://github.com/SimonSapin/cssselectr,h[]r,U cssselectr,ah\]h]]h^]h_]r,h9auhbMhchhd]ubeubeubhMhPhWhehY}r,(h]]h^]h\]h[]r,Ureleased-2013-11-08r ,ah_]r!,h&auhbK?hchhd]r",(hm)r#,}r$,(hCX0.20.0 (released 2013-11-08)r%,hDhIhMhPhWhqhY}r&,(h]]h^]h\]h[]h_]uhbK?hchhd]r',htX0.20.0 (released 2013-11-08)r(,r),}r*,(hCj%,hDj#,ubaubhFhE)r+,}r,,(hCUhDhIhMhPhWhehY}r-,(h]]h^]h\]h[]r.,Ubugfixesr/,ah_]r0,h"auhbK^hchhd]r1,(hm)r2,}r3,(hCXBugfixesr4,hDj+,hMhPhWhqhY}r5,(h]]h^]h\]h[]h_]uhbK^hchhd]r6,htXBugfixesr7,r8,}r9,(hCj4,hDj2,ubaubh)r:,}r;,(hCUhDj+,hMhPhWhhY}r<,(hX-h[]h\]h]]h^]h_]uhbK`hchhd]r=,(h)r>,}r?,(hCX/Fix tests under Django 1.6 (:commit:`b6bed44c`)r@,hDj:,hMhPhWhhY}rA,(h]]h^]h\]h[]h_]uhbNhchhd]rB,h)rC,}rD,(hCj@,hDj>,hMhPhWhhY}rE,(h]]h^]h\]h[]h_]uhbK`hd]rF,(htXFix tests under Django 1.6 (rG,rH,}rI,(hCXFix tests under Django 1.6 (hDjC,ubh)rJ,}rK,(hCX:commit:`b6bed44c`hY}rL,(UrefuriX0https://github.com/scrapy/scrapy/commit/b6bed44ch[]h\]h]]h^]h_]uhDjC,hd]rM,htXcommit b6bed44crN,rO,}rP,(hCUhDjJ,ubahWhubhtX)rQ,}rR,(hCX)hDjC,ubeubaubh)rS,}rT,(hCXXLot of bugfixes to retry middleware under disconnections using HTTP 1.1 download handlerrU,hDj:,hMhPhWhhY}rV,(h]]h^]h\]h[]h_]uhbNhchhd]rW,h)rX,}rY,(hCjU,hDjS,hMhPhWhhY}rZ,(h]]h^]h\]h[]h_]uhbKahd]r[,htXXLot of bugfixes to retry middleware under disconnections using HTTP 1.1 download handlerr\,r],}r^,(hCjU,hDjX,ubaubaubh)r_,}r`,(hCX9Fix inconsistencies among Twisted releases (:issue:`406`)ra,hDj:,hMhPhWhhY}rb,(h]]h^]h\]h[]h_]uhbNhchhd]rc,h)rd,}re,(hCja,hDj_,hMhPhWhhY}rf,(h]]h^]h\]h[]h_]uhbKbhd]rg,(htX,Fix inconsistencies among Twisted releases (rh,ri,}rj,(hCX,Fix inconsistencies among Twisted releases (hDjd,ubh)rk,}rl,(hCX :issue:`406`hY}rm,(UrefuriX+https://github.com/scrapy/scrapy/issues/406h[]h\]h]]h^]h_]uhDjd,hd]rn,htX issue 406ro,rp,}rq,(hCUhDjk,ubahWhubhtX)rr,}rs,(hCX)hDjd,ubeubaubh)rt,}ru,(hCX2Fix scrapy shell bugs (:issue:`418`, :issue:`407`)rv,hDj:,hMhPhWhhY}rw,(h]]h^]h\]h[]h_]uhbNhchhd]rx,h)ry,}rz,(hCjv,hDjt,hMhPhWhhY}r{,(h]]h^]h\]h[]h_]uhbKchd]r|,(htXFix scrapy shell bugs (r},r~,}r,(hCXFix scrapy shell bugs (hDjy,ubh)r,}r,(hCX :issue:`418`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/418h[]h\]h]]h^]h_]uhDjy,hd]r,htX issue 418r,r,}r,(hCUhDj,ubahWhubhtX, r,r,}r,(hCX, hDjy,ubh)r,}r,(hCX :issue:`407`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/407h[]h\]h]]h^]h_]uhDjy,hd]r,htX issue 407r,r,}r,(hCUhDj,ubahWhubhtX)r,}r,(hCX)hDjy,ubeubaubh)r,}r,(hCX4Fix invalid variable name in setup.py (:issue:`429`)r,hDj:,hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbNhchhd]r,h)r,}r,(hCj,hDj,hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbKdhd]r,(htX'Fix invalid variable name in setup.py (r,r,}r,(hCX'Fix invalid variable name in setup.py (hDj,ubh)r,}r,(hCX :issue:`429`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/429h[]h\]h]]h^]h_]uhDj,hd]r,htX issue 429r,r,}r,(hCUhDj,ubahWhubhtX)r,}r,(hCX)hDj,ubeubaubh)r,}r,(hCX&Fix tutorial references (:issue:`387`)r,hDj:,hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbNhchhd]r,h)r,}r,(hCj,hDj,hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbKehd]r,(htXFix tutorial references (r,r,}r,(hCXFix tutorial references (hDj,ubh)r,}r,(hCX :issue:`387`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/387h[]h\]h]]h^]h_]uhDj,hd]r,htX issue 387r,r,}r,(hCUhDj,ubahWhubhtX)r,}r,(hCX)hDj,ubeubaubh)r,}r,(hCX,Improve request-response docs (:issue:`391`)r,hDj:,hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbNhchhd]r,h)r,}r,(hCj,hDj,hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbKfhd]r,(htXImprove request-response docs (r,r,}r,(hCXImprove request-response docs (hDj,ubh)r,}r,(hCX :issue:`391`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/391h[]h\]h]]h^]h_]uhDj,hd]r,htX issue 391r,r,}r,(hCUhDj,ubahWhubhtX)r,}r,(hCX)hDj,ubeubaubh)r,}r,(hCXTImprove best practices docs (:issue:`399`, :issue:`400`, :issue:`401`, :issue:`402`)r,hDj:,hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbNhchhd]r,h)r,}r,(hCj,hDj,hMhPhWhhY}r,(h]]h^]h\]h[]h_]uhbKghd]r,(htXImprove best practices docs (r,r,}r,(hCXImprove best practices docs (hDj,ubh)r,}r,(hCX :issue:`399`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/399h[]h\]h]]h^]h_]uhDj,hd]r,htX issue 399r,r,}r,(hCUhDj,ubahWhubhtX, r,r,}r,(hCX, hDj,ubh)r,}r,(hCX :issue:`400`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/400h[]h\]h]]h^]h_]uhDj,hd]r,htX issue 400r,r,}r,(hCUhDj,ubahWhubhtX, r,r,}r,(hCX, hDj,ubh)r,}r,(hCX :issue:`401`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/401h[]h\]h]]h^]h_]uhDj,hd]r,htX issue 401r,r,}r,(hCUhDj,ubahWhubhtX, r,r,}r,(hCX, hDj,ubh)r,}r,(hCX :issue:`402`hY}r,(UrefuriX+https://github.com/scrapy/scrapy/issues/402h[]h\]h]]h^]h_]uhDj,hd]r,htX issue 402r-r-}r-(hCUhDj,ubahWhubhtX)r-}r-(hCX)hDj,ubeubaubh)r-}r-(hCX.Improve django integration docs (:issue:`404`)r-hDj:,hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbNhchhd]r -h)r -}r -(hCj-hDj-hMhPhWhhY}r -(h]]h^]h\]h[]h_]uhbKhhd]r -(htX!Improve django integration docs (r-r-}r-(hCX!Improve django integration docs (hDj -ubh)r-}r-(hCX :issue:`404`hY}r-(UrefuriX+https://github.com/scrapy/scrapy/issues/404h[]h\]h]]h^]h_]uhDj -hd]r-htX issue 404r-r-}r-(hCUhDj-ubahWhubhtX)r-}r-(hCX)hDj -ubeubaubh)r-}r-(hCX:Document `bindaddress` request meta (:commit:`37c24e01d7`)r-hDj:,hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbNhchhd]r-h)r-}r -(hCj-hDj-hMhPhWhhY}r!-(h]]h^]h\]h[]h_]uhbKihd]r"-(htX Document r#-r$-}r%-(hCX Document hDj-ubh)r&-}r'-(hCX `bindaddress`hY}r(-(h]]h^]h\]h[]h_]uhDj-hd]r)-htX bindaddressr*-r+-}r,-(hCUhDj&-ubahWhubhtX request meta (r--r.-}r/-(hCX request meta (hDj-ubh)r0-}r1-(hCX:commit:`37c24e01d7`hY}r2-(UrefuriX2https://github.com/scrapy/scrapy/commit/37c24e01d7h[]h\]h]]h^]h_]uhDj-hd]r3-htXcommit 37c24e01d7r4-r5-}r6-(hCUhDj0-ubahWhubhtX)r7-}r8-(hCX)hDj-ubeubaubh)r9-}r:-(hCX5Improve `Request` class documentation (:issue:`226`) hDj:,hMhPhWhhY}r;-(h]]h^]h\]h[]h_]uhbNhchhd]r<-h)r=-}r>-(hCX4Improve `Request` class documentation (:issue:`226`)hDj9-hMhPhWhhY}r?-(h]]h^]h\]h[]h_]uhbKjhd]r@-(htXImprove rA-rB-}rC-(hCXImprove hDj=-ubh)rD-}rE-(hCX `Request`hY}rF-(h]]h^]h\]h[]h_]uhDj=-hd]rG-htXRequestrH-rI-}rJ-(hCUhDjD-ubahWhubhtX class documentation (rK-rL-}rM-(hCX class documentation (hDj=-ubh)rN-}rO-(hCX :issue:`226`hY}rP-(UrefuriX+https://github.com/scrapy/scrapy/issues/226h[]h\]h]]h^]h_]uhDj=-hd]rQ-htX issue 226rR-rS-}rT-(hCUhDjN-ubahWhubhtX)rU-}rV-(hCX)hDj=-ubeubaubeubeubhE)rW-}rX-(hCUhDhIhMhPhWhehY}rY-(h]]h^]h\]h[]rZ-Uotherr[-ah_]r\-hauhbKmhchhd]r]-(hm)r^-}r_-(hCXOtherr`-hDjW-hMhPhWhqhY}ra-(h]]h^]h\]h[]h_]uhbKmhchhd]rb-htXOtherrc-rd-}re-(hCj`-hDj^-ubaubh)rf-}rg-(hCUhDjW-hMhPhWhhY}rh-(hX-h[]h\]h]]h^]h_]uhbKohchhd]ri-(h)rj-}rk-(hCX)Dropped Python 2.6 support (:issue:`448`)rl-hDjf-hMhPhWhhY}rm-(h]]h^]h\]h[]h_]uhbNhchhd]rn-h)ro-}rp-(hCjl-hDjj-hMhPhWhhY}rq-(h]]h^]h\]h[]h_]uhbKohd]rr-(htXDropped Python 2.6 support (rs-rt-}ru-(hCXDropped Python 2.6 support (hDjo-ubh)rv-}rw-(hCX :issue:`448`hY}rx-(UrefuriX+https://github.com/scrapy/scrapy/issues/448h[]h\]h]]h^]h_]uhDjo-hd]ry-htX issue 448rz-r{-}r|-(hCUhDjv-ubahWhubhtX)r}-}r~-(hCX)hDjo-ubeubaubh)r-}r-(hCX5Add `cssselect`_ python package as install dependencyr-hDjf-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbNhchhd]r-h)r-}r-(hCj-hDj-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbKphd]r-(htXAdd r-r-}r-(hCXAdd hDj-ubh)r-}r-(hCX `cssselect`_j) KhDj-hWhhY}r-(UnameX cssselectr-j, j,h[]h\]h]]h^]h_]uhd]r-htX cssselectr-r-}r-(hCUhDj-ubaubhtX% python package as install dependencyr-r-}r-(hCX% python package as install dependencyhDj-ubeubaubh)r-}r-(hCXSDrop libxml2 and multi selector's backend support, `lxml`_ is required from now on.r-hDjf-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbNhchhd]r-h)r-}r-(hCj-hDj-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbKqhd]r-(htX3Drop libxml2 and multi selector's backend support, r-r-}r-(hCX3Drop libxml2 and multi selector's backend support, hDj-ubh)r-}r-(hCX`lxml`_j) KhDj-hWhhY}r-(UnameXlxmlr-j, jih[]h\]h]]h^]h_]uhd]r-htXlxmlr-r-}r-(hCUhDj-ubaubhtX is required from now on.r-r-}r-(hCX is required from now on.hDj-ubeubaubh)r-}r-(hCXIMinimum Twisted version increased to 10.0.0, dropped Twisted 8.0 support.r-hDjf-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbNhchhd]r-h)r-}r-(hCj-hDj-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbKrhd]r-htXIMinimum Twisted version increased to 10.0.0, dropped Twisted 8.0 support.r-r-}r-(hCj-hDj-ubaubaubh)r-}r-(hCXFRunning test suite now requires `mock` python library (:issue:`390`) hDjf-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbNhchhd]r-h)r-}r-(hCXDRunning test suite now requires `mock` python library (:issue:`390`)hDj-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbKshd]r-(htX Running test suite now requires r-r-}r-(hCX Running test suite now requires hDj-ubh)r-}r-(hCX`mock`hY}r-(h]]h^]h\]h[]h_]uhDj-hd]r-htXmockr-r-}r-(hCUhDj-ubahWhubhtX python library (r-r-}r-(hCX python library (hDj-ubh)r-}r-(hCX :issue:`390`hY}r-(UrefuriX+https://github.com/scrapy/scrapy/issues/390h[]h\]h]]h^]h_]uhDj-hd]r-htX issue 390r-r-}r-(hCUhDj-ubahWhubhtX)r-}r-(hCX)hDj-ubeubaubeubeubhE)r-}r-(hCUhDhIhMhPhWhehY}r-(h]]h^]h\]h[]r-Uthanksr-ah_]r-hauhbKwhchhd]r-(hm)r-}r-(hCXThanksr-hDj-hMhPhWhqhY}r-(h]]h^]h\]h[]h_]uhbKwhchhd]r-htXThanksr-r-}r-(hCj-hDj-ubaubh)r-}r-(hCX2Thanks to everyone who contribute to this release!r-hDj-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbKyhchhd]r-htX2Thanks to everyone who contribute to this release!r-r-}r-(hCj-hDj-ubaubh)r-}r-(hCX2List of contributors sorted by number of commits::r-hDj-hMhPhWhhY}r-(h]]h^]h\]h[]h_]uhbK{hchhd]r-htX1List of contributors sorted by number of commits:r-r-}r-(hCX1List of contributors sorted by number of commits:hDj-ubaubj )r-}r-(hCX69 Daniel Graña 37 Pablo Hoffman 13 Mikhail Korobov 9 Alex Cepoi 9 alexanderlukanin13 8 Rolando Espinoza La fuente 8 Lukasz Biedrycki 6 Nicolas Ramirez 3 Paul Tremberth 2 Martin Olveyra 2 Stefan 2 Rolando Espinoza 2 Loren Davie 2 irgmedeiros 1 Stefan Koch 1 Stefan 1 scraperdragon 1 Kumara Tharmalingam 1 Francesco Piccinno 1 Marcos Campal 1 Dragon Dave 1 Capi Etheriel 1 cacovsky 1 Berend Iwema hDj-hMhPhWj hY}r-(j! j" h[]h\]h]]h^]h_]uhbK}hchhd]r-htX69 Daniel Graña 37 Pablo Hoffman 13 Mikhail Korobov 9 Alex Cepoi 9 alexanderlukanin13 8 Rolando Espinoza La fuente 8 Lukasz Biedrycki 6 Nicolas Ramirez 3 Paul Tremberth 2 Martin Olveyra 2 Stefan 2 Rolando Espinoza 2 Loren Davie 2 irgmedeiros 1 Stefan Koch 1 Stefan 1 scraperdragon 1 Kumara Tharmalingam 1 Francesco Piccinno 1 Marcos Campal 1 Dragon Dave 1 Capi Etheriel 1 cacovsky 1 Berend Iwema r-r-}r-(hCUhDj-ubaubeubeubhMhPhWhehY}r-(h]]r-hah^]h\]h[]r-Uid1r.ah_]uhbKBhchhd]r.(hm)r.}r.(hCX Enhancementsr.hDhFhMhPhWhqhY}r.(h]]h^]h\]h[]h_]uhbKBhchhd]r.htX Enhancementsr.r.}r .(hCj.hDj.ubaubh)r .}r .(hCUhDhFhMhPhWhhY}r .(hX-h[]h\]h]]h^]h_]uhbKDhchhd]r .(h)r.}r.(hCXKNew Selector's API including CSS selectors (:issue:`395` and :issue:`426`),r.hDj .hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbNhchhd]r.h)r.}r.(hCj.hDj.hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbKDhd]r.(htX,New Selector's API including CSS selectors (r.r.}r.(hCX,New Selector's API including CSS selectors (hDj.ubh)r.}r.(hCX :issue:`395`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/395h[]h\]h]]h^]h_]uhDj.hd]r.htX issue 395r.r.}r .(hCUhDj.ubahWhubhtX and r!.r".}r#.(hCX and hDj.ubh)r$.}r%.(hCX :issue:`426`hY}r&.(UrefuriX+https://github.com/scrapy/scrapy/issues/426h[]h\]h]]h^]h_]uhDj.hd]r'.htX issue 426r(.r).}r*.(hCUhDj$.ubahWhubhtX),r+.r,.}r-.(hCX),hDj.ubeubaubh)r..}r/.(hCXkRequest/Response url/body attributes are now immutable (modifying them had been deprecated for a long time)hDj .hMhPhWhhY}r0.(h]]h^]h\]h[]h_]uhbNhchhd]r1.h)r2.}r3.(hCXkRequest/Response url/body attributes are now immutable (modifying them had been deprecated for a long time)r4.hDj..hMhPhWhhY}r5.(h]]h^]h\]h[]h_]uhbKEhd]r6.htXkRequest/Response url/body attributes are now immutable (modifying them had been deprecated for a long time)r7.r8.}r9.(hCj4.hDj2.ubaubaubh)r:.}r;.(hCXF:setting:`ITEM_PIPELINES` is now defined as a dict (instead of a list)r<.hDj .hMhPhWhhY}r=.(h]]h^]h\]h[]h_]uhbNhchhd]r>.h)r?.}r@.(hCj<.hDj:.hMhPhWhhY}rA.(h]]h^]h\]h[]h_]uhbKGhd]rB.(j )rC.}rD.(hCX:setting:`ITEM_PIPELINES`rE.hDj?.hMhPhWj hY}rF.(UreftypeXsettingj j XITEM_PIPELINESU refdomainXstdrG.h[]h\]U refexplicith]]h^]h_]j j uhbKGhd]rH.jV )rI.}rJ.(hCjE.hY}rK.(h]]h^]rL.(j jG.X std-settingrM.eh\]h[]h_]uhDjC.hd]rN.htXITEM_PIPELINESrO.rP.}rQ.(hCUhDjI.ubahWj^ ubaubhtX- is now defined as a dict (instead of a list)rR.rS.}rT.(hCX- is now defined as a dict (instead of a list)hDj?.ubeubaubh)rU.}rV.(hCX6Sitemap spider can fetch alternate URLs (:issue:`360`)rW.hDj .hMhPhWhhY}rX.(h]]h^]h\]h[]h_]uhbNhchhd]rY.h)rZ.}r[.(hCjW.hDjU.hMhPhWhhY}r\.(h]]h^]h\]h[]h_]uhbKHhd]r].(htX)Sitemap spider can fetch alternate URLs (r^.r_.}r`.(hCX)Sitemap spider can fetch alternate URLs (hDjZ.ubh)ra.}rb.(hCX :issue:`360`hY}rc.(UrefuriX+https://github.com/scrapy/scrapy/issues/360h[]h\]h]]h^]h_]uhDjZ.hd]rd.htX issue 360re.rf.}rg.(hCUhDja.ubahWhubhtX)rh.}ri.(hCX)hDjZ.ubeubaubh)rj.}rk.(hCX^`Selector.remove_namespaces()` now remove namespaces from element's attributes. (:issue:`416`)rl.hDj .hMhPhWhhY}rm.(h]]h^]h\]h[]h_]uhbNhchhd]rn.h)ro.}rp.(hCjl.hDjj.hMhPhWhhY}rq.(h]]h^]h\]h[]h_]uhbKIhd]rr.(h)rs.}rt.(hCX`Selector.remove_namespaces()`hY}ru.(h]]h^]h\]h[]h_]uhDjo.hd]rv.htXSelector.remove_namespaces()rw.rx.}ry.(hCUhDjs.ubahWhubhtX3 now remove namespaces from element's attributes. (rz.r{.}r|.(hCX3 now remove namespaces from element's attributes. (hDjo.ubh)r}.}r~.(hCX :issue:`416`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/416h[]h\]h]]h^]h_]uhDjo.hd]r.htX issue 416r.r.}r.(hCUhDj}.ubahWhubhtX)r.}r.(hCX)hDjo.ubeubaubh)r.}r.(hCXWPaved the road for Python 3.3+ (:issue:`435`, :issue:`436`, :issue:`431`, :issue:`452`)r.hDj .hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbNhchhd]r.h)r.}r.(hCj.hDj.hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbKJhd]r.(htX Paved the road for Python 3.3+ (r.r.}r.(hCX Paved the road for Python 3.3+ (hDj.ubh)r.}r.(hCX :issue:`435`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/435h[]h\]h]]h^]h_]uhDj.hd]r.htX issue 435r.r.}r.(hCUhDj.ubahWhubhtX, r.r.}r.(hCX, hDj.ubh)r.}r.(hCX :issue:`436`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/436h[]h\]h]]h^]h_]uhDj.hd]r.htX issue 436r.r.}r.(hCUhDj.ubahWhubhtX, r.r.}r.(hCX, hDj.ubh)r.}r.(hCX :issue:`431`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/431h[]h\]h]]h^]h_]uhDj.hd]r.htX issue 431r.r.}r.(hCUhDj.ubahWhubhtX, r.r.}r.(hCX, hDj.ubh)r.}r.(hCX :issue:`452`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/452h[]h\]h]]h^]h_]uhDj.hd]r.htX issue 452r.r.}r.(hCUhDj.ubahWhubhtX)r.}r.(hCX)hDj.ubeubaubh)r.}r.(hCXONew item exporter using native python types with nesting support (:issue:`366`)r.hDj .hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbNhchhd]r.h)r.}r.(hCj.hDj.hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbKKhd]r.(htXBNew item exporter using native python types with nesting support (r.r.}r.(hCXBNew item exporter using native python types with nesting support (hDj.ubh)r.}r.(hCX :issue:`366`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/366h[]h\]h]]h^]h_]uhDj.hd]r.htX issue 366r.r.}r.(hCUhDj.ubahWhubhtX)r.}r.(hCX)hDj.ubeubaubh)r.}r.(hCXZTune HTTP1.1 pool size so it matches concurrency defined by settings (:commit:`b43b5f575`)r.hDj .hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbNhchhd]r.h)r.}r.(hCj.hDj.hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbKLhd]r.(htXFTune HTTP1.1 pool size so it matches concurrency defined by settings (r.r.}r.(hCXFTune HTTP1.1 pool size so it matches concurrency defined by settings (hDj.ubh)r.}r.(hCX:commit:`b43b5f575`hY}r.(UrefuriX1https://github.com/scrapy/scrapy/commit/b43b5f575h[]h\]h]]h^]h_]uhDj.hd]r.htXcommit b43b5f575r.r.}r.(hCUhDj.ubahWhubhtX)r.}r.(hCX)hDj.ubeubaubh)r.}r.(hCXXscrapy.mail.MailSender now can connect over TLS or upgrade using STARTTLS (:issue:`327`)r.hDj .hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbNhchhd]r.h)r.}r.(hCj.hDj.hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbKMhd]r.(htXKscrapy.mail.MailSender now can connect over TLS or upgrade using STARTTLS (r.r.}r.(hCXKscrapy.mail.MailSender now can connect over TLS or upgrade using STARTTLS (hDj.ubh)r.}r.(hCX :issue:`327`hY}r.(UrefuriX+https://github.com/scrapy/scrapy/issues/327h[]h\]h]]h^]h_]uhDj.hd]r.htX issue 327r.r.}r.(hCUhDj.ubahWhubhtX)r.}r.(hCX)hDj.ubeubaubh)r.}r.(hCXbNew FilesPipeline with functionality factored out from ImagesPipeline (:issue:`370`, :issue:`409`)r.hDj .hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbNhchhd]r.h)r.}r.(hCj.hDj.hMhPhWhhY}r.(h]]h^]h\]h[]h_]uhbKNhd]r/(htXGNew FilesPipeline with functionality factored out from ImagesPipeline (r/r/}r/(hCXGNew FilesPipeline with functionality factored out from ImagesPipeline (hDj.ubh)r/}r/(hCX :issue:`370`hY}r/(UrefuriX+https://github.com/scrapy/scrapy/issues/370h[]h\]h]]h^]h_]uhDj.hd]r/htX issue 370r/r /}r /(hCUhDj/ubahWhubhtX, r /r /}r /(hCX, hDj.ubh)r/}r/(hCX :issue:`409`hY}r/(UrefuriX+https://github.com/scrapy/scrapy/issues/409h[]h\]h]]h^]h_]uhDj.hd]r/htX issue 409r/r/}r/(hCUhDj/ubahWhubhtX)r/}r/(hCX)hDj.ubeubaubh)r/}r/(hCXARecommend Pillow instead of PIL for image handling (:issue:`317`)r/hDj .hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbNhchhd]r/h)r/}r/(hCj/hDj/hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbKOhd]r/(htX4Recommend Pillow instead of PIL for image handling (r /r!/}r"/(hCX4Recommend Pillow instead of PIL for image handling (hDj/ubh)r#/}r$/(hCX :issue:`317`hY}r%/(UrefuriX+https://github.com/scrapy/scrapy/issues/317h[]h\]h]]h^]h_]uhDj/hd]r&/htX issue 317r'/r(/}r)/(hCUhDj#/ubahWhubhtX)r*/}r+/(hCX)hDj/ubeubaubh)r,/}r-/(hCXGAdded debian packages for Ubuntu quantal and raring (:commit:`86230c0`)r./hDj .hMhPhWhhY}r//(h]]h^]h\]h[]h_]uhbNhchhd]r0/h)r1/}r2/(hCj./hDj,/hMhPhWhhY}r3/(h]]h^]h\]h[]h_]uhbKPhd]r4/(htX5Added debian packages for Ubuntu quantal and raring (r5/r6/}r7/(hCX5Added debian packages for Ubuntu quantal and raring (hDj1/ubh)r8/}r9/(hCX:commit:`86230c0`hY}r:/(UrefuriX/https://github.com/scrapy/scrapy/commit/86230c0h[]h\]h]]h^]h_]uhDj1/hd]r;/htXcommit 86230c0r/(hCUhDj8/ubahWhubhtX)r?/}r@/(hCX)hDj1/ubeubaubh)rA/}rB/(hCXIMock server (used for tests) can listen for HTTPS requests (:issue:`410`)rC/hDj .hMhPhWhhY}rD/(h]]h^]h\]h[]h_]uhbNhchhd]rE/h)rF/}rG/(hCjC/hDjA/hMhPhWhhY}rH/(h]]h^]h\]h[]h_]uhbKQhd]rI/(htX<Mock server (used for tests) can listen for HTTPS requests (rJ/rK/}rL/(hCX<Mock server (used for tests) can listen for HTTPS requests (hDjF/ubh)rM/}rN/(hCX :issue:`410`hY}rO/(UrefuriX+https://github.com/scrapy/scrapy/issues/410h[]h\]h]]h^]h_]uhDjF/hd]rP/htX issue 410rQ/rR/}rS/(hCUhDjM/ubahWhubhtX)rT/}rU/(hCX)hDjF/ubeubaubh)rV/}rW/(hCXRemove multi spider support from multiple core components (:issue:`422`, :issue:`421`, :issue:`420`, :issue:`419`, :issue:`423`, :issue:`418`)hDj .hMhPhWhhY}rX/(h]]h^]h\]h[]h_]uhbNhchhd]rY/h)rZ/}r[/(hCXRemove multi spider support from multiple core components (:issue:`422`, :issue:`421`, :issue:`420`, :issue:`419`, :issue:`423`, :issue:`418`)hDjV/hMhPhWhhY}r\/(h]]h^]h\]h[]h_]uhbKRhd]r]/(htX;Remove multi spider support from multiple core components (r^/r_/}r`/(hCX;Remove multi spider support from multiple core components (hDjZ/ubh)ra/}rb/(hCX :issue:`422`hY}rc/(UrefuriX+https://github.com/scrapy/scrapy/issues/422h[]h\]h]]h^]h_]uhDjZ/hd]rd/htX issue 422re/rf/}rg/(hCUhDja/ubahWhubhtX, rh/ri/}rj/(hCX, hDjZ/ubh)rk/}rl/(hCX :issue:`421`hY}rm/(UrefuriX+https://github.com/scrapy/scrapy/issues/421h[]h\]h]]h^]h_]uhDjZ/hd]rn/htX issue 421ro/rp/}rq/(hCUhDjk/ubahWhubhtX, rr/rs/}rt/(hCX, hDjZ/ubh)ru/}rv/(hCX :issue:`420`hY}rw/(UrefuriX+https://github.com/scrapy/scrapy/issues/420h[]h\]h]]h^]h_]uhDjZ/hd]rx/htX issue 420ry/rz/}r{/(hCUhDju/ubahWhubhtX, r|/r}/}r~/(hCX, hDjZ/ubh)r/}r/(hCX :issue:`419`hY}r/(UrefuriX+https://github.com/scrapy/scrapy/issues/419h[]h\]h]]h^]h_]uhDjZ/hd]r/htX issue 419r/r/}r/(hCUhDj/ubahWhubhtX, r/r/}r/(hCX, hDjZ/ubh)r/}r/(hCX :issue:`423`hY}r/(UrefuriX+https://github.com/scrapy/scrapy/issues/423h[]h\]h]]h^]h_]uhDjZ/hd]r/htX issue 423r/r/}r/(hCUhDj/ubahWhubhtX, r/r/}r/(hCX, hDjZ/ubh)r/}r/(hCX :issue:`418`hY}r/(UrefuriX+https://github.com/scrapy/scrapy/issues/418h[]h\]h]]h^]h_]uhDjZ/hd]r/htX issue 418r/r/}r/(hCUhDj/ubahWhubhtX)r/}r/(hCX)hDjZ/ubeubaubh)r/}r/(hCXjTravis-CI now tests Scrapy changes against development versions of `w3lib` and `queuelib` python packages.r/hDj .hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbNhchhd]r/h)r/}r/(hCj/hDj/hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbKThd]r/(htXCTravis-CI now tests Scrapy changes against development versions of r/r/}r/(hCXCTravis-CI now tests Scrapy changes against development versions of hDj/ubh)r/}r/(hCX`w3lib`hY}r/(h]]h^]h\]h[]h_]uhDj/hd]r/htXw3libr/r/}r/(hCUhDj/ubahWhubhtX and r/r/}r/(hCX and hDj/ubh)r/}r/(hCX `queuelib`hY}r/(h]]h^]h\]h[]h_]uhDj/hd]r/htXqueuelibr/r/}r/(hCUhDj/ubahWhubhtX python packages.r/r/}r/(hCX python packages.hDj/ubeubaubh)r/}r/(hCX@Add pypy 2.1 to continous integration tests (:commit:`ecfa7431`)r/hDj .hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbNhchhd]r/h)r/}r/(hCj/hDj/hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbKUhd]r/(htX-Add pypy 2.1 to continous integration tests (r/r/}r/(hCX-Add pypy 2.1 to continous integration tests (hDj/ubh)r/}r/(hCX:commit:`ecfa7431`hY}r/(UrefuriX0https://github.com/scrapy/scrapy/commit/ecfa7431h[]h\]h]]h^]h_]uhDj/hd]r/htXcommit ecfa7431r/r/}r/(hCUhDj/ubahWhubhtX)r/}r/(hCX)hDj/ubeubaubh)r/}r/(hCXXPylinted, pep8 and removed old-style exceptions from source (:issue:`430`, :issue:`432`)r/hDj .hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbNhchhd]r/h)r/}r/(hCj/hDj/hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbKVhd]r/(htX=Pylinted, pep8 and removed old-style exceptions from source (r/r/}r/(hCX=Pylinted, pep8 and removed old-style exceptions from source (hDj/ubh)r/}r/(hCX :issue:`430`hY}r/(UrefuriX+https://github.com/scrapy/scrapy/issues/430h[]h\]h]]h^]h_]uhDj/hd]r/htX issue 430r/r/}r/(hCUhDj/ubahWhubhtX, r/r/}r/(hCX, hDj/ubh)r/}r/(hCX :issue:`432`hY}r/(UrefuriX+https://github.com/scrapy/scrapy/issues/432h[]h\]h]]h^]h_]uhDj/hd]r/htX issue 432r/r/}r/(hCUhDj/ubahWhubhtX)r/}r/(hCX)hDj/ubeubaubh)r/}r/(hCX3Use importlib for parametric imports (:issue:`445`)r/hDj .hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbNhchhd]r/h)r/}r/(hCj/hDj/hMhPhWhhY}r/(h]]h^]h\]h[]h_]uhbKWhd]r/(htX&Use importlib for parametric imports (r/r/}r/(hCX&Use importlib for parametric imports (hDj/ubh)r/}r/(hCX :issue:`445`hY}r/(UrefuriX+https://github.com/scrapy/scrapy/issues/445h[]h\]h]]h^]h_]uhDj/hd]r/htX issue 445r0r0}r0(hCUhDj/ubahWhubhtX)r0}r0(hCX)hDj/ubeubaubh)r0}r0(hCXZHandle a regression introduced in Python 2.7.5 that affects XmlItemExporter (:issue:`372`)r0hDj .hMhPhWhhY}r0(h]]h^]h\]h[]h_]uhbNhchhd]r 0h)r 0}r 0(hCj0hDj0hMhPhWhhY}r 0(h]]h^]h\]h[]h_]uhbKXhd]r 0(htXMHandle a regression introduced in Python 2.7.5 that affects XmlItemExporter (r0r0}r0(hCXMHandle a regression introduced in Python 2.7.5 that affects XmlItemExporter (hDj 0ubh)r0}r0(hCX :issue:`372`hY}r0(UrefuriX+https://github.com/scrapy/scrapy/issues/372h[]h\]h]]h^]h_]uhDj 0hd]r0htX issue 372r0r0}r0(hCUhDj0ubahWhubhtX)r0}r0(hCX)hDj 0ubeubaubh)r0}r0(hCX1Bugfix crawling shutdown on SIGINT (:issue:`450`)r0hDj .hMhPhWhhY}r0(h]]h^]h\]h[]h_]uhbNhchhd]r0h)r0}r 0(hCj0hDj0hMhPhWhhY}r!0(h]]h^]h\]h[]h_]uhbKYhd]r"0(htX$Bugfix crawling shutdown on SIGINT (r#0r$0}r%0(hCX$Bugfix crawling shutdown on SIGINT (hDj0ubh)r&0}r'0(hCX :issue:`450`hY}r(0(UrefuriX+https://github.com/scrapy/scrapy/issues/450h[]h\]h]]h^]h_]uhDj0hd]r)0htX issue 450r*0r+0}r,0(hCUhDj&0ubahWhubhtX)r-0}r.0(hCX)hDj0ubeubaubh)r/0}r00(hCXRDo not submit `reset` type inputs in FormRequest.from_response (:commit:`b326b87`)r10hDj .hMhPhWhhY}r20(h]]h^]h\]h[]h_]uhbNhchhd]r30h)r40}r50(hCj10hDj/0hMhPhWhhY}r60(h]]h^]h\]h[]h_]uhbKZhd]r70(htXDo not submit r80r90}r:0(hCXDo not submit hDj40ubh)r;0}r<0(hCX`reset`hY}r=0(h]]h^]h\]h[]h_]uhDj40hd]r>0htXresetr?0r@0}rA0(hCUhDj;0ubahWhubhtX+ type inputs in FormRequest.from_response (rB0rC0}rD0(hCX+ type inputs in FormRequest.from_response (hDj40ubh)rE0}rF0(hCX:commit:`b326b87`hY}rG0(UrefuriX/https://github.com/scrapy/scrapy/commit/b326b87h[]h\]h]]h^]h_]uhDj40hd]rH0htXcommit b326b87rI0rJ0}rK0(hCUhDjE0ubahWhubhtX)rL0}rM0(hCX)hDj40ubeubaubh)rN0}rO0(hCX\Do not silence download errors when request errback raises an exception (:commit:`684cfc0`) hDj .hMhPhWhhY}rP0(h]]h^]h\]h[]h_]uhbNhchhd]rQ0h)rR0}rS0(hCX[Do not silence download errors when request errback raises an exception (:commit:`684cfc0`)hDjN0hMhPhWhhY}rT0(h]]h^]h\]h[]h_]uhbK[hd]rU0(htXIDo not silence download errors when request errback raises an exception (rV0rW0}rX0(hCXIDo not silence download errors when request errback raises an exception (hDjR0ubh)rY0}rZ0(hCX:commit:`684cfc0`hY}r[0(UrefuriX/https://github.com/scrapy/scrapy/commit/684cfc0h[]h\]h]]h^]h_]uhDjR0hd]r\0htXcommit 684cfc0r]0r^0}r_0(hCUhDjY0ubahWhubhtX)r`0}ra0(hCX)hDjR0ubeubaubeubeubhMhPhWUsystem_messagerb0hY}rc0(h]]UlevelKh[]h\]rd0j.aUsourcehPh^]h_]UlineKBUtypeUINFOre0uhbKBhchhd]rf0h)rg0}rh0(hCUhY}ri0(h]]h^]h\]h[]h_]uhDhAhd]rj0htX/Duplicate implicit target name: "enhancements".rk0rl0}rm0(hCUhDjg0ubahWhubaubh@)rn0}ro0(hCUhDjhMhPhWjb0hY}rp0(h]]UlevelKh[]h\]rq0jaUsourcehPh^]h_]UlineM&Utypeje0uhbM&hchhd]rr0h)rs0}rt0(hCUhY}ru0(h]]h^]h\]h[]h_]uhDjn0hd]rv0htX@Duplicate implicit target name: "new features and improvements".rw0rx0}ry0(hCUhDjs0ubahWhubaubh@)rz0}r{0(hCUhDj"hMhPhWjb0hY}r|0(h]]UlevelKh[]h\]r}0j"aUsourcehPh^]h_]UlineM`Utypeje0uhbM`hchhd]r~0h)r0}r0(hCUhY}r0(h]]h^]h\]h[]h_]uhDjz0hd]r0htX6Duplicate implicit target name: "changes to settings".r0r0}r0(hCUhDj0ubahWhubaubh@)r0}r0(hCUhDj#hMhPhWjb0hY}r0(h]]UlevelKh[]h\]r0j#aUsourcehPh^]h_]UlineMoUtypeje0uhbMohchhd]r0h)r0}r0(hCUhY}r0(h]]h^]h\]h[]h_]uhDj0hd]r0htX@Duplicate implicit target name: "new features and improvements".r0r0}r0(hCUhDj0ubahWhubaubh@)r0}r0(hCUhDjz%hMhPhWjb0hY}r0(h]]UlevelKh[]h\]r0j%aUsourcehPh^]h_]UlineMUtypeje0uhbMhchhd]r0h)r0}r0(hCUhY}r0(h]]h^]h\]h[]h_]uhDj0hd]r0htX.Duplicate implicit target name: "api changes".r0r0}r0(hCUhDj0ubahWhubaubeUcurrent_sourcer0NU decorationr0NUautofootnote_startr0KUnameidsr0}r0(hNhj}#hj&h j h j h j&h j]h jhj+hNhj&hj-hhahj+hjhj,hNhj5hjhj,hNhj+hj[-hj+hj` hj+h jTh!jh"j/,h#j-h$jeh%j4h&j ,h'j+h(j+h)jh*jh+jh,jLh-hhh.jh/j ,h0j:h1j,h2jh3j&h4jh5h|h6j'h7j+ h8j h9j,h:jh;j+uhd]r0(hUhKehCUU transformerr0NU footnote_refsr0}r0Urefnamesr0}r0(j-]r0(j-jfejB]r0j?aXajax crawleable urls]r0jPaj]r0jaj]r0(jjej ]r0j aXchunked transfer encoding]r0jaj-]r0j-aj]r0jaj+ ]r0j' aX clientform]r0jqauUsymbol_footnotesr0]r0Uautofootnote_refsr0]r0Usymbol_footnote_refsr0]r0U citationsr0]r0hchU current_liner0NUtransform_messagesr0]r0h@)r0}r0(hCUhY}r0(h]]UlevelKh[]h\]UsourcehPh^]h_]UlineKUtypeje0uhd]r0h)r0}r0(hCUhY}r0(h]]h^]h\]h[]h_]uhDj0hd]r0htX*Hyperlink target "news" is not referenced.r0r0}r0(hCUhDj0ubahWhubahWjb0ubaUreporterr0NUid_startr0KU autofootnotesr0]r0U citation_refsr0}r0Uindirect_targetsr0]r0Usettingsr0(cdocutils.frontend Values r0or0}r0(Ufootnote_backlinksr0KUrecord_dependenciesr0NU rfc_base_urlr0Uhttp://tools.ietf.org/html/r0U tracebackr0Upep_referencesr0NUstrip_commentsr0NU toc_backlinksr0Uentryr0U language_coder0Uenr0U datestampr0NU report_levelr0KU _destinationr0NU halt_levelr0KU strip_classesr0NhqNUerror_encoding_error_handlerr0Ubackslashreplacer0Udebugr0NUembed_stylesheetr0Uoutput_encoding_error_handlerr0Ustrictr0U sectnum_xformr0KUdump_transformsr0NU docinfo_xformr0KUwarning_streamr0NUpep_file_url_templater0Upep-%04dr0Uexit_status_levelr0KUconfigr0NUstrict_visitorr0NUcloak_email_addressesr0Utrim_footnote_reference_spacer0Uenvr0NUdump_pseudo_xmlr0NUexpose_internalsr0NUsectsubtitle_xformr0U source_linkr0NUrfc_referencesr0NUoutput_encodingr0Uutf-8r0U source_urlr0NUinput_encodingr1U utf-8-sigr1U_disable_configr1NU id_prefixr1UU tab_widthr1KUerror_encodingr1UUTF-8r1U_sourcer1U:/var/build/user_builds/scrapy/checkouts/0.22/docs/news.rstr1Ugettext_compactr 1U generatorr 1NUdump_internalsr 1NU smart_quotesr 1U pep_base_urlr 1Uhttp://www.python.org/dev/peps/r1Usyntax_highlightr1Ulongr1Uinput_encoding_error_handlerr1j0Uauto_id_prefixr1Uidr1Udoctitle_xformr1Ustrip_elements_with_classesr1NU _config_filesr1]Ufile_insertion_enabledr1U raw_enabledr1KU dump_settingsr1NubUsymbol_footnote_startr1KUidsr1}r1(j+j'j+j+jjj+j+j5j1j-j-j j j` j\ jTjPj+j+hhjjjjj&j"j]jYj4j0j-j)jjjejaj.hFj"j"jLjHj#j#jjj&j&j+j+j%jz%j,j,j}#jy#jjj j j ,hIjjj/,j+,j+j+j:j6j+j+jjhhhKj j j&j&hahKjjj ,j,jjjvjpj+j+j,j ,jjjjj,j+j&j&h|hxj'j'jjj,j,j j j[-jW-j+ j' uUsubstitution_namesr1}r1hWhchY}r1(h]]h[]h\]UsourcehPh^]h_]uU footnotesr 1]r!1Urefidsr"1}r#1ha]r$1hUasub.PKo1D_ԙ..*scrapy-0.22/.doctrees/contributing.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xscrapy-developersqXtestsqNX this postqXreporting bugsq NXauthorsq X writing testsq NXdocumentation policiesq NX running testsq NX pull requestqXwriting patchesqNX scrapy/testsqX scrapy-usersqX issue trackerqX coding styleqNX open issuesqXcontributing to scrapyqNXsubmitting patchesqNXtwisted unit-testing frameworkqXtopics-contributingqXscrapy contribqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startq KUnameidsq!}q"(hUscrapy-developersq#hUtestsq$hU this-postq%h Ureporting-bugsq&h Uauthorsq'h U writing-testsq(h Udocumentation-policiesq)h U running-testsq*hU pull-requestq+hUwriting-patchesq,hU scrapy-testsq-hU scrapy-usersq.hU issue-trackerq/hU coding-styleq0hU open-issuesq1hUcontributing-to-scrapyq2hUsubmitting-patchesq3hUtwisted-unit-testing-frameworkq4hUtopics-contributingq5hUscrapy-contribq6uUchildrenq7]q8(cdocutils.nodes target q9)q:}q;(U rawsourceqcdocutils.nodes reprunicode q?XB/var/build/user_builds/scrapy/checkouts/0.22/docs/contributing.rstq@qA}qBbUtagnameqCUtargetqDU attributesqE}qF(UidsqG]UbackrefsqH]UdupnamesqI]UclassesqJ]UnamesqK]UrefidqLh5uUlineqMKUdocumentqNhh7]ubcdocutils.nodes section qO)qP}qQ(hhAUexpect_referenced_by_nameqR}qShh:shCUsectionqThE}qU(hI]hJ]hH]hG]qV(h2h5ehK]qW(hheuhMKhNhUexpect_referenced_by_idqX}qYh5h:sh7]qZ(cdocutils.nodes title q[)q\}q](hhAhCUtitleq_hE}q`(hI]hJ]hH]hG]hK]uhMKhNhh7]qacdocutils.nodes Text qbXContributing to Scrapyqcqd}qe(hhAhCU paragraphqjhE}qk(hI]hJ]hH]hG]hK]uhMKhNhh7]qlhbXCThere are many ways to contribute to Scrapy. Here are some of them:qmqn}qo(hhAhCU bullet_listqshE}qt(UbulletquX*hG]hH]hI]hJ]hK]uhMK hNhh7]qv(cdocutils.nodes list_item qw)qx}qy(hhAhCU list_itemqzhE}q{(hI]hJ]hH]hG]hK]uhMNhNhh7]q|hf)q}}q~(hhAhChjhE}q(hI]hJ]hH]hG]hK]uhMK h7]qhbXBlog about Scrapy. Tell the world how you're using Scrapy. This will help newcomers with more examples and the Scrapy project to increase its visibility.qq}q(hhAhChzhE}q(hI]hJ]hH]hG]hK]uhMNhNhh7]qhf)q}q(hhAhChjhE}q(hI]hJ]hH]hG]hK]uhMK h7]q(hbX(Report bugs and request features in the qq}q(hhAhChzhE}q(hI]hJ]hH]hG]hK]uhMNhNhh7]qhf)q}q(hhAhChjhE}q(hI]hJ]hH]hG]hK]uhMKh7]q(hbXCSubmit patches for new functionality and/or bug fixes. Please read qq}q(hhAhChzhE}q(hI]hJ]hH]hG]hK]uhMNhNhh7]qhf)q}q(hhAhChjhE}q(hI]hJ]hH]hG]hK]uhMKh7]q(hbX Join the qЅq}q(hhAhChThE}q(hI]hJ]hH]hG]qh&ahK]qh auhMKhNhh7]q(h[)q}q(hhAhCh_hE}q(hI]hJ]hH]hG]hK]uhMKhNhh7]qhbXReporting bugsq녁q}q(hhAhChjhE}q(hI]hJ]hH]hG]hK]uhMKhNhh7]qhbXmWell-written bug reports are very helpful, so keep in mind the following guidelines when reporting a new bug.qq}q(hhAhChshE}q(huX*hG]hH]hI]hJ]hK]uhMKhNhh7]q(hw)q}q(h` first to see if your issue is addressed in a well-known question h=hh>hAhChzhE}q(hI]hJ]hH]hG]hK]uhMNhNhh7]qhf)q}q(h` first to see if your issue is addressed in a well-known questionh=hh>hAhChjhE}r(hI]hJ]hH]hG]hK]uhMKh7]r(hbX check the rr}r(h`rh=hh>hAhCU pending_xrefr hE}r (UreftypeXrefUrefwarnr U reftargetr XfaqU refdomainXstdr hG]hH]U refexplicithI]hJ]hK]UrefdocrX contributingruhMKh7]rcdocutils.nodes emphasis r)r}r(hhAhChzhE}r"(hI]hJ]hH]hG]hK]uhMNhNhh7]r#hf)r$}r%(hhAhChjhE}r&(hI]hJ]hH]hG]hK]uhMK h7]r'(hbX check the r(r)}r*(hhAhChzhE}r8(hI]hJ]hH]hG]hK]uhMNhNhh7]r9hf)r:}r;(hhAhChjhE}r<(hI]hJ]hH]hG]hK]uhMK$h7]r=(hbX search the r>r?}r@(hhAhChzhE}r[(hI]hJ]hH]hG]hK]uhMNhNhh7]r\hf)r]}r^(hhAhChjhE}r`(hI]hJ]hH]hG]hK]uhMK(h7]rahbXwrite complete, reproducible, specific bug reports. The smaller the test case, the better. Remember that other developers won't have your project to reproduce the bug, so please include all relevant files required to reproduce it.rbrc}rd(hhAhChzhE}rg(hI]hJ]hH]hG]hK]uhMNhNhh7]rhhf)ri}rj(hhAhChjhE}rk(hI]hJ]hH]hG]hK]uhMK-h7]rl(hbXinclude the output of rmrn}ro(hhAhChThE}r~(hI]hJ]hH]hG]rh,ahK]rhauhMK2hNhh7]r(h[)r}r(hhAhCh_hE}r(hI]hJ]hH]hG]hK]uhMK2hNhh7]rhbXWriting patchesrr}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMK4hNhh7]rhbXlThe better written a patch is, the higher chance that it'll get accepted and the sooner that will be merged.rr}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMK7hNhh7]rhbXWell-written patches should:rr}r(hhAhChshE}r(huX*hG]hH]hI]hJ]hK]uhMK9hNhh7]r(hw)r}r(hhAhChzhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]rhf)r}r(hcontain the minimum amount of code required for the specific change. Small patches are easier to review and merge. So, if you're doing more than one change (or bug fix), please consider submitting one patch per change. Do not collapse multiple changes into a single patch. For big changes consider using a patch queue.rh=jh>hAhChjhE}r(hI]hJ]hH]hG]hK]uhMK9h7]rhbX>contain the minimum amount of code required for the specific change. Small patches are easier to review and merge. So, if you're doing more than one change (or bug fix), please consider submitting one patch per change. Do not collapse multiple changes into a single patch. For big changes consider using a patch queue.rr}r(hhAhChzhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]rhf)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMK?h7]r(hbXpass all unit-tests. See rr}r(hhAhChzhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]rhf)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKAh7]r(hbX^include one (or more) test cases that check the bug fixed or the new functionality added. See rr}r(hhAhChzhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]rhf)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKDh7]r(hbXyif you're adding or changing a public (documented) API, please include the documentation changes in the same patch. See rr}r(hhAhChThE}r(hI]hJ]hH]hG]rh3ahK]rhauhMKIhNhh7]r(h[)r}r(hhAhCh_hE}r(hI]hJ]hH]hG]hK]uhMKIhNhh7]rhbXSubmitting patchesrr}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKKhNhh7]r(hbX-The best way to submit a patch is to issue a rr}r(hhAhChjhE}r (hI]hJ]hH]hG]hK]uhMKNhNhh7]r hbXRemember to explain what was fixed or the new functionality (what it is, why it's needed, etc). The more info you include, the easier will be for core developers to understand and accept your patch.rr}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKRhNhh7]r(hbX;You can also discuss the new functionality (or bug fix) in rr}r(hhAhChjhE}r$(hI]hJ]hH]hG]hK]uhMKWhNhh7]r%(hbX(Finally, try to keep aesthetic changes (r&r'}r((h}r?(hubahCUstrongrEubahChubhbX compliance, unused imports removal, etc) in separate commits than functional changes, to make the pull request easier to review.rFrG}rH(hhAhChThE}rK(hI]hJ]hH]hG]rLh0ahK]rMhauhMK\hNhh7]rN(h[)rO}rP(hhAhCh_hE}rR(hI]hJ]hH]hG]hK]uhMK\hNhh7]rShbX Coding stylerTrU}rV(hhAhChjhE}rZ(hI]hJ]hH]hG]hK]uhMK^hNhh7]r[hbXQPlease follow these coding conventions when writing code for inclusion in Scrapy:r\r]}r^(hhAhChshE}ra(huX*hG]hH]hI]hJ]hK]uhMKahNhh7]rb(hw)rc}rd(hhAhChzhE}re(hI]hJ]hH]hG]hK]uhMNhNhh7]rfhf)rg}rh(hhAhChjhE}ri(hI]hJ]hH]hG]hK]uhMKah7]rj(hbX#Unless otherwise specified, follow rkrl}rm(hhAhChzhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]rhf)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKch7]rhbXNIt's OK to use lines longer than 80 chars if it improves the code readability.rr}r(hhAhChzhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]rhf)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKfh7]r(hbXdDon't put your name in the code you contribute. Our policy is to keep the contributor's name in the rr}r(hhAhChThE}r(hI]hJ]hH]hG]rh6ahK]rhauhMKjhNhh7]r(h[)r}r(hhAhCh_hE}r(hI]hJ]hH]hG]hK]uhMKjhNhh7]rhbXScrapy Contribrr}r(h`_. If you are working on a new functionality, please follow that rationale to decide whether it should be a Scrapy contrib. If unsure, you can ask in `scrapy-developers`_.h=jh>hAhChjhE}r(hI]hJ]hH]hG]hK]uhMKlhNhh7]r(hbXSScrapy contrib shares a similar rationale as Django contrib, which is explained in rr}r(h`_hE}r(UnameX this posthX3http://jacobian.org/writing/what-is-django-contrib/rhG]hH]hI]hJ]hK]uh=jh7]rhbX this postrr}r(hhKh=jhChDhE}r(UrefurijhG]rh%ahH]hI]hJ]hK]rhauh7]ubhbX. If you are working on a new functionality, please follow that rationale to decide whether it should be a Scrapy contrib. If unsure, you can ask in rr}r(hhAhChThE}r(hI]hJ]hH]hG]rh)ahK]rh auhMKshNhh7]r(h[)r}r(hhAhCh_hE}r(hI]hJ]hH]hG]hK]uhMKshNhh7]rhbXDocumentation policiesrr}r(hhAhChshE}r(huX*hG]hH]hI]hJ]hK]uhMKuhNhh7]r(hw)r}r(hhAhChzhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]rhf)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKuh7]r(j=)r}r(hhAhCj hE}r(UreftypeXmethj j XItemLoader.add_valueU refdomainXpyrhG]hH]U refexplicithI]hJ]hK]jjUpy:classrNU py:modulerNuhMKuh7]rjp)r}r(hhAhChzhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]rhf)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKzh7]r(j=)r}r(hhAhChThE}r-(hI]hJ]hH]hG]r.h$ahK]r/hauhMKhNhh7]r0(h[)r1}r2(hhAhCh_hE}r4(hI]hJ]hH]hG]hK]uhMKhNhh7]r5hbXTestsr6r7}r8(hhAhChjhE}r;(hI]hJ]hH]hG]hK]uhMKhNhh7]r<(hbX Tests are implemented using the r=r>}r?(hhAhChThE}rV(hI]hJ]hH]hG]rWh*ahK]rXh auhMKhNhh7]rY(h[)rZ}r[(hhAhCh_hE}r](hI]hJ]hH]hG]hK]uhMKhNhh7]r^hbX Running testsr_r`}ra(hhAhChjhE}re(hI]hJ]hH]hG]hK]uhMKhNhh7]rfhbXHTo run all tests go to the root directory of Scrapy source code and run:rgrh}ri(hhAhCU block_quotermhE}rn(hI]hJ]hH]hG]hK]uhMNhNhh7]ro(hf)rp}rq(hhAhChjhE}rr(hI]hJ]hH]hG]hK]uhMKh7]rs(jp)rt}ru(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKh7]r(jp)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKhNhh7]r(hbXTo run a specific test (say rr}r(hhAhCjmhE}r(hI]hJ]hH]hG]hK]uhMNhNhh7]r(hf)r}r(h``bin/runtests.sh scrapy.tests.test_contrib_loader`` (on unix)rh=jh>hAhChjhE}r(hI]hJ]hH]hG]hK]uhMKh7]r(jp)r}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKh7]r(jp)r}r(hhAhChThE}r(hI]hJ]hH]hG]rh(ahK]rh auhMKhNhh7]r(h[)r}r(hhAhCh_hE}r(hI]hJ]hH]hG]hK]uhMKhNhh7]rhbX Writing testsrr}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKhNhh7]rhbXAll functionality (including new features and bug fixes) must include a test case to check that it works as expected, so please include tests for your patches if you want them to get accepted sooner.rr}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKhNhh7]r(hbX1Scrapy uses unit-tests, which are located in the rr}r(hhAhCU literal_blockrhE}r(U xml:spacerUpreserverhG]hH]hI]hJ]hK]uhMKhNhh7]rhbXscrapy.contrib.loaderrr}r(hhAhChjhE}r(hI]hJ]hH]hG]hK]uhMKhNhh7]rhbXAnd their unit-tests are in:rr}r(hhAhCjhE}r(jjhG]hH]hI]hJ]hK]uhMKhNhh7]r hbX scrapy.tests.test_contrib_loaderr r }r (hhAhChDhE}r(hhhG]rh/ahH]hI]hJ]hK]rhauhMKhNhh7]ubh9)r}r(hhAhChDhE}r(hjEhG]rh.ahH]hI]hJ]hK]rhauhMKhNhh7]ubh9)r}r(hhAhChDhE}r(hhhG]rh#ahH]hI]hJ]hK]rhauhMKhNhh7]ubh9)r}r(hhAhChDhE}r(hjChG]rh4ahH]hI]hJ]hK]r hauhMKhNhh7]ubh9)r!}r"(hhAhChDhE}r#(hjhG]r$h'ahH]hI]hJ]hK]r%h auhMKhNhh7]ubh9)r&}r'(hhAhChDhE}r((hjhG]r)h-ahH]hI]hJ]hK]r*hauhMKhNhh7]ubh9)r+}r,(hhAhChDhE}r-(hj.hG]r.h1ahH]hI]hJ]hK]r/hauhMKhNhh7]ubh9)r0}r1(hhAhChDhE}r2(hjhG]r3h+ahH]hI]hJ]hK]r4hauhMKhNhh7]ubeubeubeubehjaX running tests]r?jaXwriting patches]r@haXauthors]rAjaX issue tracker]rBhaXtwisted unit-testing framework]rCj@aX open issues]rDj+aXsubmitting patches]rEhaj]rFjajD]rGjAauUsymbol_footnotesrH]rIUautofootnote_refsrJ]rKUsymbol_footnote_refsrL]rMU citationsrN]rOhNhU current_linerPNUtransform_messagesrQ]rR(cdocutils.nodes system_message rS)rT}rU(hh)q?}q@(hUh}qA(h]h]h]h]h]uh]qBhXUsing Firebug for scrapingqCqD}qE(hXUsing Firebug for scrapingqFh h?ubah!h"ubXtopics/settingsqGh)qH}qI(hUh}qJ(h]h]h]h]h]uh]qKhXSettingsqLqM}qN(hXSettingsqOh hHubah!h"ubXtopics/practicesqPh)qQ}qR(hUh}qS(h]h]h]h]h]uh]qThXCommon PracticesqUqV}qW(hXCommon PracticesqXh hQubah!h"ubXtopics/architectureqYh)qZ}q[(hUh}q\(h]h]h]h]h]uh]q]hXArchitecture overviewq^q_}q`(hXArchitecture overviewqah hZubah!h"ubXtopics/feed-exportsqbh)qc}qd(hUh}qe(h]h]h]h]h]uh]qfhX Feed exportsqgqh}qi(hX Feed exportsqjh hcubah!h"ubX topics/itemsqkh)ql}qm(hUh}qn(h]h]h]h]h]uh]qohXItemsqpqq}qr(hXItemsqsh hlubah!h"ubXtopics/spider-middlewareqth)qu}qv(hUh}qw(h]h]h]h]h]uh]qxhXSpider Middlewareqyqz}q{(hXSpider Middlewareq|h huubah!h"ubXtopics/exportersq}h)q~}q(hUh}q(h]h]h]h]h]uh]qhXItem Exportersqq}q(hXItem Exportersqh h~ubah!h"ubX topics/jobsqh)q}q(hUh}q(h]h]h]h]h]uh]qhX!Jobs: pausing and resuming crawlsqq}q(hX!Jobs: pausing and resuming crawlsqh hubah!h"ubXindexqh)q}q(hUh}q(h]h]h]h]h]uh]q(hXScrapy qq}q(hXScrapy qh hubhX0.22qq}q(hU0.22qh hubhX documentationqq}q(hX documentationqh hubeh!h"ubXtopics/benchmarkingqh)q}q(hUh}q(h]h]h]h]h]uh]qhX Benchmarkingqq}q(hX Benchmarkingqh hubah!h"ubXintro/tutorialqh)q}q(hUh}q(h]h]h]h]h]uh]qhXScrapy Tutorialqq}q(hXScrapy Tutorialqh hubah!h"ubX topics/imagesqh)q}q(hUh}q(h]h]h]h]h]uh]qhXDownloading Item Imagesqq}q(hXDownloading Item Imagesqh hubah!h"ubXtopics/broad-crawlsqh)q}q(hUh}q(h]h]h]h]h]uh]qhX Broad Crawlsqq}q(hX Broad Crawlsqh hubah!h"ubXintro/overviewqh)q}q(hUh}q(h]h]h]h]h]uh]qhXScrapy at a glanceqɅq}q(hXScrapy at a glanceqh hubah!h"ubX contributingqh)q}q(hUh}q(h]h]h]h]h]uh]qhXContributing to Scrapyq҅q}q(hXContributing to Scrapyqh hubah!h"ubX topics/apiqh)q}q(hUh}q(h]h]h]h]h]uh]qhXCore APIqۅq}q(hXCore APIqh hubah!h"ubXtopics/autothrottleqh)q}q(hUh}q(h]h]h]h]h]uh]qhXAutoThrottle extensionq䅁q}q(hXAutoThrottle extensionqh hubah!h"ubXtopics/contractsqh)q}q(hUh}q(h]h]h]h]h]uh]qhXSpiders Contractsq텁q}q(hXSpiders Contractsqh hubah!h"ubXtopics/webserviceqh)q}q(hUh}q(h]h]h]h]h]uh]qhX Web Serviceqq}q(hX Web Serviceqh hubah!h"ubXtopics/downloader-middlewareqh)q}q(hUh}q(h]h]h]h]h]uh]qhXDownloader Middlewareqr}r(hXDownloader Middlewarerh hubah!h"ubX topics/debugrh)r}r(hUh}r(h]h]h]h]h]uh]rhXDebugging Spidersrr }r (hXDebugging Spidersr h jubah!h"ubXtopics/spidersr h)r }r(hUh}r(h]h]h]h]h]uh]rhXSpidersrr}r(hXSpidersrh j ubah!h"ubXfaqrh)r}r(hUh}r(h]h]h]h]h]uh]rhXFrequently Asked Questionsrr}r(hXFrequently Asked Questionsrh jubah!h"ubXtopics/request-responserh)r}r (hUh}r!(h]h]h]h]h]uh]r"hXRequests and Responsesr#r$}r%(hXRequests and Responsesr&h jubah!h"ubXtopics/exceptionsr'h)r(}r)(hUh}r*(h]h]h]h]h]uh]r+hX Exceptionsr,r-}r.(hX Exceptionsr/h j(ubah!h"ubXintro/examplesr0h)r1}r2(hUh}r3(h]h]h]h]h]uh]r4hXExamplesr5r6}r7(hXExamplesr8h j1ubah!h"ubXtopics/selectorsr9h)r:}r;(hUh}r<(h]h]h]h]h]uh]r=hX Selectorsr>r?}r@(hX SelectorsrAh j:ubah!h"ubXtopics/link-extractorsrBh)rC}rD(hUh}rE(h]h]h]h]h]uh]rFhXLink ExtractorsrGrH}rI(hXLink ExtractorsrJh jCubah!h"ubXexperimental/indexrKh)rL}rM(hUh}rN(h]h]h]h]h]uh]rOhXExperimental featuresrPrQ}rR(hXExperimental featuresrSh jLubah!h"ubXnewsrTh)rU}rV(hUh}rW(h]h]h]h]h]uh]rXhX Release notesrYrZ}r[(hX Release notesr\h jUubah!h"ubXtopics/telnetconsoler]h)r^}r_(hUh}r`(h]h]h]h]h]uh]rahXTelnet Consolerbrc}rd(hXTelnet Consolereh j^ubah!h"ubX topics/leaksrfh)rg}rh(hUh}ri(h]h]h]h]h]uh]rjhXDebugging memory leaksrkrl}rm(hXDebugging memory leaksrnh jgubah!h"ubXtopics/signalsroh)rp}rq(hUh}rr(h]h]h]h]h]uh]rshXSignalsrtru}rv(hXSignalsrwh jpubah!h"ubXtopics/extensionsrxh)ry}rz(hUh}r{(h]h]h]h]h]uh]r|hX Extensionsr}r~}r(hX Extensionsrh jyubah!h"ubXtopics/djangoitemrh)r}r(hUh}r(h]h]h]h]h]uh]rhX DjangoItemrr}r(hX DjangoItemrh jubah!h"ubX topics/emailrh)r}r(hUh}r(h]h]h]h]h]uh]rhXSending e-mailrr}r(hXSending e-mailrh jubah!h"ubX topics/ubunturh)r}r(hUh}r(h]h]h]h]h]uh]rhXUbuntu packagesrr}r(hXUbuntu packagesrh jubah!h"ubXtopics/commandsrh)r}r(hUh}r(h]h]h]h]h]uh]rhXCommand line toolrr}r(hXCommand line toolrh jubah!h"ubX topics/shellrh)r}r(hUh}r(h]h]h]h]h]uh]rhX Scrapy shellrr}r(hX Scrapy shellrh jubah!h"ubXtopics/scrapydrh)r}r(hUh}r(h]h]h]h]h]uh]rhXScrapydrr}r(hXScrapydrh jubah!h"ubXtopics/loggingrh)r}r(hUh}r(h]h]h]h]h]uh]rhXLoggingrr}r(hXLoggingrh jubah!h"ubX topics/statsrh)r}r(hUh}r(h]h]h]h]h]uh]rhXStats Collectionrr}r(hXStats Collectionrh jubah!h"ubX versioningrh)r}r(hUh}r(h]h]h]h]h]uh]rhXVersioning and API Stabilityrr}r(hXVersioning and API Stabilityrh jubah!h"ubuU domaindatar}r(Ustdr}r(UversionrKU anonlabelsr}r(Xtopics-stats-usecasesrjUtopics-stats-usecasesrXtopics-images-thumbnailsrhUtopics-images-thumbnailsrXhttpcache-policy-dummyrhUhttpcache-policy-dummyrXtopics-signals-refrjoUtopics-signals-refrXtopics-loadersrh Utopics-loadersrXtopics-feed-storage-s3rhbUtopics-feed-storage-s3rXtopics-exceptions-refrj'Utopics-exceptions-refrXtopics-commandsrjUtopics-commandsrXrun-from-scriptrhPUrun-from-scriptrXtopics-feed-format-marshalrhbUtopics-feed-format-marshalrXtopics-loggingrjUtopics-loggingrXtopics-telnetconsolerj]Utopics-telnetconsolerXbansrhPUbansrXfaqrjUfaqrXtopics-feed-storage-ftprhbUtopics-feed-storage-ftprXhttpcache-storage-fsrhUhttpcache-storage-fsrX topics-shellrjU topics-shellrXtopics-feed-storagerhbUtopics-feed-storagerXtopics-architecturerhYUtopics-architecturerX$topics-exporters-field-serializationrh}U$topics-exporters-field-serializationrX topics-imagesrhU topics-imagesrXtopics-items-declaringrhkUtopics-items-declaringrXtopics-scrapydrjUtopics-scrapydrXtopics-logging-levelsrjUtopics-logging-levelsrX/topics-request-response-ref-response-subclassesr jU/topics-request-response-ref-response-subclassesr Xtopics-selectors-htmlcoder j9Utopics-selectors-htmlcoder Xtopics-api-signalsr hUtopics-api-signalsrXtopics-images-overriderhUtopics-images-overriderX versioningrjU versioningrXtopics-firefoxrh#Utopics-firefoxrXtopics-feed-exportsrhbUtopics-feed-exportsrUmodindexrU py-modindexUX topics-debugrjU topics-debugrXtopics-extensionsrjxUtopics-extensionsrXsection-basicsrhUsection-basicsrXtopics-extensions-ref-memusagerjxUtopics-extensions-ref-memusagerXtopics-webservicer hUtopics-webservicer!Xtopics-exporters-referencer"h}Utopics-exporters-referencer#Xtopics-selectors-refr$j9Utopics-selectors-refr%Xtopics-feed-format-xmlr&hbUtopics-feed-format-xmlr'X topics-spider-middleware-settingr(htU topics-spider-middleware-settingr)Xtopics-feed-storage-fsr*hbUtopics-feed-storage-fsr+Xtopics-spider-middlewarer,htUtopics-spider-middlewarer-X topics-extensions-ref-webservicer.jxU topics-extensions-ref-webservicer/Xdistributed-crawlsr0hPUdistributed-crawlsr1Xintro-tutorialr2hUintro-tutorialr3Xintro-examplesr4j0Uintro-examplesr5Xintro-overviewr6hUintro-overviewr7Xtopics-leaks-guppyr8jfUtopics-leaks-guppyr9Xtopics-whatelser:hUtopics-whatelser;X intro-installr<h5U intro-installr=Xtopics-feed-storage-backendsr>hbUtopics-feed-storage-backendsr?X$topics-downloader-middleware-settingr@hU$topics-downloader-middleware-settingrAXtopics-broad-crawlsrBhUtopics-broad-crawlsrCX topics-apirDhU topics-apirEXnewsrFjTUnewsrGX cookies-mwrHhU cookies-mwrIXtopics-extensions-refrJjxUtopics-extensions-refrKXhttpcache-storage-dbmrLhUhttpcache-storage-dbmrMXhttpcache-policy-rfc2616rNhUhttpcache-policy-rfc2616rOXtopics-settings-refrPhGUtopics-settings-refrQUsearchrRUsearchUXtopics-spidersrSj Utopics-spidersrTXtopics-feed-formatrUhbUtopics-feed-formatrVXtopics-feed-format-jsonrWhbUtopics-feed-format-jsonrXXtopics-dlmw-robotsrYhUtopics-dlmw-robotsrZX topics-ubuntur[jU topics-ubuntur\X topics-indexr]hU topics-indexr^X#topics-extensions-ref-telnetconsoler_jxU#topics-extensions-ref-telnetconsoler`Xtopics-loaders-extendingrah Utopics-loaders-extendingrbX spiderargsrcj U spiderargsrdX experimentalrejKU experimentalrfX benchmarkingrghU benchmarkingrhXtopics-link-extractors-refrijBUtopics-link-extractors-refrjXdynamic-item-classesrkhPUdynamic-item-classesrlX.topics-request-response-ref-request-subclassesrmjU.topics-request-response-ref-request-subclassesrnXtopics-feed-uri-paramsrohbUtopics-feed-uri-paramsrpXtopics-downloader-middlewarerqhUtopics-downloader-middlewarerrXtopics-webservice-resourcesrshUtopics-webservice-resourcesrtXintro-overview-itemruhUintro-overview-itemrvXtopics-feed-format-jsonlinesrwhbUtopics-feed-format-jsonlinesrxXtopics-images-enablingryhUtopics-images-enablingrzXtopics-signalsr{joUtopics-signalsr|Xtopics-djangoitemr}jUtopics-djangoitemr~X6topics-request-response-ref-request-callback-argumentsrjU6topics-request-response-ref-request-callback-argumentsrXjson-with-large-datah}Ujson-with-large-dataXtopics-feed-storage-stdoutrhbUtopics-feed-storage-stdoutrXtopics-email-settingsrjUtopics-email-settingsrX topics-emailrjU topics-emailrXtopics-feed-format-picklerhbUtopics-feed-format-picklerXtopics-item-pipelinerh,Utopics-item-pipelinerXtopics-contributingrhUtopics-contributingrUgenindexrjUXtopics-exporters-serializersrh}Utopics-exporters-serializersrXtopics-leaks-trackrefsrjfUtopics-leaks-trackrefsrXtopics-spiders-refrj Utopics-spiders-refrXtopics-firebugrh>Utopics-firebugrXtopics-firefox-addonsrh#Utopics-firefox-addonsrXtopics-exceptionsrj'Utopics-exceptionsrXtopics-loaders-contextrh Utopics-loaders-contextrXtopics-items-fieldsrhkUtopics-items-fieldsrXtopics-feed-format-csvrhbUtopics-feed-format-csvrXajaxcrawl-middlewarerhUajaxcrawl-middlewarerX topics-jobsrhU topics-jobsrX topics-selectors-relative-xpathsrj9U topics-selectors-relative-xpathsrX-topics-request-response-ref-request-userloginrjU-topics-request-response-ref-request-userloginrXtopics-selectorsrj9Utopics-selectorsrXtopics-webservice-resources-refrhUtopics-webservice-resources-refrXtopics-firefox-livedomrh#Utopics-firefox-livedomrXtopics-project-structurerjUtopics-project-structurerXintro-install-platform-notesrh5Uintro-install-platform-notesrXtopics-api-statsrhUtopics-api-statsrXtopics-request-metarjUtopics-request-metarXtopics-practicesrhPUtopics-practicesrXtopics-api-crawlerrhUtopics-api-crawlerrXautothrottle-algorithmrhUautothrottle-algorithmrXtopics-loaders-processorsrh Utopics-loaders-processorsrXextending-scrapyrhUextending-scrapyrXtopics-request-responserjUtopics-request-responserXfaq-python-versionsrjUfaq-python-versionsrXtopics-spider-middleware-refrhtUtopics-spider-middleware-refrX topics-statsrjU topics-statsrXremoving-namespacesrj9Uremoving-namespacesrX#topics-loaders-available-processorsrh U#topics-loaders-available-processorsrX topics-itemsrhkU topics-itemsrX#topics-loaders-processors-declaringrh U#topics-loaders-processors-declaringrXtopics-shell-inspect-responserjUtopics-shell-inspect-responserXtopics-exportersrh}Utopics-exportersrXtopics-link-extractorsrjBUtopics-link-extractorsrXtopics-commands-refrjUtopics-commands-refrX"topics-selectors-nesting-selectorsrj9U"topics-selectors-nesting-selectorsrXtopics-contractsrhUtopics-contractsrX topics-leaksrjfU topics-leaksrXtopics-webservice-crawlerrhUtopics-webservice-crawlerrXtopics-leaks-without-leaksrjfUtopics-leaks-without-leaksrXtopics-settingsrhGUtopics-settingsrX topics-downloader-middleware-refrhU topics-downloader-middleware-refruUlabelsr}r(jjjXCommon Stats Collector usesjhjXThumbnail generationjhjXDummy policy (default)jjojXBuilt-in signals referencejh jX Item LoadersjhbjXS3jj'jXBuilt-in Exceptions referencejjjXCommand line tooljhPjXRun Scrapy from a scriptjhbjXMarshaljjjXLoggingjj]jXTelnet ConsolejhPjXAvoiding getting bannedjjjXFrequently Asked QuestionsjhbjXFTPjhjX$Filesystem storage backend (default)jjjX Scrapy shelljhbjXStoragesjhYjXArchitecture overviewjh}jXSerialization of item fieldsjhjXDownloading Item ImagesjhkjXDeclaring ItemsjjjXScrapydjjjX Log levelsj jj XResponse subclassesjj9jXNesting selectorsj hjX Signals APIjhjX(Implementing your custom Images PipelinejjjXVersioning and API Stabilityjh#jXUsing Firefox for scrapingjhbjX Feed exportsjU py-modindexUcsphinx.locale _TranslationProxy rcsphinx.locale mygettext rU Module IndexrrjjrbjjjXDebugging SpidersjjxjX ExtensionsjhjXBasic conceptsjjxjXMemory usage extensionj hj!X Web Servicej"h}j#X!Built-in Item Exporters referencej$j9j%XBuilt-in Selectors referencej&hbj'XXMLj(htj)XActivating a spider middlewarej*hbj+XLocal filesystemj,htj-XSpider Middlewarej.jxj/XWeb service extensionj0hPj1XDistributed crawlsj2hj3XScrapy Tutorialj4j0j5XExamplesj6hj7XScrapy at a glancej8jfj9X!Debugging memory leaks with Guppyj:hj;X What else?j<h5j=XInstallation guidej>hbj?XStorage backendsj@hjAX"Activating a downloader middlewarejBhjCX Broad CrawlsjDhjEXCore APIjFjTjGX Release notesjHhjIXCookiesMiddlewarejJjxjKXBuilt-in extensions referencejLhjMXDBM storage backendjNhjOXRFC2616 policyjPhGjQXBuilt-in settings referencejRjRUjjU Search PagerrjjrbjSj jTXSpidersjUhbjVXSerialization formatsjWhbjXXJSONjYhjZXRobotsTxtMiddlewarej[jj\XUbuntu packagesj]hj^XScrapy 0.22 documentationj_jxj`XTelnet console extensionjah jbX"Reusing and extending Item LoadersjejKjfXExperimental featuresjghjhX BenchmarkingjijBjjX"Built-in link extractors referencejkhPjlX Dynamic Creation of Item ClassesjmjjnXRequest subclassesjohbjpXStorage URI parametersjqhjrXDownloader MiddlewarejshjtXWeb service resourcesjuhjvX"Define the data you want to scrapejwhbjxX JSON linesjyhjzXEnabling your Images Pipelinej{joj|XSignalsj}jj~X DjangoItemjjjX-Passing additional data to callback functionsjcj jdXSpider argumentsjhbjXStandard outputjjjX Mail settingsjjjXSending e-mailjhbjXPicklejh,jX Item PipelinejhjXContributing to ScrapyjjUjjUIndexrrjjrbjh}jX&1. Declaring a serializer in the fieldjjfjX$Debugging memory leaks with trackrefjj jXBuilt-in spiders referencejh>jXUsing Firebug for scrapingjh#jX#Useful Firefox add-ons for scrapingjj'jX Exceptionsjh jXItem Loader ContextjhkjX Item FieldsjhbjXCSVjhjXAjaxCrawlMiddlewarejhjX!Jobs: pausing and resuming crawlsjj9jXWorking with relative XPathsjjjX:Using FormRequest.from_response() to simulate a user loginjj9jX SelectorsjhjXAvailable JSON-RPC resourcesjh#jX,Caveats with inspecting the live browser DOMjjjX$Default structure of Scrapy projectsjh5jX$Platform specific installation notesjhjXStats Collector APIjjjXRequest.meta special keysjhPjXCommon PracticesjhjX Crawler APIjhjXThrottling algorithmjh jXInput and Output processorsjhjXExtending ScrapyjjjXRequests and ResponsesjjjX)What Python versions does Scrapy support?jhtjX$Built-in spider middleware referencejjjXStats Collectionjj9jXRemoving namespacesjh jXAvailable built-in processorsjhkjXItemsjh jX%Declaring Input and Output ProcessorsjjjX4Invoking the shell from spiders to inspect responsesjh}jXItem ExportersjjBjXLink ExtractorsjjjXAvailable tool commandsjhjXSpiders ContractsjjfjXDebugging memory leaksjhjXCrawler JSON-RPC resourcejjfjXLeaks without leaksjhGjXSettingsjhjX(Built-in downloader middleware referenceuU progoptionsr}rUobjectsr}r(XcommandX startprojectjXstd:command-startprojectrXsettingXFEED_EXPORTERShbXstd:setting-FEED_EXPORTERSrXsettingXMEMUSAGE_LIMIT_MBhGXstd:setting-MEMUSAGE_LIMIT_MBrXreqmetaX bindaddressjXstd:reqmeta-bindaddressrXsettingXCONCURRENT_REQUESTShGXstd:setting-CONCURRENT_REQUESTSrXsettingX LOG_STDOUThGXstd:setting-LOG_STDOUTrXcommandXfetchjXstd:command-fetchrXsettingXDOWNLOADER_STATShGXstd:setting-DOWNLOADER_STATSrXsettingXWEBSERVICE_HOSThXstd:setting-WEBSERVICE_HOSTrXsettingXCONCURRENT_REQUESTS_PER_IPhGX&std:setting-CONCURRENT_REQUESTS_PER_IPrXcommandXversionjXstd:command-versionrXsettingX HTTPCACHE_DIRhXstd:setting-HTTPCACHE_DIRrXreqmetaXhandle_httpstatus_listhtX"std:reqmeta-handle_httpstatus_listrXsettingXSPIDER_CONTRACTShGXstd:setting-SPIDER_CONTRACTSrXsettingXHTTPCACHE_IGNORE_SCHEMEShX$std:setting-HTTPCACHE_IGNORE_SCHEMESrXsettingXTELNETCONSOLE_PORTj]Xstd:setting-TELNETCONSOLE_PORTrXsignalX spider_idlejoXstd:signal-spider_idlerXsettingXMAIL_SSLjXstd:setting-MAIL_SSLrXsettingXCLOSESPIDER_TIMEOUTjxXstd:setting-CLOSESPIDER_TIMEOUTr XsettingXURLLENGTH_LIMIThGXstd:setting-URLLENGTH_LIMITr XsettingXCONCURRENT_REQUESTS_PER_DOMAINhGX*std:setting-CONCURRENT_REQUESTS_PER_DOMAINr XsettingXSPIDER_CONTRACTS_BASEhGX!std:setting-SPIDER_CONTRACTS_BASEr XsettingXREDIRECT_MAX_METAREFRESH_DELAYhGX*std:setting-REDIRECT_MAX_METAREFRESH_DELAYr XsettingXHTTPCACHE_DBM_MODULEhX std:setting-HTTPCACHE_DBM_MODULErXsignalXresponse_downloadedjoXstd:signal-response_downloadedrXcommandXviewjXstd:command-viewrXsettingXSPIDER_MIDDLEWAREShGXstd:setting-SPIDER_MIDDLEWARESrXsettingX LOG_ENCODINGhGXstd:setting-LOG_ENCODINGrXsettingXDEPTH_PRIORITYhGXstd:setting-DEPTH_PRIORITYrXsettingXIMAGES_MIN_HEIGHThXstd:setting-IMAGES_MIN_HEIGHTrXsignalX spider_openedjoXstd:signal-spider_openedrXreqmetaX cookiejarhXstd:reqmeta-cookiejarrXsettingX MAIL_PORTjXstd:setting-MAIL_PORTrXsettingX MAIL_USERjXstd:setting-MAIL_USERrXsettingXDEPTH_STATS_VERBOSEhGXstd:setting-DEPTH_STATS_VERBOSErXsignalXresponse_receivedjoXstd:signal-response_receivedrXsettingXFEED_STORAGES_BASEhbXstd:setting-FEED_STORAGES_BASErXreqmetaX redirect_urlshXstd:reqmeta-redirect_urlsrXsettingXCLOSESPIDER_ERRORCOUNTjxX"std:setting-CLOSESPIDER_ERRORCOUNTrXsettingXDOWNLOAD_HANDLERShGXstd:setting-DOWNLOAD_HANDLERSrXsettingX STATS_DUMPhGXstd:setting-STATS_DUMPrXsettingXHTTPCACHE_STORAGEhXstd:setting-HTTPCACHE_STORAGEr XsettingXAJAXCRAWL_ENABLEDhXstd:setting-AJAXCRAWL_ENABLEDr!XsettingXDEFAULT_REQUEST_HEADERShGX#std:setting-DEFAULT_REQUEST_HEADERSr"XsettingX LOG_ENABLEDhGXstd:setting-LOG_ENABLEDr#XsettingXIMAGES_MIN_WIDTHhXstd:setting-IMAGES_MIN_WIDTHr$XsettingXCLOSESPIDER_ITEMCOUNTjxX!std:setting-CLOSESPIDER_ITEMCOUNTr%XsettingX IMAGES_STOREhXstd:setting-IMAGES_STOREr&XreqmetaX dont_redirecthXstd:reqmeta-dont_redirectr'XsettingX MAIL_PASSjXstd:setting-MAIL_PASSr(XcommandXlistjXstd:command-listr)XsettingXMEMDEBUG_ENABLEDhGXstd:setting-MEMDEBUG_ENABLEDr*XcommandXshelljXstd:command-shellr+XsettingXHTTPCACHE_POLICYhXstd:setting-HTTPCACHE_POLICYr,XsettingXRETRY_HTTP_CODEShXstd:setting-RETRY_HTTP_CODESr-XsettingXDOWNLOAD_DELAYhGXstd:setting-DOWNLOAD_DELAYr.XsettingX COOKIES_DEBUGhXstd:setting-COOKIES_DEBUGr/XsettingXMEMUSAGE_WARNING_MBhGXstd:setting-MEMUSAGE_WARNING_MBr0XsettingXMEMUSAGE_ENABLEDhGXstd:setting-MEMUSAGE_ENABLEDr1XsettingXIMAGES_EXPIREShXstd:setting-IMAGES_EXPIRESr2XsettingXAUTOTHROTTLE_DEBUGhXstd:setting-AUTOTHROTTLE_DEBUGr3XsettingXFEED_EXPORTERS_BASEhbXstd:setting-FEED_EXPORTERS_BASEr4XsettingX FEED_STORAGEShbXstd:setting-FEED_STORAGESr5XsettingXAWS_SECRET_ACCESS_KEYhGX!std:setting-AWS_SECRET_ACCESS_KEYr6XcommandXparsejXstd:command-parser7XsettingXRANDOMIZE_DOWNLOAD_DELAYhGX$std:setting-RANDOMIZE_DOWNLOAD_DELAYr8XsettingX RETRY_ENABLEDhXstd:setting-RETRY_ENABLEDr9XsettingXSTATSMAILER_RCPTShGXstd:setting-STATSMAILER_RCPTSr:XsettingXREDIRECT_ENABLEDhXstd:setting-REDIRECT_ENABLEDr;XsettingXAUTOTHROTTLE_ENABLEDhX std:setting-AUTOTHROTTLE_ENABLEDr<XsettingXCLOSESPIDER_PAGECOUNTjxX!std:setting-CLOSESPIDER_PAGECOUNTr=XsettingX MAIL_HOSTjXstd:setting-MAIL_HOSTr>XsettingX MAIL_FROMjXstd:setting-MAIL_FROMr?XsettingXCOOKIES_ENABLEDhXstd:setting-COOKIES_ENABLEDr@XsettingXCONCURRENT_ITEMShGXstd:setting-CONCURRENT_ITEMSrAXsettingXFEED_STORE_EMPTYhbXstd:setting-FEED_STORE_EMPTYrBXsettingXDOWNLOADER_MIDDLEWAREShGX"std:setting-DOWNLOADER_MIDDLEWARESrCXsettingX RETRY_TIMEShXstd:setting-RETRY_TIMESrDXcommandXcrawljXstd:command-crawlrEXcommandXeditjXstd:command-editrFXsignalXengine_stoppedjoXstd:signal-engine_stoppedrGXsettingXDNSCACHE_ENABLEDhGXstd:setting-DNSCACHE_ENABLEDrHXsettingXMEMUSAGE_REPORThGXstd:setting-MEMUSAGE_REPORTrIXsettingXTELNETCONSOLE_ENABLEDhGX!std:setting-TELNETCONSOLE_ENABLEDrJXsettingXWEBSERVICE_LOGFILEhXstd:setting-WEBSERVICE_LOGFILErKXsettingXREDIRECT_PRIORITY_ADJUSThGX$std:setting-REDIRECT_PRIORITY_ADJUSTrLXsignalXupdate_telnet_varsj]Xstd:signal-update_telnet_varsrMXcommandXbenchjXstd:command-benchrNXsettingX USER_AGENThGXstd:setting-USER_AGENTrOXsignalX item_droppedjoXstd:signal-item_droppedrPXsettingXHTTPCACHE_EXPIRATION_SECShX%std:setting-HTTPCACHE_EXPIRATION_SECSrQXsettingXSPIDER_MIDDLEWARES_BASEhGX#std:setting-SPIDER_MIDDLEWARES_BASErRXsettingXREFERER_ENABLEDhtXstd:setting-REFERER_ENABLEDrSXsettingX DEPTH_LIMIThGXstd:setting-DEPTH_LIMITrTXsettingXLOG_FILEhGXstd:setting-LOG_FILErUXsettingXDEFAULT_ITEM_CLASShGXstd:setting-DEFAULT_ITEM_CLASSrVXsettingXAUTOTHROTTLE_START_DELAYhX$std:setting-AUTOTHROTTLE_START_DELAYrWXsettingXFEED_URIhbXstd:setting-FEED_URIrXXsettingXHTTPCACHE_ENABLEDhXstd:setting-HTTPCACHE_ENABLEDrYXsettingXNEWSPIDER_MODULEhGXstd:setting-NEWSPIDER_MODULErZXsettingXDOWNLOADER_DEBUGhGXstd:setting-DOWNLOADER_DEBUGr[XsettingXMEMDEBUG_NOTIFYhGXstd:setting-MEMDEBUG_NOTIFYr\XsettingXDOWNLOAD_TIMEOUThGXstd:setting-DOWNLOAD_TIMEOUTr]XsettingXBOT_NAMEhGXstd:setting-BOT_NAMEr^XsettingXROBOTSTXT_OBEYhGXstd:setting-ROBOTSTXT_OBEYr_XcommandX genspiderjXstd:command-genspiderr`XreqmetaX dont_retryhXstd:reqmeta-dont_retryraXsignalXengine_startedjoXstd:signal-engine_startedrbXsettingXITEM_PIPELINES_BASEhGXstd:setting-ITEM_PIPELINES_BASErcXsettingXHTTPERROR_ALLOW_ALLhtXstd:setting-HTTPERROR_ALLOW_ALLrdXsettingXAWS_ACCESS_KEY_IDhGXstd:setting-AWS_ACCESS_KEY_IDreXsignalX item_scrapedjoXstd:signal-item_scrapedrfXsettingX LOG_LEVELhGXstd:setting-LOG_LEVELrgXsettingXCOMPRESSION_ENABLEDhXstd:setting-COMPRESSION_ENABLEDrhXcommandXsettingsjXstd:command-settingsriXsettingXMAIL_TLSjXstd:setting-MAIL_TLSrjXsettingXWEBSERVICE_PORThXstd:setting-WEBSERVICE_PORTrkXsettingXDOWNLOAD_HANDLERS_BASEhGX"std:setting-DOWNLOAD_HANDLERS_BASErlXsignalX spider_closedjoXstd:signal-spider_closedrmXsettingX STATS_CLASShGXstd:setting-STATS_CLASSrnXsettingXDUPEFILTER_CLASShGXstd:setting-DUPEFILTER_CLASSroXsignalX spider_errorjoXstd:signal-spider_errorrpXsettingXHTTPCACHE_IGNORE_MISSINGhX$std:setting-HTTPCACHE_IGNORE_MISSINGrqXcommandXdeployjXstd:command-deployrrXsettingXITEM_PIPELINEShGXstd:setting-ITEM_PIPELINESrsXsettingXAUTOTHROTTLE_MAX_DELAYhX"std:setting-AUTOTHROTTLE_MAX_DELAYrtXsettingXMEMUSAGE_NOTIFY_MAILhGX std:setting-MEMUSAGE_NOTIFY_MAILruXsettingXHTTPERROR_ALLOWED_CODEShtX#std:setting-HTTPERROR_ALLOWED_CODESrvXsettingXEXTENSIONS_BASEhGXstd:setting-EXTENSIONS_BASErwXsettingX DEPTH_STATShGXstd:setting-DEPTH_STATSrxXsettingXWEBSERVICE_ENABLEDhXstd:setting-WEBSERVICE_ENABLEDryXsettingXDOWNLOADER_MIDDLEWARES_BASEhGX'std:setting-DOWNLOADER_MIDDLEWARES_BASErzXsettingXjDITORhGXstd:setting-jDITORr{XcommandX runspiderjXstd:command-runspiderr|XsettingXCOMMANDS_MODULEjXstd:setting-COMMANDS_MODULEr}XsettingXREDIRECT_MAX_TIMEShGXstd:setting-REDIRECT_MAX_TIMESr~XsettingX FEED_FORMAThbXstd:setting-FEED_FORMATrXsettingX SCHEDULERhGXstd:setting-SCHEDULERrXsettingXMETAREFRESH_ENABLEDhXstd:setting-METAREFRESH_ENABLEDrXsettingX EXTENSIONShGXstd:setting-EXTENSIONSrXsettingXSPIDER_MODULEShGXstd:setting-SPIDER_MODULESrXcommandXcheckjXstd:command-checkrXsettingX TEMPLATES_DIRhGXstd:setting-TEMPLATES_DIRrXsettingXHTTPCACHE_IGNORE_HTTP_CODEShX'std:setting-HTTPCACHE_IGNORE_HTTP_CODESrXsettingX IMAGES_THUMBShXstd:setting-IMAGES_THUMBSrXsettingXTELNETCONSOLE_HOSTj]Xstd:setting-TELNETCONSOLE_HOSTruuUc}r(j}rjKuUpyr}r(j}r(X+scrapy.contrib.downloadermiddleware.cookiesrhUmodulerXAscrapy.contrib.downloadermiddleware.useragent.UserAgentMiddlewarerhXclassX>scrapy.contrib.memdebug.scrapy.contrib.memdebug.MemoryDebuggerrjxXclassXscrapy.contrib.debugrjxjX8scrapy.contrib.linkextractors.sgml.BaseSgmlLinkExtractorrjBXclassX(scrapy.statscol.StatsCollector.min_valuerhXmethodX)scrapy.contrib.downloadermiddleware.statsrhjX$scrapy.selector.Selector.__nonzero__rj9XmethodrXscrapy.exceptions.NotConfiguredrj'X exceptionXscrapy.http.ResponserjXclassXscrapy.signals.item_scrapedrjoXfunctionrX)scrapy.contrib.spidermiddleware.urllengthrhtjX*scrapy.selector.Selector.remove_namespacesrj9XmethodrX)scrapy.contrib.loader.processor.TakeFirstrh XclassX(scrapy.contracts.default.ScrapesContractrhXclassXMscrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonResource.ws_namerhX attributerX=scrapy.contrib.spidermiddleware.urllength.UrlLengthMiddlewarerhtXclassrX"scrapy.contrib.linkextractors.sgmlrjBjXscrapy.selector.SelectorList.rerj9XmethodrXIscrapy.contrib.downloadermiddleware.DownloaderMiddleware.process_responserhXmethodX)scrapy.telnet.scrapy.telnet.TelnetConsolerjxXclassXscrapy.utils.trackref.iter_allrjfXfunctionXscrapy.spider.Spider.start_urlsrj X attributerXscrapy.item.FieldrhkXclassXscrapy.exceptions.CloseSpiderrj'X exceptionXscrapy.selector.SelectorListrj9XclassrXscrapy.contrib.memusagerjxjX2scrapy.contrib.downloadermiddleware.defaultheadersrhjX-scrapy.contrib.spiders.XMLFeedSpider.iteratorrj X attributerXscrapy.spider.Spider.namerj X attributerXscrapy.http.Response.metarjX attributeXscrapy.settings.Settings.getrhXmethodX*scrapy.statscol.StatsCollector.open_spiderrhXmethodX-scrapy.contrib.downloadermiddleware.httpproxyrhjXKscrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddlewarerhXclassX9scrapy.contrib.exporter.BaseItemExporter.fields_to_exportrh}X attributeX"scrapy.selector.SelectorList.xpathrj9XmethodrX*scrapy.contrib.loader.processor.MapComposerh XclassX$scrapy.contracts.default.UrlContractrhXclassX*scrapy.statscol.StatsCollector.clear_statsrhXmethodX,scrapy.contrib.loader.ItemLoader.replace_cssrh XmethodX(scrapy.statscol.StatsCollector.max_valuerhXmethodXAscrapy.contrib.closespider.scrapy.contrib.closespider.CloseSpiderrjxXclassXscrapy.contrib.loader.processorrh jX*scrapy.contrib.loader.ItemLoader.get_xpathrh XmethodX4scrapy.contrib.exporter.BaseItemExporter.export_itemrh}XmethodXscrapy.exceptionsrj'jXEscrapy.contrib.spidermiddleware.SpiderMiddleware.process_spider_inputrhtXmethodrX'scrapy.contrib.exporter.XmlItemExporterrh}XclassXscrapy.http.Response.replacerjXmethodX"scrapy.settings.Settings.overridesrhX attributeXscrapy.crawlerrhjX"scrapy.contrib.corestats.CoreStatsrjxXclassX$scrapy.mail.MailSender.from_settingsrjX classmethodXscrapy.contrib.statsmailerrjxjXscrapy.item.ItemrhkXclassXscrapy.spider.Spider.parserj XmethodrXAscrapy.contrib.downloadermiddleware.robotstxt.RobotsTxtMiddlewarerhXclassXscrapy.settings.SettingsrhXclassX-scrapy.contrib.downloadermiddleware.ajaxcrawlrhjXscrapy.settingsrhjXscrapy.statscol.StatsCollectorrhXclassXAscrapy.contrib.downloadermiddleware.httpcache.HttpCacheMiddlewarerhXclassXscrapy.http.TextResponserjXclassXJscrapy.contrib.downloadermiddleware.DownloaderMiddleware.process_exceptionrhXmethodX$scrapy.contrib.spiders.CSVFeedSpiderrj XclassrX*scrapy.contrib.exporter.PprintItemExporterrh}XclassXscrapy.signalsrjojX scrapy.selector.Selector.extractrj9XmethodrX9scrapy.contrib.loader.ItemLoader.default_output_processorrh X attributeX(scrapy.http.TextResponse.body_as_unicoderjXmethodX!scrapy.settings.Settings.getfloatrhXmethodXscrapy.http.Request.copyrjXmethodX(scrapy.statscol.StatsCollector.get_statsrhXmethodXscrapy.crawler.Crawler.startrhXmethodXscrapy.http.Response.headersrjX attributeXscrapy.log.ERRORrjXdataX8scrapy.contrib.debug.scrapy.contrib.debug.StackTraceDumprjxXclassX"scrapy.contrib.spiders.CrawlSpiderrj XclassrX scrapy.utils.trackref.object_refrjfXclassX?scrapy.contrib.downloadermiddleware.redirect.RedirectMiddlewarerhXclassX2scrapy.contrib.spiders.CrawlSpider.parse_start_urlrj XmethodrX scrapy.utils.trackref.get_oldestrjfXfunctionXscrapy.statscolrjjXscrapy.http.FormRequestrjXclassX scrapy.selector.SelectorList.cssrj9XmethodrXscrapy.http.XmlResponserjXclassXscrapy.crawler.CrawlerrhXclassX-scrapy.signalmanager.SignalManager.disconnectrhXmethodX&scrapy.contracts.Contract.post_processrhXmethodXscrapy.contrib.webservicerhjX scrapy.settings.Settings.getlistrhXmethodX1scrapy.signalmanager.SignalManager.send_catch_logrhXmethodXGscrapy.contrib.spidermiddleware.SpiderMiddleware.process_start_requestsrhtXmethodrX!scrapy.http.TextResponse.encodingrjX attributeXBscrapy.contrib.downloadermiddleware.redirect.MetaRefreshMiddlewarerhXclassX(scrapy.statscol.StatsCollector.set_statsrhXmethodXscrapy.log.WARNINGrjXdataX'scrapy.contrib.exporter.CsvItemExporterrh}XclassX*scrapy.contrib.loader.ItemLoader.load_itemrh XmethodX=scrapy.contrib.spidermiddleware.httperror.HttpErrorMiddlewarerhtXclassrXHscrapy.contrib.downloadermiddleware.DownloaderMiddleware.process_requestrhXmethodX scrapy.signals.response_receivedr joXfunctionr X%scrapy.contrib.spidermiddleware.depthr htjX.scrapy.webservice.scrapy.webservice.WebServicer jxXclassXscrapy.signalmanagerr hjX8scrapy.contrib.exporter.BaseItemExporter.start_exportingrh}XmethodX!scrapy.crawler.Crawler.extensionsrhX attributeX-scrapy.contrib.downloadermiddleware.httpcacherhjXscrapy.signals.engine_startedrjoXfunctionrXscrapy.crawler.Crawler.spidersrhX attributeX1scrapy.contrib.exporter.BaseItemExporter.encodingrh}X attributeXscrapy.selectorrj9jX scrapy.spiderrj jXscrapy.spider.Spider.logrj XmethodrX)scrapy.contrib.loader.ItemLoader.selectorrh X attributeX9scrapy.contrib.spidermiddleware.offsite.OffsiteMiddlewarerhtXclassrX&scrapy.contrib.webservice.enginestatusrhjX(scrapy.contracts.default.ReturnsContractrhXclassX scrapy.httprjjXscrapy.http.Request.methodrjX attributeXscrapy.item.Item.fieldsr hkX attributeXscrapy.selector.Selector.rer!j9Xmethodr"X4scrapy.contrib.spiders.XMLFeedSpider.process_resultsr#j Xmethodr$X8scrapy.contrib.downloadermiddleware.DownloaderMiddlewarer%hXclassX5scrapy.contrib.loader.ItemLoader.get_collected_valuesr&h XmethodX7scrapy.contrib.loader.ItemLoader.default_selector_classr'h X attributeXEscrapy.contrib.downloadermiddleware.chunked.ChunkedTransferMiddlewarer(hXclassXAscrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddlewarer)hXclassXscrapy.exceptions.NotSupportedr*j'X exceptionXscrapy.contrib.corestatsr+jxjX*scrapy.signalmanager.SignalManager.connectr,hXmethodX*scrapy.contrib.loader.ItemLoader.add_xpathr-h XmethodX%scrapy.http.FormRequest.from_responser.jX classmethodX'scrapy.contrib.loader.processor.Composer/h XclassXFscrapy.contrib.spidermiddleware.SpiderMiddleware.process_spider_outputr0htXmethodr1X(scrapy.contrib.loader.ItemLoader.add_cssr2h XmethodX$scrapy.selector.SelectorList.extractr3j9Xmethodr4X.scrapy.contrib.spiders.CSVFeedSpider.delimiterr5j X attributer6Xscrapy.crawler.Crawler.statsr7hX attributeX#scrapy.statscol.DummyStatsCollectorr8jXclassr9Xscrapy.http.Request.urlr:jX attributeX scrapy.contrib.logstats.LogStatsr;jxXclassX(scrapy.contrib.loader.ItemLoader.get_cssr<h XmethodX1scrapy.contrib.spiders.SitemapSpider.sitemap_urlsr=j X attributer>Xscrapy.log.DEBUGr?jXdataX;scrapy.contrib.memusage.scrapy.contrib.memusage.MemoryUsager@jxXclassX8scrapy.contrib.loader.ItemLoader.default_input_processorrAh X attributeXscrapy.http.Request.headersrBjX attributeX.scrapy.contrib.loader.ItemLoader.replace_valuerCh XmethodXscrapy.contractsrDhjXscrapy.http.Request.metarEjX attributeXscrapy.contrib.spidermiddlewarerFhtjXHscrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonRpcResourcerGhXclassrHX4scrapy.contrib.linkextractors.sgml.SgmlLinkExtractorrIjBXclassX=scrapy.contrib.downloadermiddleware.cookies.CookiesMiddlewarerJhXclassX<scrapy.contrib.spiders.SitemapSpider.sitemap_alternate_linksrKj X attributerLXscrapy.exceptions.IgnoreRequestrMj'X exceptionX-scrapy.contrib.downloadermiddleware.useragentrNhjX3scrapy.contrib.downloadermiddleware.downloadtimeoutrOhjX(scrapy.contrib.spiders.CrawlSpider.rulesrPj X attributerQX close_spiderrRh,XmethodX scrapy.crawler.Crawler.configurerShXmethodX;scrapy.contrib.webservice.enginestatus.EngineStatusResourcerThXclassrUX<scrapy.contrib.exporter.BaseItemExporter.export_empty_fieldsrVh}X attributeXscrapy.contrib.closespiderrWjxjX-scrapy.contrib.pipeline.images.ImagesPipelinerXhXclassXscrapy.contrib.linkextractorsrYjBjXscrapy.signals.engine_stoppedrZjoXfunctionr[X#scrapy.spider.Spider.start_requestsr\j Xmethodr]X8scrapy.contrib.exporter.BaseItemExporter.serialize_fieldr^h}XmethodXscrapy.signals.spider_idler_joXfunctionr`X"scrapy.signals.response_downloadedrajoXfunctionrbX scrapy.telnetrcj]jX"scrapy.signalmanager.SignalManagerrdhXclassXscrapy.utils.trackrefrejfjX$scrapy.statscol.MemoryStatsCollectorrfjXclassrgXscrapy.http.Request.bodyrhjX attributeXscrapy.http.Request.replacerijXmethodXscrapy.signals.spider_openedrjjoXfunctionrkX-scrapy.contracts.Contract.adjust_request_argsrlhXmethodXscrapy.crawler.Crawler.enginermhX attributeXscrapy.signals.item_droppedrnjoXfunctionroX#scrapy.contrib.downloadermiddlewarerphjX/scrapy.contrib.spiders.XMLFeedSpider.namespacesrqj X attributerrX9scrapy.contrib.exporter.BaseItemExporter.finish_exportingrsh}XmethodX*scrapy.contrib.loader.ItemLoader.add_valuerth XmethodXscrapy.http.Response.requestrujX attributeX9scrapy.contrib.downloadermiddleware.retry.RetryMiddlewarervhXclassX?scrapy.contrib.downloadermiddleware.httpauth.HttpAuthMiddlewarerwhXclassXscrapy.selector.Selector.cssrxj9XmethodryXscrapy.log.CRITICALrzjXdataX5scrapy.contrib.spidermiddleware.depth.DepthMiddlewarer{htXclassr|Xscrapy.http.Response.urlr}jX attributeXscrapy.contrib.exporterr~h}jXscrapy.selector.Selector.xpathrj9XmethodrX(scrapy.contrib.exporter.BaseItemExporterrh}XclassX1scrapy.contrib.webservice.crawler.CrawlerResourcerhXclassrX scrapy.mailrjjX3scrapy.contrib.loader.ItemLoader.default_item_classrh X attributeX3scrapy.contrib.spiders.SitemapSpider.sitemap_followrj X attributerX+scrapy.contrib.downloadermiddleware.chunkedrhjX3scrapy.contrib.spiders.XMLFeedSpider.adapt_responserj XmethodrX5scrapy.contrib.loader.ItemLoader.get_output_processorrh XmethodX(scrapy.statscol.StatsCollector.inc_valuerhXmethodXscrapy.exceptions.DropItemrj'X exceptionXEscrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonResourcerhXclassrX+scrapy.selector.Selector.register_namespacerj9XmethodrXscrapy.contrib.loaderrh jX*scrapy.contrib.exporter.PickleItemExporterrh}XclassX(scrapy.selector.SelectorList.__nonzero__rj9XmethodrXscrapy.contracts.ContractrhXclassXscrapy.crawler.Crawler.settingsrhX attributeXscrapy.contrib.pipeline.imagesrhjXscrapy.webservicerjxjX:scrapy.signalmanager.SignalManager.send_catch_log_deferredrhXmethodX@scrapy.contrib.pipeline.images.ImagesPipeline.get_media_requestsrhXmethodX%scrapy.contrib.loader.ItemLoader.itemrh X attributeXscrapy.http.HtmlResponserjXclassX-scrapy.contrib.exporter.JsonLinesItemExporterrh}XclassXscrapy.http.Response.statusrjX attributeX-scrapy.contrib.downloadermiddleware.robotstxtrhjX4scrapy.contrib.loader.ItemLoader.get_input_processorrh XmethodX/scrapy.contrib.spiders.XMLFeedSpider.parse_noderj XmethodrXscrapy.spider.Spiderrj XclassrX$scrapy.contrib.spiders.SitemapSpiderrj XclassrXscrapy.contrib.webservice.statsrhjXscrapy.crawler.Crawler.signalsrhX attributeXscrapy.http.Response.flagsrjX attributeXMscrapy.contrib.downloadermiddleware.downloadtimeout.DownloadTimeoutMiddlewarerhXclassXscrapy.contrib.memdebugrjxjXscrapy.contrib.spiders.Rulerj XclassrX(scrapy.statscol.StatsCollector.get_valuerhXmethodX<scrapy.contrib.pipeline.images.ImagesPipeline.item_completedrhXmethodX process_itemrh,XmethodX scrapy.itemrhkjX(scrapy.contrib.loader.ItemLoader.contextrh X attributeX+scrapy.statscol.StatsCollector.close_spiderrhXmethodXscrapy.signals.spider_errorrjoXfunctionrX,scrapy.contrib.spiders.CSVFeedSpider.headersrj X attributerX1scrapy.contrib.loader.ItemLoader.get_output_valuerh XmethodX(scrapy.statscol.StatsCollector.set_valuerhXmethodX0scrapy.contrib.spidermiddleware.SpiderMiddlewarerhtXclassrX scrapy.settings.Settings.getboolrhXmethodX$scrapy.spider.Spider.allowed_domainsrj X attributerX scrapy.telnet.update_telnet_varsrj]XfunctionrX)scrapy.contrib.spidermiddleware.httperrorrhtjXscrapy.mail.MailSender.sendrjXmethodX2scrapy.contrib.spiders.SitemapSpider.sitemap_rulesrj X attributerX scrapy.logrjjX!scrapy.contrib.webservice.crawlerrhjXAscrapy.contrib.statsmailer.scrapy.contrib.statsmailer.StatsMailerrjxXclassX+scrapy.spider.Spider.make_requests_from_urlrj XmethodrX scrapy.contrib.loader.ItemLoaderrh XclassX open_spiderrh,XmethodX$scrapy.contrib.loader.processor.Joinrh XclassX%scrapy.utils.trackref.print_live_refsrjfXfunctionX.scrapy.contrib.spiders.CSVFeedSpider.parse_rowrj XmethodrX.scrapy.contrib.loader.ItemLoader.replace_xpathrh XmethodXscrapy.http.Response.bodyrjX attributeX,scrapy.contrib.spiders.XMLFeedSpider.itertagrj X attributerX,scrapy.contrib.downloadermiddleware.redirectrhjX'scrapy.contrib.spidermiddleware.offsiterhtjXscrapy.http.RequestrjXclassXscrapy.signals.spider_closedrjoXfunctionrX9scrapy.contrib.spidermiddleware.referer.RefererMiddlewarerhtXclassrX-scrapy.contrib.webservice.stats.StatsResourcerhXclassrX1scrapy.signalmanager.SignalManager.disconnect_allrhXmethodX1scrapy.statscol.MemoryStatsCollector.spider_statsrjX attributerXscrapy.log.INFOrjXdataXscrapy.log.startrjXfunctionX*scrapy.contrib.loader.ItemLoader.get_valuerh XmethodXscrapy.http.Response.copyrjXmethodXscrapy.contrib.logstatsrjxjX2scrapy.contrib.debug.scrapy.contrib.debug.DebuggerrjxXclassXAscrapy.contrib.downloadermiddleware.ajaxcrawl.AjaxCrawlMiddlewarerhXclassX)scrapy.contrib.downloadermiddleware.retryrhjX(scrapy.contrib.loader.processor.Identityrh XclassXSscrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonRpcResource.get_targetrhXmethodrX%scrapy.contracts.Contract.pre_processrhXmethodXscrapy.selector.Selectorrj9XclassrX(scrapy.contrib.exporter.JsonItemExporterrh}XclassXscrapy.contrib.spidersrj jXscrapy.contracts.defaultrhjX$scrapy.contrib.spiders.XMLFeedSpiderrj XclassrX9scrapy.contrib.downloadermiddleware.stats.DownloaderStatsrhXclassXscrapy.mail.MailSenderrjXclassX,scrapy.contrib.downloadermiddleware.httpauthrhjXMscrapy.contrib.downloadermiddleware.httpcompression.HttpCompressionMiddlewarerhXclassX3scrapy.contrib.downloadermiddleware.httpcompressionrhjXscrapy.settings.Settings.getintrhXmethodXIscrapy.contrib.spidermiddleware.SpiderMiddleware.process_spider_exceptionrhtXmethodrX'scrapy.contrib.spidermiddleware.refererrhtjXscrapy.log.msgrjXfunctionuUmodulesr}r(j(hcdocutils.nodes reprunicode rXCookies Downloader Middlewarerr}rbUtj(jxjX Web servicerr}rbUtj(htjXURL Length Spider Middlewarerr}r bUtj(jxjXExtensions for debugging Scrapyr r }r bUtj(jjXEmail sending facilityr r}rbUtj(hjXrobots.txt middlewarerr}rbUtjc(j]jXThe Telnet Consolerr}rbUtj(j jX8Spiders base class, spider manager and spider middlewarerr}rbUtj(hjXStats JSON-RPC resourcerr}rbUtj(hjXSettings managerrr}rbUtj(jxjXMemory debugger extensionrr }r!bUtjD(hUUtjF(htUUtj(jjXStats Collectorsr"r#}r$bUtj(hjXCrawler JSON-RPC resourcer%r&}r'bUtj(j9jXSelector classr(r)}r*bUtj(jBjX SGMLParser-based link extractorsr+r,}r-bUtj(htjXOffsite Spider Middlewarer.r/}r0bUtj (htjXDepth Spider Middlewarer1r2}r3bUtjN(hjXUser Agent Middlewarer4r5}r6bUtj(hkjXItem and Field classesr7r8}r9bUtj(hjXBuilt-in web service resourcesr:r;}r<bUtjO(hjXDownload timeout middlewarer=r>}r?bUtj(jxjXMemory usage extensionr@rA}rBbUtj(hjX%Default Headers Downloader MiddlewarerCrD}rEbUtj(hjXHttp Proxy MiddlewarerFrG}rHbUtjY(jBjXLink extractors classesrIrJ}rKbUtj(hjXDownloader Stats MiddlewarerLrM}rNbUtj(jjXLogging facilityrOrP}rQbUtj(hjXRedirection MiddlewarerRrS}rTbUtj(jjXRequest and Response classesrUrV}rWbUtje(jfjX Track references of live objectsrXrY}rZbUtj(h jX3A collection of processors to use with Item Loadersr[r\}r]bUtj (hjXThe signal managerr^r_}r`bUtj(hjX HTTP Cache downloader middlewarerarb}rcbUtj(j'jXScrapy exceptionsrdre}rfbUtjW(jxjXClose spider extensionrgrh}ribUtjp(hUUtj(hjXThe Scrapy crawlerrjrk}rlbUtj(jxjXBasic stats loggingrmrn}robUtj(jxjXStatsMailer extensionrprq}rrbUtj(hjXEngine Status JSON resourcersrt}rubUtj(hjXRetry Middlewarervrw}rxbUtj~(h}jXItem Exportersryrz}r{bUtj(j jXCollection of generic spidersr|r}}r~bUtj(hUUtj(hUUtj(hjXChunked Transfer Middlewarerr}rbUtj(jojXSignals definitionsrr}rbUtj(hjXHTTP Auth downloader middlewarerr}rbUtj+(jxjXCore stats collectionrr}rbUtj(hjXHttp Compression Middlewarerr}rbUtj(h jXItem Loader classrr}rbUtj(htjXHTTP Error Spider Middlewarerr}rbUtj(htjXReferer Spider Middlewarerr}rbUtj(hjXImages Pipelinerr}rbUtujKuUjsr}r(j}rjKuUrstr}r(j}rjKuUcppr}r(j}rjKuuU glob_toctreesrh]RrU reread_alwaysrh]RrU doctreedirrXG/var/build/user_builds/scrapy/checkouts/0.22/docs/_build/html/.doctreesrUversioning_conditionrU citationsr}jK*UsrcdirrX1/var/build/user_builds/scrapy/checkouts/0.22/docsrUconfigrcsphinx.config Config r)r}r(U html_contextr}r(U github_userUscrapyrUnamerXScrapyrU github_repojUversionsr]r(UlatestU /en/latest/rU0.22rU /en/0.22/rU0.20U /en/0.20/rU0.18U /en/0.18/rU0.16U /en/0.16/rU0.14U /en/0.14/rU0.12U /en/0.12/rU0.10.3U /en/0.10.3/rU0.9U/en/0.9/rU0.8U/en/0.8/rU0.7U/en/0.7/reU using_themeU downloads]r(UPDFU8https://media.readthedocs.org/pdf/scrapy/0.22/scrapy.pdfrUHTMLU}hG}hP}hY}hb}hk}ht}h}}h}h}h}h}h}h}h}h}h}h}h}h}h}j}j }j}j}j'}j0}j9}jB}jK}jT}j]}jf}jo}jx}j}j}j}j}j}j}j}j}j}uUversionchangesr}r(X0.8]r(X versionaddedhM@jNXNew in version 0.8.traX0.10.3]r(X versionaddedjM9jXFormRequest.from_responseXrX.New in version 0.10.3: The formname parameter.traX0.21]r(X versionaddedhM<jNXNew in version 0.21.traX0.15]r((X versionaddedhKNNXNew in version 0.15.tr(X versionaddedhKNNXNew in version 0.15.tr(X versionaddedhKjNXNew in version 0.15.tr(X versionaddedrhtKjFX'SpiderMiddleware.process_start_requestsrNXNew in version 0.15.tr (X versionaddedr htM9jNXNew in version 0.15.tr eX0.17]r ((X versionaddedhKNNXNew in version 0.17.tr (X versionaddedjMNNXNew in version 0.17.tr(X versionaddedhMjNXNew in version 0.17.tr(X versionaddedjM<jjX-New in version 0.17: The formxpath parameter.treX0.11]r((X versionaddedjMNNXNew in version 0.11.tr(X versionaddedhMjNXNew in version 0.11.tr(XversionchangedhMjNXMChanged in version 0.11: Before 0.11, HTTPCACHE_DIR was used to enable cache.tr(XversionchangedhMjNXOChanged in version 0.11: Before 0.11, zero meant cached requests always expire.tr(X versionaddedjxM0jWNXNew in version 0.11.tr(X versionaddedjxM>jWNXNew in version 0.11.treX0.10]r((X versionaddedjKNNXNew in version 0.10.tr(X versionaddedhMjNXNew in version 0.10.tr(X versionaddedhMjNXNew in version 0.10.tr(X versionaddedhbKNNXNew in version 0.10.tr(X versionaddedjKNNXNew in version 0.10.treX0.13]r((X versionaddedhMjNXNew in version 0.13.tr(X versionaddedhMjNXNew in version 0.13.tr (X versionaddedhMtjNXNew in version 0.13.tr!(X versionaddedhMjNXNew in version 0.13.tr"eX0.18]r#(X versionaddedhMjNXNew in version 0.18.tr$auUtoc_num_entriesr%}r&(h K h#Kh,Kh5Kh>KhGKFhPKhYK hbKhkK htKh}K hKhK hKhK hK hKhKhK hKhK hKhKhK2jKj KjK jKj'Kj0Kj9KjBKjKKjTK/j]K jfK joK jxKjKjK jKjKjKjKjKjKjKuUimagesr'h)r((Xtopics/_images/firebug1.pngr)h]r*h>aRr+X firebug1.pngr,r-X&topics/_images/scrapy_architecture.pngr.h]r/hYaRr0Xscrapy_architecture.pngr1r2Xtopics/_images/firebug2.pngr3h]r4h>aRr5X firebug2.pngr6r7Xtopics/_images/firebug3.pngr8h]r9h>aRr:X firebug3.pngr;r<uh]r=(j1j,j;j6eRr>bUnumbered_toctreesr?h]Rr@U found_docsrAh]rB(h h#h,h5h>hGhPhYhbhkh}hhhhhhhhhhjKhhhtjj jhj'j0j9jBjjTj]jfjojxjjjjjjjjjeRrCU longtitlesrD}rE(h hh#h$h,h-h5h6h>h?hGhHhPhQhYhZhbhchkhlhthuh}h~hhhhhhhhhhhhhhhhhhhhhhhhhhjjj j jjjjj'j(j0j1j9j:jBjCjKjLjTjUj]j^jfjgjojpjxjyjjjjjjjjjjjjjjjjjjuU dependenciesrF}rG(hYh]rHj.aRrIj9h]rJX(topics/../_static/selectors-sample1.htmlrKaRrLhh]rM(X/topics/../../scrapy/contrib/webservice/stats.pyrNX6topics/../../scrapy/contrib/webservice/enginestatus.pyrOX topics/../../extras/scrapy-ws.pyrPeRrQh>h]rR(j)j8j3eRrSuUtoctree_includesrT}rUh]rV(Xintro/overviewrWX intro/installrXXintro/tutorialrYXintro/examplesrZXtopics/commandsr[X topics/itemsr\Xtopics/spidersr]Xtopics/selectorsr^Xtopics/loadersr_X topics/shellr`Xtopics/item-pipelineraXtopics/feed-exportsrbXtopics/link-extractorsrcXtopics/loggingrdX topics/statsreX topics/emailrfXtopics/telnetconsolergXtopics/webservicerhXfaqriX topics/debugrjXtopics/contractsrkXtopics/practicesrlXtopics/broad-crawlsrmXtopics/firefoxrnXtopics/firebugroX topics/leaksrpX topics/imagesrqX topics/ubunturrXtopics/scrapydrsXtopics/autothrottlertXtopics/benchmarkingruX topics/jobsrvXtopics/djangoitemrwXtopics/architecturerxXtopics/downloader-middlewareryXtopics/spider-middlewarerzXtopics/extensionsr{X topics/apir|Xtopics/request-responser}Xtopics/settingsr~Xtopics/signalsrXtopics/exceptionsrXtopics/exportersrXnewsrX contributingrX versioningrXexperimental/indexresU temp_datar}Utocsr}r(h cdocutils.nodes bullet_list r)r}r(hUh}r(h]h]h]h]h]uh]rcdocutils.nodes list_item r)r}r(hUh}r(h]h]h]h]h]uh jh]r(csphinx.addnodes compact_paragraph r)r}r(hUh}r(h]h]h]h]h]uh jh]rcdocutils.nodes reference r)r}r(hUh}r(U anchornameUUrefurih h]h]h]h]h]Uinternaluh jh]rhX Item Loadersrr}r(hhh jubah!U referencerubah!Ucompact_paragraphrubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU%#using-item-loaders-to-populate-itemsUrefurih h]h]h]h]h]Uinternaluh jh]rhX$Using Item Loaders to populate itemsrr}r(hX$Using Item Loaders to populate itemsh jubah!jubah!jubah!U list_itemrubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#input-and-output-processorsUrefurih h]h]h]h]h]Uinternaluh jh]rhXInput and Output processorsrr}r(hXInput and Output processorsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#declaring-item-loadersUrefurih h]h]h]h]h]Uinternaluh jh]rhXDeclaring Item Loadersrr}r(hXDeclaring Item Loadersh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU&#declaring-input-and-output-processorsUrefurih h]h]h]h]h]Uinternaluh jh]rhX%Declaring Input and Output Processorsrr}r(hX%Declaring Input and Output Processorsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#item-loader-contextUrefurih h]h]h]h]h]Uinternaluh jh]rhXItem Loader Contextrr}r(hXItem Loader Contexth jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#itemloader-objectsUrefurih h]h]h]h]h]Uinternaluh jh]rhXItemLoader objectsrr}r(hXItemLoader objectsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r (hUh}r (U anchornameU##reusing-and-extending-item-loadersUrefurih h]h]h]h]h]Uinternaluh jh]r hX"Reusing and extending Item Loadersr r}r(hX"Reusing and extending Item Loadersh j ubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX'#module-scrapy.contrib.loader.processorUrefurih h]h]h]h]h]Uinternaluh jh]rhXAvailable built-in processorsrr}r(hXAvailable built-in processorsh jubah!jubah!jubah!jubeh!U bullet_listrubeh!jubah!jubh#j)r }r!(hUh}r"(h]h]h]h]h]uh]r#j)r$}r%(hUh}r&(h]h]h]h]h]uh j h]r'(j)r(}r)(hUh}r*(h]h]h]h]h]uh j$h]r+j)r,}r-(hUh}r.(U anchornameUUrefurih#h]h]h]h]h]Uinternaluh j(h]r/hXUsing Firefox for scrapingr0r1}r2(hh+h j,ubah!jubah!jubj)r3}r4(hUh}r5(h]h]h]h]h]uh j$h]r6(j)r7}r8(hUh}r9(h]h]h]h]h]uh j3h]r:j)r;}r<(hUh}r=(h]h]h]h]h]uh j7h]r>j)r?}r@(hUh}rA(U anchornameU-#caveats-with-inspecting-the-live-browser-domUrefurih#h]h]h]h]h]Uinternaluh j;h]rBhX,Caveats with inspecting the live browser DOMrCrD}rE(hX,Caveats with inspecting the live browser DOMh j?ubah!jubah!jubah!jubj)rF}rG(hUh}rH(h]h]h]h]h]uh j3h]rI(j)rJ}rK(hUh}rL(h]h]h]h]h]uh jFh]rMj)rN}rO(hUh}rP(U anchornameU$#useful-firefox-add-ons-for-scrapingUrefurih#h]h]h]h]h]Uinternaluh jJh]rQhX#Useful Firefox add-ons for scrapingrRrS}rT(hX#Useful Firefox add-ons for scrapingh jNubah!jubah!jubj)rU}rV(hUh}rW(h]h]h]h]h]uh jFh]rX(j)rY}rZ(hUh}r[(h]h]h]h]h]uh jUh]r\j)r]}r^(hUh}r_(h]h]h]h]h]uh jYh]r`j)ra}rb(hUh}rc(U anchornameU#firebugUrefurih#h]h]h]h]h]Uinternaluh j]h]rdhXFirebugrerf}rg(hXFirebugh jaubah!jubah!jubah!jubj)rh}ri(hUh}rj(h]h]h]h]h]uh jUh]rkj)rl}rm(hUh}rn(h]h]h]h]h]uh jhh]roj)rp}rq(hUh}rr(U anchornameU#xpatherUrefurih#h]h]h]h]h]Uinternaluh jlh]rshXXPatherrtru}rv(hXXPatherh jpubah!jubah!jubah!jubj)rw}rx(hUh}ry(h]h]h]h]h]uh jUh]rzj)r{}r|(hUh}r}(h]h]h]h]h]uh jwh]r~j)r}r(hUh}r(U anchornameU#xpath-checkerUrefurih#h]h]h]h]h]Uinternaluh j{h]rhX XPath Checkerrr}r(hX XPath Checkerh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jUh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #tamper-dataUrefurih#h]h]h]h]h]Uinternaluh jh]rhX Tamper Datarr}r(hX Tamper Datah jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jUh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #firecookieUrefurih#h]h]h]h]h]Uinternaluh jh]rhX Firecookierr}r(hX Firecookieh jubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubh,j)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurih,h]h]h]h]h]Uinternaluh jh]rhX Item Pipelinerr}r(hh4h jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#writing-your-own-item-pipelineUrefurih,h]h]h]h]h]Uinternaluh jh]rhXWriting your own item pipelinerr}r(hXWriting your own item pipelineh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#item-pipeline-exampleUrefurih,h]h]h]h]h]Uinternaluh jh]rhXItem pipeline examplerr}r(hXItem pipeline exampleh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU3#price-validation-and-dropping-items-with-no-pricesUrefurih,h]h]h]h]h]Uinternaluh jh]rhX2Price validation and dropping items with no pricesrr}r(hX2Price validation and dropping items with no pricesh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#write-items-to-a-json-fileUrefurih,h]h]h]h]h]Uinternaluh jh]rhXWrite items to a JSON filerr}r(hXWrite items to a JSON fileh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#duplicates-filterUrefurih,h]h]h]h]h]Uinternaluh jh]rhXDuplicates filterrr}r (hXDuplicates filterh jubah!jubah!jubah!jubeh!jubeh!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r}r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameU&#activating-an-item-pipeline-componentUrefurih,h]h]h]h]h]Uinternaluh jh]rhX%Activating an Item Pipeline componentrr}r(hX%Activating an Item Pipeline componenth jubah!jubah!jubah!jubeh!jubeh!jubah!jubh5j)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r (j)r!}r"(hUh}r#(h]h]h]h]h]uh jh]r$j)r%}r&(hUh}r'(U anchornameUUrefurih5h]h]h]h]h]Uinternaluh j!h]r(hXInstallation guider)r*}r+(hh=h j%ubah!jubah!jubj)r,}r-(hUh}r.(h]h]h]h]h]uh jh]r/(j)r0}r1(hUh}r2(h]h]h]h]h]uh j,h]r3j)r4}r5(hUh}r6(h]h]h]h]h]uh j0h]r7j)r8}r9(hUh}r:(U anchornameU#pre-requisitesUrefurih5h]h]h]h]h]Uinternaluh j4h]r;hXPre-requisitesr<r=}r>(hXPre-requisitesh j8ubah!jubah!jubah!jubj)r?}r@(hUh}rA(h]h]h]h]h]uh j,h]rBj)rC}rD(hUh}rE(h]h]h]h]h]uh j?h]rFj)rG}rH(hUh}rI(U anchornameU#installing-scrapyUrefurih5h]h]h]h]h]Uinternaluh jCh]rJhXInstalling ScrapyrKrL}rM(hXInstalling Scrapyh jGubah!jubah!jubah!jubj)rN}rO(hUh}rP(h]h]h]h]h]uh j,h]rQ(j)rR}rS(hUh}rT(h]h]h]h]h]uh jNh]rUj)rV}rW(hUh}rX(U anchornameU%#platform-specific-installation-notesUrefurih5h]h]h]h]h]Uinternaluh jRh]rYhX$Platform specific installation notesrZr[}r\(hX$Platform specific installation notesh jVubah!jubah!jubj)r]}r^(hUh}r_(h]h]h]h]h]uh jNh]r`j)ra}rb(hUh}rc(h]h]h]h]h]uh j]h]rd(j)re}rf(hUh}rg(h]h]h]h]h]uh jah]rhj)ri}rj(hUh}rk(U anchornameU#windowsUrefurih5h]h]h]h]h]Uinternaluh jeh]rlhXWindowsrmrn}ro(hXWindowsh jiubah!jubah!jubj)rp}rq(hUh}rr(h]h]h]h]h]uh jah]rsj)rt}ru(hUh}rv(h]h]h]h]h]uh jph]rwj)rx}ry(hUh}rz(h]h]h]h]h]uh jth]r{j)r|}r}(hUh}r~(U anchornameU#ubuntu-9-10-or-aboveUrefurih5h]h]h]h]h]Uinternaluh jxh]rhXUbuntu 9.10 or aboverr}r(hXUbuntu 9.10 or aboveh j|ubah!jubah!jubah!jubah!jubeh!jubah!jubeh!jubeh!jubeh!jubah!jubh>j)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurih>h]h]h]h]h]Uinternaluh jh]rhXUsing Firebug for scrapingrr}r(hhFh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #introductionUrefurih>h]h]h]h]h]Uinternaluh jh]rhX Introductionrr}r(hX Introductionh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#getting-links-to-followUrefurih>h]h]h]h]h]Uinternaluh jh]rhXGetting links to followrr}r(hXGetting links to followh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#extracting-the-dataUrefurih>h]h]h]h]h]Uinternaluh jh]rhXExtracting the datarr}r(hXExtracting the datah jubah!jubah!jubah!jubeh!jubeh!jubah!jubhGj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihGh]h]h]h]h]Uinternaluh jh]rhXSettingsrr}r(hhOh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#designating-the-settingsUrefurihGh]h]h]h]h]Uinternaluh jh]rhXDesignating the settingsrr}r(hXDesignating the settingsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#populating-the-settingsUrefurihGh]h]h]h]h]Uinternaluh jh]rhXPopulating the settingsrr}r(hXPopulating the settingsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#global-overridesUrefurihGh]h]h]h]h]Uinternaluh j h]r hX1. Global overridesr r }r (hX1. Global overridesh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#project-settings-moduleUrefurihGh]h]h]h]h]Uinternaluh j h]r hX2. Project settings moduler r }r (hX2. Project settings moduleh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r! j)r" }r# (hUh}r$ (h]h]h]h]h]uh j h]r% j)r& }r' (hUh}r( (U anchornameU#default-settings-per-commandUrefurihGh]h]h]h]h]Uinternaluh j" h]r) hX3. Default settings per-commandr* r+ }r, (hX3. Default settings per-commandh j& ubah!jubah!jubah!jubj)r- }r. (hUh}r/ (h]h]h]h]h]uh jh]r0 j)r1 }r2 (hUh}r3 (h]h]h]h]h]uh j- h]r4 j)r5 }r6 (hUh}r7 (U anchornameU#default-global-settingsUrefurihGh]h]h]h]h]Uinternaluh j1 h]r8 hX4. Default global settingsr9 r: }r; (hX4. Default global settingsh j5 ubah!jubah!jubah!jubeh!jubeh!jubj)r< }r= (hUh}r> (h]h]h]h]h]uh jh]r? j)r@ }rA (hUh}rB (h]h]h]h]h]uh j< h]rC j)rD }rE (hUh}rF (U anchornameU#how-to-access-settingsUrefurihGh]h]h]h]h]Uinternaluh j@ h]rG hXHow to access settingsrH rI }rJ (hXHow to access settingsh jD ubah!jubah!jubah!jubj)rK }rL (hUh}rM (h]h]h]h]h]uh jh]rN j)rO }rP (hUh}rQ (h]h]h]h]h]uh jK h]rR j)rS }rT (hUh}rU (U anchornameU#rationale-for-setting-namesUrefurihGh]h]h]h]h]Uinternaluh jO h]rV hXRationale for setting namesrW rX }rY (hXRationale for setting namesh jS ubah!jubah!jubah!jubj)rZ }r[ (hUh}r\ (h]h]h]h]h]uh jh]r] (j)r^ }r_ (hUh}r` (h]h]h]h]h]uh jZ h]ra j)rb }rc (hUh}rd (U anchornameU#built-in-settings-referenceUrefurihGh]h]h]h]h]Uinternaluh j^ h]re hXBuilt-in settings referencerf rg }rh (hXBuilt-in settings referenceh jb ubah!jubah!jubj)ri }rj (hUh}rk (h]h]h]h]h]uh jZ h]rl (j)rm }rn (hUh}ro (h]h]h]h]h]uh ji h]rp j)rq }rr (hUh}rs (h]h]h]h]h]uh jm h]rt j)ru }rv (hUh}rw (U anchornameU#aws-access-key-idUrefurihGh]h]h]h]h]Uinternaluh jq h]rx hXAWS_ACCESS_KEY_IDry rz }r{ (hXAWS_ACCESS_KEY_IDh ju ubah!jubah!jubah!jubj)r| }r} (hUh}r~ (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j| h]r j)r }r (hUh}r (U anchornameU#aws-secret-access-keyUrefurihGh]h]h]h]h]Uinternaluh j h]r hXAWS_SECRET_ACCESS_KEYr r }r (hXAWS_SECRET_ACCESS_KEYh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #bot-nameUrefurihGh]h]h]h]h]Uinternaluh j h]r hXBOT_NAMEr r }r (hXBOT_NAMEh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#concurrent-itemsUrefurihGh]h]h]h]h]Uinternaluh j h]r hXCONCURRENT_ITEMSr r }r (hXCONCURRENT_ITEMSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#concurrent-requestsUrefurihGh]h]h]h]h]Uinternaluh j h]r hXCONCURRENT_REQUESTSr r }r (hXCONCURRENT_REQUESTSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#concurrent-requests-per-domainUrefurihGh]h]h]h]h]Uinternaluh j h]r hXCONCURRENT_REQUESTS_PER_DOMAINr r }r (hXCONCURRENT_REQUESTS_PER_DOMAINh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#concurrent-requests-per-ipUrefurihGh]h]h]h]h]Uinternaluh j h]r hXCONCURRENT_REQUESTS_PER_IPr r }r (hXCONCURRENT_REQUESTS_PER_IPh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#default-item-classUrefurihGh]h]h]h]h]Uinternaluh j h]r hXDEFAULT_ITEM_CLASSr r }r (hXDEFAULT_ITEM_CLASSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#default-request-headersUrefurihGh]h]h]h]h]Uinternaluh j h]r hXDEFAULT_REQUEST_HEADERSr r }r (hXDEFAULT_REQUEST_HEADERSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #depth-limitUrefurihGh]h]h]h]h]Uinternaluh j h]r hX DEPTH_LIMITr r }r (hX DEPTH_LIMITh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#depth-priorityUrefurihGh]h]h]h]h]Uinternaluh j h]r hXDEPTH_PRIORITYr r }r (hXDEPTH_PRIORITYh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #depth-statsUrefurihGh]h]h]h]h]Uinternaluh j h]r hX DEPTH_STATSr r }r (hX DEPTH_STATSh j ubah!jubah!jubah!jubj)r! }r" (hUh}r# (h]h]h]h]h]uh ji h]r$ j)r% }r& (hUh}r' (h]h]h]h]h]uh j! h]r( j)r) }r* (hUh}r+ (U anchornameU#depth-stats-verboseUrefurihGh]h]h]h]h]Uinternaluh j% h]r, hXDEPTH_STATS_VERBOSEr- r. }r/ (hXDEPTH_STATS_VERBOSEh j) ubah!jubah!jubah!jubj)r0 }r1 (hUh}r2 (h]h]h]h]h]uh ji h]r3 j)r4 }r5 (hUh}r6 (h]h]h]h]h]uh j0 h]r7 j)r8 }r9 (hUh}r: (U anchornameU#dnscache-enabledUrefurihGh]h]h]h]h]Uinternaluh j4 h]r; hXDNSCACHE_ENABLEDr< r= }r> (hXDNSCACHE_ENABLEDh j8 ubah!jubah!jubah!jubj)r? }r@ (hUh}rA (h]h]h]h]h]uh ji h]rB j)rC }rD (hUh}rE (h]h]h]h]h]uh j? h]rF j)rG }rH (hUh}rI (U anchornameU#downloader-debugUrefurihGh]h]h]h]h]Uinternaluh jC h]rJ hXDOWNLOADER_DEBUGrK rL }rM (hXDOWNLOADER_DEBUGh jG ubah!jubah!jubah!jubj)rN }rO (hUh}rP (h]h]h]h]h]uh ji h]rQ j)rR }rS (hUh}rT (h]h]h]h]h]uh jN h]rU j)rV }rW (hUh}rX (U anchornameU#downloader-middlewaresUrefurihGh]h]h]h]h]Uinternaluh jR h]rY hXDOWNLOADER_MIDDLEWARESrZ r[ }r\ (hXDOWNLOADER_MIDDLEWARESh jV ubah!jubah!jubah!jubj)r] }r^ (hUh}r_ (h]h]h]h]h]uh ji h]r` j)ra }rb (hUh}rc (h]h]h]h]h]uh j] h]rd j)re }rf (hUh}rg (U anchornameU#downloader-middlewares-baseUrefurihGh]h]h]h]h]Uinternaluh ja h]rh hXDOWNLOADER_MIDDLEWARES_BASEri rj }rk (hXDOWNLOADER_MIDDLEWARES_BASEh je ubah!jubah!jubah!jubj)rl }rm (hUh}rn (h]h]h]h]h]uh ji h]ro j)rp }rq (hUh}rr (h]h]h]h]h]uh jl h]rs j)rt }ru (hUh}rv (U anchornameU#downloader-statsUrefurihGh]h]h]h]h]Uinternaluh jp h]rw hXDOWNLOADER_STATSrx ry }rz (hXDOWNLOADER_STATSh jt ubah!jubah!jubah!jubj)r{ }r| (hUh}r} (h]h]h]h]h]uh ji h]r~ j)r }r (hUh}r (h]h]h]h]h]uh j{ h]r j)r }r (hUh}r (U anchornameU#download-delayUrefurihGh]h]h]h]h]Uinternaluh j h]r hXDOWNLOAD_DELAYr r }r (hXDOWNLOAD_DELAYh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#download-handlersUrefurihGh]h]h]h]h]Uinternaluh j h]r hXDOWNLOAD_HANDLERSr r }r (hXDOWNLOAD_HANDLERSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#download-handlers-baseUrefurihGh]h]h]h]h]Uinternaluh j h]r hXDOWNLOAD_HANDLERS_BASEr r }r (hXDOWNLOAD_HANDLERS_BASEh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#download-timeoutUrefurihGh]h]h]h]h]Uinternaluh j h]r hXDOWNLOAD_TIMEOUTr r }r (hXDOWNLOAD_TIMEOUTh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#dupefilter-classUrefurihGh]h]h]h]h]Uinternaluh j h]r hXDUPEFILTER_CLASSr r }r (hXDUPEFILTER_CLASSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#editorUrefurihGh]h]h]h]h]Uinternaluh j h]r hXEDITORr r }r (hXEDITORh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #extensionsUrefurihGh]h]h]h]h]Uinternaluh j h]r hX EXTENSIONSr r }r (hX EXTENSIONSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#extensions-baseUrefurihGh]h]h]h]h]Uinternaluh j h]r hXEXTENSIONS_BASEr r }r (hXEXTENSIONS_BASEh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#item-pipelinesUrefurihGh]h]h]h]h]Uinternaluh j h]r hXITEM_PIPELINESr r }r (hXITEM_PIPELINESh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#item-pipelines-baseUrefurihGh]h]h]h]h]Uinternaluh j h]r hXITEM_PIPELINES_BASEr r }r (hXITEM_PIPELINES_BASEh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #log-enabledUrefurihGh]h]h]h]h]Uinternaluh j h]r hX LOG_ENABLEDr r }r (hX LOG_ENABLEDh j ubah!jubah!jubah!jubj)r }r! (hUh}r" (h]h]h]h]h]uh ji h]r# j)r$ }r% (hUh}r& (h]h]h]h]h]uh j h]r' j)r( }r) (hUh}r* (U anchornameU #log-encodingUrefurihGh]h]h]h]h]Uinternaluh j$ h]r+ hX LOG_ENCODINGr, r- }r. (hX LOG_ENCODINGh j( ubah!jubah!jubah!jubj)r/ }r0 (hUh}r1 (h]h]h]h]h]uh ji h]r2 j)r3 }r4 (hUh}r5 (h]h]h]h]h]uh j/ h]r6 j)r7 }r8 (hUh}r9 (U anchornameU #log-fileUrefurihGh]h]h]h]h]Uinternaluh j3 h]r: hXLOG_FILEr; r< }r= (hXLOG_FILEh j7 ubah!jubah!jubah!jubj)r> }r? (hUh}r@ (h]h]h]h]h]uh ji h]rA j)rB }rC (hUh}rD (h]h]h]h]h]uh j> h]rE j)rF }rG (hUh}rH (U anchornameU #log-levelUrefurihGh]h]h]h]h]Uinternaluh jB h]rI hX LOG_LEVELrJ rK }rL (hX LOG_LEVELh jF ubah!jubah!jubah!jubj)rM }rN (hUh}rO (h]h]h]h]h]uh ji h]rP j)rQ }rR (hUh}rS (h]h]h]h]h]uh jM h]rT j)rU }rV (hUh}rW (U anchornameU #log-stdoutUrefurihGh]h]h]h]h]Uinternaluh jQ h]rX hX LOG_STDOUTrY rZ }r[ (hX LOG_STDOUTh jU ubah!jubah!jubah!jubj)r\ }r] (hUh}r^ (h]h]h]h]h]uh ji h]r_ j)r` }ra (hUh}rb (h]h]h]h]h]uh j\ h]rc j)rd }re (hUh}rf (U anchornameU#memdebug-enabledUrefurihGh]h]h]h]h]Uinternaluh j` h]rg hXMEMDEBUG_ENABLEDrh ri }rj (hXMEMDEBUG_ENABLEDh jd ubah!jubah!jubah!jubj)rk }rl (hUh}rm (h]h]h]h]h]uh ji h]rn j)ro }rp (hUh}rq (h]h]h]h]h]uh jk h]rr j)rs }rt (hUh}ru (U anchornameU#memdebug-notifyUrefurihGh]h]h]h]h]Uinternaluh jo h]rv hXMEMDEBUG_NOTIFYrw rx }ry (hXMEMDEBUG_NOTIFYh js ubah!jubah!jubah!jubj)rz }r{ (hUh}r| (h]h]h]h]h]uh ji h]r} j)r~ }r (hUh}r (h]h]h]h]h]uh jz h]r j)r }r (hUh}r (U anchornameU#memusage-enabledUrefurihGh]h]h]h]h]Uinternaluh j~ h]r hXMEMUSAGE_ENABLEDr r }r (hXMEMUSAGE_ENABLEDh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#memusage-limit-mbUrefurihGh]h]h]h]h]Uinternaluh j h]r hXMEMUSAGE_LIMIT_MBr r }r (hXMEMUSAGE_LIMIT_MBh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#memusage-notify-mailUrefurihGh]h]h]h]h]Uinternaluh j h]r hXMEMUSAGE_NOTIFY_MAILr r }r (hXMEMUSAGE_NOTIFY_MAILh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#memusage-reportUrefurihGh]h]h]h]h]Uinternaluh j h]r hXMEMUSAGE_REPORTr r }r (hXMEMUSAGE_REPORTh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#memusage-warning-mbUrefurihGh]h]h]h]h]Uinternaluh j h]r hXMEMUSAGE_WARNING_MBr r }r (hXMEMUSAGE_WARNING_MBh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#newspider-moduleUrefurihGh]h]h]h]h]Uinternaluh j h]r hXNEWSPIDER_MODULEr r }r (hXNEWSPIDER_MODULEh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#randomize-download-delayUrefurihGh]h]h]h]h]Uinternaluh j h]r hXRANDOMIZE_DOWNLOAD_DELAYr r }r (hXRANDOMIZE_DOWNLOAD_DELAYh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#redirect-max-timesUrefurihGh]h]h]h]h]Uinternaluh j h]r hXREDIRECT_MAX_TIMESr r }r (hXREDIRECT_MAX_TIMESh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#redirect-max-metarefresh-delayUrefurihGh]h]h]h]h]Uinternaluh j h]r hXREDIRECT_MAX_METAREFRESH_DELAYr r }r (hXREDIRECT_MAX_METAREFRESH_DELAYh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#redirect-priority-adjustUrefurihGh]h]h]h]h]Uinternaluh j h]r hXREDIRECT_PRIORITY_ADJUSTr r }r (hXREDIRECT_PRIORITY_ADJUSTh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#robotstxt-obeyUrefurihGh]h]h]h]h]Uinternaluh j h]r hXROBOTSTXT_OBEYr r }r (hXROBOTSTXT_OBEYh j ubah!jubah!jubah!jubj)r }r (hUh}r! (h]h]h]h]h]uh ji h]r" j)r# }r$ (hUh}r% (h]h]h]h]h]uh j h]r& j)r' }r( (hUh}r) (U anchornameU #schedulerUrefurihGh]h]h]h]h]Uinternaluh j# h]r* hX SCHEDULERr+ r, }r- (hX SCHEDULERh j' ubah!jubah!jubah!jubj)r. }r/ (hUh}r0 (h]h]h]h]h]uh ji h]r1 j)r2 }r3 (hUh}r4 (h]h]h]h]h]uh j. h]r5 j)r6 }r7 (hUh}r8 (U anchornameU#spider-contractsUrefurihGh]h]h]h]h]Uinternaluh j2 h]r9 hXSPIDER_CONTRACTSr: r; }r< (hXSPIDER_CONTRACTSh j6 ubah!jubah!jubah!jubj)r= }r> (hUh}r? (h]h]h]h]h]uh ji h]r@ j)rA }rB (hUh}rC (h]h]h]h]h]uh j= h]rD j)rE }rF (hUh}rG (U anchornameU#spider-contracts-baseUrefurihGh]h]h]h]h]Uinternaluh jA h]rH hXSPIDER_CONTRACTS_BASErI rJ }rK (hXSPIDER_CONTRACTS_BASEh jE ubah!jubah!jubah!jubj)rL }rM (hUh}rN (h]h]h]h]h]uh ji h]rO j)rP }rQ (hUh}rR (h]h]h]h]h]uh jL h]rS j)rT }rU (hUh}rV (U anchornameU#spider-middlewaresUrefurihGh]h]h]h]h]Uinternaluh jP h]rW hXSPIDER_MIDDLEWARESrX rY }rZ (hXSPIDER_MIDDLEWARESh jT ubah!jubah!jubah!jubj)r[ }r\ (hUh}r] (h]h]h]h]h]uh ji h]r^ j)r_ }r` (hUh}ra (h]h]h]h]h]uh j[ h]rb j)rc }rd (hUh}re (U anchornameU#spider-middlewares-baseUrefurihGh]h]h]h]h]Uinternaluh j_ h]rf hXSPIDER_MIDDLEWARES_BASErg rh }ri (hXSPIDER_MIDDLEWARES_BASEh jc ubah!jubah!jubah!jubj)rj }rk (hUh}rl (h]h]h]h]h]uh ji h]rm j)rn }ro (hUh}rp (h]h]h]h]h]uh jj h]rq j)rr }rs (hUh}rt (U anchornameU#spider-modulesUrefurihGh]h]h]h]h]Uinternaluh jn h]ru hXSPIDER_MODULESrv rw }rx (hXSPIDER_MODULESh jr ubah!jubah!jubah!jubj)ry }rz (hUh}r{ (h]h]h]h]h]uh ji h]r| j)r} }r~ (hUh}r (h]h]h]h]h]uh jy h]r j)r }r (hUh}r (U anchornameU #stats-classUrefurihGh]h]h]h]h]Uinternaluh j} h]r hX STATS_CLASSr r }r (hX STATS_CLASSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #stats-dumpUrefurihGh]h]h]h]h]Uinternaluh j h]r hX STATS_DUMPr r }r (hX STATS_DUMPh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#statsmailer-rcptsUrefurihGh]h]h]h]h]Uinternaluh j h]r hXSTATSMAILER_RCPTSr r }r (hXSTATSMAILER_RCPTSh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#telnetconsole-enabledUrefurihGh]h]h]h]h]Uinternaluh j h]r hXTELNETCONSOLE_ENABLEDr r }r (hXTELNETCONSOLE_ENABLEDh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#telnetconsole-portUrefurihGh]h]h]h]h]Uinternaluh j h]r hXTELNETCONSOLE_PORTr r }r (hXTELNETCONSOLE_PORTh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#templates-dirUrefurihGh]h]h]h]h]Uinternaluh j h]r hX TEMPLATES_DIRr r }r (hX TEMPLATES_DIRh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#urllength-limitUrefurihGh]h]h]h]h]Uinternaluh j h]r hXURLLENGTH_LIMITr r }r (hXURLLENGTH_LIMITh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh ji h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #user-agentUrefurihGh]h]h]h]h]Uinternaluh j h]r hX USER_AGENTr r }r (hX USER_AGENTh j ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubhPj)r }r (hUh}r (h]h]h]h]h]uh]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r (j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameUUrefurihPh]h]h]h]h]Uinternaluh j h]r hXCommon Practicesr r }r (hhXh j ubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r (j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#run-scrapy-from-a-scriptUrefurihPh]h]h]h]h]Uinternaluh j h]r hXRun Scrapy from a scriptr r }r (hXRun Scrapy from a scripth j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r! (U anchornameU-#running-multiple-spiders-in-the-same-processUrefurihPh]h]h]h]h]Uinternaluh j h]r" hX,Running multiple spiders in the same processr# r$ }r% (hX,Running multiple spiders in the same processh j ubah!jubah!jubah!jubj)r& }r' (hUh}r( (h]h]h]h]h]uh j h]r) j)r* }r+ (hUh}r, (h]h]h]h]h]uh j& h]r- j)r. }r/ (hUh}r0 (U anchornameU#distributed-crawlsUrefurihPh]h]h]h]h]Uinternaluh j* h]r1 hXDistributed crawlsr2 r3 }r4 (hXDistributed crawlsh j. ubah!jubah!jubah!jubj)r5 }r6 (hUh}r7 (h]h]h]h]h]uh j h]r8 j)r9 }r: (hUh}r; (h]h]h]h]h]uh j5 h]r< j)r= }r> (hUh}r? (U anchornameU#avoiding-getting-bannedUrefurihPh]h]h]h]h]Uinternaluh j9 h]r@ hXAvoiding getting bannedrA rB }rC (hXAvoiding getting bannedh j= ubah!jubah!jubah!jubj)rD }rE (hUh}rF (h]h]h]h]h]uh j h]rG j)rH }rI (hUh}rJ (h]h]h]h]h]uh jD h]rK j)rL }rM (hUh}rN (U anchornameU!#dynamic-creation-of-item-classesUrefurihPh]h]h]h]h]Uinternaluh jH h]rO hX Dynamic Creation of Item ClassesrP rQ }rR (hX Dynamic Creation of Item Classesh jL ubah!jubah!jubah!jubeh!jubeh!jubah!jubhYj)rS }rT (hUh}rU (h]h]h]h]h]uh]rV j)rW }rX (hUh}rY (h]h]h]h]h]uh jS h]rZ (j)r[ }r\ (hUh}r] (h]h]h]h]h]uh jW h]r^ j)r_ }r` (hUh}ra (U anchornameUUrefurihYh]h]h]h]h]Uinternaluh j[ h]rb hXArchitecture overviewrc rd }re (hhah j_ ubah!jubah!jubj)rf }rg (hUh}rh (h]h]h]h]h]uh jW h]ri (j)rj }rk (hUh}rl (h]h]h]h]h]uh jf h]rm j)rn }ro (hUh}rp (h]h]h]h]h]uh jj h]rq j)rr }rs (hUh}rt (U anchornameU #overviewUrefurihYh]h]h]h]h]Uinternaluh jn h]ru hXOverviewrv rw }rx (hXOverviewh jr ubah!jubah!jubah!jubj)ry }rz (hUh}r{ (h]h]h]h]h]uh jf h]r| (j)r} }r~ (hUh}r (h]h]h]h]h]uh jy h]r j)r }r (hUh}r (U anchornameU #componentsUrefurihYh]h]h]h]h]Uinternaluh j} h]r hX Componentsr r }r (hX Componentsh j ubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jy h]r (j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#scrapy-engineUrefurihYh]h]h]h]h]Uinternaluh j h]r hX Scrapy Enginer r }r (hX Scrapy Engineh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #schedulerUrefurihYh]h]h]h]h]Uinternaluh j h]r hX Schedulerr r }r (hX Schedulerh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #downloaderUrefurihYh]h]h]h]h]Uinternaluh j h]r hX Downloaderr r }r (hX Downloaderh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#spidersUrefurihYh]h]h]h]h]Uinternaluh j h]r hXSpidersr r }r (hXSpidersh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#item-pipelineUrefurihYh]h]h]h]h]Uinternaluh j h]r hX Item Pipeliner r }r (hX Item Pipelineh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#downloader-middlewaresUrefurihYh]h]h]h]h]Uinternaluh j h]r hXDownloader middlewaresr r }r (hXDownloader middlewaresh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#spider-middlewaresUrefurihYh]h]h]h]h]Uinternaluh j h]r hXSpider middlewaresr r }r (hXSpider middlewaresh j ubah!jubah!jubah!jubeh!jubeh!jubj)r }r (hUh}r (h]h]h]h]h]uh jf h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU #data-flowUrefurihYh]h]h]h]h]Uinternaluh j h]rhX Data flowrr}r(hX Data flowh j ubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jf h]rj)r}r (hUh}r (h]h]h]h]h]uh jh]r j)r }r (hUh}r(U anchornameU#event-driven-networkingUrefurihYh]h]h]h]h]Uinternaluh jh]rhXEvent-driven networkingrr}r(hXEvent-driven networkingh j ubah!jubah!jubah!jubeh!jubeh!jubah!jubhbj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r (hUh}r!(U anchornameUUrefurihbh]h]h]h]h]Uinternaluh jh]r"hX Feed exportsr#r$}r%(hhjh jubah!jubah!jubj)r&}r'(hUh}r((h]h]h]h]h]uh jh]r)(j)r*}r+(hUh}r,(h]h]h]h]h]uh j&h]r-(j)r.}r/(hUh}r0(h]h]h]h]h]uh j*h]r1j)r2}r3(hUh}r4(U anchornameU#serialization-formatsUrefurihbh]h]h]h]h]Uinternaluh j.h]r5hXSerialization formatsr6r7}r8(hXSerialization formatsh j2ubah!jubah!jubj)r9}r:(hUh}r;(h]h]h]h]h]uh j*h]r<(j)r=}r>(hUh}r?(h]h]h]h]h]uh j9h]r@j)rA}rB(hUh}rC(h]h]h]h]h]uh j=h]rDj)rE}rF(hUh}rG(U anchornameU#jsonUrefurihbh]h]h]h]h]Uinternaluh jAh]rHhXJSONrIrJ}rK(hXJSONh jEubah!jubah!jubah!jubj)rL}rM(hUh}rN(h]h]h]h]h]uh j9h]rOj)rP}rQ(hUh}rR(h]h]h]h]h]uh jLh]rSj)rT}rU(hUh}rV(U anchornameU #json-linesUrefurihbh]h]h]h]h]Uinternaluh jPh]rWhX JSON linesrXrY}rZ(hX JSON linesh jTubah!jubah!jubah!jubj)r[}r\(hUh}r](h]h]h]h]h]uh j9h]r^j)r_}r`(hUh}ra(h]h]h]h]h]uh j[h]rbj)rc}rd(hUh}re(U anchornameU#csvUrefurihbh]h]h]h]h]Uinternaluh j_h]rfhXCSVrgrh}ri(hXCSVh jcubah!jubah!jubah!jubj)rj}rk(hUh}rl(h]h]h]h]h]uh j9h]rmj)rn}ro(hUh}rp(h]h]h]h]h]uh jjh]rqj)rr}rs(hUh}rt(U anchornameU#xmlUrefurihbh]h]h]h]h]Uinternaluh jnh]ruhXXMLrvrw}rx(hXXMLh jrubah!jubah!jubah!jubj)ry}rz(hUh}r{(h]h]h]h]h]uh j9h]r|j)r}}r~(hUh}r(h]h]h]h]h]uh jyh]rj)r}r(hUh}r(U anchornameU#pickleUrefurihbh]h]h]h]h]Uinternaluh j}h]rhXPicklerr}r(hXPickleh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j9h]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#marshalUrefurihbh]h]h]h]h]Uinternaluh jh]rhXMarshalrr}r(hXMarshalh jubah!jubah!jubah!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh j&h]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #storagesUrefurihbh]h]h]h]h]Uinternaluh jh]rhXStoragesrr}r(hXStoragesh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j&h]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#storage-uri-parametersUrefurihbh]h]h]h]h]Uinternaluh jh]rhXStorage URI parametersrr}r(hXStorage URI parametersh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j&h]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#storage-backendsUrefurihbh]h]h]h]h]Uinternaluh jh]rhXStorage backendsrr}r(hXStorage backendsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#local-filesystemUrefurihbh]h]h]h]h]Uinternaluh jh]rhXLocal filesystemrr}r(hXLocal filesystemh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#ftpUrefurihbh]h]h]h]h]Uinternaluh jh]rhXFTPrr}r(hXFTPh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#s3Urefurihbh]h]h]h]h]Uinternaluh jh]rhXS3rr}r(hXS3h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#standard-outputUrefurihbh]h]h]h]h]Uinternaluh jh]rhXStandard outputrr}r(hXStandard outputh jubah!jubah!jubah!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh j&h]r(j)r}r (hUh}r (h]h]h]h]h]uh jh]r j)r }r (hUh}r(U anchornameU #settingsUrefurihbh]h]h]h]h]Uinternaluh jh]rhXSettingsrr}r(hXSettingsh j ubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r (hUh}r!(U anchornameU #feed-uriUrefurihbh]h]h]h]h]Uinternaluh jh]r"hXFEED_URIr#r$}r%(hXFEED_URIh jubah!jubah!jubah!jubj)r&}r'(hUh}r((h]h]h]h]h]uh jh]r)j)r*}r+(hUh}r,(h]h]h]h]h]uh j&h]r-j)r.}r/(hUh}r0(U anchornameU #feed-formatUrefurihbh]h]h]h]h]Uinternaluh j*h]r1hX FEED_FORMATr2r3}r4(hX FEED_FORMATh j.ubah!jubah!jubah!jubj)r5}r6(hUh}r7(h]h]h]h]h]uh jh]r8j)r9}r:(hUh}r;(h]h]h]h]h]uh j5h]r<j)r=}r>(hUh}r?(U anchornameU#feed-store-emptyUrefurihbh]h]h]h]h]Uinternaluh j9h]r@hXFEED_STORE_EMPTYrArB}rC(hXFEED_STORE_EMPTYh j=ubah!jubah!jubah!jubj)rD}rE(hUh}rF(h]h]h]h]h]uh jh]rGj)rH}rI(hUh}rJ(h]h]h]h]h]uh jDh]rKj)rL}rM(hUh}rN(U anchornameU#feed-storagesUrefurihbh]h]h]h]h]Uinternaluh jHh]rOhX FEED_STORAGESrPrQ}rR(hX FEED_STORAGESh jLubah!jubah!jubah!jubj)rS}rT(hUh}rU(h]h]h]h]h]uh jh]rVj)rW}rX(hUh}rY(h]h]h]h]h]uh jSh]rZj)r[}r\(hUh}r](U anchornameU#feed-storages-baseUrefurihbh]h]h]h]h]Uinternaluh jWh]r^hXFEED_STORAGES_BASEr_r`}ra(hXFEED_STORAGES_BASEh j[ubah!jubah!jubah!jubj)rb}rc(hUh}rd(h]h]h]h]h]uh jh]rej)rf}rg(hUh}rh(h]h]h]h]h]uh jbh]rij)rj}rk(hUh}rl(U anchornameU#feed-exportersUrefurihbh]h]h]h]h]Uinternaluh jfh]rmhXFEED_EXPORTERSrnro}rp(hXFEED_EXPORTERSh jjubah!jubah!jubah!jubj)rq}rr(hUh}rs(h]h]h]h]h]uh jh]rtj)ru}rv(hUh}rw(h]h]h]h]h]uh jqh]rxj)ry}rz(hUh}r{(U anchornameU#feed-exporters-baseUrefurihbh]h]h]h]h]Uinternaluh juh]r|hXFEED_EXPORTERS_BASEr}r~}r(hXFEED_EXPORTERS_BASEh jyubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubhkj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihkh]h]h]h]h]Uinternaluh jh]rhXItemsrr}r(hhsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#declaring-itemsUrefurihkh]h]h]h]h]Uinternaluh jh]rhXDeclaring Itemsrr}r(hXDeclaring Itemsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #item-fieldsUrefurihkh]h]h]h]h]Uinternaluh jh]rhX Item Fieldsrr}r(hX Item Fieldsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#working-with-itemsUrefurihkh]h]h]h]h]Uinternaluh jh]rhXWorking with Itemsrr}r(hXWorking with Itemsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#creating-itemsUrefurihkh]h]h]h]h]Uinternaluh jh]rhXCreating itemsrr}r(hXCreating itemsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#getting-field-valuesUrefurihkh]h]h]h]h]Uinternaluh jh]rhXGetting field valuesrr}r(hXGetting field valuesh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#setting-field-valuesUrefurihkh]h]h]h]h]Uinternaluh jh]rhXSetting field valuesrr}r(hXSetting field valuesh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#accessing-all-populated-valuesUrefurihkh]h]h]h]h]Uinternaluh jh]rhXAccessing all populated valuesrr}r(hXAccessing all populated valuesh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r (hUh}r (h]h]h]h]h]uh jh]r j)r }r (hUh}r(U anchornameU#other-common-tasksUrefurihkh]h]h]h]h]Uinternaluh jh]rhXOther common tasksrr}r(hXOther common tasksh j ubah!jubah!jubah!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#extending-itemsUrefurihkh]h]h]h]h]Uinternaluh jh]rhXExtending Itemsrr }r!(hXExtending Itemsh jubah!jubah!jubah!jubj)r"}r#(hUh}r$(h]h]h]h]h]uh jh]r%j)r&}r'(hUh}r((h]h]h]h]h]uh j"h]r)j)r*}r+(hUh}r,(U anchornameU #item-objectsUrefurihkh]h]h]h]h]Uinternaluh j&h]r-hX Item objectsr.r/}r0(hX Item objectsh j*ubah!jubah!jubah!jubj)r1}r2(hUh}r3(h]h]h]h]h]uh jh]r4j)r5}r6(hUh}r7(h]h]h]h]h]uh j1h]r8j)r9}r:(hUh}r;(U anchornameU#field-objectsUrefurihkh]h]h]h]h]Uinternaluh j5h]r<hX Field objectsr=r>}r?(hX Field objectsh j9ubah!jubah!jubah!jubeh!jubeh!jubah!jubhtj)r@}rA(hUh}rB(h]h]h]h]h]uh]rCj)rD}rE(hUh}rF(h]h]h]h]h]uh j@h]rG(j)rH}rI(hUh}rJ(h]h]h]h]h]uh jDh]rKj)rL}rM(hUh}rN(U anchornameUUrefurihth]h]h]h]h]Uinternaluh jHh]rOhXSpider MiddlewarerPrQ}rR(hh|h jLubah!jubah!jubj)rS}rT(hUh}rU(h]h]h]h]h]uh jDh]rV(j)rW}rX(hUh}rY(h]h]h]h]h]uh jSh]rZj)r[}r\(hUh}r](h]h]h]h]h]uh jWh]r^j)r_}r`(hUh}ra(U anchornameU#activating-a-spider-middlewareUrefurihth]h]h]h]h]Uinternaluh j[h]rbhXActivating a spider middlewarercrd}re(hXActivating a spider middlewarerfh j_ubah!jubah!jubah!jubj)rg}rh(hUh}ri(h]h]h]h]h]uh jSh]rjj)rk}rl(hUh}rm(h]h]h]h]h]uh jgh]rnj)ro}rp(hUh}rq(U anchornameU##writing-your-own-spider-middlewareUrefurihth]h]h]h]h]Uinternaluh jkh]rrhX"Writing your own spider middlewarersrt}ru(hX"Writing your own spider middlewarervh joubah!jubah!jubah!jubj)rw}rx(hUh}ry(h]h]h]h]h]uh jSh]rz(j)r{}r|(hUh}r}(h]h]h]h]h]uh jwh]r~j)r}r(hUh}r(U anchornameU%#built-in-spider-middleware-referenceUrefurihth]h]h]h]h]Uinternaluh j{h]rhX$Built-in spider middleware referencerr}r(hX$Built-in spider middleware referencerh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jwh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX-#module-scrapy.contrib.spidermiddleware.depthUrefurihth]h]h]h]h]Uinternaluh jh]rhXDepthMiddlewarerr}r(hXDepthMiddlewarerh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX1#module-scrapy.contrib.spidermiddleware.httperrorUrefurihth]h]h]h]h]Uinternaluh jh]rhXHttpErrorMiddlewarerr}r(hXHttpErrorMiddlewarerh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httperrormiddleware-settingsUrefurihth]h]h]h]h]Uinternaluh jh]rhXHttpErrorMiddleware settingsrr}r(hXHttpErrorMiddleware settingsrh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httperror-allowed-codesUrefurihth]h]h]h]h]Uinternaluh jh]rhXHTTPERROR_ALLOWED_CODESrr}r(hXHTTPERROR_ALLOWED_CODESrh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httperror-allow-allUrefurihth]h]h]h]h]Uinternaluh jh]rhXHTTPERROR_ALLOW_ALLrr}r(hXHTTPERROR_ALLOW_ALLrh jubah!jubah!jubah!jubeh!jubeh!jubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX/#module-scrapy.contrib.spidermiddleware.offsiteUrefurihth]h]h]h]h]Uinternaluh jh]rhXOffsiteMiddlewarerr}r(hXOffsiteMiddlewarerh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX/#module-scrapy.contrib.spidermiddleware.refererUrefurihth]h]h]h]h]Uinternaluh jh]rhXRefererMiddlewarerr}r(hXRefererMiddlewarerh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r (h]h]h]h]h]uh jh]r (j)r }r (hUh}r (h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#referermiddleware-settingsUrefurihth]h]h]h]h]Uinternaluh j h]rhXRefererMiddleware settingsrr}r(hXRefererMiddleware settingsrh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r (hUh}r!(h]h]h]h]h]uh jh]r"j)r#}r$(hUh}r%(U anchornameU#referer-enabledUrefurihth]h]h]h]h]Uinternaluh jh]r&hXREFERER_ENABLEDr'r(}r)(hXREFERER_ENABLEDr*h j#ubah!jubah!jubah!jubah!jubeh!jubah!jubeh!jubj)r+}r,(hUh}r-(h]h]h]h]h]uh jh]r.j)r/}r0(hUh}r1(h]h]h]h]h]uh j+h]r2j)r3}r4(hUh}r5(U anchornameX1#module-scrapy.contrib.spidermiddleware.urllengthUrefurihth]h]h]h]h]Uinternaluh j/h]r6hXUrlLengthMiddlewarer7r8}r9(hXUrlLengthMiddlewarer:h j3ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubh}j)r;}r<(hUh}r=(h]h]h]h]h]uh]r>j)r?}r@(hUh}rA(h]h]h]h]h]uh j;h]rB(j)rC}rD(hUh}rE(h]h]h]h]h]uh j?h]rFj)rG}rH(hUh}rI(U anchornameUUrefurih}h]h]h]h]h]Uinternaluh jCh]rJhXItem ExportersrKrL}rM(hhh jGubah!jubah!jubj)rN}rO(hUh}rP(h]h]h]h]h]uh j?h]rQ(j)rR}rS(hUh}rT(h]h]h]h]h]uh jNh]rUj)rV}rW(hUh}rX(h]h]h]h]h]uh jRh]rYj)rZ}r[(hUh}r\(U anchornameU#using-item-exportersUrefurih}h]h]h]h]h]Uinternaluh jVh]r]hXUsing Item Exportersr^r_}r`(hXUsing Item Exportersh jZubah!jubah!jubah!jubj)ra}rb(hUh}rc(h]h]h]h]h]uh jNh]rd(j)re}rf(hUh}rg(h]h]h]h]h]uh jah]rhj)ri}rj(hUh}rk(U anchornameU#serialization-of-item-fieldsUrefurih}h]h]h]h]h]Uinternaluh jeh]rlhXSerialization of item fieldsrmrn}ro(hXSerialization of item fieldsh jiubah!jubah!jubj)rp}rq(hUh}rr(h]h]h]h]h]uh jah]rs(j)rt}ru(hUh}rv(h]h]h]h]h]uh jph]rwj)rx}ry(hUh}rz(h]h]h]h]h]uh jth]r{j)r|}r}(hUh}r~(U anchornameU$#declaring-a-serializer-in-the-fieldUrefurih}h]h]h]h]h]Uinternaluh jxh]rhX&1. Declaring a serializer in the fieldrr}r(hX&1. Declaring a serializer in the fieldh j|ubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jph]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU&#overriding-the-serialize-field-methodUrefurih}h]h]h]h]h]Uinternaluh jh]rhX*2. Overriding the serialize_field() methodrr}r(hX*2. Overriding the serialize_field() methodh jubah!jubah!jubah!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jNh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU"#built-in-item-exporters-referenceUrefurih}h]h]h]h]h]Uinternaluh jh]rhX!Built-in Item Exporters referencerr}r(hX!Built-in Item Exporters referenceh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#baseitemexporterUrefurih}h]h]h]h]h]Uinternaluh jh]rhXBaseItemExporterrr}r(hXBaseItemExporterh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#xmlitemexporterUrefurih}h]h]h]h]h]Uinternaluh jh]rhXXmlItemExporterrr}r(hXXmlItemExporterh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#csvitemexporterUrefurih}h]h]h]h]h]Uinternaluh jh]rhXCsvItemExporterrr}r(hXCsvItemExporterh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#pickleitemexporterUrefurih}h]h]h]h]h]Uinternaluh jh]rhXPickleItemExporterrr}r(hXPickleItemExporterh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#pprintitemexporterUrefurih}h]h]h]h]h]Uinternaluh jh]rhXPprintItemExporterrr}r(hXPprintItemExporterh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#jsonitemexporterUrefurih}h]h]h]h]h]Uinternaluh jh]rhXJsonItemExporterrr}r(hXJsonItemExporterh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r (U anchornameU#jsonlinesitemexporterUrefurih}h]h]h]h]h]Uinternaluh jh]r hXJsonLinesItemExporterr r }r (hXJsonLinesItemExporterh jubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubhj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jh]rhX!Jobs: pausing and resuming crawlsrr}r (hhh jubah!jubah!jubj)r!}r"(hUh}r#(h]h]h]h]h]uh jh]r$(j)r%}r&(hUh}r'(h]h]h]h]h]uh j!h]r(j)r)}r*(hUh}r+(h]h]h]h]h]uh j%h]r,j)r-}r.(hUh}r/(U anchornameU#job-directoryUrefurihh]h]h]h]h]Uinternaluh j)h]r0hX Job directoryr1r2}r3(hX Job directoryh j-ubah!jubah!jubah!jubj)r4}r5(hUh}r6(h]h]h]h]h]uh j!h]r7j)r8}r9(hUh}r:(h]h]h]h]h]uh j4h]r;j)r<}r=(hUh}r>(U anchornameU#how-to-use-itUrefurihh]h]h]h]h]Uinternaluh j8h]r?hX How to use itr@rA}rB(hX How to use ith j<ubah!jubah!jubah!jubj)rC}rD(hUh}rE(h]h]h]h]h]uh j!h]rFj)rG}rH(hUh}rI(h]h]h]h]h]uh jCh]rJj)rK}rL(hUh}rM(U anchornameU)#keeping-persistent-state-between-batchesUrefurihh]h]h]h]h]Uinternaluh jGh]rNhX(Keeping persistent state between batchesrOrP}rQ(hX(Keeping persistent state between batchesh jKubah!jubah!jubah!jubj)rR}rS(hUh}rT(h]h]h]h]h]uh j!h]rU(j)rV}rW(hUh}rX(h]h]h]h]h]uh jRh]rYj)rZ}r[(hUh}r\(U anchornameU#persistence-gotchasUrefurihh]h]h]h]h]Uinternaluh jVh]r]hXPersistence gotchasr^r_}r`(hXPersistence gotchash jZubah!jubah!jubj)ra}rb(hUh}rc(h]h]h]h]h]uh jRh]rd(j)re}rf(hUh}rg(h]h]h]h]h]uh jah]rhj)ri}rj(hUh}rk(h]h]h]h]h]uh jeh]rlj)rm}rn(hUh}ro(U anchornameU#cookies-expirationUrefurihh]h]h]h]h]Uinternaluh jih]rphXCookies expirationrqrr}rs(hXCookies expirationh jmubah!jubah!jubah!jubj)rt}ru(hUh}rv(h]h]h]h]h]uh jah]rwj)rx}ry(hUh}rz(h]h]h]h]h]uh jth]r{j)r|}r}(hUh}r~(U anchornameU#request-serializationUrefurihh]h]h]h]h]Uinternaluh jxh]rhXRequest serializationrr}r(hXRequest serializationh j|ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubhj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jh]r(hXScrapy rr}r(hhh jubhX0.22rr}r(hhh jubhX documentationrr}r(hhh jubeh!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #getting-helpUrefurihh]h]h]h]h]Uinternaluh jh]rhX Getting helprr}r(hX Getting helph jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #first-stepsUrefurihh]h]h]h]h]Uinternaluh jh]rhX First stepsrr}r(hX First stepsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rcsphinx.addnodes toctree r)r}r(hUh}r(UnumberedKUparenthU titlesonlyUglobh]h]h]h]h]Uentries]r(NjWrNjXrNjYrNjZreUhiddenUmaxdepthJU includefiles]r(jWjXjYjZeU includehiddenuh jh]h!Utoctreerubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#basic-conceptsUrefurihh]h]h]h]h]Uinternaluh jh]rhXBasic conceptsrr}r(hXBasic conceptsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(UnumberedKUparenthU titlesonlyUglobh]h]h]h]h]Uentries]r(Nj[rNj\rNj]rNj^rNj_rNj`rNjarNjbrNjcreUhiddenUmaxdepthJU includefiles]r(j[j\j]j^j_j`jajbjceU includehiddenuh jh]h!jubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#built-in-servicesUrefurihh]h]h]h]h]Uinternaluh jh]rhXBuilt-in servicesrr}r(hXBuilt-in servicesh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(UnumberedKUparenthU titlesonlyUglobh]h]h]h]h]Uentries]r(NjdrNjerNjfrNjgrNjhr eUhiddenUmaxdepthJU includefiles]r (jdjejfjgjheU includehiddenuh jh]h!jubah!jubeh!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameU#solving-specific-problemsUrefurihh]h]h]h]h]Uinternaluh jh]rhXSolving specific problemsrr}r(hXSolving specific problemsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r (UnumberedKUparenthU titlesonlyUglobh]h]h]h]h]Uentries]r!(Njir"Njjr#Njkr$Njlr%Njmr&Njnr'Njor(Njpr)Njqr*Njrr+Njsr,Njtr-Njur.Njvr/Njwr0eUhiddenUmaxdepthJU includefiles]r1(jijjjkjljmjnjojpjqjrjsjtjujvjweU includehiddenuh jh]h!jubah!jubeh!jubj)r2}r3(hUh}r4(h]h]h]h]h]uh jh]r5(j)r6}r7(hUh}r8(h]h]h]h]h]uh j2h]r9j)r:}r;(hUh}r<(U anchornameU#extending-scrapyUrefurihh]h]h]h]h]Uinternaluh j6h]r=hXExtending Scrapyr>r?}r@(hXExtending Scrapyh j:ubah!jubah!jubj)rA}rB(hUh}rC(h]h]h]h]h]uh j2h]rDj)rE}rF(hUh}rG(UnumberedKUparenthU titlesonlyUglobh]h]h]h]h]Uentries]rH(NjxrINjyrJNjzrKNj{rLNj|rMeUhiddenUmaxdepthJU includefiles]rN(jxjyjzj{j|eU includehiddenuh jAh]h!jubah!jubeh!jubj)rO}rP(hUh}rQ(h]h]h]h]h]uh jh]rR(j)rS}rT(hUh}rU(h]h]h]h]h]uh jOh]rVj)rW}rX(hUh}rY(U anchornameU #referenceUrefurihh]h]h]h]h]Uinternaluh jSh]rZhX Referencer[r\}r](hX Referenceh jWubah!jubah!jubj)r^}r_(hUh}r`(h]h]h]h]h]uh jOh]raj)rb}rc(hUh}rd(UnumberedKUparenthU titlesonlyUglobh]h]h]h]h]Uentries]re(Nj}rfNj~rgNjrhNjriNjrjeUhiddenUmaxdepthJU includefiles]rk(j}j~jjjeU includehiddenuh j^h]h!jubah!jubeh!jubj)rl}rm(hUh}rn(h]h]h]h]h]uh jh]ro(j)rp}rq(hUh}rr(h]h]h]h]h]uh jlh]rsj)rt}ru(hUh}rv(U anchornameU #all-the-restUrefurihh]h]h]h]h]Uinternaluh jph]rwhX All the restrxry}rz(hX All the resth jtubah!jubah!jubj)r{}r|(hUh}r}(h]h]h]h]h]uh jlh]r~j)r}r(hUh}r(UnumberedKUparenthU titlesonlyUglobh]h]h]h]h]Uentries]r(NjrNjrNjrNjreUhiddenUmaxdepthJU includefiles]r(jjjjeU includehiddenuh j{h]h!jubah!jubeh!jubeh!jubeh!jubah!jubhj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jh]rhX Benchmarkingrr}r(hhh jubah!jubah!jubah!jubah!jubhj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jh]rhXScrapy Tutorialrr}r(hhh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#creating-a-projectUrefurihh]h]h]h]h]Uinternaluh jh]rhXCreating a projectrr}r(hXCreating a projecth jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#defining-our-itemUrefurihh]h]h]h]h]Uinternaluh jh]rhXDefining our Itemrr}r(hXDefining our Itemh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#our-first-spiderUrefurihh]h]h]h]h]Uinternaluh jh]rhXOur first Spiderrr}r(hXOur first Spiderh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #crawlingUrefurihh]h]h]h]h]Uinternaluh jh]rhXCrawlingrr}r(hXCrawlingh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU"#what-just-happened-under-the-hoodUrefurihh]h]h]h]h]Uinternaluh jh]rhX"What just happened under the hood?rr}r(hX"What just happened under the hood?h jubah!jubah!jubah!jubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r(hUh}r(U anchornameU#extracting-itemsUrefurihh]h]h]h]h]Uinternaluh j h]rhXExtracting Itemsrr}r(hXExtracting Itemsh j ubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r!(hUh}r"(U anchornameU#introduction-to-selectorsUrefurihh]h]h]h]h]Uinternaluh jh]r#hXIntroduction to Selectorsr$r%}r&(hXIntroduction to Selectorsh j ubah!jubah!jubah!jubj)r'}r((hUh}r)(h]h]h]h]h]uh jh]r*j)r+}r,(hUh}r-(h]h]h]h]h]uh j'h]r.j)r/}r0(hUh}r1(U anchornameU#trying-selectors-in-the-shellUrefurihh]h]h]h]h]Uinternaluh j+h]r2hXTrying Selectors in the Shellr3r4}r5(hXTrying Selectors in the Shellh j/ubah!jubah!jubah!jubj)r6}r7(hUh}r8(h]h]h]h]h]uh jh]r9j)r:}r;(hUh}r<(h]h]h]h]h]uh j6h]r=j)r>}r?(hUh}r@(U anchornameU#extracting-the-dataUrefurihh]h]h]h]h]Uinternaluh j:h]rAhXExtracting the datarBrC}rD(hXExtracting the datah j>ubah!jubah!jubah!jubeh!jubeh!jubj)rE}rF(hUh}rG(h]h]h]h]h]uh jh]rHj)rI}rJ(hUh}rK(h]h]h]h]h]uh jEh]rLj)rM}rN(hUh}rO(U anchornameU#using-our-itemUrefurihh]h]h]h]h]Uinternaluh jIh]rPhXUsing our itemrQrR}rS(hXUsing our itemh jMubah!jubah!jubah!jubeh!jubeh!jubj)rT}rU(hUh}rV(h]h]h]h]h]uh jh]rWj)rX}rY(hUh}rZ(h]h]h]h]h]uh jTh]r[j)r\}r](hUh}r^(U anchornameU#storing-the-scraped-dataUrefurihh]h]h]h]h]Uinternaluh jXh]r_hXStoring the scraped datar`ra}rb(hXStoring the scraped datah j\ubah!jubah!jubah!jubj)rc}rd(hUh}re(h]h]h]h]h]uh jh]rfj)rg}rh(hUh}ri(h]h]h]h]h]uh jch]rjj)rk}rl(hUh}rm(U anchornameU #next-stepsUrefurihh]h]h]h]h]Uinternaluh jgh]rnhX Next stepsrorp}rq(hX Next stepsh jkubah!jubah!jubah!jubeh!jubeh!jubah!jubhj)rr}rs(hUh}rt(h]h]h]h]h]uh]ruj)rv}rw(hUh}rx(h]h]h]h]h]uh jrh]ry(j)rz}r{(hUh}r|(h]h]h]h]h]uh jvh]r}j)r~}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jzh]rhXDownloading Item Imagesrr}r(hhh j~ubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#using-the-images-pipelineUrefurihh]h]h]h]h]Uinternaluh jh]rhXUsing the Images Pipelinerr}r(hXUsing the Images Pipelineh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#usage-exampleUrefurihh]h]h]h]h]Uinternaluh jh]rhX Usage examplerr}r(hX Usage exampleh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#enabling-your-images-pipelineUrefurihh]h]h]h]h]Uinternaluh jh]rhXEnabling your Images Pipelinerr}r(hXEnabling your Images Pipelineh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#images-storageUrefurihh]h]h]h]h]Uinternaluh jh]rhXImages Storagerr}r(hXImages Storageh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#file-system-storageUrefurihh]h]h]h]h]Uinternaluh jh]rhXFile system storagerr}r(hXFile system storageh jubah!jubah!jubah!jubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#additional-featuresUrefurihh]h]h]h]h]Uinternaluh jh]rhXAdditional featuresrr}r(hXAdditional featuresh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#image-expirationUrefurihh]h]h]h]h]Uinternaluh jh]rhXImage expirationrr}r(hXImage expirationh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#thumbnail-generationUrefurihh]h]h]h]h]Uinternaluh jh]rhXThumbnail generationrr}r(hXThumbnail generationh jubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameU#filtering-out-small-imagesUrefurihh]h]h]h]h]Uinternaluh j h]rhXFiltering out small imagesrr}r(hXFiltering out small imagesh jubah!jubah!jubah!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r!(hUh}r"(U anchornameX&#module-scrapy.contrib.pipeline.imagesUrefurihh]h]h]h]h]Uinternaluh jh]r#hX(Implementing your custom Images Pipeliner$r%}r&(hX(Implementing your custom Images Pipelineh j ubah!jubah!jubah!jubj)r'}r((hUh}r)(h]h]h]h]h]uh jh]r*j)r+}r,(hUh}r-(h]h]h]h]h]uh j'h]r.j)r/}r0(hUh}r1(U anchornameU#custom-images-pipeline-exampleUrefurihh]h]h]h]h]Uinternaluh j+h]r2hXCustom Images pipeline exampler3r4}r5(hXCustom Images pipeline exampleh j/ubah!jubah!jubah!jubeh!jubeh!jubah!jubhj)r6}r7(hUh}r8(h]h]h]h]h]uh]r9j)r:}r;(hUh}r<(h]h]h]h]h]uh j6h]r=(j)r>}r?(hUh}r@(h]h]h]h]h]uh j:h]rAj)rB}rC(hUh}rD(U anchornameUUrefurihh]h]h]h]h]Uinternaluh j>h]rEhX Broad CrawlsrFrG}rH(hhh jBubah!jubah!jubj)rI}rJ(hUh}rK(h]h]h]h]h]uh j:h]rL(j)rM}rN(hUh}rO(h]h]h]h]h]uh jIh]rPj)rQ}rR(hUh}rS(h]h]h]h]h]uh jMh]rTj)rU}rV(hUh}rW(U anchornameU#increase-concurrencyUrefurihh]h]h]h]h]Uinternaluh jQh]rXhXIncrease concurrencyrYrZ}r[(hXIncrease concurrencyh jUubah!jubah!jubah!jubj)r\}r](hUh}r^(h]h]h]h]h]uh jIh]r_j)r`}ra(hUh}rb(h]h]h]h]h]uh j\h]rcj)rd}re(hUh}rf(U anchornameU#reduce-log-levelUrefurihh]h]h]h]h]Uinternaluh j`h]rghXReduce log levelrhri}rj(hXReduce log levelh jdubah!jubah!jubah!jubj)rk}rl(hUh}rm(h]h]h]h]h]uh jIh]rnj)ro}rp(hUh}rq(h]h]h]h]h]uh jkh]rrj)rs}rt(hUh}ru(U anchornameU#disable-cookiesUrefurihh]h]h]h]h]Uinternaluh joh]rvhXDisable cookiesrwrx}ry(hXDisable cookiesh jsubah!jubah!jubah!jubj)rz}r{(hUh}r|(h]h]h]h]h]uh jIh]r}j)r~}r(hUh}r(h]h]h]h]h]uh jzh]rj)r}r(hUh}r(U anchornameU#disable-retriesUrefurihh]h]h]h]h]Uinternaluh j~h]rhXDisable retriesrr}r(hXDisable retriesh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jIh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#reduce-download-timeoutUrefurihh]h]h]h]h]Uinternaluh jh]rhXReduce download timeoutrr}r(hXReduce download timeouth jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jIh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#disable-redirectsUrefurihh]h]h]h]h]Uinternaluh jh]rhXDisable redirectsrr}r(hXDisable redirectsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jIh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU(#enable-crawling-of-ajax-crawlable-pagesUrefurihh]h]h]h]h]Uinternaluh jh]rhX)Enable crawling of "Ajax Crawlable Pages"rr}r(hX)Enable crawling of "Ajax Crawlable Pages"h jubah!jubah!jubah!jubeh!jubeh!jubah!jubhj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jh]rhXScrapy at a glancerr}r(hhh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#pick-a-websiteUrefurihh]h]h]h]h]Uinternaluh jh]rhXPick a websiterr}r(hXPick a websiteh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU##define-the-data-you-want-to-scrapeUrefurihh]h]h]h]h]Uinternaluh jh]rhX"Define the data you want to scraperr}r(hX"Define the data you want to scrapeh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU##write-a-spider-to-extract-the-dataUrefurihh]h]h]h]h]Uinternaluh jh]rhX"Write a Spider to extract the datarr}r(hX"Write a Spider to extract the datah jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU##run-the-spider-to-extract-the-dataUrefurihh]h]h]h]h]Uinternaluh jh]rhX"Run the spider to extract the datarr}r(hX"Run the spider to extract the datah jubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameU#review-scraped-dataUrefurihh]h]h]h]h]Uinternaluh j h]rhXReview scraped datarr}r(hXReview scraped datah jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r!(hUh}r"(U anchornameU #what-elseUrefurihh]h]h]h]h]Uinternaluh jh]r#hX What else?r$r%}r&(hX What else?h j ubah!jubah!jubah!jubj)r'}r((hUh}r)(h]h]h]h]h]uh jh]r*j)r+}r,(hUh}r-(h]h]h]h]h]uh j'h]r.j)r/}r0(hUh}r1(U anchornameU #what-s-nextUrefurihh]h]h]h]h]Uinternaluh j+h]r2hX What's next?r3r4}r5(hX What's next?h j/ubah!jubah!jubah!jubeh!jubeh!jubah!jubhj)r6}r7(hUh}r8(h]h]h]h]h]uh]r9j)r:}r;(hUh}r<(h]h]h]h]h]uh j6h]r=(j)r>}r?(hUh}r@(h]h]h]h]h]uh j:h]rAj)rB}rC(hUh}rD(U anchornameUUrefurihh]h]h]h]h]Uinternaluh j>h]rEhXContributing to ScrapyrFrG}rH(hhh jBubah!jubah!jubj)rI}rJ(hUh}rK(h]h]h]h]h]uh j:h]rL(j)rM}rN(hUh}rO(h]h]h]h]h]uh jIh]rPj)rQ}rR(hUh}rS(h]h]h]h]h]uh jMh]rTj)rU}rV(hUh}rW(U anchornameU#reporting-bugsUrefurihh]h]h]h]h]Uinternaluh jQh]rXhXReporting bugsrYrZ}r[(hXReporting bugsh jUubah!jubah!jubah!jubj)r\}r](hUh}r^(h]h]h]h]h]uh jIh]r_j)r`}ra(hUh}rb(h]h]h]h]h]uh j\h]rcj)rd}re(hUh}rf(U anchornameU#writing-patchesUrefurihh]h]h]h]h]Uinternaluh j`h]rghXWriting patchesrhri}rj(hXWriting patchesh jdubah!jubah!jubah!jubj)rk}rl(hUh}rm(h]h]h]h]h]uh jIh]rnj)ro}rp(hUh}rq(h]h]h]h]h]uh jkh]rrj)rs}rt(hUh}ru(U anchornameU#submitting-patchesUrefurihh]h]h]h]h]Uinternaluh joh]rvhXSubmitting patchesrwrx}ry(hXSubmitting patchesh jsubah!jubah!jubah!jubj)rz}r{(hUh}r|(h]h]h]h]h]uh jIh]r}j)r~}r(hUh}r(h]h]h]h]h]uh jzh]rj)r}r(hUh}r(U anchornameU #coding-styleUrefurihh]h]h]h]h]Uinternaluh j~h]rhX Coding stylerr}r(hX Coding styleh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jIh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#scrapy-contribUrefurihh]h]h]h]h]Uinternaluh jh]rhXScrapy Contribrr}r(hXScrapy Contribh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jIh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#documentation-policiesUrefurihh]h]h]h]h]Uinternaluh jh]rhXDocumentation policiesrr}r(hXDocumentation policiesh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jIh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#testsUrefurihh]h]h]h]h]Uinternaluh jh]rhXTestsrr}r(hXTestsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#running-testsUrefurihh]h]h]h]h]Uinternaluh jh]rhX Running testsrr}r(hX Running testsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#writing-testsUrefurihh]h]h]h]h]Uinternaluh jh]rhX Writing testsrr}r(hX Writing testsh jubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubhj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jh]rhXCore APIrr}r(hhh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #crawler-apiUrefurihh]h]h]h]h]Uinternaluh jh]rhX Crawler APIrr}r(hX Crawler APIh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX#module-scrapy.settingsUrefurihh]h]h]h]h]Uinternaluh jh]r hX Settings APIr r }r (hX Settings APIh jubah!jubah!jubah!jubj)r }r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameX#module-scrapy.signalmanagerUrefurihh]h]h]h]h]Uinternaluh jh]rhX Signals APIrr}r(hX Signals APIh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r!(hUh}r"(h]h]h]h]h]uh jh]r#j)r$}r%(hUh}r&(U anchornameU#stats-collector-apiUrefurihh]h]h]h]h]Uinternaluh j h]r'hXStats Collector APIr(r)}r*(hXStats Collector APIh j$ubah!jubah!jubah!jubeh!jubeh!jubah!jubhj)r+}r,(hUh}r-(h]h]h]h]h]uh]r.j)r/}r0(hUh}r1(h]h]h]h]h]uh j+h]r2(j)r3}r4(hUh}r5(h]h]h]h]h]uh j/h]r6j)r7}r8(hUh}r9(U anchornameUUrefurihh]h]h]h]h]Uinternaluh j3h]r:hXAutoThrottle extensionr;r<}r=(hhh j7ubah!jubah!jubj)r>}r?(hUh}r@(h]h]h]h]h]uh j/h]rA(j)rB}rC(hUh}rD(h]h]h]h]h]uh j>h]rEj)rF}rG(hUh}rH(h]h]h]h]h]uh jBh]rIj)rJ}rK(hUh}rL(U anchornameU #design-goalsUrefurihh]h]h]h]h]Uinternaluh jFh]rMhX Design goalsrNrO}rP(hX Design goalsh jJubah!jubah!jubah!jubj)rQ}rR(hUh}rS(h]h]h]h]h]uh j>h]rTj)rU}rV(hUh}rW(h]h]h]h]h]uh jQh]rXj)rY}rZ(hUh}r[(U anchornameU #how-it-worksUrefurihh]h]h]h]h]Uinternaluh jUh]r\hX How it worksr]r^}r_(hX How it worksh jYubah!jubah!jubah!jubj)r`}ra(hUh}rb(h]h]h]h]h]uh j>h]rcj)rd}re(hUh}rf(h]h]h]h]h]uh j`h]rgj)rh}ri(hUh}rj(U anchornameU#throttling-algorithmUrefurihh]h]h]h]h]Uinternaluh jdh]rkhXThrottling algorithmrlrm}rn(hXThrottling algorithmh jhubah!jubah!jubah!jubj)ro}rp(hUh}rq(h]h]h]h]h]uh j>h]rr(j)rs}rt(hUh}ru(h]h]h]h]h]uh joh]rvj)rw}rx(hUh}ry(U anchornameU #settingsUrefurihh]h]h]h]h]Uinternaluh jsh]rzhXSettingsr{r|}r}(hXSettingsh jwubah!jubah!jubj)r~}r(hUh}r(h]h]h]h]h]uh joh]r(j)r}r(hUh}r(h]h]h]h]h]uh j~h]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#autothrottle-enabledUrefurihh]h]h]h]h]Uinternaluh jh]rhXAUTOTHROTTLE_ENABLEDrr}r(hXAUTOTHROTTLE_ENABLEDh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j~h]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#autothrottle-start-delayUrefurihh]h]h]h]h]Uinternaluh jh]rhXAUTOTHROTTLE_START_DELAYrr}r(hXAUTOTHROTTLE_START_DELAYh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j~h]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#autothrottle-max-delayUrefurihh]h]h]h]h]Uinternaluh jh]rhXAUTOTHROTTLE_MAX_DELAYrr}r(hXAUTOTHROTTLE_MAX_DELAYh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j~h]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#autothrottle-debugUrefurihh]h]h]h]h]Uinternaluh jh]rhXAUTOTHROTTLE_DEBUGrr}r(hXAUTOTHROTTLE_DEBUGh jubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubhj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jh]rhXSpiders Contractsrr}r(hhh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#custom-contractsUrefurihh]h]h]h]h]Uinternaluh jh]rhXCustom Contractsrr}r(hXCustom Contractsh jubah!jubah!jubah!jubah!jubeh!jubah!jubhj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jh]rhX Web Servicerr}r(hhh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#web-service-resourcesUrefurihh]h]h]h]h]Uinternaluh jh]rhXWeb service resourcesrr}r (hXWeb service resourcesr h jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh j h]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#available-json-rpc-resourcesUrefurihh]h]h]h]h]Uinternaluh jh]rhXAvailable JSON-RPC resourcesrr}r(hXAvailable JSON-RPC resourcesrh jubah!jubah!jubj)r}r (hUh}r!(h]h]h]h]h]uh jh]r"(j)r#}r$(hUh}r%(h]h]h]h]h]uh jh]r&j)r'}r((hUh}r)(h]h]h]h]h]uh j#h]r*j)r+}r,(hUh}r-(U anchornameX)#module-scrapy.contrib.webservice.crawlerUrefurihh]h]h]h]h]Uinternaluh j'h]r.hXCrawler JSON-RPC resourcer/r0}r1(hXCrawler JSON-RPC resourcer2h j+ubah!jubah!jubah!jubj)r3}r4(hUh}r5(h]h]h]h]h]uh jh]r6j)r7}r8(hUh}r9(h]h]h]h]h]uh j3h]r:j)r;}r<(hUh}r=(U anchornameX'#module-scrapy.contrib.webservice.statsUrefurihh]h]h]h]h]Uinternaluh j7h]r>hX!Stats Collector JSON-RPC resourcer?r@}rA(hX!Stats Collector JSON-RPC resourcerBh j;ubah!jubah!jubah!jubj)rC}rD(hUh}rE(h]h]h]h]h]uh jh]rFj)rG}rH(hUh}rI(h]h]h]h]h]uh jCh]rJj)rK}rL(hUh}rM(U anchornameU!#spider-manager-json-rpc-resourceUrefurihh]h]h]h]h]Uinternaluh jGh]rNhX Spider Manager JSON-RPC resourcerOrP}rQ(hX Spider Manager JSON-RPC resourcerRh jKubah!jubah!jubah!jubj)rS}rT(hUh}rU(h]h]h]h]h]uh jh]rVj)rW}rX(hUh}rY(h]h]h]h]h]uh jSh]rZj)r[}r\(hUh}r](U anchornameU$#extension-manager-json-rpc-resourceUrefurihh]h]h]h]h]Uinternaluh jWh]r^hX#Extension Manager JSON-RPC resourcer_r`}ra(hX#Extension Manager JSON-RPC resourcerbh j[ubah!jubah!jubah!jubeh!jubeh!jubj)rc}rd(hUh}re(h]h]h]h]h]uh j h]rf(j)rg}rh(hUh}ri(h]h]h]h]h]uh jch]rjj)rk}rl(hUh}rm(U anchornameU#available-json-resourcesUrefurihh]h]h]h]h]Uinternaluh jgh]rnhXAvailable JSON resourcesrorp}rq(hXAvailable JSON resourcesrrh jkubah!jubah!jubj)rs}rt(hUh}ru(h]h]h]h]h]uh jch]rvj)rw}rx(hUh}ry(h]h]h]h]h]uh jsh]rzj)r{}r|(hUh}r}(h]h]h]h]h]uh jwh]r~j)r}r(hUh}r(U anchornameX.#module-scrapy.contrib.webservice.enginestatusUrefurihh]h]h]h]h]Uinternaluh j{h]rhXEngine status JSON resourcerr}r(hXEngine status JSON resourcerh jubah!jubah!jubah!jubah!jubeh!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#web-service-settingsUrefurihh]h]h]h]h]Uinternaluh jh]rhXWeb service settingsrr}r(hXWeb service settingsrh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#webservice-enabledUrefurihh]h]h]h]h]Uinternaluh jh]rhXWEBSERVICE_ENABLEDrr}r(hXWEBSERVICE_ENABLEDrh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#webservice-logfileUrefurihh]h]h]h]h]Uinternaluh jh]rhXWEBSERVICE_LOGFILErr}r(hXWEBSERVICE_LOGFILErh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#webservice-portUrefurihh]h]h]h]h]Uinternaluh jh]rhXWEBSERVICE_PORTrr}r(hXWEBSERVICE_PORTrh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#webservice-hostUrefurihh]h]h]h]h]Uinternaluh jh]rhXWEBSERVICE_HOSTrr}r(hXWEBSERVICE_HOSTrh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#webservice-resourcesUrefurihh]h]h]h]h]Uinternaluh jh]rhXWEBSERVICE_RESOURCESrr}r(hXWEBSERVICE_RESOURCESrh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#webservice-resources-baseUrefurihh]h]h]h]h]Uinternaluh jh]rhXWEBSERVICE_RESOURCES_BASErr}r(hXWEBSERVICE_RESOURCES_BASErh jubah!jubah!jubah!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#writing-a-web-service-resourceUrefurihh]h]h]h]h]Uinternaluh jh]rhXWriting a web service resourcerr}r (hXWriting a web service resourcer h jubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameU"#examples-of-web-service-resourcesUrefurihh]h]h]h]h]Uinternaluh jh]rhX!Examples of web service resourcesrr}r(hX!Examples of web service resourcesrh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j h]r(j)r}r (hUh}r!(h]h]h]h]h]uh jh]r"j)r#}r$(hUh}r%(h]h]h]h]h]uh jh]r&j)r'}r((hUh}r)(U anchornameU #statsresource-json-rpc-resourceUrefurihh]h]h]h]h]Uinternaluh j#h]r*hX!StatsResource (JSON-RPC resource)r+r,}r-(hX!StatsResource (JSON-RPC resource)r.h j'ubah!jubah!jubah!jubj)r/}r0(hUh}r1(h]h]h]h]h]uh jh]r2j)r3}r4(hUh}r5(h]h]h]h]h]uh j/h]r6j)r7}r8(hUh}r9(U anchornameU##enginestatusresource-json-resourceUrefurihh]h]h]h]h]Uinternaluh j3h]r:hX$EngineStatusResource (JSON resource)r;r<}r=(hX$EngineStatusResource (JSON resource)r>h j7ubah!jubah!jubah!jubeh!jubeh!jubj)r?}r@(hUh}rA(h]h]h]h]h]uh jh]rB(j)rC}rD(hUh}rE(h]h]h]h]h]uh j?h]rFj)rG}rH(hUh}rI(U anchornameU#example-of-web-service-clientUrefurihh]h]h]h]h]Uinternaluh jCh]rJhXExample of web service clientrKrL}rM(hXExample of web service clientrNh jGubah!jubah!jubj)rO}rP(hUh}rQ(h]h]h]h]h]uh j?h]rRj)rS}rT(hUh}rU(h]h]h]h]h]uh jOh]rVj)rW}rX(hUh}rY(h]h]h]h]h]uh jSh]rZj)r[}r\(hUh}r](U anchornameU#scrapy-ws-py-scriptUrefurihh]h]h]h]h]Uinternaluh jWh]r^hXscrapy-ws.py scriptr_r`}ra(hXscrapy-ws.py scriptrbh j[ubah!jubah!jubah!jubah!jubeh!jubeh!jubeh!jubah!jubhj)rc}rd(hUh}re(h]h]h]h]h]uh]rfj)rg}rh(hUh}ri(h]h]h]h]h]uh jch]rj(j)rk}rl(hUh}rm(h]h]h]h]h]uh jgh]rnj)ro}rp(hUh}rq(U anchornameUUrefurihh]h]h]h]h]Uinternaluh jkh]rrhXDownloader Middlewarersrt}ru(hjh joubah!jubah!jubj)rv}rw(hUh}rx(h]h]h]h]h]uh jgh]ry(j)rz}r{(hUh}r|(h]h]h]h]h]uh jvh]r}j)r~}r(hUh}r(h]h]h]h]h]uh jzh]rj)r}r(hUh}r(U anchornameU##activating-a-downloader-middlewareUrefurihh]h]h]h]h]Uinternaluh j~h]rhX"Activating a downloader middlewarerr}r(hX"Activating a downloader middlewareh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU'#writing-your-own-downloader-middlewareUrefurihh]h]h]h]h]Uinternaluh jh]rhX&Writing your own downloader middlewarerr}r(hX&Writing your own downloader middlewareh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU)#built-in-downloader-middleware-referenceUrefurihh]h]h]h]h]Uinternaluh jh]rhX(Built-in downloader middleware referencerr}r(hX(Built-in downloader middleware referenceh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX3#module-scrapy.contrib.downloadermiddleware.cookiesUrefurihh]h]h]h]h]Uinternaluh jh]rhXCookiesMiddlewarerr}r(hXCookiesMiddlewareh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU$#multiple-cookie-sessions-per-spiderUrefurihh]h]h]h]h]Uinternaluh jh]rhX#Multiple cookie sessions per spiderrr}r(hX#Multiple cookie sessions per spiderh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#cookies-enabledUrefurihh]h]h]h]h]Uinternaluh jh]rhXCOOKIES_ENABLEDrr}r(hXCOOKIES_ENABLEDh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#cookies-debugUrefurihh]h]h]h]h]Uinternaluh jh]rhX COOKIES_DEBUGrr}r(hX COOKIES_DEBUGh jubah!jubah!jubah!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX:#module-scrapy.contrib.downloadermiddleware.defaultheadersUrefurihh]h]h]h]h]Uinternaluh jh]rhXDefaultHeadersMiddlewarerr}r(hXDefaultHeadersMiddlewareh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX;#module-scrapy.contrib.downloadermiddleware.downloadtimeoutUrefurihh]h]h]h]h]Uinternaluh jh]rhXDownloadTimeoutMiddlewarerr}r(hXDownloadTimeoutMiddlewareh jubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameX4#module-scrapy.contrib.downloadermiddleware.httpauthUrefurihh]h]h]h]h]Uinternaluh j h]rhXHttpAuthMiddlewarerr}r(hXHttpAuthMiddlewareh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r!(hUh}r"(U anchornameX5#module-scrapy.contrib.downloadermiddleware.httpcacheUrefurihh]h]h]h]h]Uinternaluh jh]r#hXHttpCacheMiddlewarer$r%}r&(hXHttpCacheMiddlewareh j ubah!jubah!jubj)r'}r((hUh}r)(h]h]h]h]h]uh jh]r*(j)r+}r,(hUh}r-(h]h]h]h]h]uh j'h]r.j)r/}r0(hUh}r1(h]h]h]h]h]uh j+h]r2j)r3}r4(hUh}r5(U anchornameU#dummy-policy-defaultUrefurihh]h]h]h]h]Uinternaluh j/h]r6hXDummy policy (default)r7r8}r9(hXDummy policy (default)h j3ubah!jubah!jubah!jubj)r:}r;(hUh}r<(h]h]h]h]h]uh j'h]r=j)r>}r?(hUh}r@(h]h]h]h]h]uh j:h]rAj)rB}rC(hUh}rD(U anchornameU#rfc2616-policyUrefurihh]h]h]h]h]Uinternaluh j>h]rEhXRFC2616 policyrFrG}rH(hXRFC2616 policyh jBubah!jubah!jubah!jubj)rI}rJ(hUh}rK(h]h]h]h]h]uh j'h]rLj)rM}rN(hUh}rO(h]h]h]h]h]uh jIh]rPj)rQ}rR(hUh}rS(U anchornameU##filesystem-storage-backend-defaultUrefurihh]h]h]h]h]Uinternaluh jMh]rThX$Filesystem storage backend (default)rUrV}rW(hX$Filesystem storage backend (default)h jQubah!jubah!jubah!jubj)rX}rY(hUh}rZ(h]h]h]h]h]uh j'h]r[j)r\}r](hUh}r^(h]h]h]h]h]uh jXh]r_j)r`}ra(hUh}rb(U anchornameU#dbm-storage-backendUrefurihh]h]h]h]h]Uinternaluh j\h]rchXDBM storage backendrdre}rf(hXDBM storage backendh j`ubah!jubah!jubah!jubj)rg}rh(hUh}ri(h]h]h]h]h]uh j'h]rj(j)rk}rl(hUh}rm(h]h]h]h]h]uh jgh]rnj)ro}rp(hUh}rq(U anchornameU#httpcache-middleware-settingsUrefurihh]h]h]h]h]Uinternaluh jkh]rrhXHTTPCache middleware settingsrsrt}ru(hXHTTPCache middleware settingsh joubah!jubah!jubj)rv}rw(hUh}rx(h]h]h]h]h]uh jgh]ry(j)rz}r{(hUh}r|(h]h]h]h]h]uh jvh]r}j)r~}r(hUh}r(h]h]h]h]h]uh jzh]rj)r}r(hUh}r(U anchornameU#httpcache-enabledUrefurihh]h]h]h]h]Uinternaluh j~h]rhXHTTPCACHE_ENABLEDrr}r(hXHTTPCACHE_ENABLEDh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httpcache-expiration-secsUrefurihh]h]h]h]h]Uinternaluh jh]rhXHTTPCACHE_EXPIRATION_SECSrr}r(hXHTTPCACHE_EXPIRATION_SECSh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httpcache-dirUrefurihh]h]h]h]h]Uinternaluh jh]rhX HTTPCACHE_DIRrr}r(hX HTTPCACHE_DIRh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httpcache-ignore-http-codesUrefurihh]h]h]h]h]Uinternaluh jh]rhXHTTPCACHE_IGNORE_HTTP_CODESrr}r(hXHTTPCACHE_IGNORE_HTTP_CODESh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httpcache-ignore-missingUrefurihh]h]h]h]h]Uinternaluh jh]rhXHTTPCACHE_IGNORE_MISSINGrr}r(hXHTTPCACHE_IGNORE_MISSINGh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httpcache-ignore-schemesUrefurihh]h]h]h]h]Uinternaluh jh]rhXHTTPCACHE_IGNORE_SCHEMESrr}r(hXHTTPCACHE_IGNORE_SCHEMESh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httpcache-storageUrefurihh]h]h]h]h]Uinternaluh jh]rhXHTTPCACHE_STORAGErr}r(hXHTTPCACHE_STORAGEh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httpcache-dbm-moduleUrefurihh]h]h]h]h]Uinternaluh jh]rhXHTTPCACHE_DBM_MODULErr}r(hXHTTPCACHE_DBM_MODULEh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#httpcache-policyUrefurihh]h]h]h]h]Uinternaluh jh]rhXHTTPCACHE_POLICYrr}r(hXHTTPCACHE_POLICYh jubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r (hUh}r (U anchornameX;#module-scrapy.contrib.downloadermiddleware.httpcompressionUrefurihh]h]h]h]h]Uinternaluh jh]r hXHttpCompressionMiddlewarer r}r(hXHttpCompressionMiddlewareh j ubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU##httpcompressionmiddleware-settingsUrefurihh]h]h]h]h]Uinternaluh jh]rhX"HttpCompressionMiddleware Settingsr r!}r"(hX"HttpCompressionMiddleware Settingsh jubah!jubah!jubj)r#}r$(hUh}r%(h]h]h]h]h]uh jh]r&j)r'}r((hUh}r)(h]h]h]h]h]uh j#h]r*j)r+}r,(hUh}r-(h]h]h]h]h]uh j'h]r.j)r/}r0(hUh}r1(U anchornameU#compression-enabledUrefurihh]h]h]h]h]Uinternaluh j+h]r2hXCOMPRESSION_ENABLEDr3r4}r5(hXCOMPRESSION_ENABLEDh j/ubah!jubah!jubah!jubah!jubeh!jubah!jubeh!jubj)r6}r7(hUh}r8(h]h]h]h]h]uh jh]r9j)r:}r;(hUh}r<(h]h]h]h]h]uh j6h]r=j)r>}r?(hUh}r@(U anchornameX3#module-scrapy.contrib.downloadermiddleware.chunkedUrefurihh]h]h]h]h]Uinternaluh j:h]rAhXChunkedTransferMiddlewarerBrC}rD(hXChunkedTransferMiddlewareh j>ubah!jubah!jubah!jubj)rE}rF(hUh}rG(h]h]h]h]h]uh jh]rHj)rI}rJ(hUh}rK(h]h]h]h]h]uh jEh]rLj)rM}rN(hUh}rO(U anchornameX5#module-scrapy.contrib.downloadermiddleware.httpproxyUrefurihh]h]h]h]h]Uinternaluh jIh]rPhXHttpProxyMiddlewarerQrR}rS(hXHttpProxyMiddlewareh jMubah!jubah!jubah!jubj)rT}rU(hUh}rV(h]h]h]h]h]uh jh]rW(j)rX}rY(hUh}rZ(h]h]h]h]h]uh jTh]r[j)r\}r](hUh}r^(U anchornameX4#module-scrapy.contrib.downloadermiddleware.redirectUrefurihh]h]h]h]h]Uinternaluh jXh]r_hXRedirectMiddlewarer`ra}rb(hXRedirectMiddlewareh j\ubah!jubah!jubj)rc}rd(hUh}re(h]h]h]h]h]uh jTh]rfj)rg}rh(hUh}ri(h]h]h]h]h]uh jch]rj(j)rk}rl(hUh}rm(h]h]h]h]h]uh jgh]rnj)ro}rp(hUh}rq(U anchornameU#redirectmiddleware-settingsUrefurihh]h]h]h]h]Uinternaluh jkh]rrhXRedirectMiddleware settingsrsrt}ru(hXRedirectMiddleware settingsh joubah!jubah!jubj)rv}rw(hUh}rx(h]h]h]h]h]uh jgh]ry(j)rz}r{(hUh}r|(h]h]h]h]h]uh jvh]r}j)r~}r(hUh}r(h]h]h]h]h]uh jzh]rj)r}r(hUh}r(U anchornameU#redirect-enabledUrefurihh]h]h]h]h]Uinternaluh j~h]rhXREDIRECT_ENABLEDrr}r(hXREDIRECT_ENABLEDh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jvh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#redirect-max-timesUrefurihh]h]h]h]h]Uinternaluh jh]rhXREDIRECT_MAX_TIMESrr}r(hXREDIRECT_MAX_TIMESh jubah!jubah!jubah!jubeh!jubeh!jubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#metarefreshmiddlewareUrefurihh]h]h]h]h]Uinternaluh jh]rhXMetaRefreshMiddlewarerr}r(hXMetaRefreshMiddlewareh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#metarefreshmiddleware-settingsUrefurihh]h]h]h]h]Uinternaluh jh]rhXMetaRefreshMiddleware settingsrr}r(hXMetaRefreshMiddleware settingsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#metarefresh-enabledUrefurihh]h]h]h]h]Uinternaluh jh]rhXMETAREFRESH_ENABLEDrr}r(hXMETAREFRESH_ENABLEDh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#redirect-max-metarefresh-delayUrefurihh]h]h]h]h]Uinternaluh jh]rhXREDIRECT_MAX_METAREFRESH_DELAYrr}r(hXREDIRECT_MAX_METAREFRESH_DELAYh jubah!jubah!jubah!jubeh!jubeh!jubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameX1#module-scrapy.contrib.downloadermiddleware.retryUrefurihh]h]h]h]h]Uinternaluh jh]rhXRetryMiddlewarerr}r(hXRetryMiddlewareh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#retrymiddleware-settingsUrefurihh]h]h]h]h]Uinternaluh jh]rhXRetryMiddleware Settingsrr}r(hXRetryMiddleware Settingsh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r j)r }r (hUh}r (U anchornameU#retry-enabledUrefurihh]h]h]h]h]Uinternaluh jh]r hX RETRY_ENABLEDrr}r(hX RETRY_ENABLEDh j ubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #retry-timesUrefurihh]h]h]h]h]Uinternaluh jh]rhX RETRY_TIMESrr}r(hX RETRY_TIMESh jubah!jubah!jubah!jubj)r }r!(hUh}r"(h]h]h]h]h]uh jh]r#j)r$}r%(hUh}r&(h]h]h]h]h]uh j h]r'j)r(}r)(hUh}r*(U anchornameU#retry-http-codesUrefurihh]h]h]h]h]Uinternaluh j$h]r+hXRETRY_HTTP_CODESr,r-}r.(hXRETRY_HTTP_CODESh j(ubah!jubah!jubah!jubeh!jubeh!jubah!jubeh!jubj)r/}r0(hUh}r1(h]h]h]h]h]uh jh]r2j)r3}r4(hUh}r5(h]h]h]h]h]uh j/h]r6j)r7}r8(hUh}r9(U anchornameX5#module-scrapy.contrib.downloadermiddleware.robotstxtUrefurihh]h]h]h]h]Uinternaluh j3h]r:hXRobotsTxtMiddlewarer;r<}r=(hXRobotsTxtMiddlewareh j7ubah!jubah!jubah!jubj)r>}r?(hUh}r@(h]h]h]h]h]uh jh]rAj)rB}rC(hUh}rD(h]h]h]h]h]uh j>h]rEj)rF}rG(hUh}rH(U anchornameX1#module-scrapy.contrib.downloadermiddleware.statsUrefurihh]h]h]h]h]Uinternaluh jBh]rIhXDownloaderStatsrJrK}rL(hXDownloaderStatsh jFubah!jubah!jubah!jubj)rM}rN(hUh}rO(h]h]h]h]h]uh jh]rPj)rQ}rR(hUh}rS(h]h]h]h]h]uh jMh]rTj)rU}rV(hUh}rW(U anchornameX5#module-scrapy.contrib.downloadermiddleware.useragentUrefurihh]h]h]h]h]Uinternaluh jQh]rXhXUserAgentMiddlewarerYrZ}r[(hXUserAgentMiddlewareh jUubah!jubah!jubah!jubj)r\}r](hUh}r^(h]h]h]h]h]uh jh]r_(j)r`}ra(hUh}rb(h]h]h]h]h]uh j\h]rcj)rd}re(hUh}rf(U anchornameX5#module-scrapy.contrib.downloadermiddleware.ajaxcrawlUrefurihh]h]h]h]h]Uinternaluh j`h]rghXAjaxCrawlMiddlewarerhri}rj(hXAjaxCrawlMiddlewareh jdubah!jubah!jubj)rk}rl(hUh}rm(h]h]h]h]h]uh j\h]rnj)ro}rp(hUh}rq(h]h]h]h]h]uh jkh]rr(j)rs}rt(hUh}ru(h]h]h]h]h]uh joh]rvj)rw}rx(hUh}ry(U anchornameU#ajaxcrawlmiddleware-settingsUrefurihh]h]h]h]h]Uinternaluh jsh]rzhXAjaxCrawlMiddleware Settingsr{r|}r}(hXAjaxCrawlMiddleware Settingsh jwubah!jubah!jubj)r~}r(hUh}r(h]h]h]h]h]uh joh]rj)r}r(hUh}r(h]h]h]h]h]uh j~h]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#ajaxcrawl-enabledUrefurihh]h]h]h]h]Uinternaluh jh]rhXAJAXCRAWL_ENABLEDrr}r(hXAJAXCRAWL_ENABLEDh jubah!jubah!jubah!jubah!jubeh!jubah!jubeh!jubeh!jubeh!jubeh!jubeh!jubah!jubjj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurijh]h]h]h]h]Uinternaluh jh]rhXDebugging Spidersrr}r(hj h jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#parse-commandUrefurijh]h]h]h]h]Uinternaluh jh]rhX Parse Commandrr}r(hX Parse Commandh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU #scrapy-shellUrefurijh]h]h]h]h]Uinternaluh jh]rhX Scrapy Shellrr}r(hX Scrapy Shellh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#open-in-browserUrefurijh]h]h]h]h]Uinternaluh jh]rhXOpen in browserrr}r(hXOpen in browserh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#loggingUrefurijh]h]h]h]h]Uinternaluh jh]rhXLoggingrr}r(hXLoggingh jubah!jubah!jubah!jubeh!jubeh!jubah!jubj j)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurij h]h]h]h]h]Uinternaluh jh]rhXSpidersrr}r(hjh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#spider-argumentsUrefurij h]h]h]h]h]Uinternaluh jh]rhXSpider argumentsrr}r (hXSpider argumentsr h jubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameU#built-in-spiders-referenceUrefurij h]h]h]h]h]Uinternaluh jh]rhXBuilt-in spiders referencerr}r(hXBuilt-in spiders referencerh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh j h]r(j)r}r (hUh}r!(h]h]h]h]h]uh jh]r"(j)r#}r$(hUh}r%(h]h]h]h]h]uh jh]r&j)r'}r((hUh}r)(U anchornameU#spiderUrefurij h]h]h]h]h]Uinternaluh j#h]r*hXSpiderr+r,}r-(hXSpiderr.h j'ubah!jubah!jubj)r/}r0(hUh}r1(h]h]h]h]h]uh jh]r2j)r3}r4(hUh}r5(h]h]h]h]h]uh j/h]r6j)r7}r8(hUh}r9(h]h]h]h]h]uh j3h]r:j)r;}r<(hUh}r=(U anchornameU#spider-exampleUrefurij h]h]h]h]h]Uinternaluh j7h]r>hXSpider exampler?r@}rA(hXSpider examplerBh j;ubah!jubah!jubah!jubah!jubeh!jubj)rC}rD(hUh}rE(h]h]h]h]h]uh jh]rF(j)rG}rH(hUh}rI(h]h]h]h]h]uh jCh]rJj)rK}rL(hUh}rM(U anchornameU #crawlspiderUrefurij h]h]h]h]h]Uinternaluh jGh]rNhX CrawlSpiderrOrP}rQ(hX CrawlSpiderrRh jKubah!jubah!jubj)rS}rT(hUh}rU(h]h]h]h]h]uh jCh]rV(j)rW}rX(hUh}rY(h]h]h]h]h]uh jSh]rZj)r[}r\(hUh}r](h]h]h]h]h]uh jWh]r^j)r_}r`(hUh}ra(U anchornameU#crawling-rulesUrefurij h]h]h]h]h]Uinternaluh j[h]rbhXCrawling rulesrcrd}re(hXCrawling rulesrfh j_ubah!jubah!jubah!jubj)rg}rh(hUh}ri(h]h]h]h]h]uh jSh]rjj)rk}rl(hUh}rm(h]h]h]h]h]uh jgh]rnj)ro}rp(hUh}rq(U anchornameU#crawlspider-exampleUrefurij h]h]h]h]h]Uinternaluh jkh]rrhXCrawlSpider examplersrt}ru(hXCrawlSpider examplervh joubah!jubah!jubah!jubeh!jubeh!jubj)rw}rx(hUh}ry(h]h]h]h]h]uh jh]rz(j)r{}r|(hUh}r}(h]h]h]h]h]uh jwh]r~j)r}r(hUh}r(U anchornameU#xmlfeedspiderUrefurij h]h]h]h]h]Uinternaluh j{h]rhX XMLFeedSpiderrr}r(hX XMLFeedSpiderrh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jwh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#xmlfeedspider-exampleUrefurij h]h]h]h]h]Uinternaluh jh]rhXXMLFeedSpider examplerr}r(hXXMLFeedSpider examplerh jubah!jubah!jubah!jubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#csvfeedspiderUrefurij h]h]h]h]h]Uinternaluh jh]rhX CSVFeedSpiderrr}r(hX CSVFeedSpiderrh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#csvfeedspider-exampleUrefurij h]h]h]h]h]Uinternaluh jh]rhXCSVFeedSpider examplerr}r(hXCSVFeedSpider examplerh jubah!jubah!jubah!jubah!jubeh!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#sitemapspiderUrefurij h]h]h]h]h]Uinternaluh jh]rhX SitemapSpiderrr}r(hX SitemapSpiderrh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU#sitemapspider-examplesUrefurij h]h]h]h]h]Uinternaluh jh]rhXSitemapSpider examplesrr}r(hXSitemapSpider examplesrh jubah!jubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubeh!jubah!jubjj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurijh]h]h]h]h]Uinternaluh jh]rhXFrequently Asked Questionsrr}r(hjh jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU1#how-does-scrapy-compare-to-beautifulsoup-or-lxmlUrefurijh]h]h]h]h]Uinternaluh jh]rhX1How does Scrapy compare to BeautifulSoup or lxml?rr}r(hX1How does Scrapy compare to BeautifulSoup or lxml?h jubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r(hUh}r(h]h]h]h]h]uh j h]rj)r}r(hUh}r(U anchornameU)#what-python-versions-does-scrapy-supportUrefurijh]h]h]h]h]Uinternaluh j h]rhX)What Python versions does Scrapy support?rr}r(hX)What Python versions does Scrapy support?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r!(hUh}r"(U anchornameU#does-scrapy-work-with-python-3Urefurijh]h]h]h]h]Uinternaluh jh]r#hXDoes Scrapy work with Python 3?r$r%}r&(hXDoes Scrapy work with Python 3?h j ubah!jubah!jubah!jubj)r'}r((hUh}r)(h]h]h]h]h]uh jh]r*j)r+}r,(hUh}r-(h]h]h]h]h]uh j'h]r.j)r/}r0(hUh}r1(U anchornameU#did-scrapy-steal-x-from-djangoUrefurijh]h]h]h]h]Uinternaluh j+h]r2hX!Did Scrapy "steal" X from Django?r3r4}r5(hX!Did Scrapy "steal" X from Django?h j/ubah!jubah!jubah!jubj)r6}r7(hUh}r8(h]h]h]h]h]uh jh]r9j)r:}r;(hUh}r<(h]h]h]h]h]uh j6h]r=j)r>}r?(hUh}r@(U anchornameU##does-scrapy-work-with-http-proxiesUrefurijh]h]h]h]h]Uinternaluh j:h]rAhX#Does Scrapy work with HTTP proxies?rBrC}rD(hX#Does Scrapy work with HTTP proxies?h j>ubah!jubah!jubah!jubj)rE}rF(hUh}rG(h]h]h]h]h]uh jh]rHj)rI}rJ(hUh}rK(h]h]h]h]h]uh jEh]rLj)rM}rN(hUh}rO(U anchornameU<#how-can-i-scrape-an-item-with-attributes-in-different-pagesUrefurijh]h]h]h]h]Uinternaluh jIh]rPhX<How can I scrape an item with attributes in different pages?rQrR}rS(hX<How can I scrape an item with attributes in different pages?h jMubah!jubah!jubah!jubj)rT}rU(hUh}rV(h]h]h]h]h]uh jh]rWj)rX}rY(hUh}rZ(h]h]h]h]h]uh jTh]r[j)r\}r](hUh}r^(U anchornameU9#scrapy-crashes-with-importerror-no-module-named-win32apiUrefurijh]h]h]h]h]Uinternaluh jXh]r_hX:Scrapy crashes with: ImportError: No module named win32apir`ra}rb(hX:Scrapy crashes with: ImportError: No module named win32apih j\ubah!jubah!jubah!jubj)rc}rd(hUh}re(h]h]h]h]h]uh jh]rfj)rg}rh(hUh}ri(h]h]h]h]h]uh jch]rjj)rk}rl(hUh}rm(U anchornameU-#how-can-i-simulate-a-user-login-in-my-spiderUrefurijh]h]h]h]h]Uinternaluh jgh]rnhX-How can I simulate a user login in my spider?rorp}rq(hX-How can I simulate a user login in my spider?h jkubah!jubah!jubah!jubj)rr}rs(hUh}rt(h]h]h]h]h]uh jh]ruj)rv}rw(hUh}rx(h]h]h]h]h]uh jrh]ryj)rz}r{(hUh}r|(U anchornameU8#does-scrapy-crawl-in-breadth-first-or-depth-first-orderUrefurijh]h]h]h]h]Uinternaluh jvh]r}hX8Does Scrapy crawl in breadth-first or depth-first order?r~r}r(hX8Does Scrapy crawl in breadth-first or depth-first order?h jzubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU1#my-scrapy-crawler-has-memory-leaks-what-can-i-doUrefurijh]h]h]h]h]Uinternaluh jh]rhX2My Scrapy crawler has memory leaks. What can I do?rr}r(hX2My Scrapy crawler has memory leaks. What can I do?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU*#how-can-i-make-scrapy-consume-less-memoryUrefurijh]h]h]h]h]Uinternaluh jh]rhX*How can I make Scrapy consume less memory?rr}r(hX*How can I make Scrapy consume less memory?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU2#can-i-use-basic-http-authentication-in-my-spidersUrefurijh]h]h]h]h]Uinternaluh jh]rhX2Can I use Basic HTTP Authentication in my spiders?rr}r(hX2Can I use Basic HTTP Authentication in my spiders?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUH#why-does-scrapy-download-pages-in-english-instead-of-my-native-languageUrefurijh]h]h]h]h]Uinternaluh jh]rhXHWhy does Scrapy download pages in English instead of my native language?rr}r(hXHWhy does Scrapy download pages in English instead of my native language?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU.#where-can-i-find-some-example-scrapy-projectsUrefurijh]h]h]h]h]Uinternaluh jh]rhX.Where can I find some example Scrapy projects?rr}r(hX.Where can I find some example Scrapy projects?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU.#can-i-run-a-spider-without-creating-a-projectUrefurijh]h]h]h]h]Uinternaluh jh]rhX.Can I run a spider without creating a project?rr}r(hX.Can I run a spider without creating a project?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU;#i-get-filtered-offsite-request-messages-how-can-i-fix-themUrefurijh]h]h]h]h]Uinternaluh jh]rhX>I get "Filtered offsite request" messages. How can I fix them?rr}r(hX>I get "Filtered offsite request" messages. How can I fix them?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUE#what-is-the-recommended-way-to-deploy-a-scrapy-crawler-in-productionUrefurijh]h]h]h]h]Uinternaluh jh]rhXEWhat is the recommended way to deploy a Scrapy crawler in production?rr}r(hXEWhat is the recommended way to deploy a Scrapy crawler in production?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU!#can-i-use-json-for-large-exportsUrefurijh]h]h]h]h]Uinternaluh jh]rhX!Can I use JSON for large exports?rr}r(hX!Can I use JSON for large exports?h jubah!jubah!jubah!jubj)r}r (hUh}r (h]h]h]h]h]uh jh]r j)r }r (hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU4#can-i-return-twisted-deferreds-from-signal-handlersUrefurijh]h]h]h]h]Uinternaluh j h]rhX6Can I return (Twisted) deferreds from signal handlers?rr}r(hX6Can I return (Twisted) deferreds from signal handlers?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r (hUh}r!(U anchornameU-#what-does-the-response-status-code-999-meansUrefurijh]h]h]h]h]Uinternaluh jh]r"hX-What does the response status code 999 means?r#r$}r%(hX-What does the response status code 999 means?h jubah!jubah!jubah!jubj)r&}r'(hUh}r((h]h]h]h]h]uh jh]r)j)r*}r+(hUh}r,(h]h]h]h]h]uh j&h]r-j)r.}r/(hUh}r0(U anchornameU7#can-i-call-pdb-set-trace-from-my-spiders-to-debug-themUrefurijh]h]h]h]h]Uinternaluh j*h]r1(hX Can I call r2r3}r4(hX Can I call h j.ubcdocutils.nodes literal r5)r6}r7(hX``pdb.set_trace()``h}r8(h]h]h]h]h]uh j.h]r9hXpdb.set_trace()r:r;}r<(hUh j6ubah!Uliteralr=ubhX from my spiders to debug them?r>r?}r@(hX from my spiders to debug them?h j.ubeh!jubah!jubah!jubj)rA}rB(hUh}rC(h]h]h]h]h]uh jh]rDj)rE}rF(hUh}rG(h]h]h]h]h]uh jAh]rHj)rI}rJ(hUh}rK(U anchornameUC#simplest-way-to-dump-all-my-scraped-items-into-a-json-csv-xml-fileUrefurijh]h]h]h]h]Uinternaluh jEh]rLhXCSimplest way to dump all my scraped items into a JSON/CSV/XML file?rMrN}rO(hXCSimplest way to dump all my scraped items into a JSON/CSV/XML file?h jIubah!jubah!jubah!jubj)rP}rQ(hUh}rR(h]h]h]h]h]uh jh]rSj)rT}rU(hUh}rV(h]h]h]h]h]uh jPh]rWj)rX}rY(hUh}rZ(U anchornameU@#what-s-this-huge-cryptic-viewstate-parameter-used-in-some-formsUrefurijh]h]h]h]h]Uinternaluh jTh]r[(hXWhat's this huge cryptic r\r]}r^(hXWhat's this huge cryptic h jXubj5)r_}r`(hX``__VIEWSTATE``h}ra(h]h]h]h]h]uh jXh]rbhX __VIEWSTATErcrd}re(hUh j_ubah!j=ubhX parameter used in some forms?rfrg}rh(hX parameter used in some forms?h jXubeh!jubah!jubah!jubj)ri}rj(hUh}rk(h]h]h]h]h]uh jh]rlj)rm}rn(hUh}ro(h]h]h]h]h]uh jih]rpj)rq}rr(hUh}rs(U anchornameU4#what-s-the-best-way-to-parse-big-xml-csv-data-feedsUrefurijh]h]h]h]h]Uinternaluh jmh]rthX4What's the best way to parse big XML/CSV data feeds?rurv}rw(hX4What's the best way to parse big XML/CSV data feeds?h jqubah!jubah!jubah!jubj)rx}ry(hUh}rz(h]h]h]h]h]uh jh]r{j)r|}r}(hUh}r~(h]h]h]h]h]uh jxh]rj)r}r(hUh}r(U anchornameU)#does-scrapy-manage-cookies-automaticallyUrefurijh]h]h]h]h]Uinternaluh j|h]rhX)Does Scrapy manage cookies automatically?rr}r(hX)Does Scrapy manage cookies automatically?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU>#how-can-i-see-the-cookies-being-sent-and-received-from-scrapyUrefurijh]h]h]h]h]Uinternaluh jh]rhX>How can I see the cookies being sent and received from Scrapy?rr}r(hX>How can I see the cookies being sent and received from Scrapy?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU+#how-can-i-instruct-a-spider-to-stop-itselfUrefurijh]h]h]h]h]Uinternaluh jh]rhX+How can I instruct a spider to stop itself?rr}r(hX+How can I instruct a spider to stop itself?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU4#how-can-i-prevent-my-scrapy-bot-from-getting-bannedUrefurijh]h]h]h]h]Uinternaluh jh]rhX4How can I prevent my Scrapy bot from getting banned?rr}r(hX4How can I prevent my Scrapy bot from getting banned?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUA#should-i-use-spider-arguments-or-settings-to-configure-my-spiderUrefurijh]h]h]h]h]Uinternaluh jh]rhXAShould I use spider arguments or settings to configure my spider?rr}r(hXAShould I use spider arguments or settings to configure my spider?h jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUK#i-m-scraping-a-xml-document-and-my-xpath-selector-doesn-t-return-any-itemsUrefurijh]h]h]h]h]Uinternaluh jh]rhXJI'm scraping a XML document and my XPath selector doesn't return any itemsrr}r(hXJI'm scraping a XML document and my XPath selector doesn't return any itemsh jubah!jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameU0#i-m-getting-an-error-cannot-import-name-crawlerUrefurijh]h]h]h]h]Uinternaluh jh]rhX2I'm getting an error: "cannot import name crawler"rr}r(hX2I'm getting an error: "cannot import name crawler"h jubah!jubah!jubah!jubeh!jubeh!jubah!jubjj)r}r(hUh}r(h]h]h]h]h]uh]rj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r}r(hUh}r(U anchornameUUrefurijh]h]h]h]h]Uinternaluh jh]rhXRequests and Responsesrr}r(hj&h jubah!jubah!jubj)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]r(j)r}r(hUh}r(h]h]h]h]h]uh jh]rj)r }r (hUh}r (U anchornameU#request-objectsUrefurijh]h]h]h]h]Uinternaluh jh]r hXRequest objectsr r }r (hXRequest objectsh j ubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU.#passing-additional-data-to-callback-functionsUrefurijh]h]h]h]h]Uinternaluh j h]r hX-Passing additional data to callback functionsr r }r (hX-Passing additional data to callback functionsh j ubah!jubah!jubah!jubah!jubeh!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r (j)r }r (hUh}r (h]h]h]h]h]uh j h]r! j)r" }r# (hUh}r$ (U anchornameU#request-meta-special-keysUrefurijh]h]h]h]h]Uinternaluh j h]r% hXRequest.meta special keysr& r' }r( (hXRequest.meta special keysh j" ubah!jubah!jubj)r) }r* (hUh}r+ (h]h]h]h]h]uh j h]r, j)r- }r. (hUh}r/ (h]h]h]h]h]uh j) h]r0 j)r1 }r2 (hUh}r3 (h]h]h]h]h]uh j- h]r4 j)r5 }r6 (hUh}r7 (U anchornameU #bindaddressUrefurijh]h]h]h]h]Uinternaluh j1 h]r8 hX bindaddressr9 r: }r; (hX bindaddressh j5 ubah!jubah!jubah!jubah!jubeh!jubj)r< }r= (hUh}r> (h]h]h]h]h]uh jh]r? (j)r@ }rA (hUh}rB (h]h]h]h]h]uh j< h]rC j)rD }rE (hUh}rF (U anchornameU#request-subclassesUrefurijh]h]h]h]h]Uinternaluh j@ h]rG hXRequest subclassesrH rI }rJ (hXRequest subclassesh jD ubah!jubah!jubj)rK }rL (hUh}rM (h]h]h]h]h]uh j< h]rN (j)rO }rP (hUh}rQ (h]h]h]h]h]uh jK h]rR j)rS }rT (hUh}rU (h]h]h]h]h]uh jO h]rV j)rW }rX (hUh}rY (U anchornameU#formrequest-objectsUrefurijh]h]h]h]h]Uinternaluh jS h]rZ hXFormRequest objectsr[ r\ }r] (hXFormRequest objectsh jW ubah!jubah!jubah!jubj)r^ }r_ (hUh}r` (h]h]h]h]h]uh jK h]ra (j)rb }rc (hUh}rd (h]h]h]h]h]uh j^ h]re j)rf }rg (hUh}rh (U anchornameU#request-usage-examplesUrefurijh]h]h]h]h]Uinternaluh jb h]ri hXRequest usage examplesrj rk }rl (hXRequest usage examplesh jf ubah!jubah!jubj)rm }rn (hUh}ro (h]h]h]h]h]uh j^ h]rp (j)rq }rr (hUh}rs (h]h]h]h]h]uh jm h]rt j)ru }rv (hUh}rw (h]h]h]h]h]uh jq h]rx j)ry }rz (hUh}r{ (U anchornameU-#using-formrequest-to-send-data-via-http-postUrefurijh]h]h]h]h]Uinternaluh ju h]r| hX,Using FormRequest to send data via HTTP POSTr} r~ }r (hX,Using FormRequest to send data via HTTP POSTh jy ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jm h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU9#using-formrequest-from-response-to-simulate-a-user-loginUrefurijh]h]h]h]h]Uinternaluh j h]r hX:Using FormRequest.from_response() to simulate a user loginr r }r (hX:Using FormRequest.from_response() to simulate a user loginh j ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#response-objectsUrefurijh]h]h]h]h]Uinternaluh j h]r hXResponse objectsr r }r (hXResponse objectsh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh jh]r (j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#response-subclassesUrefurijh]h]h]h]h]Uinternaluh j h]r hXResponse subclassesr r }r (hXResponse subclassesh j ubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r (j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#textresponse-objectsUrefurijh]h]h]h]h]Uinternaluh j h]r hXTextResponse objectsr r }r (hXTextResponse objectsh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#htmlresponse-objectsUrefurijh]h]h]h]h]Uinternaluh j h]r hXHtmlResponse objectsr r }r (hXHtmlResponse objectsh j ubah!jubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#xmlresponse-objectsUrefurijh]h]h]h]h]Uinternaluh j h]r hXXmlResponse objectsr r }r (hXXmlResponse objectsh j ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubj'j)r }r (hUh}r (h]h]h]h]h]uh]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r (j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameUUrefurij'h]h]h]h]h]Uinternaluh j h]r hX Exceptionsr r }r (hj/h j ubah!jubah!jubj)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (h]h]h]h]h]uh j h]r (j)r }r (hUh}r (h]h]h]h]h]uh j h]r j)r }r (hUh}r (U anchornameU#built-in-exceptions-referenceUrefurij'h]h]h]h]h]Uinternaluh j h]r!hXBuilt-in Exceptions referencer!r!}r!(hXBuilt-in Exceptions referenceh j ubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j h]r!(j)r!}r !(hUh}r !(h]h]h]h]h]uh j!h]r !j)r !}r !(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(U anchornameU #dropitemUrefurij'h]h]h]h]h]Uinternaluh j !h]r!hXDropItemr!r!}r!(hXDropItemh j!ubah!jubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r !(hUh}r!!(U anchornameU #closespiderUrefurij'h]h]h]h]h]Uinternaluh j!h]r"!hX CloseSpiderr#!r$!}r%!(hX CloseSpiderh j!ubah!jubah!jubah!jubj)r&!}r'!(hUh}r(!(h]h]h]h]h]uh j!h]r)!j)r*!}r+!(hUh}r,!(h]h]h]h]h]uh j&!h]r-!j)r.!}r/!(hUh}r0!(U anchornameU#ignorerequestUrefurij'h]h]h]h]h]Uinternaluh j*!h]r1!hX IgnoreRequestr2!r3!}r4!(hX IgnoreRequesth j.!ubah!jubah!jubah!jubj)r5!}r6!(hUh}r7!(h]h]h]h]h]uh j!h]r8!j)r9!}r:!(hUh}r;!(h]h]h]h]h]uh j5!h]r!(hUh}r?!(U anchornameU#notconfiguredUrefurij'h]h]h]h]h]Uinternaluh j9!h]r@!hX NotConfiguredrA!rB!}rC!(hX NotConfiguredh j=!ubah!jubah!jubah!jubj)rD!}rE!(hUh}rF!(h]h]h]h]h]uh j!h]rG!j)rH!}rI!(hUh}rJ!(h]h]h]h]h]uh jD!h]rK!j)rL!}rM!(hUh}rN!(U anchornameU #notsupportedUrefurij'h]h]h]h]h]Uinternaluh jH!h]rO!hX NotSupportedrP!rQ!}rR!(hX NotSupportedh jL!ubah!jubah!jubah!jubeh!jubeh!jubah!jubeh!jubah!jubj0j)rS!}rT!(hUh}rU!(h]h]h]h]h]uh]rV!j)rW!}rX!(hUh}rY!(h]h]h]h]h]uh jS!h]rZ!j)r[!}r\!(hUh}r]!(h]h]h]h]h]uh jW!h]r^!j)r_!}r`!(hUh}ra!(U anchornameUUrefurij0h]h]h]h]h]Uinternaluh j[!h]rb!hXExamplesrc!rd!}re!(hj8h j_!ubah!jubah!jubah!jubah!jubj9j)rf!}rg!(hUh}rh!(h]h]h]h]h]uh]ri!j)rj!}rk!(hUh}rl!(h]h]h]h]h]uh jf!h]rm!(j)rn!}ro!(hUh}rp!(h]h]h]h]h]uh jj!h]rq!j)rr!}rs!(hUh}rt!(U anchornameUUrefurij9h]h]h]h]h]Uinternaluh jn!h]ru!hX Selectorsrv!rw!}rx!(hjAh jr!ubah!jubah!jubj)ry!}rz!(hUh}r{!(h]h]h]h]h]uh jj!h]r|!(j)r}!}r~!(hUh}r!(h]h]h]h]h]uh jy!h]r!(j)r!}r!(hUh}r!(h]h]h]h]h]uh j}!h]r!j)r!}r!(hUh}r!(U anchornameU#using-selectorsUrefurij9h]h]h]h]h]Uinternaluh j!h]r!hXUsing selectorsr!r!}r!(hXUsing selectorsr!h j!ubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j}!h]r!(j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(U anchornameU#constructing-selectorsUrefurij9h]h]h]h]h]Uinternaluh j!h]r!hXConstructing selectorsr!r!}r!(hXConstructing selectorsr!h j!ubah!jubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(U anchornameU#id1Urefurij9h]h]h]h]h]Uinternaluh j!h]r!hXUsing selectorsr!r!}r!(hXUsing selectorsr!h j!ubah!jubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(U anchornameU#nesting-selectorsUrefurij9h]h]h]h]h]Uinternaluh j!h]r!hXNesting selectorsr!r!}r!(hXNesting selectorsr!h j!ubah!jubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(U anchornameU)#using-selectors-with-regular-expressionsUrefurij9h]h]h]h]h]Uinternaluh j!h]r!hX(Using selectors with regular expressionsr!r!}r!(hX(Using selectors with regular expressionsr!h j!ubah!jubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(U anchornameU#working-with-relative-xpathsUrefurij9h]h]h]h]h]Uinternaluh j!h]r!hXWorking with relative XPathsr!r!}r!(hXWorking with relative XPathsr!h j!ubah!jubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!(j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(U anchornameU#using-exslt-extensionsUrefurij9h]h]h]h]h]Uinternaluh j!h]r!hXUsing EXSLT extensionsr!r!}r!(hXUsing EXSLT extensionsr!h j!ubah!jubah!jubj)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!(j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(h]h]h]h]h]uh j!h]r!j)r!}r!(hUh}r!(U anchornameU#regular-expressionsUrefurij9h]h]h]h]h]Uinternaluh j!h]r"hXRegular expressionsr"r"}r"(hXRegular expressionsr"h j!ubah!jubah!jubah!jubj)r"}r"(hUh}r"(h]h]h]h]h]uh j!h]r"j)r "}r "(hUh}r "(h]h]h]h]h]uh j"h]r "j)r "}r"(hUh}r"(U anchornameU#set-operationsUrefurij9h]h]h]h]h]Uinternaluh j "h]r"hXSet operationsr"r"}r"(hXSet operationsr"h j "ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubj)r"}r"(hUh}r"(h]h]h]h]h]uh jy!h]r"(j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(U anchornameX#module-scrapy.selectorUrefurij9h]h]h]h]h]Uinternaluh j"h]r "hXBuilt-in Selectors referencer!"r""}r#"(hXBuilt-in Selectors referencer$"h j"ubah!jubah!jubj)r%"}r&"(hUh}r'"(h]h]h]h]h]uh j"h]r("j)r)"}r*"(hUh}r+"(h]h]h]h]h]uh j%"h]r,"(j)r-"}r."(hUh}r/"(h]h]h]h]h]uh j)"h]r0"j)r1"}r2"(hUh}r3"(U anchornameU#selectorlist-objectsUrefurij9h]h]h]h]h]Uinternaluh j-"h]r4"hXSelectorList objectsr5"r6"}r7"(hXSelectorList objectsr8"h j1"ubah!jubah!jubj)r9"}r:"(hUh}r;"(h]h]h]h]h]uh j)"h]r<"(j)r="}r>"(hUh}r?"(h]h]h]h]h]uh j9"h]r@"j)rA"}rB"(hUh}rC"(h]h]h]h]h]uh j="h]rD"j)rE"}rF"(hUh}rG"(U anchornameU##selector-examples-on-html-responseUrefurij9h]h]h]h]h]Uinternaluh jA"h]rH"hX"Selector examples on HTML responserI"rJ"}rK"(hX"Selector examples on HTML responserL"h jE"ubah!jubah!jubah!jubj)rM"}rN"(hUh}rO"(h]h]h]h]h]uh j9"h]rP"j)rQ"}rR"(hUh}rS"(h]h]h]h]h]uh jM"h]rT"j)rU"}rV"(hUh}rW"(U anchornameU"#selector-examples-on-xml-responseUrefurij9h]h]h]h]h]Uinternaluh jQ"h]rX"hX!Selector examples on XML responserY"rZ"}r["(hX!Selector examples on XML responser\"h jU"ubah!jubah!jubah!jubj)r]"}r^"(hUh}r_"(h]h]h]h]h]uh j9"h]r`"j)ra"}rb"(hUh}rc"(h]h]h]h]h]uh j]"h]rd"j)re"}rf"(hUh}rg"(U anchornameU#removing-namespacesUrefurij9h]h]h]h]h]Uinternaluh ja"h]rh"hXRemoving namespacesri"rj"}rk"(hXRemoving namespacesrl"h je"ubah!jubah!jubah!jubeh!jubeh!jubah!jubeh!jubeh!jubeh!jubah!jubjBj)rm"}rn"(hUh}ro"(h]h]h]h]h]uh]rp"j)rq"}rr"(hUh}rs"(h]h]h]h]h]uh jm"h]rt"(j)ru"}rv"(hUh}rw"(h]h]h]h]h]uh jq"h]rx"j)ry"}rz"(hUh}r{"(U anchornameUUrefurijBh]h]h]h]h]Uinternaluh ju"h]r|"hXLink Extractorsr}"r~"}r"(hjJh jy"ubah!jubah!jubj)r"}r"(hUh}r"(h]h]h]h]h]uh jq"h]r"j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"(j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(U anchornameX%#module-scrapy.contrib.linkextractorsUrefurijBh]h]h]h]h]Uinternaluh j"h]r"hX"Built-in link extractors referencer"r"}r"(hX"Built-in link extractors referenceh j"ubah!jubah!jubj)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"(j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(U anchornameU#sgmllinkextractorUrefurijBh]h]h]h]h]Uinternaluh j"h]r"hXSgmlLinkExtractorr"r"}r"(hXSgmlLinkExtractorh j"ubah!jubah!jubah!jubj)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(U anchornameU#basesgmllinkextractorUrefurijBh]h]h]h]h]Uinternaluh j"h]r"hXBaseSgmlLinkExtractorr"r"}r"(hXBaseSgmlLinkExtractorh j"ubah!jubah!jubah!jubeh!jubeh!jubah!jubeh!jubah!jubjKj)r"}r"(hUh}r"(h]h]h]h]h]uh]r"j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"(j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(U anchornameUUrefurijKh]h]h]h]h]Uinternaluh j"h]r"hXExperimental featuresr"r"}r"(hjSh j"ubah!jubah!jubj)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(U anchornameU&#add-commands-using-external-librariesUrefurijKh]h]h]h]h]Uinternaluh j"h]r"hX%Add commands using external librariesr"r"}r"(hX%Add commands using external librariesh j"ubah!jubah!jubah!jubah!jubeh!jubah!jubjTj)r"}r"(hUh}r"(h]h]h]h]h]uh]r"j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"(j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(U anchornameUUrefurijTh]h]h]h]h]Uinternaluh j"h]r"hX Release notesr"r"}r"(hj\h j"ubah!jubah!jubj)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"(j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"(j)r"}r"(hUh}r"(h]h]h]h]h]uh j"h]r"j)r"}r"(hUh}r"(U anchornameU#released-2014-01-17UrefurijTh]h]h]h]h]Uinternaluh j"h]r"hX0.22.0 (released 2014-01-17)r"r"}r#(hX0.22.0 (released 2014-01-17)r#h j"ubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j"h]r#(j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r #j)r #}r #(hUh}r #(h]h]h]h]h]uh j#h]r #j)r#}r#(hUh}r#(U anchornameU #enhancementsUrefurijTh]h]h]h]h]Uinternaluh j #h]r#hX Enhancementsr#r#}r#(hX Enhancementsr#h j#ubah!jubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r #(U anchornameU#fixesUrefurijTh]h]h]h]h]Uinternaluh j#h]r!#hXFixesr"#r##}r$#(hXFixesr%#h j#ubah!jubah!jubah!jubeh!jubeh!jubj)r&#}r'#(hUh}r(#(h]h]h]h]h]uh j"h]r)#j)r*#}r+#(hUh}r,#(h]h]h]h]h]uh j&#h]r-#j)r.#}r/#(hUh}r0#(U anchornameU#released-2013-12-09UrefurijTh]h]h]h]h]Uinternaluh j*#h]r1#hX0.20.2 (released 2013-12-09)r2#r3#}r4#(hX0.20.2 (released 2013-12-09)r5#h j.#ubah!jubah!jubah!jubj)r6#}r7#(hUh}r8#(h]h]h]h]h]uh j"h]r9#j)r:#}r;#(hUh}r<#(h]h]h]h]h]uh j6#h]r=#j)r>#}r?#(hUh}r@#(U anchornameU#released-2013-11-28UrefurijTh]h]h]h]h]Uinternaluh j:#h]rA#hX0.20.1 (released 2013-11-28)rB#rC#}rD#(hX0.20.1 (released 2013-11-28)rE#h j>#ubah!jubah!jubah!jubj)rF#}rG#(hUh}rH#(h]h]h]h]h]uh j"h]rI#(j)rJ#}rK#(hUh}rL#(h]h]h]h]h]uh jF#h]rM#j)rN#}rO#(hUh}rP#(U anchornameU#released-2013-11-08UrefurijTh]h]h]h]h]Uinternaluh jJ#h]rQ#hX0.20.0 (released 2013-11-08)rR#rS#}rT#(hX0.20.0 (released 2013-11-08)rU#h jN#ubah!jubah!jubj)rV#}rW#(hUh}rX#(h]h]h]h]h]uh jF#h]rY#(j)rZ#}r[#(hUh}r\#(h]h]h]h]h]uh jV#h]r]#j)r^#}r_#(hUh}r`#(h]h]h]h]h]uh jZ#h]ra#j)rb#}rc#(hUh}rd#(U anchornameU#id1UrefurijTh]h]h]h]h]Uinternaluh j^#h]re#hX Enhancementsrf#rg#}rh#(hX Enhancementsri#h jb#ubah!jubah!jubah!jubj)rj#}rk#(hUh}rl#(h]h]h]h]h]uh jV#h]rm#j)rn#}ro#(hUh}rp#(h]h]h]h]h]uh jj#h]rq#j)rr#}rs#(hUh}rt#(U anchornameU #bugfixesUrefurijTh]h]h]h]h]Uinternaluh jn#h]ru#hXBugfixesrv#rw#}rx#(hXBugfixesry#h jr#ubah!jubah!jubah!jubj)rz#}r{#(hUh}r|#(h]h]h]h]h]uh jV#h]r}#j)r~#}r#(hUh}r#(h]h]h]h]h]uh jz#h]r#j)r#}r#(hUh}r#(U anchornameU#otherUrefurijTh]h]h]h]h]Uinternaluh j~#h]r#hXOtherr#r#}r#(hXOtherr#h j#ubah!jubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh jV#h]r#j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r#(U anchornameU#thanksUrefurijTh]h]h]h]h]Uinternaluh j#h]r#hXThanksr#r#}r#(hXThanksr#h j#ubah!jubah!jubah!jubeh!jubeh!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j"h]r#j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r#(U anchornameU#released-2013-10-10UrefurijTh]h]h]h]h]Uinternaluh j#h]r#hX0.18.4 (released 2013-10-10)r#r#}r#(hX0.18.4 (released 2013-10-10)r#h j#ubah!jubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j"h]r#j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r#(U anchornameU#released-2013-10-03UrefurijTh]h]h]h]h]Uinternaluh j#h]r#hX0.18.3 (released 2013-10-03)r#r#}r#(hX0.18.3 (released 2013-10-03)r#h j#ubah!jubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j"h]r#j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r#(U anchornameU#released-2013-09-03UrefurijTh]h]h]h]h]Uinternaluh j#h]r#hX0.18.2 (released 2013-09-03)r#r#}r#(hX0.18.2 (released 2013-09-03)r#h j#ubah!jubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j"h]r#j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r#(U anchornameU#released-2013-08-27UrefurijTh]h]h]h]h]Uinternaluh j#h]r#hX0.18.1 (released 2013-08-27)r#r#}r#(hX0.18.1 (released 2013-08-27)r#h j#ubah!jubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j"h]r#j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r#(U anchornameU#released-2013-08-09UrefurijTh]h]h]h]h]Uinternaluh j#h]r#hX0.18.0 (released 2013-08-09)r#r#}r#(hX0.18.0 (released 2013-08-09)r#h j#ubah!jubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j"h]r#j)r#}r#(hUh}r#(h]h]h]h]h]uh j#h]r#j)r#}r#(hUh}r#(U anchornameU#released-2013-05-30UrefurijTh]h]h]h]h]Uinternaluh j#h]r#hX0.16.5 (released 2013-05-30)r#r#}r#(hX0.16.5 (released 2013-05-30)r#h j#ubah!jubah!jubah!jubj)r#}r#(hUh}r#(h]h]h]h]h]uh j"h]r#j)r#}r#(hUh}r$(h]h]h]h]h]uh j#h]r$j)r$}r$(hUh}r$(U anchornameU#released-2013-01-23UrefurijTh]h]h]h]h]Uinternaluh j#h]r$hX0.16.4 (released 2013-01-23)r$r$}r$(hX0.16.4 (released 2013-01-23)r $h j$ubah!jubah!jubah!jubj)r $}r $(hUh}r $(h]h]h]h]h]uh j"h]r $j)r$}r$(hUh}r$(h]h]h]h]h]uh j $h]r$j)r$}r$(hUh}r$(U anchornameU#released-2012-12-07UrefurijTh]h]h]h]h]Uinternaluh j$h]r$hX0.16.3 (released 2012-12-07)r$r$}r$(hX0.16.3 (released 2012-12-07)r$h j$ubah!jubah!jubah!jubj)r$}r$(hUh}r$(h]h]h]h]h]uh j"h]r$j)r$}r$(hUh}r $(h]h]h]h]h]uh j$h]r!$j)r"$}r#$(hUh}r$$(U anchornameU#released-2012-11-09UrefurijTh]h]h]h]h]Uinternaluh j$h]r%$hX0.16.2 (released 2012-11-09)r&$r'$}r($(hX0.16.2 (released 2012-11-09)r)$h j"$ubah!jubah!jubah!jubj)r*$}r+$(hUh}r,$(h]h]h]h]h]uh j"h]r-$j)r.$}r/$(hUh}r0$(h]h]h]h]h]uh j*$h]r1$j)r2$}r3$(hUh}r4$(U anchornameU#released-2012-10-26UrefurijTh]h]h]h]h]Uinternaluh j.$h]r5$hX0.16.1 (released 2012-10-26)r6$r7$}r8$(hX0.16.1 (released 2012-10-26)r9$h j2$ubah!jubah!jubah!jubj)r:$}r;$(hUh}r<$(h]h]h]h]h]uh j"h]r=$j)r>$}r?$(hUh}r@$(h]h]h]h]h]uh j:$h]rA$j)rB$}rC$(hUh}rD$(U anchornameU#released-2012-10-18UrefurijTh]h]h]h]h]Uinternaluh j>$h]rE$hX0.16.0 (released 2012-10-18)rF$rG$}rH$(hX0.16.0 (released 2012-10-18)rI$h jB$ubah!jubah!jubah!jubj)rJ$}rK$(hUh}rL$(h]h]h]h]h]uh j"h]rM$j)rN$}rO$(hUh}rP$(h]h]h]h]h]uh jJ$h]rQ$j)rR$}rS$(hUh}rT$(U anchornameU#id2UrefurijTh]h]h]h]h]Uinternaluh jN$h]rU$hX0.14.4rV$rW$}rX$(hX0.14.4rY$h jR$ubah!jubah!jubah!jubj)rZ$}r[$(hUh}r\$(h]h]h]h]h]uh j"h]r]$j)r^$}r_$(hUh}r`$(h]h]h]h]h]uh jZ$h]ra$j)rb$}rc$(hUh}rd$(U anchornameU#id3UrefurijTh]h]h]h]h]Uinternaluh j^$h]re$hX0.14.3rf$rg$}rh$(hX0.14.3ri$h jb$ubah!jubah!jubah!jubj)rj$}rk$(hUh}rl$(h]h]h]h]h]uh j"h]rm$j)rn$}ro$(hUh}rp$(h]h]h]h]h]uh jj$h]rq$j)rr$}rs$(hUh}rt$(U anchornameU#id4UrefurijTh]h]h]h]h]Uinternaluh jn$h]ru$hX0.14.2rv$rw$}rx$(hX0.14.2ry$h jr$ubah!jubah!jubah!jubj)rz$}r{$(hUh}r|$(h]h]h]h]h]uh j"h]r}$j)r~$}r$(hUh}r$(h]h]h]h]h]uh jz$h]r$j)r$}r$(hUh}r$(U anchornameU#id5UrefurijTh]h]h]h]h]Uinternaluh j~$h]r$hX0.14.1r$r$}r$(hX0.14.1r$h j$ubah!jubah!jubah!jubj)r$}r$(hUh}r$(h]h]h]h]h]uh j"h]r$(j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(U anchornameU#id6UrefurijTh]h]h]h]h]Uinternaluh j$h]r$hX0.14r$r$}r$(hX0.14r$h j$ubah!jubah!jubj)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$(j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(U anchornameU#new-features-and-settingsUrefurijTh]h]h]h]h]Uinternaluh j$h]r$hXNew features and settingsr$r$}r$(hXNew features and settingsr$h j$ubah!jubah!jubah!jubj)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(U anchornameU#code-rearranged-and-removedUrefurijTh]h]h]h]h]Uinternaluh j$h]r$hXCode rearranged and removedr$r$}r$(hXCode rearranged and removedr$h j$ubah!jubah!jubah!jubeh!jubeh!jubj)r$}r$(hUh}r$(h]h]h]h]h]uh j"h]r$(j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(U anchornameU#id7UrefurijTh]h]h]h]h]Uinternaluh j$h]r$hX0.12r$r$}r$(hX0.12r$h j$ubah!jubah!jubj)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$(j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(U anchornameU#new-features-and-improvementsUrefurijTh]h]h]h]h]Uinternaluh j$h]r$hXNew features and improvementsr$r$}r$(hXNew features and improvementsr$h j$ubah!jubah!jubah!jubj)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(U anchornameU#scrapyd-changesUrefurijTh]h]h]h]h]Uinternaluh j$h]r$hXScrapyd changesr$r$}r$(hXScrapyd changesr$h j$ubah!jubah!jubah!jubj)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(h]h]h]h]h]uh j$h]r$j)r$}r$(hUh}r$(U anchornameU#changes-to-settingsUrefurijTh]h]h]h]h]Uinternaluh j$h]r$hXChanges to settingsr$r$}r%(hXChanges to settingsr%h j$ubah!jubah!jubah!jubj)r%}r%(hUh}r%(h]h]h]h]h]uh j$h]r%j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r %j)r %}r %(hUh}r %(U anchornameU##deprecated-obsoleted-functionalityUrefurijTh]h]h]h]h]Uinternaluh j%h]r %hX"Deprecated/obsoleted functionalityr%r%}r%(hX"Deprecated/obsoleted functionalityr%h j %ubah!jubah!jubah!jubeh!jubeh!jubj)r%}r%(hUh}r%(h]h]h]h]h]uh j"h]r%(j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(U anchornameU#id8UrefurijTh]h]h]h]h]Uinternaluh j%h]r%hX0.10r%r%}r %(hX0.10r!%h j%ubah!jubah!jubj)r"%}r#%(hUh}r$%(h]h]h]h]h]uh j%h]r%%(j)r&%}r'%(hUh}r(%(h]h]h]h]h]uh j"%h]r)%j)r*%}r+%(hUh}r,%(h]h]h]h]h]uh j&%h]r-%j)r.%}r/%(hUh}r0%(U anchornameU#id9UrefurijTh]h]h]h]h]Uinternaluh j*%h]r1%hXNew features and improvementsr2%r3%}r4%(hXNew features and improvementsr5%h j.%ubah!jubah!jubah!jubj)r6%}r7%(hUh}r8%(h]h]h]h]h]uh j"%h]r9%j)r:%}r;%(hUh}r<%(h]h]h]h]h]uh j6%h]r=%j)r>%}r?%(hUh}r@%(U anchornameU#command-line-tool-changesUrefurijTh]h]h]h]h]Uinternaluh j:%h]rA%hXCommand-line tool changesrB%rC%}rD%(hXCommand-line tool changesrE%h j>%ubah!jubah!jubah!jubj)rF%}rG%(hUh}rH%(h]h]h]h]h]uh j"%h]rI%j)rJ%}rK%(hUh}rL%(h]h]h]h]h]uh jF%h]rM%j)rN%}rO%(hUh}rP%(U anchornameU #api-changesUrefurijTh]h]h]h]h]Uinternaluh jJ%h]rQ%hX API changesrR%rS%}rT%(hX API changesrU%h jN%ubah!jubah!jubah!jubj)rV%}rW%(hUh}rX%(h]h]h]h]h]uh j"%h]rY%j)rZ%}r[%(hUh}r\%(h]h]h]h]h]uh jV%h]r]%j)r^%}r_%(hUh}r`%(U anchornameU#id10UrefurijTh]h]h]h]h]Uinternaluh jZ%h]ra%hXChanges to settingsrb%rc%}rd%(hXChanges to settingsre%h j^%ubah!jubah!jubah!jubeh!jubeh!jubj)rf%}rg%(hUh}rh%(h]h]h]h]h]uh j"h]ri%(j)rj%}rk%(hUh}rl%(h]h]h]h]h]uh jf%h]rm%j)rn%}ro%(hUh}rp%(U anchornameU#id11UrefurijTh]h]h]h]h]Uinternaluh jj%h]rq%hX0.9rr%rs%}rt%(hX0.9ru%h jn%ubah!jubah!jubj)rv%}rw%(hUh}rx%(h]h]h]h]h]uh jf%h]ry%(j)rz%}r{%(hUh}r|%(h]h]h]h]h]uh jv%h]r}%j)r~%}r%(hUh}r%(h]h]h]h]h]uh jz%h]r%j)r%}r%(hUh}r%(U anchornameU#id12UrefurijTh]h]h]h]h]Uinternaluh j~%h]r%hXNew features and improvementsr%r%}r%(hXNew features and improvementsr%h j%ubah!jubah!jubah!jubj)r%}r%(hUh}r%(h]h]h]h]h]uh jv%h]r%j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(U anchornameU#id13UrefurijTh]h]h]h]h]Uinternaluh j%h]r%hX API changesr%r%}r%(hX API changesr%h j%ubah!jubah!jubah!jubj)r%}r%(hUh}r%(h]h]h]h]h]uh jv%h]r%j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(U anchornameU#changes-to-default-settingsUrefurijTh]h]h]h]h]Uinternaluh j%h]r%hXChanges to default settingsr%r%}r%(hXChanges to default settingsr%h j%ubah!jubah!jubah!jubeh!jubeh!jubj)r%}r%(hUh}r%(h]h]h]h]h]uh j"h]r%(j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(U anchornameU#id14UrefurijTh]h]h]h]h]Uinternaluh j%h]r%hX0.8r%r%}r%(hX0.8r%h j%ubah!jubah!jubj)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%(j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(U anchornameU #new-featuresUrefurijTh]h]h]h]h]Uinternaluh j%h]r%hX New featuresr%r%}r%(hX New featuresr%h j%ubah!jubah!jubah!jubj)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(U anchornameU#backwards-incompatible-changesUrefurijTh]h]h]h]h]Uinternaluh j%h]r%hXBackwards-incompatible changesr%r%}r%(hXBackwards-incompatible changesr%h j%ubah!jubah!jubah!jubeh!jubeh!jubj)r%}r%(hUh}r%(h]h]h]h]h]uh j"h]r%j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(U anchornameU#id15UrefurijTh]h]h]h]h]Uinternaluh j%h]r%hX0.7r%r%}r%(hX0.7r%h j%ubah!jubah!jubah!jubeh!jubeh!jubah!jubj]j)r%}r%(hUh}r%(h]h]h]h]h]uh]r%j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%(j)r%}r%(hUh}r%(h]h]h]h]h]uh j%h]r%j)r%}r%(hUh}r%(U anchornameUUrefurij]h]h]h]h]h]Uinternaluh j%h]r%hXTelnet Consoler%r%}r&(hjeh j%ubah!jubah!jubj)r&}r&(hUh}r&(h]h]h]h]h]uh j%h]r&(j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r &}r &(hUh}r &(h]h]h]h]h]uh j&h]r &j)r &}r&(hUh}r&(U anchornameU!#how-to-access-the-telnet-consoleUrefurij]h]h]h]h]h]Uinternaluh j &h]r&hX How to access the telnet consoler&r&}r&(hX How to access the telnet consoler&h j &ubah!jubah!jubah!jubj)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(U anchornameU*#available-variables-in-the-telnet-consoleUrefurij]h]h]h]h]h]Uinternaluh j&h]r &hX)Available variables in the telnet consoler!&r"&}r#&(hX)Available variables in the telnet consoler$&h j&ubah!jubah!jubah!jubj)r%&}r&&(hUh}r'&(h]h]h]h]h]uh j&h]r(&(j)r)&}r*&(hUh}r+&(h]h]h]h]h]uh j%&h]r,&j)r-&}r.&(hUh}r/&(U anchornameU#telnet-console-usage-examplesUrefurij]h]h]h]h]h]Uinternaluh j)&h]r0&hXTelnet console usage examplesr1&r2&}r3&(hXTelnet console usage examplesr4&h j-&ubah!jubah!jubj)r5&}r6&(hUh}r7&(h]h]h]h]h]uh j%&h]r8&(j)r9&}r:&(hUh}r;&(h]h]h]h]h]uh j5&h]r<&j)r=&}r>&(hUh}r?&(h]h]h]h]h]uh j9&h]r@&j)rA&}rB&(hUh}rC&(U anchornameU#view-engine-statusUrefurij]h]h]h]h]h]Uinternaluh j=&h]rD&hXView engine statusrE&rF&}rG&(hXView engine statusrH&h jA&ubah!jubah!jubah!jubj)rI&}rJ&(hUh}rK&(h]h]h]h]h]uh j5&h]rL&j)rM&}rN&(hUh}rO&(h]h]h]h]h]uh jI&h]rP&j)rQ&}rR&(hUh}rS&(U anchornameU(#pause-resume-and-stop-the-scrapy-engineUrefurij]h]h]h]h]h]Uinternaluh jM&h]rT&hX(Pause, resume and stop the Scrapy enginerU&rV&}rW&(hX(Pause, resume and stop the Scrapy enginerX&h jQ&ubah!jubah!jubah!jubeh!jubeh!jubj)rY&}rZ&(hUh}r[&(h]h]h]h]h]uh j&h]r\&j)r]&}r^&(hUh}r_&(h]h]h]h]h]uh jY&h]r`&j)ra&}rb&(hUh}rc&(U anchornameU#telnet-console-signalsUrefurij]h]h]h]h]h]Uinternaluh j]&h]rd&hXTelnet Console signalsre&rf&}rg&(hXTelnet Console signalsrh&h ja&ubah!jubah!jubah!jubj)ri&}rj&(hUh}rk&(h]h]h]h]h]uh j&h]rl&(j)rm&}rn&(hUh}ro&(h]h]h]h]h]uh ji&h]rp&j)rq&}rr&(hUh}rs&(U anchornameU#telnet-settingsUrefurij]h]h]h]h]h]Uinternaluh jm&h]rt&hXTelnet settingsru&rv&}rw&(hXTelnet settingsrx&h jq&ubah!jubah!jubj)ry&}rz&(hUh}r{&(h]h]h]h]h]uh ji&h]r|&(j)r}&}r~&(hUh}r&(h]h]h]h]h]uh jy&h]r&j)r&}r&(hUh}r&(h]h]h]h]h]uh j}&h]r&j)r&}r&(hUh}r&(U anchornameU#telnetconsole-portUrefurij]h]h]h]h]h]Uinternaluh j&h]r&hXTELNETCONSOLE_PORTr&r&}r&(hXTELNETCONSOLE_PORTr&h j&ubah!jubah!jubah!jubj)r&}r&(hUh}r&(h]h]h]h]h]uh jy&h]r&j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(U anchornameU#telnetconsole-hostUrefurij]h]h]h]h]h]Uinternaluh j&h]r&hXTELNETCONSOLE_HOSTr&r&}r&(hXTELNETCONSOLE_HOSTr&h j&ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubjfj)r&}r&(hUh}r&(h]h]h]h]h]uh]r&j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&(j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(U anchornameUUrefurijfh]h]h]h]h]Uinternaluh j&h]r&hXDebugging memory leaksr&r&}r&(hjnh j&ubah!jubah!jubj)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&(j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(U anchornameU#common-causes-of-memory-leaksUrefurijfh]h]h]h]h]Uinternaluh j&h]r&hXCommon causes of memory leaksr&r&}r&(hXCommon causes of memory leaksh j&ubah!jubah!jubah!jubj)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&(j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(U anchornameU%#debugging-memory-leaks-with-trackrefUrefurijfh]h]h]h]h]Uinternaluh j&h]r&(hXDebugging memory leaks with r&r&}r&(hXDebugging memory leaks with h j&ubj5)r&}r&(hX ``trackref``h}r&(h]h]h]h]h]uh j&h]r&hXtrackrefr&r&}r&(hUh j&ubah!j=ubeh!jubah!jubj)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&(j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(U anchornameU#which-objects-are-trackedUrefurijfh]h]h]h]h]Uinternaluh j&h]r&hXWhich objects are tracked?r&r&}r&(hXWhich objects are tracked?h j&ubah!jubah!jubah!jubj)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r&(hUh}r&(U anchornameU#a-real-exampleUrefurijfh]h]h]h]h]Uinternaluh j&h]r&hXA real exampler&r&}r&(hXA real exampleh j&ubah!jubah!jubah!jubj)r&}r&(hUh}r&(h]h]h]h]h]uh j&h]r&j)r&}r'(hUh}r'(h]h]h]h]h]uh j&h]r'j)r'}r'(hUh}r'(U anchornameU#too-many-spidersUrefurijfh]h]h]h]h]Uinternaluh j&h]r'hXToo many spiders?r'r'}r '(hXToo many spiders?h j'ubah!jubah!jubah!jubj)r '}r '(hUh}r '(h]h]h]h]h]uh j&h]r 'j)r'}r'(hUh}r'(h]h]h]h]h]uh j 'h]r'j)r'}r'(hUh}r'(U anchornameU#scrapy-utils-trackref-moduleUrefurijfh]h]h]h]h]Uinternaluh j'h]r'hXscrapy.utils.trackref moduler'r'}r'(hXscrapy.utils.trackref moduleh j'ubah!jubah!jubah!jubeh!jubeh!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh j&h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r 'j)r!'}r"'(hUh}r#'(U anchornameU"#debugging-memory-leaks-with-guppyUrefurijfh]h]h]h]h]Uinternaluh j'h]r$'hX!Debugging memory leaks with Guppyr%'r&'}r''(hX!Debugging memory leaks with Guppyh j!'ubah!jubah!jubah!jubj)r('}r)'(hUh}r*'(h]h]h]h]h]uh j&h]r+'j)r,'}r-'(hUh}r.'(h]h]h]h]h]uh j('h]r/'j)r0'}r1'(hUh}r2'(U anchornameU#leaks-without-leaksUrefurijfh]h]h]h]h]Uinternaluh j,'h]r3'hXLeaks without leaksr4'r5'}r6'(hXLeaks without leaksh j0'ubah!jubah!jubah!jubeh!jubeh!jubah!jubjoj)r7'}r8'(hUh}r9'(h]h]h]h]h]uh]r:'j)r;'}r<'(hUh}r='(h]h]h]h]h]uh j7'h]r>'(j)r?'}r@'(hUh}rA'(h]h]h]h]h]uh j;'h]rB'j)rC'}rD'(hUh}rE'(U anchornameUUrefurijoh]h]h]h]h]Uinternaluh j?'h]rF'hXSignalsrG'rH'}rI'(hjwh jC'ubah!jubah!jubj)rJ'}rK'(hUh}rL'(h]h]h]h]h]uh j;'h]rM'(j)rN'}rO'(hUh}rP'(h]h]h]h]h]uh jJ'h]rQ'j)rR'}rS'(hUh}rT'(h]h]h]h]h]uh jN'h]rU'j)rV'}rW'(hUh}rX'(U anchornameU#deferred-signal-handlersUrefurijoh]h]h]h]h]Uinternaluh jR'h]rY'hXDeferred signal handlersrZ'r['}r\'(hXDeferred signal handlersr]'h jV'ubah!jubah!jubah!jubj)r^'}r_'(hUh}r`'(h]h]h]h]h]uh jJ'h]ra'(j)rb'}rc'(hUh}rd'(h]h]h]h]h]uh j^'h]re'j)rf'}rg'(hUh}rh'(U anchornameX#module-scrapy.signalsUrefurijoh]h]h]h]h]Uinternaluh jb'h]ri'hXBuilt-in signals referencerj'rk'}rl'(hXBuilt-in signals referencerm'h jf'ubah!jubah!jubj)rn'}ro'(hUh}rp'(h]h]h]h]h]uh j^'h]rq'(j)rr'}rs'(hUh}rt'(h]h]h]h]h]uh jn'h]ru'j)rv'}rw'(hUh}rx'(h]h]h]h]h]uh jr'h]ry'j)rz'}r{'(hUh}r|'(U anchornameU#engine-startedUrefurijoh]h]h]h]h]Uinternaluh jv'h]r}'hXengine_startedr~'r'}r'(hXengine_startedr'h jz'ubah!jubah!jubah!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh jn'h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r'j)r'}r'(hUh}r'(U anchornameU#engine-stoppedUrefurijoh]h]h]h]h]Uinternaluh j'h]r'hXengine_stoppedr'r'}r'(hXengine_stoppedr'h j'ubah!jubah!jubah!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh jn'h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r'j)r'}r'(hUh}r'(U anchornameU #item-scrapedUrefurijoh]h]h]h]h]Uinternaluh j'h]r'hX item_scrapedr'r'}r'(hX item_scrapedr'h j'ubah!jubah!jubah!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh jn'h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r'j)r'}r'(hUh}r'(U anchornameU #item-droppedUrefurijoh]h]h]h]h]Uinternaluh j'h]r'hX item_droppedr'r'}r'(hX item_droppedr'h j'ubah!jubah!jubah!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh jn'h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r'j)r'}r'(hUh}r'(U anchornameU#spider-closedUrefurijoh]h]h]h]h]Uinternaluh j'h]r'hX spider_closedr'r'}r'(hX spider_closedr'h j'ubah!jubah!jubah!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh jn'h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r'j)r'}r'(hUh}r'(U anchornameU#spider-openedUrefurijoh]h]h]h]h]Uinternaluh j'h]r'hX spider_openedr'r'}r'(hX spider_openedr'h j'ubah!jubah!jubah!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh jn'h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r'j)r'}r'(hUh}r'(U anchornameU #spider-idleUrefurijoh]h]h]h]h]Uinternaluh j'h]r'hX spider_idler'r'}r'(hX spider_idler'h j'ubah!jubah!jubah!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh jn'h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r'j)r'}r'(hUh}r'(U anchornameU #spider-errorUrefurijoh]h]h]h]h]Uinternaluh j'h]r'hX spider_errorr'r'}r'(hX spider_errorr'h j'ubah!jubah!jubah!jubj)r'}r'(hUh}r'(h]h]h]h]h]uh jn'h]r'j)r'}r'(hUh}r'(h]h]h]h]h]uh j'h]r'j)r'}r'(hUh}r'(U anchornameU#response-receivedUrefurijoh]h]h]h]h]Uinternaluh j'h]r'hXresponse_receivedr'r'}r((hXresponse_receivedr(h j'ubah!jubah!jubah!jubj)r(}r((hUh}r((h]h]h]h]h]uh jn'h]r(j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r (j)r (}r ((hUh}r ((U anchornameU#response-downloadedUrefurijoh]h]h]h]h]Uinternaluh j(h]r (hXresponse_downloadedr(r(}r((hXresponse_downloadedr(h j (ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubjxj)r(}r((hUh}r((h]h]h]h]h]uh]r(j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r((j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r ((U anchornameUUrefurijxh]h]h]h]h]Uinternaluh j(h]r!(hX Extensionsr"(r#(}r$((hjh j(ubah!jubah!jubj)r%(}r&((hUh}r'((h]h]h]h]h]uh j(h]r(((j)r)(}r*((hUh}r+((h]h]h]h]h]uh j%(h]r,(j)r-(}r.((hUh}r/((h]h]h]h]h]uh j)(h]r0(j)r1(}r2((hUh}r3((U anchornameU#extension-settingsUrefurijxh]h]h]h]h]Uinternaluh j-(h]r4(hXExtension settingsr5(r6(}r7((hXExtension settingsh j1(ubah!jubah!jubah!jubj)r8(}r9((hUh}r:((h]h]h]h]h]uh j%(h]r;(j)r<(}r=((hUh}r>((h]h]h]h]h]uh j8(h]r?(j)r@(}rA((hUh}rB((U anchornameU#loading-activating-extensionsUrefurijxh]h]h]h]h]Uinternaluh j<(h]rC(hXLoading & activating extensionsrD(rE(}rF((hXLoading & activating extensionsh j@(ubah!jubah!jubah!jubj)rG(}rH((hUh}rI((h]h]h]h]h]uh j%(h]rJ(j)rK(}rL((hUh}rM((h]h]h]h]h]uh jG(h]rN(j)rO(}rP((hUh}rQ((U anchornameU*#available-enabled-and-disabled-extensionsUrefurijxh]h]h]h]h]Uinternaluh jK(h]rR(hX*Available, enabled and disabled extensionsrS(rT(}rU((hX*Available, enabled and disabled extensionsh jO(ubah!jubah!jubah!jubj)rV(}rW((hUh}rX((h]h]h]h]h]uh j%(h]rY(j)rZ(}r[((hUh}r\((h]h]h]h]h]uh jV(h]r](j)r^(}r_((hUh}r`((U anchornameU#disabling-an-extensionUrefurijxh]h]h]h]h]Uinternaluh jZ(h]ra(hXDisabling an extensionrb(rc(}rd((hXDisabling an extensionh j^(ubah!jubah!jubah!jubj)re(}rf((hUh}rg((h]h]h]h]h]uh j%(h]rh((j)ri(}rj((hUh}rk((h]h]h]h]h]uh je(h]rl(j)rm(}rn((hUh}ro((U anchornameU#writing-your-own-extensionUrefurijxh]h]h]h]h]Uinternaluh ji(h]rp(hXWriting your own extensionrq(rr(}rs((hXWriting your own extensionh jm(ubah!jubah!jubj)rt(}ru((hUh}rv((h]h]h]h]h]uh je(h]rw(j)rx(}ry((hUh}rz((h]h]h]h]h]uh jt(h]r{(j)r|(}r}((hUh}r~((h]h]h]h]h]uh jx(h]r(j)r(}r((hUh}r((U anchornameU#sample-extensionUrefurijxh]h]h]h]h]Uinternaluh j|(h]r(hXSample extensionr(r(}r((hXSample extensionh j(ubah!jubah!jubah!jubah!jubeh!jubj)r(}r((hUh}r((h]h]h]h]h]uh j%(h]r((j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((U anchornameU#built-in-extensions-referenceUrefurijxh]h]h]h]h]Uinternaluh j(h]r(hXBuilt-in extensions referencer(r(}r((hXBuilt-in extensions referenceh j(ubah!jubah!jubj)r(}r((hUh}r((h]h]h]h]h]uh j(h]r((j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r((j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((U anchornameU#general-purpose-extensionsUrefurijxh]h]h]h]h]Uinternaluh j(h]r(hXGeneral purpose extensionsr(r(}r((hXGeneral purpose extensionsh j(ubah!jubah!jubj)r(}r((hUh}r((h]h]h]h]h]uh j(h]r((j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((U anchornameX#module-scrapy.contrib.logstatsUrefurijxh]h]h]h]h]Uinternaluh j(h]r(hXLog Stats extensionr(r(}r((hXLog Stats extensionh j(ubah!jubah!jubah!jubj)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((U anchornameX #module-scrapy.contrib.corestatsUrefurijxh]h]h]h]h]Uinternaluh j(h]r(hXCore Stats extensionr(r(}r((hXCore Stats extensionh j(ubah!jubah!jubah!jubj)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((U anchornameX#module-scrapy.webserviceUrefurijxh]h]h]h]h]Uinternaluh j(h]r(hXWeb service extensionr(r(}r((hXWeb service extensionh j(ubah!jubah!jubah!jubj)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((U anchornameX#module-scrapy.telnetUrefurijxh]h]h]h]h]Uinternaluh j(h]r(hXTelnet console extensionr(r(}r((hXTelnet console extensionh j(ubah!jubah!jubah!jubj)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((U anchornameX#module-scrapy.contrib.memusageUrefurijxh]h]h]h]h]Uinternaluh j(h]r(hXMemory usage extensionr(r(}r((hXMemory usage extensionh j(ubah!jubah!jubah!jubj)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r(}r((hUh}r((h]h]h]h]h]uh j(h]r(j)r)}r)(hUh}r)(U anchornameX#module-scrapy.contrib.memdebugUrefurijxh]h]h]h]h]Uinternaluh j(h]r)hXMemory debugger extensionr)r)}r)(hXMemory debugger extensionh j)ubah!jubah!jubah!jubj)r)}r)(hUh}r )(h]h]h]h]h]uh j(h]r )(j)r )}r )(hUh}r )(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(U anchornameX"#module-scrapy.contrib.closespiderUrefurijxh]h]h]h]h]Uinternaluh j )h]r)hXClose spider extensionr)r)}r)(hXClose spider extensionh j)ubah!jubah!jubj)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)(j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r )(h]h]h]h]h]uh j)h]r!)j)r")}r#)(hUh}r$)(U anchornameU#closespider-timeoutUrefurijxh]h]h]h]h]Uinternaluh j)h]r%)hXCLOSESPIDER_TIMEOUTr&)r')}r()(hXCLOSESPIDER_TIMEOUTh j")ubah!jubah!jubah!jubj)r))}r*)(hUh}r+)(h]h]h]h]h]uh j)h]r,)j)r-)}r.)(hUh}r/)(h]h]h]h]h]uh j))h]r0)j)r1)}r2)(hUh}r3)(U anchornameU#closespider-itemcountUrefurijxh]h]h]h]h]Uinternaluh j-)h]r4)hXCLOSESPIDER_ITEMCOUNTr5)r6)}r7)(hXCLOSESPIDER_ITEMCOUNTh j1)ubah!jubah!jubah!jubj)r8)}r9)(hUh}r:)(h]h]h]h]h]uh j)h]r;)j)r<)}r=)(hUh}r>)(h]h]h]h]h]uh j8)h]r?)j)r@)}rA)(hUh}rB)(U anchornameU#closespider-pagecountUrefurijxh]h]h]h]h]Uinternaluh j<)h]rC)hXCLOSESPIDER_PAGECOUNTrD)rE)}rF)(hXCLOSESPIDER_PAGECOUNTh j@)ubah!jubah!jubah!jubj)rG)}rH)(hUh}rI)(h]h]h]h]h]uh j)h]rJ)j)rK)}rL)(hUh}rM)(h]h]h]h]h]uh jG)h]rN)j)rO)}rP)(hUh}rQ)(U anchornameU#closespider-errorcountUrefurijxh]h]h]h]h]Uinternaluh jK)h]rR)hXCLOSESPIDER_ERRORCOUNTrS)rT)}rU)(hXCLOSESPIDER_ERRORCOUNTh jO)ubah!jubah!jubah!jubeh!jubeh!jubj)rV)}rW)(hUh}rX)(h]h]h]h]h]uh j(h]rY)j)rZ)}r[)(hUh}r\)(h]h]h]h]h]uh jV)h]r])j)r^)}r_)(hUh}r`)(U anchornameX"#module-scrapy.contrib.statsmailerUrefurijxh]h]h]h]h]Uinternaluh jZ)h]ra)hXStatsMailer extensionrb)rc)}rd)(hXStatsMailer extensionh j^)ubah!jubah!jubah!jubeh!jubeh!jubj)re)}rf)(hUh}rg)(h]h]h]h]h]uh j(h]rh)(j)ri)}rj)(hUh}rk)(h]h]h]h]h]uh je)h]rl)j)rm)}rn)(hUh}ro)(U anchornameU#debugging-extensionsUrefurijxh]h]h]h]h]Uinternaluh ji)h]rp)hXDebugging extensionsrq)rr)}rs)(hXDebugging extensionsh jm)ubah!jubah!jubj)rt)}ru)(hUh}rv)(h]h]h]h]h]uh je)h]rw)(j)rx)}ry)(hUh}rz)(h]h]h]h]h]uh jt)h]r{)j)r|)}r})(hUh}r~)(h]h]h]h]h]uh jx)h]r)j)r)}r)(hUh}r)(U anchornameU#stack-trace-dump-extensionUrefurijxh]h]h]h]h]Uinternaluh j|)h]r)hXStack trace dump extensionr)r)}r)(hXStack trace dump extensionh j)ubah!jubah!jubah!jubj)r)}r)(hUh}r)(h]h]h]h]h]uh jt)h]r)j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(U anchornameU#debugger-extensionUrefurijxh]h]h]h]h]Uinternaluh j)h]r)hXDebugger extensionr)r)}r)(hXDebugger extensionh j)ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubeh!jubeh!jubah!jubjj)r)}r)(hUh}r)(h]h]h]h]h]uh]r)j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)(j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(U anchornameUUrefurijh]h]h]h]h]Uinternaluh j)h]r)hX DjangoItemr)r)}r)(hjh j)ubah!jubah!jubj)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)(j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(U anchornameU#using-djangoitemUrefurijh]h]h]h]h]Uinternaluh j)h]r)hXUsing DjangoItemr)r)}r)(hXUsing DjangoItemh j)ubah!jubah!jubah!jubj)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(U anchornameU#djangoitem-caveatsUrefurijh]h]h]h]h]Uinternaluh j)h]r)hXDjangoItem caveatsr)r)}r)(hXDjangoItem caveatsh j)ubah!jubah!jubah!jubj)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(U anchornameU#django-settings-set-upUrefurijh]h]h]h]h]Uinternaluh j)h]r)hXDjango settings set upr)r)}r)(hXDjango settings set uph j)ubah!jubah!jubah!jubeh!jubeh!jubah!jubjj)r)}r)(hUh}r)(h]h]h]h]h]uh]r)j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)(j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(U anchornameUUrefurijh]h]h]h]h]Uinternaluh j)h]r)hXSending e-mailr)r)}r)(hjh j)ubah!jubah!jubj)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)(j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(h]h]h]h]h]uh j)h]r)j)r)}r)(hUh}r)(U anchornameU#quick-exampleUrefurijh]h]h]h]h]Uinternaluh j)h]r)hX Quick exampler)r)}r)(hX Quick exampleh j)ubah!jubah!jubah!jubj)r*}r*(hUh}r*(h]h]h]h]h]uh j)h]r*j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r *(hUh}r *(U anchornameU#mailsender-class-referenceUrefurijh]h]h]h]h]Uinternaluh j*h]r *hXMailSender class referencer *r *}r*(hXMailSender class referenceh j*ubah!jubah!jubah!jubj)r*}r*(hUh}r*(h]h]h]h]h]uh j)h]r*(j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(U anchornameU#mail-settingsUrefurijh]h]h]h]h]Uinternaluh j*h]r*hX Mail settingsr*r*}r*(hX Mail settingsh j*ubah!jubah!jubj)r*}r*(hUh}r *(h]h]h]h]h]uh j*h]r!*(j)r"*}r#*(hUh}r$*(h]h]h]h]h]uh j*h]r%*j)r&*}r'*(hUh}r(*(h]h]h]h]h]uh j"*h]r)*j)r**}r+*(hUh}r,*(U anchornameU #mail-fromUrefurijh]h]h]h]h]Uinternaluh j&*h]r-*hX MAIL_FROMr.*r/*}r0*(hX MAIL_FROMh j**ubah!jubah!jubah!jubj)r1*}r2*(hUh}r3*(h]h]h]h]h]uh j*h]r4*j)r5*}r6*(hUh}r7*(h]h]h]h]h]uh j1*h]r8*j)r9*}r:*(hUh}r;*(U anchornameU #mail-hostUrefurijh]h]h]h]h]Uinternaluh j5*h]r<*hX MAIL_HOSTr=*r>*}r?*(hX MAIL_HOSTh j9*ubah!jubah!jubah!jubj)r@*}rA*(hUh}rB*(h]h]h]h]h]uh j*h]rC*j)rD*}rE*(hUh}rF*(h]h]h]h]h]uh j@*h]rG*j)rH*}rI*(hUh}rJ*(U anchornameU #mail-portUrefurijh]h]h]h]h]Uinternaluh jD*h]rK*hX MAIL_PORTrL*rM*}rN*(hX MAIL_PORTh jH*ubah!jubah!jubah!jubj)rO*}rP*(hUh}rQ*(h]h]h]h]h]uh j*h]rR*j)rS*}rT*(hUh}rU*(h]h]h]h]h]uh jO*h]rV*j)rW*}rX*(hUh}rY*(U anchornameU #mail-userUrefurijh]h]h]h]h]Uinternaluh jS*h]rZ*hX MAIL_USERr[*r\*}r]*(hX MAIL_USERh jW*ubah!jubah!jubah!jubj)r^*}r_*(hUh}r`*(h]h]h]h]h]uh j*h]ra*j)rb*}rc*(hUh}rd*(h]h]h]h]h]uh j^*h]re*j)rf*}rg*(hUh}rh*(U anchornameU #mail-passUrefurijh]h]h]h]h]Uinternaluh jb*h]ri*hX MAIL_PASSrj*rk*}rl*(hX MAIL_PASSh jf*ubah!jubah!jubah!jubj)rm*}rn*(hUh}ro*(h]h]h]h]h]uh j*h]rp*j)rq*}rr*(hUh}rs*(h]h]h]h]h]uh jm*h]rt*j)ru*}rv*(hUh}rw*(U anchornameU #mail-tlsUrefurijh]h]h]h]h]Uinternaluh jq*h]rx*hXMAIL_TLSry*rz*}r{*(hXMAIL_TLSh ju*ubah!jubah!jubah!jubj)r|*}r}*(hUh}r~*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(h]h]h]h]h]uh j|*h]r*j)r*}r*(hUh}r*(U anchornameU #mail-sslUrefurijh]h]h]h]h]Uinternaluh j*h]r*hXMAIL_SSLr*r*}r*(hXMAIL_SSLh j*ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubjj)r*}r*(hUh}r*(h]h]h]h]h]uh]r*j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(U anchornameUUrefurijh]h]h]h]h]Uinternaluh j*h]r*hXUbuntu packagesr*r*}r*(hjh j*ubah!jubah!jubah!jubah!jubjj)r*}r*(hUh}r*(h]h]h]h]h]uh]r*j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*(j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(U anchornameUUrefurijh]h]h]h]h]Uinternaluh j*h]r*hXCommand line toolr*r*}r*(hjh j*ubah!jubah!jubj)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*(j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(U anchornameU%#default-structure-of-scrapy-projectsUrefurijh]h]h]h]h]Uinternaluh j*h]r*hX$Default structure of Scrapy projectsr*r*}r*(hX$Default structure of Scrapy projectsh j*ubah!jubah!jubah!jubj)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*(j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(U anchornameU#using-the-scrapy-toolUrefurijh]h]h]h]h]Uinternaluh j*h]r*(hX Using the r*r*}r*(hX Using the h j*ubj5)r*}r*(hX ``scrapy``h}r*(h]h]h]h]h]uh j*h]r*hXscrapyr*r*}r*(hUh j*ubah!j=ubhX toolr*r*}r*(hX toolh j*ubeh!jubah!jubj)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*(j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(U anchornameU#creating-projectsUrefurijh]h]h]h]h]Uinternaluh j*h]r*hXCreating projectsr*r*}r*(hXCreating projectsh j*ubah!jubah!jubah!jubj)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(h]h]h]h]h]uh j*h]r*j)r*}r*(hUh}r*(U anchornameU#controlling-projectsUrefurijh]h]h]h]h]Uinternaluh j*h]r*hXControlling projectsr*r*}r*(hXControlling projectsh j*ubah!jubah!jubah!jubeh!jubeh!jubj)r*}r+(hUh}r+(h]h]h]h]h]uh j*h]r+(j)r+}r+(hUh}r+(h]h]h]h]h]uh j*h]r+j)r+}r+(hUh}r +(U anchornameU#available-tool-commandsUrefurijh]h]h]h]h]Uinternaluh j+h]r +hXAvailable tool commandsr +r +}r +(hXAvailable tool commandsh j+ubah!jubah!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j*h]r+(j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU #startprojectUrefurijh]h]h]h]h]Uinternaluh j+h]r+hX startprojectr+r+}r +(hX startprojecth j+ubah!jubah!jubah!jubj)r!+}r"+(hUh}r#+(h]h]h]h]h]uh j+h]r$+j)r%+}r&+(hUh}r'+(h]h]h]h]h]uh j!+h]r(+j)r)+}r*+(hUh}r++(U anchornameU #genspiderUrefurijh]h]h]h]h]Uinternaluh j%+h]r,+hX genspiderr-+r.+}r/+(hX genspiderh j)+ubah!jubah!jubah!jubj)r0+}r1+(hUh}r2+(h]h]h]h]h]uh j+h]r3+j)r4+}r5+(hUh}r6+(h]h]h]h]h]uh j0+h]r7+j)r8+}r9+(hUh}r:+(U anchornameU#crawlUrefurijh]h]h]h]h]Uinternaluh j4+h]r;+hXcrawlr<+r=+}r>+(hXcrawlh j8+ubah!jubah!jubah!jubj)r?+}r@+(hUh}rA+(h]h]h]h]h]uh j+h]rB+j)rC+}rD+(hUh}rE+(h]h]h]h]h]uh j?+h]rF+j)rG+}rH+(hUh}rI+(U anchornameU#checkUrefurijh]h]h]h]h]Uinternaluh jC+h]rJ+hXcheckrK+rL+}rM+(hXcheckh jG+ubah!jubah!jubah!jubj)rN+}rO+(hUh}rP+(h]h]h]h]h]uh j+h]rQ+j)rR+}rS+(hUh}rT+(h]h]h]h]h]uh jN+h]rU+j)rV+}rW+(hUh}rX+(U anchornameU#listUrefurijh]h]h]h]h]Uinternaluh jR+h]rY+hXlistrZ+r[+}r\+(hXlisth jV+ubah!jubah!jubah!jubj)r]+}r^+(hUh}r_+(h]h]h]h]h]uh j+h]r`+j)ra+}rb+(hUh}rc+(h]h]h]h]h]uh j]+h]rd+j)re+}rf+(hUh}rg+(U anchornameU#editUrefurijh]h]h]h]h]Uinternaluh ja+h]rh+hXeditri+rj+}rk+(hXedith je+ubah!jubah!jubah!jubj)rl+}rm+(hUh}rn+(h]h]h]h]h]uh j+h]ro+j)rp+}rq+(hUh}rr+(h]h]h]h]h]uh jl+h]rs+j)rt+}ru+(hUh}rv+(U anchornameU#fetchUrefurijh]h]h]h]h]Uinternaluh jp+h]rw+hXfetchrx+ry+}rz+(hXfetchh jt+ubah!jubah!jubah!jubj)r{+}r|+(hUh}r}+(h]h]h]h]h]uh j+h]r~+j)r+}r+(hUh}r+(h]h]h]h]h]uh j{+h]r+j)r+}r+(hUh}r+(U anchornameU#viewUrefurijh]h]h]h]h]Uinternaluh j+h]r+hXviewr+r+}r+(hXviewh j+ubah!jubah!jubah!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU#shellUrefurijh]h]h]h]h]Uinternaluh j+h]r+hXshellr+r+}r+(hXshellh j+ubah!jubah!jubah!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU#parseUrefurijh]h]h]h]h]Uinternaluh j+h]r+hXparser+r+}r+(hXparseh j+ubah!jubah!jubah!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU #settingsUrefurijh]h]h]h]h]Uinternaluh j+h]r+hXsettingsr+r+}r+(hXsettingsh j+ubah!jubah!jubah!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU #runspiderUrefurijh]h]h]h]h]Uinternaluh j+h]r+hX runspiderr+r+}r+(hX runspiderh j+ubah!jubah!jubah!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU#versionUrefurijh]h]h]h]h]Uinternaluh j+h]r+hXversionr+r+}r+(hXversionh j+ubah!jubah!jubah!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU#deployUrefurijh]h]h]h]h]Uinternaluh j+h]r+hXdeployr+r+}r+(hXdeployh j+ubah!jubah!jubah!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU#benchUrefurijh]h]h]h]h]Uinternaluh j+h]r+hXbenchr+r+}r+(hXbenchh j+ubah!jubah!jubah!jubeh!jubeh!jubj)r+}r+(hUh}r+(h]h]h]h]h]uh j*h]r+(j)r+}r+(hUh}r+(h]h]h]h]h]uh j+h]r+j)r+}r+(hUh}r+(U anchornameU#custom-project-commandsUrefurijh]h]h]h]h]Uinternaluh j+h]r+hXCustom project commandsr+r,}r,(hXCustom project commandsh j+ubah!jubah!jubj)r,}r,(hUh}r,(h]h]h]h]h]uh j+h]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r ,j)r ,}r ,(hUh}r ,(h]h]h]h]h]uh j,h]r ,j)r,}r,(hUh}r,(U anchornameU#commands-moduleUrefurijh]h]h]h]h]Uinternaluh j ,h]r,hXCOMMANDS_MODULEr,r,}r,(hXCOMMANDS_MODULEh j,ubah!jubah!jubah!jubah!jubeh!jubeh!jubeh!jubah!jubjj)r,}r,(hUh}r,(h]h]h]h]h]uh]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,(j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r ,j)r!,}r",(hUh}r#,(U anchornameUUrefurijh]h]h]h]h]Uinternaluh j,h]r$,hX Scrapy shellr%,r&,}r',(hjh j!,ubah!jubah!jubj)r(,}r),(hUh}r*,(h]h]h]h]h]uh j,h]r+,(j)r,,}r-,(hUh}r.,(h]h]h]h]h]uh j(,h]r/,j)r0,}r1,(hUh}r2,(h]h]h]h]h]uh j,,h]r3,j)r4,}r5,(hUh}r6,(U anchornameU#launch-the-shellUrefurijh]h]h]h]h]Uinternaluh j0,h]r7,hXLaunch the shellr8,r9,}r:,(hXLaunch the shellr;,h j4,ubah!jubah!jubah!jubj)r<,}r=,(hUh}r>,(h]h]h]h]h]uh j(,h]r?,(j)r@,}rA,(hUh}rB,(h]h]h]h]h]uh j<,h]rC,j)rD,}rE,(hUh}rF,(U anchornameU#using-the-shellUrefurijh]h]h]h]h]Uinternaluh j@,h]rG,hXUsing the shellrH,rI,}rJ,(hXUsing the shellrK,h jD,ubah!jubah!jubj)rL,}rM,(hUh}rN,(h]h]h]h]h]uh j<,h]rO,(j)rP,}rQ,(hUh}rR,(h]h]h]h]h]uh jL,h]rS,j)rT,}rU,(hUh}rV,(h]h]h]h]h]uh jP,h]rW,j)rX,}rY,(hUh}rZ,(U anchornameU#available-shortcutsUrefurijh]h]h]h]h]Uinternaluh jT,h]r[,hXAvailable Shortcutsr\,r],}r^,(hXAvailable Shortcutsr_,h jX,ubah!jubah!jubah!jubj)r`,}ra,(hUh}rb,(h]h]h]h]h]uh jL,h]rc,j)rd,}re,(hUh}rf,(h]h]h]h]h]uh j`,h]rg,j)rh,}ri,(hUh}rj,(U anchornameU#available-scrapy-objectsUrefurijh]h]h]h]h]Uinternaluh jd,h]rk,hXAvailable Scrapy objectsrl,rm,}rn,(hXAvailable Scrapy objectsro,h jh,ubah!jubah!jubah!jubeh!jubeh!jubj)rp,}rq,(hUh}rr,(h]h]h]h]h]uh j(,h]rs,j)rt,}ru,(hUh}rv,(h]h]h]h]h]uh jp,h]rw,j)rx,}ry,(hUh}rz,(U anchornameU#example-of-shell-sessionUrefurijh]h]h]h]h]Uinternaluh jt,h]r{,hXExample of shell sessionr|,r},}r~,(hXExample of shell sessionr,h jx,ubah!jubah!jubah!jubj)r,}r,(hUh}r,(h]h]h]h]h]uh j(,h]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(U anchornameU5#invoking-the-shell-from-spiders-to-inspect-responsesUrefurijh]h]h]h]h]Uinternaluh j,h]r,hX4Invoking the shell from spiders to inspect responsesr,r,}r,(hX4Invoking the shell from spiders to inspect responsesr,h j,ubah!jubah!jubah!jubeh!jubeh!jubah!jubjj)r,}r,(hUh}r,(h]h]h]h]h]uh]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(U anchornameUUrefurijh]h]h]h]h]Uinternaluh j,h]r,hXScrapydr,r,}r,(hjh j,ubah!jubah!jubah!jubah!jubjj)r,}r,(hUh}r,(h]h]h]h]h]uh]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,(j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(U anchornameUUrefurijh]h]h]h]h]Uinternaluh j,h]r,hXLoggingr,r,}r,(hjh j,ubah!jubah!jubj)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,(j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(U anchornameU #log-levelsUrefurijh]h]h]h]h]Uinternaluh j,h]r,hX Log levelsr,r,}r,(hX Log levelsh j,ubah!jubah!jubah!jubj)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(U anchornameU#how-to-set-the-log-levelUrefurijh]h]h]h]h]Uinternaluh j,h]r,hXHow to set the log levelr,r,}r,(hXHow to set the log levelh j,ubah!jubah!jubah!jubj)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(U anchornameU#how-to-log-messagesUrefurijh]h]h]h]h]Uinternaluh j,h]r,hXHow to log messagesr,r,}r,(hXHow to log messagesh j,ubah!jubah!jubah!jubj)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(U anchornameU#logging-from-spidersUrefurijh]h]h]h]h]Uinternaluh j,h]r,hXLogging from Spidersr,r,}r,(hXLogging from Spidersh j,ubah!jubah!jubah!jubj)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r,(h]h]h]h]h]uh j,h]r,j)r,}r,(hUh}r-(U anchornameX#module-scrapy.logUrefurijh]h]h]h]h]Uinternaluh j,h]r-hXscrapy.log moduler-r-}r-(hXscrapy.log moduleh j,ubah!jubah!jubah!jubj)r-}r-(hUh}r-(h]h]h]h]h]uh j,h]r-j)r -}r -(hUh}r -(h]h]h]h]h]uh j-h]r -j)r -}r-(hUh}r-(U anchornameU#logging-settingsUrefurijh]h]h]h]h]Uinternaluh j -h]r-hXLogging settingsr-r-}r-(hXLogging settingsh j -ubah!jubah!jubah!jubeh!jubeh!jubah!jubjj)r-}r-(hUh}r-(h]h]h]h]h]uh]r-j)r-}r-(hUh}r-(h]h]h]h]h]uh j-h]r-(j)r-}r-(hUh}r-(h]h]h]h]h]uh j-h]r-j)r -}r!-(hUh}r"-(U anchornameUUrefurijh]h]h]h]h]Uinternaluh j-h]r#-hXStats Collectionr$-r%-}r&-(hjh j -ubah!jubah!jubj)r'-}r(-(hUh}r)-(h]h]h]h]h]uh j-h]r*-(j)r+-}r,-(hUh}r--(h]h]h]h]h]uh j'-h]r.-j)r/-}r0-(hUh}r1-(h]h]h]h]h]uh j+-h]r2-j)r3-}r4-(hUh}r5-(U anchornameU#common-stats-collector-usesUrefurijh]h]h]h]h]Uinternaluh j/-h]r6-hXCommon Stats Collector usesr7-r8-}r9-(hXCommon Stats Collector usesr:-h j3-ubah!jubah!jubah!jubj)r;-}r<-(hUh}r=-(h]h]h]h]h]uh j'-h]r>-(j)r?-}r@-(hUh}rA-(h]h]h]h]h]uh j;-h]rB-j)rC-}rD-(hUh}rE-(U anchornameU#available-stats-collectorsUrefurijh]h]h]h]h]Uinternaluh j?-h]rF-hXAvailable Stats CollectorsrG-rH-}rI-(hXAvailable Stats CollectorsrJ-h jC-ubah!jubah!jubj)rK-}rL-(hUh}rM-(h]h]h]h]h]uh j;-h]rN-(j)rO-}rP-(hUh}rQ-(h]h]h]h]h]uh jK-h]rR-j)rS-}rT-(hUh}rU-(h]h]h]h]h]uh jO-h]rV-j)rW-}rX-(hUh}rY-(U anchornameU#memorystatscollectorUrefurijh]h]h]h]h]Uinternaluh jS-h]rZ-hXMemoryStatsCollectorr[-r\-}r]-(hXMemoryStatsCollectorr^-h jW-ubah!jubah!jubah!jubj)r_-}r`-(hUh}ra-(h]h]h]h]h]uh jK-h]rb-j)rc-}rd-(hUh}re-(h]h]h]h]h]uh j_-h]rf-j)rg-}rh-(hUh}ri-(U anchornameU#dummystatscollectorUrefurijh]h]h]h]h]Uinternaluh jc-h]rj-hXDummyStatsCollectorrk-rl-}rm-(hXDummyStatsCollectorrn-h jg-ubah!jubah!jubah!jubeh!jubeh!jubeh!jubeh!jubah!jubjj)ro-}rp-(hUh}rq-(h]h]h]h]h]uh]rr-j)rs-}rt-(hUh}ru-(h]h]h]h]h]uh jo-h]rv-(j)rw-}rx-(hUh}ry-(h]h]h]h]h]uh js-h]rz-j)r{-}r|-(hUh}r}-(U anchornameUUrefurijh]h]h]h]h]Uinternaluh jw-h]r~-hXVersioning and API Stabilityr-r-}r-(hjh j{-ubah!jubah!jubj)r-}r-(hUh}r-(h]h]h]h]h]uh js-h]r-(j)r-}r-(hUh}r-(h]h]h]h]h]uh j-h]r-j)r-}r-(hUh}r-(h]h]h]h]h]uh j-h]r-j)r-}r-(hUh}r-(U anchornameU#id1Urefurijh]h]h]h]h]Uinternaluh j-h]r-hX Versioningr-r-}r-(hX Versioningr-h j-ubah!jubah!jubah!jubj)r-}r-(hUh}r-(h]h]h]h]h]uh j-h]r-j)r-}r-(hUh}r-(h]h]h]h]h]uh j-h]r-j)r-}r-(hUh}r-(U anchornameU#api-stabilityUrefurijh]h]h]h]h]Uinternaluh j-h]r-hX API Stabilityr-r-}r-(hX API Stabilityr-h j-ubah!jubah!jubah!jubeh!jubeh!jubah!jubuU indexentriesr-}r-(h ]r-((Usingler-Xscrapy.contrib.loader (module)Xmodule-scrapy.contrib.loaderUtr-(j-X+ItemLoader (class in scrapy.contrib.loader)jUtr-(j-X5get_value() (scrapy.contrib.loader.ItemLoader method)jUtr-(j-X5add_value() (scrapy.contrib.loader.ItemLoader method)jtUtr-(j-X9replace_value() (scrapy.contrib.loader.ItemLoader method)jCUtr-(j-X5get_xpath() (scrapy.contrib.loader.ItemLoader method)jUtr-(j-X5add_xpath() (scrapy.contrib.loader.ItemLoader method)j-Utr-(j-X9replace_xpath() (scrapy.contrib.loader.ItemLoader method)jUtr-(j-X3get_css() (scrapy.contrib.loader.ItemLoader method)j<Utr-(j-X3add_css() (scrapy.contrib.loader.ItemLoader method)j2Utr-(j-X7replace_css() (scrapy.contrib.loader.ItemLoader method)jUtr-(j-X5load_item() (scrapy.contrib.loader.ItemLoader method)jUtr-(j-X@get_collected_values() (scrapy.contrib.loader.ItemLoader method)j&Utr-(j-X<get_output_value() (scrapy.contrib.loader.ItemLoader method)jUtr-(j-X?get_input_processor() (scrapy.contrib.loader.ItemLoader method)jUtr-(j-X@get_output_processor() (scrapy.contrib.loader.ItemLoader method)jUtr-(j-X1item (scrapy.contrib.loader.ItemLoader attribute)jUtr-(j-X4context (scrapy.contrib.loader.ItemLoader attribute)jUtr-(j-X?default_item_class (scrapy.contrib.loader.ItemLoader attribute)jUtr-(j-XDdefault_input_processor (scrapy.contrib.loader.ItemLoader attribute)jAUtr-(j-XEdefault_output_processor (scrapy.contrib.loader.ItemLoader attribute)jUtr-(j-XCdefault_selector_class (scrapy.contrib.loader.ItemLoader attribute)j'Utr-(j-X5selector (scrapy.contrib.loader.ItemLoader attribute)jUtr-(j-X(scrapy.contrib.loader.processor (module)X&module-scrapy.contrib.loader.processorUtr-(j-X3Identity (class in scrapy.contrib.loader.processor)jUtr-(j-X4TakeFirst (class in scrapy.contrib.loader.processor)jUtr-(j-X/Join (class in scrapy.contrib.loader.processor)jUtr-(j-X2Compose (class in scrapy.contrib.loader.processor)j/Utr-(j-X5MapCompose (class in scrapy.contrib.loader.processor)jUtr-eh#]h,]r-((j-Xprocess_item()jUtr-(j-X open_spider()jUtr-(j-Xclose_spider()jRUtr-eh5]h>]hG]r-((XpairXAWS_ACCESS_KEY_ID; settingjeUtr-(XpairXAWS_SECRET_ACCESS_KEY; settingj6Utr-(XpairXBOT_NAME; settingj^Utr-(XpairXCONCURRENT_ITEMS; settingjAUtr-(XpairXCONCURRENT_REQUESTS; settingjUtr-(XpairX'CONCURRENT_REQUESTS_PER_DOMAIN; settingj Utr-(XpairX#CONCURRENT_REQUESTS_PER_IP; settingjUtr-(XpairXDEFAULT_ITEM_CLASS; settingjVUtr-(XpairX DEFAULT_REQUEST_HEADERS; settingj"Utr-(XpairXDEPTH_LIMIT; settingjTUtr-(XpairXDEPTH_PRIORITY; settingjUtr-(XpairXDEPTH_STATS; settingjxUtr-(XpairXDEPTH_STATS_VERBOSE; settingjUtr-(XpairXDNSCACHE_ENABLED; settingjHUtr-(XpairXDOWNLOADER_DEBUG; settingj[Utr-(XpairXDOWNLOADER_MIDDLEWARES; settingjCUtr-(XpairX$DOWNLOADER_MIDDLEWARES_BASE; settingjzUtr-(XpairXDOWNLOADER_STATS; settingjUtr-(XpairXDOWNLOAD_DELAY; settingj.Utr-(XpairXDOWNLOAD_HANDLERS; settingjUtr-(XpairXDOWNLOAD_HANDLERS_BASE; settingjlUtr-(XpairXDOWNLOAD_TIMEOUT; settingj]Utr-(XpairXDUPEFILTER_CLASS; settingjoUtr-(XpairXjDITOR; settingj{Utr-(XpairXEXTENSIONS; settingjUtr-(XpairXEXTENSIONS_BASE; settingjwUtr-(XpairXITEM_PIPELINES; settingjsUtr-(XpairXITEM_PIPELINES_BASE; settingjcUtr-(XpairXLOG_ENABLED; settingj#Utr-(XpairXLOG_ENCODING; settingjUtr-(XpairXLOG_FILE; settingjUUtr-(XpairXLOG_LEVEL; settingjgUtr-(XpairXLOG_STDOUT; settingjUtr-(XpairXMEMDEBUG_ENABLED; settingj*Utr-(XpairXMEMDEBUG_NOTIFY; settingj\Utr-(XpairXMEMUSAGE_ENABLED; settingj1Utr-(XpairXMEMUSAGE_LIMIT_MB; settingjUtr-(XpairXMEMUSAGE_NOTIFY_MAIL; settingjuUtr-(XpairXMEMUSAGE_REPORT; settingjIUtr-(XpairXMEMUSAGE_WARNING_MB; settingj0Utr-(XpairXNEWSPIDER_MODULE; settingjZUtr-(XpairX!RANDOMIZE_DOWNLOAD_DELAY; settingj8Utr-(XpairXREDIRECT_MAX_TIMES; settingj~Utr-(XpairX'REDIRECT_MAX_METAREFRESH_DELAY; settingj Utr-(XpairX!REDIRECT_PRIORITY_ADJUST; settingjLUtr-(XpairXROBOTSTXT_OBEY; settingj_Utr-(XpairXSCHEDULER; settingjUtr-(XpairXSPIDER_CONTRACTS; settingjUtr-(XpairXSPIDER_CONTRACTS_BASE; settingj Utr-(XpairXSPIDER_MIDDLEWARES; settingjUtr-(XpairX SPIDER_MIDDLEWARES_BASE; settingjRUtr-(XpairXSPIDER_MODULES; settingjUtr-(XpairXSTATS_CLASS; settingjnUtr.(XpairXSTATS_DUMP; settingjUtr.(XpairXSTATSMAILER_RCPTS; settingj:Utr.(XpairXTELNETCONSOLE_ENABLED; settingjJUtr.(XpairXTELNETCONSOLE_PORT; settingXstd:setting-TELNETCONSOLE_PORTUtr.(XpairXTEMPLATES_DIR; settingjUtr.(XpairXURLLENGTH_LIMIT; settingj Utr.(XpairXUSER_AGENT; settingjOUtr.ehP]hY]hb]r.((XpairXFEED_URI; settingjXUtr .(XpairXFEED_FORMAT; settingjUtr .(XpairXFEED_STORE_EMPTY; settingjBUtr .(XpairXFEED_STORAGES; settingj5Utr .(XpairXFEED_STORAGES_BASE; settingjUtr .(XpairXFEED_EXPORTERS; settingjUtr.(XpairXFEED_EXPORTERS_BASE; settingj4Utr.ehk]r.((j-Xscrapy.item (module)Xmodule-scrapy.itemUtr.(j-XItem (class in scrapy.item)jUtr.(j-X#fields (scrapy.item.Item attribute)j Utr.(j-XField (class in scrapy.item)jUtr.eht]r.((j-X(scrapy.contrib.spidermiddleware (module)X&module-scrapy.contrib.spidermiddlewareUtr.(j-X;SpiderMiddleware (class in scrapy.contrib.spidermiddleware)jUtr.(j-XPprocess_spider_input() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)jUtr.(j-XQprocess_spider_output() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)j0Utr.(j-XTprocess_spider_exception() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)jUtr.(j-XRprocess_start_requests() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)jUtr.(j-X.scrapy.contrib.spidermiddleware.depth (module)X,module-scrapy.contrib.spidermiddleware.depthUtr.(j-X@DepthMiddleware (class in scrapy.contrib.spidermiddleware.depth)j{Utr.(j-X2scrapy.contrib.spidermiddleware.httperror (module)X0module-scrapy.contrib.spidermiddleware.httperrorUtr.(j-XHHttpErrorMiddleware (class in scrapy.contrib.spidermiddleware.httperror)jUtr.(XpairXhandle_httpstatus_list; reqmetajUtr .(XpairX HTTPERROR_ALLOWED_CODES; settingjvUtr!.(XpairXHTTPERROR_ALLOW_ALL; settingjdUtr".(j-X0scrapy.contrib.spidermiddleware.offsite (module)X.module-scrapy.contrib.spidermiddleware.offsiteUtr#.(j-XDOffsiteMiddleware (class in scrapy.contrib.spidermiddleware.offsite)jUtr$.(j-X0scrapy.contrib.spidermiddleware.referer (module)X.module-scrapy.contrib.spidermiddleware.refererUtr%.(j-XDRefererMiddleware (class in scrapy.contrib.spidermiddleware.referer)jUtr&.(XpairXREFERER_ENABLED; settingjSUtr'.(j-X2scrapy.contrib.spidermiddleware.urllength (module)X0module-scrapy.contrib.spidermiddleware.urllengthUtr(.(j-XHUrlLengthMiddleware (class in scrapy.contrib.spidermiddleware.urllength)jUtr).eh}]r*.((j-X scrapy.contrib.exporter (module)Xmodule-scrapy.contrib.exporterUtr+.(j-X3BaseItemExporter (class in scrapy.contrib.exporter)jUtr,.(j-X?export_item() (scrapy.contrib.exporter.BaseItemExporter method)jUtr-.(j-XCserialize_field() (scrapy.contrib.exporter.BaseItemExporter method)j^Utr..(j-XCstart_exporting() (scrapy.contrib.exporter.BaseItemExporter method)jUtr/.(j-XDfinish_exporting() (scrapy.contrib.exporter.BaseItemExporter method)jsUtr0.(j-XEfields_to_export (scrapy.contrib.exporter.BaseItemExporter attribute)jUtr1.(j-XHexport_empty_fields (scrapy.contrib.exporter.BaseItemExporter attribute)jVUtr2.(j-X=encoding (scrapy.contrib.exporter.BaseItemExporter attribute)jUtr3.(j-X2XmlItemExporter (class in scrapy.contrib.exporter)jUtr4.(j-X2CsvItemExporter (class in scrapy.contrib.exporter)jUtr5.(j-X5PickleItemExporter (class in scrapy.contrib.exporter)jUtr6.(j-X5PprintItemExporter (class in scrapy.contrib.exporter)jUtr7.(j-X3JsonItemExporter (class in scrapy.contrib.exporter)jUtr8.(j-X8JsonLinesItemExporter (class in scrapy.contrib.exporter)jUtr9.eh]h]h]h]h]r:.((XpairXIMAGES_STORE; settingj&Utr;.(XpairXIMAGES_EXPIRES; settingj2Utr<.(XpairXIMAGES_THUMBS; settingjUtr=.(XpairXIMAGES_MIN_HEIGHT; settingjUtr>.(XpairXIMAGES_MIN_WIDTH; settingj$Utr?.(j-X'scrapy.contrib.pipeline.images (module)X%module-scrapy.contrib.pipeline.imagesUtr@.(j-X8ImagesPipeline (class in scrapy.contrib.pipeline.images)jXUtrA.(j-XKget_media_requests() (scrapy.contrib.pipeline.images.ImagesPipeline method)jUtrB.(j-XGitem_completed() (scrapy.contrib.pipeline.images.ImagesPipeline method)jUtrC.eh]h]h]rD.((j-X#Python Enhancement Proposals; PEP 8Uindex-0UtrE.(j-X#Python Enhancement Proposals; PEP 8Uindex-1UtrF.eh]rG.((j-Xscrapy.crawler (module)Xmodule-scrapy.crawlerUtrH.(j-X!Crawler (class in scrapy.crawler)jUtrI.(j-X+settings (scrapy.crawler.Crawler attribute)jUtrJ.(j-X*signals (scrapy.crawler.Crawler attribute)jUtrK.(j-X(stats (scrapy.crawler.Crawler attribute)j7UtrL.(j-X-extensions (scrapy.crawler.Crawler attribute)jUtrM.(j-X*spiders (scrapy.crawler.Crawler attribute)jUtrN.(j-X)engine (scrapy.crawler.Crawler attribute)jmUtrO.(j-X+configure() (scrapy.crawler.Crawler method)jSUtrP.(j-X'start() (scrapy.crawler.Crawler method)jUtrQ.(j-Xscrapy.settings (module)Xmodule-scrapy.settingsUtrR.(j-X#Settings (class in scrapy.settings)jUtrS.(j-X.overrides (scrapy.settings.Settings attribute)jUtrT.(j-X'get() (scrapy.settings.Settings method)jUtrU.(j-X+getbool() (scrapy.settings.Settings method)jUtrV.(j-X*getint() (scrapy.settings.Settings method)jUtrW.(j-X,getfloat() (scrapy.settings.Settings method)jUtrX.(j-X+getlist() (scrapy.settings.Settings method)jUtrY.(j-Xscrapy.signalmanager (module)Xmodule-scrapy.signalmanagerUtrZ.(j-X-SignalManager (class in scrapy.signalmanager)jdUtr[.(j-X5connect() (scrapy.signalmanager.SignalManager method)j,Utr\.(j-X<send_catch_log() (scrapy.signalmanager.SignalManager method)jUtr].(j-XEsend_catch_log_deferred() (scrapy.signalmanager.SignalManager method)jUtr^.(j-X8disconnect() (scrapy.signalmanager.SignalManager method)jUtr_.(j-X<disconnect_all() (scrapy.signalmanager.SignalManager method)jUtr`.(j-Xscrapy.statscol (module)Xmodule-scrapy.statscolUtra.(j-X)StatsCollector (class in scrapy.statscol)jUtrb.(j-X3get_value() (scrapy.statscol.StatsCollector method)jUtrc.(j-X3get_stats() (scrapy.statscol.StatsCollector method)jUtrd.(j-X3set_value() (scrapy.statscol.StatsCollector method)jUtre.(j-X3set_stats() (scrapy.statscol.StatsCollector method)jUtrf.(j-X3inc_value() (scrapy.statscol.StatsCollector method)jUtrg.(j-X3max_value() (scrapy.statscol.StatsCollector method)jUtrh.(j-X3min_value() (scrapy.statscol.StatsCollector method)jUtri.(j-X5clear_stats() (scrapy.statscol.StatsCollector method)jUtrj.(j-X5open_spider() (scrapy.statscol.StatsCollector method)jUtrk.(j-X6close_spider() (scrapy.statscol.StatsCollector method)jUtrl.eh]rm.((XpairXAUTOTHROTTLE_ENABLED; settingj<Utrn.(XpairX!AUTOTHROTTLE_START_DELAY; settingjWUtro.(XpairXAUTOTHROTTLE_MAX_DELAY; settingjtUtrp.(XpairXAUTOTHROTTLE_DEBUG; settingj3Utrq.eh]rr.((j-X!scrapy.contracts.default (module)Xmodule-scrapy.contracts.defaultUtrs.(j-X/UrlContract (class in scrapy.contracts.default)jUtrt.(j-X3ReturnsContract (class in scrapy.contracts.default)jUtru.(j-X3ScrapesContract (class in scrapy.contracts.default)jUtrv.(j-Xscrapy.contracts (module)Xmodule-scrapy.contractsUtrw.(j-X$Contract (class in scrapy.contracts)jUtrx.(j-X8adjust_request_args() (scrapy.contracts.Contract method)jlUtry.(j-X0pre_process() (scrapy.contracts.Contract method)jUtrz.(j-X1post_process() (scrapy.contracts.Contract method)jUtr{.eh]r|.((j-X"scrapy.contrib.webservice (module)X module-scrapy.contrib.webserviceUtr}.(j-X*scrapy.contrib.webservice.crawler (module)X(module-scrapy.contrib.webservice.crawlerUtr~.(j-X<CrawlerResource (class in scrapy.contrib.webservice.crawler)jUtr.(j-X(scrapy.contrib.webservice.stats (module)X&module-scrapy.contrib.webservice.statsUtr.(j-X8StatsResource (class in scrapy.contrib.webservice.stats)jUtr.(j-X/scrapy.contrib.webservice.enginestatus (module)X-module-scrapy.contrib.webservice.enginestatusUtr.(j-XFEngineStatusResource (class in scrapy.contrib.webservice.enginestatus)jTUtr.(XpairXWEBSERVICE_ENABLED; settingjyUtr.(XpairXWEBSERVICE_LOGFILE; settingjKUtr.(XpairXWEBSERVICE_PORT; settingjkUtr.(XpairXWEBSERVICE_HOST; settingjUtr.(j-XPscrapy.webservice.JsonResource (class in scrapy.contrib.webservice.enginestatus)jUtr.(j-XYws_name (scrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonResource attribute)jUtr.(j-XSscrapy.webservice.JsonRpcResource (class in scrapy.contrib.webservice.enginestatus)jGUtr.(j-X^get_target() (scrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonRpcResource method)jUtr.eh]r.((j-X,scrapy.contrib.downloadermiddleware (module)X*module-scrapy.contrib.downloadermiddlewareUtr.(j-XCDownloaderMiddleware (class in scrapy.contrib.downloadermiddleware)j%Utr.(j-XSprocess_request() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)jUtr.(j-XTprocess_response() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)jUtr.(j-XUprocess_exception() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)jUtr.(j-X4scrapy.contrib.downloadermiddleware.cookies (module)X2module-scrapy.contrib.downloadermiddleware.cookiesUtr.(j-XHCookiesMiddleware (class in scrapy.contrib.downloadermiddleware.cookies)jJUtr.(XpairXcookiejar; reqmetajUtr.(XpairXCOOKIES_ENABLED; settingj@Utr.(XpairXCOOKIES_DEBUG; settingj/Utr.(j-X;scrapy.contrib.downloadermiddleware.defaultheaders (module)X9module-scrapy.contrib.downloadermiddleware.defaultheadersUtr.(j-XVDefaultHeadersMiddleware (class in scrapy.contrib.downloadermiddleware.defaultheaders)jUtr.(j-X<scrapy.contrib.downloadermiddleware.downloadtimeout (module)X:module-scrapy.contrib.downloadermiddleware.downloadtimeoutUtr.(j-XXDownloadTimeoutMiddleware (class in scrapy.contrib.downloadermiddleware.downloadtimeout)jUtr.(j-X5scrapy.contrib.downloadermiddleware.httpauth (module)X3module-scrapy.contrib.downloadermiddleware.httpauthUtr.(j-XJHttpAuthMiddleware (class in scrapy.contrib.downloadermiddleware.httpauth)jwUtr.(j-X6scrapy.contrib.downloadermiddleware.httpcache (module)X4module-scrapy.contrib.downloadermiddleware.httpcacheUtr.(j-XLHttpCacheMiddleware (class in scrapy.contrib.downloadermiddleware.httpcache)jUtr.(XpairXHTTPCACHE_ENABLED; settingjYUtr.(XpairX"HTTPCACHE_EXPIRATION_SECS; settingjQUtr.(XpairXHTTPCACHE_DIR; settingjUtr.(XpairX$HTTPCACHE_IGNORE_HTTP_CODES; settingjUtr.(XpairX!HTTPCACHE_IGNORE_MISSING; settingjqUtr.(XpairX!HTTPCACHE_IGNORE_SCHEMES; settingjUtr.(XpairXHTTPCACHE_STORAGE; settingj Utr.(XpairXHTTPCACHE_DBM_MODULE; settingjUtr.(XpairXHTTPCACHE_POLICY; settingj,Utr.(j-X<scrapy.contrib.downloadermiddleware.httpcompression (module)X:module-scrapy.contrib.downloadermiddleware.httpcompressionUtr.(j-XXHttpCompressionMiddleware (class in scrapy.contrib.downloadermiddleware.httpcompression)jUtr.(XpairXCOMPRESSION_ENABLED; settingjhUtr.(j-X4scrapy.contrib.downloadermiddleware.chunked (module)X2module-scrapy.contrib.downloadermiddleware.chunkedUtr.(j-XPChunkedTransferMiddleware (class in scrapy.contrib.downloadermiddleware.chunked)j(Utr.(j-X6scrapy.contrib.downloadermiddleware.httpproxy (module)X4module-scrapy.contrib.downloadermiddleware.httpproxyUtr.(j-XLHttpProxyMiddleware (class in scrapy.contrib.downloadermiddleware.httpproxy)j)Utr.(j-X5scrapy.contrib.downloadermiddleware.redirect (module)X3module-scrapy.contrib.downloadermiddleware.redirectUtr.(j-XJRedirectMiddleware (class in scrapy.contrib.downloadermiddleware.redirect)jUtr.(XpairXredirect_urls; reqmetajUtr.(XpairXdont_redirect; reqmetaj'Utr.(XpairXREDIRECT_ENABLED; settingj;Utr.(XpairXREDIRECT_MAX_TIMES; settingXstd:setting-REDIRECT_MAX_TIMESUtr.(j-XMMetaRefreshMiddleware (class in scrapy.contrib.downloadermiddleware.redirect)jUtr.(XpairXMETAREFRESH_ENABLED; settingjUtr.(XpairX'REDIRECT_MAX_METAREFRESH_DELAY; settingX*std:setting-REDIRECT_MAX_METAREFRESH_DELAYUtr.(j-X2scrapy.contrib.downloadermiddleware.retry (module)X0module-scrapy.contrib.downloadermiddleware.retryUtr.(j-XDRetryMiddleware (class in scrapy.contrib.downloadermiddleware.retry)jvUtr.(XpairXdont_retry; reqmetajaUtr.(XpairXRETRY_ENABLED; settingj9Utr.(XpairXRETRY_TIMES; settingjDUtr.(XpairXRETRY_HTTP_CODES; settingj-Utr.(j-X6scrapy.contrib.downloadermiddleware.robotstxt (module)X4module-scrapy.contrib.downloadermiddleware.robotstxtUtr.(j-XLRobotsTxtMiddleware (class in scrapy.contrib.downloadermiddleware.robotstxt)jUtr.(j-X2scrapy.contrib.downloadermiddleware.stats (module)X0module-scrapy.contrib.downloadermiddleware.statsUtr.(j-XDDownloaderStats (class in scrapy.contrib.downloadermiddleware.stats)jUtr.(j-X6scrapy.contrib.downloadermiddleware.useragent (module)X4module-scrapy.contrib.downloadermiddleware.useragentUtr.(j-XLUserAgentMiddleware (class in scrapy.contrib.downloadermiddleware.useragent)jUtr.(j-X6scrapy.contrib.downloadermiddleware.ajaxcrawl (module)X4module-scrapy.contrib.downloadermiddleware.ajaxcrawlUtr.(j-XLAjaxCrawlMiddleware (class in scrapy.contrib.downloadermiddleware.ajaxcrawl)jUtr.(XpairXAJAXCRAWL_ENABLED; settingj!Utr.ej]j ]r.((j-Xscrapy.spider (module)Xmodule-scrapy.spiderUtr.(j-XSpider (class in scrapy.spider)jUtr.(j-X%name (scrapy.spider.Spider attribute)jUtr.(j-X0allowed_domains (scrapy.spider.Spider attribute)jUtr.(j-X+start_urls (scrapy.spider.Spider attribute)jUtr.(j-X.start_requests() (scrapy.spider.Spider method)j\Utr.(j-X6make_requests_from_url() (scrapy.spider.Spider method)jUtr.(j-X%parse() (scrapy.spider.Spider method)jUtr.(j-X#log() (scrapy.spider.Spider method)jUtr.(j-Xscrapy.contrib.spiders (module)Xmodule-scrapy.contrib.spidersUtr.(j-X-CrawlSpider (class in scrapy.contrib.spiders)jUtr.(j-X4rules (scrapy.contrib.spiders.CrawlSpider attribute)jPUtr.(j-X=parse_start_url() (scrapy.contrib.spiders.CrawlSpider method)jUtr.(j-X&Rule (class in scrapy.contrib.spiders)jUtr.(j-X/XMLFeedSpider (class in scrapy.contrib.spiders)jUtr.(j-X9iterator (scrapy.contrib.spiders.XMLFeedSpider attribute)jUtr.(j-X8itertag (scrapy.contrib.spiders.XMLFeedSpider attribute)jUtr.(j-X;namespaces (scrapy.contrib.spiders.XMLFeedSpider attribute)jqUtr.(j-X>adapt_response() (scrapy.contrib.spiders.XMLFeedSpider method)jUtr.(j-X:parse_node() (scrapy.contrib.spiders.XMLFeedSpider method)jUtr.(j-X?process_results() (scrapy.contrib.spiders.XMLFeedSpider method)j#Utr.(j-X/CSVFeedSpider (class in scrapy.contrib.spiders)jUtr.(j-X:delimiter (scrapy.contrib.spiders.CSVFeedSpider attribute)j5Utr.(j-X8headers (scrapy.contrib.spiders.CSVFeedSpider attribute)jUtr.(j-X9parse_row() (scrapy.contrib.spiders.CSVFeedSpider method)jUtr.(j-X/SitemapSpider (class in scrapy.contrib.spiders)jUtr.(j-X=sitemap_urls (scrapy.contrib.spiders.SitemapSpider attribute)j=Utr.(j-X>sitemap_rules (scrapy.contrib.spiders.SitemapSpider attribute)jUtr.(j-X?sitemap_follow (scrapy.contrib.spiders.SitemapSpider attribute)jUtr.(j-XHsitemap_alternate_links (scrapy.contrib.spiders.SitemapSpider attribute)jKUtr.ej]j]r.((j-Xscrapy.http (module)Xmodule-scrapy.httpUtr.(j-XRequest (class in scrapy.http)jUtr.(j-X#url (scrapy.http.Request attribute)j:Utr.(j-X&method (scrapy.http.Request attribute)jUtr.(j-X'headers (scrapy.http.Request attribute)jBUtr.(j-X$body (scrapy.http.Request attribute)jhUtr.(j-X$meta (scrapy.http.Request attribute)jEUtr.(j-X#copy() (scrapy.http.Request method)jUtr.(j-X&replace() (scrapy.http.Request method)jiUtr.(XpairXbindaddress; reqmetajUtr.(j-X"FormRequest (class in scrapy.http)jUtr.(j-X6from_response() (scrapy.http.FormRequest class method)j.Utr.(j-XResponse (class in scrapy.http)jUtr.(j-X$url (scrapy.http.Response attribute)j}Utr.(j-X'status (scrapy.http.Response attribute)jUtr.(j-X(headers (scrapy.http.Response attribute)jUtr.(j-X%body (scrapy.http.Response attribute)jUtr.(j-X(request (scrapy.http.Response attribute)juUtr.(j-X%meta (scrapy.http.Response attribute)jUtr.(j-X&flags (scrapy.http.Response attribute)jUtr.(j-X$copy() (scrapy.http.Response method)jUtr.(j-X'replace() (scrapy.http.Response method)jUtr.(j-X#TextResponse (class in scrapy.http)jUtr.(j-X-encoding (scrapy.http.TextResponse attribute)jUtr.(j-X3body_as_unicode() (scrapy.http.TextResponse method)jUtr.(j-X#HtmlResponse (class in scrapy.http)jUtr/(j-X"XmlResponse (class in scrapy.http)jUtr/ej']r/((j-Xscrapy.exceptions (module)Xmodule-scrapy.exceptionsUtr/(j-XDropItemjUtr/(j-X CloseSpiderjUtr/(j-X IgnoreRequestjMUtr/(j-X NotConfiguredjUtr/(j-X NotSupportedj*Utr/ej0]j9]r /((j-Xscrapy.selector (module)Xmodule-scrapy.selectorUtr /(j-X#Selector (class in scrapy.selector)jUtr /(j-X)xpath() (scrapy.selector.Selector method)jUtr /(j-X'css() (scrapy.selector.Selector method)jxUtr /(j-X+extract() (scrapy.selector.Selector method)jUtr/(j-X&re() (scrapy.selector.Selector method)j!Utr/(j-X6register_namespace() (scrapy.selector.Selector method)jUtr/(j-X5remove_namespaces() (scrapy.selector.Selector method)jUtr/(j-X/__nonzero__() (scrapy.selector.Selector method)jUtr/(j-X'SelectorList (class in scrapy.selector)jUtr/(j-X-xpath() (scrapy.selector.SelectorList method)jUtr/(j-X+css() (scrapy.selector.SelectorList method)jUtr/(j-X/extract() (scrapy.selector.SelectorList method)j3Utr/(j-X*re() (scrapy.selector.SelectorList method)jUtr/(j-X3__nonzero__() (scrapy.selector.SelectorList method)jUtr/ejB]r/((j-X&scrapy.contrib.linkextractors (module)X$module-scrapy.contrib.linkextractorsUtr/(j-X+scrapy.contrib.linkextractors.sgml (module)X)module-scrapy.contrib.linkextractors.sgmlUtr/(j-X?SgmlLinkExtractor (class in scrapy.contrib.linkextractors.sgml)jIUtr/(j-XCBaseSgmlLinkExtractor (class in scrapy.contrib.linkextractors.sgml)jUtr/ejK]jT]j]]r/((j-Xscrapy.telnet (module)Xmodule-scrapy.telnetUtr/(XpairXupdate_telnet_vars; signaljMUtr /(j-X.update_telnet_vars() (in module scrapy.telnet)jUtr!/(XpairXTELNETCONSOLE_PORT; settingjUtr"/(XpairXTELNETCONSOLE_HOST; settingjUtr#/ejf]r$/((j-Xscrapy.utils.trackref (module)Xmodule-scrapy.utils.trackrefUtr%/(j-X+object_ref (class in scrapy.utils.trackref)jUtr&/(j-X3print_live_refs() (in module scrapy.utils.trackref)jUtr'/(j-X.get_oldest() (in module scrapy.utils.trackref)jUtr(/(j-X,iter_all() (in module scrapy.utils.trackref)jUtr)/ejo]r*/((j-Xscrapy.signals (module)Xmodule-scrapy.signalsUtr+/(XpairXengine_started; signaljbUtr,/(j-X+engine_started() (in module scrapy.signals)jUtr-/(XpairXengine_stopped; signaljGUtr./(j-X+engine_stopped() (in module scrapy.signals)jZUtr//(XpairXitem_scraped; signaljfUtr0/(j-X)item_scraped() (in module scrapy.signals)jUtr1/(XpairXitem_dropped; signaljPUtr2/(j-X)item_dropped() (in module scrapy.signals)jnUtr3/(XpairXspider_closed; signaljmUtr4/(j-X*spider_closed() (in module scrapy.signals)jUtr5/(XpairXspider_opened; signaljUtr6/(j-X*spider_opened() (in module scrapy.signals)jjUtr7/(XpairXspider_idle; signaljUtr8/(j-X(spider_idle() (in module scrapy.signals)j_Utr9/(XpairXspider_error; signaljpUtr:/(j-X)spider_error() (in module scrapy.signals)jUtr;/(XpairXresponse_received; signaljUtr/(j-X0response_downloaded() (in module scrapy.signals)jaUtr?/ejx]r@/((j-X scrapy.contrib.logstats (module)Xmodule-scrapy.contrib.logstatsUtrA/(j-X+LogStats (class in scrapy.contrib.logstats)j;UtrB/(j-X!scrapy.contrib.corestats (module)Xmodule-scrapy.contrib.corestatsUtrC/(j-X-CoreStats (class in scrapy.contrib.corestats)jUtrD/(j-Xscrapy.webservice (module)Xmodule-scrapy.webserviceUtrE/(j-X9scrapy.webservice.WebService (class in scrapy.webservice)j UtrF/(j-Xscrapy.telnet (module)Xmodule-scrapy.telnetUtrG/(j-X4scrapy.telnet.TelnetConsole (class in scrapy.telnet)jUtrH/(j-X scrapy.contrib.memusage (module)Xmodule-scrapy.contrib.memusageUtrI/(j-XFscrapy.contrib.memusage.MemoryUsage (class in scrapy.contrib.memusage)j@UtrJ/(j-X scrapy.contrib.memdebug (module)Xmodule-scrapy.contrib.memdebugUtrK/(j-XIscrapy.contrib.memdebug.MemoryDebugger (class in scrapy.contrib.memdebug)jUtrL/(j-X#scrapy.contrib.closespider (module)X!module-scrapy.contrib.closespiderUtrM/(j-XLscrapy.contrib.closespider.CloseSpider (class in scrapy.contrib.closespider)jUtrN/(XpairXCLOSESPIDER_TIMEOUT; settingj UtrO/(XpairXCLOSESPIDER_ITEMCOUNT; settingj%UtrP/(XpairXCLOSESPIDER_PAGECOUNT; settingj=UtrQ/(XpairXCLOSESPIDER_ERRORCOUNT; settingjUtrR/(j-X#scrapy.contrib.statsmailer (module)X!module-scrapy.contrib.statsmailerUtrS/(j-XLscrapy.contrib.statsmailer.StatsMailer (class in scrapy.contrib.statsmailer)jUtrT/(j-Xscrapy.contrib.debug (module)Xmodule-scrapy.contrib.debugUtrU/(j-XCscrapy.contrib.debug.StackTraceDump (class in scrapy.contrib.debug)jUtrV/(j-X=scrapy.contrib.debug.Debugger (class in scrapy.contrib.debug)jUtrW/ej]j]rX/((j-Xscrapy.mail (module)Xmodule-scrapy.mailUtrY/(j-X!MailSender (class in scrapy.mail)jUtrZ/(j-X5from_settings() (scrapy.mail.MailSender class method)jUtr[/(j-X&send() (scrapy.mail.MailSender method)jUtr\/(XpairXMAIL_FROM; settingj?Utr]/(XpairXMAIL_HOST; settingj>Utr^/(XpairXMAIL_PORT; settingjUtr_/(XpairXMAIL_USER; settingjUtr`/(XpairXMAIL_PASS; settingj(Utra/(XpairXMAIL_TLS; settingjjUtrb/(XpairXMAIL_SSL; settingjUtrc/ej]j]rd/((XpairXstartproject; commandjUtre/(XpairXgenspider; commandj`Utrf/(XpairXcrawl; commandjEUtrg/(XpairXcheck; commandjUtrh/(XpairX list; commandj)Utri/(XpairX edit; commandjFUtrj/(XpairXfetch; commandjUtrk/(XpairX view; commandjUtrl/(XpairXshell; commandj+Utrm/(XpairXparse; commandj7Utrn/(XpairXsettings; commandjiUtro/(XpairXrunspider; commandj|Utrp/(XpairXversion; commandjUtrq/(XpairXdeploy; commandjrUtrr/(XpairXbench; commandjNUtrs/(XpairXCOMMANDS_MODULE; settingj}Utrt/ej]j]j]ru/((j-Xscrapy.log (module)Xmodule-scrapy.logUtrv/(j-Xstart() (in module scrapy.log)jUtrw/(j-Xmsg() (in module scrapy.log)jUtrx/(j-XCRITICAL (in module scrapy.log)jzUtry/(j-XERROR (in module scrapy.log)jUtrz/(j-XWARNING (in module scrapy.log)jUtr{/(j-XINFO (in module scrapy.log)jUtr|/(j-XDEBUG (in module scrapy.log)j?Utr}/ej]r~/((j-Xscrapy.statscol (module)Xmodule-scrapy.statscolUtr/(j-X/MemoryStatsCollector (class in scrapy.statscol)jfUtr/(j-X=spider_stats (scrapy.statscol.MemoryStatsCollector attribute)jUtr/(j-X.DummyStatsCollector (class in scrapy.statscol)j8Utr/ej]uUall_docsr/}r/(h GAԶb٩h#GAԶb¥wh,GAԶb¯k@h5GAԶb/h>GAԶb£MhGGAԶbphPGAԶb֖ hYGAԶbI#hbGAԶbž٥hkGAԶb³htGAԶbWh}GAԶb‹ҢhGAԶb¸hGAԶb mhGAԶbLhGAԶb4hGAԶb¬'JhGAԶbOYhGAԶb^hGAԶbhGAԶbEhGAԶbK޹hGAԶb\hGAԶb6K+hGAԶb{jGAԶb`A)j GAԶb(jGAԶb]jGAԶbj'GAԶb„0bj0GAԶb ]j9GAԶbjBGAԶb…#jKGAԶb#jTGAԶb9j]GAԶb/#jfGAԶb½joGAԶbHjxGAԶb•d jGAԶbbBpjGAԶbjGAԶb0jGAԶbX6jGAԶb jGAԶbjGAԶbӑjGAԶb+|jGAԶb81uUsettingsr/}r/(Ucloak_email_addressesr/Utrim_footnote_reference_spacer/U halt_levelr/KUsectsubtitle_xformr/Uembed_stylesheetr/U pep_base_urlr/Uhttp://www.python.org/dev/peps/r/Udoctitle_xformr/Uwarning_streamr/csphinx.util.nodes WarningStream r/)r/}r/(U_rer/cre _compile r/U+\((DEBUG|INFO|WARNING|ERROR|SEVERE)/[0-4]\)r/KRr/Uwarnfuncr/NubUenvr/hU rfc_base_urlr/Uhttp://tools.ietf.org/html/r/Ufile_insertion_enabledr/Ugettext_compactr/Uinput_encodingr/U utf-8-sigr/uUfiles_to_rebuildr/}r/(j_h]r/haRr/jnh]r/haRr/jah]r/haRr/jXh]r/haRr/joh]r/haRr/j~h]r/haRr/jlh]r/haRr/jxh]r/haRr/jbh]r/haRr/j\h]r/haRr/jh]r/haRr/jvh]r/haRr/juh]r/haRr/jYh]r/haRr/jqh]r/haRr/jmh]r/haRr/jWh]r/haRr/jh]r/haRr/j|h]r/haRr/jth]r/haRr/jkh]r/haRr/jhh]r/haRr/jyh]r/haRr/jjh]r/haRr/j]h]r/haRr/jih]r/haRr/jh]r/haRr/jh]r/haRr/jZh]r/haRr/j^h]r/haRr/jch]r/haRr/j}h]r/haRr/jh]r/haRr/jh]r/haRr/jgh]r/haRr/jph]r/haRr/jzh]r/haRr/j{h]r/haRr/jwh]r/haRr/jfh]r/haRr/jrh]r/haRr/j[h]r/haRr/j`h]r/haRr/jsh]r/haRr/jdh]r/haRr/jeh]r/haRr/jh]r/haRr/uUtoc_secnumbersr/}U_nitpick_ignorer0h]Rr0ub.PK o1DGg + +(scrapy-0.22/.doctrees/versioning.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X.odd-numbered versions for development releasesqX api stabilityqNXversioning and api stabilityqNX versioningq uUsubstitution_defsq }q Uparse_messagesq ]q cdocutils.nodes system_message q)q}q(U rawsourceqUUparentqcdocutils.nodes section q)q}q(hUU referencedqKhh)q}q(hUhhUsourceqcdocutils.nodes reprunicode qX@/var/build/user_builds/scrapy/checkouts/0.22/docs/versioning.rstqq}qbUexpect_referenced_by_nameq}qh cdocutils.nodes target q )q!}q"(hX.. _versioning:hhhhUtagnameq#Utargetq$U attributesq%}q&(Uidsq']Ubackrefsq(]Udupnamesq)]Uclassesq*]Unamesq+]Urefidq,U versioningq-uUlineq.KUdocumentq/hUchildrenq0]ubsh#Usectionq1h%}q2(h)]h*]h(]h']q3(Uversioning-and-api-stabilityq4h-eh+]q5(hh euh.Kh/hUexpect_referenced_by_idq6}q7h-h!sh0]q8(cdocutils.nodes title q9)q:}q;(hXVersioning and API Stabilityq(h)]h*]h(]h']h+]uh.Kh/hh0]q?cdocutils.nodes Text q@XVersioning and API StabilityqAqB}qC(hhX.odd-numbered versions for development releases]r?hasUsymbol_footnotesr@]rAUautofootnote_refsrB]rCUsymbol_footnote_refsrD]rEU citationsrF]rGh/hU current_linerHNUtransform_messagesrI]rJh)rK}rL(hUh%}rM(h)]UlevelKh']h(]Usourcehh*]h+]UlineKUtypej+uh0]rNhS)rO}rP(hUh%}rQ(h)]h*]h(]h']h+]uhjKh0]rRh@X0Hyperlink target "versioning" is not referenced.rSrT}rU(hUhjOubah#hVubah#j(ubaUreporterrVNUid_startrWKU autofootnotesrX]rYU citation_refsrZ}r[Uindirect_targetsr\]r]Usettingsr^(cdocutils.frontend Values r_or`}ra(Ufootnote_backlinksrbKUrecord_dependenciesrcNU rfc_base_urlrdUhttp://tools.ietf.org/html/reU tracebackrfUpep_referencesrgNUstrip_commentsrhNU toc_backlinksriUentryrjU language_coderkUenrlU datestamprmNU report_levelrnKU _destinationroNU halt_levelrpKU strip_classesrqNh=NUerror_encoding_error_handlerrrUbackslashreplacersUdebugrtNUembed_stylesheetruUoutput_encoding_error_handlerrvUstrictrwU sectnum_xformrxKUdump_transformsryNU docinfo_xformrzKUwarning_streamr{NUpep_file_url_templater|Upep-%04dr}Uexit_status_levelr~KUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerU@/var/build/user_builds/scrapy/checkouts/0.22/docs/versioning.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjwUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]rUfile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(h4hhhhHhDh-hhhuUsubstitution_namesr}rh#h/h%}r(h)]h']h(]Usourcehh*]h+]uU footnotesr]rUrefidsr}rh-]rh!asub.PKo1D!scrapy-0.22/.doctrees/faq.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X6can i return (twisted) deferreds from signal handlers?qNXaccept-languageqXHwhy does scrapy download pages in english instead of my native language?qNX*how can i make scrapy consume less memory?q NX4how can i prevent my scrapy bot from getting banned?q NXEwhat is the recommended way to deploy a scrapy crawler in production?q NXpywin32q XJi'm scraping a xml document and my xpath selector doesn't return any itemsq NX-how can i simulate a user login in my spider?qNXCsimplest way to dump all my scraped items into a json/csv/xml file?qNX)what python versions does scrapy support?qNX this pageqX bfo orderqX#does scrapy work with http proxies?qNXlxmlqX8does scrapy crawl in breadth-first or depth-first order?qNX>how can i see the cookies being sent and received from scrapy?qNX beautifulsoupqXdoes scrapy work with python 3?qNX<how can i scrape an item with attributes in different pages?qNX.where can i find some example scrapy projects?qNX.can i run a spider without creating a project?qNXAshould i use spider arguments or settings to configure my spider?qNXBwhat's this huge cryptic __viewstate parameter used in some forms?qNX2i'm getting an error: "cannot import name crawler"qNXfaq-python-versionsqXlifoq Xfaqq!X1how does scrapy compare to beautifulsoup or lxml?q"NX4what's the best way to parse big xml/csv data feeds?q#NX)does scrapy manage cookies automatically?q$NX:scrapy crashes with: importerror: no module named win32apiq%NXjinja2q&X user agentsq'X2my scrapy crawler has memory leaks. what can i do?q(NX+how can i instruct a spider to stop itself?q)NXexample spiderq*Xfrequently asked questionsq+NX2can i use basic http authentication in my spiders?q,NX-what does the response status code 999 means?q-NX!can i use json for large exports?q.NXdjangoq/X dfo orderq0X>i get "filtered offsite request" messages. how can i fix them?q1NXthis twisted bugq2X9can i call pdb.set_trace() from my spiders to debug them?q3NX!did scrapy "steal" x from django?q4NuUsubstitution_defsq5}q6Uparse_messagesq7]q8cdocutils.nodes system_message q9)q:}q;(U rawsourceq)q?}q@(h)qA}qB(h)qn}qo(h`) but you can easily use `BeautifulSoup`_ (or `lxml`_) instead, if you feel more comfortable working with them. After all, they're just parsing libraries which can be imported and used from any Python code.h=hnhChFhMhhO}q(hS]hT]hR]hQ]hU]uhXK hYhhZ]q(hjXAScrapy provides a built-in mechanism for extracting data (called qq}q(h`qh=hhChFhMU pending_xrefqhO}q(UreftypeXrefUrefwarnqU reftargetqXtopics-selectorsU refdomainXstdqhQ]hR]U refexplicithS]hT]hU]UrefdocqXfaqquhXK hZ]qcdocutils.nodes emphasis q)q}q(h)r"}r#(h)r;}r<(hUdoes-scrapy-work-with-python-3r?ahU]r@hauhXK$hYhhZ]rA(hc)rB}rC(h)rn}ro(h)r}r(h)r}r(hYou need to install `pywin32`_ because of `this Twisted bug`_.rh=jhChFhMhhO}r(hS]hT]hR]hQ]hU]uhXKNhYhhZ]r(hjXYou need to install rr}r(h)r}r(h)r"}r#(h}r?(h)rd}re(h)r}r(h)r}r(h)r}r(hhttp://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4rhQ]hR]hS]hT]hU]uhZ]rhjXAccept-Languagerr}r(h)r}r(hhjXintro-examplesr?r@}rA(h)rD}rE(h)r}r(hI get "Filtered offsite request" messages. How can I fix them?rh=jhChFhMhghO}r(hS]hT]hR]hQ]hU]uhXKhYhhZ]rhjX>I get "Filtered offsite request" messages. How can I fix them?rr}r(h)r}r(h)r}r(h` in :class:`~scrapy.contrib.exporter.JsonItemExporter` documentation.h=jhChFhMhhO}r(hS]hT]hR]hQ]hU]uhXKhYhhZ]r(hjX.It'll depend on how large your output is. See rr}r(h`rh=jhChFhMhhO}r(UreftypeXrefhhXjson-with-large-dataU refdomainXstdrhQ]hR]U refexplicithS]hT]hU]hhuhXKhZ]rh)r}r(h)r9}r:(hhauhXKhYhhZ]r?(hc)r@}rA(h)ra}rb(h)r}r(h)r}r(h)r8}r9(h(hc)r?}r@(h)r}r(h)r}r(h)r}r(hHow can I see the cookies being sent and received from Scrapy?r h=jhChFhMhghO}r!(hS]hT]hR]hQ]hU]uhXKhYhhZ]r"hjX>How can I see the cookies being sent and received from Scrapy?r#r$}r%(h}r?(h)r@}rA(h)ry}rz(h)r}r(h` and :ref:`settings ` can be used to configure your spider. There is no strict rule that mandates to use one or the other, but settings are more suited for parameters that, once set, don't change much, while spider arguments are meant to change more often, even on each spider run and sometimes are required for the spider to run at all (for example, to set the start url of a spider).h=jhChFhMhhO}r(hS]hT]hR]hQ]hU]uhXM hYhhZ]r(hjXBoth rr}r(h`rh=jhChFhMhhO}r(UreftypeXrefhhX spiderargsU refdomainXstdrhQ]hR]U refexplicithS]hT]hU]hhuhXM hZ]rh)r}r(h`rh=jhChFhMhhO}r(UreftypeXrefhhXtopics-settingsU refdomainXstdrhQ]hR]U refexplicithS]hT]hU]hhuhXM hZ]rh)r}r(h)r}r(h)r }r (h(hS]hT]hR]hQ]hU]uhXM+hYhhZ]r?(hjXVThis way to access the crawler object is deprecated, the code should be ported to use r@rA}rB(h.. _DFO order: http://en.wikipedia.org/wiki/Depth-first_searchjKh=j hChFhMhNhO}rj(hjFhQ]rkU dfo-orderrlahR]hS]hT]hU]rmh0auhXM<hYhhZ]ubhJ)rn}ro(hUinput_encoding_error_handlerr?jUauto_id_prefixr@UidrAUdoctitle_xformrBUstrip_elements_with_classesrCNU _config_filesrD]Ufile_insertion_enabledrEU raw_enabledrFKU dump_settingsrGNubUsymbol_footnote_startrHKUidsrI}rJ(jjjjj&j"j<j8jjjjjjjvh?jjjj{hrhnjjjj jjj jjjj}jyjjjDj@jhjdjjj!j"j=j9jjjfjbjjjjhWhAjjjjj jjjjjj`j\jHjDjjjjjj jjjejajrjnjjjjj?j;j'j"jrjnh^hAjljhuUsubstitution_namesrK}rLhMhYhO}rM(hS]hQ]hR]UsourcehFhT]hU]uU footnotesrN]rOUrefidsrP}rQ(j!]rRjahW]rShKauub.PKo1DN#scrapy-0.22/.doctrees/index.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xbuilt-in servicesqNX referenceqNXsolving specific problemsqNX)archives of the scrapy-users mailing listq X getting helpq NXpost a questionq Xbasic conceptsq NX first stepsq NXsection-basicsqXscrapy version documentationqNX#scrapy irc channelqX topics-indexqX issue trackerqXextending scrapyqNXextending-scrapyqX all the restqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUbuilt-in-servicesqhU referenceq hUsolving-specific-problemsq!h U)archives-of-the-scrapy-users-mailing-listq"h U getting-helpq#h Upost-a-questionq$h Ubasic-conceptsq%h U first-stepsq&hUsection-basicsq'hUscrapy-version-documentationq(hUscrapy-irc-channelq)hU topics-indexq*hU issue-trackerq+hUid1q,hUextending-scrapyq-hU all-the-restq.uUchildrenq/]q0(cdocutils.nodes target q1)q2}q3(U rawsourceq4X.. _topics-index:Uparentq5hUsourceq6cdocutils.nodes reprunicode q7X;/var/build/user_builds/scrapy/checkouts/0.22/docs/index.rstq8q9}q:bUtagnameq;Utargetq(Uidsq?]Ubackrefsq@]UdupnamesqA]UclassesqB]UnamesqC]UrefidqDh*uUlineqEKUdocumentqFhh/]ubcdocutils.nodes section qG)qH}qI(h4Uh5hh6h9Uexpect_referenced_by_nameqJ}qKhh2sh;UsectionqLh=}qM(hA]hB]h@]h?]qN(h(h*ehC]qO(hheuhEKhFhUexpect_referenced_by_idqP}qQh*h2sh/]qR(cdocutils.nodes title qS)qT}qU(h4XScrapy |version| documentationqVh5hHh6h9h;UtitleqWh=}qX(hA]hB]h@]h?]hC]uhEKhFhh/]qY(cdocutils.nodes Text qZXScrapy q[q\}q](h4XScrapy q^h5hTubhZX0.22q_q`}qa(h4U0.22qbh6NhENhFhh5hTubhZX documentationqcqd}qe(h4X documentationqfh5hTubeubcdocutils.nodes paragraph qg)qh}qi(h4XEThis documentation contains everything you need to know about Scrapy.qjh5hHh6h9h;U paragraphqkh=}ql(hA]hB]h@]h?]hC]uhEKhFhh/]qmhZXEThis documentation contains everything you need to know about Scrapy.qnqo}qp(h4hjh5hhubaubhG)qq}qr(h4Uh5hHh6h9h;hLh=}qs(hA]hB]h@]h?]qth#ahC]quh auhEK hFhh/]qv(hS)qw}qx(h4X Getting helpqyh5hqh6h9h;hWh=}qz(hA]hB]h@]h?]hC]uhEK hFhh/]q{hZX Getting helpq|q}}q~(h4hyh5hwubaubhg)q}q(h4X"Having trouble? We'd like to help!qh5hqh6h9h;hkh=}q(hA]hB]h@]h?]hC]uhEK hFhh/]qhZX"Having trouble? We'd like to help!qq}q(h4hh5hubaubcdocutils.nodes bullet_list q)q}q(h4Uh5hqh6h9h;U bullet_listqh=}q(UbulletqX*h?]h@]hA]hB]hC]uhEKhFhh/]q(cdocutils.nodes list_item q)q}q(h4XFTry the :doc:`FAQ ` -- it's got answers to some common questions.qh5hh6h9h;U list_itemqh=}q(hA]hB]h@]h?]hC]uhENhFhh/]qhg)q}q(h4hh5hh6h9h;hkh=}q(hA]hB]h@]h?]hC]uhEKh/]q(hZXTry the qq}q(h4XTry the h5hubcsphinx.addnodes pending_xref q)q}q(h4X:doc:`FAQ `qh5hh6h9h;U pending_xrefqh=}q(UreftypeXdocqUrefwarnqU reftargetqXfaqU refdomainUh?]h@]U refexplicithA]hB]hC]UrefdocqXindexquhEKh/]qcdocutils.nodes literal q)q}q(h4hh=}q(hA]hB]q(Uxrefqheh@]h?]hC]uh5hh/]qhZXFAQqq}q(h4Uh5hubah;UliteralqubaubhZX. -- it's got answers to some common questions.qq}q(h4X. -- it's got answers to some common questions.h5hubeubaubh)q}q(h4XMLooking for specific information? Try the :ref:`genindex` or :ref:`modindex`.qh5hh6h9h;hh=}q(hA]hB]h@]h?]hC]uhENhFhh/]qhg)q}q(h4hh5hh6h9h;hkh=}q(hA]hB]h@]h?]hC]uhEKh/]q(hZX*Looking for specific information? Try the qq}q(h4X*Looking for specific information? Try the h5hubh)q}q(h4X:ref:`genindex`qh5hh6h9h;hh=}q(UreftypeXrefhhXgenindexU refdomainXstdqh?]h@]U refexplicithA]hB]hC]hhuhEKh/]qcdocutils.nodes emphasis q)q}q(h4hh=}q(hA]hB]q(hhXstd-refqeh@]h?]hC]uh5hh/]qhZXgenindexqυq}q(h4Uh5hubah;UemphasisqubaubhZX or qӅq}q(h4X or h5hubh)q}q(h4X:ref:`modindex`qh5hh6h9h;hh=}q(UreftypeXrefhhXmodindexU refdomainXstdqh?]h@]U refexplicithA]hB]hC]hhuhEKh/]qh)q}q(h4hh=}q(hA]hB]q(hhXstd-refqeh@]h?]hC]uh5hh/]qhZXmodindexq⅁q}q(h4Uh5hubah;hubaubhZX.q}q(h4X.h5hubeubaubh)q}q(h4XbSearch for information in the `archives of the scrapy-users mailing list`_, or `post a question`_.h5hh6h9h;hh=}q(hA]hB]h@]h?]hC]uhENhFhh/]qhg)q}q(h4XbSearch for information in the `archives of the scrapy-users mailing list`_, or `post a question`_.h5hh6h9h;hkh=}q(hA]hB]h@]h?]hC]uhEKh/]q(hZXSearch for information in the qq}q(h4XSearch for information in the h5hubcdocutils.nodes reference q)q}q(h4X,`archives of the scrapy-users mailing list`_UresolvedqKh5hh;U referenceqh=}q(UnameX)archives of the scrapy-users mailing listUrefuriqX,http://groups.google.com/group/scrapy-users/qh?]h@]hA]hB]hC]uh/]qhZX)archives of the scrapy-users mailing listqq}q(h4Uh5hubaubhZX, or qq}r(h4X, or h5hubh)r}r(h4X`post a question`_hKh5hh;hh=}r(UnameXpost a questionhX,http://groups.google.com/group/scrapy-users/rh?]h@]hA]hB]hC]uh/]rhZXpost a questionrr}r(h4Uh5jubaubhZX.r }r (h4X.h5hubeubaubh)r }r (h4X-Ask a question in the `#scrapy IRC channel`_.r h5hh6h9h;hh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rhg)r}r(h4j h5j h6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]r(hZXAsk a question in the rr}r(h4XAsk a question in the h5jubh)r}r(h4X`#scrapy IRC channel`_hKh5jh;hh=}r(UnameX#scrapy IRC channelhXirc://irc.freenode.net/scrapyrh?]h@]hA]hB]hC]uh/]rhZX#scrapy IRC channelrr}r(h4Uh5jubaubhZX.r}r (h4X.h5jubeubaubh)r!}r"(h4X1Report bugs with Scrapy in our `issue tracker`_. h5hh6h9h;hh=}r#(hA]hB]h@]h?]hC]uhENhFhh/]r$hg)r%}r&(h4X0Report bugs with Scrapy in our `issue tracker`_.h5j!h6h9h;hkh=}r'(hA]hB]h@]h?]hC]uhEKh/]r((hZXReport bugs with Scrapy in our r)r*}r+(h4XReport bugs with Scrapy in our h5j%ubh)r,}r-(h4X`issue tracker`_hKh5j%h;hh=}r.(UnameX issue trackerhX'https://github.com/scrapy/scrapy/issuesr/h?]h@]hA]hB]hC]uh/]r0hZX issue trackerr1r2}r3(h4Uh5j,ubaubhZX.r4}r5(h4X.h5j%ubeubaubeubh1)r6}r7(h4X[.. _archives of the scrapy-users mailing list: http://groups.google.com/group/scrapy-users/U referencedr8Kh5hqh6h9h;h(hjh?]r?h$ah@]hA]hB]hC]r@h auhEKhFhh/]ubh1)rA}rB(h4X6.. _#scrapy IRC channel: irc://irc.freenode.net/scrapyj8Kh5hqh6h9h;h(j~)r?}r@(h4X]:doc:`topics/commands` Learn about the command-line tool used to manage your Scrapy project. h5j;h6h9h;jh=}rA(hA]hB]h@]h?]hC]uhEKEh/]rB(j)rC}rD(h4X:doc:`topics/commands`rEh5j?h6h9h;jh=}rF(hA]hB]h@]h?]hC]uhEKEh/]rGh)rH}rI(h4jEh5jCh6h9h;hh=}rJ(UreftypeXdocrKhhXtopics/commandsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKEh/]rLh)rM}rN(h4jEh=}rO(hA]hB]rP(hjKeh@]h?]hC]uh5jHh/]rQhZXtopics/commandsrRrS}rT(h4Uh5jMubah;hubaubaubj)rU}rV(h4Uh=}rW(hA]hB]h@]h?]hC]uh5j?h/]rXhg)rY}rZ(h4XELearn about the command-line tool used to manage your Scrapy project.r[h5jUh6h9h;hkh=}r\(hA]hB]h@]h?]hC]uhEKEh/]r]hZXELearn about the command-line tool used to manage your Scrapy project.r^r_}r`(h4j[h5jYubaubah;jubeubj~)ra}rb(h4X8:doc:`topics/items` Define the data you want to scrape. h5j;h6h9h;jh=}rc(hA]hB]h@]h?]hC]uhEKHhFhh/]rd(j)re}rf(h4X:doc:`topics/items`rgh5jah6h9h;jh=}rh(hA]hB]h@]h?]hC]uhEKHh/]rih)rj}rk(h4jgh5jeh6h9h;hh=}rl(UreftypeXdocrmhhX topics/itemsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKHh/]rnh)ro}rp(h4jgh=}rq(hA]hB]rr(hjmeh@]h?]hC]uh5jjh/]rshZX topics/itemsrtru}rv(h4Uh5joubah;hubaubaubj)rw}rx(h4Uh=}ry(hA]hB]h@]h?]hC]uh5jah/]rzhg)r{}r|(h4X#Define the data you want to scrape.r}h5jwh6h9h;hkh=}r~(hA]hB]h@]h?]hC]uhEKHh/]rhZX#Define the data you want to scrape.rr}r(h4j}h5j{ubaubah;jubeubj~)r}r(h4X>:doc:`topics/spiders` Write the rules to crawl your websites. h5j;h6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKKhFhh/]r(j)r}r(h4X:doc:`topics/spiders`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/spidersU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/spidersrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X'Write the rules to crawl your websites.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKKh/]rhZX'Write the rules to crawl your websites.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XE:doc:`topics/selectors` Extract the data from web pages using XPath. h5j;h6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKNhFhh/]r(j)r}r(h4X:doc:`topics/selectors`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKNh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/selectorsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKNh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/selectorsrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X,Extract the data from web pages using XPath.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKNh/]rhZX,Extract the data from web pages using XPath.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XM:doc:`topics/shell` Test your extraction code in an interactive environment. h5j;h6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKQhFhh/]r(j)r}r(h4X:doc:`topics/shell`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKQh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhX topics/shellU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKQh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZX topics/shellrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X8Test your extraction code in an interactive environment.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKQh/]rhZX8Test your extraction code in an interactive environment.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XC:doc:`topics/loaders` Populate your items with the extracted data. h5j;h6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKThFhh/]r(j)r}r(h4X:doc:`topics/loaders`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKTh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/loadersU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKTh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/loadersrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X,Populate your items with the extracted data.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKTh/]rhZX,Populate your items with the extracted data.rr }r (h4jh5jubaubah;jubeubj~)r }r (h4XF:doc:`topics/item-pipeline` Post-process and store your scraped data. h5j;h6h9h;jh=}r (hA]hB]h@]h?]hC]uhEKWhFhh/]r(j)r}r(h4X:doc:`topics/item-pipeline`rh5j h6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKWh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/item-pipelineU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKWh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/item-pipelinerr}r (h4Uh5jubah;hubaubaubj)r!}r"(h4Uh=}r#(hA]hB]h@]h?]hC]uh5j h/]r$hg)r%}r&(h4X)Post-process and store your scraped data.r'h5j!h6h9h;hkh=}r((hA]hB]h@]h?]hC]uhEKWh/]r)hZX)Post-process and store your scraped data.r*r+}r,(h4j'h5j%ubaubah;jubeubj~)r-}r.(h4XZ:doc:`topics/feed-exports` Output your scraped data using different formats and storages. h5j;h6h9h;jh=}r/(hA]hB]h@]h?]hC]uhEKZhFhh/]r0(j)r1}r2(h4X:doc:`topics/feed-exports`r3h5j-h6h9h;jh=}r4(hA]hB]h@]h?]hC]uhEKZh/]r5h)r6}r7(h4j3h5j1h6h9h;hh=}r8(UreftypeXdocr9hhXtopics/feed-exportsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKZh/]r:h)r;}r<(h4j3h=}r=(hA]hB]r>(hj9eh@]h?]hC]uh5j6h/]r?hZXtopics/feed-exportsr@rA}rB(h4Uh5j;ubah;hubaubaubj)rC}rD(h4Uh=}rE(hA]hB]h@]h?]hC]uh5j-h/]rFhg)rG}rH(h4X>Output your scraped data using different formats and storages.rIh5jCh6h9h;hkh=}rJ(hA]hB]h@]h?]hC]uhEKZh/]rKhZX>Output your scraped data using different formats and storages.rLrM}rN(h4jIh5jGubaubah;jubeubj~)rO}rP(h4XX:doc:`topics/link-extractors` Convenient classes to extract links to follow from pages. h5j;h6h9h;jh=}rQ(hA]hB]h@]h?]hC]uhEK]hFhh/]rR(j)rS}rT(h4X:doc:`topics/link-extractors`rUh5jOh6h9h;jh=}rV(hA]hB]h@]h?]hC]uhEK]h/]rWh)rX}rY(h4jUh5jSh6h9h;hh=}rZ(UreftypeXdocr[hhXtopics/link-extractorsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEK]h/]r\h)r]}r^(h4jUh=}r_(hA]hB]r`(hj[eh@]h?]hC]uh5jXh/]rahZXtopics/link-extractorsrbrc}rd(h4Uh5j]ubah;hubaubaubj)re}rf(h4Uh=}rg(hA]hB]h@]h?]hC]uh5jOh/]rhhg)ri}rj(h4X9Convenient classes to extract links to follow from pages.rkh5jeh6h9h;hkh=}rl(hA]hB]h@]h?]hC]uhEK]h/]rmhZX9Convenient classes to extract links to follow from pages.rnro}rp(h4jkh5jiubaubah;jubeubeubeubhG)rq}rr(h4Uh5hHh6h9h;hLh=}rs(hA]hB]h@]h?]rthahC]ruhauhEK`hFhh/]rv(hS)rw}rx(h4XBuilt-in servicesryh5jqh6h9h;hWh=}rz(hA]hB]h@]h?]hC]uhEK`hFhh/]r{hZXBuilt-in servicesr|r}}r~(h4jyh5jwubaubjY)r}r(h4Uh5jqh6h9h;j\h=}r(hA]hB]rj_ah@]h?]hC]uhENhFhh/]rja)r}r(h4Uh5jh6h9h;jdh=}r(jfKjgh5hjhjih?]h@]hA]hB]hC]jj]r(NXtopics/loggingrrNX topics/statsrrNX topics/emailrrNXtopics/telnetconsolerrNXtopics/webservicerrejtju]r(jjjjjejwJuhEKbh/]ubaubjx)r}r(h4Uh5jqh6h9h;j{h=}r(hA]hB]h@]h?]hC]uhENhFhh/]r(j~)r}r(h4XQ:doc:`topics/logging` Understand the simple logging facility provided by Scrapy. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKlh/]r(j)r}r(h4X:doc:`topics/logging`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKlh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/loggingU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKlh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/loggingrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X:Understand the simple logging facility provided by Scrapy.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKlh/]rhZX:Understand the simple logging facility provided by Scrapy.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XD:doc:`topics/stats` Collect statistics about your scraping crawler. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKohFhh/]r(j)r}r(h4X:doc:`topics/stats`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKoh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhX topics/statsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKoh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZX topics/statsrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X/Collect statistics about your scraping crawler.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKoh/]rhZX/Collect statistics about your scraping crawler.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XH:doc:`topics/email` Send email notifications when certain events occur. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKrhFhh/]r(j)r}r(h4X:doc:`topics/email`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKrh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhX topics/emailU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKrh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZX topics/emailrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X3Send email notifications when certain events occur.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKrh/]rhZX3Send email notifications when certain events occur.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XW:doc:`topics/telnetconsole` Inspect a running crawler using a built-in Python console. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKuhFhh/]r(j)r}r(h4X:doc:`topics/telnetconsole`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKuh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocr hhXtopics/telnetconsoleU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKuh/]r h)r }r (h4jh=}r (hA]hB]r(hj eh@]h?]hC]uh5jh/]rhZXtopics/telnetconsolerr}r(h4Uh5j ubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X:Inspect a running crawler using a built-in Python console.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKuh/]rhZX:Inspect a running crawler using a built-in Python console.rr}r(h4jh5jubaubah;jubeubj~)r}r (h4XM:doc:`topics/webservice` Monitor and control a crawler using a web service. h5jh6h9h;jh=}r!(hA]hB]h@]h?]hC]uhEKyhFhh/]r"(j)r#}r$(h4X:doc:`topics/webservice`r%h5jh6h9h;jh=}r&(hA]hB]h@]h?]hC]uhEKyh/]r'h)r(}r)(h4j%h5j#h6h9h;hh=}r*(UreftypeXdocr+hhXtopics/webserviceU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKyh/]r,h)r-}r.(h4j%h=}r/(hA]hB]r0(hj+eh@]h?]hC]uh5j(h/]r1hZXtopics/webservicer2r3}r4(h4Uh5j-ubah;hubaubaubj)r5}r6(h4Uh=}r7(hA]hB]h@]h?]hC]uh5jh/]r8hg)r9}r:(h4X2Monitor and control a crawler using a web service.r;h5j5h6h9h;hkh=}r<(hA]hB]h@]h?]hC]uhEKxh/]r=hZX2Monitor and control a crawler using a web service.r>r?}r@(h4j;h5j9ubaubah;jubeubeubeubhG)rA}rB(h4Uh5hHh6h9h;hLh=}rC(hA]hB]h@]h?]rDh!ahC]rEhauhEK|hFhh/]rF(hS)rG}rH(h4XSolving specific problemsrIh5jAh6h9h;hWh=}rJ(hA]hB]h@]h?]hC]uhEK|hFhh/]rKhZXSolving specific problemsrLrM}rN(h4jIh5jGubaubjY)rO}rP(h4Uh5jAh6h9h;j\h=}rQ(hA]hB]rRj_ah@]h?]hC]uhENhFhh/]rSja)rT}rU(h4Uh5jOh6h9h;jdh=}rV(jfKjgh5hjhjih?]h@]hA]hB]hC]jj]rW(NXfaqrXrYNX topics/debugrZr[NXtopics/contractsr\r]NXtopics/practicesr^r_NXtopics/broad-crawlsr`raNXtopics/firefoxrbrcNXtopics/firebugrdreNX topics/leaksrfrgNX topics/imagesrhriNX topics/ubunturjrkNXtopics/scrapydrlrmNXtopics/autothrottlernroNXtopics/benchmarkingrprqNX topics/jobsrrrsNXtopics/djangoitemrtruejtju]rv(jXjZj\j^j`jbjdjfjhjjjljnjpjrjtejwJuhEK~h/]ubaubjx)rw}rx(h4Uh5jAh6h9h;j{h=}ry(hA]hB]h@]h?]hC]uhENhFhh/]rz(j~)r{}r|(h4X;:doc:`faq` Get answers to most frequently asked questions. h5jwh6h9h;jh=}r}(hA]hB]h@]h?]hC]uhEKh/]r~(j)r}r(h4X :doc:`faq`rh5j{h6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXfaqU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXfaqrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5j{h/]rhg)r}r(h4X/Get answers to most frequently asked questions.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX/Get answers to most frequently asked questions.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XN:doc:`topics/debug` Learn how to debug common problems of your scrapy spider. h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/debug`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhX topics/debugU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZX topics/debugrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X9Learn how to debug common problems of your scrapy spider.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX9Learn how to debug common problems of your scrapy spider.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XM:doc:`topics/contracts` Learn how to use contracts for testing your spiders. h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/contracts`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/contractsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/contractsrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X4Learn how to use contracts for testing your spiders.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX4Learn how to use contracts for testing your spiders.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XH:doc:`topics/practices` Get familiar with some Scrapy common practices. h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/practices`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/practicesU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/practicesrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X/Get familiar with some Scrapy common practices.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX/Get familiar with some Scrapy common practices.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XO:doc:`topics/broad-crawls` Tune Scrapy for crawling a lot domains in parallel. h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/broad-crawls`r h5jh6h9h;jh=}r (hA]hB]h@]h?]hC]uhEKh/]r h)r }r (h4j h5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/broad-crawlsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4j h=}r(hA]hB]r(hjeh@]h?]hC]uh5j h/]rhZXtopics/broad-crawlsrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X3Tune Scrapy for crawling a lot domains in parallel.rh5jh6h9h;hkh=}r (hA]hB]h@]h?]hC]uhEKh/]r!hZX3Tune Scrapy for crawling a lot domains in parallel.r"r#}r$(h4jh5jubaubah;jubeubj~)r%}r&(h4XP:doc:`topics/firefox` Learn how to scrape with Firefox and some useful add-ons. h5jwh6h9h;jh=}r'(hA]hB]h@]h?]hC]uhEKhFhh/]r((j)r)}r*(h4X:doc:`topics/firefox`r+h5j%h6h9h;jh=}r,(hA]hB]h@]h?]hC]uhEKh/]r-h)r.}r/(h4j+h5j)h6h9h;hh=}r0(UreftypeXdocr1hhXtopics/firefoxU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]r2h)r3}r4(h4j+h=}r5(hA]hB]r6(hj1eh@]h?]hC]uh5j.h/]r7hZXtopics/firefoxr8r9}r:(h4Uh5j3ubah;hubaubaubj)r;}r<(h4Uh=}r=(hA]hB]h@]h?]hC]uh5j%h/]r>hg)r?}r@(h4X9Learn how to scrape with Firefox and some useful add-ons.rAh5j;h6h9h;hkh=}rB(hA]hB]h@]h?]hC]uhEKh/]rChZX9Learn how to scrape with Firefox and some useful add-ons.rDrE}rF(h4jAh5j?ubaubah;jubeubj~)rG}rH(h4XE:doc:`topics/firebug` Learn how to scrape efficiently using Firebug. h5jwh6h9h;jh=}rI(hA]hB]h@]h?]hC]uhEKhFhh/]rJ(j)rK}rL(h4X:doc:`topics/firebug`rMh5jGh6h9h;jh=}rN(hA]hB]h@]h?]hC]uhEKh/]rOh)rP}rQ(h4jMh5jKh6h9h;hh=}rR(UreftypeXdocrShhXtopics/firebugU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rTh)rU}rV(h4jMh=}rW(hA]hB]rX(hjSeh@]h?]hC]uh5jPh/]rYhZXtopics/firebugrZr[}r\(h4Uh5jUubah;hubaubaubj)r]}r^(h4Uh=}r_(hA]hB]h@]h?]hC]uh5jGh/]r`hg)ra}rb(h4X.Learn how to scrape efficiently using Firebug.rch5j]h6h9h;hkh=}rd(hA]hB]h@]h?]hC]uhEKh/]rehZX.Learn how to scrape efficiently using Firebug.rfrg}rh(h4jch5jaubaubah;jubeubj~)ri}rj(h4XS:doc:`topics/leaks` Learn how to find and get rid of memory leaks in your crawler. h5jwh6h9h;jh=}rk(hA]hB]h@]h?]hC]uhEKhFhh/]rl(j)rm}rn(h4X:doc:`topics/leaks`roh5jih6h9h;jh=}rp(hA]hB]h@]h?]hC]uhEKh/]rqh)rr}rs(h4joh5jmh6h9h;hh=}rt(UreftypeXdocruhhX topics/leaksU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rvh)rw}rx(h4joh=}ry(hA]hB]rz(hjueh@]h?]hC]uh5jrh/]r{hZX topics/leaksr|r}}r~(h4Uh5jwubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jih/]rhg)r}r(h4X>Learn how to find and get rid of memory leaks in your crawler.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX>Learn how to find and get rid of memory leaks in your crawler.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XP:doc:`topics/images` Download static images associated with your scraped items. h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/images`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhX topics/imagesU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZX topics/imagesrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X:Download static images associated with your scraped items.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX:Download static images associated with your scraped items.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XE:doc:`topics/ubuntu` Install latest Scrapy packages easily on Ubuntu h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/ubuntu`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhX topics/ubuntuU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZX topics/ubunturr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X/Install latest Scrapy packages easily on Ubunturh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX/Install latest Scrapy packages easily on Ubunturr}r(h4jh5jubaubah;jubeubj~)r}r(h4XC:doc:`topics/scrapyd` Deploying your Scrapy project in production. h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/scrapyd`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/scrapydU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/scrapydrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X,Deploying your Scrapy project in production.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX,Deploying your Scrapy project in production.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XH:doc:`topics/autothrottle` Adjust crawl rate dynamically based on load. h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/autothrottle`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/autothrottleU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/autothrottlerr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r (hA]hB]h@]h?]hC]uh5jh/]r hg)r }r (h4X,Adjust crawl rate dynamically based on load.r h5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX,Adjust crawl rate dynamically based on load.rr}r(h4j h5j ubaubah;jubeubj~)r}r(h4XG:doc:`topics/benchmarking` Check how Scrapy performs on your hardware. h5jwh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/benchmarking`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/benchmarkingU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]r h)r!}r"(h4jh=}r#(hA]hB]r$(hjeh@]h?]hC]uh5jh/]r%hZXtopics/benchmarkingr&r'}r((h4Uh5j!ubah;hubaubaubj)r)}r*(h4Uh=}r+(hA]hB]h@]h?]hC]uh5jh/]r,hg)r-}r.(h4X+Check how Scrapy performs on your hardware.r/h5j)h6h9h;hkh=}r0(hA]hB]h@]h?]hC]uhEKh/]r1hZX+Check how Scrapy performs on your hardware.r2r3}r4(h4j/h5j-ubaubah;jubeubj~)r5}r6(h4XK:doc:`topics/jobs` Learn how to pause and resume crawls for large spiders. h5jwh6h9h;jh=}r7(hA]hB]h@]h?]hC]uhEKhFhh/]r8(j)r9}r:(h4X:doc:`topics/jobs`r;h5j5h6h9h;jh=}r<(hA]hB]h@]h?]hC]uhEKh/]r=h)r>}r?(h4j;h5j9h6h9h;hh=}r@(UreftypeXdocrAhhX topics/jobsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rBh)rC}rD(h4j;h=}rE(hA]hB]rF(hjAeh@]h?]hC]uh5j>h/]rGhZX topics/jobsrHrI}rJ(h4Uh5jCubah;hubaubaubj)rK}rL(h4Uh=}rM(hA]hB]h@]h?]hC]uh5j5h/]rNhg)rO}rP(h4X7Learn how to pause and resume crawls for large spiders.rQh5jKh6h9h;hkh=}rR(hA]hB]h@]h?]hC]uhEKh/]rShZX7Learn how to pause and resume crawls for large spiders.rTrU}rV(h4jQh5jOubaubah;jubeubj~)rW}rX(h4XB:doc:`topics/djangoitem` Write scraped items using Django models. h5jwh6h9h;jh=}rY(hA]hB]h@]h?]hC]uhEKhFhh/]rZ(j)r[}r\(h4X:doc:`topics/djangoitem`r]h5jWh6h9h;jh=}r^(hA]hB]h@]h?]hC]uhEKh/]r_h)r`}ra(h4j]h5j[h6h9h;hh=}rb(UreftypeXdocrchhXtopics/djangoitemU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rdh)re}rf(h4j]h=}rg(hA]hB]rh(hjceh@]h?]hC]uh5j`h/]rihZXtopics/djangoitemrjrk}rl(h4Uh5jeubah;hubaubaubj)rm}rn(h4Uh=}ro(hA]hB]h@]h?]hC]uh5jWh/]rphg)rq}rr(h4X(Write scraped items using Django models.rsh5jmh6h9h;hkh=}rt(hA]hB]h@]h?]hC]uhEKh/]ruhZX(Write scraped items using Django models.rvrw}rx(h4jsh5jqubaubah;jubeubeubh1)ry}rz(h4X.. _extending-scrapy:h5jAh6h9h;hhZX topics/apir?r@}rA(h4Uh5j:ubah;hubaubaubj)rB}rC(h4Uh=}rD(hA]hB]h@]h?]hC]uh5j,h/]rEhg)rF}rG(h4XCUse it on extensions and middlewares to extend Scrapy functionalityrHh5jBh6h9h;hkh=}rI(hA]hB]h@]h?]hC]uhEKh/]rJhZXCUse it on extensions and middlewares to extend Scrapy functionalityrKrL}rM(h4jHh5jFubaubah;jubeubeubeubhG)rN}rO(h4Uh5hHh6h9h;hLh=}rP(hA]hB]h@]h?]rQh ahC]rRhauhEKhFhh/]rS(hS)rT}rU(h4X ReferencerVh5jNh6h9h;hWh=}rW(hA]hB]h@]h?]hC]uhEKhFhh/]rXhZX ReferencerYrZ}r[(h4jVh5jTubaubjY)r\}r](h4Uh5jNh6h9h;j\h=}r^(hA]hB]r_j_ah@]h?]hC]uhENhFhh/]r`ja)ra}rb(h4Uh5j\h6h9h;jdh=}rc(jfKjgh5hjhjih?]h@]hA]hB]hC]jj]rd(NXtopics/request-responsererfNXtopics/settingsrgrhNXtopics/signalsrirjNXtopics/exceptionsrkrlNXtopics/exportersrmrnejtju]ro(jejgjijkjmejwJuhEKh/]ubaubjx)rp}rq(h4Uh5jNh6h9h;j{h=}rr(hA]hB]h@]h?]hC]uhENhFhh/]rs(j~)rt}ru(h4Xv:doc:`topics/commands` Learn about the command-line tool and see all :ref:`available commands `. h5jph6h9h;jh=}rv(hA]hB]h@]h?]hC]uhEKh/]rw(j)rx}ry(h4X:doc:`topics/commands`rzh5jth6h9h;jh=}r{(hA]hB]h@]h?]hC]uhEKh/]r|h)r}}r~(h4jzh5jxh6h9h;hh=}r(UreftypeXdocrhhXtopics/commandsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jzh=}r(hA]hB]r(hjeh@]h?]hC]uh5j}h/]rhZXtopics/commandsrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jth/]rhg)r}r(h4X^Learn about the command-line tool and see all :ref:`available commands `.h5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]r(hZX.Learn about the command-line tool and see all rr}r(h4X.Learn about the command-line tool and see all h5jubh)r}r(h4X/:ref:`available commands `rh5jh6h9h;hh=}r(UreftypeXrefhhXtopics-commands-refU refdomainXstdrh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjXstd-refreh@]h?]hC]uh5jh/]rhZXavailable commandsrr}r(h4Uh5jubah;hubaubhZX.r}r(h4X.h5jubeubah;jubeubj~)r}r(h4Xe:doc:`topics/request-response` Understand the classes used to represent HTTP requests and responses. h5jph6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/request-response`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/request-responseU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/request-responserr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4XEUnderstand the classes used to represent HTTP requests and responses.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZXEUnderstand the classes used to represent HTTP requests and responses.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4Xr:doc:`topics/settings` Learn how to configure Scrapy and see all :ref:`available settings `. h5jph6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/settings`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/settingsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXtopics/settingsrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4XZLearn how to configure Scrapy and see all :ref:`available settings `.h5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]r(hZX*Learn how to configure Scrapy and see all rr}r(h4X*Learn how to configure Scrapy and see all h5jubh)r}r(h4X/:ref:`available settings `rh5jh6h9h;hh=}r(UreftypeXrefhhXtopics-settings-refU refdomainXstdrh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r(h4jh=}r(hA]hB]r(hjXstd-refreh@]h?]hC]uh5jh/]rhZXavailable settingsrr}r(h4Uh5jubah;hubaubhZX.r}r(h4X.h5jubeubah;jubeubj~)r}r(h4XK:doc:`topics/signals` See all available signals and how to work with them. h5jph6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r}r(h4X:doc:`topics/signals`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXtopics/signalsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rh)r}r (h4jh=}r (hA]hB]r (hjeh@]h?]hC]uh5jh/]r hZXtopics/signalsr r}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X4See all available signals and how to work with them.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEKh/]rhZX4See all available signals and how to work with them.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XI:doc:`topics/exceptions` See all available exceptions and their meaning. h5jph6h9h;jh=}r(hA]hB]h@]h?]hC]uhEKhFhh/]r(j)r }r!(h4X:doc:`topics/exceptions`r"h5jh6h9h;jh=}r#(hA]hB]h@]h?]hC]uhEKh/]r$h)r%}r&(h4j"h5j h6h9h;hh=}r'(UreftypeXdocr(hhXtopics/exceptionsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]r)h)r*}r+(h4j"h=}r,(hA]hB]r-(hj(eh@]h?]hC]uh5j%h/]r.hZXtopics/exceptionsr/r0}r1(h4Uh5j*ubah;hubaubaubj)r2}r3(h4Uh=}r4(hA]hB]h@]h?]hC]uh5jh/]r5hg)r6}r7(h4X/See all available exceptions and their meaning.r8h5j2h6h9h;hkh=}r9(hA]hB]h@]h?]hC]uhEKh/]r:hZX/See all available exceptions and their meaning.r;r<}r=(h4j8h5j6ubaubah;jubeubj~)r>}r?(h4XV:doc:`topics/exporters` Quickly export your scraped items to a file (XML, CSV, etc). h5jph6h9h;jh=}r@(hA]hB]h@]h?]hC]uhEKhFhh/]rA(j)rB}rC(h4X:doc:`topics/exporters`rDh5j>h6h9h;jh=}rE(hA]hB]h@]h?]hC]uhEKh/]rFh)rG}rH(h4jDh5jBh6h9h;hh=}rI(UreftypeXdocrJhhXtopics/exportersU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEKh/]rKh)rL}rM(h4jDh=}rN(hA]hB]rO(hjJeh@]h?]hC]uh5jGh/]rPhZXtopics/exportersrQrR}rS(h4Uh5jLubah;hubaubaubj)rT}rU(h4Uh=}rV(hA]hB]h@]h?]hC]uh5j>h/]rWhg)rX}rY(h4X<Quickly export your scraped items to a file (XML, CSV, etc).rZh5jTh6h9h;hkh=}r[(hA]hB]h@]h?]hC]uhEKh/]r\hZX<Quickly export your scraped items to a file (XML, CSV, etc).r]r^}r_(h4jZh5jXubaubah;jubeubeubeubhG)r`}ra(h4Uh5hHh6h9h;hLh=}rb(hA]hB]h@]h?]rch.ahC]rdhauhEKhFhh/]re(hS)rf}rg(h4X All the restrhh5j`h6h9h;hWh=}ri(hA]hB]h@]h?]hC]uhEKhFhh/]rjhZX All the restrkrl}rm(h4jhh5jfubaubjY)rn}ro(h4Uh5j`h6h9h;j\h=}rp(hA]hB]rqj_ah@]h?]hC]uhENhFhh/]rrja)rs}rt(h4Uh5jnh6h9h;jdh=}ru(jfKjgh5hjhjih?]h@]hA]hB]hC]jj]rv(NXnewsrwrxNX contributingryrzNX versioningr{r|NXexperimental/indexr}r~ejtju]r(jwjyj{j}ejwJuhEKh/]ubaubjx)r}r(h4Uh5j`h6h9h;j{h=}r(hA]hB]h@]h?]hC]uhENhFhh/]r(j~)r}r(h4X<:doc:`news` See what has changed in recent Scrapy versions. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEMh/]r(j)r}r(h4X :doc:`news`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEMh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXnewsU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEMh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXnewsrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X/See what has changed in recent Scrapy versions.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEMh/]rhZX/See what has changed in recent Scrapy versions.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XC:doc:`contributing` Learn how to contribute to the Scrapy project. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEM hFhh/]r(j)r}r(h4X:doc:`contributing`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEM h/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhX contributingU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEM h/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZX contributingrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X.Learn how to contribute to the Scrapy project.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEM h/]rhZX.Learn how to contribute to the Scrapy project.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4XB:doc:`versioning` Understand Scrapy versioning and API stability. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEM hFhh/]r(j)r}r(h4X:doc:`versioning`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEM h/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhX versioningU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEM h/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZX versioningrr}r(h4Uh5jubah;hubaubaubj)r}r(h4Uh=}r(hA]hB]h@]h?]hC]uh5jh/]rhg)r}r(h4X/Understand Scrapy versioning and API stability.rh5jh6h9h;hkh=}r(hA]hB]h@]h?]hC]uhEM h/]rhZX/Understand Scrapy versioning and API stability.rr}r(h4jh5jubaubah;jubeubj~)r}r(h4X=:doc:`experimental/index` Learn about bleeding-edge features.h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEMhFhh/]r(j)r}r(h4X:doc:`experimental/index`rh5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhEMh/]rh)r}r(h4jh5jh6h9h;hh=}r(UreftypeXdocrhhXexperimental/indexU refdomainUh?]h@]U refexplicithA]hB]hC]hhuhEMh/]rh)r}r(h4jh=}r(hA]hB]r(hjeh@]h?]hC]uh5jh/]rhZXexperimental/indexrr}r(h4Uh5jubah;hubaubaubj)r }r (h4Uh=}r (hA]hB]h@]h?]hC]uh5jh/]r hg)r }r (h4X#Learn about bleeding-edge features.r h5j h6h9h;hkh=}r (hA]hB]h@]h?]hC]uhEMh/]r hZX#Learn about bleeding-edge features.r r }r (h4j h5j ubaubah;jubeubeubeubeubeh4UU transformerr NU footnote_refsr }r Urefnamesr }r (X)archives of the scrapy-users mailing list]r haX issue tracker]r j,aX#scrapy irc channel]r jaXpost a question]r jauUsymbol_footnotesr ]r Uautofootnote_refsr ]r Usymbol_footnote_refsr ]r U citationsr ]r hFhU current_liner NUtransform_messagesr ]r (cdocutils.nodes system_message r )r! }r" (h4Uh=}r# (hA]UlevelKh?]h@]Usourceh9hB]hC]UlineKUtypeUINFOr$ uh/]r% hg)r& }r' (h4Uh=}r( (hA]hB]h@]h?]hC]uh5j! h/]r) hZX2Hyperlink target "topics-index" is not referenced.r* r+ }r, (h4Uh5j& ubah;hkubah;Usystem_messager- ubj )r. }r/ (h4Uh=}r0 (hA]UlevelKh?]h@]Usourceh9hB]hC]UlineK2Utypej$ uh/]r1 hg)r2 }r3 (h4Uh=}r4 (hA]hB]h@]h?]hC]uh5j. h/]r5 hZX4Hyperlink target "section-basics" is not referenced.r6 r7 }r8 (h4Uh5j2 ubah;hkubah;j- ubj )r9 }r: (h4Uh=}r; (hA]UlevelKh?]h@]Usourceh9hB]hC]UlineKUtypej$ uh/]r< hg)r= }r> (h4Uh=}r? (hA]hB]h@]h?]hC]uh5j9 h/]r@ hZX6Hyperlink target "extending-scrapy" is not referenced.rA rB }rC (h4Uh5j= ubah;hkubah;j- ubeUreporterrD NUid_startrE KU autofootnotesrF ]rG U citation_refsrH }rI Uindirect_targetsrJ ]rK UsettingsrL (cdocutils.frontend Values rM orN }rO (Ufootnote_backlinksrP KUrecord_dependenciesrQ NU rfc_base_urlrR Uhttp://tools.ietf.org/html/rS U tracebackrT Upep_referencesrU NUstrip_commentsrV NU toc_backlinksrW UentryrX U language_coderY UenrZ U datestampr[ NU report_levelr\ KU _destinationr] NU halt_levelr^ KU strip_classesr_ NhWNUerror_encoding_error_handlerr` Ubackslashreplacera Udebugrb NUembed_stylesheetrc Uoutput_encoding_error_handlerrd Ustrictre U sectnum_xformrf KUdump_transformsrg NU docinfo_xformrh KUwarning_streamri NUpep_file_url_templaterj Upep-%04drk Uexit_status_levelrl KUconfigrm NUstrict_visitorrn NUcloak_email_addressesro Utrim_footnote_reference_spacerp Uenvrq NUdump_pseudo_xmlrr NUexpose_internalsrs NUsectsubtitle_xformrt U source_linkru NUrfc_referencesrv NUoutput_encodingrw Uutf-8rx U source_urlry NUinput_encodingrz U utf-8-sigr{ U_disable_configr| NU id_prefixr} UU tab_widthr~ KUerror_encodingr UUTF-8r U_sourcer U;/var/build/user_builds/scrapy/checkouts/0.22/docs/index.rstr Ugettext_compactr U generatorr NUdump_internalsr NU smart_quotesr U pep_base_urlr Uhttp://www.python.org/dev/peps/r Usyntax_highlightr Ulongr Uinput_encoding_error_handlerr je Uauto_id_prefixr Uidr Udoctitle_xformr Ustrip_elements_with_classesr NU _config_filesr ]Ufile_insertion_enabledr U raw_enabledr KU dump_settingsr NubUsymbol_footnote_startr KUidsr }r (h#hqh$j<h)jAhjqh jNh!jAh&jKh.j`h-j|h'jh+jFh*hHh"j6h(hHh,j|h%juUsubstitution_namesr }r h;hFh=}r (hA]h?]h@]Usourceh9hB]hC]uU footnotesr ]r Urefidsr }r (h']r j ah-]r jyah*]r h2auub.PKo1D;kk0scrapy-0.22/.doctrees/experimental/index.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xexperimental featuresqNX mailing listsqX experimentalqX%add commands using external librariesq NuUsubstitution_defsq }q Uparse_messagesq ]q Ucurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUexperimental-featuresqhU mailing-listsqhU experimentalqh U%add-commands-using-external-librariesquUchildrenq]q(cdocutils.nodes target q)q}q(U rawsourceqX.. _experimental:UparentqhUsourceqcdocutils.nodes reprunicode qXH/var/build/user_builds/scrapy/checkouts/0.22/docs/experimental/index.rstq q!}q"bUtagnameq#Utargetq$U attributesq%}q&(Uidsq']Ubackrefsq(]Udupnamesq)]Uclassesq*]Unamesq+]Urefidq,huUlineq-KUdocumentq.hh]ubcdocutils.nodes section q/)q0}q1(hUhhhh!Uexpect_referenced_by_nameq2}q3hhsh#Usectionq4h%}q5(h)]h*]h(]h']q6(hheh+]q7(hheuh-Kh.hUexpect_referenced_by_idq8}q9hhsh]q:(cdocutils.nodes title q;)q<}q=(hXExperimental featuresq>hh0hh!h#Utitleq?h%}q@(h)]h*]h(]h']h+]uh-Kh.hh]qAcdocutils.nodes Text qBXExperimental featuresqCqD}qE(hh>hh`_ to get notified of any changes.hh0hh!h#U paragraphqIh%}qJ(h)]h*]h(]h']h+]uh-Kh.hh]qK(hBXThis section documents experimental Scrapy features that may become stable in future releases, but whose API is not yet stable. Use them with caution, and subscribe to the qLqM}qN(hXThis section documents experimental Scrapy features that may become stable in future releases, but whose API is not yet stable. Use them with caution, and subscribe to the hhGubcdocutils.nodes reference qO)qP}qQ(hX/`mailing lists `_h%}qR(UnameX mailing listsUrefuriqSXhttp://scrapy.org/community/qTh']h(]h)]h*]h+]uhhGh]qUhBX mailing listsqVqW}qX(hUhhPubah#U referenceqYubh)qZ}q[(hX U referencedq\KhhGh#h$h%}q](UrefurihTh']q^hah(]h)]h*]h+]q_hauh]ubhBX to get notified of any changes.q`qa}qb(hX to get notified of any changes.hhGubeubhF)qc}qd(hXSince it's not revised so frequently, this section may contain documentation which is outdated, incomplete or overlapping with stable documentation (until it's properly merged) . Use at your own risk.qehh0hh!h#hIh%}qf(h)]h*]h(]h']h+]uh-K h.hh]qghBXSince it's not revised so frequently, this section may contain documentation which is outdated, incomplete or overlapping with stable documentation (until it's properly merged) . Use at your own risk.qhqi}qj(hhehhcubaubcdocutils.nodes warning qk)ql}qm(hX?This documentation is a work in progress. Use at your own risk.qnhh0hh!h#Uwarningqoh%}qp(h)]h*]h(]h']h+]uh-Nh.hh]qqhF)qr}qs(hhnhhlhh!h#hIh%}qt(h)]h*]h(]h']h+]uh-Kh]quhBX?This documentation is a work in progress. Use at your own risk.qvqw}qx(hhnhhrubaubaubh/)qy}qz(hUhh0hh!h#h4h%}q{(h)]h*]h(]h']q|hah+]q}h auh-Kh.hh]q~(h;)q}q(hX%Add commands using external librariesqhhyhh!h#h?h%}q(h)]h*]h(]h']h+]uh-Kh.hh]qhBX%Add commands using external librariesqq}q(hhhhubaubhF)q}q(hXYou can also add Scrapy commands from an external library by adding `scrapy.commands` section into entry_points in the `setup.py`.qhhyhh!h#hIh%}q(h)]h*]h(]h']h+]uh-Kh.hh]q(hBXDYou can also add Scrapy commands from an external library by adding qq}q(hXDYou can also add Scrapy commands from an external library by adding hhubcdocutils.nodes title_reference q)q}q(hX`scrapy.commands`h%}q(h)]h*]h(]h']h+]uhhh]qhBXscrapy.commandsqq}q(hUhhubah#Utitle_referencequbhBX" section into entry_points in the qq}q(hX" section into entry_points in the hhubh)q}q(hX `setup.py`h%}q(h)]h*]h(]h']h+]uhhh]qhBXsetup.pyqq}q(hUhhubah#hubhBX.q}q(hX.hhubeubhF)q}q(hX1The following example adds `my_command` command::qhhyhh!h#hIh%}q(h)]h*]h(]h']h+]uh-Kh.hh]q(hBXThe following example adds qq}q(hXThe following example adds hhubh)q}q(hX `my_command`h%}q(h)]h*]h(]h']h+]uhhh]qhBX my_commandqq}q(hUhhubah#hubhBX command:qq}q(hX command:hhubeubcdocutils.nodes literal_block q)q}q(hXfrom setuptools import setup, find_packages setup(name='scrapy-mymodule', entry_points={ 'scrapy.commands': [ 'my_command=my_scrapy_module.commands:MyCommand', ], }, )hhyhh!h#U literal_blockqh%}q(U xml:spaceqUpreserveqh']h(]h)]h*]h+]uh-Kh.hh]qhBXfrom setuptools import setup, find_packages setup(name='scrapy-mymodule', entry_points={ 'scrapy.commands': [ 'my_command=my_scrapy_module.commands:MyCommand', ], }, )qq}q(hUhhubaubeubeubehUU transformerqNU footnote_refsq}qUrefnamesq}qUsymbol_footnotesq]qUautofootnote_refsq]qUsymbol_footnote_refsq]qU citationsq]qh.hU current_lineqNUtransform_messagesq]qcdocutils.nodes system_message q)q}q(hUh%}q(h)]UlevelKh']h(]Usourceh!h*]h+]UlineKUtypeUINFOquh]qhF)q}q(hUh%}q(h)]h*]h(]h']h+]uhhh]qhBX2Hyperlink target "experimental" is not referenced.qۅq}q(hUhhubah#hIubah#Usystem_messagequbaUreporterqNUid_startqKU autofootnotesq]qU citation_refsq}qUindirect_targetsq]qUsettingsq(cdocutils.frontend Values qoq}q(Ufootnote_backlinksqKUrecord_dependenciesqNU rfc_base_urlqUhttp://tools.ietf.org/html/qU tracebackqUpep_referencesqNUstrip_commentsqNU toc_backlinksqUentryqU language_codeqUenqU datestampqNU report_levelqKU _destinationqNU halt_levelqKU strip_classesqNh?NUerror_encoding_error_handlerqUbackslashreplaceqUdebugqNUembed_stylesheetqUoutput_encoding_error_handlerqUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorr NUcloak_email_addressesr Utrim_footnote_reference_spacer Uenvr NUdump_pseudo_xmlr NUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUH/var/build/user_builds/scrapy/checkouts/0.22/docs/experimental/index.rstrUgettext_compactrU generatorrNUdump_internalsr NU smart_quotesr!U pep_base_urlr"Uhttp://www.python.org/dev/peps/r#Usyntax_highlightr$Ulongr%Uinput_encoding_error_handlerr&jUauto_id_prefixr'Uidr(Udoctitle_xformr)Ustrip_elements_with_classesr*NU _config_filesr+]Ufile_insertion_enabledr,U raw_enabledr-KU dump_settingsr.NubUsymbol_footnote_startr/KUidsr0}r1(hhZhhyhh0hh0uUsubstitution_namesr2}r3h#h.h%}r4(h)]h']h(]Usourceh!h*]h+]uU footnotesr5]r6Urefidsr7}r8h]r9hasub.PKo1DF\{bb+scrapy-0.22/.doctrees/intro/install.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X control panelqXinstallation guideqNXlxmlqX intro-installq Xubuntu 9.10 or aboveq NXwindowsq NX$platform specific installation notesq NXinstalling scrapyq NXwin32 openssl pageqXopensslqXpre-requisitesqNXpythonqXpipqXintro-install-platform-notesqX easy_installqXzope.interface pypi pagequUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hU control-panelqhUinstallation-guideq hUlxmlq!h U intro-installq"h Uubuntu-9-10-or-aboveq#h Uwindowsq$h U$platform-specific-installation-notesq%h Uinstalling-scrapyq&hUwin32-openssl-pageq'hUopensslq(hUpre-requisitesq)hUpythonq*hUpipq+hUintro-install-platform-notesq,hU easy-installq-hUzope-interface-pypi-pageq.uUchildrenq/]q0(cdocutils.nodes target q1)q2}q3(U rawsourceq4X.. _intro-install:Uparentq5hUsourceq6cdocutils.nodes reprunicode q7XC/var/build/user_builds/scrapy/checkouts/0.22/docs/intro/install.rstq8q9}q:bUtagnameq;Utargetq(Uidsq?]Ubackrefsq@]UdupnamesqA]UclassesqB]UnamesqC]UrefidqDh"uUlineqEKUdocumentqFhh/]ubcdocutils.nodes section qG)qH}qI(h4Uh5hh6h9Uexpect_referenced_by_nameqJ}qKh h2sh;UsectionqLh=}qM(hA]hB]h@]h?]qN(h h"ehC]qO(hh euhEKhFhUexpect_referenced_by_idqP}qQh"h2sh/]qR(cdocutils.nodes title qS)qT}qU(h4XInstallation guideqVh5hHh6h9h;UtitleqWh=}qX(hA]hB]h@]h?]hC]uhEKhFhh/]qYcdocutils.nodes Text qZXInstallation guideq[q\}q](h4hVh5hTubaubhG)q^}q_(h4Uh5hHh6h9h;hLh=}q`(hA]hB]h@]h?]qah)ahC]qbhauhEKhFhh/]qc(hS)qd}qe(h4XPre-requisitesqfh5h^h6h9h;hWh=}qg(hA]hB]h@]h?]hC]uhEKhFhh/]qhhZXPre-requisitesqiqj}qk(h4hfh5hdubaubcdocutils.nodes paragraph ql)qm}qn(h4XKThe installation steps assume that you have the following things installed:qoh5h^h6h9h;U paragraphqph=}qq(hA]hB]h@]h?]hC]uhEK hFhh/]qrhZXKThe installation steps assume that you have the following things installed:qsqt}qu(h4hoh5hmubaubcdocutils.nodes bullet_list qv)qw}qx(h4Uh5h^h6h9h;U bullet_listqyh=}qz(Ubulletq{X*h?]h@]hA]hB]hC]uhEK hFhh/]q|(cdocutils.nodes list_item q})q~}q(h4X `Python`_ 2.7qh5hwh6h9h;U list_itemqh=}q(hA]hB]h@]h?]hC]uhENhFhh/]qhl)q}q(h4hh5h~h6h9h;hph=}q(hA]hB]h@]h?]hC]uhEK h/]q(cdocutils.nodes reference q)q}q(h4X `Python`_UresolvedqKh5hh;U referenceqh=}q(UnameXPythonUrefuriqXhttp://www.python.orgqh?]h@]hA]hB]hC]uh/]qhZXPythonqq}q(h4Uh5hubaubhZX 2.7qq}q(h4X 2.7h5hubeubaubh})q}q(h4Xy`lxml`_. Most Linux distributions ships prepackaged versions of lxml. Otherwise refer to http://lxml.de/installation.htmlqh5hwh6h9h;hh=}q(hA]hB]h@]h?]hC]uhENhFhh/]qhl)q}q(h4hh5hh6h9h;hph=}q(hA]hB]h@]h?]hC]uhEK h/]q(h)q}q(h4X`lxml`_hKh5hh;hh=}q(UnameXlxmlqhXhttp://lxml.de/qh?]h@]hA]hB]hC]uh/]qhZXlxmlqq}q(h4Uh5hubaubhZXR. Most Linux distributions ships prepackaged versions of lxml. Otherwise refer to qq}q(h4XR. Most Linux distributions ships prepackaged versions of lxml. Otherwise refer to h5hubh)q}q(h4X http://lxml.de/installation.htmlqh=}q(Urefurihh?]h@]hA]hB]hC]uh5hh/]qhZX http://lxml.de/installation.htmlqq}q(h4Uh5hubah;hubeubaubh})q}q(h4Xu`OpenSSL`_. This comes preinstalled in all operating systems except Windows (see :ref:`intro-install-platform-notes`)qh5hwh6h9h;hh=}q(hA]hB]h@]h?]hC]uhENhFhh/]qhl)q}q(h4hh5hh6h9h;hph=}q(hA]hB]h@]h?]hC]uhEKh/]q(h)q}q(h4X `OpenSSL`_hKh5hh;hh=}q(UnameXOpenSSLhX&https://pypi.python.org/pypi/pyOpenSSLqh?]h@]hA]hB]hC]uh/]qhZXOpenSSLq…q}q(h4Uh5hubaubhZXG. This comes preinstalled in all operating systems except Windows (see qŅq}q(h4XG. This comes preinstalled in all operating systems except Windows (see h5hubcsphinx.addnodes pending_xref q)q}q(h4X#:ref:`intro-install-platform-notes`qh5hh6h9h;U pending_xrefqh=}q(UreftypeXrefUrefwarnqΈU reftargetqXintro-install-platform-notesU refdomainXstdqh?]h@]U refexplicithA]hB]hC]UrefdocqX intro/installquhEKh/]qcdocutils.nodes emphasis q)q}q(h4hh=}q(hA]hB]q(UxrefqhXstd-refqeh@]h?]hC]uh5hh/]qhZXintro-install-platform-notesq܅q}q(h4Uh5hubah;UemphasisqubaubhZX)q}q(h4X)h5hubeubaubh})q}q(h4X2`pip`_ or `easy_install`_ Python package managers h5hwh6h9h;hh=}q(hA]hB]h@]h?]hC]uhENhFhh/]qhl)q}q(h4X1`pip`_ or `easy_install`_ Python package managersh5hh6h9h;hph=}q(hA]hB]h@]h?]hC]uhEKh/]q(h)q}q(h4X`pip`_hKh5hh;hh=}q(UnameXpipqhX6http://www.pip-installer.org/en/latest/installing.htmlqh?]h@]hA]hB]hC]uh/]qhZXpipqq}q(h4Uh5hubaubhZX or qq}q(h4X or h5hubh)q}q(h4X`easy_install`_hKh5hh;hh=}q(UnameX easy_installqhX&http://pypi.python.org/pypi/setuptoolsqh?]h@]hA]hB]hC]uh/]qhZX easy_installqq}q(h4Uh5hubaubhZX Python package managersqr}r(h4X Python package managersh5hubeubaubeubeubhG)r}r(h4Uh5hHh6h9h;hLh=}r(hA]hB]h@]h?]rh&ahC]rh auhEKhFhh/]r(hS)r}r (h4XInstalling Scrapyr h5jh6h9h;hWh=}r (hA]hB]h@]h?]hC]uhEKhFhh/]r hZXInstalling Scrapyr r}r(h4j h5jubaubhl)r}r(h4XxYou can install Scrapy using easy_install or pip (which is the canonical way to distribute and install Python packages).rh5jh6h9h;hph=}r(hA]hB]h@]h?]hC]uhEKhFhh/]rhZXxYou can install Scrapy using easy_install or pip (which is the canonical way to distribute and install Python packages).rr}r(h4jh5jubaubcdocutils.nodes note r)r}r(h4X0Check :ref:`intro-install-platform-notes` first.rh5jh6h9h;Unoterh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rhl)r}r (h4jh5jh6h9h;hph=}r!(hA]hB]h@]h?]hC]uhEKh/]r"(hZXCheck r#r$}r%(h4XCheck h5jubh)r&}r'(h4X#:ref:`intro-install-platform-notes`r(h5jh6h9h;hh=}r)(UreftypeXrefhΈhXintro-install-platform-notesU refdomainXstdr*h?]h@]U refexplicithA]hB]hC]hhuhEKh/]r+h)r,}r-(h4j(h=}r.(hA]hB]r/(hj*Xstd-refr0eh@]h?]hC]uh5j&h/]r1hZXintro-install-platform-notesr2r3}r4(h4Uh5j,ubah;hubaubhZX first.r5r6}r7(h4X first.h5jubeubaubhl)r8}r9(h4XTo install using pip::r:h5jh6h9h;hph=}r;(hA]hB]h@]h?]hC]uhEKhFhh/]r<hZXTo install using pip:r=r>}r?(h4XTo install using pip:h5j8ubaubcdocutils.nodes literal_block r@)rA}rB(h4Xpip install Scrapyh5jh6h9h;U literal_blockrCh=}rD(U xml:spacerEUpreserverFh?]h@]hA]hB]hC]uhEKhFhh/]rGhZXpip install ScrapyrHrI}rJ(h4Uh5jAubaubhl)rK}rL(h4XTo install using easy_install::rMh5jh6h9h;hph=}rN(hA]hB]h@]h?]hC]uhEKhFhh/]rOhZXTo install using easy_install:rPrQ}rR(h4XTo install using easy_install:h5jKubaubj@)rS}rT(h4Xeasy_install Scrapyh5jh6h9h;jCh=}rU(jEjFh?]h@]hA]hB]hC]uhEKhFhh/]rVhZXeasy_install ScrapyrWrX}rY(h4Uh5jSubaubh1)rZ}r[(h4X!.. _intro-install-platform-notes:h5jh6h9h;h`_ 2. download Visual C++ 2008 redistributables for your Windows and architecture 3. download OpenSSL for your Windows and architecture (the regular version, not the light one) 4. add the ``c:\openssl-win32\bin`` (or similar) directory to your ``PATH``, the same way you added ``python27`` in the first step`` in the first step h5jh6Nh;hh=}r(hA]hB]h@]h?]hC]uhENhFhh/]r(hl)r}r(h4X)install OpenSSL by following these steps:rh5jh6h9h;hph=}r(hA]hB]h@]h?]hC]uhEK/h/]rhZX)install OpenSSL by following these steps:rr}r(h4jh5jubaubcdocutils.nodes enumerated_list r)r}r(h4Uh=}r(UsuffixrU.h?]h@]hA]UprefixrUhB]hC]UenumtyperUarabicruh5jh/]r(h})r}r(h4XMgo to `Win32 OpenSSL page `_ h=}r(hA]hB]h@]h?]hC]uh5jh/]rhl)r}r(h4XLgo to `Win32 OpenSSL page `_h5jh6h9h;hph=}r(hA]hB]h@]h?]hC]uhEK1h/]r(hZXgo to rr}r(h4Xgo to h5jubh)r}r(h4XF`Win32 OpenSSL page `_h=}r(UnameXWin32 OpenSSL pagehX.http://slproweb.com/products/Win32OpenSSL.htmlrh?]h@]hA]hB]hC]uh5jh/]rhZXWin32 OpenSSL pagerr}r(h4Uh5jubah;hubh1)r}r(h4X1 U referencedrKh5jh;h`_ and install it by running ``easy_install file.egg`` * lxml: http://pypi.python.org/pypi/lxml/ * pyOpenSSL: https://launchpad.net/pyopenssl h5jh6Nh;hh=}r.(hA]hB]h@]h?]hC]uhENhFhh/]r/(hl)r0}r1(h4X1some binary packages that Scrapy depends on (like Twisted, lxml and pyOpenSSL) require a compiler available to install, and fail if you don't have Visual Studio installed. You can find Windows installers for those in the following links. Make sure you respect your Python version and Windows architecture.r2h5j,h6h9h;hph=}r3(hA]hB]h@]h?]hC]uhEK9h/]r4hZX1some binary packages that Scrapy depends on (like Twisted, lxml and pyOpenSSL) require a compiler available to install, and fail if you don't have Visual Studio installed. You can find Windows installers for those in the following links. Make sure you respect your Python version and Windows architecture.r5r6}r7(h4j2h5j0ubaubhv)r8}r9(h4Uh=}r:(h{X*h?]h@]hA]hB]hC]uh5j,h/]r;(h})r<}r=(h4X7pywin32: http://sourceforge.net/projects/pywin32/files/r>h=}r?(hA]hB]h@]h?]hC]uh5j8h/]r@hl)rA}rB(h4j>h5j<h6h9h;hph=}rC(hA]hB]h@]h?]hC]uhEK;h/]rD(hZX pywin32: rErF}rG(h4X pywin32: h5jAubh)rH}rI(h4X.http://sourceforge.net/projects/pywin32/files/rJh=}rK(UrefurijJh?]h@]hA]hB]hC]uh5jAh/]rLhZX.http://sourceforge.net/projects/pywin32/files/rMrN}rO(h4Uh5jHubah;hubeubah;hubh})rP}rQ(h4X5Twisted: http://twistedmatrix.com/trac/wiki/DownloadsrRh=}rS(hA]hB]h@]h?]hC]uh5j8h/]rThl)rU}rV(h4jRh5jPh6h9h;hph=}rW(hA]hB]h@]h?]hC]uhEK`_ and install it by running ``easy_install file.egg``rfh=}rg(hA]hB]h@]h?]hC]uh5j8h/]rhhl)ri}rj(h4jfh5jdh6h9h;hph=}rk(hA]hB]h@]h?]hC]uhEK=h/]rl(hZX&zope.interface: download the egg from rmrn}ro(h4X&zope.interface: download the egg from h5jiubh)rp}rq(h4XH`zope.interface pypi page `_h=}rr(UnameXzope.interface pypi pagehX*http://pypi.python.org/pypi/zope.interfacersh?]h@]hA]hB]hC]uh5jih/]rthZXzope.interface pypi pagerurv}rw(h4Uh5jpubah;hubh1)rx}ry(h4X- jKh5jih;hh/]r(hZXlxml: rr}r(h4Xlxml: h5jubh)r}r(h4X!http://pypi.python.org/pypi/lxml/rh=}r(Urefurijh?]h@]hA]hB]hC]uh5jh/]rhZX!http://pypi.python.org/pypi/lxml/rr}r(h4Uh5jubah;hubeubah;hubh})r}r(h4X+pyOpenSSL: https://launchpad.net/pyopenssl h=}r(hA]hB]h@]h?]hC]uh5j8h/]rhl)r}r(h4X*pyOpenSSL: https://launchpad.net/pyopensslrh5jh6h9h;hph=}r(hA]hB]h@]h?]hC]uhEK?h/]r(hZX pyOpenSSL: rr}r(h4X pyOpenSSL: h5jubh)r}r(h4Xhttps://launchpad.net/pyopensslrh=}r(Urefurijh?]h@]hA]hB]hC]uh5jh/]rhZXhttps://launchpad.net/pyopensslrr}r(h4Uh5jubah;hubeubah;hubeh;hyubeubeubhl)r}r(h4XzFinally, this page contains many precompiled Python binary libraries, which may come handy to fulfill Scrapy dependencies:rh5jmh6h9h;hph=}r(hA]hB]h@]h?]hC]uhEKAhFhh/]rhZXzFinally, this page contains many precompiled Python binary libraries, which may come handy to fulfill Scrapy dependencies:rr}r(h4jh5jubaubcdocutils.nodes block_quote r)r}r(h4Uh5jmh6h9h;U block_quoterh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rhl)r}r(h4X*http://www.lfd.uci.edu/~gohlke/pythonlibs/rh5jh6h9h;hph=}r(hA]hB]h@]h?]hC]uhEKDh/]rh)r}r(h4jh=}r(Urefurijh?]h@]hA]hB]hC]uh5jh/]rhZX*http://www.lfd.uci.edu/~gohlke/pythonlibs/rr}r(h4Uh5jubah;hubaubaubhG)r}r(h4Uh5jmh6h9h;hLh=}r(hA]hB]h@]h?]rh#ahC]rh auhEKGhFhh/]r(hS)r}r(h4XUbuntu 9.10 or aboverh5jh6h9h;hWh=}r(hA]hB]h@]h?]hC]uhEKGhFhh/]rhZXUbuntu 9.10 or aboverr}r(h4jh5jubaubhl)r}r(h4X**Don't** use the ``python-scrapy`` package provided by Ubuntu, they are typically too old and slow to catch up with latest Scrapy.h5jh6h9h;hph=}r(hA]hB]h@]h?]hC]uhEKIhFhh/]r(cdocutils.nodes strong r)r}r(h4X **Don't**h=}r(hA]hB]h@]h?]hC]uh5jh/]rhZXDon'trr}r(h4Uh5jubah;UstrongrubhZX use the rr}r(h4X use the h5jubj)r}r(h4X``python-scrapy``h=}r(hA]hB]h@]h?]hC]uh5jh/]rhZX python-scrapyrr}r(h4Uh5jubah;jubhZX` package provided by Ubuntu, they are typically too old and slow to catch up with latest Scrapy.rr}r(h4X` package provided by Ubuntu, they are typically too old and slow to catch up with latest Scrapy.h5jubeubhl)r}r(h4XInstead, use the official :ref:`Ubuntu Packages `, which already solve all dependencies for you and are continuously updated with the latest bug fixes.h5jh6h9h;hph=}r(hA]hB]h@]h?]hC]uhEKLhFhh/]r(hZXInstead, use the official rr}r(h4XInstead, use the official h5jubh)r}r(h4X&:ref:`Ubuntu Packages `rh5jh6h9h;hh=}r(UreftypeXrefhΈhX topics-ubuntuU refdomainXstdrh?]h@]U refexplicithA]hB]hC]hhuhEKLh/]rh)r}r(h4jh=}r(hA]hB]r(hjXstd-refreh@]h?]hC]uh5jh/]rhZXUbuntu Packagesrr}r(h4Uh5jubah;hubaubhZXf, which already solve all dependencies for you and are continuously updated with the latest bug fixes.rr}r (h4Xf, which already solve all dependencies for you and are continuously updated with the latest bug fixes.h5jubeubh1)r }r (h4X!.. _Python: http://www.python.orgjKh5jh6h9h;h)r?}r@(h4Uh=}rA(hA]UlevelKh?]h@]Usourceh9hB]hC]UlineKUtypeUINFOrBuh/]rChl)rD}rE(h4Uh=}rF(hA]hB]h@]h?]hC]uh5j?h/]rGhZX3Hyperlink target "intro-install" is not referenced.rHrI}rJ(h4Uh5jDubah;hpubah;Usystem_messagerKubj>)rL}rM(h4Uh=}rN(hA]UlevelKh?]h@]Usourceh9hB]hC]UlineK!UtypejBuh/]rOhl)rP}rQ(h4Uh=}rR(hA]hB]h@]h?]hC]uh5jLh/]rShZXBHyperlink target "intro-install-platform-notes" is not referenced.rTrU}rV(h4Uh5jPubah;hpubah;jKubeUreporterrWNUid_startrXKU autofootnotesrY]rZU citation_refsr[}r\Uindirect_targetsr]]r^Usettingsr_(cdocutils.frontend Values r`ora}rb(Ufootnote_backlinksrcKUrecord_dependenciesrdNU rfc_base_urlreUhttp://tools.ietf.org/html/rfU tracebackrgUpep_referencesrhNUstrip_commentsriNU toc_backlinksrjUentryrkU language_coderlUenrmU datestamprnNU report_levelroKU _destinationrpNU halt_levelrqKU strip_classesrrNhWNUerror_encoding_error_handlerrsUbackslashreplacertUdebugruNUembed_stylesheetrvUoutput_encoding_error_handlerrwUstrictrxU sectnum_xformryKUdump_transformsrzNU docinfo_xformr{KUwarning_streamr|NUpep_file_url_templater}Upep-%04dr~Uexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUC/var/build/user_builds/scrapy/checkouts/0.22/docs/intro/install.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjxUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(h-jh!jh(j#h"hHh$jmh&jh'jh hHh)h^h.jxh+jhjh,j]h*j h%j]h#juUsubstitution_namesr}rh;hFh=}r(hA]h?]h@]Usourceh9hB]hC]uU footnotesr]rUrefidsr}r(h,]rjZah"]rh2auub.PKo1Dvy)e)e,scrapy-0.22/.doctrees/intro/tutorial.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xopen directory project (dmoz)qXlearn python the hard wayqXthis xpath tutorialqXscrapy tutorialq NXusing our itemq NXdirbotq Xtrying selectors in the shellq NXextracting the dataq NXextracting itemsqNXdefining our itemqNXjsonqXintro-tutorialqXcssqXxpathqXintroduction to selectorsqNXpythonqX1this list of python resources for non-programmersqXour first spiderqNX next stepsqNXstoring the scraped dataqNXcreating a projectqNX"what just happened under the hood?qNXcrawlingqNuUsubstitution_defsq}qUparse_messagesq]q Ucurrent_sourceq!NU decorationq"NUautofootnote_startq#KUnameidsq$}q%(hUopen-directory-project-dmozq&hUlearn-python-the-hard-wayq'hUthis-xpath-tutorialq(h Uscrapy-tutorialq)h Uusing-our-itemq*h Udirbotq+h Utrying-selectors-in-the-shellq,h Uextracting-the-dataq-hUextracting-itemsq.hUdefining-our-itemq/hUjsonq0hUintro-tutorialq1hUcssq2hUxpathq3hUintroduction-to-selectorsq4hUpythonq5hU1this-list-of-python-resources-for-non-programmersq6hUour-first-spiderq7hU next-stepsq8hUstoring-the-scraped-dataq9hUcreating-a-projectq:hU!what-just-happened-under-the-hoodq;hUcrawlingq(cdocutils.nodes target q?)q@}qA(U rawsourceqBX.. _intro-tutorial:UparentqChUsourceqDcdocutils.nodes reprunicode qEXD/var/build/user_builds/scrapy/checkouts/0.22/docs/intro/tutorial.rstqFqG}qHbUtagnameqIUtargetqJU attributesqK}qL(UidsqM]UbackrefsqN]UdupnamesqO]UclassesqP]UnamesqQ]UrefidqRh1uUlineqSKUdocumentqThh=]ubcdocutils.nodes section qU)qV}qW(hBUhChhDhGUexpect_referenced_by_nameqX}qYhh@shIUsectionqZhK}q[(hO]hP]hN]hM]q\(h)h1ehQ]q](h heuhSKhThUexpect_referenced_by_idq^}q_h1h@sh=]q`(cdocutils.nodes title qa)qb}qc(hBXScrapy TutorialqdhChVhDhGhIUtitleqehK}qf(hO]hP]hN]hM]hQ]uhSKhThh=]qgcdocutils.nodes Text qhXScrapy Tutorialqiqj}qk(hBhdhChbubaubcdocutils.nodes paragraph ql)qm}qn(hBXIn this tutorial, we'll assume that Scrapy is already installed on your system. If that's not the case, see :ref:`intro-install`.hChVhDhGhIU paragraphqohK}qp(hO]hP]hN]hM]hQ]uhSKhThh=]qq(hhXlIn this tutorial, we'll assume that Scrapy is already installed on your system. If that's not the case, see qrqs}qt(hBXlIn this tutorial, we'll assume that Scrapy is already installed on your system. If that's not the case, see hChmubcsphinx.addnodes pending_xref qu)qv}qw(hBX:ref:`intro-install`qxhChmhDhGhIU pending_xrefqyhK}qz(UreftypeXrefUrefwarnq{U reftargetq|X intro-installU refdomainXstdq}hM]hN]U refexplicithO]hP]hQ]Urefdocq~Xintro/tutorialquhSKh=]qcdocutils.nodes emphasis q)q}q(hBhxhK}q(hO]hP]q(Uxrefqh}Xstd-refqehN]hM]hQ]uhChvh=]qhhX intro-installqq}q(hBUhChubahIUemphasisqubaubhhX.q}q(hBX.hChmubeubhl)q}q(hBXlWe are going to use `Open directory project (dmoz) `_ as our example domain to scrape.hChVhDhGhIhohK}q(hO]hP]hN]hM]hQ]uhSK hThh=]q(hhXWe are going to use qq}q(hBXWe are going to use hChubcdocutils.nodes reference q)q}q(hBX7`Open directory project (dmoz) `_hK}q(UnameXOpen directory project (dmoz)UrefuriqXhttp://www.dmoz.org/qhM]hN]hO]hP]hQ]uhChh=]qhhXOpen directory project (dmoz)qq}q(hBUhChubahIU referencequbh?)q}q(hBX U referencedqKhChhIhJhK}q(UrefurihhM]qh&ahN]hO]hP]hQ]qhauh=]ubhhX! as our example domain to scrape.qq}q(hBX! as our example domain to scrape.hChubeubhl)q}q(hBX0This tutorial will walk you through these tasks:qhChVhDhGhIhohK}q(hO]hP]hN]hM]hQ]uhSK hThh=]qhhX0This tutorial will walk you through these tasks:qq}q(hBhhChubaubcdocutils.nodes enumerated_list q)q}q(hBUhChVhDhGhIUenumerated_listqhK}q(UsuffixqU.hM]hN]hO]UprefixqUhP]hQ]UenumtypeqUarabicquhSKhThh=]q(cdocutils.nodes list_item q)q}q(hBXCreating a new Scrapy projectqhChhDhGhIU list_itemqhK}q(hO]hP]hN]hM]hQ]uhSNhThh=]qhl)q}q(hBhhChhDhGhIhohK}q(hO]hP]hN]hM]hQ]uhSKh=]qhhXCreating a new Scrapy projectqDžq}q(hBhhChubaubaubh)q}q(hBX#Defining the Items you will extractqhChhDhGhIhhK}q(hO]hP]hN]hM]hQ]uhSNhThh=]qhl)q}q(hBhhChhDhGhIhohK}q(hO]hP]hN]hM]hQ]uhSKh=]qhhX#Defining the Items you will extractqӅq}q(hBhhChubaubaubh)q}q(hBX`Writing a :ref:`spider ` to crawl a site and extract :ref:`Items `hChhDhGhIhhK}q(hO]hP]hN]hM]hQ]uhSNhThh=]qhl)q}q(hBX`Writing a :ref:`spider ` to crawl a site and extract :ref:`Items `hChhDhGhIhohK}q(hO]hP]hN]hM]hQ]uhSKh=]q(hhX Writing a qޅq}q(hBX Writing a hChubhu)q}q(hBX:ref:`spider `qhChhDhGhIhyhK}q(UreftypeXrefh{h|Xtopics-spidersU refdomainXstdqhM]hN]U refexplicithO]hP]hQ]h~huhSKh=]qh)q}q(hBhhK}q(hO]hP]q(hhXstd-refqehN]hM]hQ]uhChh=]qhhXspiderq텁q}q(hBUhChubahIhubaubhhX to crawl a site and extract qq}q(hBX to crawl a site and extract hChubhu)q}q(hBX:ref:`Items `qhChhDhGhIhyhK}q(UreftypeXrefh{h|X topics-itemsU refdomainXstdqhM]hN]U refexplicithO]hP]hQ]h~huhSKh=]qh)q}q(hBhhK}q(hO]hP]q(hhXstd-refqehN]hM]hQ]uhChh=]qhhXItemsqr}r(hBUhChubahIhubaubeubaubh)r}r(hBXTWriting an :ref:`Item Pipeline ` to store the extracted Items hChhDhGhIhhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]rhl)r}r(hBXSWriting an :ref:`Item Pipeline ` to store the extracted ItemshCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKh=]r (hhX Writing an r r }r (hBX Writing an hCjubhu)r }r(hBX+:ref:`Item Pipeline `rhCjhDhGhIhyhK}r(UreftypeXrefh{h|Xtopics-item-pipelineU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSKh=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCj h=]rhhX Item Pipelinerr}r(hBUhCjubahIhubaubhhX to store the extracted Itemsrr}r(hBX to store the extracted ItemshCjubeubaubeubhl)r}r (hBXScrapy is written in Python_. If you're new to the language you might want to start by getting an idea of what the language is like, to get the most out of Scrapy. If you're already familiar with other languages, and want to learn Python quickly, we recommend `Learn Python The Hard Way`_. If you're new to programming and want to start with Python, take a look at `this list of Python resources for non-programmers`_.hChVhDhGhIhohK}r!(hO]hP]hN]hM]hQ]uhSKhThh=]r"(hhXScrapy is written in r#r$}r%(hBXScrapy is written in hCjubh)r&}r'(hBXPython_Uresolvedr(KhCjhIhhK}r)(UnameXPythonhXhttp://www.python.orgr*hM]hN]hO]hP]hQ]uh=]r+hhXPythonr,r-}r.(hBUhCj&ubaubhhX. If you're new to the language you might want to start by getting an idea of what the language is like, to get the most out of Scrapy. If you're already familiar with other languages, and want to learn Python quickly, we recommend r/r0}r1(hBX. If you're new to the language you might want to start by getting an idea of what the language is like, to get the most out of Scrapy. If you're already familiar with other languages, and want to learn Python quickly, we recommend hCjubh)r2}r3(hBX`Learn Python The Hard Way`_j(KhCjhIhhK}r4(UnameXLearn Python The Hard WayhX&http://learnpythonthehardway.org/book/r5hM]hN]hO]hP]hQ]uh=]r6hhXLearn Python The Hard Wayr7r8}r9(hBUhCj2ubaubhhXN. If you're new to programming and want to start with Python, take a look at r:r;}r<(hBXN. If you're new to programming and want to start with Python, take a look at hCjubh)r=}r>(hBX4`this list of Python resources for non-programmers`_j(KhCjhIhhK}r?(UnameX1this list of Python resources for non-programmershX9http://wiki.python.org/moin/BeginnersGuide/NonProgrammersr@hM]hN]hO]hP]hQ]uh=]rAhhX1this list of Python resources for non-programmersrBrC}rD(hBUhCj=ubaubhhX.rE}rF(hBX.hCjubeubh?)rG}rH(hBX!.. _Python: http://www.python.orghKhChVhDhGhIhJhK}rI(hj*hM]rJh5ahN]hO]hP]hQ]rKhauhSKhThh=]ubh?)rL}rM(hBXp.. _this list of Python resources for non-programmers: http://wiki.python.org/moin/BeginnersGuide/NonProgrammershKhChVhDhGhIhJhK}rN(hj@hM]rOh6ahN]hO]hP]hQ]rPhauhSKhThh=]ubh?)rQ}rR(hBXE.. _Learn Python The Hard Way: http://learnpythonthehardway.org/book/hKhChVhDhGhIhJhK}rS(hj5hM]rTh'ahN]hO]hP]hQ]rUhauhSKhThh=]ubhU)rV}rW(hBUhChVhDhGhIhZhK}rX(hO]hP]hN]hM]rYh:ahQ]rZhauhSK"hThh=]r[(ha)r\}r](hBXCreating a projectr^hCjVhDhGhIhehK}r_(hO]hP]hN]hM]hQ]uhSK"hThh=]r`hhXCreating a projectrarb}rc(hBj^hCj\ubaubhl)rd}re(hBXBefore you start scraping, you will have set up a new Scrapy project. Enter a directory where you'd like to store your code and then run::hCjVhDhGhIhohK}rf(hO]hP]hN]hM]hQ]uhSK$hThh=]rghhXBefore you start scraping, you will have set up a new Scrapy project. Enter a directory where you'd like to store your code and then run:rhri}rj(hBXBefore you start scraping, you will have set up a new Scrapy project. Enter a directory where you'd like to store your code and then run:hCjdubaubcdocutils.nodes literal_block rk)rl}rm(hBXscrapy startproject tutorialhCjVhDhGhIU literal_blockrnhK}ro(U xml:spacerpUpreserverqhM]hN]hO]hP]hQ]uhSK'hThh=]rrhhXscrapy startproject tutorialrsrt}ru(hBUhCjlubaubhl)rv}rw(hBXGThis will create a ``tutorial`` directory with the following contents::rxhCjVhDhGhIhohK}ry(hO]hP]hN]hM]hQ]uhSK)hThh=]rz(hhXThis will create a r{r|}r}(hBXThis will create a hCjvubcdocutils.nodes literal r~)r}r(hBX ``tutorial``hK}r(hO]hP]hN]hM]hQ]uhCjvh=]rhhXtutorialrr}r(hBUhCjubahIUliteralrubhhX' directory with the following contents:rr}r(hBX' directory with the following contents:hCjvubeubjk)r}r(hBXtutorial/ scrapy.cfg tutorial/ __init__.py items.py pipelines.py settings.py spiders/ __init__.py ...hCjVhDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSK+hThh=]rhhXtutorial/ scrapy.cfg tutorial/ __init__.py items.py pipelines.py settings.py spiders/ __init__.py ...rr}r(hBUhCjubaubhl)r}r(hBXThese are basically:rhCjVhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSK6hThh=]rhhXThese are basically:rr}r(hBjhCjubaubcdocutils.nodes bullet_list r)r}r(hBUhCjVhDhGhIU bullet_listrhK}r(UbulletrX*hM]hN]hO]hP]hQ]uhSK8hThh=]r(h)r}r(hBX.``scrapy.cfg``: the project configuration filerhCjhDhGhIhhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]rhl)r}r(hBjhCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSK8h=]r(j~)r}r(hBX``scrapy.cfg``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX scrapy.cfgrr}r(hBUhCjubahIjubhhX : the project configuration filerr}r(hBX : the project configuration filehCjubeubaubh)r}r(hBXT``tutorial/``: the project's python module, you'll later import your code from here.hCjhDhGhIhhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]rhl)r}r(hBXT``tutorial/``: the project's python module, you'll later import your code from here.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSK9h=]r(j~)r}r(hBX ``tutorial/``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX tutorial/rr}r(hBUhCjubahIjubhhXG: the project's python module, you'll later import your code from here.rr}r(hBXG: the project's python module, you'll later import your code from here.hCjubeubaubh)r}r(hBX0``tutorial/items.py``: the project's items file.rhCjhDhGhIhhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]rhl)r}r(hBjhCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSK;h=]r(j~)r}r(hBX``tutorial/items.py``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhXtutorial/items.pyrr}r(hBUhCjubahIjubhhX: the project's items file.rr}r(hBX: the project's items file.hCjubeubaubh)r}r(hBX8``tutorial/pipelines.py``: the project's pipelines file.rhCjhDhGhIhhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]rhl)r}r(hBjhCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKh=]r(j~)r}r(hBX``tutorial/spiders/``hK}r(hO]hP]hN]hM]hQ]uhCjh=]r hhXtutorial/spiders/r r }r (hBUhCjubahIjubhhX2: a directory where you'll later put your spiders.r r}r(hBX2: a directory where you'll later put your spiders.hCjubeubaubeubeubhU)r}r(hBUhChVhDhGhIhZhK}r(hO]hP]hN]hM]rh/ahQ]rhauhSKAhThh=]r(ha)r}r(hBXDefining our ItemrhCjhDhGhIhehK}r(hO]hP]hN]hM]hQ]uhSKAhThh=]rhhXDefining our Itemrr}r(hBjhCjubaubhl)r}r(hBX`Items` are containers that will be loaded with the scraped data; they work like simple python dicts but provide additional protecting against populating undeclared fields, to prevent typos.hCjhDhGhIhohK}r (hO]hP]hN]hM]hQ]uhSKChThh=]r!(cdocutils.nodes title_reference r")r#}r$(hBX`Items`hK}r%(hO]hP]hN]hM]hQ]uhCjh=]r&hhXItemsr'r(}r)(hBUhCj#ubahIUtitle_referencer*ubhhX are containers that will be loaded with the scraped data; they work like simple python dicts but provide additional protecting against populating undeclared fields, to prevent typos.r+r,}r-(hBX are containers that will be loaded with the scraped data; they work like simple python dicts but provide additional protecting against populating undeclared fields, to prevent typos.hCjubeubhl)r.}r/(hBXThey are declared by creating an :class:`scrapy.item.Item` class and defining its attributes as :class:`scrapy.item.Field` objects, like you will in an ORM (don't worry if you're not familiar with ORMs, you will see that this is an easy task).hCjhDhGhIhohK}r0(hO]hP]hN]hM]hQ]uhSKGhThh=]r1(hhX!They are declared by creating an r2r3}r4(hBX!They are declared by creating an hCj.ubhu)r5}r6(hBX:class:`scrapy.item.Item`r7hCj.hDhGhIhyhK}r8(UreftypeXclassh{h|Xscrapy.item.ItemU refdomainXpyr9hM]hN]U refexplicithO]hP]hQ]h~hUpy:classr:NU py:moduler;NuhSKGh=]r<j~)r=}r>(hBj7hK}r?(hO]hP]r@(hj9Xpy-classrAehN]hM]hQ]uhCj5h=]rBhhXscrapy.item.ItemrCrD}rE(hBUhCj=ubahIjubaubhhX& class and defining its attributes as rFrG}rH(hBX& class and defining its attributes as hCj.ubhu)rI}rJ(hBX:class:`scrapy.item.Field`rKhCj.hDhGhIhyhK}rL(UreftypeXclassh{h|Xscrapy.item.FieldU refdomainXpyrMhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKGh=]rNj~)rO}rP(hBjKhK}rQ(hO]hP]rR(hjMXpy-classrSehN]hM]hQ]uhCjIh=]rThhXscrapy.item.FieldrUrV}rW(hBUhCjOubahIjubaubhhXy objects, like you will in an ORM (don't worry if you're not familiar with ORMs, you will see that this is an easy task).rXrY}rZ(hBXy objects, like you will in an ORM (don't worry if you're not familiar with ORMs, you will see that this is an easy task).hCj.ubeubhl)r[}r\(hBX8We begin by modeling the item that we will use to hold the sites data obtained from dmoz.org, as we want to capture the name, url and description of the sites, we define fields for each of these three attributes. To do that, we edit items.py, found in the ``tutorial`` directory. Our Item class looks like this::hCjhDhGhIhohK}r](hO]hP]hN]hM]hQ]uhSKLhThh=]r^(hhXWe begin by modeling the item that we will use to hold the sites data obtained from dmoz.org, as we want to capture the name, url and description of the sites, we define fields for each of these three attributes. To do that, we edit items.py, found in the r_r`}ra(hBXWe begin by modeling the item that we will use to hold the sites data obtained from dmoz.org, as we want to capture the name, url and description of the sites, we define fields for each of these three attributes. To do that, we edit items.py, found in the hCj[ubj~)rb}rc(hBX ``tutorial``hK}rd(hO]hP]hN]hM]hQ]uhCj[h=]rehhXtutorialrfrg}rh(hBUhCjbubahIjubhhX+ directory. Our Item class looks like this:rirj}rk(hBX+ directory. Our Item class looks like this:hCj[ubeubjk)rl}rm(hBXtfrom scrapy.item import Item, Field class DmozItem(Item): title = Field() link = Field() desc = Field()hCjhDhGhIjnhK}rn(jpjqhM]hN]hO]hP]hQ]uhSKQhThh=]rohhXtfrom scrapy.item import Item, Field class DmozItem(Item): title = Field() link = Field() desc = Field()rprq}rr(hBUhCjlubaubhl)rs}rt(hBXThis may seem complicated at first, but defining the item allows you to use other handy components of Scrapy that need to know how your item looks like.ruhCjhDhGhIhohK}rv(hO]hP]hN]hM]hQ]uhSKXhThh=]rwhhXThis may seem complicated at first, but defining the item allows you to use other handy components of Scrapy that need to know how your item looks like.rxry}rz(hBjuhCjsubaubeubhU)r{}r|(hBUhChVhDhGhIhZhK}r}(hO]hP]hN]hM]r~h7ahQ]rhauhSK\hThh=]r(ha)r}r(hBXOur first SpiderrhCj{hDhGhIhehK}r(hO]hP]hN]hM]hQ]uhSK\hThh=]rhhXOur first Spiderrr}r(hBjhCjubaubhl)r}r(hBX`Spiders are user-written classes used to scrape information from a domain (or group of domains).rhCj{hDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSK^hThh=]rhhX`Spiders are user-written classes used to scrape information from a domain (or group of domains).rr}r(hBjhCjubaubhl)r}r(hBXThey define an initial list of URLs to download, how to follow links, and how to parse the contents of those pages to extract :ref:`items `.hCj{hDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKahThh=]r(hhX~They define an initial list of URLs to download, how to follow links, and how to parse the contents of those pages to extract rr}r(hBX~They define an initial list of URLs to download, how to follow links, and how to parse the contents of those pages to extract hCjubhu)r}r(hBX:ref:`items `rhCjhDhGhIhyhK}r(UreftypeXrefh{h|X topics-itemsU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSKah=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCjh=]rhhXitemsrr}r(hBUhCjubahIhubaubhhX.r}r(hBX.hCjubeubhl)r}r(hBXvTo create a Spider, you must subclass :class:`scrapy.spider.Spider`, and define the three main, mandatory, attributes:hCj{hDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKdhThh=]r(hhX&To create a Spider, you must subclass rr}r(hBX&To create a Spider, you must subclass hCjubhu)r}r(hBX:class:`scrapy.spider.Spider`rhCjhDhGhIhyhK}r(UreftypeXclassh{h|Xscrapy.spider.SpiderU refdomainXpyrhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKdh=]rj~)r}r(hBjhK}r(hO]hP]r(hjXpy-classrehN]hM]hQ]uhCjh=]rhhXscrapy.spider.Spiderrr}r(hBUhCjubahIjubaubhhX3, and define the three main, mandatory, attributes:rr}r(hBX3, and define the three main, mandatory, attributes:hCjubeubj)r}r(hBUhCj{hDhGhIjhK}r(jX*hM]hN]hO]hP]hQ]uhSKghThh=]r(h)r}r(hBX:attr:`~scrapy.spider.Spider.name`: identifies the Spider. It must be unique, that is, you can't set the same name for different Spiders. hCjhDhGhIhhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]rhl)r}r(hBX:attr:`~scrapy.spider.Spider.name`: identifies the Spider. It must be unique, that is, you can't set the same name for different Spiders.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKgh=]r(hu)r}r(hBX":attr:`~scrapy.spider.Spider.name`rhCjhDhGhIhyhK}r(UreftypeXattrh{h|Xscrapy.spider.Spider.nameU refdomainXpyrhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKgh=]rj~)r}r(hBjhK}r(hO]hP]r(hjXpy-attrrehN]hM]hQ]uhCjh=]rhhXnamerr}r(hBUhCjubahIjubaubhhXg: identifies the Spider. It must be unique, that is, you can't set the same name for different Spiders.rr}r(hBXg: identifies the Spider. It must be unique, that is, you can't set the same name for different Spiders.hCjubeubaubh)r}r(hBX:attr:`~scrapy.spider.Spider.start_urls`: is a list of URLs where the Spider will begin to crawl from. So, the first pages downloaded will be those listed here. The subsequent URLs will be generated successively from data contained in the start URLs. hCjhDhGhIhhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]rhl)r}r(hBX:attr:`~scrapy.spider.Spider.start_urls`: is a list of URLs where the Spider will begin to crawl from. So, the first pages downloaded will be those listed here. The subsequent URLs will be generated successively from data contained in the start URLs.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKjh=]r(hu)r}r(hBX(:attr:`~scrapy.spider.Spider.start_urls`rhCjhDhGhIhyhK}r(UreftypeXattrh{h|Xscrapy.spider.Spider.start_urlsU refdomainXpyrhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKjh=]rj~)r}r(hBjhK}r(hO]hP]r(hjXpy-attrrehN]hM]hQ]uhCjh=]rhhX start_urlsrr}r(hBUhCjubahIjubaubhhX: is a list of URLs where the Spider will begin to crawl from. So, the first pages downloaded will be those listed here. The subsequent URLs will be generated successively from data contained in the start URLs.rr}r(hBX: is a list of URLs where the Spider will begin to crawl from. So, the first pages downloaded will be those listed here. The subsequent URLs will be generated successively from data contained in the start URLs.hCjubeubaubh)r}r(hBXG:meth:`~scrapy.spider.Spider.parse` is a method of the spider, which will be called with the downloaded :class:`~scrapy.http.Response` object of each start URL. The response is passed to the method as the first and only argument. This method is responsible for parsing the response data and extracting scraped data (as scraped items) and more URLs to follow. The :meth:`~scrapy.spider.Spider.parse` method is in charge of processing the response and returning scraped data (as :class:`~scrapy.item.Item` objects) and more URLs to follow (as :class:`~scrapy.http.Request` objects). hCjhDhGhIhhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]r(hl)r}r(hBX:meth:`~scrapy.spider.Spider.parse` is a method of the spider, which will be called with the downloaded :class:`~scrapy.http.Response` object of each start URL. The response is passed to the method as the first and only argument.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKoh=]r(hu)r}r(hBX#:meth:`~scrapy.spider.Spider.parse`rhCjhDhGhIhyhK}r(UreftypeXmethh{h|Xscrapy.spider.Spider.parseU refdomainXpyrhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKoh=]rj~)r}r (hBjhK}r (hO]hP]r (hjXpy-methr ehN]hM]hQ]uhCjh=]r hhXparse()rr}r(hBUhCjubahIjubaubhhXE is a method of the spider, which will be called with the downloaded rr}r(hBXE is a method of the spider, which will be called with the downloaded hCjubhu)r}r(hBX:class:`~scrapy.http.Response`rhCjhDhGhIhyhK}r(UreftypeXclassh{h|Xscrapy.http.ResponseU refdomainXpyrhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKoh=]rj~)r}r(hBjhK}r(hO]hP]r(hjXpy-classrehN]hM]hQ]uhCjh=]rhhXResponser r!}r"(hBUhCjubahIjubaubhhX_ object of each start URL. The response is passed to the method as the first and only argument.r#r$}r%(hBX_ object of each start URL. The response is passed to the method as the first and only argument.hCjubeubhl)r&}r'(hBXThis method is responsible for parsing the response data and extracting scraped data (as scraped items) and more URLs to follow.r(hCjhDhGhIhohK}r)(hO]hP]hN]hM]hQ]uhSKth=]r*hhXThis method is responsible for parsing the response data and extracting scraped data (as scraped items) and more URLs to follow.r+r,}r-(hBj(hCj&ubaubhl)r.}r/(hBXThe :meth:`~scrapy.spider.Spider.parse` method is in charge of processing the response and returning scraped data (as :class:`~scrapy.item.Item` objects) and more URLs to follow (as :class:`~scrapy.http.Request` objects).hCjhDhGhIhohK}r0(hO]hP]hN]hM]hQ]uhSKwh=]r1(hhXThe r2r3}r4(hBXThe hCj.ubhu)r5}r6(hBX#:meth:`~scrapy.spider.Spider.parse`r7hCj.hDhGhIhyhK}r8(UreftypeXmethh{h|Xscrapy.spider.Spider.parseU refdomainXpyr9hM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKwh=]r:j~)r;}r<(hBj7hK}r=(hO]hP]r>(hj9Xpy-methr?ehN]hM]hQ]uhCj5h=]r@hhXparse()rArB}rC(hBUhCj;ubahIjubaubhhXO method is in charge of processing the response and returning scraped data (as rDrE}rF(hBXO method is in charge of processing the response and returning scraped data (as hCj.ubhu)rG}rH(hBX:class:`~scrapy.item.Item`rIhCj.hDhGhIhyhK}rJ(UreftypeXclassh{h|Xscrapy.item.ItemU refdomainXpyrKhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKwh=]rLj~)rM}rN(hBjIhK}rO(hO]hP]rP(hjKXpy-classrQehN]hM]hQ]uhCjGh=]rRhhXItemrSrT}rU(hBUhCjMubahIjubaubhhX& objects) and more URLs to follow (as rVrW}rX(hBX& objects) and more URLs to follow (as hCj.ubhu)rY}rZ(hBX:class:`~scrapy.http.Request`r[hCj.hDhGhIhyhK}r\(UreftypeXclassh{h|Xscrapy.http.RequestU refdomainXpyr]hM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKwh=]r^j~)r_}r`(hBj[hK}ra(hO]hP]rb(hj]Xpy-classrcehN]hM]hQ]uhCjYh=]rdhhXRequestrerf}rg(hBUhCj_ubahIjubaubhhX objects).rhri}rj(hBX objects).hCj.ubeubeubeubhl)rk}rl(hBX|This is the code for our first Spider; save it in a file named ``dmoz_spider.py`` under the ``tutorial/spiders`` directory::hCj{hDhGhIhohK}rm(hO]hP]hN]hM]hQ]uhSK{hThh=]rn(hhX?This is the code for our first Spider; save it in a file named rorp}rq(hBX?This is the code for our first Spider; save it in a file named hCjkubj~)rr}rs(hBX``dmoz_spider.py``hK}rt(hO]hP]hN]hM]hQ]uhCjkh=]ruhhXdmoz_spider.pyrvrw}rx(hBUhCjrubahIjubhhX under the ryrz}r{(hBX under the hCjkubj~)r|}r}(hBX``tutorial/spiders``hK}r~(hO]hP]hN]hM]hQ]uhCjkh=]rhhXtutorial/spidersrr}r(hBUhCj|ubahIjubhhX directory:rr}r(hBX directory:hCjkubeubjk)r}r(hBXfrom scrapy.spider import Spider class DmozSpider(Spider): name = "dmoz" allowed_domains = ["dmoz.org"] start_urls = [ "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" ] def parse(self, response): filename = response.url.split("/")[-2] open(filename, 'wb').write(response.body)hCj{hDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSK~hThh=]rhhXfrom scrapy.spider import Spider class DmozSpider(Spider): name = "dmoz" allowed_domains = ["dmoz.org"] start_urls = [ "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" ] def parse(self, response): filename = response.url.split("/")[-2] open(filename, 'wb').write(response.body)rr}r(hBUhCjubaubhU)r}r(hBUhCj{hDhGhIhZhK}r(hO]hP]hN]hM]rh (referer: ) 2008-08-20 03:51:14-0300 [dmoz] DEBUG: Crawled (referer: ) 2008-08-20 03:51:14-0300 [dmoz] INFO: Spider closed (finished)hCjhDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSKhThh=]rhhX2008-08-20 03:51:13-0300 [scrapy] INFO: Started project: dmoz 2008-08-20 03:51:13-0300 [tutorial] INFO: Enabled extensions: ... 2008-08-20 03:51:13-0300 [tutorial] INFO: Enabled downloader middlewares: ... 2008-08-20 03:51:13-0300 [tutorial] INFO: Enabled spider middlewares: ... 2008-08-20 03:51:13-0300 [tutorial] INFO: Enabled item pipelines: ... 2008-08-20 03:51:14-0300 [dmoz] INFO: Spider opened 2008-08-20 03:51:14-0300 [dmoz] DEBUG: Crawled (referer: ) 2008-08-20 03:51:14-0300 [dmoz] DEBUG: Crawled (referer: ) 2008-08-20 03:51:14-0300 [dmoz] INFO: Spider closed (finished)rr}r(hBUhCjubaubhl)r}r(hBX#Pay attention to the lines containing ``[dmoz]``, which corresponds to our spider. You can see a log line for each URL defined in ``start_urls``. Because these URLs are the starting ones, they have no referrers, which is shown at the end of the log line, where it says ``(referer: )``.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKhThh=]r(hhX&Pay attention to the lines containing rr}r(hBX&Pay attention to the lines containing hCjubj~)r}r(hBX ``[dmoz]``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX[dmoz]rr}r(hBUhCjubahIjubhhXR, which corresponds to our spider. You can see a log line for each URL defined in rr}r(hBXR, which corresponds to our spider. You can see a log line for each URL defined in hCjubj~)r}r(hBX``start_urls``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX start_urlsrr}r(hBUhCjubahIjubhhX}. Because these URLs are the starting ones, they have no referrers, which is shown at the end of the log line, where it says rr}r(hBX}. Because these URLs are the starting ones, they have no referrers, which is shown at the end of the log line, where it says hCjubj~)r}r(hBX``(referer: )``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX(referer: )rr}r(hBUhCjubahIjubhhX.r}r(hBX.hCjubeubhl)r}r(hBXBut more interesting, as our ``parse`` method instructs, two files have been created: *Books* and *Resources*, with the content of both URLs.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKhThh=]r(hhXBut more interesting, as our rr}r(hBXBut more interesting, as our hCjubj~)r}r(hBX ``parse``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhXparserr}r(hBUhCjubahIjubhhX0 method instructs, two files have been created: rr}r(hBX0 method instructs, two files have been created: hCjubh)r}r(hBX*Books*hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhXBooksrr}r(hBUhCjubahIhubhhX and rr }r (hBX and hCjubh)r }r (hBX *Resources*hK}r (hO]hP]hN]hM]hQ]uhCjh=]rhhX Resourcesrr}r(hBUhCj ubahIhubhhX , with the content of both URLs.rr}r(hBX , with the content of both URLs.hCjubeubhU)r}r(hBUhCjhDhGhIhZhK}r(hO]hP]hN]hM]rh;ahQ]rhauhSKhThh=]r(ha)r}r(hBX"What just happened under the hood?rhCjhDhGhIhehK}r(hO]hP]hN]hM]hQ]uhSKhThh=]rhhX"What just happened under the hood?r r!}r"(hBjhCjubaubhl)r#}r$(hBXScrapy creates :class:`scrapy.http.Request` objects for each URL in the ``start_urls`` attribute of the Spider, and assigns them the ``parse`` method of the spider as their callback function.hCjhDhGhIhohK}r%(hO]hP]hN]hM]hQ]uhSKhThh=]r&(hhXScrapy creates r'r(}r)(hBXScrapy creates hCj#ubhu)r*}r+(hBX:class:`scrapy.http.Request`r,hCj#hDhGhIhyhK}r-(UreftypeXclassh{h|Xscrapy.http.RequestU refdomainXpyr.hM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKh=]r/j~)r0}r1(hBj,hK}r2(hO]hP]r3(hj.Xpy-classr4ehN]hM]hQ]uhCj*h=]r5hhXscrapy.http.Requestr6r7}r8(hBUhCj0ubahIjubaubhhX objects for each URL in the r9r:}r;(hBX objects for each URL in the hCj#ubj~)r<}r=(hBX``start_urls``hK}r>(hO]hP]hN]hM]hQ]uhCj#h=]r?hhX start_urlsr@rA}rB(hBUhCj<ubahIjubhhX/ attribute of the Spider, and assigns them the rCrD}rE(hBX/ attribute of the Spider, and assigns them the hCj#ubj~)rF}rG(hBX ``parse``hK}rH(hO]hP]hN]hM]hQ]uhCj#h=]rIhhXparserJrK}rL(hBUhCjFubahIjubhhX1 method of the spider as their callback function.rMrN}rO(hBX1 method of the spider as their callback function.hCj#ubeubhl)rP}rQ(hBXThese Requests are scheduled, then executed, and :class:`scrapy.http.Response` objects are returned and then fed back to the spider, through the :meth:`~scrapy.spider.Spider.parse` method.hCjhDhGhIhohK}rR(hO]hP]hN]hM]hQ]uhSKhThh=]rS(hhX1These Requests are scheduled, then executed, and rTrU}rV(hBX1These Requests are scheduled, then executed, and hCjPubhu)rW}rX(hBX:class:`scrapy.http.Response`rYhCjPhDhGhIhyhK}rZ(UreftypeXclassh{h|Xscrapy.http.ResponseU refdomainXpyr[hM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKh=]r\j~)r]}r^(hBjYhK}r_(hO]hP]r`(hj[Xpy-classraehN]hM]hQ]uhCjWh=]rbhhXscrapy.http.Responsercrd}re(hBUhCj]ubahIjubaubhhXC objects are returned and then fed back to the spider, through the rfrg}rh(hBXC objects are returned and then fed back to the spider, through the hCjPubhu)ri}rj(hBX#:meth:`~scrapy.spider.Spider.parse`rkhCjPhDhGhIhyhK}rl(UreftypeXmethh{h|Xscrapy.spider.Spider.parseU refdomainXpyrmhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSKh=]rnj~)ro}rp(hBjkhK}rq(hO]hP]rr(hjmXpy-methrsehN]hM]hQ]uhCjih=]rthhXparse()rurv}rw(hBUhCjoubahIjubaubhhX method.rxry}rz(hBX method.hCjPubeubeubeubhU)r{}r|(hBUhCj{hDhGhIhZhK}r}(hO]hP]hN]hM]r~h.ahQ]rhauhSKhThh=]r(ha)r}r(hBXExtracting ItemsrhCj{hDhGhIhehK}r(hO]hP]hN]hM]hQ]uhSKhThh=]rhhXExtracting Itemsrr}r(hBjhCjubaubhU)r}r(hBUhCj{hDhGhIhZhK}r(hO]hP]hN]hM]rh4ahQ]rhauhSKhThh=]r(ha)r}r(hBXIntroduction to SelectorsrhCjhDhGhIhehK}r(hO]hP]hN]hM]hQ]uhSKhThh=]rhhXIntroduction to Selectorsrr}r(hBjhCjubaubhl)r}r(hBX*There are several ways to extract data from web pages. Scrapy uses a mechanism based on `XPath`_ or `CSS`_ expressions called :ref:`Scrapy Selectors `. For more information about selectors and other extraction mechanisms see the :ref:`Selectors documentation `.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKhThh=]r(hhXXThere are several ways to extract data from web pages. Scrapy uses a mechanism based on rr}r(hBXXThere are several ways to extract data from web pages. Scrapy uses a mechanism based on hCjubh)r}r(hBX`XPath`_j(KhCjhIhhK}r(UnameXXPathhXhttp://www.w3.org/TR/xpathrhM]hN]hO]hP]hQ]uh=]rhhXXPathrr}r(hBUhCjubaubhhX or rr}r(hBX or hCjubh)r}r(hBX`CSS`_j(KhCjhIhhK}r(UnameXCSShXhttp://www.w3.org/TR/selectorsrhM]hN]hO]hP]hQ]uh=]rhhXCSSrr}r(hBUhCjubaubhhX expressions called rr}r(hBX expressions called hCjubhu)r}r(hBX*:ref:`Scrapy Selectors `rhCjhDhGhIhyhK}r(UreftypeXrefh{h|Xtopics-selectorsU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSKh=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCjh=]rhhXScrapy Selectorsrr}r(hBUhCjubahIhubaubhhXP. For more information about selectors and other extraction mechanisms see the rr}r(hBXP. For more information about selectors and other extraction mechanisms see the hCjubhu)r}r(hBX1:ref:`Selectors documentation `rhCjhDhGhIhyhK}r(UreftypeXrefh{h|Xtopics-selectorsU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSKh=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCjh=]rhhXSelectors documentationrr}r(hBUhCjubahIhubaubhhX.r}r(hBX.hCjubeubh?)r}r(hBX%.. _XPath: http://www.w3.org/TR/xpathhKhCjhDhGhIhJhK}r(hjhM]rh3ahN]hO]hP]hQ]rhauhSKhThh=]ubh?)r}r(hBX'.. _CSS: http://www.w3.org/TR/selectorshKhCjhDhGhIhJhK}r(hjhM]rh2ahN]hO]hP]hQ]rhauhSKhThh=]ubhl)r}r(hBX?Here are some examples of XPath expressions and their meanings:rhCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSKhThh=]rhhX?Here are some examples of XPath expressions and their meanings:rr}r(hBjhCjubaubj)r}r(hBUhCjhDhGhIjhK}r(jX*hM]hN]hO]hP]hQ]uhSKhThh=]r(h)r}r(hBXh``/html/head/title``: selects the ```` element, inside the ``<head>`` element of a HTML document hCj��hDhGhIhhK}r��(hO]hP]hN]hM]hQ]uhSNhThh=]r��hl)r��}r��(hBXg���``/html/head/title``: selects the ``<title>`` element, inside the ``<head>`` element of a HTML documenthCj��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKh=]r��(j~��)r��}r��(hBX���``/html/head/title``hK}r��(hO]hP]hN]hM]hQ]uhCj��h=]r��hhX���/html/head/titler��r��}r��(hBU�hCj��ubahIj��ubhhX���: selects the r��r��}r��(hBX���: selects the hCj��ubj~��)r��}r���(hBX ���``<title>``hK}r��(hO]hP]hN]hM]hQ]uhCj��h=]r��hhX���<title>r��r��}r��(hBU�hCj��ubahIj��ubhhX��� element, inside the r��r��}r��(hBX��� element, inside the hCj��ubj~��)r ��}r ��(hBX ���``<head>``hK}r ��(hO]hP]hN]hM]hQ]uhCj��h=]r ��hhX���<head>r ��r��}r��(hBU�hCj ��ubahIj��ubhhX��� element of a HTML documentr��r��}r��(hBX��� element of a HTML documenthCj��ubeubaubh)r��}r��(hBX]���``/html/head/title/text()``: selects the text inside the aforementioned ``<title>`` element. hCj��hDhGhIhhK}r��(hO]hP]hN]hM]hQ]uhSNhThh=]r��hl)r��}r��(hBX\���``/html/head/title/text()``: selects the text inside the aforementioned ``<title>`` element.hCj��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKh=]r��(j~��)r��}r��(hBX���``/html/head/title/text()``hK}r��(hO]hP]hN]hM]hQ]uhCj��h=]r��hhX���/html/head/title/text()r��r ��}r!��(hBU�hCj��ubahIj��ubhhX-���: selects the text inside the aforementioned r"��r#��}r$��(hBX-���: selects the text inside the aforementioned hCj��ubj~��)r%��}r&��(hBX ���``<title>``hK}r'��(hO]hP]hN]hM]hQ]uhCj��h=]r(��hhX���<title>r)��r*��}r+��(hBU�hCj%��ubahIj��ubhhX ��� element.r,��r-��}r.��(hBX ��� element.hCj��ubeubaubh)r/��}r0��(hBX,���``//td``: selects all the ``<td>`` elements hCj��hDhGhIhhK}r1��(hO]hP]hN]hM]hQ]uhSNhThh=]r2��hl)r3��}r4��(hBX+���``//td``: selects all the ``<td>`` elementshCj/��hDhGhIhohK}r5��(hO]hP]hN]hM]hQ]uhSKh=]r6��(j~��)r7��}r8��(hBX���``//td``hK}r9��(hO]hP]hN]hM]hQ]uhCj3��h=]r:��hhX���//tdr;��r<��}r=��(hBU�hCj7��ubahIj��ubhhX���: selects all the r>��r?��}r@��(hBX���: selects all the hCj3��ubj~��)rA��}rB��(hBX���``<td>``hK}rC��(hO]hP]hN]hM]hQ]uhCj3��h=]rD��hhX���<td>rE��rF��}rG��(hBU�hCjA��ubahIj��ubhhX ��� elementsrH��rI��}rJ��(hBX ��� elementshCj3��ubeubaubh)rK��}rL��(hBXc���``//div[@class="mine"]``: selects all ``div`` elements which contain an attribute ``class="mine"`` hCj��hDhGhIhhK}rM��(hO]hP]hN]hM]hQ]uhSNhThh=]rN��hl)rO��}rP��(hBXb���``//div[@class="mine"]``: selects all ``div`` elements which contain an attribute ``class="mine"``hCjK��hDhGhIhohK}rQ��(hO]hP]hN]hM]hQ]uhSKh=]rR��(j~��)rS��}rT��(hBX���``//div[@class="mine"]``hK}rU��(hO]hP]hN]hM]hQ]uhCjO��h=]rV��hhX���//div[@class="mine"]rW��rX��}rY��(hBU�hCjS��ubahIj��ubhhX���: selects all rZ��r[��}r\��(hBX���: selects all hCjO��ubj~��)r]��}r^��(hBX���``div``hK}r_��(hO]hP]hN]hM]hQ]uhCjO��h=]r`��hhX���divra��rb��}rc��(hBU�hCj]��ubahIj��ubhhX%��� elements which contain an attribute rd��re��}rf��(hBX%��� elements which contain an attribute hCjO��ubj~��)rg��}rh��(hBX���``class="mine"``hK}ri��(hO]hP]hN]hM]hQ]uhCjO��h=]rj��hhX ���class="mine"rk��rl��}rm��(hBU�hCjg��ubahIj��ubeubaubeubhl)rn��}ro��(hBX���These are just a couple of simple examples of what you can do with XPath, but XPath expressions are indeed much more powerful. To learn more about XPath we recommend `this XPath tutorial <http://www.w3schools.com/XPath/default.asp>`_.hCj��hDhGhIhohK}rp��(hO]hP]hN]hM]hQ]uhSKhThh=]rq��(hhX���These are just a couple of simple examples of what you can do with XPath, but XPath expressions are indeed much more powerful. To learn more about XPath we recommend rr��rs��}rt��(hBX���These are just a couple of simple examples of what you can do with XPath, but XPath expressions are indeed much more powerful. To learn more about XPath we recommend hCjn��ubh)ru��}rv��(hBXC���`this XPath tutorial <http://www.w3schools.com/XPath/default.asp>`_hK}rw��(UnameX���this XPath tutorialhX*���http://www.w3schools.com/XPath/default.asprx��hM]hN]hO]hP]hQ]uhCjn��h=]ry��hhX���this XPath tutorialrz��r{��}r|��(hBU�hCju��ubahIhubh?)r}��}r~��(hBX-��� <http://www.w3schools.com/XPath/default.asp>hKhCjn��hIhJhK}r��(Urefurijx��hM]r��h(ahN]hO]hP]hQ]r��hauh=]ubhhX���.r��}r��(hBX���.hCjn��ubeubhl)r��}r��(hBX���For working with XPaths, Scrapy provides a :class:`~scrapy.selector.Selector` class, it is instantiated with a :class:`~scrapy.http.HtmlResponse` or :class:`~scrapy.http.XmlResponse` object as first argument.hCj��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKhThh=]r��(hhX+���For working with XPaths, Scrapy provides a r��r��}r��(hBX+���For working with XPaths, Scrapy provides a hCj��ubhu)r��}r��(hBX"���:class:`~scrapy.selector.Selector`r��hCj��hDhGhIhyhK}r��(UreftypeX���classh{h|X���scrapy.selector.SelectorU refdomainX���pyr��hM]hN]U refexplicithO]hP]hQ]h~hj:��Nj;��NuhSKh=]r��j~��)r��}r��(hBj��hK}r��(hO]hP]r��(hj��X���py-classr��ehN]hM]hQ]uhCj��h=]r��hhX���Selectorr��r��}r��(hBU�hCj��ubahIj��ubaubhhX"��� class, it is instantiated with a r��r��}r��(hBX"��� class, it is instantiated with a hCj��ubhu)r��}r��(hBX"���:class:`~scrapy.http.HtmlResponse`r��hCj��hDhGhIhyhK}r��(UreftypeX���classh{h|X���scrapy.http.HtmlResponseU refdomainX���pyr��hM]hN]U refexplicithO]hP]hQ]h~hj:��Nj;��NuhSKh=]r��j~��)r��}r��(hBj��hK}r��(hO]hP]r��(hj��X���py-classr��ehN]hM]hQ]uhCj��h=]r��hhX ���HtmlResponser��r��}r��(hBU�hCj��ubahIj��ubaubhhX��� or r��r��}r��(hBX��� or hCj��ubhu)r��}r��(hBX!���:class:`~scrapy.http.XmlResponse`r��hCj��hDhGhIhyhK}r��(UreftypeX���classh{h|X���scrapy.http.XmlResponseU refdomainX���pyr��hM]hN]U refexplicithO]hP]hQ]h~hj:��Nj;��NuhSKh=]r��j~��)r��}r��(hBj��hK}r��(hO]hP]r��(hj��X���py-classr��ehN]hM]hQ]uhCj��h=]r��hhX ���XmlResponser��r��}r��(hBU�hCj��ubahIj��ubaubhhX��� object as first argument.r��r��}r��(hBX��� object as first argument.hCj��ubeubhl)r��}r��(hBX���You can see selectors as objects that represent nodes in the document structure. So, the first instantiated selectors are associated to the root node, or the entire document.r��hCj��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKhThh=]r��hhX���You can see selectors as objects that represent nodes in the document structure. So, the first instantiated selectors are associated to the root node, or the entire document.r��r��}r��(hBj��hCj��ubaubhl)r��}r��(hBX^���Selectors have four basic methods (click on the method to see the complete API documentation).r��hCj��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKhThh=]r��hhX^���Selectors have four basic methods (click on the method to see the complete API documentation).r��r��}r��(hBj��hCj��ubaubj��)r��}r��(hBU�hCj��hDhGhIj��hK}r��(j��X���*hM]hN]hO]hP]hQ]uhSKhThh=]r��(h)r��}r��(hBX���:meth:`~scrapy.selector.Selector.xpath`: returns a list of selectors, each of them representing the nodes selected by the xpath expression given as argument. hCj��hDhGhIhhK}r��(hO]hP]hN]hM]hQ]uhSNhThh=]r��hl)r��}r��(hBX���:meth:`~scrapy.selector.Selector.xpath`: returns a list of selectors, each of them representing the nodes selected by the xpath expression given as argument.hCj��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKh=]r��(hu)r��}r��(hBX'���:meth:`~scrapy.selector.Selector.xpath`r��hCj��hDhGhIhyhK}r��(UreftypeX���methh{h|X���scrapy.selector.Selector.xpathU refdomainX���pyr��hM]hN]U refexplicithO]hP]hQ]h~hj:��Nj;��NuhSKh=]r��j~��)r��}r��(hBj��hK}r��(hO]hP]r��(hj��X���py-methr��ehN]hM]hQ]uhCj��h=]r��hhX���xpath()r��r��}r��(hBU�hCj��ubahIj��ubaubhhXv���: returns a list of selectors, each of them representing the nodes selected by the xpath expression given as argument.r��r��}r��(hBXv���: returns a list of selectors, each of them representing the nodes selected by the xpath expression given as argument.hCj��ubeubaubh)r��}r��(hBX���:meth:`~scrapy.selector.Selector.css`: returns a list of selectors, each of them representing the nodes selected by the CSS expression given as argument. hCj��hDhGhIhhK}r��(hO]hP]hN]hM]hQ]uhSNhThh=]r��hl)r��}r��(hBX���:meth:`~scrapy.selector.Selector.css`: returns a list of selectors, each of them representing the nodes selected by the CSS expression given as argument.hCj��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKh=]r��(hu)r��}r��(hBX%���:meth:`~scrapy.selector.Selector.css`r��hCj��hDhGhIhyhK}r��(UreftypeX���methh{h|X���scrapy.selector.Selector.cssU refdomainX���pyr��hM]hN]U refexplicithO]hP]hQ]h~hj:��Nj;��NuhSKh=]r��j~��)r��}r��(hBj��hK}r��(hO]hP]r���(hj��X���py-methr��ehN]hM]hQ]uhCj��h=]r��hhX���css()r��r��}r��(hBU�hCj��ubahIj��ubaubhhXt���: returns a list of selectors, each of them representing the nodes selected by the CSS expression given as argument.r��r��}r��(hBXt���: returns a list of selectors, each of them representing the nodes selected by the CSS expression given as argument.hCj��ubeubaubh)r ��}r ��(hBX\���:meth:`~scrapy.selector.Selector.extract`: returns a unicode string with the selected data. hCj��hDhGhIhhK}r ��(hO]hP]hN]hM]hQ]uhSNhThh=]r ��hl)r ��}r��(hBX[���:meth:`~scrapy.selector.Selector.extract`: returns a unicode string with the selected data.hCj ��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKh=]r��(hu)r��}r��(hBX)���:meth:`~scrapy.selector.Selector.extract`r��hCj ��hDhGhIhyhK}r��(UreftypeX���methh{h|X ���scrapy.selector.Selector.extractU refdomainX���pyr��hM]hN]U refexplicithO]hP]hQ]h~hj:��Nj;��NuhSKh=]r��j~��)r��}r��(hBj��hK}r��(hO]hP]r��(hj��X���py-methr��ehN]hM]hQ]uhCj��h=]r��hhX ���extract()r��r��}r��(hBU�hCj��ubahIj��ubaubhhX2���: returns a unicode string with the selected data.r ��r!��}r"��(hBX2���: returns a unicode string with the selected data.hCj ��ubeubaubh)r#��}r$��(hBX���:meth:`~scrapy.selector.Selector.re`: returns a list of unicode strings extracted by applying the regular expression given as argument. hCj��hDhGhIhhK}r%��(hO]hP]hN]hM]hQ]uhSNhThh=]r&��hl)r'��}r(��(hBX���:meth:`~scrapy.selector.Selector.re`: returns a list of unicode strings extracted by applying the regular expression given as argument.hCj#��hDhGhIhohK}r)��(hO]hP]hN]hM]hQ]uhSKh=]r*��(hu)r+��}r,��(hBX$���:meth:`~scrapy.selector.Selector.re`r-��hCj'��hDhGhIhyhK}r.��(UreftypeX���methh{h|X���scrapy.selector.Selector.reU refdomainX���pyr/��hM]hN]U refexplicithO]hP]hQ]h~hj:��Nj;��NuhSKh=]r0��j~��)r1��}r2��(hBj-��hK}r3��(hO]hP]r4��(hj/��X���py-methr5��ehN]hM]hQ]uhCj+��h=]r6��hhX���re()r7��r8��}r9��(hBU�hCj1��ubahIj��ubaubhhXc���: returns a list of unicode strings extracted by applying the regular expression given as argument.r:��r;��}r<��(hBXc���: returns a list of unicode strings extracted by applying the regular expression given as argument.hCj'��ubeubaubeubeubhU)r=��}r>��(hBU�hCj{��hDhGhIhZhK}r?��(hO]hP]hN]hM]r@��h,ahQ]rA��h auhSKhThh=]rB��(ha)rC��}rD��(hBX���Trying Selectors in the ShellrE��hCj=��hDhGhIhehK}rF��(hO]hP]hN]hM]hQ]uhSKhThh=]rG��hhX���Trying Selectors in the ShellrH��rI��}rJ��(hBjE��hCjC��ubaubhl)rK��}rL��(hBX���To illustrate the use of Selectors we're going to use the built-in :ref:`Scrapy shell <topics-shell>`, which also requires IPython (an extended Python console) installed on your system.hCj=��hDhGhIhohK}rM��(hO]hP]hN]hM]hQ]uhSKhThh=]rN��(hhXC���To illustrate the use of Selectors we're going to use the built-in rO��rP��}rQ��(hBXC���To illustrate the use of Selectors we're going to use the built-in hCjK��ubhu)rR��}rS��(hBX"���:ref:`Scrapy shell <topics-shell>`rT��hCjK��hDhGhIhyhK}rU��(UreftypeX���refh{h|X ���topics-shellU refdomainX���stdrV��hM]hN]U refexplicithO]hP]hQ]h~huhSKh=]rW��h)rX��}rY��(hBjT��hK}rZ��(hO]hP]r[��(hjV��X���std-refr\��ehN]hM]hQ]uhCjR��h=]r]��hhX ���Scrapy shellr^��r_��}r`��(hBU�hCjX��ubahIhubaubhhXT���, which also requires IPython (an extended Python console) installed on your system.ra��rb��}rc��(hBXT���, which also requires IPython (an extended Python console) installed on your system.hCjK��ubeubhl)rd��}re��(hBXL���To start a shell, you must go to the project's top level directory and run::rf��hCj=��hDhGhIhohK}rg��(hO]hP]hN]hM]hQ]uhSKhThh=]rh��hhXK���To start a shell, you must go to the project's top level directory and run:ri��rj��}rk��(hBXK���To start a shell, you must go to the project's top level directory and run:hCjd��ubaubjk��)rl��}rm��(hBXP���scrapy shell "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/"hCj=��hDhGhIjn��hK}rn��(jp��jq��hM]hN]hO]hP]hQ]uhSKhThh=]ro��hhXP���scrapy shell "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/"rp��rq��}rr��(hBU�hCjl��ubaubcdocutils.nodes note rs��)rt��}ru��(hBX���Remember to always enclose urls with quotes when running Scrapy shell from command-line, otherwise urls containing arguments (ie. ``&`` character) will not work.hCj=��hDhGhIUnoterv��hK}rw��(hO]hP]hN]hM]hQ]uhSNhThh=]rx��hl)ry��}rz��(hBX���Remember to always enclose urls with quotes when running Scrapy shell from command-line, otherwise urls containing arguments (ie. ``&`` character) will not work.hCjt��hDhGhIhohK}r{��(hO]hP]hN]hM]hQ]uhSKh=]r|��(hhX���Remember to always enclose urls with quotes when running Scrapy shell from command-line, otherwise urls containing arguments (ie. r}��r~��}r��(hBX���Remember to always enclose urls with quotes when running Scrapy shell from command-line, otherwise urls containing arguments (ie. hCjy��ubj~��)r��}r��(hBX���``&``hK}r��(hO]hP]hN]hM]hQ]uhCjy��h=]r��hhX���&r��}r��(hBU�hCj��ubahIj��ubhhX��� character) will not work.r��r��}r��(hBX��� character) will not work.hCjy��ubeubaubhl)r��}r��(hBX#���This is what the shell looks like::r��hCj=��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSKhThh=]r��hhX"���This is what the shell looks like:r��r��}r��(hBX"���This is what the shell looks like:hCj��ubaubjk��)r��}r��(hBX��[ ... Scrapy log here ... ] [s] Available Scrapy objects: [s] 2010-08-19 21:45:59-0300 [default] INFO: Spider closed (finished) [s] sel <Selector (http://www.dmoz.org/Computers/Programming/Languages/Python/Books/) xpath=None> [s] item Item() [s] request <GET http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> [s] response <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> [s] spider <Spider 'default' at 0x1b6c2d0> [s] Useful shortcuts: [s] shelp() Print this help [s] fetch(req_or_url) Fetch a new request or URL and update shell objects [s] view(response) View response in a browser In [1]:hCj=��hDhGhIjn��hK}r��(jp��jq��hM]hN]hO]hP]hQ]uhSKhThh=]r��hhX��[ ... Scrapy log here ... ] [s] Available Scrapy objects: [s] 2010-08-19 21:45:59-0300 [default] INFO: Spider closed (finished) [s] sel <Selector (http://www.dmoz.org/Computers/Programming/Languages/Python/Books/) xpath=None> [s] item Item() [s] request <GET http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> [s] response <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> [s] spider <Spider 'default' at 0x1b6c2d0> [s] Useful shortcuts: [s] shelp() Print this help [s] fetch(req_or_url) Fetch a new request or URL and update shell objects [s] view(response) View response in a browser In [1]:r��r��}r��(hBU�hCj��ubaubhl)r��}r��(hBX���After the shell loads, you will have the response fetched in a local ``response`` variable, so if you type ``response.body`` you will see the body of the response, or you can type ``response.headers`` to see its headers.hCj=��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSMhThh=]r��(hhXE���After the shell loads, you will have the response fetched in a local r��r��}r��(hBXE���After the shell loads, you will have the response fetched in a local hCj��ubj~��)r��}r��(hBX ���``response``hK}r��(hO]hP]hN]hM]hQ]uhCj��h=]r��hhX���responser��r��}r��(hBU�hCj��ubahIj��ubhhX��� variable, so if you type r��r��}r��(hBX��� variable, so if you type hCj��ubj~��)r��}r��(hBX���``response.body``hK}r��(hO]hP]hN]hM]hQ]uhCj��h=]r��hhX ���response.bodyr��r��}r��(hBU�hCj��ubahIj��ubhhX8��� you will see the body of the response, or you can type r��r��}r��(hBX8��� you will see the body of the response, or you can type hCj��ubj~��)r��}r��(hBX���``response.headers``hK}r��(hO]hP]hN]hM]hQ]uhCj��h=]r��hhX���response.headersr��r��}r��(hBU�hCj��ubahIj��ubhhX��� to see its headers.r��r��}r��(hBX��� to see its headers.hCj��ubeubhl)r��}r��(hBX���The shell also pre-instantiate a selector for this response in variable ``sel``, the selector automatically chooses the best parsing rules (XML vs HTML) based on response's type.hCj=��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSMhThh=]r��(hhXH���The shell also pre-instantiate a selector for this response in variable r��r��}r��(hBXH���The shell also pre-instantiate a selector for this response in variable hCj��ubj~��)r��}r��(hBX���``sel``hK}r��(hO]hP]hN]hM]hQ]uhCj��h=]r��hhX���selr��r��}r��(hBU�hCj��ubahIj��ubhhXc���, the selector automatically chooses the best parsing rules (XML vs HTML) based on response's type.r��r��}r��(hBXc���, the selector automatically chooses the best parsing rules (XML vs HTML) based on response's type.hCj��ubeubhl)r��}r��(hBX���So let's try it::r��hCj=��hDhGhIhohK}r��(hO]hP]hN]hM]hQ]uhSMhThh=]r��hhX���So let's try it:r��r��}r��(hBX���So let's try it:hCj��ubaubjk��)r��}r��(hBX��In [1]: sel.xpath('//title') Out[1]: [<Selector (title) xpath=//title>] In [2]: sel.xpath('//title').extract() Out[2]: [u'<title>Open Directory - Computers: Programming: Languages: Python: Books'] In [3]: sel.xpath('//title/text()') Out[3]: [] In [4]: sel.xpath('//title/text()').extract() Out[4]: [u'Open Directory - Computers: Programming: Languages: Python: Books'] In [5]: sel.xpath('//title/text()').re('(\w+):') Out[5]: [u'Computers', u'Programming', u'Languages', u'Python']hCj=hDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSMhThh=]rhhXIn [1]: sel.xpath('//title') Out[1]: [] In [2]: sel.xpath('//title').extract() Out[2]: [u'Open Directory - Computers: Programming: Languages: Python: Books'] In [3]: sel.xpath('//title/text()') Out[3]: [] In [4]: sel.xpath('//title/text()').extract() Out[4]: [u'Open Directory - Computers: Programming: Languages: Python: Books'] In [5]: sel.xpath('//title/text()').re('(\w+):') Out[5]: [u'Computers', u'Programming', u'Languages', u'Python']rr}r(hBUhCjubaubeubhU)r}r(hBUhCj{hDhGhIhZhK}r(hO]hP]hN]hM]rh-ahQ]rh auhSM(hThh=]r(ha)r}r(hBXExtracting the datarhCjhDhGhIhehK}r(hO]hP]hN]hM]hQ]uhSM(hThh=]rhhXExtracting the datarr}r(hBjhCjubaubhl)r}r(hBXANow, let's try to extract some real information from those pages.rhCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSM*hThh=]rhhXANow, let's try to extract some real information from those pages.rr}r(hBjhCjubaubhl)r}r(hBX]You could type ``response.body`` in the console, and inspect the source code to figure out the XPaths you need to use. However, inspecting the raw HTML code there could become a very tedious task. To make this an easier task, you can use some Firefox extensions like Firebug. For more information see :ref:`topics-firebug` and :ref:`topics-firefox`.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSM,hThh=]r(hhXYou could type rr}r(hBXYou could type hCjubj~)r}r(hBX``response.body``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX response.bodyrr}r(hBUhCjubahIjubhhX  in the console, and inspect the source code to figure out the XPaths you need to use. However, inspecting the raw HTML code there could become a very tedious task. To make this an easier task, you can use some Firefox extensions like Firebug. For more information see rr}r(hBX  in the console, and inspect the source code to figure out the XPaths you need to use. However, inspecting the raw HTML code there could become a very tedious task. To make this an easier task, you can use some Firefox extensions like Firebug. For more information see hCjubhu)r}r(hBX:ref:`topics-firebug`rhCjhDhGhIhyhK}r(UreftypeXrefh{h|Xtopics-firebugU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSM,h=]r h)r }r (hBjhK}r (hO]hP]r (hjXstd-refrehN]hM]hQ]uhCjh=]rhhXtopics-firebugrr}r(hBUhCj ubahIhubaubhhX and rr}r(hBX and hCjubhu)r}r(hBX:ref:`topics-firefox`rhCjhDhGhIhyhK}r(UreftypeXrefh{h|Xtopics-firefoxU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSM,h=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refr ehN]hM]hQ]uhCjh=]r!hhXtopics-firefoxr"r#}r$(hBUhCjubahIhubaubhhX.r%}r&(hBX.hCjubeubhl)r'}r((hBXAfter inspecting the page source, you'll find that the web sites information is inside a ``

    `` element, in fact the *second* ``
      `` element.hCjhDhGhIhohK}r)(hO]hP]hN]hM]hQ]uhSM2hThh=]r*(hhXYAfter inspecting the page source, you'll find that the web sites information is inside a r+r,}r-(hBXYAfter inspecting the page source, you'll find that the web sites information is inside a hCj'ubj~)r.}r/(hBX``
        ``hK}r0(hO]hP]hN]hM]hQ]uhCj'h=]r1hhX
          r2r3}r4(hBUhCj.ubahIjubhhX element, in fact the r5r6}r7(hBX element, in fact the hCj'ubh)r8}r9(hBX*second*hK}r:(hO]hP]hN]hM]hQ]uhCj'h=]r;hhXsecondr<r=}r>(hBUhCj8ubahIhubhhX r?}r@(hBX hCj'ubj~)rA}rB(hBX``
            ``hK}rC(hO]hP]hN]hM]hQ]uhCj'h=]rDhhX
              rErF}rG(hBUhCjAubahIjubhhX element.rHrI}rJ(hBX element.hCj'ubeubhl)rK}rL(hBXSSo we can select each ``
            • `` element belonging to the sites list with this code::hCjhDhGhIhohK}rM(hO]hP]hN]hM]hQ]uhSM5hThh=]rN(hhXSo we can select each rOrP}rQ(hBXSo we can select each hCjKubj~)rR}rS(hBX``
            • ``hK}rT(hO]hP]hN]hM]hQ]uhCjKh=]rUhhX
            • rVrW}rX(hBUhCjRubahIjubhhX4 element belonging to the sites list with this code:rYrZ}r[(hBX4 element belonging to the sites list with this code:hCjKubeubjk)r\}r](hBXsel.xpath('//ul/li')hCjhDhGhIjnhK}r^(jpjqhM]hN]hO]hP]hQ]uhSM8hThh=]r_hhXsel.xpath('//ul/li')r`ra}rb(hBUhCj\ubaubhl)rc}rd(hBX'And from them, the sites descriptions::rehCjhDhGhIhohK}rf(hO]hP]hN]hM]hQ]uhSM:hThh=]rghhX&And from them, the sites descriptions:rhri}rj(hBX&And from them, the sites descriptions:hCjcubaubjk)rk}rl(hBX%sel.xpath('//ul/li/text()').extract()hCjhDhGhIjnhK}rm(jpjqhM]hN]hO]hP]hQ]uhSM<hThh=]rnhhX%sel.xpath('//ul/li/text()').extract()rorp}rq(hBUhCjkubaubhl)rr}rs(hBXThe sites titles::rthCjhDhGhIhohK}ru(hO]hP]hN]hM]hQ]uhSM>hThh=]rvhhXThe sites titles:rwrx}ry(hBXThe sites titles:hCjrubaubjk)rz}r{(hBX'sel.xpath('//ul/li/a/text()').extract()hCjhDhGhIjnhK}r|(jpjqhM]hN]hO]hP]hQ]uhSM@hThh=]r}hhX'sel.xpath('//ul/li/a/text()').extract()r~r}r(hBUhCjzubaubhl)r}r(hBXAnd the sites links::rhCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSMBhThh=]rhhXAnd the sites links:rr}r(hBXAnd the sites links:hCjubaubjk)r}r(hBX&sel.xpath('//ul/li/a/@href').extract()hCjhDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSMDhThh=]rhhX&sel.xpath('//ul/li/a/@href').extract()rr}r(hBUhCjubaubhl)r}r(hBXAs we said before, each ``.xpath()`` call returns a list of selectors, so we can concatenate further ``.xpath()`` calls to dig deeper into a node. We are going to use that property here, so::hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSMFhThh=]r(hhXAs we said before, each rr}r(hBXAs we said before, each hCjubj~)r}r(hBX ``.xpath()``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX.xpath()rr}r(hBUhCjubahIjubhhXA call returns a list of selectors, so we can concatenate further rr}r(hBXA call returns a list of selectors, so we can concatenate further hCjubj~)r}r(hBX ``.xpath()``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX.xpath()rr}r(hBUhCjubahIjubhhXM calls to dig deeper into a node. We are going to use that property here, so:rr}r(hBXM calls to dig deeper into a node. We are going to use that property here, so:hCjubeubjk)r}r(hBXsites = sel.xpath('//ul/li') for site in sites: title = site.xpath('a/text()').extract() link = site.xpath('a/@href').extract() desc = site.xpath('text()').extract() print title, link, deschCjhDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSMJhThh=]rhhXsites = sel.xpath('//ul/li') for site in sites: title = site.xpath('a/text()').extract() link = site.xpath('a/@href').extract() desc = site.xpath('text()').extract() print title, link, descrr}r(hBUhCjubaubjs)r}r(hBXFor a more detailed description of using nested selectors, see :ref:`topics-selectors-nesting-selectors` and :ref:`topics-selectors-relative-xpaths` in the :ref:`topics-selectors` documentationhCjhDhGhIjvhK}r(hO]hP]hN]hM]hQ]uhSNhThh=]rhl)r}r(hBXFor a more detailed description of using nested selectors, see :ref:`topics-selectors-nesting-selectors` and :ref:`topics-selectors-relative-xpaths` in the :ref:`topics-selectors` documentationhCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSMSh=]r(hhX?For a more detailed description of using nested selectors, see rr}r(hBX?For a more detailed description of using nested selectors, see hCjubhu)r}r(hBX):ref:`topics-selectors-nesting-selectors`rhCjhDhGhIhyhK}r(UreftypeXrefh{h|X"topics-selectors-nesting-selectorsU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSMSh=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCjh=]rhhX"topics-selectors-nesting-selectorsrr}r(hBUhCjubahIhubaubhhX and rr}r(hBX and hCjubhu)r}r(hBX':ref:`topics-selectors-relative-xpaths`rhCjhDhGhIhyhK}r(UreftypeXrefh{h|X topics-selectors-relative-xpathsU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSMSh=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCjh=]rhhX topics-selectors-relative-xpathsrr}r(hBUhCjubahIhubaubhhX in the rr}r(hBX in the hCjubhu)r}r(hBX:ref:`topics-selectors`rhCjhDhGhIhyhK}r(UreftypeXrefh{h|Xtopics-selectorsU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSMSh=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCjh=]rhhXtopics-selectorsrr}r(hBUhCjubahIhubaubhhX documentationrr}r(hBX documentationhCjubeubaubhl)r}r(hBX#Let's add this code to our spider::rhCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSMXhThh=]rhhX"Let's add this code to our spider:rr}r(hBX"Let's add this code to our spider:hCjubaubjk)r}r(hBXfrom scrapy.spider import Spider from scrapy.selector import Selector class DmozSpider(Spider): name = "dmoz" allowed_domains = ["dmoz.org"] start_urls = [ "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" ] def parse(self, response): sel = Selector(response) sites = sel.xpath('//ul/li') for site in sites: title = site.xpath('a/text()').extract() link = site.xpath('a/@href').extract() desc = site.xpath('text()').extract() print title, link, deschCjhDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSMZhThh=]rhhXfrom scrapy.spider import Spider from scrapy.selector import Selector class DmozSpider(Spider): name = "dmoz" allowed_domains = ["dmoz.org"] start_urls = [ "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" ] def parse(self, response): sel = Selector(response) sites = sel.xpath('//ul/li') for site in sites: title = site.xpath('a/text()').extract() link = site.xpath('a/@href').extract() desc = site.xpath('text()').extract() print title, link, descrr}r(hBUhCjubaubhl)r}r(hBXNotice we import our Selector class from scrapy.selector and instantiate a new Selector object. We can now specify our XPaths just as we did in the shell. Now try crawling the dmoz.org domain again and you'll see sites being printed in your output, run::hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSMnhThh=]rhhXNotice we import our Selector class from scrapy.selector and instantiate a new Selector object. We can now specify our XPaths just as we did in the shell. Now try crawling the dmoz.org domain again and you'll see sites being printed in your output, run:rr}r(hBXNotice we import our Selector class from scrapy.selector and instantiate a new Selector object. We can now specify our XPaths just as we did in the shell. Now try crawling the dmoz.org domain again and you'll see sites being printed in your output, run:hCjubaubjk)r }r (hBXscrapy crawl dmozhCjhDhGhIjnhK}r (jpjqhM]hN]hO]hP]hQ]uhSMshThh=]r hhXscrapy crawl dmozr r}r(hBUhCj ubaubeubeubhU)r}r(hBUhCj{hDhGhIhZhK}r(hO]hP]hN]hM]rh*ahQ]rh auhSMvhThh=]r(ha)r}r(hBXUsing our itemrhCjhDhGhIhehK}r(hO]hP]hN]hM]hQ]uhSMvhThh=]rhhXUsing our itemrr}r(hBjhCjubaubhl)r}r(hBX:class:`~scrapy.item.Item` objects are custom python dicts; you can access the values of their fields (attributes of the class we defined earlier) using the standard dict syntax like::hCjhDhGhIhohK}r (hO]hP]hN]hM]hQ]uhSMxhThh=]r!(hu)r"}r#(hBX:class:`~scrapy.item.Item`r$hCjhDhGhIhyhK}r%(UreftypeXclassh{h|Xscrapy.item.ItemU refdomainXpyr&hM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSMxh=]r'j~)r(}r)(hBj$hK}r*(hO]hP]r+(hj&Xpy-classr,ehN]hM]hQ]uhCj"h=]r-hhXItemr.r/}r0(hBUhCj(ubahIjubaubhhX objects are custom python dicts; you can access the values of their fields (attributes of the class we defined earlier) using the standard dict syntax like:r1r2}r3(hBX objects are custom python dicts; you can access the values of their fields (attributes of the class we defined earlier) using the standard dict syntax like:hCjubeubjk)r4}r5(hBX[>>> item = DmozItem() >>> item['title'] = 'Example title' >>> item['title'] 'Example title'hCjhDhGhIjnhK}r6(jpjqhM]hN]hO]hP]hQ]uhSM|hThh=]r7hhX[>>> item = DmozItem() >>> item['title'] = 'Example title' >>> item['title'] 'Example title'r8r9}r:(hBUhCj4ubaubhl)r;}r<(hBXSpiders are expected to return their scraped data inside :class:`~scrapy.item.Item` objects. So, in order to return the data we've scraped so far, the final code for our Spider would be like this::hCjhDhGhIhohK}r=(hO]hP]hN]hM]hQ]uhSMhThh=]r>(hhX9Spiders are expected to return their scraped data inside r?r@}rA(hBX9Spiders are expected to return their scraped data inside hCj;ubhu)rB}rC(hBX:class:`~scrapy.item.Item`rDhCj;hDhGhIhyhK}rE(UreftypeXclassh{h|Xscrapy.item.ItemU refdomainXpyrFhM]hN]U refexplicithO]hP]hQ]h~hj:Nj;NuhSMh=]rGj~)rH}rI(hBjDhK}rJ(hO]hP]rK(hjFXpy-classrLehN]hM]hQ]uhCjBh=]rMhhXItemrNrO}rP(hBUhCjHubahIjubaubhhXq objects. So, in order to return the data we've scraped so far, the final code for our Spider would be like this:rQrR}rS(hBXq objects. So, in order to return the data we've scraped so far, the final code for our Spider would be like this:hCj;ubeubjk)rT}rU(hBXfrom scrapy.spider import Spider from scrapy.selector import Selector from tutorial.items import DmozItem class DmozSpider(Spider): name = "dmoz" allowed_domains = ["dmoz.org"] start_urls = [ "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" ] def parse(self, response): sel = Selector(response) sites = sel.xpath('//ul/li') items = [] for site in sites: item = DmozItem() item['title'] = site.xpath('a/text()').extract() item['link'] = site.xpath('a/@href').extract() item['desc'] = site.xpath('text()').extract() items.append(item) return itemshCjhDhGhIjnhK}rV(jpjqhM]hN]hO]hP]hQ]uhSMhThh=]rWhhXfrom scrapy.spider import Spider from scrapy.selector import Selector from tutorial.items import DmozItem class DmozSpider(Spider): name = "dmoz" allowed_domains = ["dmoz.org"] start_urls = [ "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" ] def parse(self, response): sel = Selector(response) sites = sel.xpath('//ul/li') items = [] for site in sites: item = DmozItem() item['title'] = site.xpath('a/text()').extract() item['link'] = site.xpath('a/@href').extract() item['desc'] = site.xpath('text()').extract() items.append(item) return itemsrXrY}rZ(hBUhCjTubaubjs)r[}r\(hBX{You can find a fully-functional variant of this spider in the dirbot_ project available at https://github.com/scrapy/dirbothCjhDhGhIjvhK}r](hO]hP]hN]hM]hQ]uhSNhThh=]r^hl)r_}r`(hBX{You can find a fully-functional variant of this spider in the dirbot_ project available at https://github.com/scrapy/dirbothCj[hDhGhIhohK}ra(hO]hP]hN]hM]hQ]uhSMh=]rb(hhX>You can find a fully-functional variant of this spider in the rcrd}re(hBX>You can find a fully-functional variant of this spider in the hCj_ubh)rf}rg(hBXdirbot_j(KhCj_hIhhK}rh(UnameXdirbotrihX https://github.com/scrapy/dirbotrjhM]hN]hO]hP]hQ]uh=]rkhhXdirbotrlrm}rn(hBUhCjfubaubhhX project available at rorp}rq(hBX project available at hCj_ubh)rr}rs(hBX https://github.com/scrapy/dirbotrthK}ru(UrefurijthM]hN]hO]hP]hQ]uhCj_h=]rvhhX https://github.com/scrapy/dirbotrwrx}ry(hBUhCjrubahIhubeubaubhl)rz}r{(hBX@Now doing a crawl on the dmoz.org domain yields ``DmozItem``'s::r|hCjhDhGhIhohK}r}(hO]hP]hN]hM]hQ]uhSMhThh=]r~(hhX0Now doing a crawl on the dmoz.org domain yields rr}r(hBX0Now doing a crawl on the dmoz.org domain yields hCjzubj~)r}r(hBX ``DmozItem``hK}r(hO]hP]hN]hM]hQ]uhCjzh=]rhhXDmozItemrr}r(hBUhCjubahIjubhhX's:rr}r(hBX's:hCjzubeubjk)r}r(hBX#[dmoz] DEBUG: Scraped from <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> {'desc': [u' - By David Mertz; Addison Wesley. Book in progress, full text, ASCII format. Asks for feedback. [author website, Gnosis Software, Inc.\n], 'link': [u'http://gnosis.cx/TPiP/'], 'title': [u'Text Processing in Python']} [dmoz] DEBUG: Scraped from <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> {'desc': [u' - By Sean McGrath; Prentice Hall PTR, 2000, ISBN 0130211192, has CD-ROM. Methods to build XML applications fast, Python tutorial, DOM and SAX, new Pyxie open source XML processing library. [Prentice Hall PTR]\n'], 'link': [u'http://www.informit.com/store/product.aspx?isbn=0130211192'], 'title': [u'XML Processing with Python']}hCjhDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSMhThh=]rhhX#[dmoz] DEBUG: Scraped from <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> {'desc': [u' - By David Mertz; Addison Wesley. Book in progress, full text, ASCII format. Asks for feedback. [author website, Gnosis Software, Inc.\n], 'link': [u'http://gnosis.cx/TPiP/'], 'title': [u'Text Processing in Python']} [dmoz] DEBUG: Scraped from <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> {'desc': [u' - By Sean McGrath; Prentice Hall PTR, 2000, ISBN 0130211192, has CD-ROM. Methods to build XML applications fast, Python tutorial, DOM and SAX, new Pyxie open source XML processing library. [Prentice Hall PTR]\n'], 'link': [u'http://www.informit.com/store/product.aspx?isbn=0130211192'], 'title': [u'XML Processing with Python']}rr}r(hBUhCjubaubeubeubhU)r}r(hBUhChVhDhGhIhZhK}r(hO]hP]hN]hM]rh9ahQ]rhauhSMhThh=]r(ha)r}r(hBXStoring the scraped datarhCjhDhGhIhehK}r(hO]hP]hN]hM]hQ]uhSMhThh=]rhhXStoring the scraped datarr}r(hBjhCjubaubhl)r}r(hBXThe simplest way to store the scraped data is by using the :ref:`Feed exports `, with the following command::hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSMhThh=]r(hhX;The simplest way to store the scraped data is by using the rr}r(hBX;The simplest way to store the scraped data is by using the hCjubhu)r}r(hBX):ref:`Feed exports `rhCjhDhGhIhyhK}r(UreftypeXrefh{h|Xtopics-feed-exportsU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSMh=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCjh=]rhhX Feed exportsrr}r(hBUhCjubahIhubaubhhX, with the following command:rr}r(hBX, with the following command:hCjubeubjk)r}r(hBX'scrapy crawl dmoz -o items.json -t jsonhCjhDhGhIjnhK}r(jpjqhM]hN]hO]hP]hQ]uhSMhThh=]rhhX'scrapy crawl dmoz -o items.json -t jsonrr}r(hBUhCjubaubhl)r}r(hBX]That will generate a ``items.json`` file containing all scraped items, serialized in `JSON`_.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSMhThh=]r(hhXThat will generate a rr}r(hBXThat will generate a hCjubj~)r}r(hBX``items.json``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhX items.jsonrr}r(hBUhCjubahIjubhhX2 file containing all scraped items, serialized in rr}r(hBX2 file containing all scraped items, serialized in hCjubh)r}r(hBX`JSON`_j(KhCjhIhhK}r(UnameXJSONhX!http://en.wikipedia.org/wiki/JSONrhM]hN]hO]hP]hQ]uh=]rhhXJSONrr}r(hBUhCjubaubhhX.r}r(hBX.hCjubeubhl)r}r(hBXIn small projects (like the one in this tutorial), that should be enough. However, if you want to perform more complex things with the scraped items, you can write an :ref:`Item Pipeline `. As with Items, a placeholder file for Item Pipelines has been set up for you when the project is created, in ``tutorial/pipelines.py``. Though you don't need to implement any item pipeline if you just want to store the scraped items.hCjhDhGhIhohK}r(hO]hP]hN]hM]hQ]uhSMhThh=]r(hhXIn small projects (like the one in this tutorial), that should be enough. However, if you want to perform more complex things with the scraped items, you can write an rr}r(hBXIn small projects (like the one in this tutorial), that should be enough. However, if you want to perform more complex things with the scraped items, you can write an hCjubhu)r}r(hBX+:ref:`Item Pipeline `rhCjhDhGhIhyhK}r(UreftypeXrefh{h|Xtopics-item-pipelineU refdomainXstdrhM]hN]U refexplicithO]hP]hQ]h~huhSMh=]rh)r}r(hBjhK}r(hO]hP]r(hjXstd-refrehN]hM]hQ]uhCjh=]rhhX Item Pipelinerr}r(hBUhCjubahIhubaubhhXo. As with Items, a placeholder file for Item Pipelines has been set up for you when the project is created, in rr}r(hBXo. As with Items, a placeholder file for Item Pipelines has been set up for you when the project is created, in hCjubj~)r}r(hBX``tutorial/pipelines.py``hK}r(hO]hP]hN]hM]hQ]uhCjh=]rhhXtutorial/pipelines.pyrr}r(hBUhCjubahIjubhhXc. Though you don't need to implement any item pipeline if you just want to store the scraped items.rr}r(hBXc. Though you don't need to implement any item pipeline if you just want to store the scraped items.hCjubeubeubhU)r}r (hBUhChVhDhGhIhZhK}r (hO]hP]hN]hM]r h8ahQ]r hauhSMhThh=]r (ha)r }r (hBX Next stepsr hCjhDhGhIhehK}r (hO]hP]hN]hM]hQ]uhSMhThh=]r hhX Next stepsr r }r (hBj hCj ubaubhl)r }r (hBXThis tutorial covers only the basics of Scrapy, but there's a lot of other features not mentioned here. Check the :ref:`topics-whatelse` section in :ref:`intro-overview` chapter for a quick overview of the most important ones.hCjhDhGhIhohK}r (hO]hP]hN]hM]hQ]uhSMhThh=]r (hhXrThis tutorial covers only the basics of Scrapy, but there's a lot of other features not mentioned here. Check the r r }r (hBXrThis tutorial covers only the basics of Scrapy, but there's a lot of other features not mentioned here. Check the hCj ubhu)r }r (hBX:ref:`topics-whatelse`r hCj hDhGhIhyhK}r (UreftypeXrefh{h|Xtopics-whatelseU refdomainXstdr hM]hN]U refexplicithO]hP]hQ]h~huhSMh=]r h)r }r (hBj hK}r (hO]hP]r (hj Xstd-refr ehN]hM]hQ]uhCj h=]r hhXtopics-whatelser r! }r" (hBUhCj ubahIhubaubhhX section in r# r$ }r% (hBX section in hCj ubhu)r& }r' (hBX:ref:`intro-overview`r( hCj hDhGhIhyhK}r) (UreftypeXrefh{h|Xintro-overviewU refdomainXstdr* hM]hN]U refexplicithO]hP]hQ]h~huhSMh=]r+ h)r, }r- (hBj( hK}r. (hO]hP]r/ (hj* Xstd-refr0 ehN]hM]hQ]uhCj& h=]r1 hhXintro-overviewr2 r3 }r4 (hBUhCj, ubahIhubaubhhX9 chapter for a quick overview of the most important ones.r5 r6 }r7 (hBX9 chapter for a quick overview of the most important ones.hCj ubeubhl)r8 }r9 (hBXThen, we recommend you continue by playing with an example project (see :ref:`intro-examples`), and then continue with the section :ref:`section-basics`.hCjhDhGhIhohK}r: (hO]hP]hN]hM]hQ]uhSMhThh=]r; (hhXHThen, we recommend you continue by playing with an example project (see r< r= }r> (hBXHThen, we recommend you continue by playing with an example project (see hCj8 ubhu)r? }r@ (hBX:ref:`intro-examples`rA hCj8 hDhGhIhyhK}rB (UreftypeXrefh{h|Xintro-examplesU refdomainXstdrC hM]hN]U refexplicithO]hP]hQ]h~huhSMh=]rD h)rE }rF (hBjA hK}rG (hO]hP]rH (hjC Xstd-refrI ehN]hM]hQ]uhCj? h=]rJ hhXintro-examplesrK rL }rM (hBUhCjE ubahIhubaubhhX&), and then continue with the section rN rO }rP (hBX&), and then continue with the section hCj8 ubhu)rQ }rR (hBX:ref:`section-basics`rS hCj8 hDhGhIhyhK}rT (UreftypeXrefh{h|Xsection-basicsU refdomainXstdrU hM]hN]U refexplicithO]hP]hQ]h~huhSMh=]rV h)rW }rX (hBjS hK}rY (hO]hP]rZ (hjU Xstd-refr[ ehN]hM]hQ]uhCjQ h=]r\ hhXsection-basicsr] r^ }r_ (hBUhCjW ubahIhubaubhhX.r` }ra (hBX.hCj8 ubeubh?)rb }rc (hBX+.. _JSON: http://en.wikipedia.org/wiki/JSONhKhCjhDhGhIhJhK}rd (hjhM]re h0ahN]hO]hP]hQ]rf hauhSMhThh=]ubh?)rg }rh (hBX,.. _dirbot: https://github.com/scrapy/dirbothKhCjhDhGhIhJhK}ri (hjjhM]rj h+ahN]hO]hP]hQ]rk h auhSMhThh=]ubeubeubehBUU transformerrl NU footnote_refsrm }rn Urefnamesro }rp (Xlearn python the hard way]rq j2aXxpath]rr jaji]rs jfaXpython]rt j&aX1this list of python resources for non-programmers]ru j=aXjson]rv jaXcss]rw jauUsymbol_footnotesrx ]ry Uautofootnote_refsrz ]r{ Usymbol_footnote_refsr| ]r} U citationsr~ ]r hThU current_liner NUtransform_messagesr ]r cdocutils.nodes system_message r )r }r (hBUhK}r (hO]UlevelKhM]hN]UsourcehGhP]hQ]UlineKUtypeUINFOr uh=]r hl)r }r (hBUhK}r (hO]hP]hN]hM]hQ]uhCj h=]r hhX4Hyperlink target "intro-tutorial" is not referenced.r r }r (hBUhCj ubahIhoubahIUsystem_messager ubaUreporterr NUid_startr KU autofootnotesr ]r U citation_refsr }r Uindirect_targetsr ]r Usettingsr (cdocutils.frontend Values r or }r (Ufootnote_backlinksr KUrecord_dependenciesr NU rfc_base_urlr Uhttp://tools.ietf.org/html/r U tracebackr Upep_referencesr NUstrip_commentsr NU toc_backlinksr Uentryr U language_coder Uenr U datestampr NU report_levelr KU _destinationr NU halt_levelr KU strip_classesr NheNUerror_encoding_error_handlerr Ubackslashreplacer Udebugr NUembed_stylesheetr Uoutput_encoding_error_handlerr Ustrictr U sectnum_xformr KUdump_transformsr NU docinfo_xformr KUwarning_streamr NUpep_file_url_templater Upep-%04dr Uexit_status_levelr KUconfigr NUstrict_visitorr NUcloak_email_addressesr Utrim_footnote_reference_spacer Uenvr NUdump_pseudo_xmlr NUexpose_internalsr NUsectsubtitle_xformr U source_linkr NUrfc_referencesr NUoutput_encodingr Uutf-8r U source_urlr NUinput_encodingr U utf-8-sigr U_disable_configr NU id_prefixr UU tab_widthr KUerror_encodingr UUTF-8r U_sourcer UD/var/build/user_builds/scrapy/checkouts/0.22/docs/intro/tutorial.rstr Ugettext_compactr U generatorr NUdump_internalsr NU smart_quotesr U pep_base_urlr Uhttp://www.python.org/dev/peps/r Usyntax_highlightr Ulongr Uinput_encoding_error_handlerr j Uauto_id_prefixr Uidr Udoctitle_xformr Ustrip_elements_with_classesr NU _config_filesr ]Ufile_insertion_enabledr U raw_enabledr KU dump_settingsr NubUsymbol_footnote_startr KUidsr }r (h'jQh;jh+jg h-jh4jh9jh)hVh,j=h0jb h1hVh/jh*jh2jh3jh5jGh&hh6jLh:jVh.j{h(j}h8jhX.. _intro-overview:Uparentq?hUsourceq@cdocutils.nodes reprunicode qAXD/var/build/user_builds/scrapy/checkouts/0.22/docs/intro/overview.rstqBqC}qDbUtagnameqEUtargetqFU attributesqG}qH(UidsqI]UbackrefsqJ]UdupnamesqK]UclassesqL]UnamesqM]UrefidqNh2uUlineqOKUdocumentqPhh9]ubcdocutils.nodes section qQ)qR}qS(h>Uh?hh@hCUexpect_referenced_by_nameqT}qUhhXScrapy at a glanceq`h?hRh@hChEUtitleqahG}qb(hK]hL]hJ]hI]hM]uhOKhPhh9]qccdocutils.nodes Text qdXScrapy at a glanceqeqf}qg(h>h`h?h^ubaubcdocutils.nodes paragraph qh)qi}qj(h>XScrapy is an application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing or historical archival.qkh?hRh@hChEU paragraphqlhG}qm(hK]hL]hJ]hI]hM]uhOKhPhh9]qnhdXScrapy is an application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing or historical archival.qoqp}qq(h>hkh?hiubaubhh)qr}qs(h>XEven though Scrapy was originally designed for `screen scraping`_ (more precisely, `web scraping`_), it can also be used to extract data using APIs (such as `Amazon Associates Web Services`_) or as a general purpose web crawler.h?hRh@hChEhlhG}qt(hK]hL]hJ]hI]hM]uhOK hPhh9]qu(hdX/Even though Scrapy was originally designed for qvqw}qx(h>X/Even though Scrapy was originally designed for h?hrubcdocutils.nodes reference qy)qz}q{(h>X`screen scraping`_Uresolvedq|Kh?hrhEU referenceq}hG}q~(UnameXscreen scrapingUrefuriqX,http://en.wikipedia.org/wiki/Screen_scrapingqhI]hJ]hK]hL]hM]uh9]qhdXscreen scrapingqq}q(h>Uh?hzubaubhdX (more precisely, qq}q(h>X (more precisely, h?hrubhy)q}q(h>X`web scraping`_h|Kh?hrhEh}hG}q(UnameX web scrapinghX)http://en.wikipedia.org/wiki/Web_scrapingqhI]hJ]hK]hL]hM]uh9]qhdX web scrapingqq}q(h>Uh?hubaubhdX;), it can also be used to extract data using APIs (such as qq}q(h>X;), it can also be used to extract data using APIs (such as h?hrubhy)q}q(h>X!`Amazon Associates Web Services`_h|Kh?hrhEh}hG}q(UnameXAmazon Associates Web ServiceshX!http://aws.amazon.com/associates/qhI]hJ]hK]hL]hM]uh9]qhdXAmazon Associates Web Servicesqq}q(h>Uh?hubaubhdX&) or as a general purpose web crawler.qq}q(h>X&) or as a general purpose web crawler.h?hrubeubhh)q}q(h>XThe purpose of this document is to introduce you to the concepts behind Scrapy so you can get an idea of how it works and decide if Scrapy is what you need.qh?hRh@hChEhlhG}q(hK]hL]hJ]hI]hM]uhOKhPhh9]qhdXThe purpose of this document is to introduce you to the concepts behind Scrapy so you can get an idea of how it works and decide if Scrapy is what you need.qq}q(h>hh?hubaubhh)q}q(h>X^When you're ready to start a project, you can :ref:`start with the tutorial `.h?hRh@hChEhlhG}q(hK]hL]hJ]hI]hM]uhOKhPhh9]q(hdX.When you're ready to start a project, you can qq}q(h>X.When you're ready to start a project, you can h?hubcsphinx.addnodes pending_xref q)q}q(h>X/:ref:`start with the tutorial `qh?hh@hChEU pending_xrefqhG}q(UreftypeXrefUrefwarnqU reftargetqXintro-tutorialU refdomainXstdqhI]hJ]U refexplicithK]hL]hM]UrefdocqXintro/overviewquhOKh9]qcdocutils.nodes emphasis q)q}q(h>hhG}q(hK]hL]q(UxrefqhXstd-refqehJ]hI]hM]uh?hh9]qhdXstart with the tutorialqq}q(h>Uh?hubahEUemphasisqubaubhdX.q}q(h>X.h?hubeubhQ)q}q(h>Uh?hRh@hChEhVhG}q(hK]hL]hJ]hI]qh/ahM]qhauhOKhPhh9]q(h])q}q(h>XPick a websiteqh?hh@hChEhahG}q(hK]hL]hJ]hI]hM]uhOKhPhh9]qhdXPick a websiteq҅q}q(h>hh?hubaubhh)q}q(h>XSo you need to extract some information from a website, but the website doesn't provide any API or mechanism to access that info programmatically. Scrapy can help you extract that information.qh?hh@hChEhlhG}q(hK]hL]hJ]hI]hM]uhOKhPhh9]qhdXSo you need to extract some information from a website, but the website doesn't provide any API or mechanism to access that info programmatically. Scrapy can help you extract that information.qڅq}q(h>hh?hubaubhh)q}q(h>XzLet's say we want to extract the URL, name, description and size of all torrent files added today in the `Mininova`_ site.h?hh@hChEhlhG}q(hK]hL]hJ]hI]hM]uhOKhPhh9]q(hdXiLet's say we want to extract the URL, name, description and size of all torrent files added today in the qᅁq}q(h>XiLet's say we want to extract the URL, name, description and size of all torrent files added today in the h?hubhy)q}q(h>X `Mininova`_h|Kh?hhEh}hG}q(UnameXMininovahXhttp://www.mininova.orgqhI]hJ]hK]hL]hM]uh9]qhdXMininovaq酁q}q(h>Uh?hubaubhdX site.q셁q}q(h>X site.h?hubeubhh)q}q(h>X?The list of all torrents added today can be found on this page:qh?hh@hChEhlhG}q(hK]hL]hJ]hI]hM]uhOK hPhh9]qhdX?The list of all torrents added today can be found on this page:qq}q(h>hh?hubaubcdocutils.nodes block_quote q)q}q(h>Uh?hh@hChEU block_quoteqhG}q(hK]hL]hJ]hI]hM]uhONhPhh9]qhh)q}q(h>Xhttp://www.mininova.org/todayqh?hh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOK"h9]rhy)r}r(h>hhG}r(UrefurihhI]hJ]hK]hL]hM]uh?hh9]rhdXhttp://www.mininova.org/todayrr}r(h>Uh?jubahEh}ubaubaubh;)r }r (h>X.. _intro-overview-item:h?hh@hChEhFhG}r (hI]hJ]hK]hL]hM]hNh3uhOK$hPhh9]ubeubhQ)r }r (h>Uh?hRh@hChT}rhj shEhVhG}r(hK]hL]hJ]hI]r(h4h3ehM]r(hheuhOK'hPhhZ}rh3j sh9]r(h])r}r(h>X"Define the data you want to scraperh?j h@hChEhahG}r(hK]hL]hJ]hI]hM]uhOK'hPhh9]rhdX"Define the data you want to scraperr}r(h>jh?jubaubhh)r}r(h>XThe first thing is to define the data we want to scrape. In Scrapy, this is done through :ref:`Scrapy Items ` (Torrent files, in this case).h?j h@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOK)hPhh9]r(hdXYThe first thing is to define the data we want to scrape. In Scrapy, this is done through r r!}r"(h>XYThe first thing is to define the data we want to scrape. In Scrapy, this is done through h?jubh)r#}r$(h>X":ref:`Scrapy Items `r%h?jh@hChEhhG}r&(UreftypeXrefhhX topics-itemsU refdomainXstdr'hI]hJ]U refexplicithK]hL]hM]hhuhOK)h9]r(h)r)}r*(h>j%hG}r+(hK]hL]r,(hj'Xstd-refr-ehJ]hI]hM]uh?j#h9]r.hdX Scrapy Itemsr/r0}r1(h>Uh?j)ubahEhubaubhdX (Torrent files, in this case).r2r3}r4(h>X (Torrent files, in this case).h?jubeubhh)r5}r6(h>XThis would be our Item::r7h?j h@hChEhlhG}r8(hK]hL]hJ]hI]hM]uhOK,hPhh9]r9hdXThis would be our Item:r:r;}r<(h>XThis would be our Item:h?j5ubaubcdocutils.nodes literal_block r=)r>}r?(h>Xfrom scrapy.item import Item, Field class TorrentItem(Item): url = Field() name = Field() description = Field() size = Field()h?j h@hChEU literal_blockr@hG}rA(U xml:spacerBUpreserverChI]hJ]hK]hL]hM]uhOK.hPhh9]rDhdXfrom scrapy.item import Item, Field class TorrentItem(Item): url = Field() name = Field() description = Field() size = Field()rErF}rG(h>Uh?j>ubaubeubhQ)rH}rI(h>Uh?hRh@hChEhVhG}rJ(hK]hL]hJ]hI]rKh)ahM]rLh auhOK7hPhh9]rM(h])rN}rO(h>X"Write a Spider to extract the datarPh?jHh@hChEhahG}rQ(hK]hL]hJ]hI]hM]uhOK7hPhh9]rRhdX"Write a Spider to extract the datarSrT}rU(h>jPh?jNubaubhh)rV}rW(h>XThe next thing is to write a Spider which defines the start URL (http://www.mininova.org/today), the rules for following links and the rules for extracting the data from pages.h?jHh@hChEhlhG}rX(hK]hL]hJ]hI]hM]uhOK9hPhh9]rY(hdXAThe next thing is to write a Spider which defines the start URL (rZr[}r\(h>XAThe next thing is to write a Spider which defines the start URL (h?jVubhy)r]}r^(h>Xhttp://www.mininova.org/todayr_hG}r`(Urefurij_hI]hJ]hK]hL]hM]uh?jVh9]rahdXhttp://www.mininova.org/todayrbrc}rd(h>Uh?j]ubahEh}ubhdXR), the rules for following links and the rules for extracting the data from pages.rerf}rg(h>XR), the rules for following links and the rules for extracting the data from pages.h?jVubeubhh)rh}ri(h>XIf we take a look at that page content we'll see that all torrent URLs are like http://www.mininova.org/tor/NUMBER where ``NUMBER`` is an integer. We'll use that to construct the regular expression for the links to follow: ``/tor/\d+``.h?jHh@hChEhlhG}rj(hK]hL]hJ]hI]hM]uhOK=hPhh9]rk(hdXPIf we take a look at that page content we'll see that all torrent URLs are like rlrm}rn(h>XPIf we take a look at that page content we'll see that all torrent URLs are like h?jhubhy)ro}rp(h>X"http://www.mininova.org/tor/NUMBERrqhG}rr(UrefurijqhI]hJ]hK]hL]hM]uh?jhh9]rshdX"http://www.mininova.org/tor/NUMBERrtru}rv(h>Uh?joubahEh}ubhdX where rwrx}ry(h>X where h?jhubcdocutils.nodes literal rz)r{}r|(h>X ``NUMBER``hG}r}(hK]hL]hJ]hI]hM]uh?jhh9]r~hdXNUMBERrr}r(h>Uh?j{ubahEUliteralrubhdX\ is an integer. We'll use that to construct the regular expression for the links to follow: rr}r(h>X\ is an integer. We'll use that to construct the regular expression for the links to follow: h?jhubjz)r}r(h>X ``/tor/\d+``hG}r(hK]hL]hJ]hI]hM]uh?jhh9]rhdX/tor/\d+rr}r(h>Uh?jubahEjubhdX.r}r(h>X.h?jhubeubhh)r}r(h>XzWe'll use `XPath`_ for selecting the data to extract from the web page HTML source. Let's take one of those torrent pages:h?jHh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKAhPhh9]r(hdX We'll use rr}r(h>X We'll use h?jubhy)r}r(h>X`XPath`_h|Kh?jhEh}hG}r(UnameXXPathhXhttp://www.w3.org/TR/xpathrhI]hJ]hK]hL]hM]uh9]rhdXXPathrr}r(h>Uh?jubaubhdXh for selecting the data to extract from the web page HTML source. Let's take one of those torrent pages:rr}r(h>Xh for selecting the data to extract from the web page HTML source. Let's take one of those torrent pages:h?jubeubh)r}r(h>Uh?jHh@hChEhhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>X#http://www.mininova.org/tor/2676093rh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKDh9]rhy)r}r(h>jhG}r(UrefurijhI]hJ]hK]hL]hM]uh?jh9]rhdX#http://www.mininova.org/tor/2676093rr}r(h>Uh?jubahEh}ubaubaubhh)r}r(h>XAnd look at the page HTML source to construct the XPath to select the data we want which is: torrent name, description and size.rh?jHh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKFhPhh9]rhdXAnd look at the page HTML source to construct the XPath to select the data we want which is: torrent name, description and size.rr}r(h>jh?jubaubcsphinx.addnodes highlightlang r)r}r(h>Uh?jHh@hChEU highlightlangrhG}r(UlangXhtmlUlinenothresholdI9223372036854775807 hI]hJ]hK]hL]hM]uhOKJhPhh9]ubhh)r}r(h>XeBy looking at the page HTML source we can see that the file name is contained inside a ``

              `` tag::h?jHh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKKhPhh9]r(hdXWBy looking at the page HTML source we can see that the file name is contained inside a rr}r(h>XWBy looking at the page HTML source we can see that the file name is contained inside a h?jubjz)r}r(h>X``

              ``hG}r(hK]hL]hJ]hI]hM]uh?jh9]rhdX

              rr}r(h>Uh?jubahEjubhdX tag:rr}r(h>X tag:h?jubeubj=)r}r(h>X0

              Darwin - The Evolution Of An Exhibition

              h?jHh@hChEj@hG}r(jBjChI]hJ]hK]hL]hM]uhOKNhPhh9]rhdX0

              Darwin - The Evolution Of An Exhibition

              rr}r(h>Uh?jubaubj)r}r(h>Uh?jHh@hChEjhG}r(UlangXnoneUlinenothresholdI9223372036854775807 hI]hJ]hK]hL]hM]uhOKQhPhh9]ubhh)r}r(h>X2An XPath expression to extract the name could be::rh?jHh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKRhPhh9]rhdX1An XPath expression to extract the name could be:rr}r(h>X1An XPath expression to extract the name could be:h?jubaubj=)r}r(h>X //h1/text()h?jHh@hChEj@hG}r(jBjChI]hJ]hK]hL]hM]uhOKThPhh9]rhdX //h1/text()rr}r(h>Uh?jubaubj)r}r(h>Uh?jHh@hChEjhG}r(UlangXhtmlUlinenothresholdI9223372036854775807 hI]hJ]hK]hL]hM]uhOKWhPhh9]ubhh)r}r(h>XSAnd the description is contained inside a ``
              `` tag with ``id="description"``::rh?jHh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKXhPhh9]r(hdX*And the description is contained inside a rr}r(h>X*And the description is contained inside a h?jubjz)r}r(h>X ``
              ``hG}r(hK]hL]hJ]hI]hM]uh?jh9]rhdX
              rr}r(h>Uh?jubahEjubhdX tag with rr}r(h>X tag with h?jubjz)r}r(h>X``id="description"``hG}r(hK]hL]hJ]hI]hM]uh?jh9]rhdXid="description"rr}r(h>Uh?jubahEjubhdX:r}r(h>X:h?jubeubj=)r}r(h>X

              Description:

              Short documentary made for Plymouth City Museum and Art Gallery regarding the setup of an exhibit about Charles Darwin in conjunction with the 200th anniversary of his birth. ...h?jHh@hChEj@hG}r(jBjChI]hJ]hK]hL]hM]uhOKZhPhh9]r hdX

              Description:

              Short documentary made for Plymouth City Museum and Art Gallery regarding the setup of an exhibit about Charles Darwin in conjunction with the 200th anniversary of his birth. ...r r }r (h>Uh?jubaubj)r }r(h>Uh?jHh@hChEjhG}r(UlangXnoneUlinenothresholdI9223372036854775807 hI]hJ]hK]hL]hM]uhOKbhPhh9]ubhh)r}r(h>X8An XPath expression to select the description could be::rh?jHh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKchPhh9]rhdX7An XPath expression to select the description could be:rr}r(h>X7An XPath expression to select the description could be:h?jubaubj=)r}r(h>X//div[@id='description']h?jHh@hChEj@hG}r(jBjChI]hJ]hK]hL]hM]uhOKehPhh9]rhdX//div[@id='description']rr}r(h>Uh?jubaubj)r}r (h>Uh?jHh@hChEjhG}r!(UlangXhtmlUlinenothresholdI9223372036854775807 hI]hJ]hK]hL]hM]uhOKhhPhh9]ubhh)r"}r#(h>XsFinally, the file size is contained in the second ``

              `` tag inside the ``

              `` tag with ``id=specifications``::h?jHh@hChEhlhG}r$(hK]hL]hJ]hI]hM]uhOKihPhh9]r%(hdX2Finally, the file size is contained in the second r&r'}r((h>X2Finally, the file size is contained in the second h?j"ubjz)r)}r*(h>X``

              ``hG}r+(hK]hL]hJ]hI]hM]uh?j"h9]r,hdX

              r-r.}r/(h>Uh?j)ubahEjubhdX tag inside the r0r1}r2(h>X tag inside the h?j"ubjz)r3}r4(h>X ``

              ``hG}r5(hK]hL]hJ]hI]hM]uh?j"h9]r6hdX
              r7r8}r9(h>Uh?j3ubahEjubhdX tag with r:r;}r<(h>X tag with h?j"ubjz)r=}r>(h>X``id=specifications``hG}r?(hK]hL]hJ]hI]hM]uh?j"h9]r@hdXid=specificationsrArB}rC(h>Uh?j=ubahEjubhdX:rD}rE(h>X:h?j"ubeubj=)rF}rG(h>X

              Category: Movies > Documentary

              Total size: 150.62 megabyte

              h?jHh@hChEj@hG}rH(jBjChI]hJ]hK]hL]hM]uhOKlhPhh9]rIhdX

              Category: Movies > Documentary

              Total size: 150.62 megabyte

              rJrK}rL(h>Uh?jFubaubj)rM}rN(h>Uh?jHh@hChEjhG}rO(UlangXnoneUlinenothresholdI9223372036854775807 hI]hJ]hK]hL]hM]uhOKyhPhh9]ubhh)rP}rQ(h>X6An XPath expression to select the file size could be::rRh?jHh@hChEhlhG}rS(hK]hL]hJ]hI]hM]uhOKzhPhh9]rThdX5An XPath expression to select the file size could be:rUrV}rW(h>X5An XPath expression to select the file size could be:h?jPubaubj=)rX}rY(h>X*//div[@id='specifications']/p[2]/text()[2]h?jHh@hChEj@hG}rZ(jBjChI]hJ]hK]hL]hM]uhOK|hPhh9]r[hdX*//div[@id='specifications']/p[2]/text()[2]r\r]}r^(h>Uh?jXubaubj)r_}r`(h>Uh?jHh@hChEjhG}ra(UlangXpythonUlinenothresholdI9223372036854775807 hI]hJ]hK]hL]hM]uhOKhPhh9]ubhh)rb}rc(h>X<For more information about XPath see the `XPath reference`_.rdh?jHh@hChEhlhG}re(hK]hL]hJ]hI]hM]uhOKhPhh9]rf(hdX)For more information about XPath see the rgrh}ri(h>X)For more information about XPath see the h?jbubhy)rj}rk(h>X`XPath reference`_h|Kh?jbhEh}hG}rl(UnameXXPath referencehXhttp://www.w3.org/TR/xpathrmhI]hJ]hK]hL]hM]uh9]rnhdXXPath referencerorp}rq(h>Uh?jjubaubhdX.rr}rs(h>X.h?jbubeubhh)rt}ru(h>X!Finally, here's the spider code::rvh?jHh@hChEhlhG}rw(hK]hL]hJ]hI]hM]uhOKhPhh9]rxhdX Finally, here's the spider code:ryrz}r{(h>X Finally, here's the spider code:h?jtubaubj=)r|}r}(h>Xfrom scrapy.contrib.spiders import CrawlSpider, Rule from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor from scrapy.selector import Selector class MininovaSpider(CrawlSpider): name = 'mininova' allowed_domains = ['mininova.org'] start_urls = ['http://www.mininova.org/today'] rules = [Rule(SgmlLinkExtractor(allow=['/tor/\d+']), 'parse_torrent')] def parse_torrent(self, response): sel = Selector(response) torrent = TorrentItem() torrent['url'] = response.url torrent['name'] = sel.xpath("//h1/text()").extract() torrent['description'] = sel.xpath("//div[@id='description']").extract() torrent['size'] = sel.xpath("//div[@id='info-left']/p[2]/text()[2]").extract() return torrenth?jHh@hChEj@hG}r~(jBjChI]hJ]hK]hL]hM]uhOKhPhh9]rhdXfrom scrapy.contrib.spiders import CrawlSpider, Rule from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor from scrapy.selector import Selector class MininovaSpider(CrawlSpider): name = 'mininova' allowed_domains = ['mininova.org'] start_urls = ['http://www.mininova.org/today'] rules = [Rule(SgmlLinkExtractor(allow=['/tor/\d+']), 'parse_torrent')] def parse_torrent(self, response): sel = Selector(response) torrent = TorrentItem() torrent['url'] = response.url torrent['name'] = sel.xpath("//h1/text()").extract() torrent['description'] = sel.xpath("//div[@id='description']").extract() torrent['size'] = sel.xpath("//div[@id='info-left']/p[2]/text()[2]").extract() return torrentrr}r(h>Uh?j|ubaubhh)r}r(h>XHThe ``TorrentItem`` class is :ref:`defined above `.rh?jHh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKhPhh9]r(hdXThe rr}r(h>XThe h?jubjz)r}r(h>X``TorrentItem``hG}r(hK]hL]hJ]hI]hM]uh?jh9]rhdX TorrentItemrr}r(h>Uh?jubahEjubhdX class is rr}r(h>X class is h?jubh)r}r(h>X*:ref:`defined above `rh?jh@hChEhhG}r(UreftypeXrefhhXintro-overview-itemU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdX defined aboverr}r(h>Uh?jubahEhubaubhdX.r}r(h>X.h?jubeubeubhQ)r}r(h>Uh?hRh@hChEhVhG}r(hK]hL]hJ]hI]rh$ahM]rhauhOKhPhh9]r(h])r}r(h>X"Run the spider to extract the datarh?jh@hChEhahG}r(hK]hL]hJ]hI]hM]uhOKhPhh9]rhdX"Run the spider to extract the datarr}r(h>jh?jubaubhh)r}r(h>X{Finally, we'll run the spider to crawl the site an output file ``scraped_data.json`` with the scraped data in JSON format::h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKhPhh9]r(hdX?Finally, we'll run the spider to crawl the site an output file rr}r(h>X?Finally, we'll run the spider to crawl the site an output file h?jubjz)r}r(h>X``scraped_data.json``hG}r(hK]hL]hJ]hI]hM]uh?jh9]rhdXscraped_data.jsonrr}r(h>Uh?jubahEjubhdX& with the scraped data in JSON format:rr}r(h>X& with the scraped data in JSON format:h?jubeubj=)r}r(h>X2scrapy crawl mininova -o scraped_data.json -t jsonh?jh@hChEj@hG}r(jBjChI]hJ]hK]hL]hM]uhOKhPhh9]rhdX2scrapy crawl mininova -o scraped_data.json -t jsonrr}r(h>Uh?jubaubhh)r}r(h>XThis uses :ref:`feed exports ` to generate the JSON file. You can easily change the export format (XML or CSV, for example) or the storage backend (FTP or `Amazon S3`_, for example).h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKhPhh9]r(hdX This uses rr}r(h>X This uses h?jubh)r}r(h>X):ref:`feed exports `rh?jh@hChEhhG}r(UreftypeXrefhhXtopics-feed-exportsU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdX feed exportsrr}r(h>Uh?jubahEhubaubhdX} to generate the JSON file. You can easily change the export format (XML or CSV, for example) or the storage backend (FTP or rr}r(h>X} to generate the JSON file. You can easily change the export format (XML or CSV, for example) or the storage backend (FTP or h?jubhy)r}r(h>X `Amazon S3`_h|Kh?jhEh}hG}r(UnameX Amazon S3hXhttp://aws.amazon.com/s3/rhI]hJ]hK]hL]hM]uh9]rhdX Amazon S3rr}r(h>Uh?jubaubhdX, for example).rr}r(h>X, for example).h?jubeubhh)r}r(h>XoYou can also write an :ref:`item pipeline ` to store the items in a database very easily.h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKhPhh9]r(hdXYou can also write an rr}r(h>XYou can also write an h?jubh)r}r(h>X+:ref:`item pipeline `rh?jh@hChEhhG}r(UreftypeXrefhhXtopics-item-pipelineU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdX item pipelinerr}r(h>Uh?jubahEhubaubhdX. to store the items in a database very easily.rr}r(h>X. to store the items in a database very easily.h?jubeubeubhQ)r }r (h>Uh?hRh@hChEhVhG}r (hK]hL]hJ]hI]r h7ahM]r hauhOKhPhh9]r(h])r}r(h>XReview scraped datarh?j h@hChEhahG}r(hK]hL]hJ]hI]hM]uhOKhPhh9]rhdXReview scraped datarr}r(h>jh?jubaubhh)r}r(h>XlIf you check the ``scraped_data.json`` file after the process finishes, you'll see the scraped items there::h?j h@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKhPhh9]r(hdXIf you check the rr}r(h>XIf you check the h?jubjz)r}r(h>X``scraped_data.json``hG}r (hK]hL]hJ]hI]hM]uh?jh9]r!hdXscraped_data.jsonr"r#}r$(h>Uh?jubahEjubhdXE file after the process finishes, you'll see the scraped items there:r%r&}r'(h>XE file after the process finishes, you'll see the scraped items there:h?jubeubj=)r(}r)(h>X[{"url": "http://www.mininova.org/tor/2676093", "name": ["Darwin - The Evolution Of An Exhibition"], "description": ["Short documentary made for Plymouth ..."], "size": ["150.62 megabyte"]}, # ... other items ... ]h?j h@hChEj@hG}r*(jBjChI]hJ]hK]hL]hM]uhOKhPhh9]r+hdX[{"url": "http://www.mininova.org/tor/2676093", "name": ["Darwin - The Evolution Of An Exhibition"], "description": ["Short documentary made for Plymouth ..."], "size": ["150.62 megabyte"]}, # ... other items ... ]r,r-}r.(h>Uh?j(ubaubhh)r/}r0(h>XMYou'll notice that all field values (except for the ``url`` which was assigned directly) are actually lists. This is because the :ref:`selectors ` return lists. You may want to store single values, or perform some additional parsing/cleansing to the values. That's what :ref:`Item Loaders ` are for.h?j h@hChEhlhG}r1(hK]hL]hJ]hI]hM]uhOKhPhh9]r2(hdX4You'll notice that all field values (except for the r3r4}r5(h>X4You'll notice that all field values (except for the h?j/ubjz)r6}r7(h>X``url``hG}r8(hK]hL]hJ]hI]hM]uh?j/h9]r9hdXurlr:r;}r<(h>Uh?j6ubahEjubhdXF which was assigned directly) are actually lists. This is because the r=r>}r?(h>XF which was assigned directly) are actually lists. This is because the h?j/ubh)r@}rA(h>X#:ref:`selectors `rBh?j/h@hChEhhG}rC(UreftypeXrefhhXtopics-selectorsU refdomainXstdrDhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rEh)rF}rG(h>jBhG}rH(hK]hL]rI(hjDXstd-refrJehJ]hI]hM]uh?j@h9]rKhdX selectorsrLrM}rN(h>Uh?jFubahEhubaubhdX| return lists. You may want to store single values, or perform some additional parsing/cleansing to the values. That's what rOrP}rQ(h>X| return lists. You may want to store single values, or perform some additional parsing/cleansing to the values. That's what h?j/ubh)rR}rS(h>X$:ref:`Item Loaders `rTh?j/h@hChEhhG}rU(UreftypeXrefhhXtopics-loadersU refdomainXstdrVhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rWh)rX}rY(h>jThG}rZ(hK]hL]r[(hjVXstd-refr\ehJ]hI]hM]uh?jRh9]r]hdX Item Loadersr^r_}r`(h>Uh?jXubahEhubaubhdX are for.rarb}rc(h>X are for.h?j/ubeubh;)rd}re(h>X.. _topics-whatelse:h?j h@hChEhFhG}rf(hI]hJ]hK]hL]hM]hNh&uhOKhPhh9]ubeubhQ)rg}rh(h>Uh?hRh@hChT}rihjdshEhVhG}rj(hK]hL]hJ]hI]rk(h0h&ehM]rl(hheuhOKhPhhZ}rmh&jdsh9]rn(h])ro}rp(h>X What else?rqh?jgh@hChEhahG}rr(hK]hL]hJ]hI]hM]uhOKhPhh9]rshdX What else?rtru}rv(h>jqh?joubaubhh)rw}rx(h>XYou've seen how to extract and store items from a website using Scrapy, but this is just the surface. Scrapy provides a lot of powerful features for making scraping easy and efficient, such as:ryh?jgh@hChEhlhG}rz(hK]hL]hJ]hI]hM]uhOKhPhh9]r{hdXYou've seen how to extract and store items from a website using Scrapy, but this is just the surface. Scrapy provides a lot of powerful features for making scraping easy and efficient, such as:r|r}}r~(h>jyh?jwubaubcdocutils.nodes bullet_list r)r}r(h>Uh?jgh@hChEU bullet_listrhG}r(UbulletrX*hI]hJ]hK]hL]hM]uhOKhPhh9]r(cdocutils.nodes list_item r)r}r(h>XgBuilt-in support for :ref:`selecting and extracting ` data from HTML and XML sources h?jh@hChEU list_itemrhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XfBuilt-in support for :ref:`selecting and extracting ` data from HTML and XML sourcesh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(hdXBuilt-in support for rr}r(h>XBuilt-in support for h?jubh)r}r(h>X2:ref:`selecting and extracting `rh?jh@hChEhhG}r(UreftypeXrefhhXtopics-selectorsU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdXselecting and extractingrr}r(h>Uh?jubahEhubaubhdX data from HTML and XML sourcesrr}r(h>X data from HTML and XML sourcesh?jubeubaubj)r}r(h>XBuilt-in support for cleaning and sanitizing the scraped data using a collection of reusable filters (called :ref:`Item Loaders `) shared between all the spiders. h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XBuilt-in support for cleaning and sanitizing the scraped data using a collection of reusable filters (called :ref:`Item Loaders `) shared between all the spiders.h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(hdXmBuilt-in support for cleaning and sanitizing the scraped data using a collection of reusable filters (called rr}r(h>XmBuilt-in support for cleaning and sanitizing the scraped data using a collection of reusable filters (called h?jubh)r}r(h>X$:ref:`Item Loaders `rh?jh@hChEhhG}r(UreftypeXrefhhXtopics-loadersU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdX Item Loadersrr}r(h>Uh?jubahEhubaubhdX!) shared between all the spiders.rr}r(h>X!) shared between all the spiders.h?jubeubaubj)r}r(h>XBuilt-in support for :ref:`generating feed exports ` in multiple formats (JSON, CSV, XML) and storing them in multiple backends (FTP, S3, local filesystem) h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XBuilt-in support for :ref:`generating feed exports ` in multiple formats (JSON, CSV, XML) and storing them in multiple backends (FTP, S3, local filesystem)h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(hdXBuilt-in support for rr}r(h>XBuilt-in support for h?jubh)r}r(h>X4:ref:`generating feed exports `rh?jh@hChEhhG}r(UreftypeXrefhhXtopics-feed-exportsU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdXgenerating feed exportsrr}r(h>Uh?jubahEhubaubhdXg in multiple formats (JSON, CSV, XML) and storing them in multiple backends (FTP, S3, local filesystem)rr}r(h>Xg in multiple formats (JSON, CSV, XML) and storing them in multiple backends (FTP, S3, local filesystem)h?jubeubaubj)r}r(h>XA media pipeline for :ref:`automatically downloading images ` (or any other media) associated with the scraped items h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XA media pipeline for :ref:`automatically downloading images ` (or any other media) associated with the scraped itemsh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(hdXA media pipeline for rr}r(h>XA media pipeline for h?jubh)r}r(h>X7:ref:`automatically downloading images `rh?jh@hChEhhG}r(UreftypeXrefhhX topics-imagesU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdX automatically downloading imagesrr}r(h>Uh?jubahEhubaubhdX7 (or any other media) associated with the scraped itemsrr}r(h>X7 (or any other media) associated with the scraped itemsh?jubeubaubj)r}r(h>XSupport for :ref:`extending Scrapy ` by plugging your own functionality using :ref:`signals ` and a well-defined API (middlewares, :ref:`extensions `, and :ref:`pipelines `). h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XSupport for :ref:`extending Scrapy ` by plugging your own functionality using :ref:`signals ` and a well-defined API (middlewares, :ref:`extensions `, and :ref:`pipelines `).h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(hdX Support for rr}r(h>X Support for h?jubh)r}r(h>X*:ref:`extending Scrapy `r h?jh@hChEhhG}r (UreftypeXrefhhXextending-scrapyU refdomainXstdr hI]hJ]U refexplicithK]hL]hM]hhuhOKh9]r h)r }r(h>j hG}r(hK]hL]r(hj Xstd-refrehJ]hI]hM]uh?jh9]rhdXextending Scrapyrr}r(h>Uh?j ubahEhubaubhdX* by plugging your own functionality using rr}r(h>X* by plugging your own functionality using h?jubh)r}r(h>X:ref:`signals `rh?jh@hChEhhG}r(UreftypeXrefhhXtopics-signalsU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r (h>jhG}r!(hK]hL]r"(hjXstd-refr#ehJ]hI]hM]uh?jh9]r$hdXsignalsr%r&}r'(h>Uh?jubahEhubaubhdX& and a well-defined API (middlewares, r(r)}r*(h>X& and a well-defined API (middlewares, h?jubh)r+}r,(h>X%:ref:`extensions `r-h?jh@hChEhhG}r.(UreftypeXrefhhXtopics-extensionsU refdomainXstdr/hI]hJ]U refexplicithK]hL]hM]hhuhOKh9]r0h)r1}r2(h>j-hG}r3(hK]hL]r4(hj/Xstd-refr5ehJ]hI]hM]uh?j+h9]r6hdX extensionsr7r8}r9(h>Uh?j1ubahEhubaubhdX, and r:r;}r<(h>X, and h?jubh)r=}r>(h>X':ref:`pipelines `r?h?jh@hChEhhG}r@(UreftypeXrefhhXtopics-item-pipelineU refdomainXstdrAhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rBh)rC}rD(h>j?hG}rE(hK]hL]rF(hjAXstd-refrGehJ]hI]hM]uh?j=h9]rHhdX pipelinesrIrJ}rK(h>Uh?jCubahEhubaubhdX).rLrM}rN(h>X).h?jubeubaubj)rO}rP(h>XWide range of built-in middlewares and extensions for: * cookies and session handling * HTTP compression * HTTP authentication * HTTP cache * user-agent spoofing * robots.txt * crawl depth restriction * and more h?jh@NhEjhG}rQ(hK]hL]hJ]hI]hM]uhONhPhh9]rR(hh)rS}rT(h>X6Wide range of built-in middlewares and extensions for:rUh?jOh@hChEhlhG}rV(hK]hL]hJ]hI]hM]uhOKh9]rWhdX6Wide range of built-in middlewares and extensions for:rXrY}rZ(h>jUh?jSubaubj)r[}r\(h>UhG}r](jX*hI]hJ]hK]hL]hM]uh?jOh9]r^(j)r_}r`(h>Xcookies and session handlingrahG}rb(hK]hL]hJ]hI]hM]uh?j[h9]rchh)rd}re(h>jah?j_h@hChEhlhG}rf(hK]hL]hJ]hI]hM]uhOKh9]rghdXcookies and session handlingrhri}rj(h>jah?jdubaubahEjubj)rk}rl(h>XHTTP compressionrmhG}rn(hK]hL]hJ]hI]hM]uh?j[h9]rohh)rp}rq(h>jmh?jkh@hChEhlhG}rr(hK]hL]hJ]hI]hM]uhOKh9]rshdXHTTP compressionrtru}rv(h>jmh?jpubaubahEjubj)rw}rx(h>XHTTP authenticationryhG}rz(hK]hL]hJ]hI]hM]uh?j[h9]r{hh)r|}r}(h>jyh?jwh@hChEhlhG}r~(hK]hL]hJ]hI]hM]uhOKh9]rhdXHTTP authenticationrr}r(h>jyh?j|ubaubahEjubj)r}r(h>X HTTP cacherhG}r(hK]hL]hJ]hI]hM]uh?j[h9]rhh)r}r(h>jh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]rhdX HTTP cacherr}r(h>jh?jubaubahEjubj)r}r(h>Xuser-agent spoofingrhG}r(hK]hL]hJ]hI]hM]uh?j[h9]rhh)r}r(h>jh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]rhdXuser-agent spoofingrr}r(h>jh?jubaubahEjubj)r}r(h>X robots.txtrhG}r(hK]hL]hJ]hI]hM]uh?j[h9]rhh)r}r(h>jh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]rhdX robots.txtrr}r(h>jh?jubaubahEjubj)r}r(h>Xcrawl depth restrictionrhG}r(hK]hL]hJ]hI]hM]uh?j[h9]rhh)r}r(h>jh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]rhdXcrawl depth restrictionrr}r(h>jh?jubaubahEjubj)r}r(h>X and more hG}r(hK]hL]hJ]hI]hM]uh?j[h9]rhh)r}r(h>Xand morerh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]rhdXand morerr}r(h>jh?jubaubahEjubehEjubeubj)r}r(h>XuRobust encoding support and auto-detection, for dealing with foreign, non-standard and broken encoding declarations. h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XtRobust encoding support and auto-detection, for dealing with foreign, non-standard and broken encoding declarations.rh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]rhdXtRobust encoding support and auto-detection, for dealing with foreign, non-standard and broken encoding declarations.rr}r(h>jh?jubaubaubj)r}r(h>XSupport for creating spiders based on pre-defined templates, to speed up spider creation and make their code more consistent on large projects. See :command:`genspider` command for more details. h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XSupport for creating spiders based on pre-defined templates, to speed up spider creation and make their code more consistent on large projects. See :command:`genspider` command for more details.h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(hdXSupport for creating spiders based on pre-defined templates, to speed up spider creation and make their code more consistent on large projects. See rr}r(h>XSupport for creating spiders based on pre-defined templates, to speed up spider creation and make their code more consistent on large projects. See h?jubh)r}r(h>X:command:`genspider`rh?jh@hChEhhG}r(UreftypeXcommandhhX genspiderU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rjz)r}r(h>jhG}r(hK]hL]r(hjX std-commandrehJ]hI]hM]uh?jh9]rhdX genspiderrr}r(h>Uh?jubahEjubaubhdX command for more details.rr}r(h>X command for more details.h?jubeubaubj)r}r(h>XExtensible :ref:`stats collection ` for multiple spider metrics, useful for monitoring the performance of your spiders and detecting when they get broken h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XExtensible :ref:`stats collection ` for multiple spider metrics, useful for monitoring the performance of your spiders and detecting when they get brokenh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(hdX Extensible rr}r(h>X Extensible h?jubh)r}r(h>X&:ref:`stats collection `rh?jh@hChEhhG}r(UreftypeXrefhhX topics-statsU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdXstats collectionrr}r(h>Uh?jubahEhubaubhdXv for multiple spider metrics, useful for monitoring the performance of your spiders and detecting when they get brokenrr}r(h>Xv for multiple spider metrics, useful for monitoring the performance of your spiders and detecting when they get brokenh?jubeubaubj)r}r(h>XyAn :ref:`Interactive shell console ` for trying XPaths, very useful for writing and debugging your spiders h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r }r (h>XxAn :ref:`Interactive shell console ` for trying XPaths, very useful for writing and debugging your spidersh?jh@hChEhlhG}r (hK]hL]hJ]hI]hM]uhOKh9]r (hdXAn r r}r(h>XAn h?j ubh)r}r(h>X/:ref:`Interactive shell console `rh?j h@hChEhhG}r(UreftypeXrefhhX topics-shellU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdXInteractive shell consolerr}r(h>Uh?jubahEhubaubhdXF for trying XPaths, very useful for writing and debugging your spidersrr }r!(h>XF for trying XPaths, very useful for writing and debugging your spidersh?j ubeubaubj)r"}r#(h>XpA :ref:`System service ` designed to ease the deployment and run of your spiders in production. h?jh@hChEjhG}r$(hK]hL]hJ]hI]hM]uhONhPhh9]r%hh)r&}r'(h>XoA :ref:`System service ` designed to ease the deployment and run of your spiders in production.h?j"h@hChEhlhG}r((hK]hL]hJ]hI]hM]uhOKh9]r)(hdXA r*r+}r,(h>XA h?j&ubh)r-}r.(h>X&:ref:`System service `r/h?j&h@hChEhhG}r0(UreftypeXrefhhXtopics-scrapydU refdomainXstdr1hI]hJ]U refexplicithK]hL]hM]hhuhOKh9]r2h)r3}r4(h>j/hG}r5(hK]hL]r6(hj1Xstd-refr7ehJ]hI]hM]uh?j-h9]r8hdXSystem servicer9r:}r;(h>Uh?j3ubahEhubaubhdXG designed to ease the deployment and run of your spiders in production.r<r=}r>(h>XG designed to ease the deployment and run of your spiders in production.h?j&ubeubaubj)r?}r@(h>XZA built-in :ref:`Web service ` for monitoring and controlling your bot h?jh@hChEjhG}rA(hK]hL]hJ]hI]hM]uhONhPhh9]rBhh)rC}rD(h>XYA built-in :ref:`Web service ` for monitoring and controlling your both?j?h@hChEhlhG}rE(hK]hL]hJ]hI]hM]uhOKh9]rF(hdX A built-in rGrH}rI(h>X A built-in h?jCubh)rJ}rK(h>X&:ref:`Web service `rLh?jCh@hChEhhG}rM(UreftypeXrefhhXtopics-webserviceU refdomainXstdrNhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rOh)rP}rQ(h>jLhG}rR(hK]hL]rS(hjNXstd-refrTehJ]hI]hM]uh?jJh9]rUhdX Web servicerVrW}rX(h>Uh?jPubahEhubaubhdX( for monitoring and controlling your botrYrZ}r[(h>X( for monitoring and controlling your both?jCubeubaubj)r\}r](h>XA :ref:`Telnet console ` for hooking into a Python console running inside your Scrapy process, to introspect and debug your crawler h?jh@hChEjhG}r^(hK]hL]hJ]hI]hM]uhONhPhh9]r_hh)r`}ra(h>XA :ref:`Telnet console ` for hooking into a Python console running inside your Scrapy process, to introspect and debug your crawlerh?j\h@hChEhlhG}rb(hK]hL]hJ]hI]hM]uhOKh9]rc(hdXA rdre}rf(h>XA h?j`ubh)rg}rh(h>X,:ref:`Telnet console `rih?j`h@hChEhhG}rj(UreftypeXrefhhXtopics-telnetconsoleU refdomainXstdrkhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rlh)rm}rn(h>jihG}ro(hK]hL]rp(hjkXstd-refrqehJ]hI]hM]uh?jgh9]rrhdXTelnet consolersrt}ru(h>Uh?jmubahEhubaubhdXk for hooking into a Python console running inside your Scrapy process, to introspect and debug your crawlerrvrw}rx(h>Xk for hooking into a Python console running inside your Scrapy process, to introspect and debug your crawlerh?j`ubeubaubj)ry}rz(h>Xr:ref:`Logging ` facility that you can hook on to for catching errors during the scraping process. h?jh@hChEjhG}r{(hK]hL]hJ]hI]hM]uhONhPhh9]r|hh)r}}r~(h>Xq:ref:`Logging ` facility that you can hook on to for catching errors during the scraping process.h?jyh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(h)r}r(h>X:ref:`Logging `rh?j}h@hChEhhG}r(UreftypeXrefhhXtopics-loggingU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOKh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdXLoggingrr}r(h>Uh?jubahEhubaubhdXR facility that you can hook on to for catching errors during the scraping process.rr}r(h>XR facility that you can hook on to for catching errors during the scraping process.h?j}ubeubaubj)r}r(h>XBSupport for crawling based on URLs discovered through `Sitemaps`_ h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XASupport for crawling based on URLs discovered through `Sitemaps`_h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]r(hdX6Support for crawling based on URLs discovered through rr}r(h>X6Support for crawling based on URLs discovered through h?jubhy)r}r(h>X `Sitemaps`_h|Kh?jhEh}hG}r(UnameXSitemapshXhttp://www.sitemaps.orgrhI]hJ]hK]hL]hM]uh9]rhdXSitemapsrr}r(h>Uh?jubaubeubaubj)r}r(h>XA caching DNS resolver h?jh@hChEjhG}r(hK]hL]hJ]hI]hM]uhONhPhh9]rhh)r}r(h>XA caching DNS resolverrh?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOKh9]rhdXA caching DNS resolverrr}r(h>jh?jubaubaubeubeubhQ)r}r(h>Uh?hRh@hChEhVhG}r(hK]hL]hJ]hI]rh(ahM]rh auhOMhPhh9]r(h])r}r(h>X What's next?rh?jh@hChEhahG}r(hK]hL]hJ]hI]hM]uhOMhPhh9]rhdX What's next?rr}r(h>jh?jubaubhh)r}r(h>XThe next obvious steps are for you to `download Scrapy`_, read :ref:`the tutorial ` and join `the community`_. Thanks for your interest!h?jh@hChEhlhG}r(hK]hL]hJ]hI]hM]uhOMhPhh9]r(hdX&The next obvious steps are for you to rr}r(h>X&The next obvious steps are for you to h?jubhy)r}r(h>X`download Scrapy`_h|Kh?jhEh}hG}r(UnameXdownload ScrapyhXhttp://scrapy.org/download/rhI]hJ]hK]hL]hM]uh9]rhdXdownload Scrapyrr}r(h>Uh?jubaubhdX, read rr}r(h>X, read h?jubh)r}r(h>X$:ref:`the tutorial `rh?jh@hChEhhG}r(UreftypeXrefhhXintro-tutorialU refdomainXstdrhI]hJ]U refexplicithK]hL]hM]hhuhOMh9]rh)r}r(h>jhG}r(hK]hL]r(hjXstd-refrehJ]hI]hM]uh?jh9]rhdX the tutorialrr}r(h>Uh?jubahEhubaubhdX and join rr}r(h>X and join h?jubhy)r}r(h>X`the community`_h|Kh?jhEh}hG}r(UnameX the communityhXhttp://scrapy.org/community/rhI]hJ]hK]hL]hM]uh9]rhdX the communityrr}r(h>Uh?jubaubhdX. Thanks for your interest!rr}r(h>X. Thanks for your interest!h?jubeubh;)r}r(h>X0.. _download Scrapy: http://scrapy.org/download/U referencedrKh?jh@hChEhFhG}r(hjhI]rh*ahJ]hK]hL]hM]rh auhOMhPhh9]ubh;)r}r(h>X/.. _the community: http://scrapy.org/community/jKh?jh@hChEhFhG}r(hjhI]rh5ahJ]hK]hL]hM]rhauhOMhPhh9]ubh;)r}r(h>XA.. _screen scraping: http://en.wikipedia.org/wiki/Screen_scrapingjKh?jh@hChEhFhG}r(hhhI]rh6ahJ]hK]hL]hM]rhauhOMhPhh9]ubh;)r}r(h>X;.. _web scraping: http://en.wikipedia.org/wiki/Web_scrapingjKh?jh@hChEhFhG}r(hhhI]rh-ahJ]hK]hL]hM]rhauhOM hPhh9]ubh;)r}r(h>XE.. _Amazon Associates Web Services: http://aws.amazon.com/associates/jKh?jh@hChEhFhG}r(hhhI]rh,ahJ]hK]hL]hM]rhauhOM hPhh9]ubh;)r }r (h>X%.. _Mininova: http://www.mininova.orgjKh?jh@hChEhFhG}r (hhhI]r h8ahJ]hK]hL]hM]r hauhOM hPhh9]ubh;)r}r(h>X%.. _XPath: http://www.w3.org/TR/xpathjKh?jh@hChEhFhG}r(hjhI]rh%ahJ]hK]hL]hM]rhauhOM hPhh9]ubh;)r}r(h>X/.. _XPath reference: http://www.w3.org/TR/xpathjKh?jh@hChEhFhG}r(hjmhI]rh1ahJ]hK]hL]hM]rhauhOM hPhh9]ubh;)r}r(h>X(.. _Amazon S3: http://aws.amazon.com/s3/jKh?jh@hChEhFhG}r(hjhI]rh+ahJ]hK]hL]hM]rh auhOMhPhh9]ubh;)r}r(h>X%.. _Sitemaps: http://www.sitemaps.orgjKh?jh@hChEhFhG}r(hjhI]r h.ahJ]hK]hL]hM]r!hauhOMhPhh9]ubeubeubeh>UU transformerr"NU footnote_refsr#}r$Urefnamesr%}r&(Xxpath]r'jaXdownload scrapy]r(jaX the community]r)jaXamazon associates web services]r*haX web scraping]r+haXsitemaps]r,jaXxpath reference]r-jjaXscreen scraping]r.hzaX amazon s3]r/jaXmininova]r0hauUsymbol_footnotesr1]r2Uautofootnote_refsr3]r4Usymbol_footnote_refsr5]r6U citationsr7]r8hPhU current_liner9NUtransform_messagesr:]r;(cdocutils.nodes system_message r<)r=}r>(h>UhG}r?(hK]UlevelKhI]hJ]UsourcehChL]hM]UlineKUtypeUINFOr@uh9]rAhh)rB}rC(h>UhG}rD(hK]hL]hJ]hI]hM]uh?j=h9]rEhdX4Hyperlink target "intro-overview" is not referenced.rFrG}rH(h>Uh?jBubahEhlubahEUsystem_messagerIubj<)rJ}rK(h>UhG}rL(hK]UlevelKhI]hJ]UsourcehChL]hM]UlineK$Utypej@uh9]rMhh)rN}rO(h>UhG}rP(hK]hL]hJ]hI]hM]uh?jJh9]rQhdX9Hyperlink target "intro-overview-item" is not referenced.rRrS}rT(h>Uh?jNubahEhlubahEjIubj<)rU}rV(h>UhG}rW(hK]UlevelKhI]hJ]UsourcehChL]hM]UlineKUtypej@uh9]rXhh)rY}rZ(h>UhG}r[(hK]hL]hJ]hI]hM]uh?jUh9]r\hdX5Hyperlink target "topics-whatelse" is not referenced.r]r^}r_(h>Uh?jYubahEhlubahEjIubeUreporterr`NUid_startraKU autofootnotesrb]rcU citation_refsrd}reUindirect_targetsrf]rgUsettingsrh(cdocutils.frontend Values riorj}rk(Ufootnote_backlinksrlKUrecord_dependenciesrmNU rfc_base_urlrnUhttp://tools.ietf.org/html/roU tracebackrpUpep_referencesrqNUstrip_commentsrrNU toc_backlinksrsUentryrtU language_coderuUenrvU datestamprwNU report_levelrxKU _destinationryNU halt_levelrzKU strip_classesr{NhaNUerror_encoding_error_handlerr|Ubackslashreplacer}Udebugr~NUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUD/var/build/user_builds/scrapy/checkouts/0.22/docs/intro/overview.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(h4j h5jh-jh&jgh.jh$jh%jh8j h/hh)jHh0jgh(jh1jh'hRh2hRh3j h*jh6jh,jh+jh7j uUsubstitution_namesr}rhEhPhG}r(hK]hI]hJ]UsourcehChL]hM]uU footnotesr]rUrefidsr}r(h2]rh}q?(hXExamplesq@hh2h h#h%UtitleqAh'}qB(h+]h,]h*]h)]h-]uh/Kh0hh]qCcdocutils.nodes Text qDXExamplesqEqF}qG(hh@hh>ubaubcdocutils.nodes paragraph qH)qI}qJ(hXThe best way to learn is with examples, and Scrapy is no exception. For this reason, there is an example Scrapy project named dirbot_, that you can use to play and learn more about Scrapy. It contains the dmoz spider described in the tutorial.hh2h h#h%U paragraphqKh'}qL(h+]h,]h*]h)]h-]uh/Kh0hh]qM(hDX~The best way to learn is with examples, and Scrapy is no exception. For this reason, there is an example Scrapy project named qNqO}qP(hX~The best way to learn is with examples, and Scrapy is no exception. For this reason, there is an example Scrapy project named hhIubcdocutils.nodes reference qQ)qR}qS(hXdirbot_UresolvedqTKhhIh%U referenceqUh'}qV(UnameXdirbotqWUrefuriqXX https://github.com/scrapy/dirbotqYh)]h*]h+]h,]h-]uh]qZhDXdirbotq[q\}q](hUhhRubaubhDXn, that you can use to play and learn more about Scrapy. It contains the dmoz spider described in the tutorial.q^q_}q`(hXn, that you can use to play and learn more about Scrapy. It contains the dmoz spider described in the tutorial.hhIubeubhH)qa}qb(hXFThis dirbot_ project is available at: https://github.com/scrapy/dirbotqchh2h h#h%hKh'}qd(h+]h,]h*]h)]h-]uh/K h0hh]qe(hDXThis qfqg}qh(hXThis hhaubhQ)qi}qj(hXdirbot_hTKhhah%hUh'}qk(UnameXdirbothXhYh)]h*]h+]h,]h-]uh]qlhDXdirbotqmqn}qo(hUhhiubaubhDX project is available at: qpqq}qr(hX project is available at: hhaubhQ)qs}qt(hX https://github.com/scrapy/dirbotquh'}qv(Urefurihuh)]h*]h+]h,]h-]uhhah]qwhDX https://github.com/scrapy/dirbotqxqy}qz(hUhhsubah%hUubeubhH)q{}q|(hXNIt contains a README file with a detailed description of the project contents.q}hh2h h#h%hKh'}q~(h+]h,]h*]h)]h-]uh/Kh0hh]qhDXNIt contains a README file with a detailed description of the project contents.qq}q(hh}hh{ubaubhH)q}q(hXIf you're familiar with git, you can checkout the code. Otherwise you can download a tarball or zip file of the project by clicking on `Downloads`_.hh2h h#h%hKh'}q(h+]h,]h*]h)]h-]uh/Kh0hh]q(hDXIf you're familiar with git, you can checkout the code. Otherwise you can download a tarball or zip file of the project by clicking on qq}q(hXIf you're familiar with git, you can checkout the code. Otherwise you can download a tarball or zip file of the project by clicking on hhubhQ)q}q(hX `Downloads`_hTKhhh%hUh'}q(UnameX DownloadshXX0https://github.com/scrapy/dirbot/archives/masterqh)]h*]h+]h,]h-]uh]qhDX Downloadsqq}q(hUhhubaubhDX.q}q(hX.hhubeubhH)q}q(hXThe `scrapy tag on Snipplr`_ is used for sharing code snippets such as spiders, middlewares, extensions, or scripts. Feel free (and encouraged!) to share any code there.hh2h h#h%hKh'}q(h+]h,]h*]h)]h-]uh/Kh0hh]q(hDXThe qq}q(hXThe hhubhQ)q}q(hX`scrapy tag on Snipplr`_hTKhhh%hUh'}q(UnameXscrapy tag on SnipplrhXX#http://snipplr.com/all/tags/scrapy/qh)]h*]h+]h,]h-]uh]qhDXscrapy tag on Snipplrqq}q(hUhhubaubhDX is used for sharing code snippets such as spiders, middlewares, extensions, or scripts. Feel free (and encouraged!) to share any code there.qq}q(hX is used for sharing code snippets such as spiders, middlewares, extensions, or scripts. Feel free (and encouraged!) to share any code there.hhubeubh)q}q(hX,.. _dirbot: https://github.com/scrapy/dirbotU referencedqKhh2h h#h%h&h'}q(hXhYh)]qhah*]h+]h,]h-]qhauh/Kh0hh]ubh)q}q(hX?.. _Downloads: https://github.com/scrapy/dirbot/archives/masterhKhh2h h#h%h&h'}q(hXhh)]qhah*]h+]h,]h-]qhauh/Kh0hh]ubh)q}q(hX>.. _scrapy tag on Snipplr: http://snipplr.com/all/tags/scrapy/hKhh2h h#h%h&h'}q(hXhh)]qhah*]h+]h,]h-]qhauh/Kh0hh]ubeubehUU transformerqNU footnote_refsq}qUrefnamesq}q(hW]q(hRhieXscrapy tag on snipplr]qhaX downloads]qhauUsymbol_footnotesq]qUautofootnote_refsq]qUsymbol_footnote_refsq]qU citationsq]qh0hU current_lineqNUtransform_messagesq]qcdocutils.nodes system_message q)q}q(hUh'}q(h+]UlevelKh)]h*]Usourceh#h,]h-]UlineKUtypeUINFOquh]qhH)q}q(hUh'}q(h+]h,]h*]h)]h-]uhhh]qhDX4Hyperlink target "intro-examples" is not referenced.qӅq}q(hUhhubah%hKubah%Usystem_messagequbaUreporterqNUid_startqKU autofootnotesq]qU citation_refsq}qUindirect_targetsq]qUsettingsq(cdocutils.frontend Values qoq}q(Ufootnote_backlinksqKUrecord_dependenciesqNU rfc_base_urlqUhttp://tools.ietf.org/html/qU tracebackqUpep_referencesqNUstrip_commentsqNU toc_backlinksqUentryqU language_codeqUenqU datestampqNU report_levelqKU _destinationqNU halt_levelqKU strip_classesqNhANUerror_encoding_error_handlerqUbackslashreplaceqUdebugqNUembed_stylesheetqUoutput_encoding_error_handlerqUstrictqU sectnum_xformqKUdump_transformsqNU docinfo_xformqKUwarning_streamqNUpep_file_url_templateqUpep-%04dqUexit_status_levelqKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesr NUoutput_encodingr Uutf-8r U source_urlr NUinput_encodingr U utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUD/var/build/user_builds/scrapy/checkouts/0.22/docs/intro/examples.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrhUauto_id_prefixrUidr Udoctitle_xformr!Ustrip_elements_with_classesr"NU _config_filesr#]Ufile_insertion_enabledr$U raw_enabledr%KU dump_settingsr&NubUsymbol_footnote_startr'KUidsr(}r)(hhhhhh2hh2hhuUsubstitution_namesr*}r+h%h0h'}r,(h+]h)]h*]Usourceh#h,]h-]uU footnotesr-]r.Urefidsr/}r0h]r1hasub.PK o1D$uu-scrapy-0.22/.doctrees/topics/settings.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xnewspider_moduleqNXitem_pipelinesqNX schedulerqNXmemdebug_enabledq NXamazon web servicesq Xdownload_handlersq NXspider_contractsq NXredirect_max_metarefresh_delayq NX3. default settings per-commandqNXtelnetconsole_enabledqNXurllength_limitqNXdepth_priorityqNXspider_middlewares_baseqNXmemusage_reportqNXdepth-first orderqXaws_access_key_idqNX log_stdoutqNXimport search pathqXredirect_max_timesqNX log_levelqNXspider_modulesqNXmemusage_warning_mbqNX stats_dumpqNXdepth_stats_verboseqNXtopics-settingsqX log_enabledqNXdupefilter_classq NXdownloader_debugq!NXpopulating the settingsq"NXdefault_item_classq#NXspider_middlewaresq$NX depth_statsq%NXdownloader_middlewaresq&NXmemusage_notify_mailq'NXeditorq(NX1. global overridesq)NXrobotstxt_obeyq*NX2. project settings moduleq+NXconcurrent_requests_per_domainq,NXdownloader_middlewares_baseq-NXconcurrent_requestsq.NXrandomize_download_delayq/NXextensions_baseq0NXbreadth-first orderq1Xdefault_request_headersq2NXconcurrent_requests_per_ipq3NXdnscache_enabledq4NXdownload_timeoutq5NXhow to access settingsq6NXmemusage_limit_mbq7NXmemusage_enabledq8NX stats_classq9NXwgetq:Xbuilt-in settings referenceq;NXitem_pipelines_baseqNXtopics-settings-refq?Xredirect_priority_adjustq@NX depth_limitqANX templates_dirqBNXsettingsqCNXtelnetconsole_portqDNX log_encodingqENXbot_nameqFNXdesignating the settingsqGNXconcurrent_itemsqHNX extensionsqINX user_agentqJNXaws_secret_access_keyqKNX4. default global settingsqLNXspider_contracts_baseqMNXdownloader_statsqNNXlog_fileqONXstatsmailer_rcptsqPNXdownload_delayqQNXmemdebug_notifyqRNuUsubstitution_defsqS}qTUparse_messagesqU]qVUcurrent_sourceqWNU decorationqXNUautofootnote_startqYKUnameidsqZ}q[(hUnewspider-moduleq\hUitem-pipelinesq]hU schedulerq^h Umemdebug-enabledq_h Uamazon-web-servicesq`h Udownload-handlersqah Uspider-contractsqbh Uredirect-max-metarefresh-delayqchUdefault-settings-per-commandqdhUtelnetconsole-enabledqehUurllength-limitqfhUdepth-priorityqghUspider-middlewares-baseqhhUmemusage-reportqihUdepth-first-orderqjhUaws-access-key-idqkhU log-stdoutqlhUimport-search-pathqmhUredirect-max-timesqnhU log-levelqohUspider-modulesqphUmemusage-warning-mbqqhU stats-dumpqrhUdepth-stats-verboseqshUtopics-settingsqthU log-enabledquh Udupefilter-classqvh!Udownloader-debugqwh"Upopulating-the-settingsqxh#Udefault-item-classqyh$Uspider-middlewaresqzh%U depth-statsq{h&Udownloader-middlewaresq|h'Umemusage-notify-mailq}h(Ueditorq~h)Uglobal-overridesqh*Urobotstxt-obeyqh+Uproject-settings-moduleqh,Uconcurrent-requests-per-domainqh-Udownloader-middlewares-baseqh.Uconcurrent-requestsqh/Urandomize-download-delayqh0Uextensions-baseqh1Ubreadth-first-orderqh2Udefault-request-headersqh3Uconcurrent-requests-per-ipqh4Udnscache-enabledqh5Udownload-timeoutqh6Uhow-to-access-settingsqh7Umemusage-limit-mbqh8Umemusage-enabledqh9U stats-classqh:Uwgetqh;Ubuilt-in-settings-referenceqhUdownload-handlers-baseqh?Utopics-settings-refqh@Uredirect-priority-adjustqhAU depth-limitqhBU templates-dirqhCUsettingsqhDUtelnetconsole-portqhEU log-encodingqhFUbot-nameqhGUdesignating-the-settingsqhHUconcurrent-itemsqhIU extensionsqhJU user-agentqhKUaws-secret-access-keyqhLUdefault-global-settingsqhMUspider-contracts-baseqhNUdownloader-statsqhOUlog-fileqhPUstatsmailer-rcptsqhQUdownload-delayqhRUmemdebug-notifyquUchildrenq]q(cdocutils.nodes target q)q}q(U rawsourceqX.. _topics-settings:UparentqhUsourceqcdocutils.nodes reprunicode qXE/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/settings.rstqq}qbUtagnameqUtargetqU attributesq}q(Uidsq]Ubackrefsq]Udupnamesq]Uclassesq]Unamesq]UrefidqhtuUlineqKUdocumentqhh]ubcdocutils.nodes section q)q}q(hUhhhhUexpect_referenced_by_nameq}qhhshUsectionqh}q(h]h]h]h]q(hhteh]q(hCheuhKhhUexpect_referenced_by_idq}qhthsh]q(cdocutils.nodes title q)q}q(hXSettingsqhhhhhUtitleqh}q(h]h]h]h]h]uhKhhh]qcdocutils.nodes Text qXSettingsqՅq}q(hhhhubaubcdocutils.nodes paragraph q)q}q(hXThe Scrapy settings allows you to customize the behaviour of all Scrapy components, including the core, extensions, pipelines and spiders themselves.qhhhhhU paragraphqh}q(h]h]h]h]h]uhKhhh]qhXThe Scrapy settings allows you to customize the behaviour of all Scrapy components, including the core, extensions, pipelines and spiders themselves.q߅q}q(hhhhubaubh)q}q(hXThe infrastructure of the settings provides a global namespace of key-value mappings that the code can use to pull configuration values from. The settings can be populated through different mechanisms, which are described below.qhhhhhhh}q(h]h]h]h]h]uhK hhh]qhXThe infrastructure of the settings provides a global namespace of key-value mappings that the code can use to pull configuration values from. The settings can be populated through different mechanisms, which are described below.q煁q}q(hhhhubaubh)q}q(hXnThe settings are also the mechanism for selecting the currently active Scrapy project (in case you have many).qhhhhhhh}q(h]h]h]h]h]uhKhhh]qhXnThe settings are also the mechanism for selecting the currently active Scrapy project (in case you have many).qq}q(hhhhubaubh)q}q(hXJFor a list of available built-in settings see: :ref:`topics-settings-ref`.qhhhhhhh}q(h]h]h]h]h]uhKhhh]q(hX/For a list of available built-in settings see: qq}q(hX/For a list of available built-in settings see: hhubcsphinx.addnodes pending_xref q)q}q(hX:ref:`topics-settings-ref`qhhhhhU pending_xrefqh}q(UreftypeXrefUrefwarnrU reftargetrXtopics-settings-refU refdomainXstdrh]h]U refexplicith]h]h]UrefdocrXtopics/settingsruhKh]rcdocutils.nodes emphasis r)r}r(hhh}r (h]h]r (Uxrefr jXstd-refr eh]h]h]uhhh]r hXtopics-settings-refrr}r(hUhjubahUemphasisrubaubhX.r}r(hX.hhubeubh)r}r(hUhhhhhhh}r(h]h]h]h]rhah]rhGauhKhhh]r(h)r}r(hXDesignating the settingsrhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXDesignating the settingsrr }r!(hjhjubaubh)r"}r#(hXWhen you use Scrapy, you have to tell it which settings you're using. You can do this by using an environment variable, ``SCRAPY_SETTINGS_MODULE``.hjhhhhh}r$(h]h]h]h]h]uhKhhh]r%(hXxWhen you use Scrapy, you have to tell it which settings you're using. You can do this by using an environment variable, r&r'}r((hXxWhen you use Scrapy, you have to tell it which settings you're using. You can do this by using an environment variable, hj"ubcdocutils.nodes literal r))r*}r+(hX``SCRAPY_SETTINGS_MODULE``h}r,(h]h]h]h]h]uhj"h]r-hXSCRAPY_SETTINGS_MODULEr.r/}r0(hUhj*ubahUliteralr1ubhX.r2}r3(hX.hj"ubeubh)r4}r5(hXThe value of ``SCRAPY_SETTINGS_MODULE`` should be in Python path syntax, e.g. ``myproject.settings``. Note that the settings module should be on the Python `import search path`_.hjhhhhh}r6(h]h]h]h]h]uhKhhh]r7(hX The value of r8r9}r:(hX The value of hj4ubj))r;}r<(hX``SCRAPY_SETTINGS_MODULE``h}r=(h]h]h]h]h]uhj4h]r>hXSCRAPY_SETTINGS_MODULEr?r@}rA(hUhj;ubahj1ubhX' should be in Python path syntax, e.g. rBrC}rD(hX' should be in Python path syntax, e.g. hj4ubj))rE}rF(hX``myproject.settings``h}rG(h]h]h]h]h]uhj4h]rHhXmyproject.settingsrIrJ}rK(hUhjEubahj1ubhX8. Note that the settings module should be on the Python rLrM}rN(hX8. Note that the settings module should be on the Python hj4ubcdocutils.nodes reference rO)rP}rQ(hX`import search path`_UresolvedrRKhj4hU referencerSh}rT(UnameXimport search pathUrefurirUXEhttp://docs.python.org/2/tutorial/modules.html#the-module-search-pathrVh]h]h]h]h]uh]rWhXimport search pathrXrY}rZ(hUhjPubaubhX.r[}r\(hX.hj4ubeubh)r]}r^(hX].. _import search path: http://docs.python.org/2/tutorial/modules.html#the-module-search-pathU referencedr_Khjhhhhh}r`(jUjVh]rahmah]h]h]h]rbhauhKhhh]ubeubh)rc}rd(hUhhhhhhh}re(h]h]h]h]rfhxah]rgh"auhK hhh]rh(h)ri}rj(hXPopulating the settingsrkhjchhhhh}rl(h]h]h]h]h]uhK hhh]rmhXPopulating the settingsrnro}rp(hjkhjiubaubh)rq}rr(hXSettings can be populated using different mechanisms, each of which having a different precedence. Here is the list of them in decreasing order of precedence:rshjchhhhh}rt(h]h]h]h]h]uhK"hhh]ruhXSettings can be populated using different mechanisms, each of which having a different precedence. Here is the list of them in decreasing order of precedence:rvrw}rx(hjshjqubaubcdocutils.nodes block_quote ry)rz}r{(hUhjchNhU block_quoter|h}r}(h]h]h]h]h]uhNhhh]r~cdocutils.nodes enumerated_list r)r}r(hUh}r(UsuffixrU.h]h]h]UprefixrUh]h]UenumtyperUarabicruhjzh]r(cdocutils.nodes list_item r)r}r(hX"Global overrides (most precedence)rh}r(h]h]h]h]h]uhjh]rh)r}r(hjhjhhhhh}r(h]h]h]h]h]uhK&h]rhX"Global overrides (most precedence)rr}r(hjhjubaubahU list_itemrubj)r}r(hXProject settings modulerh}r(h]h]h]h]h]uhjh]rh)r}r(hjhjhhhhh}r(h]h]h]h]h]uhK'h]rhXProject settings modulerr}r(hjhjubaubahjubj)r}r(hXDefault settings per-commandrh}r(h]h]h]h]h]uhjh]rh)r}r(hjhjhhhhh}r(h]h]h]h]h]uhK(h]rhXDefault settings per-commandrr}r(hjhjubaubahjubj)r}r(hX*Default global settings (less precedence) h}r(h]h]h]h]h]uhjh]rh)r}r(hX)Default global settings (less precedence)rhjhhhhh}r(h]h]h]h]h]uhK)h]rhX)Default global settings (less precedence)rr}r(hjhjubaubahjubehUenumerated_listrubaubh)r}r(hX4These mechanisms are described in more detail below.rhjchhhhh}r(h]h]h]h]h]uhK+hhh]rhX4These mechanisms are described in more detail below.rr}r(hjhjubaubh)r}r(hUhjchhhhh}r(h]h]h]h]rhah]rh)auhK.hhh]r(h)r}r(hX1. Global overridesrhjhhhhh}r(h]h]h]h]h]uhK.hhh]rhX1. Global overridesrr}r(hjhjubaubh)r}r(hXGlobal overrides are the ones that take most precedence, and are usually populated by command-line options. You can also override one (or more) settings from command line using the ``-s`` (or ``--set``) command line option.hjhhhhh}r(h]h]h]h]h]uhK0hhh]r(hXGlobal overrides are the ones that take most precedence, and are usually populated by command-line options. You can also override one (or more) settings from command line using the rr}r(hXGlobal overrides are the ones that take most precedence, and are usually populated by command-line options. You can also override one (or more) settings from command line using the hjubj))r}r(hX``-s``h}r(h]h]h]h]h]uhjh]rhX-srr}r(hUhjubahj1ubhX (or rr}r(hX (or hjubj))r}r(hX ``--set``h}r(h]h]h]h]h]uhjh]rhX--setrr}r(hUhjubahj1ubhX) command line option.rr}r(hX) command line option.hjubeubh)r}r(hX\For more information see the :attr:`~scrapy.settings.Settings.overrides` Settings attribute.hjhhhhh}r(h]h]h]h]h]uhK4hhh]r(hXFor more information see the rr}r(hXFor more information see the hjubh)r}r(hX+:attr:`~scrapy.settings.Settings.overrides`rhjhhhhh}r(UreftypeXattrjjX"scrapy.settings.Settings.overridesU refdomainXpyrh]h]U refexplicith]h]h]jjUpy:classrNU py:modulerNuhK4h]rj))r}r(hjh}r(h]h]r(j jXpy-attrreh]h]h]uhjh]rhX overridesrr}r(hUhjubahj1ubaubhX Settings attribute.rr}r(hX Settings attribute.hjubeubcsphinx.addnodes highlightlang r)r}r (hUhjhhhU highlightlangr h}r (UlangXshUlinenothresholdI9223372036854775807 h]h]h]h]h]uhK8hhh]ubh)r }r (hX Example::rhjhhhhh}r(h]h]h]h]h]uhK9hhh]rhXExample:rr}r(hXExample:hj ubaubcdocutils.nodes literal_block r)r}r(hX,scrapy crawl myspider -s LOG_FILE=scrapy.loghjhhhU literal_blockrh}r(U xml:spacerUpreserverh]h]h]h]h]uhK;hhh]rhX,scrapy crawl myspider -s LOG_FILE=scrapy.logrr}r(hUhjubaubeubh)r}r (hUhjchhhhh}r!(h]h]h]h]r"hah]r#h+auhK>hhh]r$(h)r%}r&(hX2. Project settings moduler'hjhhhhh}r((h]h]h]h]h]uhK>hhh]r)hX2. Project settings moduler*r+}r,(hj'hj%ubaubh)r-}r.(hXThe project settings module is the standard configuration file for your Scrapy project. It's where most of your custom settings will be populated. For example:: ``myproject.settings``.hjhhhhh}r/(h]h]h]h]h]uhK@hhh]r0(hXThe project settings module is the standard configuration file for your Scrapy project. It's where most of your custom settings will be populated. For example:: r1r2}r3(hXThe project settings module is the standard configuration file for your Scrapy project. It's where most of your custom settings will be populated. For example:: hj-ubj))r4}r5(hX``myproject.settings``h}r6(h]h]h]h]h]uhj-h]r7hXmyproject.settingsr8r9}r:(hUhj4ubahj1ubhX.r;}r<(hX.hj-ubeubeubh)r=}r>(hUhjchhhhh}r?(h]h]h]h]r@hdah]rAhauhKEhhh]rB(h)rC}rD(hX3. Default settings per-commandrEhj=hhhhh}rF(h]h]h]h]h]uhKEhhh]rGhX3. Default settings per-commandrHrI}rJ(hjEhjCubaubh)rK}rL(hXEach :doc:`Scrapy tool ` command can have its own default settings, which override the global default settings. Those custom command settings are specified in the ``default_settings`` attribute of the command class.hj=hhhhh}rM(h]h]h]h]h]uhKGhhh]rN(hXEach rOrP}rQ(hXEach hjKubh)rR}rS(hX%:doc:`Scrapy tool `rThjKhhhhh}rU(UreftypeXdocrVjjX/topics/commandsU refdomainUh]h]U refexplicith]h]h]jjuhKGh]rWj))rX}rY(hjTh}rZ(h]h]r[(j jVeh]h]h]uhjRh]r\hX Scrapy toolr]r^}r_(hUhjXubahj1ubaubhX command can have its own default settings, which override the global default settings. Those custom command settings are specified in the r`ra}rb(hX command can have its own default settings, which override the global default settings. Those custom command settings are specified in the hjKubj))rc}rd(hX``default_settings``h}re(h]h]h]h]h]uhjKh]rfhXdefault_settingsrgrh}ri(hUhjcubahj1ubhX attribute of the command class.rjrk}rl(hX attribute of the command class.hjKubeubeubh)rm}rn(hUhjchhhhh}ro(h]h]h]h]rphah]rqhLauhKMhhh]rr(h)rs}rt(hX4. Default global settingsruhjmhhhhh}rv(h]h]h]h]h]uhKMhhh]rwhX4. Default global settingsrxry}rz(hjuhjsubaubh)r{}r|(hXThe global defaults are located in the ``scrapy.settings.default_settings`` module and documented in the :ref:`topics-settings-ref` section.hjmhhhhh}r}(h]h]h]h]h]uhKOhhh]r~(hX'The global defaults are located in the rr}r(hX'The global defaults are located in the hj{ubj))r}r(hX$``scrapy.settings.default_settings``h}r(h]h]h]h]h]uhj{h]rhX scrapy.settings.default_settingsrr}r(hUhjubahj1ubhX module and documented in the rr}r(hX module and documented in the hj{ubh)r}r(hX:ref:`topics-settings-ref`rhj{hhhhh}r(UreftypeXrefjjXtopics-settings-refU refdomainXstdrh]h]U refexplicith]h]h]jjuhKOh]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhXtopics-settings-refrr}r(hUhjubahjubaubhX section.rr}r(hX section.hj{ubeubeubeubh)r}r(hUhhhhhhh}r(h]h]h]h]rhah]rh6auhKShhh]r(h)r}r(hXHow to access settingsrhjhhhhh}r(h]h]h]h]h]uhKShhh]rhXHow to access settingsrr}r(hjhjubaubj)r}r(hUhjhhhj h}r(UlangXpythonUlinenothresholdI9223372036854775807 h]h]h]h]h]uhKVhhh]ubh)r}r(hXSettings can be accessed through the :attr:`scrapy.crawler.Crawler.settings` attribute of the Crawler that is passed to ``from_crawler`` method in extensions and middlewares::hjhhhhh}r(h]h]h]h]h]uhKWhhh]r(hX%Settings can be accessed through the rr}r(hX%Settings can be accessed through the hjubh)r}r(hX':attr:`scrapy.crawler.Crawler.settings`rhjhhhhh}r(UreftypeXattrjjXscrapy.crawler.Crawler.settingsU refdomainXpyrh]h]U refexplicith]h]h]jjjNjNuhKWh]rj))r}r(hjh}r(h]h]r(j jXpy-attrreh]h]h]uhjh]rhXscrapy.crawler.Crawler.settingsrr}r(hUhjubahj1ubaubhX, attribute of the Crawler that is passed to rr}r(hX, attribute of the Crawler that is passed to hjubj))r}r(hX``from_crawler``h}r(h]h]h]h]h]uhjh]rhX from_crawlerrr}r(hUhjubahj1ubhX& method in extensions and middlewares:rr}r(hX& method in extensions and middlewares:hjubeubj)r}r(hXclass MyExtension(object): @classmethod def from_crawler(cls, crawler): settings = crawler.settings if settings['LOG_ENABLED']: print "log is enabled!"hjhhhjh}r(jjh]h]h]h]h]uhK[hhh]rhXclass MyExtension(object): @classmethod def from_crawler(cls, crawler): settings = crawler.settings if settings['LOG_ENABLED']: print "log is enabled!"rr}r(hUhjubaubh)r}r(hXIn other words, settings can be accessed like a dict, but it's usually preferred to extract the setting in the format you need it to avoid type errors. In order to do that you'll have to use one of the methods provided the :class:`~scrapy.settings.Settings` API.hjhhhhh}r(h]h]h]h]h]uhKchhh]r(hXIn other words, settings can be accessed like a dict, but it's usually preferred to extract the setting in the format you need it to avoid type errors. In order to do that you'll have to use one of the methods provided the rr}r(hXIn other words, settings can be accessed like a dict, but it's usually preferred to extract the setting in the format you need it to avoid type errors. In order to do that you'll have to use one of the methods provided the hjubh)r}r(hX":class:`~scrapy.settings.Settings`rhjhhhhh}r(UreftypeXclassjjXscrapy.settings.SettingsU refdomainXpyrh]h]U refexplicith]h]h]jjjNjNuhKch]rj))r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXSettingsrr}r(hUhjubahj1ubaubhX API.rr}r(hX API.hjubeubeubh)r}r(hUhhhhhhh}r(h]h]h]h]rhah]rh=auhKihhh]r(h)r}r(hXRationale for setting namesrhjhhhhh}r(h]h]h]h]h]uhKihhh]rhXRationale for setting namesrr}r(hjhjubaubh)r}r(hXSetting names are usually prefixed with the component that they configure. For example, proper setting names for a fictional robots.txt extension would be ``ROBOTSTXT_ENABLED``, ``ROBOTSTXT_OBEY``, ``ROBOTSTXT_CACHEDIR``, etc.hjhhhhh}r(h]h]h]h]h]uhKkhhh]r(hXSetting names are usually prefixed with the component that they configure. For example, proper setting names for a fictional robots.txt extension would be rr}r(hXSetting names are usually prefixed with the component that they configure. For example, proper setting names for a fictional robots.txt extension would be hjubj))r}r(hX``ROBOTSTXT_ENABLED``h}r (h]h]h]h]h]uhjh]r hXROBOTSTXT_ENABLEDr r }r (hUhjubahj1ubhX, rr}r(hX, hjubj))r}r(hX``ROBOTSTXT_OBEY``h}r(h]h]h]h]h]uhjh]rhXROBOTSTXT_OBEYrr}r(hUhjubahj1ubhX, rr}r(hX, hjubj))r}r(hX``ROBOTSTXT_CACHEDIR``h}r(h]h]h]h]h]uhjh]rhXROBOTSTXT_CACHEDIRrr }r!(hUhjubahj1ubhX, etc.r"r#}r$(hX, etc.hjubeubh)r%}r&(hX.. _topics-settings-ref:hjhhhhh}r'(h]h]h]h]h]hhuhKphhh]ubeubh)r(}r)(hUhhhhh}r*h?j%shhh}r+(h]h]h]h]r,(hheh]r-(h;h?euhKshhh}r.hj%sh]r/(h)r0}r1(hXBuilt-in settings referencer2hj(hhhhh}r3(h]h]h]h]h]uhKshhh]r4hXBuilt-in settings referencer5r6}r7(hj2hj0ubaubh)r8}r9(hXHere's a list of all available Scrapy settings, in alphabetical order, along with their default values and the scope where they apply.r:hj(hhhhh}r;(h]h]h]h]h]uhKuhhh]r<hXHere's a list of all available Scrapy settings, in alphabetical order, along with their default values and the scope where they apply.r=r>}r?(hj:hj8ubaubh)r@}rA(hX3The scope, where available, shows where the setting is being used, if it's tied to any particular component. In that case the module of that component will be shown, typically an extension, middleware or pipeline. It also means that the component must be enabled in order for the setting to have any effect.rBhj(hhhhh}rC(h]h]h]h]h]uhKxhhh]rDhX3The scope, where available, shows where the setting is being used, if it's tied to any particular component. In that case the module of that component will be shown, typically an extension, middleware or pipeline. It also means that the component must be enabled in order for the setting to have any effect.rErF}rG(hjBhj@ubaubcsphinx.addnodes index rH)rI}rJ(hUhj(hhhUindexrKh}rL(h]h]h]h]h]Uentries]rM(XpairXAWS_ACCESS_KEY_ID; settingXstd:setting-AWS_ACCESS_KEY_IDrNUtrOauhK~hhh]ubh)rP}rQ(hUhj(hhhhh}rR(h]h]h]h]h]hjNuhK~hhh]ubh)rS}rT(hUhj(hhh}hhh}rU(h]h]h]h]rV(hkjNeh]rWhauhKhhh}rXjNjPsh]rY(h)rZ}r[(hXAWS_ACCESS_KEY_IDr\hjShhhhh}r](h]h]h]h]h]uhKhhh]r^hXAWS_ACCESS_KEY_IDr_r`}ra(hj\hjZubaubh)rb}rc(hXDefault: ``None``rdhjShhhhh}re(h]h]h]h]h]uhKhhh]rf(hX Default: rgrh}ri(hX Default: hjbubj))rj}rk(hX``None``h}rl(h]h]h]h]h]uhjbh]rmhXNonernro}rp(hUhjjubahj1ubeubh)rq}rr(hXThe AWS access key used by code that requires access to `Amazon Web services`_, such as the :ref:`S3 feed storage backend `.hjShhhhh}rs(h]h]h]h]h]uhKhhh]rt(hX8The AWS access key used by code that requires access to rurv}rw(hX8The AWS access key used by code that requires access to hjqubjO)rx}ry(hX`Amazon Web services`_jRKhjqhjSh}rz(UnameXAmazon Web servicesjUXhttp://aws.amazon.com/r{h]h]h]h]h]uh]r|hXAmazon Web servicesr}r~}r(hUhjxubaubhX, such as the rr}r(hX, such as the hjqubh)r}r(hX7:ref:`S3 feed storage backend `rhjqhhhhh}r(UreftypeXrefjjXtopics-feed-storage-s3U refdomainXstdrh]h]U refexplicith]h]h]jjuhKh]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhXS3 feed storage backendrr}r(hUhjubahjubaubhX.r}r(hX.hjqubeubjH)r}r(hUhjShhhjKh}r(h]h]h]h]h]Uentries]r(XpairXAWS_SECRET_ACCESS_KEY; settingX!std:setting-AWS_SECRET_ACCESS_KEYrUtrauhKhhh]ubh)r}r(hUhjShhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rhKauhKhhh}rjjsh]r(h)r}r(hXAWS_SECRET_ACCESS_KEYrhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXAWS_SECRET_ACCESS_KEYrr}r(hjhjubaubh)r}r(hXDefault: ``None``rhjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``None``h}r(h]h]h]h]h]uhjh]rhXNonerr}r(hUhjubahj1ubeubh)r}r(hXThe AWS secret key used by code that requires access to `Amazon Web services`_, such as the :ref:`S3 feed storage backend `.hjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX8The AWS secret key used by code that requires access to rr}r(hX8The AWS secret key used by code that requires access to hjubjO)r}r(hX`Amazon Web services`_jRKhjhjSh}r(UnameXAmazon Web servicesjUj{h]h]h]h]h]uh]rhXAmazon Web servicesrr}r(hUhjubaubhX, such as the rr}r(hX, such as the hjubh)r}r(hX7:ref:`S3 feed storage backend `rhjhhhhh}r(UreftypeXrefjjXtopics-feed-storage-s3U refdomainXstdrh]h]U refexplicith]h]h]jjuhKh]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhXS3 feed storage backendrr}r(hUhjubahjubaubhX.r}r(hX.hjubeubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXBOT_NAME; settingXstd:setting-BOT_NAMErUtrauhKhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rhFauhKhhh}rjjsh]r(h)r}r(hXBOT_NAMErhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXBOT_NAMErr}r(hjhjubaubh)r}r(hXDefault: ``'scrapybot'``rhjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``'scrapybot'``h}r(h]h]h]h]h]uhjh]rhX 'scrapybot'rr}r(hUhjubahj1ubeubh)r}r(hXThe name of the bot implemented by this Scrapy project (also known as the project name). This will be used to construct the User-Agent by default, and also for logging.rhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXThe name of the bot implemented by this Scrapy project (also known as the project name). This will be used to construct the User-Agent by default, and also for logging.r r }r (hjhjubaubh)r }r (hXzIt's automatically populated with your project name when you create your project with the :command:`startproject` command.hjhhhhh}r(h]h]h]h]h]uhKhhh]r(hXZIt's automatically populated with your project name when you create your project with the rr}r(hXZIt's automatically populated with your project name when you create your project with the hj ubh)r}r(hX:command:`startproject`rhj hhhhh}r(UreftypeXcommandjjX startprojectU refdomainXstdrh]h]U refexplicith]h]h]jjuhKh]rj))r}r(hjh}r(h]h]r(j jX std-commandreh]h]h]uhjh]rhX startprojectrr }r!(hUhjubahj1ubaubhX command.r"r#}r$(hX command.hj ubeubjH)r%}r&(hUhjhhhjKh}r'(h]h]h]h]h]Uentries]r((XpairXCONCURRENT_ITEMS; settingXstd:setting-CONCURRENT_ITEMSr)Utr*auhKhhh]ubh)r+}r,(hUhjhhhhh}r-(h]h]h]h]h]hj)uhKhhh]ubeubh)r.}r/(hUhj(hhh}hhh}r0(h]h]h]h]r1(hj)eh]r2hHauhKhhh}r3j)j+sh]r4(h)r5}r6(hXCONCURRENT_ITEMSr7hj.hhhhh}r8(h]h]h]h]h]uhKhhh]r9hXCONCURRENT_ITEMSr:r;}r<(hj7hj5ubaubh)r=}r>(hXDefault: ``100``r?hj.hhhhh}r@(h]h]h]h]h]uhKhhh]rA(hX Default: rBrC}rD(hX Default: hj=ubj))rE}rF(hX``100``h}rG(h]h]h]h]h]uhj=h]rHhX100rIrJ}rK(hUhjEubahj1ubeubh)rL}rM(hXMaximum number of concurrent items (per response) to process in parallel in the Item Processor (also known as the :ref:`Item Pipeline `).hj.hhhhh}rN(h]h]h]h]h]uhKhhh]rO(hXrMaximum number of concurrent items (per response) to process in parallel in the Item Processor (also known as the rPrQ}rR(hXrMaximum number of concurrent items (per response) to process in parallel in the Item Processor (also known as the hjLubh)rS}rT(hX+:ref:`Item Pipeline `rUhjLhhhhh}rV(UreftypeXrefjjXtopics-item-pipelineU refdomainXstdrWh]h]U refexplicith]h]h]jjuhKh]rXj)rY}rZ(hjUh}r[(h]h]r\(j jWXstd-refr]eh]h]h]uhjSh]r^hX Item Pipeliner_r`}ra(hUhjYubahjubaubhX).rbrc}rd(hX).hjLubeubjH)re}rf(hUhj.hhhjKh}rg(h]h]h]h]h]Uentries]rh(XpairXCONCURRENT_REQUESTS; settingXstd:setting-CONCURRENT_REQUESTSriUtrjauhKhhh]ubh)rk}rl(hUhj.hhhhh}rm(h]h]h]h]h]hjiuhKhhh]ubeubh)rn}ro(hUhj(hhh}hhh}rp(h]h]h]h]rq(hjieh]rrh.auhKhhh}rsjijksh]rt(h)ru}rv(hXCONCURRENT_REQUESTSrwhjnhhhhh}rx(h]h]h]h]h]uhKhhh]ryhXCONCURRENT_REQUESTSrzr{}r|(hjwhjuubaubh)r}}r~(hXDefault: ``16``rhjnhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hj}ubj))r}r(hX``16``h}r(h]h]h]h]h]uhj}h]rhX16rr}r(hUhjubahj1ubeubh)r}r(hXmThe maximum number of concurrent (ie. simultaneous) requests that will be performed by the Scrapy downloader.rhjnhhhhh}r(h]h]h]h]h]uhKhhh]rhXmThe maximum number of concurrent (ie. simultaneous) requests that will be performed by the Scrapy downloader.rr}r(hjhjubaubjH)r}r(hUhjnhhhjKh}r(h]h]h]h]h]Uentries]r(XpairX'CONCURRENT_REQUESTS_PER_DOMAIN; settingX*std:setting-CONCURRENT_REQUESTS_PER_DOMAINrUtrauhKhhh]ubh)r}r(hUhjnhhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rh,auhKhhh}rjjsh]r(h)r}r(hXCONCURRENT_REQUESTS_PER_DOMAINrhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXCONCURRENT_REQUESTS_PER_DOMAINrr}r(hjhjubaubh)r}r(hXDefault: ``8``rhjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``8``h}r(h]h]h]h]h]uhjh]rhX8r}r(hUhjubahj1ubeubh)r}r(hXiThe maximum number of concurrent (ie. simultaneous) requests that will be performed to any single domain.rhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXiThe maximum number of concurrent (ie. simultaneous) requests that will be performed to any single domain.rr}r(hjhjubaubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairX#CONCURRENT_REQUESTS_PER_IP; settingX&std:setting-CONCURRENT_REQUESTS_PER_IPrUtrauhKhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rh3auhKhhh}rjjsh]r(h)r}r(hXCONCURRENT_REQUESTS_PER_IPrhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXCONCURRENT_REQUESTS_PER_IPrr}r(hjhjubaubh)r}r(hXDefault: ``0``rhjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``0``h}r(h]h]h]h]h]uhjh]rhX0r}r(hUhjubahj1ubeubh)r}r(hXThe maximum number of concurrent (ie. simultaneous) requests that will be performed to any single IP. If non-zero, the :setting:`CONCURRENT_REQUESTS_PER_DOMAIN` setting is ignored, and this one is used instead. In other words, concurrency limits will be applied per IP, not per domain.hjhhhhh}r(h]h]h]h]h]uhKhhh]r(hXwThe maximum number of concurrent (ie. simultaneous) requests that will be performed to any single IP. If non-zero, the rr}r(hXwThe maximum number of concurrent (ie. simultaneous) requests that will be performed to any single IP. If non-zero, the hjubh)r}r(hX):setting:`CONCURRENT_REQUESTS_PER_DOMAIN`rhjhhhhh}r(UreftypeXsettingjjXCONCURRENT_REQUESTS_PER_DOMAINU refdomainXstdrh]h]U refexplicith]h]h]jjuhKh]rj))r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXCONCURRENT_REQUESTS_PER_DOMAINrr}r(hUhjubahj1ubaubhX} setting is ignored, and this one is used instead. In other words, concurrency limits will be applied per IP, not per domain.rr}r(hX} setting is ignored, and this one is used instead. In other words, concurrency limits will be applied per IP, not per domain.hjubeubh)r}r(hXThis setting also affects :setting:`DOWNLOAD_DELAY`: if :setting:`CONCURRENT_REQUESTS_PER_IP` is non-zero, download delay is enforced per IP, not per domain.hjhhhhh}r(h]h]h]h]h]uhKhhh]r(hXThis setting also affects rr}r(hXThis setting also affects hjubh)r}r (hX:setting:`DOWNLOAD_DELAY`r hjhhhhh}r (UreftypeXsettingjjXDOWNLOAD_DELAYU refdomainXstdr h]h]U refexplicith]h]h]jjuhKh]r j))r}r(hj h}r(h]h]r(j j X std-settingreh]h]h]uhjh]rhXDOWNLOAD_DELAYrr}r(hUhjubahj1ubaubhX: if rr}r(hX: if hjubh)r}r(hX%:setting:`CONCURRENT_REQUESTS_PER_IP`rhjhhhhh}r(UreftypeXsettingjjXCONCURRENT_REQUESTS_PER_IPU refdomainXstdrh]h]U refexplicith]h]h]jjuhKh]rj))r }r!(hjh}r"(h]h]r#(j jX std-settingr$eh]h]h]uhjh]r%hXCONCURRENT_REQUESTS_PER_IPr&r'}r((hUhj ubahj1ubaubhX@ is non-zero, download delay is enforced per IP, not per domain.r)r*}r+(hX@ is non-zero, download delay is enforced per IP, not per domain.hjubeubjH)r,}r-(hUhjhhhjKh}r.(h]h]h]h]h]Uentries]r/(XpairXDEFAULT_ITEM_CLASS; settingXstd:setting-DEFAULT_ITEM_CLASSr0Utr1auhKhhh]ubh)r2}r3(hUhjhhhhh}r4(h]h]h]h]h]hj0uhKhhh]ubeubh)r5}r6(hUhj(hhh}hhh}r7(h]h]h]h]r8(hyj0eh]r9h#auhKhhh}r:j0j2sh]r;(h)r<}r=(hXDEFAULT_ITEM_CLASSr>hj5hhhhh}r?(h]h]h]h]h]uhKhhh]r@hXDEFAULT_ITEM_CLASSrArB}rC(hj>hj<ubaubh)rD}rE(hXDefault: ``'scrapy.item.Item'``rFhj5hhhhh}rG(h]h]h]h]h]uhKhhh]rH(hX Default: rIrJ}rK(hX Default: hjDubj))rL}rM(hX``'scrapy.item.Item'``h}rN(h]h]h]h]h]uhjDh]rOhX'scrapy.item.Item'rPrQ}rR(hUhjLubahj1ubeubh)rS}rT(hXjThe default class that will be used for instantiating items in the :ref:`the Scrapy shell `.hj5hhhhh}rU(h]h]h]h]h]uhKhhh]rV(hXCThe default class that will be used for instantiating items in the rWrX}rY(hXCThe default class that will be used for instantiating items in the hjSubh)rZ}r[(hX&:ref:`the Scrapy shell `r\hjShhhhh}r](UreftypeXrefjjX topics-shellU refdomainXstdr^h]h]U refexplicith]h]h]jjuhKh]r_j)r`}ra(hj\h}rb(h]h]rc(j j^Xstd-refrdeh]h]h]uhjZh]rehXthe Scrapy shellrfrg}rh(hUhj`ubahjubaubhX.ri}rj(hX.hjSubeubjH)rk}rl(hUhj5hhhjKh}rm(h]h]h]h]h]Uentries]rn(XpairX DEFAULT_REQUEST_HEADERS; settingX#std:setting-DEFAULT_REQUEST_HEADERSroUtrpauhKhhh]ubh)rq}rr(hUhj5hhhhh}rs(h]h]h]h]h]hjouhKhhh]ubeubh)rt}ru(hUhj(hhh}hhh}rv(h]h]h]h]rw(hjoeh]rxh2auhKhhh}ryjojqsh]rz(h)r{}r|(hXDEFAULT_REQUEST_HEADERSr}hjthhhhh}r~(h]h]h]h]h]uhKhhh]rhXDEFAULT_REQUEST_HEADERSrr}r(hj}hj{ubaubh)r}r(hX Default::rhjthhhhh}r(h]h]h]h]h]uhKhhh]rhXDefault:rr}r(hXDefault:hjubaubj)r}r(hXq{ 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language': 'en', }hjthhhjh}r(jjh]h]h]h]h]uhKhhh]rhXq{ 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language': 'en', }rr}r(hUhjubaubh)r}r(hXThe default headers used for Scrapy HTTP Requests. They're populated in the :class:`~scrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddleware`.hjthhhhh}r(h]h]h]h]h]uhKhhh]r(hXLThe default headers used for Scrapy HTTP Requests. They're populated in the rr}r(hXLThe default headers used for Scrapy HTTP Requests. They're populated in the hjubh)r}r(hXU:class:`~scrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddleware`rhjhhhhh}r(UreftypeXclassjjXKscrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddlewareU refdomainXpyrh]h]U refexplicith]h]h]jjjNjNuhKh]rj))r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXDefaultHeadersMiddlewarerr}r(hUhjubahj1ubaubhX.r}r(hX.hjubeubjH)r}r(hUhjthhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDEPTH_LIMIT; settingXstd:setting-DEPTH_LIMITrUtrauhKhhh]ubh)r}r(hUhjthhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rhAauhKhhh}rjjsh]r(h)r}r(hX DEPTH_LIMITrhjhhhhh}r(h]h]h]h]h]uhKhhh]rhX DEPTH_LIMITrr}r(hjhjubaubh)r}r(hXDefault: ``0``rhjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``0``h}r(h]h]h]h]h]uhjh]rhX0r}r(hUhjubahj1ubeubh)r}r(hX`The maximum depth that will be allowed to crawl for any site. If zero, no limit will be imposed.rhjhhhhh}r(h]h]h]h]h]uhKhhh]rhX`The maximum depth that will be allowed to crawl for any site. If zero, no limit will be imposed.rr}r(hjhjubaubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDEPTH_PRIORITY; settingXstd:setting-DEPTH_PRIORITYrUtrauhKhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hgjeh]rhauhKhhh}rjjsh]r(h)r}r(hXDEPTH_PRIORITYrhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXDEPTH_PRIORITYrr}r(hjhjubaubh)r}r(hXDefault: ``0``rhjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``0``h}r(h]h]h]h]h]uhjh]rhX0r}r(hUhjubahj1ubeubh)r}r(hXJAn integer that is used to adjust the request priority based on its depth.rhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXJAn integer that is used to adjust the request priority based on its depth.rr}r(hjhjubaubh)r}r(hX3If zero, no priority adjustment is made from depth.rhjhhhhh}r (h]h]h]h]h]uhKhhh]r hX3If zero, no priority adjustment is made from depth.r r }r (hjhjubaubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDEPTH_STATS; settingXstd:setting-DEPTH_STATSrUtrauhKhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(h{jeh]rh%auhMhhh}rjjsh]r(h)r}r(hX DEPTH_STATSr hjhhhhh}r!(h]h]h]h]h]uhMhhh]r"hX DEPTH_STATSr#r$}r%(hj hjubaubh)r&}r'(hXDefault: ``True``r(hjhhhhh}r)(h]h]h]h]h]uhMhhh]r*(hX Default: r+r,}r-(hX Default: hj&ubj))r.}r/(hX``True``h}r0(h]h]h]h]h]uhj&h]r1hXTruer2r3}r4(hUhj.ubahj1ubeubh)r5}r6(hX'Whether to collect maximum depth stats.r7hjhhhhh}r8(h]h]h]h]h]uhMhhh]r9hX'Whether to collect maximum depth stats.r:r;}r<(hj7hj5ubaubjH)r=}r>(hUhjhhhjKh}r?(h]h]h]h]h]Uentries]r@(XpairXDEPTH_STATS_VERBOSE; settingXstd:setting-DEPTH_STATS_VERBOSErAUtrBauhMhhh]ubh)rC}rD(hUhjhhhhh}rE(h]h]h]h]h]hjAuhMhhh]ubeubh)rF}rG(hUhj(hhh}hhh}rH(h]h]h]h]rI(hsjAeh]rJhauhM hhh}rKjAjCsh]rL(h)rM}rN(hXDEPTH_STATS_VERBOSErOhjFhhhhh}rP(h]h]h]h]h]uhM hhh]rQhXDEPTH_STATS_VERBOSErRrS}rT(hjOhjMubaubh)rU}rV(hXDefault: ``False``rWhjFhhhhh}rX(h]h]h]h]h]uhM hhh]rY(hX Default: rZr[}r\(hX Default: hjUubj))r]}r^(hX ``False``h}r_(h]h]h]h]h]uhjUh]r`hXFalserarb}rc(hUhj]ubahj1ubeubh)rd}re(hX|Whether to collect verbose depth stats. If this is enabled, the number of requests for each depth is collected in the stats.rfhjFhhhhh}rg(h]h]h]h]h]uhMhhh]rhhX|Whether to collect verbose depth stats. If this is enabled, the number of requests for each depth is collected in the stats.rirj}rk(hjfhjdubaubjH)rl}rm(hUhjFhhhjKh}rn(h]h]h]h]h]Uentries]ro(XpairXDNSCACHE_ENABLED; settingXstd:setting-DNSCACHE_ENABLEDrpUtrqauhMhhh]ubh)rr}rs(hUhjFhhhhh}rt(h]h]h]h]h]hjpuhMhhh]ubeubh)ru}rv(hUhj(hhh}hhh}rw(h]h]h]h]rx(hjpeh]ryh4auhMhhh}rzjpjrsh]r{(h)r|}r}(hXDNSCACHE_ENABLEDr~hjuhhhhh}r(h]h]h]h]h]uhMhhh]rhXDNSCACHE_ENABLEDrr}r(hj~hj|ubaubh)r}r(hXDefault: ``True``rhjuhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``True``h}r(h]h]h]h]h]uhjh]rhXTruerr}r(hUhjubahj1ubeubh)r}r(hX&Whether to enable DNS in-memory cache.rhjuhhhhh}r(h]h]h]h]h]uhMhhh]rhX&Whether to enable DNS in-memory cache.rr}r(hjhjubaubjH)r}r(hUhjuhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDOWNLOADER_DEBUG; settingXstd:setting-DOWNLOADER_DEBUGrUtrauhMhhh]ubh)r}r(hUhjuhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hwjeh]rh!auhMhhh}rjjsh]r(h)r}r(hXDOWNLOADER_DEBUGrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXDOWNLOADER_DEBUGrr}r(hjhjubaubh)r}r(hXDefault: ``False``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX ``False``h}r(h]h]h]h]h]uhjh]rhXFalserr}r(hUhjubahj1ubeubh)r}r(hX0Whether to enable the Downloader debugging mode.rhjhhhhh}r(h]h]h]h]h]uhM!hhh]rhX0Whether to enable the Downloader debugging mode.rr}r(hjhjubaubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDOWNLOADER_MIDDLEWARES; settingX"std:setting-DOWNLOADER_MIDDLEWARESrUtrauhM$hhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhM$hhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(h|jeh]rh&auhM&hhh}rjjsh]r(h)r}r(hXDOWNLOADER_MIDDLEWARESrhjhhhhh}r(h]h]h]h]h]uhM&hhh]rhXDOWNLOADER_MIDDLEWARESrr}r(hjhjubaubh)r}r(hXDefault:: ``{}``rhjhhhhh}r(h]h]h]h]h]uhM(hhh]r(hX Default:: rr}r(hX Default:: hjubj))r}r(hX``{}``h}r(h]h]h]h]h]uhjh]rhX{}rr}r(hUhjubahj1ubeubh)r}r(hXA dict containing the downloader middlewares enabled in your project, and their orders. For more info see :ref:`topics-downloader-middleware-setting`.hjhhhhh}r(h]h]h]h]h]uhM*hhh]r(hXjA dict containing the downloader middlewares enabled in your project, and their orders. For more info see rr}r(hXjA dict containing the downloader middlewares enabled in your project, and their orders. For more info see hjubh)r}r(hX+:ref:`topics-downloader-middleware-setting`rhjhhhhh}r(UreftypeXrefjjX$topics-downloader-middleware-settingU refdomainXstdrh]h]U refexplicith]h]h]jjuhM*h]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhX$topics-downloader-middleware-settingrr}r(hUhjubahjubaubhX.r}r(hX.hjubeubjH)r }r (hUhjhhhjKh}r (h]h]h]h]h]Uentries]r (XpairX$DOWNLOADER_MIDDLEWARES_BASE; settingX'std:setting-DOWNLOADER_MIDDLEWARES_BASEr UtrauhM.hhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hj uhM.hhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hj eh]rh-auhM0hhh}rj jsh]r(h)r}r(hXDOWNLOADER_MIDDLEWARES_BASErhjhhhhh}r(h]h]h]h]h]uhM0hhh]rhXDOWNLOADER_MIDDLEWARES_BASErr}r (hjhjubaubh)r!}r"(hX Default::r#hjhhhhh}r$(h]h]h]h]h]uhM2hhh]r%hXDefault:r&r'}r((hXDefault:hj!ubaubj)r)}r*(hXV{ 'scrapy.contrib.downloadermiddleware.robotstxt.RobotsTxtMiddleware': 100, 'scrapy.contrib.downloadermiddleware.httpauth.HttpAuthMiddleware': 300, 'scrapy.contrib.downloadermiddleware.downloadtimeout.DownloadTimeoutMiddleware': 350, 'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': 400, 'scrapy.contrib.downloadermiddleware.retry.RetryMiddleware': 500, 'scrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddleware': 550, 'scrapy.contrib.downloadermiddleware.redirect.MetaRefreshMiddleware': 580, 'scrapy.contrib.downloadermiddleware.httpcompression.HttpCompressionMiddleware': 590, 'scrapy.contrib.downloadermiddleware.redirect.RedirectMiddleware': 600, 'scrapy.contrib.downloadermiddleware.cookies.CookiesMiddleware': 700, 'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 750, 'scrapy.contrib.downloadermiddleware.chunked.ChunkedTransferMiddleware': 830, 'scrapy.contrib.downloadermiddleware.stats.DownloaderStats': 850, 'scrapy.contrib.downloadermiddleware.httpcache.HttpCacheMiddleware': 900, }hjhhhjh}r+(jjh]h]h]h]h]uhM4hhh]r,hXV{ 'scrapy.contrib.downloadermiddleware.robotstxt.RobotsTxtMiddleware': 100, 'scrapy.contrib.downloadermiddleware.httpauth.HttpAuthMiddleware': 300, 'scrapy.contrib.downloadermiddleware.downloadtimeout.DownloadTimeoutMiddleware': 350, 'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': 400, 'scrapy.contrib.downloadermiddleware.retry.RetryMiddleware': 500, 'scrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddleware': 550, 'scrapy.contrib.downloadermiddleware.redirect.MetaRefreshMiddleware': 580, 'scrapy.contrib.downloadermiddleware.httpcompression.HttpCompressionMiddleware': 590, 'scrapy.contrib.downloadermiddleware.redirect.RedirectMiddleware': 600, 'scrapy.contrib.downloadermiddleware.cookies.CookiesMiddleware': 700, 'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 750, 'scrapy.contrib.downloadermiddleware.chunked.ChunkedTransferMiddleware': 830, 'scrapy.contrib.downloadermiddleware.stats.DownloaderStats': 850, 'scrapy.contrib.downloadermiddleware.httpcache.HttpCacheMiddleware': 900, }r-r.}r/(hUhj)ubaubh)r0}r1(hXA dict containing the downloader middlewares enabled by default in Scrapy. You should never modify this setting in your project, modify :setting:`DOWNLOADER_MIDDLEWARES` instead. For more info see :ref:`topics-downloader-middleware-setting`.hjhhhhh}r2(h]h]h]h]h]uhMEhhh]r3(hXA dict containing the downloader middlewares enabled by default in Scrapy. You should never modify this setting in your project, modify r4r5}r6(hXA dict containing the downloader middlewares enabled by default in Scrapy. You should never modify this setting in your project, modify hj0ubh)r7}r8(hX!:setting:`DOWNLOADER_MIDDLEWARES`r9hj0hhhhh}r:(UreftypeXsettingjjXDOWNLOADER_MIDDLEWARESU refdomainXstdr;h]h]U refexplicith]h]h]jjuhMEh]r<j))r=}r>(hj9h}r?(h]h]r@(j j;X std-settingrAeh]h]h]uhj7h]rBhXDOWNLOADER_MIDDLEWARESrCrD}rE(hUhj=ubahj1ubaubhX instead. For more info see rFrG}rH(hX instead. For more info see hj0ubh)rI}rJ(hX+:ref:`topics-downloader-middleware-setting`rKhj0hhhhh}rL(UreftypeXrefjjX$topics-downloader-middleware-settingU refdomainXstdrMh]h]U refexplicith]h]h]jjuhMEh]rNj)rO}rP(hjKh}rQ(h]h]rR(j jMXstd-refrSeh]h]h]uhjIh]rThX$topics-downloader-middleware-settingrUrV}rW(hUhjOubahjubaubhX.rX}rY(hX.hj0ubeubjH)rZ}r[(hUhjhhhjKh}r\(h]h]h]h]h]Uentries]r](XpairXDOWNLOADER_STATS; settingXstd:setting-DOWNLOADER_STATSr^Utr_auhMKhhh]ubh)r`}ra(hUhjhhhhh}rb(h]h]h]h]h]hj^uhMKhhh]ubeubh)rc}rd(hUhj(hhh}hhh}re(h]h]h]h]rf(hj^eh]rghNauhMMhhh}rhj^j`sh]ri(h)rj}rk(hXDOWNLOADER_STATSrlhjchhhhh}rm(h]h]h]h]h]uhMMhhh]rnhXDOWNLOADER_STATSrorp}rq(hjlhjjubaubh)rr}rs(hXDefault: ``True``rthjchhhhh}ru(h]h]h]h]h]uhMOhhh]rv(hX Default: rwrx}ry(hX Default: hjrubj))rz}r{(hX``True``h}r|(h]h]h]h]h]uhjrh]r}hXTruer~r}r(hUhjzubahj1ubeubh)r}r(hX.Whether to enable downloader stats collection.rhjchhhhh}r(h]h]h]h]h]uhMQhhh]rhX.Whether to enable downloader stats collection.rr}r(hjhjubaubjH)r}r(hUhjchhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDOWNLOAD_DELAY; settingXstd:setting-DOWNLOAD_DELAYrUtrauhMThhh]ubh)r}r(hUhjchhhhh}r(h]h]h]h]h]hjuhMThhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rhQauhMVhhh}rjjsh]r(h)r}r(hXDOWNLOAD_DELAYrhjhhhhh}r(h]h]h]h]h]uhMVhhh]rhXDOWNLOAD_DELAYrr}r(hjhjubaubh)r}r(hXDefault: ``0``rhjhhhhh}r(h]h]h]h]h]uhMXhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``0``h}r(h]h]h]h]h]uhjh]rhX0r}r(hUhjubahj1ubeubh)r}r(hXThe amount of time (in secs) that the downloader should wait before downloading consecutive pages from the same website. This can be used to throttle the crawling speed to avoid hitting servers too hard. Decimal numbers are supported. Example::hjhhhhh}r(h]h]h]h]h]uhMZhhh]rhXThe amount of time (in secs) that the downloader should wait before downloading consecutive pages from the same website. This can be used to throttle the crawling speed to avoid hitting servers too hard. Decimal numbers are supported. Example:rr}r(hXThe amount of time (in secs) that the downloader should wait before downloading consecutive pages from the same website. This can be used to throttle the crawling speed to avoid hitting servers too hard. Decimal numbers are supported. Example:hjubaubj)r}r(hX*DOWNLOAD_DELAY = 0.25 # 250 ms of delayhjhhhjh}r(jjh]h]h]h]h]uhM_hhh]rhX*DOWNLOAD_DELAY = 0.25 # 250 ms of delayrr}r(hUhjubaubh)r}r(hXThis setting is also affected by the :setting:`RANDOMIZE_DOWNLOAD_DELAY` setting (which is enabled by default). By default, Scrapy doesn't wait a fixed amount of time between requests, but uses a random interval between 0.5 and 1.5 * :setting:`DOWNLOAD_DELAY`.hjhhhhh}r(h]h]h]h]h]uhMahhh]r(hX%This setting is also affected by the rr}r(hX%This setting is also affected by the hjubh)r}r(hX#:setting:`RANDOMIZE_DOWNLOAD_DELAY`rhjhhhhh}r(UreftypeXsettingjjXRANDOMIZE_DOWNLOAD_DELAYU refdomainXstdrh]h]U refexplicith]h]h]jjuhMah]rj))r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXRANDOMIZE_DOWNLOAD_DELAYrr}r(hUhjubahj1ubaubhX setting (which is enabled by default). By default, Scrapy doesn't wait a fixed amount of time between requests, but uses a random interval between 0.5 and 1.5 * rr}r(hX setting (which is enabled by default). By default, Scrapy doesn't wait a fixed amount of time between requests, but uses a random interval between 0.5 and 1.5 * hjubh)r}r(hX:setting:`DOWNLOAD_DELAY`rhjhhhhh}r(UreftypeXsettingjjXDOWNLOAD_DELAYU refdomainXstdrh]h]U refexplicith]h]h]jjuhMah]rj))r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXDOWNLOAD_DELAYrr}r(hUhjubahj1ubaubhX.r}r(hX.hjubeubh)r}r(hXqWhen :setting:`CONCURRENT_REQUESTS_PER_IP` is non-zero, delays are enforced per ip address instead of per domain.hjhhhhh}r(h]h]h]h]h]uhMfhhh]r(hXWhen rr}r(hXWhen hjubh)r}r(hX%:setting:`CONCURRENT_REQUESTS_PER_IP`rhjhhhhh}r(UreftypeXsettingjjXCONCURRENT_REQUESTS_PER_IPU refdomainXstdrh]h]U refexplicith]h]h]jjuhMfh]rj))r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXCONCURRENT_REQUESTS_PER_IPrr}r(hUhjubahj1ubaubhXG is non-zero, delays are enforced per ip address instead of per domain.rr}r(hXG is non-zero, delays are enforced per ip address instead of per domain.hjubeubh)r}r(hX[You can also change this setting per spider by setting ``download_delay`` spider attribute.hjhhhhh}r(h]h]h]h]h]uhMihhh]r(hX7You can also change this setting per spider by setting rr}r(hX7You can also change this setting per spider by setting hjubj))r}r(hX``download_delay``h}r (h]h]h]h]h]uhjh]r hXdownload_delayr r }r (hUhjubahj1ubhX spider attribute.rr}r(hX spider attribute.hjubeubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDOWNLOAD_HANDLERS; settingXstd:setting-DOWNLOAD_HANDLERSrUtrauhMmhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMmhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hajeh]rh auhMohhh}rjjsh]r (h)r!}r"(hXDOWNLOAD_HANDLERSr#hjhhhhh}r$(h]h]h]h]h]uhMohhh]r%hXDOWNLOAD_HANDLERSr&r'}r((hj#hj!ubaubh)r)}r*(hXDefault: ``{}``r+hjhhhhh}r,(h]h]h]h]h]uhMqhhh]r-(hX Default: r.r/}r0(hX Default: hj)ubj))r1}r2(hX``{}``h}r3(h]h]h]h]h]uhj)h]r4hX{}r5r6}r7(hUhj1ubahj1ubeubh)r8}r9(hX{A dict containing the request downloader handlers enabled in your project. See `DOWNLOAD_HANDLERS_BASE` for example format.hjhhhhh}r:(h]h]h]h]h]uhMshhh]r;(hXOA dict containing the request downloader handlers enabled in your project. See r<r=}r>(hXOA dict containing the request downloader handlers enabled in your project. See hj8ubcdocutils.nodes title_reference r?)r@}rA(hX`DOWNLOAD_HANDLERS_BASE`h}rB(h]h]h]h]h]uhj8h]rChXDOWNLOAD_HANDLERS_BASErDrE}rF(hUhj@ubahUtitle_referencerGubhX for example format.rHrI}rJ(hX for example format.hj8ubeubjH)rK}rL(hUhjhhhjKh}rM(h]h]h]h]h]Uentries]rN(XpairXDOWNLOAD_HANDLERS_BASE; settingX"std:setting-DOWNLOAD_HANDLERS_BASErOUtrPauhMwhhh]ubh)rQ}rR(hUhjhhhhh}rS(h]h]h]h]h]hjOuhMwhhh]ubeubh)rT}rU(hUhj(hhh}hhh}rV(h]h]h]h]rW(hjOeh]rXh>auhMyhhh}rYjOjQsh]rZ(h)r[}r\(hXDOWNLOAD_HANDLERS_BASEr]hjThhhhh}r^(h]h]h]h]h]uhMyhhh]r_hXDOWNLOAD_HANDLERS_BASEr`ra}rb(hj]hj[ubaubh)rc}rd(hX Default::rehjThhhhh}rf(h]h]h]h]h]uhM{hhh]rghXDefault:rhri}rj(hXDefault:hjcubaubj)rk}rl(hX{ 'file': 'scrapy.core.downloader.handlers.file.FileDownloadHandler', 'http': 'scrapy.core.downloader.handlers.http.HttpDownloadHandler', 'https': 'scrapy.core.downloader.handlers.http.HttpDownloadHandler', 's3': 'scrapy.core.downloader.handlers.s3.S3DownloadHandler', }hjThhhjh}rm(jjh]h]h]h]h]uhM}hhh]rnhX{ 'file': 'scrapy.core.downloader.handlers.file.FileDownloadHandler', 'http': 'scrapy.core.downloader.handlers.http.HttpDownloadHandler', 'https': 'scrapy.core.downloader.handlers.http.HttpDownloadHandler', 's3': 'scrapy.core.downloader.handlers.s3.S3DownloadHandler', }rorp}rq(hUhjkubaubh)rr}rs(hXA dict containing the request download handlers enabled by default in Scrapy. You should never modify this setting in your project, modify :setting:`DOWNLOAD_HANDLERS` instead.hjThhhhh}rt(h]h]h]h]h]uhMhhh]ru(hXA dict containing the request download handlers enabled by default in Scrapy. You should never modify this setting in your project, modify rvrw}rx(hXA dict containing the request download handlers enabled by default in Scrapy. You should never modify this setting in your project, modify hjrubh)ry}rz(hX:setting:`DOWNLOAD_HANDLERS`r{hjrhhhhh}r|(UreftypeXsettingjjXDOWNLOAD_HANDLERSU refdomainXstdr}h]h]U refexplicith]h]h]jjuhMh]r~j))r}r(hj{h}r(h]h]r(j j}X std-settingreh]h]h]uhjyh]rhXDOWNLOAD_HANDLERSrr}r(hUhjubahj1ubaubhX instead.rr}r(hX instead.hjrubeubjH)r}r(hUhjThhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDOWNLOAD_TIMEOUT; settingXstd:setting-DOWNLOAD_TIMEOUTrUtrauhMhhh]ubh)r}r(hUhjThhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rh5auhMhhh}rjjsh]r(h)r}r(hXDOWNLOAD_TIMEOUTrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXDOWNLOAD_TIMEOUTrr}r(hjhjubaubh)r}r(hXDefault: ``180``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``180``h}r(h]h]h]h]h]uhjh]rhX180rr}r(hUhjubahj1ubeubh)r}r(hXMThe amount of time (in secs) that the downloader will wait before timing out.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXMThe amount of time (in secs) that the downloader will wait before timing out.rr}r(hjhjubaubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXDUPEFILTER_CLASS; settingXstd:setting-DUPEFILTER_CLASSrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hvjeh]rh auhMhhh}rjjsh]r(h)r}r(hXDUPEFILTER_CLASSrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXDUPEFILTER_CLASSrr}r(hjhjubaubh)r}r(hX.Default: ``'scrapy.dupefilter.RFPDupeFilter'``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX%``'scrapy.dupefilter.RFPDupeFilter'``h}r(h]h]h]h]h]uhjh]rhX!'scrapy.dupefilter.RFPDupeFilter'rr}r(hUhjubahj1ubeubh)r}r(hX7The class used to detect and filter duplicate requests.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX7The class used to detect and filter duplicate requests.rr}r(hjhjubaubh)r}r(hXThe default (``RFPDupeFilter``) filters based on request fingerprint using the ``scrapy.utils.request.request_fingerprint`` function.hjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX The default (rr}r(hX The default (hjubj))r}r(hX``RFPDupeFilter``h}r(h]h]h]h]h]uhjh]rhX RFPDupeFilterrr}r(hUhjubahj1ubhX1) filters based on request fingerprint using the rr}r(hX1) filters based on request fingerprint using the hjubj))r}r(hX,``scrapy.utils.request.request_fingerprint``h}r(h]h]h]h]h]uhjh]rhX(scrapy.utils.request.request_fingerprintrr}r (hUhjubahj1ubhX function.r r }r (hX function.hjubeubjH)r }r (hUhjhhhjKh}r (h]h]h]h]h]Uentries]r (XpairXjDITOR; settingXstd:setting-jDITORr Utr auhMhhh]ubh)r }r (hUhjhhhhh}r (h]h]h]h]h]hj uhMhhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (h~j eh]r h(auhMhhh}r j j sh]r (h)r }r (hXEDITORr hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXEDITORr r }r (hj hj ubaubh)r }r (hX%Default: `depends on the environment`r hj hhhhh}r (h]h]h]h]h]uhMhhh]r (hX Default: r! r" }r# (hX Default: hj ubj?)r$ }r% (hX`depends on the environment`h}r& (h]h]h]h]h]uhj h]r' hXdepends on the environmentr( r) }r* (hUhj$ ubahjGubeubh)r+ }r, (hXThe editor to use for editing spiders with the :command:`edit` command. It defaults to the ``EDITOR`` environment variable, if set. Otherwise, it defaults to ``vi`` (on Unix systems) or the IDLE editor (on Windows).hj hhhhh}r- (h]h]h]h]h]uhMhhh]r. (hX/The editor to use for editing spiders with the r/ r0 }r1 (hX/The editor to use for editing spiders with the hj+ ubh)r2 }r3 (hX:command:`edit`r4 hj+ hhhhh}r5 (UreftypeXcommandjjXeditU refdomainXstdr6 h]h]U refexplicith]h]h]jjuhMh]r7 j))r8 }r9 (hj4 h}r: (h]h]r; (j j6 X std-commandr< eh]h]h]uhj2 h]r= hXeditr> r? }r@ (hUhj8 ubahj1ubaubhX command. It defaults to the rA rB }rC (hX command. It defaults to the hj+ ubj))rD }rE (hX ``EDITOR``h}rF (h]h]h]h]h]uhj+ h]rG hXEDITORrH rI }rJ (hUhjD ubahj1ubhX9 environment variable, if set. Otherwise, it defaults to rK rL }rM (hX9 environment variable, if set. Otherwise, it defaults to hj+ ubj))rN }rO (hX``vi``h}rP (h]h]h]h]h]uhj+ h]rQ hXvirR rS }rT (hUhjN ubahj1ubhX3 (on Unix systems) or the IDLE editor (on Windows).rU rV }rW (hX3 (on Unix systems) or the IDLE editor (on Windows).hj+ ubeubjH)rX }rY (hUhj hhhjKh}rZ (h]h]h]h]h]Uentries]r[ (XpairXEXTENSIONS; settingXstd:setting-EXTENSIONSr\ Utr] auhMhhh]ubh)r^ }r_ (hUhj hhhhh}r` (h]h]h]h]h]hj\ uhMhhh]ubeubh)ra }rb (hUhj(hhh}hhh}rc (h]h]h]h]rd (hj\ eh]re hIauhMhhh}rf j\ j^ sh]rg (h)rh }ri (hX EXTENSIONSrj hja hhhhh}rk (h]h]h]h]h]uhMhhh]rl hX EXTENSIONSrm rn }ro (hjj hjh ubaubh)rp }rq (hXDefault:: ``{}``rr hja hhhhh}rs (h]h]h]h]h]uhMhhh]rt (hX Default:: ru rv }rw (hX Default:: hjp ubj))rx }ry (hX``{}``h}rz (h]h]h]h]h]uhjp h]r{ hX{}r| r} }r~ (hUhjx ubahj1ubeubh)r }r (hXKA dict containing the extensions enabled in your project, and their orders.r hja hhhhh}r (h]h]h]h]h]uhMhhh]r hXKA dict containing the extensions enabled in your project, and their orders.r r }r (hj hj ubaubjH)r }r (hUhja hhhjKh}r (h]h]h]h]h]Uentries]r (XpairXEXTENSIONS_BASE; settingXstd:setting-EXTENSIONS_BASEr Utr auhMhhh]ubh)r }r (hUhja hhhhh}r (h]h]h]h]h]hj uhMhhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (hj eh]r h0auhMhhh}r j j sh]r (h)r }r (hXEXTENSIONS_BASEr hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXEXTENSIONS_BASEr r }r (hj hj ubaubh)r }r (hX Default::r hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXDefault:r r }r (hXDefault:hj ubaubj)r }r (hX{ 'scrapy.contrib.corestats.CoreStats': 0, 'scrapy.webservice.WebService': 0, 'scrapy.telnet.TelnetConsole': 0, 'scrapy.contrib.memusage.MemoryUsage': 0, 'scrapy.contrib.memdebug.MemoryDebugger': 0, 'scrapy.contrib.closespider.CloseSpider': 0, 'scrapy.contrib.feedexport.FeedExporter': 0, 'scrapy.contrib.logstats.LogStats': 0, 'scrapy.contrib.spiderstate.SpiderState': 0, 'scrapy.contrib.throttle.AutoThrottle': 0, }hj hhhjh}r (jjh]h]h]h]h]uhMhhh]r hX{ 'scrapy.contrib.corestats.CoreStats': 0, 'scrapy.webservice.WebService': 0, 'scrapy.telnet.TelnetConsole': 0, 'scrapy.contrib.memusage.MemoryUsage': 0, 'scrapy.contrib.memdebug.MemoryDebugger': 0, 'scrapy.contrib.closespider.CloseSpider': 0, 'scrapy.contrib.feedexport.FeedExporter': 0, 'scrapy.contrib.logstats.LogStats': 0, 'scrapy.contrib.spiderstate.SpiderState': 0, 'scrapy.contrib.throttle.AutoThrottle': 0, }r r }r (hUhj ubaubh)r }r (hXThe list of available extensions. Keep in mind that some of them need to be enabled through a setting. By default, this setting contains all stable built-in extensions.r hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXThe list of available extensions. Keep in mind that some of them need to be enabled through a setting. By default, this setting contains all stable built-in extensions.r r }r (hj hj ubaubh)r }r (hXFor more information See the :ref:`extensions user guide ` and the :ref:`list of available extensions `.hj hhhhh}r (h]h]h]h]h]uhMhhh]r (hXFor more information See the r r }r (hXFor more information See the hj ubh)r }r (hX1:ref:`extensions user guide `r hj hhhhh}r (UreftypeXrefjjXtopics-extensionsU refdomainXstdr h]h]U refexplicith]h]h]jjuhMh]r j)r }r (hj h}r (h]h]r (j j Xstd-refr eh]h]h]uhj h]r hXextensions user guider r }r (hUhj ubahjubaubhX and the r r }r (hX and the hj ubh)r }r (hX;:ref:`list of available extensions `r hj hhhhh}r (UreftypeXrefjjXtopics-extensions-refU refdomainXstdr h]h]U refexplicith]h]h]jjuhMh]r j)r }r (hj h}r (h]h]r (j j Xstd-refr eh]h]h]uhj h]r hXlist of available extensionsr r }r (hUhj ubahjubaubhX.r }r (hX.hj ubeubjH)r }r (hUhj hhhjKh}r (h]h]h]h]h]Uentries]r (XpairXITEM_PIPELINES; settingXstd:setting-ITEM_PIPELINESr Utr auhMhhh]ubh)r }r (hUhj hhhhh}r (h]h]h]h]h]hj uhMhhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (h]j eh]r hauhMhhh}r j j sh]r (h)r }r (hXITEM_PIPELINESr hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXITEM_PIPELINESr r }r (hj hj ubaubh)r }r (hXDefault: ``{}``r hj hhhhh}r (h]h]h]h]h]uhMhhh]r (hX Default: r r }r (hX Default: hj ubj))r }r (hX``{}``h}r (h]h]h]h]h]uhj h]r hX{}r r }r (hUhj ubahj1ubeubh)r }r (hXA dict containing the item pipelines to use, and their orders. The dict is empty by default order values are arbitrary but it's customary to define them in the 0-1000 range.r hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXA dict containing the item pipelines to use, and their orders. The dict is empty by default order values are arbitrary but it's customary to define them in the 0-1000 range.r r }r (hj hj ubaubh)r }r (hXfLists are supported in :setting:`ITEM_PIPELINES` for backwards compatibility, but they are deprecated.hj hhhhh}r (h]h]h]h]h]uhMhhh]r (hXLists are supported in r r }r (hXLists are supported in hj ubh)r }r (hX:setting:`ITEM_PIPELINES`r hj hhhhh}r (UreftypeXsettingjjXITEM_PIPELINESU refdomainXstdr h]h]U refexplicith]h]h]jjuhMh]r j))r }r (hj h}r (h]h]r (j j X std-settingr eh]h]h]uhj h]r! hXITEM_PIPELINESr" r# }r$ (hUhj ubahj1ubaubhX6 for backwards compatibility, but they are deprecated.r% r& }r' (hX6 for backwards compatibility, but they are deprecated.hj ubeubh)r( }r) (hX Example::r* hj hhhhh}r+ (h]h]h]h]h]uhMhhh]r, hXExample:r- r. }r/ (hXExample:hj( ubaubj)r0 }r1 (hXwITEM_PIPELINES = { 'mybot.pipeline.validate.ValidateMyItem': 300, 'mybot.pipeline.validate.StoreMyItem': 800, }hj hhhjh}r2 (jjh]h]h]h]h]uhMhhh]r3 hXwITEM_PIPELINES = { 'mybot.pipeline.validate.ValidateMyItem': 300, 'mybot.pipeline.validate.StoreMyItem': 800, }r4 r5 }r6 (hUhj0 ubaubjH)r7 }r8 (hUhj hhhjKh}r9 (h]h]h]h]h]Uentries]r: (XpairXITEM_PIPELINES_BASE; settingXstd:setting-ITEM_PIPELINES_BASEr; Utr< auhMhhh]ubh)r= }r> (hUhj hhhhh}r? (h]h]h]h]h]hj; uhMhhh]ubeubh)r@ }rA (hUhj(hhh}hhh}rB (h]h]h]h]rC (hj; eh]rD h r? }r@ (hUhj8 ubahjubaubhX.rA }rB (hX.hj+ ubeubjH)rC }rD (hUhj hhhjKh}rE (h]h]h]h]h]Uentries]rF (XpairXLOG_STDOUT; settingXstd:setting-LOG_STDOUTrG UtrH auhMhhh]ubh)rI }rJ (hUhj hhhhh}rK (h]h]h]h]h]hjG uhMhhh]ubeubh)rL }rM (hUhj(hhh}hhh}rN (h]h]h]h]rO (hljG eh]rP hauhMhhh}rQ jG jI sh]rR (h)rS }rT (hX LOG_STDOUTrU hjL hhhhh}rV (h]h]h]h]h]uhMhhh]rW hX LOG_STDOUTrX rY }rZ (hjU hjS ubaubh)r[ }r\ (hXDefault: ``False``r] hjL hhhhh}r^ (h]h]h]h]h]uhMhhh]r_ (hX Default: r` ra }rb (hX Default: hj[ ubj))rc }rd (hX ``False``h}re (h]h]h]h]h]uhj[ h]rf hXFalserg rh }ri (hUhjc ubahj1ubeubh)rj }rk (hXIf ``True``, all standard output (and error) of your process will be redirected to the log. For example if you ``print 'hello'`` it will appear in the Scrapy log.hjL hhhhh}rl (h]h]h]h]h]uhMhhh]rm (hXIf rn ro }rp (hXIf hjj ubj))rq }rr (hX``True``h}rs (h]h]h]h]h]uhjj h]rt hXTrueru rv }rw (hUhjq ubahj1ubhXd, all standard output (and error) of your process will be redirected to the log. For example if you rx ry }rz (hXd, all standard output (and error) of your process will be redirected to the log. For example if you hjj ubj))r{ }r| (hX``print 'hello'``h}r} (h]h]h]h]h]uhjj h]r~ hX print 'hello'r r }r (hUhj{ ubahj1ubhX" it will appear in the Scrapy log.r r }r (hX" it will appear in the Scrapy log.hjj ubeubjH)r }r (hUhjL hhhjKh}r (h]h]h]h]h]Uentries]r (XpairXMEMDEBUG_ENABLED; settingXstd:setting-MEMDEBUG_ENABLEDr Utr auhMhhh]ubh)r }r (hUhjL hhhhh}r (h]h]h]h]h]hj uhMhhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (h_j eh]r h auhMhhh}r j j sh]r (h)r }r (hXMEMDEBUG_ENABLEDr hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXMEMDEBUG_ENABLEDr r }r (hj hj ubaubh)r }r (hXDefault: ``False``r hj hhhhh}r (h]h]h]h]h]uhM hhh]r (hX Default: r r }r (hX Default: hj ubj))r }r (hX ``False``h}r (h]h]h]h]h]uhj h]r hXFalser r }r (hUhj ubahj1ubeubh)r }r (hX#Whether to enable memory debugging.r hj hhhhh}r (h]h]h]h]h]uhM"hhh]r hX#Whether to enable memory debugging.r r }r (hj hj ubaubjH)r }r (hUhj hhhjKh}r (h]h]h]h]h]Uentries]r (XpairXMEMDEBUG_NOTIFY; settingXstd:setting-MEMDEBUG_NOTIFYr Utr auhM%hhh]ubh)r }r (hUhj hhhhh}r (h]h]h]h]h]hj uhM%hhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (hj eh]r hRauhM'hhh}r j j sh]r (h)r }r (hXMEMDEBUG_NOTIFYr hj hhhhh}r (h]h]h]h]h]uhM'hhh]r hXMEMDEBUG_NOTIFYr r }r (hj hj ubaubh)r }r (hXDefault: ``[]``r hj hhhhh}r (h]h]h]h]h]uhM)hhh]r (hX Default: r r }r (hX Default: hj ubj))r }r (hX``[]``h}r (h]h]h]h]h]uhj h]r hX[]r r }r (hUhj ubahj1ubeubh)r }r (hXWhen memory debugging is enabled a memory report will be sent to the specified addresses if this setting is not empty, otherwise the report will be written to the log.r hj hhhhh}r (h]h]h]h]h]uhM+hhh]r hXWhen memory debugging is enabled a memory report will be sent to the specified addresses if this setting is not empty, otherwise the report will be written to the log.r r }r (hj hj ubaubh)r }r (hX Example::r hj hhhhh}r (h]h]h]h]h]uhM/hhh]r hXExample:r r }r (hXExample:hj ubaubj)r }r (hX&MEMDEBUG_NOTIFY = ['user@example.com']hj hhhjh}r (jjh]h]h]h]h]uhM1hhh]r hX&MEMDEBUG_NOTIFY = ['user@example.com']r r }r (hUhj ubaubjH)r }r (hUhj hhhjKh}r (h]h]h]h]h]Uentries]r (XpairXMEMUSAGE_ENABLED; settingXstd:setting-MEMUSAGE_ENABLEDr Utr auhM4hhh]ubh)r }r (hUhj hhhhh}r (h]h]h]h]h]hj uhM4hhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (hj eh]r h8auhM6hhh}r j j sh]r (h)r }r (hXMEMUSAGE_ENABLEDr hj hhhhh}r (h]h]h]h]h]uhM6hhh]r hXMEMUSAGE_ENABLEDr r }r (hj hj ubaubh)r }r (hXDefault: ``False``r hj hhhhh}r (h]h]h]h]h]uhM8hhh]r (hX Default: r r }r (hX Default: hj ubj))r }r (hX ``False``h}r (h]h]h]h]h]uhj h]r hXFalser r }r (hUhj ubahj1ubeubh)r }r (hX"Scope: ``scrapy.contrib.memusage``r hj hhhhh}r (h]h]h]h]h]uhM:hhh]r (hXScope: r r }r (hXScope: hj ubj))r! }r" (hX``scrapy.contrib.memusage``h}r# (h]h]h]h]h]uhj h]r$ hXscrapy.contrib.memusager% r& }r' (hUhj! ubahj1ubeubh)r( }r) (hXWhether to enable the memory usage extension that will shutdown the Scrapy process when it exceeds a memory limit, and also notify by email when that happened.r* hj hhhhh}r+ (h]h]h]h]h]uhM<hhh]r, hXWhether to enable the memory usage extension that will shutdown the Scrapy process when it exceeds a memory limit, and also notify by email when that happened.r- r. }r/ (hj* hj( ubaubh)r0 }r1 (hX*See :ref:`topics-extensions-ref-memusage`.r2 hj hhhhh}r3 (h]h]h]h]h]uhM@hhh]r4 (hXSee r5 r6 }r7 (hXSee hj0 ubh)r8 }r9 (hX%:ref:`topics-extensions-ref-memusage`r: hj0 hhhhh}r; (UreftypeXrefjjXtopics-extensions-ref-memusageU refdomainXstdr< h]h]U refexplicith]h]h]jjuhM@h]r= j)r> }r? (hj: h}r@ (h]h]rA (j j< Xstd-refrB eh]h]h]uhj8 h]rC hXtopics-extensions-ref-memusagerD rE }rF (hUhj> ubahjubaubhX.rG }rH (hX.hj0 ubeubjH)rI }rJ (hUhj hhhjKh}rK (h]h]h]h]h]Uentries]rL (XpairXMEMUSAGE_LIMIT_MB; settingXstd:setting-MEMUSAGE_LIMIT_MBrM UtrN auhMChhh]ubh)rO }rP (hUhj hhhhh}rQ (h]h]h]h]h]hjM uhMChhh]ubeubh)rR }rS (hUhj(hhh}hhh}rT (h]h]h]h]rU (hjM eh]rV h7auhMEhhh}rW jM jO sh]rX (h)rY }rZ (hXMEMUSAGE_LIMIT_MBr[ hjR hhhhh}r\ (h]h]h]h]h]uhMEhhh]r] hXMEMUSAGE_LIMIT_MBr^ r_ }r` (hj[ hjY ubaubh)ra }rb (hXDefault: ``0``rc hjR hhhhh}rd (h]h]h]h]h]uhMGhhh]re (hX Default: rf rg }rh (hX Default: hja ubj))ri }rj (hX``0``h}rk (h]h]h]h]h]uhja h]rl hX0rm }rn (hUhji ubahj1ubeubh)ro }rp (hX"Scope: ``scrapy.contrib.memusage``rq hjR hhhhh}rr (h]h]h]h]h]uhMIhhh]rs (hXScope: rt ru }rv (hXScope: hjo ubj))rw }rx (hX``scrapy.contrib.memusage``h}ry (h]h]h]h]h]uhjo h]rz hXscrapy.contrib.memusager{ r| }r} (hUhjw ubahj1ubeubh)r~ }r (hXThe maximum amount of memory to allow (in megabytes) before shutting down Scrapy (if MEMUSAGE_ENABLED is True). If zero, no check will be performed.r hjR hhhhh}r (h]h]h]h]h]uhMKhhh]r hXThe maximum amount of memory to allow (in megabytes) before shutting down Scrapy (if MEMUSAGE_ENABLED is True). If zero, no check will be performed.r r }r (hj hj~ ubaubh)r }r (hX*See :ref:`topics-extensions-ref-memusage`.r hjR hhhhh}r (h]h]h]h]h]uhMNhhh]r (hXSee r r }r (hXSee hj ubh)r }r (hX%:ref:`topics-extensions-ref-memusage`r hj hhhhh}r (UreftypeXrefjjXtopics-extensions-ref-memusageU refdomainXstdr h]h]U refexplicith]h]h]jjuhMNh]r j)r }r (hj h}r (h]h]r (j j Xstd-refr eh]h]h]uhj h]r hXtopics-extensions-ref-memusager r }r (hUhj ubahjubaubhX.r }r (hX.hj ubeubjH)r }r (hUhjR hhhjKh}r (h]h]h]h]h]Uentries]r (XpairXMEMUSAGE_NOTIFY_MAIL; settingX std:setting-MEMUSAGE_NOTIFY_MAILr Utr auhMQhhh]ubh)r }r (hUhjR hhhhh}r (h]h]h]h]h]hj uhMQhhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (h}j eh]r h'auhMShhh}r j j sh]r (h)r }r (hXMEMUSAGE_NOTIFY_MAILr hj hhhhh}r (h]h]h]h]h]uhMShhh]r hXMEMUSAGE_NOTIFY_MAILr r }r (hj hj ubaubh)r }r (hXDefault: ``False``r hj hhhhh}r (h]h]h]h]h]uhMUhhh]r (hX Default: r r }r (hX Default: hj ubj))r }r (hX ``False``h}r (h]h]h]h]h]uhj h]r hXFalser r }r (hUhj ubahj1ubeubh)r }r (hX"Scope: ``scrapy.contrib.memusage``r hj hhhhh}r (h]h]h]h]h]uhMWhhh]r (hXScope: r r }r (hXScope: hj ubj))r }r (hX``scrapy.contrib.memusage``h}r (h]h]h]h]h]uhj h]r hXscrapy.contrib.memusager r }r (hUhj ubahj1ubeubh)r }r (hX@A list of emails to notify if the memory limit has been reached.r hj hhhhh}r (h]h]h]h]h]uhMYhhh]r hX@A list of emails to notify if the memory limit has been reached.r r }r (hj hj ubaubh)r }r (hX Example::r hj hhhhh}r (h]h]h]h]h]uhM[hhh]r hXExample:r r }r (hXExample:hj ubaubj)r }r (hX+MEMUSAGE_NOTIFY_MAIL = ['user@example.com']hj hhhjh}r (jjh]h]h]h]h]uhM]hhh]r hX+MEMUSAGE_NOTIFY_MAIL = ['user@example.com']r r }r (hUhj ubaubh)r }r (hX*See :ref:`topics-extensions-ref-memusage`.r hj hhhhh}r (h]h]h]h]h]uhM_hhh]r (hXSee r r }r (hXSee hj ubh)r }r (hX%:ref:`topics-extensions-ref-memusage`r hj hhhhh}r (UreftypeXrefjjXtopics-extensions-ref-memusageU refdomainXstdr h]h]U refexplicith]h]h]jjuhM_h]r j)r }r (hj h}r (h]h]r (j j Xstd-refr eh]h]h]uhj h]r hXtopics-extensions-ref-memusager r }r (hUhj ubahjubaubhX.r }r (hX.hj ubeubjH)r }r (hUhj hhhjKh}r (h]h]h]h]h]Uentries]r (XpairXMEMUSAGE_REPORT; settingXstd:setting-MEMUSAGE_REPORTr Utr auhMbhhh]ubh)r }r (hUhj hhhhh}r (h]h]h]h]h]hj uhMbhhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (hij eh]r hauhMdhhh}r j j sh]r (h)r }r (hXMEMUSAGE_REPORTr hj hhhhh}r (h]h]h]h]h]uhMdhhh]r hXMEMUSAGE_REPORTr r }r (hj hj ubaubh)r }r (hXDefault: ``False``r hj hhhhh}r (h]h]h]h]h]uhMfhhh]r! (hX Default: r" r# }r$ (hX Default: hj ubj))r% }r& (hX ``False``h}r' (h]h]h]h]h]uhj h]r( hXFalser) r* }r+ (hUhj% ubahj1ubeubh)r, }r- (hX"Scope: ``scrapy.contrib.memusage``r. hj hhhhh}r/ (h]h]h]h]h]uhMhhhh]r0 (hXScope: r1 r2 }r3 (hXScope: hj, ubj))r4 }r5 (hX``scrapy.contrib.memusage``h}r6 (h]h]h]h]h]uhj, h]r7 hXscrapy.contrib.memusager8 r9 }r: (hUhj4 ubahj1ubeubh)r; }r< (hXHWhether to send a memory usage report after each spider has been closed.r= hj hhhhh}r> (h]h]h]h]h]uhMjhhh]r? hXHWhether to send a memory usage report after each spider has been closed.r@ rA }rB (hj= hj; ubaubh)rC }rD (hX*See :ref:`topics-extensions-ref-memusage`.rE hj hhhhh}rF (h]h]h]h]h]uhMlhhh]rG (hXSee rH rI }rJ (hXSee hjC ubh)rK }rL (hX%:ref:`topics-extensions-ref-memusage`rM hjC hhhhh}rN (UreftypeXrefjjXtopics-extensions-ref-memusageU refdomainXstdrO h]h]U refexplicith]h]h]jjuhMlh]rP j)rQ }rR (hjM h}rS (h]h]rT (j jO Xstd-refrU eh]h]h]uhjK h]rV hXtopics-extensions-ref-memusagerW rX }rY (hUhjQ ubahjubaubhX.rZ }r[ (hX.hjC ubeubjH)r\ }r] (hUhj hhhjKh}r^ (h]h]h]h]h]Uentries]r_ (XpairXMEMUSAGE_WARNING_MB; settingXstd:setting-MEMUSAGE_WARNING_MBr` Utra auhMohhh]ubh)rb }rc (hUhj hhhhh}rd (h]h]h]h]h]hj` uhMohhh]ubeubh)re }rf (hUhj(hhh}hhh}rg (h]h]h]h]rh (hqj` eh]ri hauhMqhhh}rj j` jb sh]rk (h)rl }rm (hXMEMUSAGE_WARNING_MBrn hje hhhhh}ro (h]h]h]h]h]uhMqhhh]rp hXMEMUSAGE_WARNING_MBrq rr }rs (hjn hjl ubaubh)rt }ru (hXDefault: ``0``rv hje hhhhh}rw (h]h]h]h]h]uhMshhh]rx (hX Default: ry rz }r{ (hX Default: hjt ubj))r| }r} (hX``0``h}r~ (h]h]h]h]h]uhjt h]r hX0r }r (hUhj| ubahj1ubeubh)r }r (hX"Scope: ``scrapy.contrib.memusage``r hje hhhhh}r (h]h]h]h]h]uhMuhhh]r (hXScope: r r }r (hXScope: hj ubj))r }r (hX``scrapy.contrib.memusage``h}r (h]h]h]h]h]uhj h]r hXscrapy.contrib.memusager r }r (hUhj ubahj1ubeubh)r }r (hXThe maximum amount of memory to allow (in megabytes) before sending a warning email notifying about it. If zero, no warning will be produced.r hje hhhhh}r (h]h]h]h]h]uhMwhhh]r hXThe maximum amount of memory to allow (in megabytes) before sending a warning email notifying about it. If zero, no warning will be produced.r r }r (hj hj ubaubjH)r }r (hUhje hhhjKh}r (h]h]h]h]h]Uentries]r (XpairXNEWSPIDER_MODULE; settingXstd:setting-NEWSPIDER_MODULEr Utr auhM{hhh]ubh)r }r (hUhje hhhhh}r (h]h]h]h]h]hj uhM{hhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (h\j eh]r hauhM}hhh}r j j sh]r (h)r }r (hXNEWSPIDER_MODULEr hj hhhhh}r (h]h]h]h]h]uhM}hhh]r hXNEWSPIDER_MODULEr r }r (hj hj ubaubh)r }r (hXDefault: ``''``r hj hhhhh}r (h]h]h]h]h]uhMhhh]r (hX Default: r r }r (hX Default: hj ubj))r }r (hX``''``h}r (h]h]h]h]h]uhj h]r hX''r r }r (hUhj ubahj1ubeubh)r }r (hXJModule where to create new spiders using the :command:`genspider` command.r hj hhhhh}r (h]h]h]h]h]uhMhhh]r (hX-Module where to create new spiders using the r r }r (hX-Module where to create new spiders using the hj ubh)r }r (hX:command:`genspider`r hj hhhhh}r (UreftypeXcommandjjX genspiderU refdomainXstdr h]h]U refexplicith]h]h]jjuhMh]r j))r }r (hj h}r (h]h]r (j j X std-commandr eh]h]h]uhj h]r hX genspiderr r }r (hUhj ubahj1ubaubhX command.r r }r (hX command.hj ubeubh)r }r (hX Example::r hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXExample:r r }r (hXExample:hj ubaubj)r }r (hX&NEWSPIDER_MODULE = 'mybot.spiders_dev'hj hhhjh}r (jjh]h]h]h]h]uhMhhh]r hX&NEWSPIDER_MODULE = 'mybot.spiders_dev'r r }r (hUhj ubaubjH)r }r (hUhj hhhjKh}r (h]h]h]h]h]Uentries]r (XpairX!RANDOMIZE_DOWNLOAD_DELAY; settingX$std:setting-RANDOMIZE_DOWNLOAD_DELAYr Utr auhMhhh]ubh)r }r (hUhj hhhhh}r (h]h]h]h]h]hj uhMhhh]ubeubh)r }r (hUhj(hhh}hhh}r (h]h]h]h]r (hj eh]r h/auhMhhh}r j j sh]r (h)r }r (hXRANDOMIZE_DOWNLOAD_DELAYr hj hhhhh}r (h]h]h]h]h]uhMhhh]r hXRANDOMIZE_DOWNLOAD_DELAYr r }r(hj hj ubaubh)r}r(hXDefault: ``True``rhj hhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r }r (hX``True``h}r (h]h]h]h]h]uhjh]r hXTruer r}r(hUhj ubahj1ubeubh)r}r(hXIf enabled, Scrapy will wait a random amount of time (between 0.5 and 1.5 * :setting:`DOWNLOAD_DELAY`) while fetching requests from the same website.hj hhhhh}r(h]h]h]h]h]uhMhhh]r(hXLIf enabled, Scrapy will wait a random amount of time (between 0.5 and 1.5 * rr}r(hXLIf enabled, Scrapy will wait a random amount of time (between 0.5 and 1.5 * hjubh)r}r(hX:setting:`DOWNLOAD_DELAY`rhjhhhhh}r(UreftypeXsettingjjXDOWNLOAD_DELAYU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj))r}r(hjh}r(h]h]r (j jX std-settingr!eh]h]h]uhjh]r"hXDOWNLOAD_DELAYr#r$}r%(hUhjubahj1ubaubhX0) while fetching requests from the same website.r&r'}r((hX0) while fetching requests from the same website.hjubeubh)r)}r*(hXThis randomization decreases the chance of the crawler being detected (and subsequently blocked) by sites which analyze requests looking for statistically significant similarities in the time between their requests.r+hj hhhhh}r,(h]h]h]h]h]uhMhhh]r-hXThis randomization decreases the chance of the crawler being detected (and subsequently blocked) by sites which analyze requests looking for statistically significant similarities in the time between their requests.r.r/}r0(hj+hj)ubaubh)r1}r2(hXNThe randomization policy is the same used by `wget`_ ``--random-wait`` option.r3hj hhhhh}r4(h]h]h]h]h]uhMhhh]r5(hX-The randomization policy is the same used by r6r7}r8(hX-The randomization policy is the same used by hj1ubjO)r9}r:(hX`wget`_jRKhj1hjSh}r;(UnameXwgetr<jUX1http://www.gnu.org/software/wget/manual/wget.htmlr=h]h]h]h]h]uh]r>hXwgetr?r@}rA(hUhj9ubaubhX rB}rC(hX hj1ubj))rD}rE(hX``--random-wait``h}rF(h]h]h]h]h]uhj1h]rGhX --random-waitrHrI}rJ(hUhjDubahj1ubhX option.rKrL}rM(hX option.hj1ubeubh)rN}rO(hXIIf :setting:`DOWNLOAD_DELAY` is zero (default) this option has no effect.rPhj hhhhh}rQ(h]h]h]h]h]uhMhhh]rR(hXIf rSrT}rU(hXIf hjNubh)rV}rW(hX:setting:`DOWNLOAD_DELAY`rXhjNhhhhh}rY(UreftypeXsettingjjXDOWNLOAD_DELAYU refdomainXstdrZh]h]U refexplicith]h]h]jjuhMh]r[j))r\}r](hjXh}r^(h]h]r_(j jZX std-settingr`eh]h]h]uhjVh]rahXDOWNLOAD_DELAYrbrc}rd(hUhj\ubahj1ubaubhX- is zero (default) this option has no effect.rerf}rg(hX- is zero (default) this option has no effect.hjNubeubh)rh}ri(hX;.. _wget: http://www.gnu.org/software/wget/manual/wget.htmlj_Khj hhhhh}rj(jUj=h]rkhah]h]h]h]rlh:auhMhhh]ubjH)rm}rn(hUhj hhhjKh}ro(h]h]h]h]h]Uentries]rp(XpairXREDIRECT_MAX_TIMES; settingXstd:setting-REDIRECT_MAX_TIMESrqUtrrauhMhhh]ubh)rs}rt(hUhj hhhhh}ru(h]h]h]h]h]hjquhMhhh]ubeubh)rv}rw(hUhj(hhh}hhh}rx(h]h]h]h]ry(hnjqeh]rzhauhMhhh}r{jqjssh]r|(h)r}}r~(hXREDIRECT_MAX_TIMESrhjvhhhhh}r(h]h]h]h]h]uhMhhh]rhXREDIRECT_MAX_TIMESrr}r(hjhj}ubaubh)r}r(hXDefault: ``20``rhjvhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``20``h}r(h]h]h]h]h]uhjh]rhX20rr}r(hUhjubahj1ubeubh)r}r(hXDefines the maximum times a request can be redirected. After this maximum the request's response is returned as is. We used Firefox default value for the same task.rhjvhhhhh}r(h]h]h]h]h]uhMhhh]rhXDefines the maximum times a request can be redirected. After this maximum the request's response is returned as is. We used Firefox default value for the same task.rr}r(hjhjubaubjH)r}r(hUhjvhhhjKh}r(h]h]h]h]h]Uentries]r(XpairX'REDIRECT_MAX_METAREFRESH_DELAY; settingX*std:setting-REDIRECT_MAX_METAREFRESH_DELAYrUtrauhMhhh]ubh)r}r(hUhjvhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hcjeh]rh auhMhhh}rjjsh]r(h)r}r(hXREDIRECT_MAX_METAREFRESH_DELAYrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXREDIRECT_MAX_METAREFRESH_DELAYrr}r(hjhjubaubh)r}r(hXDefault: ``100``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``100``h}r(h]h]h]h]h]uhjh]rhX100rr}r(hUhjubahj1ubeubh)r}r(hXSome sites use meta-refresh for redirecting to a session expired page, so we restrict automatic redirection to a maximum delay (in seconds)rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXSome sites use meta-refresh for redirecting to a session expired page, so we restrict automatic redirection to a maximum delay (in seconds)rr}r(hjhjubaubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairX!REDIRECT_PRIORITY_ADJUST; settingX$std:setting-REDIRECT_PRIORITY_ADJUSTrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rh@auhMhhh}rjjsh]r(h)r}r(hXREDIRECT_PRIORITY_ADJUSTrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXREDIRECT_PRIORITY_ADJUSTrr}r(hjhjubaubh)r}r(hXDefault: ``+2``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``+2``h}r(h]h]h]h]h]uhjh]rhX+2rr}r(hUhjubahj1ubeubh)r}r(hXnAdjust redirect request priority relative to original request. A negative priority adjust means more priority.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXnAdjust redirect request priority relative to original request. A negative priority adjust means more priority.rr}r(hjhjubaubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXROBOTSTXT_OBEY; settingXstd:setting-ROBOTSTXT_OBEYrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rh*auhMhhh}rjjsh]r (h)r }r (hXROBOTSTXT_OBEYr hjhhhhh}r (h]h]h]h]h]uhMhhh]rhXROBOTSTXT_OBEYrr}r(hj hj ubaubh)r}r(hXDefault: ``False``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX ``False``h}r(h]h]h]h]h]uhjh]rhXFalserr}r (hUhjubahj1ubeubh)r!}r"(hX8Scope: ``scrapy.contrib.downloadermiddleware.robotstxt``r#hjhhhhh}r$(h]h]h]h]h]uhMhhh]r%(hXScope: r&r'}r((hXScope: hj!ubj))r)}r*(hX1``scrapy.contrib.downloadermiddleware.robotstxt``h}r+(h]h]h]h]h]uhj!h]r,hX-scrapy.contrib.downloadermiddleware.robotstxtr-r.}r/(hUhj)ubahj1ubeubh)r0}r1(hXgIf enabled, Scrapy will respect robots.txt policies. For more information see :ref:`topics-dlmw-robots`hjhhhhh}r2(h]h]h]h]h]uhMhhh]r3(hXNIf enabled, Scrapy will respect robots.txt policies. For more information see r4r5}r6(hXNIf enabled, Scrapy will respect robots.txt policies. For more information see hj0ubh)r7}r8(hX:ref:`topics-dlmw-robots`r9hj0hhhhh}r:(UreftypeXrefjjXtopics-dlmw-robotsU refdomainXstdr;h]h]U refexplicith]h]h]jjuhMh]r<j)r=}r>(hj9h}r?(h]h]r@(j j;Xstd-refrAeh]h]h]uhj7h]rBhXtopics-dlmw-robotsrCrD}rE(hUhj=ubahjubaubeubjH)rF}rG(hUhjhhhjKh}rH(h]h]h]h]h]Uentries]rI(XpairXSCHEDULER; settingXstd:setting-SCHEDULERrJUtrKauhMhhh]ubh)rL}rM(hUhjhhhhh}rN(h]h]h]h]h]hjJuhMhhh]ubeubh)rO}rP(hUhj(hhh}hhh}rQ(h]h]h]h]rR(h^jJeh]rShauhMhhh}rTjJjLsh]rU(h)rV}rW(hX SCHEDULERrXhjOhhhhh}rY(h]h]h]h]h]uhMhhh]rZhX SCHEDULERr[r\}r](hjXhjVubaubh)r^}r_(hX.Default: ``'scrapy.core.scheduler.Scheduler'``r`hjOhhhhh}ra(h]h]h]h]h]uhMhhh]rb(hX Default: rcrd}re(hX Default: hj^ubj))rf}rg(hX%``'scrapy.core.scheduler.Scheduler'``h}rh(h]h]h]h]h]uhj^h]rihX!'scrapy.core.scheduler.Scheduler'rjrk}rl(hUhjfubahj1ubeubh)rm}rn(hX"The scheduler to use for crawling.rohjOhhhhh}rp(h]h]h]h]h]uhMhhh]rqhX"The scheduler to use for crawling.rrrs}rt(hjohjmubaubjH)ru}rv(hUhjOhhhjKh}rw(h]h]h]h]h]Uentries]rx(XpairXSPIDER_CONTRACTS; settingXstd:setting-SPIDER_CONTRACTSryUtrzauhMhhh]ubh)r{}r|(hUhjOhhhhh}r}(h]h]h]h]h]hjyuhMhhh]ubeubh)r~}r(hUhj(hhh}hhh}r(h]h]h]h]r(hbjyeh]rh auhMhhh}rjyj{sh]r(h)r}r(hXSPIDER_CONTRACTSrhj~hhhhh}r(h]h]h]h]h]uhMhhh]rhXSPIDER_CONTRACTSrr}r(hjhjubaubh)r}r(hXDefault:: ``{}``rhj~hhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default:: rr}r(hX Default:: hjubj))r}r(hX``{}``h}r(h]h]h]h]h]uhjh]rhX{}rr}r(hUhjubahj1ubeubh)r}r(hXA dict containing the scrapy contracts enabled in your project, used for testing spiders. For more info see :ref:`topics-contracts`.hj~hhhhh}r(h]h]h]h]h]uhMhhh]r(hXlA dict containing the scrapy contracts enabled in your project, used for testing spiders. For more info see rr}r(hXlA dict containing the scrapy contracts enabled in your project, used for testing spiders. For more info see hjubh)r}r(hX:ref:`topics-contracts`rhjhhhhh}r(UreftypeXrefjjXtopics-contractsU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhXtopics-contractsrr}r(hUhjubahjubaubhX.r}r(hX.hjubeubjH)r}r(hUhj~hhhjKh}r(h]h]h]h]h]Uentries]r(XpairXSPIDER_CONTRACTS_BASE; settingX!std:setting-SPIDER_CONTRACTS_BASErUtrauhMhhh]ubh)r}r(hUhj~hhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rhMauhMhhh}rjjsh]r(h)r}r(hXSPIDER_CONTRACTS_BASErhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXSPIDER_CONTRACTS_BASErr}r(hjhjubaubh)r}r(hX Default::rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXDefault:rr}r(hXDefault:hjubaubj)r}r(hX{ 'scrapy.contracts.default.UrlContract' : 1, 'scrapy.contracts.default.ReturnsContract': 2, 'scrapy.contracts.default.ScrapesContract': 3, }hjhhhjh}r(jjh]h]h]h]h]uhMhhh]rhX{ 'scrapy.contracts.default.UrlContract' : 1, 'scrapy.contracts.default.ReturnsContract': 2, 'scrapy.contracts.default.ScrapesContract': 3, }rr}r(hUhjubaubh)r}r(hXA dict containing the scrapy contracts enabled by default in Scrapy. You should never modify this setting in your project, modify :setting:`SPIDER_CONTRACTS` instead. For more info see :ref:`topics-contracts`.hjhhhhh}r(h]h]h]h]h]uhMhhh]r(hXA dict containing the scrapy contracts enabled by default in Scrapy. You should never modify this setting in your project, modify rr}r(hXA dict containing the scrapy contracts enabled by default in Scrapy. You should never modify this setting in your project, modify hjubh)r}r(hX:setting:`SPIDER_CONTRACTS`rhjhhhhh}r(UreftypeXsettingjjXSPIDER_CONTRACTSU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj))r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXSPIDER_CONTRACTSrr}r(hUhjubahj1ubaubhX instead. For more info see rr}r(hX instead. For more info see hjubh)r}r(hX:ref:`topics-contracts`rhjhhhhh}r(UreftypeXrefjjXtopics-contractsU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhXtopics-contractsrr}r(hUhjubahjubaubhX.r}r(hX.hjubeubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXSPIDER_MIDDLEWARES; settingXstd:setting-SPIDER_MIDDLEWARESr Utr auhMhhh]ubh)r }r (hUhjhhhhh}r (h]h]h]h]h]hj uhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hzj eh]rh$auhMhhh}rj j sh]r(h)r}r(hXSPIDER_MIDDLEWARESrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXSPIDER_MIDDLEWARESrr}r(hjhjubaubh)r}r(hXDefault:: ``{}``rhjhhhhh}r (h]h]h]h]h]uhMhhh]r!(hX Default:: r"r#}r$(hX Default:: hjubj))r%}r&(hX``{}``h}r'(h]h]h]h]h]uhjh]r(hX{}r)r*}r+(hUhj%ubahj1ubeubh)r,}r-(hXA dict containing the spider middlewares enabled in your project, and their orders. For more info see :ref:`topics-spider-middleware-setting`.hjhhhhh}r.(h]h]h]h]h]uhMhhh]r/(hXfA dict containing the spider middlewares enabled in your project, and their orders. For more info see r0r1}r2(hXfA dict containing the spider middlewares enabled in your project, and their orders. For more info see hj,ubh)r3}r4(hX':ref:`topics-spider-middleware-setting`r5hj,hhhhh}r6(UreftypeXrefjjX topics-spider-middleware-settingU refdomainXstdr7h]h]U refexplicith]h]h]jjuhMh]r8j)r9}r:(hj5h}r;(h]h]r<(j j7Xstd-refr=eh]h]h]uhj3h]r>hX topics-spider-middleware-settingr?r@}rA(hUhj9ubahjubaubhX.rB}rC(hX.hj,ubeubjH)rD}rE(hUhjhhhjKh}rF(h]h]h]h]h]Uentries]rG(XpairX SPIDER_MIDDLEWARES_BASE; settingX#std:setting-SPIDER_MIDDLEWARES_BASErHUtrIauhMhhh]ubh)rJ}rK(hUhjhhhhh}rL(h]h]h]h]h]hjHuhMhhh]ubeubh)rM}rN(hUhj(hhh}hhh}rO(h]h]h]h]rP(hhjHeh]rQhauhMhhh}rRjHjJsh]rS(h)rT}rU(hXSPIDER_MIDDLEWARES_BASErVhjMhhhhh}rW(h]h]h]h]h]uhMhhh]rXhXSPIDER_MIDDLEWARES_BASErYrZ}r[(hjVhjTubaubh)r\}r](hX Default::r^hjMhhhhh}r_(h]h]h]h]h]uhMhhh]r`hXDefault:rarb}rc(hXDefault:hj\ubaubj)rd}re(hXd{ 'scrapy.contrib.spidermiddleware.httperror.HttpErrorMiddleware': 50, 'scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware': 500, 'scrapy.contrib.spidermiddleware.referer.RefererMiddleware': 700, 'scrapy.contrib.spidermiddleware.urllength.UrlLengthMiddleware': 800, 'scrapy.contrib.spidermiddleware.depth.DepthMiddleware': 900, }hjMhhhjh}rf(jjh]h]h]h]h]uhMhhh]rghXd{ 'scrapy.contrib.spidermiddleware.httperror.HttpErrorMiddleware': 50, 'scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware': 500, 'scrapy.contrib.spidermiddleware.referer.RefererMiddleware': 700, 'scrapy.contrib.spidermiddleware.urllength.UrlLengthMiddleware': 800, 'scrapy.contrib.spidermiddleware.depth.DepthMiddleware': 900, }rhri}rj(hUhjdubaubh)rk}rl(hXA dict containing the spider middlewares enabled by default in Scrapy. You should never modify this setting in your project, modify :setting:`SPIDER_MIDDLEWARES` instead. For more info see :ref:`topics-spider-middleware-setting`.hjMhhhhh}rm(h]h]h]h]h]uhMhhh]rn(hXA dict containing the spider middlewares enabled by default in Scrapy. You should never modify this setting in your project, modify rorp}rq(hXA dict containing the spider middlewares enabled by default in Scrapy. You should never modify this setting in your project, modify hjkubh)rr}rs(hX:setting:`SPIDER_MIDDLEWARES`rthjkhhhhh}ru(UreftypeXsettingjjXSPIDER_MIDDLEWARESU refdomainXstdrvh]h]U refexplicith]h]h]jjuhMh]rwj))rx}ry(hjth}rz(h]h]r{(j jvX std-settingr|eh]h]h]uhjrh]r}hXSPIDER_MIDDLEWARESr~r}r(hUhjxubahj1ubaubhX instead. For more info see rr}r(hX instead. For more info see hjkubh)r}r(hX':ref:`topics-spider-middleware-setting`rhjkhhhhh}r(UreftypeXrefjjX topics-spider-middleware-settingU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhX topics-spider-middleware-settingrr}r(hUhjubahjubaubhX.r}r(hX.hjkubeubjH)r}r(hUhjMhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXSPIDER_MODULES; settingXstd:setting-SPIDER_MODULESrUtrauhM hhh]ubh)r}r(hUhjMhhhhh}r(h]h]h]h]h]hjuhM hhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hpjeh]rhauhM hhh}rjjsh]r(h)r}r(hXSPIDER_MODULESrhjhhhhh}r(h]h]h]h]h]uhM hhh]rhXSPIDER_MODULESrr}r(hjhjubaubh)r}r(hXDefault: ``[]``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``[]``h}r(h]h]h]h]h]uhjh]rhX[]rr}r(hUhjubahj1ubeubh)r}r(hX5A list of modules where Scrapy will look for spiders.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX5A list of modules where Scrapy will look for spiders.rr}r(hjhjubaubh)r}r(hX Example::rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXExample:rr}r(hXExample:hjubaubj)r}r(hX<SPIDER_MODULES = ['mybot.spiders_prod', 'mybot.spiders_dev']hjhhhjh}r(jjh]h]h]h]h]uhMhhh]rhX<SPIDER_MODULES = ['mybot.spiders_prod', 'mybot.spiders_dev']rr}r(hUhjubaubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXSTATS_CLASS; settingXstd:setting-STATS_CLASSrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rh9auhMhhh}rjjsh]r(h)r}r(hX STATS_CLASSrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX STATS_CLASSrr}r(hjhjubaubh)r}r(hX3Default: ``'scrapy.statscol.MemoryStatsCollector'``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX*``'scrapy.statscol.MemoryStatsCollector'``h}r(h]h]h]h]h]uhjh]rhX&'scrapy.statscol.MemoryStatsCollector'rr}r(hUhjubahj1ubeubh)r}r(hXVThe class to use for collecting stats, who must implement the :ref:`topics-api-stats`.hjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX>The class to use for collecting stats, who must implement the rr}r(hX>The class to use for collecting stats, who must implement the hjubh)r}r(hX:ref:`topics-api-stats`rhjhhhhh}r(UreftypeXrefjjXtopics-api-statsU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r (h]h]r (j jXstd-refr eh]h]h]uhjh]r hXtopics-api-statsr r}r(hUhjubahjubaubhX.r}r(hX.hjubeubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXSTATS_DUMP; settingXstd:setting-STATS_DUMPrUtrauhM!hhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhM!hhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hrjeh]rhauhM#hhh}r jjsh]r!(h)r"}r#(hX STATS_DUMPr$hjhhhhh}r%(h]h]h]h]h]uhM#hhh]r&hX STATS_DUMPr'r(}r)(hj$hj"ubaubh)r*}r+(hXDefault: ``True``r,hjhhhhh}r-(h]h]h]h]h]uhM%hhh]r.(hX Default: r/r0}r1(hX Default: hj*ubj))r2}r3(hX``True``h}r4(h]h]h]h]h]uhj*h]r5hXTruer6r7}r8(hUhj2ubahj1ubeubh)r9}r:(hXYDump the :ref:`Scrapy stats ` (to the Scrapy log) once the spider finishes.hjhhhhh}r;(h]h]h]h]h]uhM'hhh]r<(hX Dump the r=r>}r?(hX Dump the hj9ubh)r@}rA(hX":ref:`Scrapy stats `rBhj9hhhhh}rC(UreftypeXrefjjX topics-statsU refdomainXstdrDh]h]U refexplicith]h]h]jjuhM'h]rEj)rF}rG(hjBh}rH(h]h]rI(j jDXstd-refrJeh]h]h]uhj@h]rKhX Scrapy statsrLrM}rN(hUhjFubahjubaubhX. (to the Scrapy log) once the spider finishes.rOrP}rQ(hX. (to the Scrapy log) once the spider finishes.hj9ubeubh)rR}rS(hX'For more info see: :ref:`topics-stats`.rThjhhhhh}rU(h]h]h]h]h]uhM*hhh]rV(hXFor more info see: rWrX}rY(hXFor more info see: hjRubh)rZ}r[(hX:ref:`topics-stats`r\hjRhhhhh}r](UreftypeXrefjjX topics-statsU refdomainXstdr^h]h]U refexplicith]h]h]jjuhM*h]r_j)r`}ra(hj\h}rb(h]h]rc(j j^Xstd-refrdeh]h]h]uhjZh]rehX topics-statsrfrg}rh(hUhj`ubahjubaubhX.ri}rj(hX.hjRubeubjH)rk}rl(hUhjhhhjKh}rm(h]h]h]h]h]Uentries]rn(XpairXSTATSMAILER_RCPTS; settingXstd:setting-STATSMAILER_RCPTSroUtrpauhM-hhh]ubh)rq}rr(hUhjhhhhh}rs(h]h]h]h]h]hjouhM-hhh]ubeubh)rt}ru(hUhj(hhh}hhh}rv(h]h]h]h]rw(hjoeh]rxhPauhM/hhh}ryjojqsh]rz(h)r{}r|(hXSTATSMAILER_RCPTSr}hjthhhhh}r~(h]h]h]h]h]uhM/hhh]rhXSTATSMAILER_RCPTSrr}r(hj}hj{ubaubh)r}r(hXDefault: ``[]`` (empty list)rhjthhhhh}r(h]h]h]h]h]uhM1hhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``[]``h}r(h]h]h]h]h]uhjh]rhX[]rr}r(hUhjubahj1ubhX (empty list)rr}r(hX (empty list)hjubeubh)r}r(hXtSend Scrapy stats after spiders finish scraping. See :class:`~scrapy.contrib.statsmailer.StatsMailer` for more info.hjthhhhh}r(h]h]h]h]h]uhM3hhh]r(hX5Send Scrapy stats after spiders finish scraping. See rr}r(hX5Send Scrapy stats after spiders finish scraping. See hjubh)r}r(hX0:class:`~scrapy.contrib.statsmailer.StatsMailer`rhjhhhhh}r(UreftypeXclassjjX&scrapy.contrib.statsmailer.StatsMailerU refdomainXpyrh]h]U refexplicith]h]h]jjjNjNuhM3h]rj))r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhX StatsMailerrr}r(hUhjubahj1ubaubhX for more info.rr}r(hX for more info.hjubeubjH)r}r(hUhjthhhjKh}r(h]h]h]h]h]Uentries]r(XpairXTELNETCONSOLE_ENABLED; settingX!std:setting-TELNETCONSOLE_ENABLEDrUtrauhM7hhh]ubh)r}r(hUhjthhhhh}r(h]h]h]h]h]hjuhM7hhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hejeh]rhauhM9hhh}rjjsh]r(h)r}r(hXTELNETCONSOLE_ENABLEDrhjhhhhh}r(h]h]h]h]h]uhM9hhh]rhXTELNETCONSOLE_ENABLEDrr}r(hjhjubaubh)r}r(hXDefault: ``True``rhjhhhhh}r(h]h]h]h]h]uhM;hhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``True``h}r(h]h]h]h]h]uhjh]rhXTruerr}r(hUhjubahj1ubeubh)r}r(hXA boolean which specifies if the :ref:`telnet console ` will be enabled (provided its extension is also enabled).hjhhhhh}r(h]h]h]h]h]uhM=hhh]r(hX!A boolean which specifies if the rr}r(hX!A boolean which specifies if the hjubh)r}r(hX,:ref:`telnet console `rhjhhhhh}r(UreftypeXrefjjXtopics-telnetconsoleU refdomainXstdrh]h]U refexplicith]h]h]jjuhM=h]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhXtelnet consolerr}r(hUhjubahjubaubhX: will be enabled (provided its extension is also enabled).rr}r(hX: will be enabled (provided its extension is also enabled).hjubeubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXTELNETCONSOLE_PORT; settingXstd:setting-TELNETCONSOLE_PORTrUtrauhMAhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMAhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rhDauhMChhh}rjjsh]r(h)r}r(hXTELNETCONSOLE_PORTrhjhhhhh}r(h]h]h]h]h]uhMChhh]rhXTELNETCONSOLE_PORTrr}r(hjhjubaubh)r}r(hXDefault: ``[6023, 6073]``rhjhhhhh}r (h]h]h]h]h]uhMEhhh]r (hX Default: r r }r (hX Default: hjubj))r}r(hX``[6023, 6073]``h}r(h]h]h]h]h]uhjh]rhX [6023, 6073]rr}r(hUhjubahj1ubeubh)r}r(hXThe port range to use for the telnet console. If set to ``None`` or ``0``, a dynamically assigned port is used. For more info see :ref:`topics-telnetconsole`.hjhhhhh}r(h]h]h]h]h]uhMGhhh]r(hX8The port range to use for the telnet console. If set to rr}r(hX8The port range to use for the telnet console. If set to hjubj))r}r(hX``None``h}r(h]h]h]h]h]uhjh]rhXNoner r!}r"(hUhjubahj1ubhX or r#r$}r%(hX or hjubj))r&}r'(hX``0``h}r((h]h]h]h]h]uhjh]r)hX0r*}r+(hUhj&ubahj1ubhX9, a dynamically assigned port is used. For more info see r,r-}r.(hX9, a dynamically assigned port is used. For more info see hjubh)r/}r0(hX:ref:`topics-telnetconsole`r1hjhhhhh}r2(UreftypeXrefjjXtopics-telnetconsoleU refdomainXstdr3h]h]U refexplicith]h]h]jjuhMGh]r4j)r5}r6(hj1h}r7(h]h]r8(j j3Xstd-refr9eh]h]h]uhj/h]r:hXtopics-telnetconsoler;r<}r=(hUhj5ubahjubaubhX.r>}r?(hX.hjubeubjH)r@}rA(hUhjhhhjKh}rB(h]h]h]h]h]Uentries]rC(XpairXTEMPLATES_DIR; settingXstd:setting-TEMPLATES_DIRrDUtrEauhMLhhh]ubh)rF}rG(hUhjhhhhh}rH(h]h]h]h]h]hjDuhMLhhh]ubeubh)rI}rJ(hUhj(hhh}hhh}rK(h]h]h]h]rL(hjDeh]rMhBauhMNhhh}rNjDjFsh]rO(h)rP}rQ(hX TEMPLATES_DIRrRhjIhhhhh}rS(h]h]h]h]h]uhMNhhh]rThX TEMPLATES_DIRrUrV}rW(hjRhjPubaubh)rX}rY(hX/Default: ``templates`` dir inside scrapy modulerZhjIhhhhh}r[(h]h]h]h]h]uhMPhhh]r\(hX Default: r]r^}r_(hX Default: hjXubj))r`}ra(hX ``templates``h}rb(h]h]h]h]h]uhjXh]rchX templatesrdre}rf(hUhj`ubahj1ubhX dir inside scrapy modulergrh}ri(hX dir inside scrapy modulehjXubeubh)rj}rk(hXjThe directory where to look for templates when creating new projects with :command:`startproject` command.hjIhhhhh}rl(h]h]h]h]h]uhMRhhh]rm(hXJThe directory where to look for templates when creating new projects with rnro}rp(hXJThe directory where to look for templates when creating new projects with hjjubh)rq}rr(hX:command:`startproject`rshjjhhhhh}rt(UreftypeXcommandjjX startprojectU refdomainXstdruh]h]U refexplicith]h]h]jjuhMRh]rvj))rw}rx(hjsh}ry(h]h]rz(j juX std-commandr{eh]h]h]uhjqh]r|hX startprojectr}r~}r(hUhjwubahj1ubaubhX command.rr}r(hX command.hjjubeubjH)r}r(hUhjIhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXURLLENGTH_LIMIT; settingXstd:setting-URLLENGTH_LIMITrUtrauhMVhhh]ubh)r}r(hUhjIhhhhh}r(h]h]h]h]h]hjuhMVhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hfjeh]rhauhMXhhh}rjjsh]r(h)r}r(hXURLLENGTH_LIMITrhjhhhhh}r(h]h]h]h]h]uhMXhhh]rhXURLLENGTH_LIMITrr}r(hjhjubaubh)r}r(hXDefault: ``2083``rhjhhhhh}r(h]h]h]h]h]uhMZhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX``2083``h}r(h]h]h]h]h]uhjh]rhX2083rr}r(hUhjubahj1ubeubh)r}r(hX-Scope: ``contrib.spidermiddleware.urllength``rhjhhhhh}r(h]h]h]h]h]uhM\hhh]r(hXScope: rr}r(hXScope: hjubj))r}r(hX&``contrib.spidermiddleware.urllength``h}r(h]h]h]h]h]uhjh]rhX"contrib.spidermiddleware.urllengthrr}r(hUhjubahj1ubeubh)r}r(hXThe maximum URL length to allow for crawled URLs. For more information about the default value for this setting see: http://www.boutell.com/newfaq/misc/urllength.htmlhjhhhhh}r(h]h]h]h]h]uhM^hhh]r(hXuThe maximum URL length to allow for crawled URLs. For more information about the default value for this setting see: rr}r(hXuThe maximum URL length to allow for crawled URLs. For more information about the default value for this setting see: hjubjO)r}r(hX1http://www.boutell.com/newfaq/misc/urllength.htmlrh}r(Urefurijh]h]h]h]h]uhjh]rhX1http://www.boutell.com/newfaq/misc/urllength.htmlrr}r(hUhjubahjSubeubjH)r}r(hUhjhhhjKh}r(h]h]h]h]h]Uentries]r(XpairXUSER_AGENT; settingXstd:setting-USER_AGENTrUtrauhMbhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMbhhh]ubeubh)r}r(hUhj(hhh}hhh}r(h]h]h]h]r(hjeh]rhJauhMdhhh}rjjsh]r(h)r}r(hX USER_AGENTrhjhhhhh}r(h]h]h]h]h]uhMdhhh]rhX USER_AGENTrr}r(hjhjubaubh)r}r(hX2Default: ``"Scrapy/VERSION (+http://scrapy.org)"``rhjhhhhh}r(h]h]h]h]h]uhMfhhh]r(hX Default: rr}r(hX Default: hjubj))r}r(hX)``"Scrapy/VERSION (+http://scrapy.org)"``h}r(h]h]h]h]h]uhjh]rhX%"Scrapy/VERSION (+http://scrapy.org)"rr}r(hUhjubahj1ubeubh)r}r(hX?The default User-Agent to use when crawling, unless overridden.rhjhhhhh}r(h]h]h]h]h]uhMhhhh]rhX?The default User-Agent to use when crawling, unless overridden.rr}r(hjhjubaubh)r}r(hX/.. _Amazon web services: http://aws.amazon.com/j_Khjhhhhh}r(jUj{h]rh`ah]h]h]h]rh auhMjhhh]ubh)r}r(hXJ.. _breadth-first order: http://en.wikipedia.org/wiki/Breadth-first_searchhjhhhhh}r(jUX1http://en.wikipedia.org/wiki/Breadth-first_searchh]rhah]h]h]h]rh1auhMkhhh]ubh)r}r(hXF.. _depth-first order: http://en.wikipedia.org/wiki/Depth-first_searchhjhhhhh}r(jUX/http://en.wikipedia.org/wiki/Depth-first_searchh]rhjah]h]h]h]rhauhMlhhh]ubeubeubeubehUU transformerrNU footnote_refsr}rUrefnamesr }r (Ximport search path]r jPaj<]r j9aXamazon web services]r (jxjeuUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rhhU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypeUINFOruh]rh)r}r (hUh}r!(h]h]h]h]h]uhjh]r"hX5Hyperlink target "topics-settings" is not referenced.r#r$}r%(hUhjubahhubahUsystem_messager&ubj)r'}r((hUh}r)(h]UlevelKh]h]Usourcehh]h]UlineKpUtypejuh]r*h)r+}r,(hUh}r-(h]h]h]h]h]uhj'h]r.hX9Hyperlink target "topics-settings-ref" is not referenced.r/r0}r1(hUhj+ubahhubahj&ubj)r2}r3(hUh}r4(h]UlevelKh]h]Usourcehh]h]UlineK~Utypejuh]r5h)r6}r7(hUh}r8(h]h]h]h]h]uhj2h]r9hXCHyperlink target "std:setting-AWS_ACCESS_KEY_ID" is not referenced.r:r;}r<(hUhj6ubahhubahj&ubj)r=}r>(hUh}r?(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]r@h)rA}rB(hUh}rC(h]h]h]h]h]uhj=h]rDhXGHyperlink target "std:setting-AWS_SECRET_ACCESS_KEY" is not referenced.rErF}rG(hUhjAubahhubahj&ubj)rH}rI(hUh}rJ(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rKh)rL}rM(hUh}rN(h]h]h]h]h]uhjHh]rOhX:Hyperlink target "std:setting-BOT_NAME" is not referenced.rPrQ}rR(hUhjLubahhubahj&ubj)rS}rT(hUh}rU(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rVh)rW}rX(hUh}rY(h]h]h]h]h]uhjSh]rZhXBHyperlink target "std:setting-CONCURRENT_ITEMS" is not referenced.r[r\}r](hUhjWubahhubahj&ubj)r^}r_(hUh}r`(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rah)rb}rc(hUh}rd(h]h]h]h]h]uhj^h]rehXEHyperlink target "std:setting-CONCURRENT_REQUESTS" is not referenced.rfrg}rh(hUhjbubahhubahj&ubj)ri}rj(hUh}rk(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rlh)rm}rn(hUh}ro(h]h]h]h]h]uhjih]rphXPHyperlink target "std:setting-CONCURRENT_REQUESTS_PER_DOMAIN" is not referenced.rqrr}rs(hUhjmubahhubahj&ubj)rt}ru(hUh}rv(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rwh)rx}ry(hUh}rz(h]h]h]h]h]uhjth]r{hXLHyperlink target "std:setting-CONCURRENT_REQUESTS_PER_IP" is not referenced.r|r}}r~(hUhjxubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXDHyperlink target "std:setting-DEFAULT_ITEM_CLASS" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXIHyperlink target "std:setting-DEFAULT_REQUEST_HEADERS" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX=Hyperlink target "std:setting-DEPTH_LIMIT" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX@Hyperlink target "std:setting-DEPTH_PRIORITY" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX=Hyperlink target "std:setting-DEPTH_STATS" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXEHyperlink target "std:setting-DEPTH_STATS_VERBOSE" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXBHyperlink target "std:setting-DNSCACHE_ENABLED" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXBHyperlink target "std:setting-DOWNLOADER_DEBUG" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM$Utypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXHHyperlink target "std:setting-DOWNLOADER_MIDDLEWARES" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM.Utypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXMHyperlink target "std:setting-DOWNLOADER_MIDDLEWARES_BASE" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMKUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXBHyperlink target "std:setting-DOWNLOADER_STATS" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMTUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX@Hyperlink target "std:setting-DOWNLOAD_DELAY" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMmUtypejuh]rh)r}r(hUh}r (h]h]h]h]h]uhjh]r hXCHyperlink target "std:setting-DOWNLOAD_HANDLERS" is not referenced.r r }r (hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMwUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXHHyperlink target "std:setting-DOWNLOAD_HANDLERS_BASE" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]r hXBHyperlink target "std:setting-DOWNLOAD_TIMEOUT" is not referenced.r!r"}r#(hUhjubahhubahj&ubj)r$}r%(hUh}r&(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]r'h)r(}r)(hUh}r*(h]h]h]h]h]uhj$h]r+hXBHyperlink target "std:setting-DUPEFILTER_CLASS" is not referenced.r,r-}r.(hUhj(ubahhubahj&ubj)r/}r0(hUh}r1(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]r2h)r3}r4(hUh}r5(h]h]h]h]h]uhj/h]r6hX8Hyperlink target "std:setting-jDITOR" is not referenced.r7r8}r9(hUhj3ubahhubahj&ubj)r:}r;(hUh}r<(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]r=h)r>}r?(hUh}r@(h]h]h]h]h]uhj:h]rAhX<Hyperlink target "std:setting-EXTENSIONS" is not referenced.rBrC}rD(hUhj>ubahhubahj&ubj)rE}rF(hUh}rG(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rHh)rI}rJ(hUh}rK(h]h]h]h]h]uhjEh]rLhXAHyperlink target "std:setting-EXTENSIONS_BASE" is not referenced.rMrN}rO(hUhjIubahhubahj&ubj)rP}rQ(hUh}rR(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rSh)rT}rU(hUh}rV(h]h]h]h]h]uhjPh]rWhX@Hyperlink target "std:setting-ITEM_PIPELINES" is not referenced.rXrY}rZ(hUhjTubahhubahj&ubj)r[}r\(hUh}r](h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]r^h)r_}r`(hUh}ra(h]h]h]h]h]uhj[h]rbhXEHyperlink target "std:setting-ITEM_PIPELINES_BASE" is not referenced.rcrd}re(hUhj_ubahhubahj&ubj)rf}rg(hUh}rh(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rih)rj}rk(hUh}rl(h]h]h]h]h]uhjfh]rmhX=Hyperlink target "std:setting-LOG_ENABLED" is not referenced.rnro}rp(hUhjjubahhubahj&ubj)rq}rr(hUh}rs(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rth)ru}rv(hUh}rw(h]h]h]h]h]uhjqh]rxhX>Hyperlink target "std:setting-LOG_ENCODING" is not referenced.ryrz}r{(hUhjuubahhubahj&ubj)r|}r}(hUh}r~(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhj|h]rhX:Hyperlink target "std:setting-LOG_FILE" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX;Hyperlink target "std:setting-LOG_LEVEL" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX<Hyperlink target "std:setting-LOG_STDOUT" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXBHyperlink target "std:setting-MEMDEBUG_ENABLED" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM%Utypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXAHyperlink target "std:setting-MEMDEBUG_NOTIFY" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM4Utypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXBHyperlink target "std:setting-MEMUSAGE_ENABLED" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMCUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXCHyperlink target "std:setting-MEMUSAGE_LIMIT_MB" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMQUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXFHyperlink target "std:setting-MEMUSAGE_NOTIFY_MAIL" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMbUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXAHyperlink target "std:setting-MEMUSAGE_REPORT" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMoUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXEHyperlink target "std:setting-MEMUSAGE_WARNING_MB" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM{Utypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXBHyperlink target "std:setting-NEWSPIDER_MODULE" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXJHyperlink target "std:setting-RANDOMIZE_DOWNLOAD_DELAY" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXDHyperlink target "std:setting-REDIRECT_MAX_TIMES" is not referenced.rr }r (hUhjubahhubahj&ubj)r }r (hUh}r (h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhj h]rhXPHyperlink target "std:setting-REDIRECT_MAX_METAREFRESH_DELAY" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXJHyperlink target "std:setting-REDIRECT_PRIORITY_ADJUST" is not referenced.rr}r (hUhjubahhubahj&ubj)r!}r"(hUh}r#(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]r$h)r%}r&(hUh}r'(h]h]h]h]h]uhj!h]r(hX@Hyperlink target "std:setting-ROBOTSTXT_OBEY" is not referenced.r)r*}r+(hUhj%ubahhubahj&ubj)r,}r-(hUh}r.(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]r/h)r0}r1(hUh}r2(h]h]h]h]h]uhj,h]r3hX;Hyperlink target "std:setting-SCHEDULER" is not referenced.r4r5}r6(hUhj0ubahhubahj&ubj)r7}r8(hUh}r9(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]r:h)r;}r<(hUh}r=(h]h]h]h]h]uhj7h]r>hXBHyperlink target "std:setting-SPIDER_CONTRACTS" is not referenced.r?r@}rA(hUhj;ubahhubahj&ubj)rB}rC(hUh}rD(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rEh)rF}rG(hUh}rH(h]h]h]h]h]uhjBh]rIhXGHyperlink target "std:setting-SPIDER_CONTRACTS_BASE" is not referenced.rJrK}rL(hUhjFubahhubahj&ubj)rM}rN(hUh}rO(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rPh)rQ}rR(hUh}rS(h]h]h]h]h]uhjMh]rThXDHyperlink target "std:setting-SPIDER_MIDDLEWARES" is not referenced.rUrV}rW(hUhjQubahhubahj&ubj)rX}rY(hUh}rZ(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]r[h)r\}r](hUh}r^(h]h]h]h]h]uhjXh]r_hXIHyperlink target "std:setting-SPIDER_MIDDLEWARES_BASE" is not referenced.r`ra}rb(hUhj\ubahhubahj&ubj)rc}rd(hUh}re(h]UlevelKh]h]Usourcehh]h]UlineM Utypejuh]rfh)rg}rh(hUh}ri(h]h]h]h]h]uhjch]rjhX@Hyperlink target "std:setting-SPIDER_MODULES" is not referenced.rkrl}rm(hUhjgubahhubahj&ubj)rn}ro(hUh}rp(h]UlevelKh]h]Usourcehh]h]UlineMUtypejuh]rqh)rr}rs(hUh}rt(h]h]h]h]h]uhjnh]ruhX=Hyperlink target "std:setting-STATS_CLASS" is not referenced.rvrw}rx(hUhjrubahhubahj&ubj)ry}rz(hUh}r{(h]UlevelKh]h]Usourcehh]h]UlineM!Utypejuh]r|h)r}}r~(hUh}r(h]h]h]h]h]uhjyh]rhX<Hyperlink target "std:setting-STATS_DUMP" is not referenced.rr}r(hUhj}ubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM-Utypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXCHyperlink target "std:setting-STATSMAILER_RCPTS" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM7Utypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXGHyperlink target "std:setting-TELNETCONSOLE_ENABLED" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMAUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXDHyperlink target "std:setting-TELNETCONSOLE_PORT" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMLUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX?Hyperlink target "std:setting-TEMPLATES_DIR" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMVUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXAHyperlink target "std:setting-URLLENGTH_LIMIT" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMbUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX<Hyperlink target "std:setting-USER_AGENT" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMkUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX9Hyperlink target "breadth-first order" is not referenced.rr}r(hUhjubahhubahj&ubj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMlUtypejuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX7Hyperlink target "depth-first order" is not referenced.rr}r(hUhjubahhubahj&ubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvr NUdump_pseudo_xmlr NUexpose_internalsr NUsectsubtitle_xformr U source_linkr NUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUE/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/settings.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/r Usyntax_highlightr!Ulongr"Uinput_encoding_error_handlerr#jUauto_id_prefixr$Uidr%Udoctitle_xformr&Ustrip_elements_with_classesr'NU _config_filesr(]Ufile_insertion_enabledr)U raw_enabledr*KU dump_settingsr+NubUsymbol_footnote_startr,KUidsr-}r.(hljL hj hjTjJjOhjhjhfjhjuhjj\ ja jjjjh]j hjjjjjjhjhjhejjjjojtjyj~j{ j hjjjhjcjjh{jh^jOhjjAjFjojthhjMhjjjhhjjhjnhjhj@ hkjShjR hj.jjjjhvjhnjvj j hdj=hyj5hjhj j j j^jchjjjjHjMhcjhgjj j j j jNjSjjhjh_j h`jhjjDjIhbj~jjhjhij hjhh\j j j hj jpjuhjj jhoj hja j` je j0j5jOjThzjhmj]jG jL hpjhthjM jR h|jj j hj(hjj j j; j@ hjhjh~j j)j.hwjhj(h}j jjhjtjjhjjjjjjjhj hj hj hsjFhjIj j hjhajjjhrjjqjvhjmhjthuj jjj j j j j j jjjijnhxjcjjhqje j j j jjjj j uUsubstitution_namesr/}r0hhh}r1(h]h]h]Usourcehh]h]uU footnotesr2]r3Urefidsr4}r5(j]r6jajG ]r7jI ajJ]r8jLaj ]r9j aj]r:jaj ]r;j ajM ]r<jO aj ]r=j aj^]r>j`aj ]r?j aj\ ]r@j^ aj]rAjaj]rBjaj ]rCj aj; ]rDj= aj]rEjaj]rFjaj]rGjajq]rHjsaht]rIhaj ]rJj aji]rKjkaj ]rLj aj]rMjajN]rNjPajo]rOjqaj)]rPj+ajy]rQj{aj]rRjaj{ ]rSj} aj]rTjaj ]rUj ajp]rVjraj]rWjaj]rXjaj]rYjaj ]rZj aj ]r[j ajD]r\jFaj]r]jaj]r^jajA]r_jCaj]r`jaj]rajajo]rbjqah]rcj%aj]rdjajH]rejJaj]rfjaj ]rgj aj ]rhjaj ]rij aj]rjjaj]rkjaj]rljaj ]rmj aj` ]rnjb aj]rojaj0]rpj2aj]rqjajO]rrjQaj ]rsj auub.PKo1DJ*A*A/scrapy-0.22/.doctrees/topics/exceptions.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xtopics-exceptions-refqXscrapy.exceptions.IgnoreRequestqX notconfiguredqNXtopics-exceptionsq X ignorerequestq NX closespiderq NXbuilt-in exceptions referenceq NXdropitemq NXscrapy.exceptions.NotConfiguredqXscrapy.exceptions.NotSupportedqXscrapy.exceptions.DropItemqX exceptionsqNX notsupportedqNXscrapy.exceptions.CloseSpiderquUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUtopics-exceptions-refqhhhU notconfiguredqh Utopics-exceptionsqh U ignorerequestq h U closespiderq!h Ubuilt-in-exceptions-referenceq"h Udropitemq#hhhhhhhU exceptionsq$hU notsupportedq%hhuUchildrenq&]q'(cdocutils.nodes target q()q)}q*(U rawsourceq+X.. _topics-exceptions:Uparentq,hUsourceq-cdocutils.nodes reprunicode q.XG/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/exceptions.rstq/q0}q1bUtagnameq2Utargetq3U attributesq4}q5(Uidsq6]Ubackrefsq7]Udupnamesq8]Uclassesq9]Unamesq:]Urefidq;huUlineq)q?}q@(h+Uh,hh-h0Uexpect_referenced_by_nameqA}qBh h)sh2UsectionqCh4}qD(h8]h9]h7]h6]qE(Xmodule-scrapy.exceptionsqFh$heh:]qG(hh euh)qa}qb(h+Uh,h?h-h0hA}qchh^sh2hCh4}qd(h8]h9]h7]h6]qe(h"heh:]qf(h heuh)q{}q|(h+Uh,hah-h0h2hCh4}q}(h8]h9]h7]h6]q~h#ah:]qh auh)q}q(h+Uh,hah-h0h2hCh4}q(h8]h9]h7]h6]qh!ah:]qh auh)r?}r@(h+Uh4}rA(h8]h9]h7]h6]h:]uh,j9h&]rB(cdocutils.nodes field_name rC)rD}rE(h+Uh4}rF(h8]h9]h7]h6]h:]uh,j?h&]rGhRX ParametersrHrI}rJ(h+Uh,jDubah2U field_namerKubcdocutils.nodes field_body rL)rM}rN(h+Uh4}rO(h8]h9]h7]h6]h:]uh,j?h&]rPhq)rQ}rR(h+Uh4}rS(h8]h9]h7]h6]h:]uh,jMh&]rT(cdocutils.nodes strong rU)rV}rW(h+Xreasonh4}rX(h8]h9]h7]h6]h:]uh,jQh&]rYhRXreasonrZr[}r\(h+Uh,jVubah2Ustrongr]ubhRX (r^r_}r`(h+Uh,jQubh)ra}rb(h+Uh4}rc(UreftypeUobjrdU reftargetXstrreU refdomainhh6]h7]U refexplicith8]h9]h:]uh,jQh&]rfh)rg}rh(h+jeh4}ri(h8]h9]h7]h6]h:]uh,jah&]rjhRXstrrkrl}rm(h+Uh,jgubah2hubah2hubhRX)rn}ro(h+Uh,jQubhRX -- rprq}rr(h+Uh,jQubhRXthe reason for closingrsrt}ru(h+Xthe reason for closingrvh,jQubeh2huubah2U field_bodyrwubeh2Ufieldrxubaubeubeubhq)ry}rz(h+X For example::r{h,hh-h0h2huh4}r|(h8]h9]h7]h6]h:]uh)r}r(h+Uh,hah-h0h2hCh4}r(h8]h9]h7]h6]rh ah:]rh auh)r}r(h+Uh,hah-h0h2hCh4}r(h8]h9]h7]h6]rhah:]rhauh(h8]h9]h7]h6]h:]uh,jh&]r?hq)r@}rA(h+XSpider middlewaresrBh,j<h-h0h2huh4}rC(h8]h9]h7]h6]h:]uh)rQ}rR(h+Uh,hah-h0h2hCh4}rS(h8]h9]h7]h6]rTh%ah:]rUhauh(hheh3]q?(h heuh5Kh6hUexpect_referenced_by_idq@}qAhh"sh]qB(cdocutils.nodes title qC)qD}qE(h$XLink ExtractorsqFh%h8h&h)h+UtitleqGh-}qH(h1]h2]h0]h/]h3]uh5Kh6hh]qIcdocutils.nodes Text qJXLink ExtractorsqKqL}qM(h$hFh%hDubaubcdocutils.nodes paragraph qN)qO}qP(h$XLinkExtractors are objects whose only purpose is to extract links from web pages (:class:`scrapy.http.Response` objects) which will be eventually followed.h%h8h&h)h+U paragraphqQh-}qR(h1]h2]h0]h/]h3]uh5Kh6hh]qS(hJXRLinkExtractors are objects whose only purpose is to extract links from web pages (qTqU}qV(h$XRLinkExtractors are objects whose only purpose is to extract links from web pages (h%hOubcsphinx.addnodes pending_xref qW)qX}qY(h$X:class:`scrapy.http.Response`qZh%hOh&h)h+U pending_xrefq[h-}q\(UreftypeXclassUrefwarnq]U reftargetq^Xscrapy.http.ResponseU refdomainXpyq_h/]h0]U refexplicith1]h2]h3]Urefdocq`Xtopics/link-extractorsqaUpy:classqbNU py:moduleqcNuh5Kh]qdcdocutils.nodes literal qe)qf}qg(h$hZh-}qh(h1]h2]qi(Uxrefqjh_Xpy-classqkeh0]h/]h3]uh%hXh]qlhJXscrapy.http.Responseqmqn}qo(h$Uh%hfubah+UliteralqpubaubhJX, objects) which will be eventually followed.qqqr}qs(h$X, objects) which will be eventually followed.h%hOubeubhN)qt}qu(h$XThere are two Link Extractors available in Scrapy by default, but you create your own custom Link Extractors to suit your needs by implementing a simple interface.qvh%h8h&h)h+hQh-}qw(h1]h2]h0]h/]h3]uh5K h6hh]qxhJXThere are two Link Extractors available in Scrapy by default, but you create your own custom Link Extractors to suit your needs by implementing a simple interface.qyqz}q{(h$hvh%htubaubhN)q|}q}(h$XYThe only public method that every LinkExtractor has is ``extract_links``, which receives a :class:`~scrapy.http.Response` object and returns a list of :class:`scrapy.link.Link` objects. Link Extractors are meant to be instantiated once and their ``extract_links`` method called several times with different responses, to extract links to follow.h%h8h&h)h+hQh-}q~(h1]h2]h0]h/]h3]uh5Kh6hh]q(hJX7The only public method that every LinkExtractor has is qq}q(h$X7The only public method that every LinkExtractor has is h%h|ubhe)q}q(h$X``extract_links``h-}q(h1]h2]h0]h/]h3]uh%h|h]qhJX extract_linksqq}q(h$Uh%hubah+hpubhJX, which receives a qq}q(h$X, which receives a h%h|ubhW)q}q(h$X:class:`~scrapy.http.Response`qh%h|h&h)h+h[h-}q(UreftypeXclassh]h^Xscrapy.http.ResponseU refdomainXpyqh/]h0]U refexplicith1]h2]h3]h`hahbNhcNuh5Kh]qhe)q}q(h$hh-}q(h1]h2]q(hjhXpy-classqeh0]h/]h3]uh%hh]qhJXResponseqq}q(h$Uh%hubah+hpubaubhJX object and returns a list of qq}q(h$X object and returns a list of h%h|ubhW)q}q(h$X:class:`scrapy.link.Link`qh%h|h&h)h+h[h-}q(UreftypeXclassh]h^Xscrapy.link.LinkU refdomainXpyqh/]h0]U refexplicith1]h2]h3]h`hahbNhcNuh5Kh]qhe)q}q(h$hh-}q(h1]h2]q(hjhXpy-classqeh0]h/]h3]uh%hh]qhJXscrapy.link.Linkqq}q(h$Uh%hubah+hpubaubhJXF objects. Link Extractors are meant to be instantiated once and their qq}q(h$XF objects. Link Extractors are meant to be instantiated once and their h%h|ubhe)q}q(h$X``extract_links``h-}q(h1]h2]h0]h/]h3]uh%h|h]qhJX extract_linksqq}q(h$Uh%hubah+hpubhJXR method called several times with different responses, to extract links to follow.qq}q(h$XR method called several times with different responses, to extract links to follow.h%h|ubeubhN)q}q(h$X)Link extractors are used in the :class:`~scrapy.contrib.spiders.CrawlSpider` class (available in Scrapy), through a set of rules, but you can also use it in your spiders, even if you don't subclass from :class:`~scrapy.contrib.spiders.CrawlSpider`, as its purpose is very simple: to extract links.h%h8h&h)h+hQh-}q(h1]h2]h0]h/]h3]uh5Kh6hh]q(hJX Link extractors are used in the qq}q(h$X Link extractors are used in the h%hubhW)q}q(h$X,:class:`~scrapy.contrib.spiders.CrawlSpider`qh%hh&h)h+h[h-}q(UreftypeXclassh]h^X"scrapy.contrib.spiders.CrawlSpiderU refdomainXpyqh/]h0]U refexplicith1]h2]h3]h`hahbNhcNuh5Kh]qhe)q}q(h$hh-}q(h1]h2]q(hjhXpy-classqeh0]h/]h3]uh%hh]qhJX CrawlSpiderq΅q}q(h$Uh%hubah+hpubaubhJX class (available in Scrapy), through a set of rules, but you can also use it in your spiders, even if you don't subclass from qхq}q(h$X class (available in Scrapy), through a set of rules, but you can also use it in your spiders, even if you don't subclass from h%hubhW)q}q(h$X,:class:`~scrapy.contrib.spiders.CrawlSpider`qh%hh&h)h+h[h-}q(UreftypeXclassh]h^X"scrapy.contrib.spiders.CrawlSpiderU refdomainXpyqh/]h0]U refexplicith1]h2]h3]h`hahbNhcNuh5Kh]qhe)q}q(h$hh-}q(h1]h2]q(hjhXpy-classqeh0]h/]h3]uh%hh]qhJX CrawlSpiderqq}q(h$Uh%hubah+hpubaubhJX2, as its purpose is very simple: to extract links.qㅁq}q(h$X2, as its purpose is very simple: to extract links.h%hubeubh!)q}q(h$X.. _topics-link-extractors-ref:h%h8h&h)h+h,h-}q(h/]h0]h1]h2]h3]h4huh5Kh6hh]ubh7)q}q(h$Uh%h8h&h)h:}qhhsh+hUdomainr?Xpyr@h/]h0]h1]h2]h3]UobjtyperAXclassrBUdesctyperCjBuh5Nh6hh]rD(csphinx.addnodes desc_signature rE)rF}rG(h$XSgmlLinkExtractor(allow=(), deny=(), allow_domains=(), deny_domains=(), deny_extensions=None, restrict_xpaths=(), tags=('a', 'area'), attrs=('href'), canonicalize=True, unique=True, process_value=None)h%j:h&h)h+Udesc_signaturerHh-}rI(h/]rJh aUmodulerKX"scrapy.contrib.linkextractors.sgmlrLh0]h1]h2]h3]rMh aUfullnamerNXSgmlLinkExtractorrOUclassrPUUfirstrQuh5Khh6hh]rR(csphinx.addnodes desc_annotation rS)rT}rU(h$Xclass h%jFh&h)h+Udesc_annotationrVh-}rW(h1]h2]h0]h/]h3]uh5Khh6hh]rXhJXclass rYrZ}r[(h$Uh%jTubaubcsphinx.addnodes desc_addname r\)r]}r^(h$X#scrapy.contrib.linkextractors.sgml.h%jFh&h)h+U desc_addnamer_h-}r`(h1]h2]h0]h/]h3]uh5Khh6hh]rahJX#scrapy.contrib.linkextractors.sgml.rbrc}rd(h$Uh%j]ubaubcsphinx.addnodes desc_name re)rf}rg(h$jOh%jFh&h)h+U desc_namerhh-}ri(h1]h2]h0]h/]h3]uh5Khh6hh]rjhJXSgmlLinkExtractorrkrl}rm(h$Uh%jfubaubcsphinx.addnodes desc_parameterlist rn)ro}rp(h$Uh%jFh&h)h+Udesc_parameterlistrqh-}rr(h1]h2]h0]h/]h3]uh5Khh6hh]rs(csphinx.addnodes desc_parameter rt)ru}rv(h$Xallow=()h-}rw(h1]h2]h0]h/]h3]uh%joh]rxhJXallow=()ryrz}r{(h$Uh%juubah+Udesc_parameterr|ubjt)r}}r~(h$Xdeny=()h-}r(h1]h2]h0]h/]h3]uh%joh]rhJXdeny=()rr}r(h$Uh%j}ubah+j|ubjt)r}r(h$Xallow_domains=()h-}r(h1]h2]h0]h/]h3]uh%joh]rhJXallow_domains=()rr}r(h$Uh%jubah+j|ubjt)r}r(h$Xdeny_domains=()h-}r(h1]h2]h0]h/]h3]uh%joh]rhJXdeny_domains=()rr}r(h$Uh%jubah+j|ubjt)r}r(h$Xdeny_extensions=Noneh-}r(h1]h2]h0]h/]h3]uh%joh]rhJXdeny_extensions=Nonerr}r(h$Uh%jubah+j|ubjt)r}r(h$Xrestrict_xpaths=()h-}r(h1]h2]h0]h/]h3]uh%joh]rhJXrestrict_xpaths=()rr}r(h$Uh%jubah+j|ubjt)r}r(h$X tags=('a'h-}r(h1]h2]h0]h/]h3]uh%joh]rhJX tags=('a'rr}r(h$Uh%jubah+j|ubjt)r}r(h$X'area')h-}r(h1]h2]h0]h/]h3]uh%joh]rhJX'area')rr}r(h$Uh%jubah+j|ubjt)r}r(h$Xattrs=('href')h-}r(h1]h2]h0]h/]h3]uh%joh]rhJXattrs=('href')rr}r(h$Uh%jubah+j|ubjt)r}r(h$Xcanonicalize=Trueh-}r(h1]h2]h0]h/]h3]uh%joh]rhJXcanonicalize=Truerr}r(h$Uh%jubah+j|ubjt)r}r(h$X unique=Trueh-}r(h1]h2]h0]h/]h3]uh%joh]rhJX unique=Truerr}r(h$Uh%jubah+j|ubjt)r}r(h$Xprocess_value=Noneh-}r(h1]h2]h0]h/]h3]uh%joh]rhJXprocess_value=Nonerr}r(h$Uh%jubah+j|ubeubeubcsphinx.addnodes desc_content r)r}r(h$Uh%j:h&h)h+U desc_contentrh-}r(h1]h2]h0]h/]h3]uh5Khh6hh]r(hN)r}r(h$X%The SgmlLinkExtractor extends the base :class:`BaseSgmlLinkExtractor` by providing additional filters that you can specify to extract links, including regular expressions patterns that the links must match to be extracted. All those filters are configured through these constructor parameters:h%jh&h)h+hQh-}r(h1]h2]h0]h/]h3]uh5K/h6hh]r(hJX'The SgmlLinkExtractor extends the base rr}r(h$X'The SgmlLinkExtractor extends the base h%jubhW)r}r(h$X:class:`BaseSgmlLinkExtractor`rh%jh&h)h+h[h-}r(UreftypeXclassh]h^XBaseSgmlLinkExtractorU refdomainXpyrh/]h0]U refexplicith1]h2]h3]h`hahbjOhcjLuh5K/h]rhe)r}r(h$jh-}r(h1]h2]r(hjjXpy-classreh0]h/]h3]uh%jh]rhJXBaseSgmlLinkExtractorrr}r(h$Uh%jubah+hpubaubhJX by providing additional filters that you can specify to extract links, including regular expressions patterns that the links must match to be extracted. All those filters are configured through these constructor parameters:rr}r(h$X by providing additional filters that you can specify to extract links, including regular expressions patterns that the links must match to be extracted. All those filters are configured through these constructor parameters:h%jubeubcdocutils.nodes field_list r)r}r(h$Uh%jh&Nh+U field_listrh-}r(h1]h2]h0]h/]h3]uh5Nh6hh]rcdocutils.nodes field r)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(cdocutils.nodes field_name r)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhJX Parametersrr}r(h$Uh%jubah+U field_namerubcdocutils.nodes field_body r)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rcdocutils.nodes bullet_list r)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(cdocutils.nodes list_item r)r}r (h$Uh-}r (h1]h2]h0]h/]h3]uh%jh]r hN)r }r (h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(cdocutils.nodes strong r)r}r(h$Xallowh-}r(h1]h2]h0]h/]h3]uh%j h]rhJXallowrr}r(h$Uh%jubah+UstrongrubhJX (rr}r(h$Uh%j ubhW)r}r(h$Uh-}r(UreftypeUobjrU reftargetX!a regular expression (or list of)r U refdomainj@h/]h0]U refexplicith1]h2]h3]uh%j h]r!cdocutils.nodes emphasis r")r#}r$(h$j h-}r%(h1]h2]h0]h/]h3]uh%jh]r&hJX!a regular expression (or list of)r'r(}r)(h$Uh%j#ubah+Uemphasisr*ubah+h[ubhJX)r+}r,(h$Uh%j ubhJX -- r-r.}r/(h$Uh%j ubhJXa single regular expression (or list of regular expressions) that the (absolute) urls must match in order to be extracted. If not given (or empty), it will match all links.r0r1}r2(h$Xa single regular expression (or list of regular expressions) that the (absolute) urls must match in order to be extracted. If not given (or empty), it will match all links.r3h%j ubeh+hQubah+U list_itemr4ubj)r5}r6(h$Uh-}r7(h1]h2]h0]h/]h3]uh%jh]r8hN)r9}r:(h$Uh-}r;(h1]h2]h0]h/]h3]uh%j5h]r<(j)r=}r>(h$Xdenyh-}r?(h1]h2]h0]h/]h3]uh%j9h]r@hJXdenyrArB}rC(h$Uh%j=ubah+jubhJX (rDrE}rF(h$Uh%j9ubhW)rG}rH(h$Uh-}rI(UreftypejU reftargetX!a regular expression (or list of)rJU refdomainj@h/]h0]U refexplicith1]h2]h3]uh%j9h]rKj")rL}rM(h$jJh-}rN(h1]h2]h0]h/]h3]uh%jGh]rOhJX!a regular expression (or list of)rPrQ}rR(h$Uh%jLubah+j*ubah+h[ubhJX)rS}rT(h$Uh%j9ubhJX -- rUrV}rW(h$Uh%j9ubhJXa single regular expression (or list of regular expressions) that the (absolute) urls must match in order to be excluded (ie. not extracted). It has precedence over the rXrY}rZ(h$Xa single regular expression (or list of regular expressions) that the (absolute) urls must match in order to be excluded (ie. not extracted). It has precedence over the h%j9ubhe)r[}r\(h$X ``allow``h-}r](h1]h2]h0]h/]h3]uh%j9h]r^hJXallowr_r`}ra(h$Uh%j[ubah+hpubhJX? parameter. If not given (or empty) it won't exclude any links.rbrc}rd(h$X? parameter. If not given (or empty) it won't exclude any links.h%j9ubeh+hQubah+j4ubj)re}rf(h$Uh-}rg(h1]h2]h0]h/]h3]uh%jh]rhhN)ri}rj(h$Uh-}rk(h1]h2]h0]h/]h3]uh%jeh]rl(j)rm}rn(h$X allow_domainsh-}ro(h1]h2]h0]h/]h3]uh%jih]rphJX allow_domainsrqrr}rs(h$Uh%jmubah+jubhJX (rtru}rv(h$Uh%jiubhW)rw}rx(h$Uh-}ry(UreftypejU reftargetX str or listrzU refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jih]r{j")r|}r}(h$jzh-}r~(h1]h2]h0]h/]h3]uh%jwh]rhJX str or listrr}r(h$Uh%j|ubah+j*ubah+h[ubhJX)r}r(h$Uh%jiubhJX -- rr}r(h$Uh%jiubhJXga single value or a list of string containing domains which will be considered for extracting the linksrr}r(h$Xga single value or a list of string containing domains which will be considered for extracting the linksrh%jiubeh+hQubah+j4ubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$X deny_domainsh-}r(h1]h2]h0]h/]h3]uh%jh]rhJX deny_domainsrr}r(h$Uh%jubah+jubhJX (rr}r(h$Uh%jubhW)r}r(h$Uh-}r(UreftypejU reftargetX str or listrU refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jh]rj")r}r(h$jh-}r(h1]h2]h0]h/]h3]uh%jh]rhJX str or listrr}r(h$Uh%jubah+j*ubah+h[ubhJX)r}r(h$Uh%jubhJX -- rr}r(h$Uh%jubhJXia single value or a list of strings containing domains which won't be considered for extracting the linksrr}r(h$Xia single value or a list of strings containing domains which won't be considered for extracting the linksrh%jubeh+hQubah+j4ubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$Xdeny_extensionsh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXdeny_extensionsrr}r(h$Uh%jubah+jubhJX (rr}r(h$Uh%jubhW)r}r(h$Uh-}r(UreftypejU reftargetXlistrU refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jh]rj")r}r(h$jh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXlistrr}r(h$Uh%jubah+j*ubah+h[ubhJX)r}r(h$Uh%jubhJX -- rr}r(h$Uh%jubhJXha list of extensions that should be ignored when extracting links. If not given, it will default to the rr}r(h$Xha list of extensions that should be ignored when extracting links. If not given, it will default to the h%jubhe)r}r(h$X``IGNORED_EXTENSIONS``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJXIGNORED_EXTENSIONSrr}r(h$Uh%jubah+hpubhJX list defined in the rr}r(h$X list defined in the h%jubcdocutils.nodes reference r)r}r(h$X`scrapy.linkextractor`_UresolvedrKh%jh+U referencerh-}r(UnameXscrapy.linkextractorrUrefurirXDhttps://github.com/scrapy/scrapy/blob/master/scrapy/linkextractor.pyrh/]h0]h1]h2]h3]uh]rhJXscrapy.linkextractorrr}r(h$Uh%jubaubhJX module.rr}r(h$X module.h%jubeh+hQubah+j4ubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$Xrestrict_xpathsh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXrestrict_xpathsrr}r(h$Uh%jubah+jubhJX (rr}r(h$Uh%jubhW)r}r(h$Uh-}r(UreftypejU reftargetX str or listrU refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jh]r j")r }r (h$jh-}r (h1]h2]h0]h/]h3]uh%jh]r hJX str or listrr}r(h$Uh%j ubah+j*ubah+h[ubhJX)r}r(h$Uh%jubhJX -- rr}r(h$Uh%jubhJXis a XPath (or list of XPath's) which defines regions inside the response where links should be extracted from. If given, only the text selected by those XPath will be scanned for links. See examples below.rr}r(h$Xis a XPath (or list of XPath's) which defines regions inside the response where links should be extracted from. If given, only the text selected by those XPath will be scanned for links. See examples below.rh%jubeh+hQubah+j4ubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r (h1]h2]h0]h/]h3]uh%jh]r!(j)r"}r#(h$Xtagsh-}r$(h1]h2]h0]h/]h3]uh%jh]r%hJXtagsr&r'}r((h$Uh%j"ubah+jubhJX (r)r*}r+(h$Uh%jubhW)r,}r-(h$Uh-}r.(UreftypejU reftargetX str or listr/U refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jh]r0j")r1}r2(h$j/h-}r3(h1]h2]h0]h/]h3]uh%j,h]r4hJX str or listr5r6}r7(h$Uh%j1ubah+j*ubah+h[ubhJX)r8}r9(h$Uh%jubhJX -- r:r;}r<(h$Uh%jubhJXGa tag or a list of tags to consider when extracting links. Defaults to r=r>}r?(h$XGa tag or a list of tags to consider when extracting links. Defaults to h%jubhe)r@}rA(h$X``('a', 'area')``h-}rB(h1]h2]h0]h/]h3]uh%jh]rChJX ('a', 'area')rDrE}rF(h$Uh%j@ubah+hpubhJX.rG}rH(h$X.h%jubeh+hQubah+j4ubj)rI}rJ(h$Uh-}rK(h1]h2]h0]h/]h3]uh%jh]rLhN)rM}rN(h$Uh-}rO(h1]h2]h0]h/]h3]uh%jIh]rP(j)rQ}rR(h$Xattrsh-}rS(h1]h2]h0]h/]h3]uh%jMh]rThJXattrsrUrV}rW(h$Uh%jQubah+jubhJX (rXrY}rZ(h$Uh%jMubhW)r[}r\(h$Uh-}r](UreftypejU reftargetXlistr^U refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jMh]r_j")r`}ra(h$j^h-}rb(h1]h2]h0]h/]h3]uh%j[h]rchJXlistrdre}rf(h$Uh%j`ubah+j*ubah+h[ubhJX)rg}rh(h$Uh%jMubhJX -- rirj}rk(h$Uh%jMubhJXvlist of attributes which should be considered when looking for links to extract (only for those tags specified in the rlrm}rn(h$Xvlist of attributes which should be considered when looking for links to extract (only for those tags specified in the h%jMubhe)ro}rp(h$X``tags``h-}rq(h1]h2]h0]h/]h3]uh%jMh]rrhJXtagsrsrt}ru(h$Uh%joubah+hpubhJX parameter). Defaults to rvrw}rx(h$X parameter). Defaults to h%jMubhe)ry}rz(h$X ``('href',)``h-}r{(h1]h2]h0]h/]h3]uh%jMh]r|hJX ('href',)r}r~}r(h$Uh%jyubah+hpubeh+hQubah+j4ubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$X canonicalizeh-}r(h1]h2]h0]h/]h3]uh%jh]rhJX canonicalizerr}r(h$Uh%jubah+jubhJX (rr}r(h$Uh%jubhW)r}r(h$Uh-}r(UreftypejU reftargetXbooleanrU refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jh]rj")r}r(h$jh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXbooleanrr}r(h$Uh%jubah+j*ubah+h[ubhJX)r}r(h$Uh%jubhJX -- rr}r(h$Uh%jubhJXWcanonicalize each extracted url (using scrapy.utils.url.canonicalize_url). Defaults to rr}r(h$XWcanonicalize each extracted url (using scrapy.utils.url.canonicalize_url). Defaults to h%jubhe)r}r(h$X``True``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJXTruerr}r(h$Uh%jubah+hpubhJX.r}r(h$X.h%jubeh+hQubah+j4ubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$Xuniqueh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXuniquerr}r(h$Uh%jubah+jubhJX (rr}r(h$Uh%jubhW)r}r(h$Uh-}r(UreftypejU reftargetXbooleanrU refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jh]rj")r}r(h$jh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXbooleanrr}r(h$Uh%jubah+j*ubah+h[ubhJX)r}r(h$Uh%jubhJX -- rr}r(h$Uh%jubhJXAwhether duplicate filtering should be applied to extracted links.rr}r(h$XAwhether duplicate filtering should be applied to extracted links.rh%jubeh+hQubah+j4ubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$X process_valueh-}r(h1]h2]h0]h/]h3]uh%jh]rhJX process_valuerr}r(h$Uh%jubah+jubhJX (rr}r(h$Uh%jubhW)r}r(h$Uh-}r(UreftypejU reftargetXcallablerU refdomainj@h/]h0]U refexplicith1]h2]h3]uh%jh]rj")r}r(h$jh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXcallablerr}r(h$Uh%jubah+j*ubah+h[ubhJX)r}r(h$Uh%jubhJX -- rr}r(h$Uh%jubhJXsee rr}r(h$Xsee h%jubhe)r}r(h$X``process_value``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJX process_valuerr}r(h$Uh%jubah+hpubhJX argument of rr}r(h$X argument of h%jubhW)r}r(h$X:class:`BaseSgmlLinkExtractor`rh%jh&h)h+h[h-}r (UreftypeXclassh]h^XBaseSgmlLinkExtractorU refdomainXpyr h/]h0]U refexplicith1]h2]h3]h`hahbjOhcjLuh5Keh]r he)r }r (h$jh-}r(h1]h2]r(hjj Xpy-classreh0]h/]h3]uh%jh]rhJXBaseSgmlLinkExtractorrr}r(h$Uh%j ubah+hpubaubhJX class constructorrr}r(h$X class constructorh%jubeh+hQubah+j4ubeh+U bullet_listrubah+U field_bodyrubeh+Ufieldrubaubeubeubeubh7)r}r(h$Uh%hh&h)h+hj?Xpyr1h/]h0]h1]h2]h3]jAXclassr2jCj2uh5Nh6hh]r3(jE)r4}r5(h$XMBaseSgmlLinkExtractor(tag="a", attr="href", unique=False, process_value=None)h%j.h&h)h+jHh-}r6(h/]r7hajKjLh0]h1]h2]h3]r8hajNXBaseSgmlLinkExtractorr9jPUjQuh5Kh6hh]r:(jS)r;}r<(h$Xclass h%j4h&h)h+jVh-}r=(h1]h2]h0]h/]h3]uh5Kh6hh]r>hJXclass r?r@}rA(h$Uh%j;ubaubj\)rB}rC(h$X#scrapy.contrib.linkextractors.sgml.h%j4h&h)h+j_h-}rD(h1]h2]h0]h/]h3]uh5Kh6hh]rEhJX#scrapy.contrib.linkextractors.sgml.rFrG}rH(h$Uh%jBubaubje)rI}rJ(h$j9h%j4h&h)h+jhh-}rK(h1]h2]h0]h/]h3]uh5Kh6hh]rLhJXBaseSgmlLinkExtractorrMrN}rO(h$Uh%jIubaubjn)rP}rQ(h$Uh%j4h&h)h+jqh-}rR(h1]h2]h0]h/]h3]uh5Kh6hh]rS(jt)rT}rU(h$Xtag="a"h-}rV(h1]h2]h0]h/]h3]uh%jPh]rWhJXtag="a"rXrY}rZ(h$Uh%jTubah+j|ubjt)r[}r\(h$X attr="href"h-}r](h1]h2]h0]h/]h3]uh%jPh]r^hJX attr="href"r_r`}ra(h$Uh%j[ubah+j|ubjt)rb}rc(h$X unique=Falseh-}rd(h1]h2]h0]h/]h3]uh%jPh]rehJX unique=Falserfrg}rh(h$Uh%jbubah+j|ubjt)ri}rj(h$Xprocess_value=Noneh-}rk(h1]h2]h0]h/]h3]uh%jPh]rlhJXprocess_value=Nonermrn}ro(h$Uh%jiubah+j|ubeubeubj)rp}rq(h$Uh%j.h&h)h+jh-}rr(h1]h2]h0]h/]h3]uh5Kh6hh]rs(hN)rt}ru(h$XThe purpose of this Link Extractor is only to serve as a base class for the :class:`SgmlLinkExtractor`. You should use that one instead.h%jph&h)h+hQh-}rv(h1]h2]h0]h/]h3]uh5Knh6hh]rw(hJXLThe purpose of this Link Extractor is only to serve as a base class for the rxry}rz(h$XLThe purpose of this Link Extractor is only to serve as a base class for the h%jtubhW)r{}r|(h$X:class:`SgmlLinkExtractor`r}h%jth&h)h+h[h-}r~(UreftypeXclassh]h^XSgmlLinkExtractorU refdomainXpyrh/]h0]U refexplicith1]h2]h3]h`hahbj9hcjLuh5Knh]rhe)r}r(h$j}h-}r(h1]h2]r(hjjXpy-classreh0]h/]h3]uh%j{h]rhJXSgmlLinkExtractorrr}r(h$Uh%jubah+hpubaubhJX". You should use that one instead.rr}r(h$X". You should use that one instead.h%jtubeubhN)r}r(h$XThe constructor arguments are:rh%jph&h)h+hQh-}r(h1]h2]h0]h/]h3]uh5Kqh6hh]rhJXThe constructor arguments are:rr}r(h$jh%jubaubj)r}r(h$Uh%jph&Nh+jh-}r(h1]h2]h0]h/]h3]uh5Nh6hh]rj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhJX Parametersrr}r(h$Uh%jubah+jubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r}r(h$Xtagh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXtagrr}r(h$Uh%jubah+jubhJX (rr}r(h$Uh%jubhW)r}r(h$Uh-}r(UreftypejU reftargetXstr or callablerU refdomainj1h/]h0]U refexplicith1]h2]h3]uh%jh]rj")r}r(h$jh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXstr or callablerr}r(h$Uh%jubah+j*ubah+h[ubhJX)r}r(h$Uh%jubhJX -- rr}r(h$Uh%jubhJX\either a string (with the name of a tag) or a function that receives a tag name and returns rr}r(h$X\either a string (with the name of a tag) or a function that receives a tag name and returns h%jubhe)r}r(h$X``True``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJXTruerr}r(h$Uh%jubah+hpubhJX0 if links should be extracted from that tag, or rr}r(h$X0 if links should be extracted from that tag, or h%jubhe)r}r(h$X ``False``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJXFalserr}r(h$Uh%jubah+hpubhJX if they shouldn't. Defaults to rr}r(h$X if they shouldn't. Defaults to h%jubhe)r}r(h$X``'a'``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJX'a'rr}r(h$Uh%jubah+hpubhJXT. request (once it's downloaded) as its first parameter. For more information, see rr}r(h$XT. request (once it's downloaded) as its first parameter. For more information, see h%jubhW)r}r(h$X=:ref:`topics-request-response-ref-request-callback-arguments`rh%jh&h)h+h[h-}r(UreftypeXrefh]h^X6topics-request-response-ref-request-callback-argumentsU refdomainXstdrh/]h0]U refexplicith1]h2]h3]h`hauh5Ksh]rj")r}r(h$jh-}r(h1]h2]r(hjjXstd-refreh0]h/]h3]uh%jh]rhJX6topics-request-response-ref-request-callback-argumentsrr}r(h$Uh%jubah+j*ubaubhJX.r}r(h$X.h%jubeh+hQubah+j4ubj)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]r(j)r }r (h$Xattrh-}r (h1]h2]h0]h/]h3]uh%jh]r hJXattrr r}r(h$Uh%j ubah+jubhJX (rr}r(h$Uh%jubhW)r}r(h$Uh-}r(UreftypejU reftargetXstr or callablerU refdomainj1h/]h0]U refexplicith1]h2]h3]uh%jh]rj")r}r(h$jh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXstr or callablerr}r(h$Uh%jubah+j*ubah+h[ubhJX)r}r (h$Uh%jubhJX -- r!r"}r#(h$Uh%jubhJXleither string (with the name of a tag attribute), or a function that receives an attribute name and returns r$r%}r&(h$Xleither string (with the name of a tag attribute), or a function that receives an attribute name and returns h%jubhe)r'}r((h$X``True``h-}r)(h1]h2]h0]h/]h3]uh%jh]r*hJXTruer+r,}r-(h$Uh%j'ubah+hpubhJX* if links should be extracted from it, or r.r/}r0(h$X* if links should be extracted from it, or h%jubhe)r1}r2(h$X ``False``h-}r3(h1]h2]h0]h/]h3]uh%jh]r4hJXFalser5r6}r7(h$Uh%j1ubah+hpubhJX if they shouldn't. Defaults to r8r9}r:(h$X if they shouldn't. Defaults to h%jubhe)r;}r<(h$X``href``h-}r=(h1]h2]h0]h/]h3]uh%jh]r>hJXhrefr?r@}rA(h$Uh%j;ubah+hpubhJX.rB}rC(h$X.h%jubeh+hQubah+j4ubj)rD}rE(h$Uh-}rF(h1]h2]h0]h/]h3]uh%jh]rGhN)rH}rI(h$Uh-}rJ(h1]h2]h0]h/]h3]uh%jDh]rK(j)rL}rM(h$Xuniqueh-}rN(h1]h2]h0]h/]h3]uh%jHh]rOhJXuniquerPrQ}rR(h$Uh%jLubah+jubhJX (rSrT}rU(h$Uh%jHubhW)rV}rW(h$Uh-}rX(UreftypejU reftargetXbooleanrYU refdomainj1h/]h0]U refexplicith1]h2]h3]uh%jHh]rZj")r[}r\(h$jYh-}r](h1]h2]h0]h/]h3]uh%jVh]r^hJXbooleanr_r`}ra(h$Uh%j[ubah+j*ubah+h[ubhJX)rb}rc(h$Uh%jHubhJX -- rdre}rf(h$Uh%jHubhJXZis a boolean that specifies if a duplicate filtering should be applied to links extracted.rgrh}ri(h$XZis a boolean that specifies if a duplicate filtering should be applied to links extracted.rjh%jHubeh+hQubah+j4ubj)rk}rl(h$Uh-}rm(h1]h2]h0]h/]h3]uh%jh]rnhN)ro}rp(h$Uh-}rq(h1]h2]h0]h/]h3]uh%jkh]rr(j)rs}rt(h$X process_valueh-}ru(h1]h2]h0]h/]h3]uh%joh]rvhJX process_valuerwrx}ry(h$Uh%jsubah+jubhJX (rzr{}r|(h$Uh%joubhW)r}}r~(h$Uh-}r(UreftypejU reftargetXcallablerU refdomainj1h/]h0]U refexplicith1]h2]h3]uh%joh]rj")r}r(h$jh-}r(h1]h2]h0]h/]h3]uh%j}h]rhJXcallablerr}r(h$Uh%jubah+j*ubah+h[ubhJX)r}r(h$Uh%joubhJX -- rr}r(h$Uh%joubhN)r}r(h$Xa function which receives each value extracted from the tag and attributes scanned and can modify the value and return a new one, or return ``None`` to ignore the link altogether. If not given, ``process_value`` defaults to ``lambda x: x``.h%joh&h)h+hQh-}r(h1]h2]h0]h/]h3]uh5Kh]r(hJXa function which receives each value extracted from the tag and attributes scanned and can modify the value and return a new one, or return rr}r(h$Xa function which receives each value extracted from the tag and attributes scanned and can modify the value and return a new one, or return h%jubhe)r}r(h$X``None``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJXNonerr}r(h$Uh%jubah+hpubhJX. to ignore the link altogether. If not given, rr}r(h$X. to ignore the link altogether. If not given, h%jubhe)r}r(h$X``process_value``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJX process_valuerr}r(h$Uh%jubah+hpubhJX defaults to rr}r(h$X defaults to h%jubhe)r}r(h$X``lambda x: x``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJX lambda x: xrr}r(h$Uh%jubah+hpubhJX.r}r(h$X.h%jubeubcsphinx.addnodes highlightlang r)r}r(h$Uh-}r(UlangXhtmlUlinenothresholdI9223372036854775807 h/]h0]h1]h2]h3]uh%joh]h+U highlightlangrubhN)r}r(h$X.For example, to extract links from this code::h%joh&h)h+hQh-}r(h1]h2]h0]h/]h3]uh5Kh]rhJX-For example, to extract links from this code:rr}r(h$X-For example, to extract links from this code:h%jubaubcdocutils.nodes literal_block r)r}r(h$XOLink texth%joh+U literal_blockrh-}r(U xml:spacerUpreserverh/]h0]h1]h2]h3]uh5Kh]rhJXOLink textrr}r(h$Uh%jubaubj)r}r(h$Uh-}r(UlangXpythonUlinenothresholdI9223372036854775807 h/]h0]h1]h2]h3]uh%joh]h+jubhN)r}r(h$X9You can use the following function in ``process_value``::h%joh&h)h+hQh-}r(h1]h2]h0]h/]h3]uh5Kh]r(hJX&You can use the following function in rr}r(h$X&You can use the following function in h%jubhe)r}r(h$X``process_value``h-}r(h1]h2]h0]h/]h3]uh%jh]rhJX process_valuerr}r(h$Uh%jubah+hpubhJX:r}r(h$X:h%jubeubj)r}r(h$Xvdef process_value(value): m = re.search("javascript:goToPage\('(.*?)'", value) if m: return m.group(1)h%joh+jh-}r(jjh/]h0]h1]h2]h3]uh5Kh]rhJXvdef process_value(value): m = re.search("javascript:goToPage\('(.*?)'", value) if m: return m.group(1)rr}r(h$Uh%jubaubeh+hQubah+j4ubeh+jubah+jubeh+jubaubeubeubh!)r}r(h$X^.. _scrapy.linkextractor: https://github.com/scrapy/scrapy/blob/master/scrapy/linkextractor.pyU referencedrKh%jh&h)h+h,h-}r(jjh/]rhah0]h1]h2]h3]rhauh5Kh6hh]ubeubeubeubeh$UU transformerrNU footnote_refsr}rUrefnamesr}rj]rjasUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh6hU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h$Uh-}r(h1]UlevelKh/]h0]Usourceh)h2]h3]UlineKUtypeUINFOruh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhJX<Hyperlink target "topics-link-extractors" is not referenced.rr}r(h$Uh%jubah+hQubah+Usystem_messagerubj)r}r (h$Uh-}r (h1]UlevelKh/]h0]Usourceh)h2]h3]UlineKUtypejuh]r hN)r }r (h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhJX@Hyperlink target "topics-link-extractors-ref" is not referenced.rr}r(h$Uh%j ubah+hQubah+jubj)r}r(h$Uh-}r(h1]UlevelKh/]h0]Usourceh)h2]h3]Utypejuh]rhN)r}r(h$Uh-}r(h1]h2]h0]h/]h3]uh%jh]rhJXOHyperlink target "module-scrapy.contrib.linkextractors.sgml" is not referenced.rr}r(h$Uh%jubah+hQubah+jubeUreporterrNUid_startrKU autofootnotesr ]r!U citation_refsr"}r#Uindirect_targetsr$]r%Usettingsr&(cdocutils.frontend Values r'or(}r)(Ufootnote_backlinksr*KUrecord_dependenciesr+NU rfc_base_urlr,Uhttp://tools.ietf.org/html/r-U tracebackr.Upep_referencesr/NUstrip_commentsr0NU toc_backlinksr1Uentryr2U language_coder3Uenr4U datestampr5NU report_levelr6KU _destinationr7NU halt_levelr8KU strip_classesr9NhGNUerror_encoding_error_handlerr:Ubackslashreplacer;Udebugr<NUembed_stylesheetr=Uoutput_encoding_error_handlerr>Ustrictr?U sectnum_xformr@KUdump_transformsrANU docinfo_xformrBKUwarning_streamrCNUpep_file_url_templaterDUpep-%04drEUexit_status_levelrFKUconfigrGNUstrict_visitorrHNUcloak_email_addressesrIUtrim_footnote_reference_spacerJUenvrKNUdump_pseudo_xmlrLNUexpose_internalsrMNUsectsubtitle_xformrNU source_linkrONUrfc_referencesrPNUoutput_encodingrQUutf-8rRU source_urlrSNUinput_encodingrTU utf-8-sigrUU_disable_configrVNU id_prefixrWUU tab_widthrXKUerror_encodingrYUUTF-8rZU_sourcer[UL/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/link-extractors.rstr\Ugettext_compactr]U generatorr^NUdump_internalsr_NU smart_quotesr`U pep_base_urlraUhttp://www.python.org/dev/peps/rbUsyntax_highlightrcUlongrdUinput_encoding_error_handlerrej?Uauto_id_prefixrfUidrgUdoctitle_xformrhUstrip_elements_with_classesriNU _config_filesrj]Ufile_insertion_enabledrkU raw_enabledrlKU dump_settingsrmNubUsymbol_footnote_startrnKUidsro}rp(hh!)rq}rr(h$Uh%hh&h)h+h,h-}rs(h1]h/]rthah0]Uismodh2]h3]uh5Nh6hh]ubhhhj4hh8hhh jFhj&hjhjj jhh8uUsubstitution_namesru}rvh+h6h-}rw(h1]h/]h0]Usourceh)h2]h3]uU footnotesrx]ryUrefidsrz}r{(h]r|hah]r}h"auub.PKo1Dpc iFiF-scrapy-0.22/.doctrees/topics/commands.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xavailable tool commandsqNX genspiderqNX$default structure of scrapy projectsqNXscrapy/commandsq Xcommand line toolq NXparseq NXtopics-project-structureq Xdeploying your projectq Xcreating projectsqNXcheckqNXtopics-commandsqXbenchqNXversionqNXcustom project commandsqNXshellqNXdeployqNX startprojectqNXcrawlqNXtopics-commands-refqX runspiderqNXusing the scrapy toolqNXsettingsqNXeditqNXlistqNXcontrolling projectsqNXcommands_moduleqNXfetchq NXviewq!NuUsubstitution_defsq"}q#Uparse_messagesq$]q%Ucurrent_sourceq&NU decorationq'NUautofootnote_startq(KUnameidsq)}q*(hUavailable-tool-commandsq+hU genspiderq,hU$default-structure-of-scrapy-projectsq-h Uscrapy-commandsq.h Ucommand-line-toolq/h Uparseq0h Utopics-project-structureq1h Udeploying-your-projectq2hUcreating-projectsq3hUcheckq4hUtopics-commandsq5hUbenchq6hUversionq7hUcustom-project-commandsq8hUshellq9hUdeployq:hU startprojectq;hUcrawlqhUusing-the-scrapy-toolq?hUsettingsq@hUeditqAhUlistqBhUcontrolling-projectsqChUcommands-moduleqDh UfetchqEh!UviewqFuUchildrenqG]qH(cdocutils.nodes target qI)qJ}qK(U rawsourceqLX.. _topics-commands:UparentqMhUsourceqNcdocutils.nodes reprunicode qOXE/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/commands.rstqPqQ}qRbUtagnameqSUtargetqTU attributesqU}qV(UidsqW]UbackrefsqX]UdupnamesqY]UclassesqZ]Unamesq[]Urefidq\h5uUlineq]KUdocumentq^hhG]ubcdocutils.nodes section q_)q`}qa(hLUhMhhNhQUexpect_referenced_by_nameqb}qchhJshSUsectionqdhU}qe(hY]hZ]hX]hW]qf(h/h5eh[]qg(h heuh]Kh^hUexpect_referenced_by_idqh}qih5hJshG]qj(cdocutils.nodes title qk)ql}qm(hLXCommand line toolqnhMh`hNhQhSUtitleqohU}qp(hY]hZ]hX]hW]h[]uh]Kh^hhG]qqcdocutils.nodes Text qrXCommand line toolqsqt}qu(hLhnhMhlubaubcsphinx.addnodes versionmodified qv)qw}qx(hLUhMh`hNhQhSUversionmodifiedqyhU}qz(Uversionq{X0.10q|hW]hX]hY]hZ]h[]Utypeq}X versionaddedq~uh]Kh^hhG]qcdocutils.nodes paragraph q)q}q(hLUhMhwhNhQhSU paragraphqhU}q(hY]hZ]hX]hW]h[]uh]Kh^hhG]qcdocutils.nodes inline q)q}q(hLUhU}q(hY]hZ]qhyahX]hW]h[]uhMhhG]qhrXNew in version 0.10.qq}q(hLUhMhubahSUinlinequbaubaubh)q}q(hLXScrapy is controlled through the ``scrapy`` command-line tool, to be referred here as the "Scrapy tool" to differentiate it from their sub-commands which we just call "commands", or "Scrapy commands".hMh`hNhQhShhU}q(hY]hZ]hX]hW]h[]uh]K h^hhG]q(hrX!Scrapy is controlled through the qq}q(hLX!Scrapy is controlled through the hMhubcdocutils.nodes literal q)q}q(hLX ``scrapy``hU}q(hY]hZ]hX]hW]h[]uhMhhG]qhrXscrapyqq}q(hLUhMhubahSUliteralqubhrX command-line tool, to be referred here as the "Scrapy tool" to differentiate it from their sub-commands which we just call "commands", or "Scrapy commands".qq}q(hLX command-line tool, to be referred here as the "Scrapy tool" to differentiate it from their sub-commands which we just call "commands", or "Scrapy commands".hMhubeubh)q}q(hLXThe Scrapy tool provides several commands, for multiple purposes, and each one accepts a different set of arguments and options.qhMh`hNhQhShhU}q(hY]hZ]hX]hW]h[]uh]K h^hhG]qhrXThe Scrapy tool provides several commands, for multiple purposes, and each one accepts a different set of arguments and options.qq}q(hLhhMhubaubhI)q}q(hLX.. _topics-project-structure:hMh`hNhQhShThU}q(hW]hX]hY]hZ]h[]h\h1uh]Kh^hhG]ubh_)q}q(hLUhMh`hNhQhb}qh hshShdhU}q(hY]hZ]hX]hW]q(h-h1eh[]q(hh euh]Kh^hhh}qh1hshG]q(hk)q}q(hLX$Default structure of Scrapy projectsqhMhhNhQhShohU}q(hY]hZ]hX]hW]h[]uh]Kh^hhG]qhrX$Default structure of Scrapy projectsqq}q(hLhhMhubaubh)q}q(hLXBefore delving into the command-line tool and its sub-commands, let's first understand the directory structure of a Scrapy project.qhMhhNhQhShhU}q(hY]hZ]hX]hW]h[]uh]Kh^hhG]qhrXBefore delving into the command-line tool and its sub-commands, let's first understand the directory structure of a Scrapy project.qÅq}q(hLhhMhubaubh)q}q(hLXoEven thought it can be modified, all Scrapy projects have the same file structure by default, similar to this::hMhhNhQhShhU}q(hY]hZ]hX]hW]h[]uh]Kh^hhG]qhrXnEven thought it can be modified, all Scrapy projects have the same file structure by default, similar to this:qʅq}q(hLXnEven thought it can be modified, all Scrapy projects have the same file structure by default, similar to this:hMhubaubcdocutils.nodes literal_block q)q}q(hLXscrapy.cfg myproject/ __init__.py items.py pipelines.py settings.py spiders/ __init__.py spider1.py spider2.py ...hMhhNhQhSU literal_blockqhU}q(U xml:spaceqUpreserveqhW]hX]hY]hZ]h[]uh]Kh^hhG]qhrXscrapy.cfg myproject/ __init__.py items.py pipelines.py settings.py spiders/ __init__.py spider1.py spider2.py ...qՅq}q(hLUhMhubaubh)q}q(hLXThe directory where the ``scrapy.cfg`` file resides is known as the *project root directory*. That file contains the name of the python module that defines the project settings. Here is an example::hMhhNhQhShhU}q(hY]hZ]hX]hW]h[]uh]K'h^hhG]q(hrXThe directory where the q܅q}q(hLXThe directory where the hMhubh)q}q(hLX``scrapy.cfg``hU}q(hY]hZ]hX]hW]h[]uhMhhG]qhrX scrapy.cfgqㅁq}q(hLUhMhubahShubhrX file resides is known as the q慁q}q(hLX file resides is known as the hMhubcdocutils.nodes emphasis q)q}q(hLX*project root directory*hU}q(hY]hZ]hX]hW]h[]uhMhhG]qhrXproject root directoryqq}q(hLUhMhubahSUemphasisqubhrXi. That file contains the name of the python module that defines the project settings. Here is an example:qq}q(hLXi. That file contains the name of the python module that defines the project settings. Here is an example:hMhubeubh)q}q(hLX'[settings] default = myproject.settingshMhhNhQhShhU}q(hhhW]hX]hY]hZ]h[]uh]K+h^hhG]qhrX'[settings] default = myproject.settingsqq}q(hLUhMhubaubeubh_)q}q(hLUhMh`hNhQhShdhU}q(hY]hZ]hX]hW]qh?ah[]rhauh]K/h^hhG]r(hk)r}r(hLXUsing the ``scrapy`` toolrhMhhNhQhShohU}r(hY]hZ]hX]hW]h[]uh]K/h^hhG]r(hrX Using the rr}r (hLX Using the r hMjubh)r }r (hLX ``scrapy``r hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXscrapyrr}r(hLUhMj ubahShubhrX toolrr}r(hLX toolrhMjubeubh)r}r(hLXyYou can start by running the Scrapy tool with no arguments and it will print some usage help and the available commands::hMhhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]K1h^hhG]rhrXxYou can start by running the Scrapy tool with no arguments and it will print some usage help and the available commands:rr}r(hLXxYou can start by running the Scrapy tool with no arguments and it will print some usage help and the available commands:hMjubaubh)r}r(hLXScrapy X.Y - no active project Usage: scrapy [options] [args] Available commands: crawl Run a spider fetch Fetch a URL using the Scrapy downloader [...]hMhhNhQhShhU}r (hhhW]hX]hY]hZ]h[]uh]K4h^hhG]r!hrXScrapy X.Y - no active project Usage: scrapy [options] [args] Available commands: crawl Run a spider fetch Fetch a URL using the Scrapy downloader [...]r"r#}r$(hLUhMjubaubh)r%}r&(hLXThe first line will print the currently active project, if you're inside a Scrapy project. In this, it was run from outside a project. If run from inside a project it would have printed something like this::hMhhNhQhShhU}r'(hY]hZ]hX]hW]h[]uh]K>h^hhG]r(hrXThe first line will print the currently active project, if you're inside a Scrapy project. In this, it was run from outside a project. If run from inside a project it would have printed something like this:r)r*}r+(hLXThe first line will print the currently active project, if you're inside a Scrapy project. In this, it was run from outside a project. If run from inside a project it would have printed something like this:hMj%ubaubh)r,}r-(hLXRScrapy X.Y - project: myproject Usage: scrapy [options] [args] [...]hMhhNhQhShhU}r.(hhhW]hX]hY]hZ]h[]uh]KBh^hhG]r/hrXRScrapy X.Y - project: myproject Usage: scrapy [options] [args] [...]r0r1}r2(hLUhMj,ubaubh_)r3}r4(hLUhMhhNhQhShdhU}r5(hY]hZ]hX]hW]r6h3ah[]r7hauh]KJh^hhG]r8(hk)r9}r:(hLXCreating projectsr;hMj3hNhQhShohU}r<(hY]hZ]hX]hW]h[]uh]KJh^hhG]r=hrXCreating projectsr>r?}r@(hLj;hMj9ubaubh)rA}rB(hLXYThe first thing you typically do with the ``scrapy`` tool is create your Scrapy project::hMj3hNhQhShhU}rC(hY]hZ]hX]hW]h[]uh]KLh^hhG]rD(hrX*The first thing you typically do with the rErF}rG(hLX*The first thing you typically do with the hMjAubh)rH}rI(hLX ``scrapy``hU}rJ(hY]hZ]hX]hW]h[]uhMjAhG]rKhrXscrapyrLrM}rN(hLUhMjHubahShubhrX$ tool is create your Scrapy project:rOrP}rQ(hLX$ tool is create your Scrapy project:hMjAubeubh)rR}rS(hLXscrapy startproject myprojecthMj3hNhQhShhU}rT(hhhW]hX]hY]hZ]h[]uh]KOh^hhG]rUhrXscrapy startproject myprojectrVrW}rX(hLUhMjRubaubh)rY}rZ(hLXDThat will create a Scrapy project under the ``myproject`` directory.r[hMj3hNhQhShhU}r\(hY]hZ]hX]hW]h[]uh]KQh^hhG]r](hrX,That will create a Scrapy project under the r^r_}r`(hLX,That will create a Scrapy project under the hMjYubh)ra}rb(hLX ``myproject``hU}rc(hY]hZ]hX]hW]h[]uhMjYhG]rdhrX myprojectrerf}rg(hLUhMjaubahShubhrX directory.rhri}rj(hLX directory.hMjYubeubh)rk}rl(hLX/Next, you go inside the new project directory::rmhMj3hNhQhShhU}rn(hY]hZ]hX]hW]h[]uh]KSh^hhG]rohrX.Next, you go inside the new project directory:rprq}rr(hLX.Next, you go inside the new project directory:hMjkubaubh)rs}rt(hLX cd myprojecthMj3hNhQhShhU}ru(hhhW]hX]hY]hZ]h[]uh]KUh^hhG]rvhrX cd myprojectrwrx}ry(hLUhMjsubaubh)rz}r{(hLX]And you're ready to use the ``scrapy`` command to manage and control your project from there.hMj3hNhQhShhU}r|(hY]hZ]hX]hW]h[]uh]KWh^hhG]r}(hrXAnd you're ready to use the r~r}r(hLXAnd you're ready to use the hMjzubh)r}r(hLX ``scrapy``hU}r(hY]hZ]hX]hW]h[]uhMjzhG]rhrXscrapyrr}r(hLUhMjubahShubhrX7 command to manage and control your project from there.rr}r(hLX7 command to manage and control your project from there.hMjzubeubeubh_)r}r(hLUhMhhNhQhShdhU}r(hY]hZ]hX]hW]rhCah[]rhauh]K[h^hhG]r(hk)r}r(hLXControlling projectsrhMjhNhQhShohU}r(hY]hZ]hX]hW]h[]uh]K[h^hhG]rhrXControlling projectsrr}r(hLjhMjubaubh)r}r(hLXQYou use the ``scrapy`` tool from inside your projects to control and manage them.hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]K]h^hhG]r(hrX You use the rr}r(hLX You use the hMjubh)r}r(hLX ``scrapy``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXscrapyrr}r(hLUhMjubahShubhrX; tool from inside your projects to control and manage them.rr}r(hLX; tool from inside your projects to control and manage them.hMjubeubh)r}r(hLX%For example, to create a new spider::rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]K`h^hhG]rhrX$For example, to create a new spider:rr}r(hLX$For example, to create a new spider:hMjubaubh)r}r(hLX&scrapy genspider mydomain mydomain.comhMjhNhQhShhU}r(hhhW]hX]hY]hZ]h[]uh]Kbh^hhG]rhrX&scrapy genspider mydomain mydomain.comrr}r(hLUhMjubaubh)r}r(hLXSome Scrapy commands (like :command:`crawl`) must be run from inside a Scrapy project. See the :ref:`commands reference ` below for more information on which commands must be run from inside projects, and which not.hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Kdh^hhG]r(hrXSome Scrapy commands (like rr}r(hLXSome Scrapy commands (like hMjubcsphinx.addnodes pending_xref r)r}r(hLX:command:`crawl`rhMjhNhQhSU pending_xrefrhU}r(UreftypeXcommandUrefwarnrU reftargetrXcrawlU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]UrefdocrXtopics/commandsruh]KdhG]rh)r}r(hLjhU}r(hY]hZ]r(UxrefrjX std-commandrehX]hW]h[]uhMjhG]rhrXcrawlrr}r(hLUhMjubahShubaubhrX4) must be run from inside a Scrapy project. See the rr}r(hLX4) must be run from inside a Scrapy project. See the hMjubj)r}r(hLX/:ref:`commands reference `rhMjhNhQhSjhU}r(UreftypeXrefjjXtopics-commands-refU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KdhG]rh)r}r(hLjhU}r(hY]hZ]r(jjXstd-refrehX]hW]h[]uhMjhG]rhrXcommands referencerr}r(hLUhMjubahShubaubhrX^ below for more information on which commands must be run from inside projects, and which not.rr}r(hLX^ below for more information on which commands must be run from inside projects, and which not.hMjubeubh)r}r(hLXAlso keep in mind that some commands may have slightly different behaviours when running them from inside projects. For example, the fetch command will use spider-overridden behaviours (such as the ``user_agent`` attribute to override the user-agent) if the url being fetched is associated with some specific spider. This is intentional, as the ``fetch`` command is meant to be used to check how spiders are downloading pages.hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Khh^hhG]r(hrXAlso keep in mind that some commands may have slightly different behaviours when running them from inside projects. For example, the fetch command will use spider-overridden behaviours (such as the rr}r(hLXAlso keep in mind that some commands may have slightly different behaviours when running them from inside projects. For example, the fetch command will use spider-overridden behaviours (such as the hMjubh)r}r(hLX``user_agent``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX user_agentrr}r(hLUhMjubahShubhrX attribute to override the user-agent) if the url being fetched is associated with some specific spider. This is intentional, as the rr}r(hLX attribute to override the user-agent) if the url being fetched is associated with some specific spider. This is intentional, as the hMjubh)r}r(hLX ``fetch``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXfetchrr}r(hLUhMjubahShubhrXH command is meant to be used to check how spiders are downloading pages.rr}r(hLXH command is meant to be used to check how spiders are downloading pages.hMjubeubhI)r}r(hLX.. _topics-commands-ref:hMjhNhQhShThU}r(hW]hX]hY]hZ]h[]h\h=uh]Koh^hhG]ubeubeubh_)r }r (hLUhMh`hNhQhb}r hjshShdhU}r (hY]hZ]hX]hW]r (h+h=eh[]r(hheuh]Krh^hhh}rh=jshG]r(hk)r}r(hLXAvailable tool commandsrhMj hNhQhShohU}r(hY]hZ]hX]hW]h[]uh]Krh^hhG]rhrXAvailable tool commandsrr}r(hLjhMjubaubh)r}r(hLXThis section contains a list of the available built-in commands with a description and some usage examples. Remember you can always get more info about each command by running::hMj hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Kth^hhG]rhrXThis section contains a list of the available built-in commands with a description and some usage examples. Remember you can always get more info about each command by running:rr}r(hLXThis section contains a list of the available built-in commands with a description and some usage examples. Remember you can always get more info about each command by running:hMjubaubh)r }r!(hLXscrapy -hhMj hNhQhShhU}r"(hhhW]hX]hY]hZ]h[]uh]Kxh^hhG]r#hrXscrapy -hr$r%}r&(hLUhMj ubaubh)r'}r((hLX-And you can see all available commands with::r)hMj hNhQhShhU}r*(hY]hZ]hX]hW]h[]uh]Kzh^hhG]r+hrX,And you can see all available commands with:r,r-}r.(hLX,And you can see all available commands with:hMj'ubaubh)r/}r0(hLX scrapy -hhMj hNhQhShhU}r1(hhhW]hX]hY]hZ]h[]uh]K|h^hhG]r2hrX scrapy -hr3r4}r5(hLUhMj/ubaubh)r6}r7(hLX=There are two kinds of commands, those that only work from inside a Scrapy project (Project-specific commands) and those that also work without an active Scrapy project (Global commands), though they may behave slightly different when running from inside a project (as they would use the project overridden settings).r8hMj hNhQhShhU}r9(hY]hZ]hX]hW]h[]uh]K~h^hhG]r:hrX=There are two kinds of commands, those that only work from inside a Scrapy project (Project-specific commands) and those that also work without an active Scrapy project (Global commands), though they may behave slightly different when running from inside a project (as they would use the project overridden settings).r;r<}r=(hLj8hMj6ubaubh)r>}r?(hLXGlobal commands:r@hMj hNhQhShhU}rA(hY]hZ]hX]hW]h[]uh]Kh^hhG]rBhrXGlobal commands:rCrD}rE(hLj@hMj>ubaubcdocutils.nodes bullet_list rF)rG}rH(hLUhMj hNhQhSU bullet_listrIhU}rJ(UbulletrKX*hW]hX]hY]hZ]h[]uh]Kh^hhG]rL(cdocutils.nodes list_item rM)rN}rO(hLX:command:`startproject`rPhMjGhNhQhSU list_itemrQhU}rR(hY]hZ]hX]hW]h[]uh]Nh^hhG]rSh)rT}rU(hLjPhMjNhNhQhShhU}rV(hY]hZ]hX]hW]h[]uh]KhG]rWj)rX}rY(hLjPhMjThNhQhSjhU}rZ(UreftypeXcommandjjX startprojectU refdomainXstdr[hW]hX]U refexplicithY]hZ]h[]jjuh]KhG]r\h)r]}r^(hLjPhU}r_(hY]hZ]r`(jj[X std-commandraehX]hW]h[]uhMjXhG]rbhrX startprojectrcrd}re(hLUhMj]ubahShubaubaubaubjM)rf}rg(hLX:command:`settings`rhhMjGhNhQhSjQhU}ri(hY]hZ]hX]hW]h[]uh]Nh^hhG]rjh)rk}rl(hLjhhMjfhNhQhShhU}rm(hY]hZ]hX]hW]h[]uh]KhG]rnj)ro}rp(hLjhhMjkhNhQhSjhU}rq(UreftypeXcommandjjXsettingsU refdomainXstdrrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rsh)rt}ru(hLjhhU}rv(hY]hZ]rw(jjrX std-commandrxehX]hW]h[]uhMjohG]ryhrXsettingsrzr{}r|(hLUhMjtubahShubaubaubaubjM)r}}r~(hLX:command:`runspider`rhMjGhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMj}hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjX runspiderU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rh)r}r(hLjhU}r(hY]hZ]r(jjX std-commandrehX]hW]h[]uhMjhG]rhrX runspiderrr}r(hLUhMjubahShubaubaubaubjM)r}r(hLX:command:`shell`rhMjGhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjXshellU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rh)r}r(hLjhU}r(hY]hZ]r(jjX std-commandrehX]hW]h[]uhMjhG]rhrXshellrr}r(hLUhMjubahShubaubaubaubjM)r}r(hLX:command:`fetch`rhMjGhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjXfetchU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rh)r}r(hLjhU}r(hY]hZ]r(jjX std-commandrehX]hW]h[]uhMjhG]rhrXfetchrr}r(hLUhMjubahShubaubaubaubjM)r}r(hLX:command:`view`rhMjGhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjXviewU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rh)r}r(hLjhU}r(hY]hZ]r(jjX std-commandrehX]hW]h[]uhMjhG]rhrXviewrr}r(hLUhMjubahShubaubaubaubjM)r}r(hLX:command:`version` hMjGhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLX:command:`version`rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjXversionU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rh)r}r(hLjhU}r(hY]hZ]r(jjX std-commandrehX]hW]h[]uhMjhG]rhrXversionrr}r(hLUhMjubahShubaubaubaubeubh)r}r(hLXProject-only commands:rhMj hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Kh^hhG]rhrXProject-only commands:rr}r(hLjhMjubaubjF)r}r(hLUhMj hNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]Kh^hhG]r(jM)r}r(hLX:command:`crawl`rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjXcrawlU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]r h)r }r (hLjhU}r (hY]hZ]r (jjX std-commandrehX]hW]h[]uhMjhG]rhrXcrawlrr}r(hLUhMj ubahShubaubaubaubjM)r}r(hLX:command:`check`rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjXcheckU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]r h)r!}r"(hLjhU}r#(hY]hZ]r$(jjX std-commandr%ehX]hW]h[]uhMjhG]r&hrXcheckr'r(}r)(hLUhMj!ubahShubaubaubaubjM)r*}r+(hLX:command:`list`r,hMjhNhQhSjQhU}r-(hY]hZ]hX]hW]h[]uh]Nh^hhG]r.h)r/}r0(hLj,hMj*hNhQhShhU}r1(hY]hZ]hX]hW]h[]uh]KhG]r2j)r3}r4(hLj,hMj/hNhQhSjhU}r5(UreftypeXcommandjjXlistU refdomainXstdr6hW]hX]U refexplicithY]hZ]h[]jjuh]KhG]r7h)r8}r9(hLj,hU}r:(hY]hZ]r;(jj6X std-commandr<ehX]hW]h[]uhMj3hG]r=hrXlistr>r?}r@(hLUhMj8ubahShubaubaubaubjM)rA}rB(hLX:command:`edit`rChMjhNhQhSjQhU}rD(hY]hZ]hX]hW]h[]uh]Nh^hhG]rEh)rF}rG(hLjChMjAhNhQhShhU}rH(hY]hZ]hX]hW]h[]uh]KhG]rIj)rJ}rK(hLjChMjFhNhQhSjhU}rL(UreftypeXcommandjjXeditU refdomainXstdrMhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rNh)rO}rP(hLjChU}rQ(hY]hZ]rR(jjMX std-commandrSehX]hW]h[]uhMjJhG]rThrXeditrUrV}rW(hLUhMjOubahShubaubaubaubjM)rX}rY(hLX:command:`parse`rZhMjhNhQhSjQhU}r[(hY]hZ]hX]hW]h[]uh]Nh^hhG]r\h)r]}r^(hLjZhMjXhNhQhShhU}r_(hY]hZ]hX]hW]h[]uh]KhG]r`j)ra}rb(hLjZhMj]hNhQhSjhU}rc(UreftypeXcommandjjXparseU refdomainXstdrdhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]reh)rf}rg(hLjZhU}rh(hY]hZ]ri(jjdX std-commandrjehX]hW]h[]uhMjahG]rkhrXparserlrm}rn(hLUhMjfubahShubaubaubaubjM)ro}rp(hLX:command:`genspider`rqhMjhNhQhSjQhU}rr(hY]hZ]hX]hW]h[]uh]Nh^hhG]rsh)rt}ru(hLjqhMjohNhQhShhU}rv(hY]hZ]hX]hW]h[]uh]KhG]rwj)rx}ry(hLjqhMjthNhQhSjhU}rz(UreftypeXcommandjjX genspiderU refdomainXstdr{hW]hX]U refexplicithY]hZ]h[]jjuh]KhG]r|h)r}}r~(hLjqhU}r(hY]hZ]r(jj{X std-commandrehX]hW]h[]uhMjxhG]rhrX genspiderrr}r(hLUhMj}ubahShubaubaubaubjM)r}r(hLX:command:`deploy`rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjXdeployU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rh)r}r(hLjhU}r(hY]hZ]r(jjX std-commandrehX]hW]h[]uhMjhG]rhrXdeployrr}r(hLUhMjubahShubaubaubaubjM)r}r(hLX:command:`bench` hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLX:command:`bench`rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]rj)r}r(hLjhMjhNhQhSjhU}r(UreftypeXcommandjjXbenchU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]KhG]rh)r}r(hLjhU}r(hY]hZ]r(jjX std-commandrehX]hW]h[]uhMjhG]rhrXbenchrr}r(hLUhMjubahShubaubaubaubeubcsphinx.addnodes index r)r}r(hLUhMj hNhQhSUindexrhU}r(hW]hX]hY]hZ]h[]Uentries]r(XpairXstartproject; commandXstd:command-startprojectrUtrauh]Kh^hhG]ubhI)r}r(hLUhMj hNhQhShThU}r(hW]hX]hY]hZ]h[]h\juh]Kh^hhG]ubh_)r}r(hLUhMj hNhQhb}hShdhU}r(hY]hZ]hX]hW]r(h;jeh[]rhauh]Kh^hhh}rjjshG]r(hk)r}r(hLX startprojectrhMjhNhQhShohU}r(hY]hZ]hX]hW]h[]uh]Kh^hhG]rhrX startprojectrr}r(hLjhMjubaubjF)r}r(hLUhMjhNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]Kh^hhG]r(jM)r}r(hLX.Syntax: ``scrapy startproject ``rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]r(hrXSyntax: rr}r(hLXSyntax: hMjubh)r}r(hLX&``scrapy startproject ``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX"scrapy startproject rr}r(hLUhMjubahShubeubaubjM)r}r(hLXRequires project: *no* hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXRequires project: *no*hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]r(hrXRequires project: rr}r(hLXRequires project: hMjubh)r}r(hLX*no*hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXnorr}r(hLUhMjubahShubeubaubeubh)r}r(hLXZCreates a new Scrapy project named ``project_name``, under the ``project_name`` directory.hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Kh^hhG]r(hrX#Creates a new Scrapy project named rr}r(hLX#Creates a new Scrapy project named hMjubh)r}r(hLX``project_name``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX project_namerr}r(hLUhMjubahShubhrX , under the rr}r(hLX , under the hMjubh)r}r (hLX``project_name``hU}r (hY]hZ]hX]hW]h[]uhMjhG]r hrX project_namer r }r(hLUhMjubahShubhrX directory.rr}r(hLX directory.hMjubeubh)r}r(hLXUsage example::rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Kh^hhG]rhrXUsage example:rr}r(hLXUsage example:hMjubaubh)r}r(hLX$ scrapy startproject myprojecthMjhNhQhShhU}r(hhhW]hX]hY]hZ]h[]uh]Kh^hhG]rhrX$ scrapy startproject myprojectrr}r (hLUhMjubaubj)r!}r"(hLUhMjhNhQhSjhU}r#(hW]hX]hY]hZ]h[]Uentries]r$(XpairXgenspider; commandXstd:command-genspiderr%Utr&auh]Kh^hhG]ubhI)r'}r((hLUhMjhNhQhShThU}r)(hW]hX]hY]hZ]h[]h\j%uh]Kh^hhG]ubeubh_)r*}r+(hLUhMj hNhQhb}hShdhU}r,(hY]hZ]hX]hW]r-(h,j%eh[]r.hauh]Kh^hhh}r/j%j'shG]r0(hk)r1}r2(hLX genspiderr3hMj*hNhQhShohU}r4(hY]hZ]hX]hW]h[]uh]Kh^hhG]r5hrX genspiderr6r7}r8(hLj3hMj1ubaubjF)r9}r:(hLUhMj*hNhQhSjIhU}r;(jKX*hW]hX]hY]hZ]h[]uh]Kh^hhG]r<(jM)r=}r>(hLX:Syntax: ``scrapy genspider [-t template] ``r?hMj9hNhQhSjQhU}r@(hY]hZ]hX]hW]h[]uh]Nh^hhG]rAh)rB}rC(hLj?hMj=hNhQhShhU}rD(hY]hZ]hX]hW]h[]uh]KhG]rE(hrXSyntax: rFrG}rH(hLXSyntax: hMjBubh)rI}rJ(hLX2``scrapy genspider [-t template] ``hU}rK(hY]hZ]hX]hW]h[]uhMjBhG]rLhrX.scrapy genspider [-t template] rMrN}rO(hLUhMjIubahShubeubaubjM)rP}rQ(hLXRequires project: *yes* hMj9hNhQhSjQhU}rR(hY]hZ]hX]hW]h[]uh]Nh^hhG]rSh)rT}rU(hLXRequires project: *yes*hMjPhNhQhShhU}rV(hY]hZ]hX]hW]h[]uh]KhG]rW(hrXRequires project: rXrY}rZ(hLXRequires project: hMjTubh)r[}r\(hLX*yes*hU}r](hY]hZ]hX]hW]h[]uhMjThG]r^hrXyesr_r`}ra(hLUhMj[ubahShubeubaubeubh)rb}rc(hLX+Create a new spider in the current project.rdhMj*hNhQhShhU}re(hY]hZ]hX]hW]h[]uh]Kh^hhG]rfhrX+Create a new spider in the current project.rgrh}ri(hLjdhMjbubaubh)rj}rk(hLXThis is just a convenient shortcut command for creating spiders based on pre-defined templates, but certainly not the only way to create spiders. You can just create the spider source code files yourself, instead of using this command.rlhMj*hNhQhShhU}rm(hY]hZ]hX]hW]h[]uh]Kh^hhG]rnhrXThis is just a convenient shortcut command for creating spiders based on pre-defined templates, but certainly not the only way to create spiders. You can just create the spider source code files yourself, instead of using this command.rorp}rq(hLjlhMjjubaubh)rr}rs(hLXUsage example::rthMj*hNhQhShhU}ru(hY]hZ]hX]hW]h[]uh]Kh^hhG]rvhrXUsage example:rwrx}ry(hLXUsage example:hMjrubaubh)rz}r{(hLX$ scrapy genspider -l Available templates: basic crawl csvfeed xmlfeed $ scrapy genspider -d basic from scrapy.spider import Spider class $classname(Spider): name = "$name" allowed_domains = ["$domain"] start_urls = ( 'http://www.$domain/', ) def parse(self, response): pass $ scrapy genspider -t basic example example.com Created spider 'example' using template 'basic' in module: mybot.spiders.examplehMj*hNhQhShhU}r|(hhhW]hX]hY]hZ]h[]uh]Kh^hhG]r}hrX$ scrapy genspider -l Available templates: basic crawl csvfeed xmlfeed $ scrapy genspider -d basic from scrapy.spider import Spider class $classname(Spider): name = "$name" allowed_domains = ["$domain"] start_urls = ( 'http://www.$domain/', ) def parse(self, response): pass $ scrapy genspider -t basic example example.com Created spider 'example' using template 'basic' in module: mybot.spiders.exampler~r}r(hLUhMjzubaubj)r}r(hLUhMj*hNhQhSjhU}r(hW]hX]hY]hZ]h[]Uentries]r(XpairXcrawl; commandXstd:command-crawlrUtrauh]Kh^hhG]ubhI)r}r(hLUhMj*hNhQhShThU}r(hW]hX]hY]hZ]h[]h\juh]Kh^hhG]ubeubh_)r}r(hLUhMj hNhQhb}hShdhU}r(hY]hZ]hX]hW]r(h``rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]r(hrXSyntax: rr}r(hLXSyntax: hMjubh)r}r(hLX``scrapy crawl ``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXscrapy crawl rr}r(hLUhMjubahShubeubaubjM)r}r(hLXRequires project: *yes* hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXRequires project: *yes*hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]r(hrXRequires project: rr}r(hLXRequires project: hMjubh)r}r(hLX*yes*hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXyesrr}r(hLUhMjubahShubeubaubeubh)r}r(hLXStart crawling a spider.rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Kh^hhG]rhrXStart crawling a spider.rr}r(hLjhMjubaubh)r}r(hLXUsage examples::rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Kh^hhG]rhrXUsage examples:rr}r(hLXUsage examples:hMjubaubh)r}r(hLX<$ scrapy crawl myspider [ ... myspider starts crawling ... ]hMjhNhQhShhU}r(hhhW]hX]hY]hZ]h[]uh]Kh^hhG]rhrX<$ scrapy crawl myspider [ ... myspider starts crawling ... ]rr}r(hLUhMjubaubj)r}r(hLUhMjhNhQhSjhU}r(hW]hX]hY]hZ]h[]Uentries]r(XpairXcheck; commandXstd:command-checkrUtrauh]Kh^hhG]ubhI)r}r(hLUhMjhNhQhShThU}r(hW]hX]hY]hZ]h[]h\juh]Kh^hhG]ubeubh_)r}r(hLUhMj hNhQhb}hShdhU}r(hY]hZ]hX]hW]r(h4jeh[]rhauh]Kh^hhh}rjjshG]r(hk)r}r(hLXcheckrhMjhNhQhShohU}r(hY]hZ]hX]hW]h[]uh]Kh^hhG]rhrXcheckrr}r(hLjhMjubaubjF)r}r(hLUhMjhNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]Kh^hhG]r(jM)r}r(hLX&Syntax: ``scrapy check [-l] ``rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]r(hrXSyntax: rr}r(hLXSyntax: hMjubh)r}r(hLX``scrapy check [-l] ``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXscrapy check [-l] rr}r(hLUhMjubahShubeubaubjM)r}r (hLXRequires project: *yes* hMjhNhQhSjQhU}r (hY]hZ]hX]hW]h[]uh]Nh^hhG]r h)r }r (hLXRequires project: *yes*hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]KhG]r(hrXRequires project: rr}r(hLXRequires project: hMj ubh)r}r(hLX*yes*hU}r(hY]hZ]hX]hW]h[]uhMj hG]rhrXyesrr}r(hLUhMjubahShubeubaubeubh)r}r(hLXRun contract checks.rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Kh^hhG]rhrXRun contract checks.rr }r!(hLjhMjubaubh)r"}r#(hLXUsage examples::r$hMjhNhQhShhU}r%(hY]hZ]hX]hW]h[]uh]Kh^hhG]r&hrXUsage examples:r'r(}r)(hLXUsage examples:hMj"ubaubh)r*}r+(hLX$ scrapy check -l first_spider * parse * parse_item second_spider * parse * parse_item $ scrapy check [FAILED] first_spider:parse_item >>> 'RetailPricex' field is missing [FAILED] first_spider:parse >>> Returned 92 requests, expected 0..4hMjhNhQhShhU}r,(hhhW]hX]hY]hZ]h[]uh]Kh^hhG]r-hrX$ scrapy check -l first_spider * parse * parse_item second_spider * parse * parse_item $ scrapy check [FAILED] first_spider:parse_item >>> 'RetailPricex' field is missing [FAILED] first_spider:parse >>> Returned 92 requests, expected 0..4r.r/}r0(hLUhMj*ubaubj)r1}r2(hLUhMjhNhQhSjhU}r3(hW]hX]hY]hZ]h[]Uentries]r4(XpairX list; commandXstd:command-listr5Utr6auh]Kh^hhG]ubhI)r7}r8(hLUhMjhNhQhShThU}r9(hW]hX]hY]hZ]h[]h\j5uh]Kh^hhG]ubeubh_)r:}r;(hLUhMj hNhQhb}hShdhU}r<(hY]hZ]hX]hW]r=(hBj5eh[]r>hauh]Kh^hhh}r?j5j7shG]r@(hk)rA}rB(hLXlistrChMj:hNhQhShohU}rD(hY]hZ]hX]hW]h[]uh]Kh^hhG]rEhrXlistrFrG}rH(hLjChMjAubaubjF)rI}rJ(hLUhMj:hNhQhSjIhU}rK(jKX*hW]hX]hY]hZ]h[]uh]Mh^hhG]rL(jM)rM}rN(hLXSyntax: ``scrapy list``rOhMjIhNhQhSjQhU}rP(hY]hZ]hX]hW]h[]uh]Nh^hhG]rQh)rR}rS(hLjOhMjMhNhQhShhU}rT(hY]hZ]hX]hW]h[]uh]MhG]rU(hrXSyntax: rVrW}rX(hLXSyntax: hMjRubh)rY}rZ(hLX``scrapy list``hU}r[(hY]hZ]hX]hW]h[]uhMjRhG]r\hrX scrapy listr]r^}r_(hLUhMjYubahShubeubaubjM)r`}ra(hLXRequires project: *yes* hMjIhNhQhSjQhU}rb(hY]hZ]hX]hW]h[]uh]Nh^hhG]rch)rd}re(hLXRequires project: *yes*hMj`hNhQhShhU}rf(hY]hZ]hX]hW]h[]uh]MhG]rg(hrXRequires project: rhri}rj(hLXRequires project: hMjdubh)rk}rl(hLX*yes*hU}rm(hY]hZ]hX]hW]h[]uhMjdhG]rnhrXyesrorp}rq(hLUhMjkubahShubeubaubeubh)rr}rs(hLXUList all available spiders in the current project. The output is one spider per line.rthMj:hNhQhShhU}ru(hY]hZ]hX]hW]h[]uh]Mh^hhG]rvhrXUList all available spiders in the current project. The output is one spider per line.rwrx}ry(hLjthMjrubaubh)rz}r{(hLXUsage example::r|hMj:hNhQhShhU}r}(hY]hZ]hX]hW]h[]uh]Mh^hhG]r~hrXUsage example:rr}r(hLXUsage example:hMjzubaubh)r}r(hLX$ scrapy list spider1 spider2hMj:hNhQhShhU}r(hhhW]hX]hY]hZ]h[]uh]M h^hhG]rhrX$ scrapy list spider1 spider2rr}r(hLUhMjubaubj)r}r(hLUhMj:hNhQhSjhU}r(hW]hX]hY]hZ]h[]Uentries]r(XpairX edit; commandXstd:command-editrUtrauh]Mh^hhG]ubhI)r}r(hLUhMj:hNhQhShThU}r(hW]hX]hY]hZ]h[]h\juh]Mh^hhG]ubeubh_)r}r(hLUhMj hNhQhb}hShdhU}r(hY]hZ]hX]hW]r(hAjeh[]rhauh]Mh^hhh}rjjshG]r(hk)r}r(hLXeditrhMjhNhQhShohU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]rhrXeditrr}r(hLjhMjubaubjF)r}r(hLUhMjhNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]Mh^hhG]r(jM)r}r(hLX Syntax: ``scrapy edit ``rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MhG]r(hrXSyntax: rr}r(hLXSyntax: hMjubh)r}r(hLX``scrapy edit ``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXscrapy edit rr}r(hLUhMjubahShubeubaubjM)r}r(hLXRequires project: *yes* hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXRequires project: *yes*hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MhG]r(hrXRequires project: rr}r(hLXRequires project: hMjubh)r}r(hLX*yes*hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXyesrr}r(hLUhMjubahShubeubaubeubh)r}r(hLXPEdit the given spider using the editor defined in the :setting:`EDITOR` setting.hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]r(hrX6Edit the given spider using the editor defined in the rr}r(hLX6Edit the given spider using the editor defined in the hMjubj)r}r(hLX:setting:`EDITOR`rhMjhNhQhSjhU}r(UreftypeXsettingjjXEDITORU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]MhG]rh)r}r(hLjhU}r(hY]hZ]r(jjX std-settingrehX]hW]h[]uhMjhG]rhrXEDITORrr}r(hLUhMjubahShubaubhrX setting.rr}r(hLX setting.hMjubeubh)r}r(hLXThis command is provided only as a convenient shortcut for the most common case, the developer is of course free to choose any tool or IDE to write and debug his spiders.rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]rhrXThis command is provided only as a convenient shortcut for the most common case, the developer is of course free to choose any tool or IDE to write and debug his spiders.rr}r(hLjhMjubaubh)r}r(hLXUsage example::rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]rhrXUsage example:rr}r(hLXUsage example:hMjubaubh)r}r(hLX$ scrapy edit spider1hMjhNhQhShhU}r(hhhW]hX]hY]hZ]h[]uh]Mh^hhG]rhrX$ scrapy edit spider1rr}r(hLUhMjubaubj)r}r(hLUhMjhNhQhSjhU}r(hW]hX]hY]hZ]h[]Uentries]r(XpairXfetch; commandXstd:command-fetchrUtrauh]M!h^hhG]ubhI)r}r(hLUhMjhNhQhShThU}r(hW]hX]hY]hZ]h[]h\juh]M!h^hhG]ubeubh_)r}r(hLUhMj hNhQhb}hShdhU}r(hY]hZ]hX]hW]r(hEjeh[]rh auh]M#h^hhh}rjjshG]r (hk)r }r (hLXfetchr hMjhNhQhShohU}r (hY]hZ]hX]hW]h[]uh]M#h^hhG]rhrXfetchrr}r(hLj hMj ubaubjF)r}r(hLUhMjhNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]M%h^hhG]r(jM)r}r(hLXSyntax: ``scrapy fetch ``rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]M%hG]r(hrXSyntax: rr }r!(hLXSyntax: hMjubh)r"}r#(hLX``scrapy fetch ``hU}r$(hY]hZ]hX]hW]h[]uhMjhG]r%hrXscrapy fetch r&r'}r((hLUhMj"ubahShubeubaubjM)r)}r*(hLXRequires project: *no* hMjhNhQhSjQhU}r+(hY]hZ]hX]hW]h[]uh]Nh^hhG]r,h)r-}r.(hLXRequires project: *no*hMj)hNhQhShhU}r/(hY]hZ]hX]hW]h[]uh]M&hG]r0(hrXRequires project: r1r2}r3(hLXRequires project: hMj-ubh)r4}r5(hLX*no*hU}r6(hY]hZ]hX]hW]h[]uhMj-hG]r7hrXnor8r9}r:(hLUhMj4ubahShubeubaubeubh)r;}r<(hLX_Downloads the given URL using the Scrapy downloader and writes the contents to standard output.r=hMjhNhQhShhU}r>(hY]hZ]hX]hW]h[]uh]M(h^hhG]r?hrX_Downloads the given URL using the Scrapy downloader and writes the contents to standard output.r@rA}rB(hLj=hMj;ubaubh)rC}rD(hLXThe interesting thing about this command is that it fetches the page how the the spider would download it. For example, if the spider has an ``USER_AGENT`` attribute which overrides the User Agent, it will use that one.hMjhNhQhShhU}rE(hY]hZ]hX]hW]h[]uh]M+h^hhG]rF(hrXThe interesting thing about this command is that it fetches the page how the the spider would download it. For example, if the spider has an rGrH}rI(hLXThe interesting thing about this command is that it fetches the page how the the spider would download it. For example, if the spider has an hMjCubh)rJ}rK(hLX``USER_AGENT``hU}rL(hY]hZ]hX]hW]h[]uhMjChG]rMhrX USER_AGENTrNrO}rP(hLUhMjJubahShubhrX@ attribute which overrides the User Agent, it will use that one.rQrR}rS(hLX@ attribute which overrides the User Agent, it will use that one.hMjCubeubh)rT}rU(hLXNSo this command can be used to "see" how your spider would fetch certain page.rVhMjhNhQhShhU}rW(hY]hZ]hX]hW]h[]uh]M/h^hhG]rXhrXNSo this command can be used to "see" how your spider would fetch certain page.rYrZ}r[(hLjVhMjTubaubh)r\}r](hLXIf used outside a project, no particular per-spider behaviour would be applied and it will just use the default Scrapy downloder settings.r^hMjhNhQhShhU}r_(hY]hZ]hX]hW]h[]uh]M1h^hhG]r`hrXIf used outside a project, no particular per-spider behaviour would be applied and it will just use the default Scrapy downloder settings.rarb}rc(hLj^hMj\ubaubh)rd}re(hLXUsage examples::rfhMjhNhQhShhU}rg(hY]hZ]hX]hW]h[]uh]M4h^hhG]rhhrXUsage examples:rirj}rk(hLXUsage examples:hMjdubaubh)rl}rm(hLX$ scrapy fetch --nolog http://www.example.com/some/page.html [ ... html content here ... ] $ scrapy fetch --nolog --headers http://www.example.com/ {'Accept-Ranges': ['bytes'], 'Age': ['1263 '], 'Connection': ['close '], 'Content-Length': ['596'], 'Content-Type': ['text/html; charset=UTF-8'], 'Date': ['Wed, 18 Aug 2010 23:59:46 GMT'], 'Etag': ['"573c1-254-48c9c87349680"'], 'Last-Modified': ['Fri, 30 Jul 2010 15:30:18 GMT'], 'Server': ['Apache/2.2.3 (CentOS)']}hMjhNhQhShhU}rn(hhhW]hX]hY]hZ]h[]uh]M6h^hhG]rohrX$ scrapy fetch --nolog http://www.example.com/some/page.html [ ... html content here ... ] $ scrapy fetch --nolog --headers http://www.example.com/ {'Accept-Ranges': ['bytes'], 'Age': ['1263 '], 'Connection': ['close '], 'Content-Length': ['596'], 'Content-Type': ['text/html; charset=UTF-8'], 'Date': ['Wed, 18 Aug 2010 23:59:46 GMT'], 'Etag': ['"573c1-254-48c9c87349680"'], 'Last-Modified': ['Fri, 30 Jul 2010 15:30:18 GMT'], 'Server': ['Apache/2.2.3 (CentOS)']}rprq}rr(hLUhMjlubaubj)rs}rt(hLUhMjhNhQhSjhU}ru(hW]hX]hY]hZ]h[]Uentries]rv(XpairX view; commandXstd:command-viewrwUtrxauh]MEh^hhG]ubhI)ry}rz(hLUhMjhNhQhShThU}r{(hW]hX]hY]hZ]h[]h\jwuh]MEh^hhG]ubeubh_)r|}r}(hLUhMj hNhQhb}hShdhU}r~(hY]hZ]hX]hW]r(hFjweh[]rh!auh]MGh^hhh}rjwjyshG]r(hk)r}r(hLXviewrhMj|hNhQhShohU}r(hY]hZ]hX]hW]h[]uh]MGh^hhG]rhrXviewrr}r(hLjhMjubaubjF)r}r(hLUhMj|hNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]MIh^hhG]r(jM)r}r(hLXSyntax: ``scrapy view ``rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MIhG]r(hrXSyntax: rr}r(hLXSyntax: hMjubh)r}r(hLX``scrapy view ``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXscrapy view rr}r(hLUhMjubahShubeubaubjM)r}r(hLXRequires project: *no* hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXRequires project: *no*hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MJhG]r(hrXRequires project: rr}r(hLXRequires project: hMjubh)r}r(hLX*no*hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXnorr}r(hLUhMjubahShubeubaubeubh)r}r(hLXOpens the given URL in a browser, as your Scrapy spider would "see" it. Sometimes spiders see pages differently from regular users, so this can be used to check what the spider "sees" and confirm it's what you expect.rhMj|hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MLh^hhG]rhrXOpens the given URL in a browser, as your Scrapy spider would "see" it. Sometimes spiders see pages differently from regular users, so this can be used to check what the spider "sees" and confirm it's what you expect.rr}r(hLjhMjubaubh)r}r(hLXUsage example::rhMj|hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MPh^hhG]rhrXUsage example:rr}r(hLXUsage example:hMjubaubh)r}r(hLXN$ scrapy view http://www.example.com/some/page.html [ ... browser starts ... ]hMj|hNhQhShhU}r(hhhW]hX]hY]hZ]h[]uh]MRh^hhG]rhrXN$ scrapy view http://www.example.com/some/page.html [ ... browser starts ... ]rr}r(hLUhMjubaubj)r}r(hLUhMj|hNhQhSjhU}r(hW]hX]hY]hZ]h[]Uentries]r(XpairXshell; commandXstd:command-shellrUtrauh]MVh^hhG]ubhI)r}r(hLUhMj|hNhQhShThU}r(hW]hX]hY]hZ]h[]h\juh]MVh^hhG]ubeubh_)r}r(hLUhMj hNhQhb}hShdhU}r(hY]hZ]hX]hW]r(h9jeh[]rhauh]MXh^hhh}rjjshG]r(hk)r}r(hLXshellrhMjhNhQhShohU}r(hY]hZ]hX]hW]h[]uh]MXh^hhG]rhrXshellrr}r(hLjhMjubaubjF)r}r(hLUhMjhNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]MZh^hhG]r(jM)r}r(hLXSyntax: ``scrapy shell [url]``rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MZhG]r(hrXSyntax: rr}r(hLXSyntax: hMjubh)r}r(hLX``scrapy shell [url]``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXscrapy shell [url]rr}r(hLUhMjubahShubeubaubjM)r}r(hLXRequires project: *no* hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXRequires project: *no*hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]M[hG]r(hrXRequires project: rr}r(hLXRequires project: hMjubh)r}r(hLX*no*hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXnor r }r (hLUhMjubahShubeubaubeubh)r }r (hLXyStarts the Scrapy shell for the given URL (if given) or empty if not URL is given. See :ref:`topics-shell` for more info.hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]M]h^hhG]r(hrXWStarts the Scrapy shell for the given URL (if given) or empty if not URL is given. See rr}r(hLXWStarts the Scrapy shell for the given URL (if given) or empty if not URL is given. See hMj ubj)r}r(hLX:ref:`topics-shell`rhMj hNhQhSjhU}r(UreftypeXrefjjX topics-shellU refdomainXstdrhW]hX]U refexplicithY]hZ]h[]jjuh]M]hG]rh)r}r(hLjhU}r(hY]hZ]r(jjXstd-refrehX]hW]h[]uhMjhG]rhrX topics-shellrr }r!(hLUhMjubahShubaubhrX for more info.r"r#}r$(hLX for more info.hMj ubeubh)r%}r&(hLXUsage example::r'hMjhNhQhShhU}r((hY]hZ]hX]hW]h[]uh]M`h^hhG]r)hrXUsage example:r*r+}r,(hLXUsage example:hMj%ubaubh)r-}r.(hLXT$ scrapy shell http://www.example.com/some/page.html [ ... scrapy shell starts ... ]hMjhNhQhShhU}r/(hhhW]hX]hY]hZ]h[]uh]Mbh^hhG]r0hrXT$ scrapy shell http://www.example.com/some/page.html [ ... scrapy shell starts ... ]r1r2}r3(hLUhMj-ubaubj)r4}r5(hLUhMjhNhQhSjhU}r6(hW]hX]hY]hZ]h[]Uentries]r7(XpairXparse; commandXstd:command-parser8Utr9auh]Mfh^hhG]ubhI)r:}r;(hLUhMjhNhQhShThU}r<(hW]hX]hY]hZ]h[]h\j8uh]Mfh^hhG]ubeubh_)r=}r>(hLUhMj hNhQhb}hShdhU}r?(hY]hZ]hX]hW]r@(h0j8eh[]rAh auh]Mhh^hhh}rBj8j:shG]rC(hk)rD}rE(hLXparserFhMj=hNhQhShohU}rG(hY]hZ]hX]hW]h[]uh]Mhh^hhG]rHhrXparserIrJ}rK(hLjFhMjDubaubjF)rL}rM(hLUhMj=hNhQhSjIhU}rN(jKX*hW]hX]hY]hZ]h[]uh]Mjh^hhG]rO(jM)rP}rQ(hLX(Syntax: ``scrapy parse [options]``rRhMjLhNhQhSjQhU}rS(hY]hZ]hX]hW]h[]uh]Nh^hhG]rTh)rU}rV(hLjRhMjPhNhQhShhU}rW(hY]hZ]hX]hW]h[]uh]MjhG]rX(hrXSyntax: rYrZ}r[(hLXSyntax: hMjUubh)r\}r](hLX ``scrapy parse [options]``hU}r^(hY]hZ]hX]hW]h[]uhMjUhG]r_hrXscrapy parse [options]r`ra}rb(hLUhMj\ubahShubeubaubjM)rc}rd(hLXRequires project: *yes* hMjLhNhQhSjQhU}re(hY]hZ]hX]hW]h[]uh]Nh^hhG]rfh)rg}rh(hLXRequires project: *yes*hMjchNhQhShhU}ri(hY]hZ]hX]hW]h[]uh]MkhG]rj(hrXRequires project: rkrl}rm(hLXRequires project: hMjgubh)rn}ro(hLX*yes*hU}rp(hY]hZ]hX]hW]h[]uhMjghG]rqhrXyesrrrs}rt(hLUhMjnubahShubeubaubeubh)ru}rv(hLXFetches the given URL and parses with the spider that handles it, using the method passed with the ``--callback`` option, or ``parse`` if not given.hMj=hNhQhShhU}rw(hY]hZ]hX]hW]h[]uh]Mmh^hhG]rx(hrXcFetches the given URL and parses with the spider that handles it, using the method passed with the ryrz}r{(hLXcFetches the given URL and parses with the spider that handles it, using the method passed with the hMjuubh)r|}r}(hLX``--callback``hU}r~(hY]hZ]hX]hW]h[]uhMjuhG]rhrX --callbackrr}r(hLUhMj|ubahShubhrX option, or rr}r(hLX option, or hMjuubh)r}r(hLX ``parse``hU}r(hY]hZ]hX]hW]h[]uhMjuhG]rhrXparserr}r(hLUhMjubahShubhrX if not given.rr}r(hLX if not given.hMjuubeubh)r}r(hLXSupported options:rhMj=hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mph^hhG]rhrXSupported options:rr}r(hLjhMjubaubjF)r}r(hLUhMj=hNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]Mrh^hhG]r(jM)r}r(hLXT``--callback`` or ``-c``: spider method to use as callback for parsing the response hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXS``--callback`` or ``-c``: spider method to use as callback for parsing the responsehMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MrhG]r(h)r}r(hLX``--callback``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX --callbackrr}r(hLUhMjubahShubhrX or rr}r(hLX or hMjubh)r}r(hLX``-c``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX-crr}r(hLUhMjubahShubhrX;: spider method to use as callback for parsing the responserr}r(hLX;: spider method to use as callback for parsing the responsehMjubeubaubjM)r}r(hLX``--rules`` or ``-r``: use :class:`~scrapy.contrib.spiders.CrawlSpider` rules to discover the callback (ie. spider method) to use for parsing the response hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLX``--rules`` or ``-r``: use :class:`~scrapy.contrib.spiders.CrawlSpider` rules to discover the callback (ie. spider method) to use for parsing the responsehMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MuhG]r(h)r}r(hLX ``--rules``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX--rulesrr}r(hLUhMjubahShubhrX or rr}r(hLX or hMjubh)r}r(hLX``-r``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX-rrr}r(hLUhMjubahShubhrX: use rr}r(hLX: use hMjubj)r}r(hLX,:class:`~scrapy.contrib.spiders.CrawlSpider`rhMjhNhQhSjhU}r(UreftypeXclassjjX"scrapy.contrib.spiders.CrawlSpiderU refdomainXpyrhW]hX]U refexplicithY]hZ]h[]jjUpy:classrNU py:modulerNuh]MuhG]rh)r}r(hLjhU}r(hY]hZ]r(jjXpy-classrehX]hW]h[]uhMjhG]rhrX CrawlSpiderrr}r(hLUhMjubahShubaubhrXS rules to discover the callback (ie. spider method) to use for parsing the responserr}r(hLXS rules to discover the callback (ie. spider method) to use for parsing the responsehMjubeubaubjM)r}r(hLX(``--noitems``: don't show scraped items hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLX'``--noitems``: don't show scraped itemshMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MyhG]r(h)r}r(hLX ``--noitems``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX --noitemsrr}r(hLUhMjubahShubhrX: don't show scraped itemsrr}r(hLX: don't show scraped itemshMjubeubaubjM)r}r(hLX*``--nolinks``: don't show extracted links hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLX)``--nolinks``: don't show extracted linkshMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]M{hG]r(h)r}r(hLX ``--nolinks``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX --nolinksrr}r(hLUhMjubahShubhrX: don't show extracted linksr r }r (hLX: don't show extracted linkshMjubeubaubjM)r }r (hLXf``--depth`` or ``-d``: depth level for which the requests should be followed recursively (default: 1) hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXe``--depth`` or ``-d``: depth level for which the requests should be followed recursively (default: 1)hMj hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]M}hG]r(h)r}r(hLX ``--depth``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX--depthrr}r(hLUhMjubahShubhrX or rr}r(hLX or hMjubh)r}r(hLX``-d``hU}r (hY]hZ]hX]hW]h[]uhMjhG]r!hrX-dr"r#}r$(hLUhMjubahShubhrXP: depth level for which the requests should be followed recursively (default: 1)r%r&}r'(hLXP: depth level for which the requests should be followed recursively (default: 1)hMjubeubaubjM)r(}r)(hLXB``--verbose`` or ``-v``: display information for each depth level hMjhNhQhSjQhU}r*(hY]hZ]hX]hW]h[]uh]Nh^hhG]r+h)r,}r-(hLXA``--verbose`` or ``-v``: display information for each depth levelhMj(hNhQhShhU}r.(hY]hZ]hX]hW]h[]uh]MhG]r/(h)r0}r1(hLX ``--verbose``hU}r2(hY]hZ]hX]hW]h[]uhMj,hG]r3hrX --verboser4r5}r6(hLUhMj0ubahShubhrX or r7r8}r9(hLX or hMj,ubh)r:}r;(hLX``-v``hU}r<(hY]hZ]hX]hW]h[]uhMj,hG]r=hrX-vr>r?}r@(hLUhMj:ubahShubhrX*: display information for each depth levelrArB}rC(hLX*: display information for each depth levelhMj,ubeubaubeubh)rD}rE(hLXUsage example::rFhMj=hNhQhShhU}rG(hY]hZ]hX]hW]h[]uh]Mh^hhG]rHhrXUsage example:rIrJ}rK(hLXUsage example:hMjDubaubh)rL}rM(hLXw$ scrapy parse http://www.example.com/ -c parse_item [ ... scrapy log lines crawling example.com spider ... ] >>> STATUS DEPTH LEVEL 1 <<< # Scraped Items ------------------------------------------------------------ [{'name': u'Example item', 'category': u'Furniture', 'length': u'12 cm'}] # Requests ----------------------------------------------------------------- []hMj=hNhQhShhU}rN(hhhW]hX]hY]hZ]h[]uh]Mh^hhG]rOhrXw$ scrapy parse http://www.example.com/ -c parse_item [ ... scrapy log lines crawling example.com spider ... ] >>> STATUS DEPTH LEVEL 1 <<< # Scraped Items ------------------------------------------------------------ [{'name': u'Example item', 'category': u'Furniture', 'length': u'12 cm'}] # Requests ----------------------------------------------------------------- []rPrQ}rR(hLUhMjLubaubj)rS}rT(hLUhMj=hNhQhSjhU}rU(hW]hX]hY]hZ]h[]Uentries]rV(XpairXsettings; commandXstd:command-settingsrWUtrXauh]Mh^hhG]ubhI)rY}rZ(hLUhMj=hNhQhShThU}r[(hW]hX]hY]hZ]h[]h\jWuh]Mh^hhG]ubeubh_)r\}r](hLUhMj hNhQhb}hShdhU}r^(hY]hZ]hX]hW]r_(h@jWeh[]r`hauh]Mh^hhh}rajWjYshG]rb(hk)rc}rd(hLXsettingsrehMj\hNhQhShohU}rf(hY]hZ]hX]hW]h[]uh]Mh^hhG]rghrXsettingsrhri}rj(hLjehMjcubaubjF)rk}rl(hLUhMj\hNhQhSjIhU}rm(jKX*hW]hX]hY]hZ]h[]uh]Mh^hhG]rn(jM)ro}rp(hLX%Syntax: ``scrapy settings [options]``rqhMjkhNhQhSjQhU}rr(hY]hZ]hX]hW]h[]uh]Nh^hhG]rsh)rt}ru(hLjqhMjohNhQhShhU}rv(hY]hZ]hX]hW]h[]uh]MhG]rw(hrXSyntax: rxry}rz(hLXSyntax: hMjtubh)r{}r|(hLX``scrapy settings [options]``hU}r}(hY]hZ]hX]hW]h[]uhMjthG]r~hrXscrapy settings [options]rr}r(hLUhMj{ubahShubeubaubjM)r}r(hLXRequires project: *no* hMjkhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXRequires project: *no*hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MhG]r(hrXRequires project: rr}r(hLXRequires project: hMjubh)r}r(hLX*no*hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXnorr}r(hLUhMjubahShubeubaubeubh)r}r(hLX"Get the value of a Scrapy setting.rhMj\hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]rhrX"Get the value of a Scrapy setting.rr}r(hLjhMjubaubh)r}r(hLX~If used inside a project it'll show the project setting value, otherwise it'll show the default Scrapy value for that setting.rhMj\hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]rhrX~If used inside a project it'll show the project setting value, otherwise it'll show the default Scrapy value for that setting.rr}r(hLjhMjubaubh)r}r(hLXExample usage::rhMj\hNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]rhrXExample usage:rr}r(hLXExample usage:hMjubaubh)r}r(hLXS$ scrapy settings --get BOT_NAME scrapybot $ scrapy settings --get DOWNLOAD_DELAY 0hMj\hNhQhShhU}r(hhhW]hX]hY]hZ]h[]uh]Mh^hhG]rhrXS$ scrapy settings --get BOT_NAME scrapybot $ scrapy settings --get DOWNLOAD_DELAY 0rr}r(hLUhMjubaubj)r}r(hLUhMj\hNhQhSjhU}r(hW]hX]hY]hZ]h[]Uentries]r(XpairXrunspider; commandXstd:command-runspiderrUtrauh]Mh^hhG]ubhI)r}r(hLUhMj\hNhQhShThU}r(hW]hX]hY]hZ]h[]h\juh]Mh^hhG]ubeubh_)r}r(hLUhMj hNhQhb}hShdhU}r(hY]hZ]hX]hW]r(h>jeh[]rhauh]Mh^hhh}rjjshG]r(hk)r}r(hLX runspiderrhMjhNhQhShohU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]rhrX runspiderrr}r(hLjhMjubaubjF)r}r(hLUhMjhNhQhSjIhU}r(jKX*hW]hX]hY]hZ]h[]uh]Mh^hhG]r(jM)r}r(hLX-Syntax: ``scrapy runspider ``rhMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLjhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MhG]r(hrXSyntax: rr}r(hLXSyntax: hMjubh)r}r(hLX%``scrapy runspider ``hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrX!scrapy runspider rr}r(hLUhMjubahShubeubaubjM)r}r(hLXRequires project: *no* hMjhNhQhSjQhU}r(hY]hZ]hX]hW]h[]uh]Nh^hhG]rh)r}r(hLXRequires project: *no*hMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]MhG]r(hrXRequires project: rr}r(hLXRequires project: hMjubh)r}r(hLX*no*hU}r(hY]hZ]hX]hW]h[]uhMjhG]rhrXnorr}r(hLUhMjubahShubeubaubeubh)r}r(hLXQRun a spider self-contained in a Python file, without having to create a project.rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]rhrXQRun a spider self-contained in a Python file, without having to create a project.rr}r(hLjhMjubaubh)r}r(hLXExample usage::rhMjhNhQhShhU}r(hY]hZ]hX]hW]h[]uh]Mh^hhG]r hrXExample usage:r r }r (hLXExample usage:hMjubaubh)r }r (hLXA$ scrapy runspider myspider.py [ ... spider starts crawling ... ]hMjhNhQhShhU}r (hhhW]hX]hY]hZ]h[]uh]Mh^hhG]r hrXA$ scrapy runspider myspider.py [ ... spider starts crawling ... ]r r }r (hLUhMj ubaubj)r }r (hLUhMjhNhQhSjhU}r (hW]hX]hY]hZ]h[]Uentries]r (XpairXversion; commandXstd:command-versionr Utr auh]Mh^hhG]ubhI)r }r (hLUhMjhNhQhShThU}r (hW]hX]hY]hZ]h[]h\j uh]Mh^hhG]ubeubh_)r }r (hLUhMj hNhQhb}hShdhU}r (hY]hZ]hX]hW]r (h7j eh[]r hauh]Mh^hhh}r j j shG]r (hk)r }r (hLXversionr hMj hNhQhShohU}r (hY]hZ]hX]hW]h[]uh]Mh^hhG]r hrXversionr r! }r" (hLj hMj ubaubjF)r# }r$ (hLUhMj hNhQhSjIhU}r% (jKX*hW]hX]hY]hZ]h[]uh]Mh^hhG]r& (jM)r' }r( (hLXSyntax: ``scrapy version [-v]``r) hMj# hNhQhSjQhU}r* (hY]hZ]hX]hW]h[]uh]Nh^hhG]r+ h)r, }r- (hLj) hMj' hNhQhShhU}r. (hY]hZ]hX]hW]h[]uh]MhG]r/ (hrXSyntax: r0 r1 }r2 (hLXSyntax: hMj, ubh)r3 }r4 (hLX``scrapy version [-v]``hU}r5 (hY]hZ]hX]hW]h[]uhMj, hG]r6 hrXscrapy version [-v]r7 r8 }r9 (hLUhMj3 ubahShubeubaubjM)r: }r; (hLXRequires project: *no* hMj# hNhQhSjQhU}r< (hY]hZ]hX]hW]h[]uh]Nh^hhG]r= h)r> }r? (hLXRequires project: *no*hMj: hNhQhShhU}r@ (hY]hZ]hX]hW]h[]uh]MhG]rA (hrXRequires project: rB rC }rD (hLXRequires project: hMj> ubh)rE }rF (hLX*no*hU}rG (hY]hZ]hX]hW]h[]uhMj> hG]rH hrXnorI rJ }rK (hLUhMjE ubahShubeubaubeubh)rL }rM (hLXPrints the Scrapy version. If used with ``-v`` it also prints Python, Twisted and Platform info, which is useful for bug reports.hMj hNhQhShhU}rN (hY]hZ]hX]hW]h[]uh]Mh^hhG]rO (hrX(Prints the Scrapy version. If used with rP rQ }rR (hLX(Prints the Scrapy version. If used with hMjL ubh)rS }rT (hLX``-v``hU}rU (hY]hZ]hX]hW]h[]uhMjL hG]rV hrX-vrW rX }rY (hLUhMjS ubahShubhrXS it also prints Python, Twisted and Platform info, which is useful for bug reports.rZ r[ }r\ (hLXS it also prints Python, Twisted and Platform info, which is useful for bug reports.hMjL ubeubj)r] }r^ (hLUhMj hNhQhSjhU}r_ (hW]hX]hY]hZ]h[]Uentries]r` (XpairXdeploy; commandXstd:command-deployra Utrb auh]Mh^hhG]ubhI)rc }rd (hLUhMj hNhQhShThU}re (hW]hX]hY]hZ]h[]h\ja uh]Mh^hhG]ubeubh_)rf }rg (hLUhMj hNhQhb}hShdhU}rh (hY]hZ]hX]hW]ri (h:ja eh[]rj hauh]Mh^hhh}rk ja jc shG]rl (hk)rm }rn (hLXdeployro hMjf hNhQhShohU}rp (hY]hZ]hX]hW]h[]uh]Mh^hhG]rq hrXdeployrr rs }rt (hLjo hMjm ubaubhv)ru }rv (hLUhMjf hNhQhShyhU}rw (h{X0.11rx hW]hX]hY]hZ]h[]h}X versionaddedry uh]Mh^hhG]rz h)r{ }r| (hLUhMju hNhQhShhU}r} (hY]hZ]hX]hW]h[]uh]Mh^hhG]r~ h)r }r (hLUhU}r (hY]hZ]r hyahX]hW]h[]uhMj{ hG]r hrXNew in version 0.11.r r }r (hLUhMj ubahShubaubaubjF)r }r (hLUhMjf hNhQhSjIhU}r (jKX*hW]hX]hY]hZ]h[]uh]Mh^hhG]r (jM)r }r (hLXASyntax: ``scrapy deploy [ | -l | -L ]``r hMj hNhQhSjQhU}r (hY]hZ]hX]hW]h[]uh]Nh^hhG]r h)r }r (hLj hMj hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]MhG]r (hrXSyntax: r r }r (hLXSyntax: hMj ubh)r }r (hLX9``scrapy deploy [ | -l | -L ]``hU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX5scrapy deploy [ | -l | -L ]r r }r (hLUhMj ubahShubeubaubjM)r }r (hLXRequires project: *yes* hMj hNhQhSjQhU}r (hY]hZ]hX]hW]h[]uh]Nh^hhG]r h)r }r (hLXRequires project: *yes*hMj hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]MhG]r (hrXRequires project: r r }r (hLXRequires project: hMj ubh)r }r (hLX*yes*hU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrXyesr r }r (hLUhMj ubahShubeubaubeubh)r }r (hLXHDeploy the project into a Scrapyd server. See `Deploying your project`_.r hMjf hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]Mh^hhG]r (hrX.Deploy the project into a Scrapyd server. See r r }r (hLX.Deploy the project into a Scrapyd server. See hMj ubcdocutils.nodes reference r )r }r (hLX`Deploying your project`_Uresolvedr KhMj hSU referencer hU}r (UnameXDeploying your projectUrefurir X@http://scrapyd.readthedocs.org/en/latest/#deploying-your-projectr hW]hX]hY]hZ]h[]uhG]r hrXDeploying your projectr r }r (hLUhMj ubaubhrX.r }r (hLX.hMj ubeubj)r }r (hLUhMjf hNhQhSjhU}r (hW]hX]hY]hZ]h[]Uentries]r (XpairXbench; commandXstd:command-benchr Utr auh]Mh^hhG]ubhI)r }r (hLUhMjf hNhQhShThU}r (hW]hX]hY]hZ]h[]h\j uh]Mh^hhG]ubeubh_)r }r (hLUhMj hNhQhb}hShdhU}r (hY]hZ]hX]hW]r (h6j eh[]r hauh]Mh^hhh}r j j shG]r (hk)r }r (hLXbenchr hMj hNhQhShohU}r (hY]hZ]hX]hW]h[]uh]Mh^hhG]r hrXbenchr r }r (hLj hMj ubaubhv)r }r (hLUhMj hNhQhShyhU}r (h{X0.17hW]hX]hY]hZ]h[]h}X versionaddedr uh]Mh^hhG]r h)r }r (hLUhMj hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]Mh^hhG]r h)r }r (hLUhU}r (hY]hZ]r hyahX]hW]h[]uhMj hG]r hrXNew in version 0.17.r r }r (hLUhMj ubahShubaubaubjF)r }r (hLUhMj hNhQhSjIhU}r (jKX*hW]hX]hY]hZ]h[]uh]Mh^hhG]r (jM)r }r (hLXSyntax: ``scrapy bench``r hMj hNhQhSjQhU}r (hY]hZ]hX]hW]h[]uh]Nh^hhG]r h)r }r (hLj hMj hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]MhG]r (hrXSyntax: r r }r (hLXSyntax: hMj ubh)r }r (hLX``scrapy bench``hU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX scrapy benchr r }r (hLUhMj ubahShubeubaubjM)r }r (hLXRequires project: *no* hMj hNhQhSjQhU}r (hY]hZ]hX]hW]h[]uh]Nh^hhG]r h)r }r (hLXRequires project: *no*r hMj hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]MhG]r (hrXRequires project: r r }r (hLXRequires project: hMj ubh)r }r (hLX*no*hU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrXnor r }r (hLUhMj ubahShubeubaubeubh)r }r (hLX.Run quick benchmark test. :ref:`benchmarking`.r hMj hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]Mh^hhG]r (hrXRun quick benchmark test. r r }r (hLXRun quick benchmark test. hMj ubj)r! }r" (hLX:ref:`benchmarking`r# hMj hNhQhSjhU}r$ (UreftypeXrefjjX benchmarkingU refdomainXstdr% hW]hX]U refexplicithY]hZ]h[]jjuh]MhG]r& h)r' }r( (hLj# hU}r) (hY]hZ]r* (jj% Xstd-refr+ ehX]hW]h[]uhMj! hG]r, hrX benchmarkingr- r. }r/ (hLUhMj' ubahShubaubhrX.r0 }r1 (hLX.hMj ubeubeubeubh_)r2 }r3 (hLUhMh`hNhQhShdhU}r4 (hY]hZ]hX]hW]r5 h8ah[]r6 hauh]Mh^hhG]r7 (hk)r8 }r9 (hLXCustom project commandsr: hMj2 hNhQhShohU}r; (hY]hZ]hX]hW]h[]uh]Mh^hhG]r< hrXCustom project commandsr= r> }r? (hLj: hMj8 ubaubh)r@ }rA (hLXYou can also add your custom project commands by using the :setting:`COMMANDS_MODULE` setting. See the Scrapy commands in `scrapy/commands`_ for examples on how to implement your commands.hMj2 hNhQhShhU}rB (hY]hZ]hX]hW]h[]uh]Mh^hhG]rC (hrX;You can also add your custom project commands by using the rD rE }rF (hLX;You can also add your custom project commands by using the hMj@ ubj)rG }rH (hLX:setting:`COMMANDS_MODULE`rI hMj@ hNhQhSjhU}rJ (UreftypeXsettingjjXCOMMANDS_MODULEU refdomainXstdrK hW]hX]U refexplicithY]hZ]h[]jjuh]MhG]rL h)rM }rN (hLjI hU}rO (hY]hZ]rP (jjK X std-settingrQ ehX]hW]h[]uhMjG hG]rR hrXCOMMANDS_MODULErS rT }rU (hLUhMjM ubahShubaubhrX% setting. See the Scrapy commands in rV rW }rX (hLX% setting. See the Scrapy commands in hMj@ ubj )rY }rZ (hLX`scrapy/commands`_j KhMj@ hSj hU}r[ (UnameXscrapy/commandsr\ j X<https://github.com/scrapy/scrapy/blob/master/scrapy/commandsr] hW]hX]hY]hZ]h[]uhG]r^ hrXscrapy/commandsr_ r` }ra (hLUhMjY ubaubhrX0 for examples on how to implement your commands.rb rc }rd (hLX0 for examples on how to implement your commands.hMj@ ubeubhI)re }rf (hLXQ.. _scrapy/commands: https://github.com/scrapy/scrapy/blob/master/scrapy/commandsU referencedrg KhMj2 hNhQhShThU}rh (j j] hW]ri h.ahX]hY]hZ]h[]rj h auh]Mh^hhG]ubj)rk }rl (hLUhMj2 hNhQhSjhU}rm (hW]hX]hY]hZ]h[]Uentries]rn (XpairXCOMMANDS_MODULE; settingXstd:setting-COMMANDS_MODULEro Utrp auh]Mh^hhG]ubhI)rq }rr (hLUhMj2 hNhQhShThU}rs (hW]hX]hY]hZ]h[]h\jo uh]Mh^hhG]ubh_)rt }ru (hLUhMj2 hNhQhb}hShdhU}rv (hY]hZ]hX]hW]rw (hDjo eh[]rx hauh]Mh^hhh}ry jo jq shG]rz (hk)r{ }r| (hLXCOMMANDS_MODULEr} hMjt hNhQhShohU}r~ (hY]hZ]hX]hW]h[]uh]Mh^hhG]r hrXCOMMANDS_MODULEr r }r (hLj} hMj{ ubaubh)r }r (hLXDefault: ``''`` (empty string)r hMjt hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]Mh^hhG]r (hrX Default: r r }r (hLX Default: hMj ubh)r }r (hLX``''``hU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX''r r }r (hLUhMj ubahShubhrX (empty string)r r }r (hLX (empty string)hMj ubeubh)r }r (hLXpA module to use for looking custom Scrapy commands. This is used to add custom commands for your Scrapy project.r hMjt hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]Mh^hhG]r hrXpA module to use for looking custom Scrapy commands. This is used to add custom commands for your Scrapy project.r r }r (hLj hMj ubaubh)r }r (hLX Example::r hMjt hNhQhShhU}r (hY]hZ]hX]hW]h[]uh]Mh^hhG]r hrXExample:r r }r (hLXExample:hMj ubaubh)r }r (hLX"COMMANDS_MODULE = 'mybot.commands'hMjt hNhQhShhU}r (hhhW]hX]hY]hZ]h[]uh]Mh^hhG]r hrX"COMMANDS_MODULE = 'mybot.commands'r r }r (hLUhMj ubaubhI)r }r (hLX\.. _Deploying your project: http://scrapyd.readthedocs.org/en/latest/#deploying-your-projectjg KhMjt hNhQhShThU}r (j j hW]r h2ahX]hY]hZ]h[]r h auh]Mh^hhG]ubeubeubeubehLUU transformerr NU footnote_refsr }r Urefnamesr }r (j\ ]r jY aXdeploying your project]r j auUsymbol_footnotesr ]r Uautofootnote_refsr ]r Usymbol_footnote_refsr ]r U citationsr ]r h^hU current_liner NUtransform_messagesr ]r (cdocutils.nodes system_message r )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineKUtypeUINFOr uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX5Hyperlink target "topics-commands" is not referenced.r r }r (hLUhMj ubahShubahSUsystem_messager ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineKUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX>Hyperlink target "topics-project-structure" is not referenced.r r }r (hLUhMj ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineKoUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX9Hyperlink target "topics-commands-ref" is not referenced.r r }r (hLUhMj ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineKUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX>Hyperlink target "std:command-startproject" is not referenced.r r }r (hLUhMj ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineKUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX;Hyperlink target "std:command-genspider" is not referenced.r r }r (hLUhMj ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineKUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX7Hyperlink target "std:command-crawl" is not referenced.r r }r (hLUhMj ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineKUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX7Hyperlink target "std:command-check" is not referenced.r r }r (hLUhMj ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineKUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX6Hyperlink target "std:command-list" is not referenced.r r }r (hLUhMj ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMUtypej uhG]r! h)r" }r# (hLUhU}r$ (hY]hZ]hX]hW]h[]uhMj hG]r% hrX6Hyperlink target "std:command-edit" is not referenced.r& r' }r( (hLUhMj" ubahShubahSj ubj )r) }r* (hLUhU}r+ (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineM!Utypej uhG]r, h)r- }r. (hLUhU}r/ (hY]hZ]hX]hW]h[]uhMj) hG]r0 hrX7Hyperlink target "std:command-fetch" is not referenced.r1 r2 }r3 (hLUhMj- ubahShubahSj ubj )r4 }r5 (hLUhU}r6 (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMEUtypej uhG]r7 h)r8 }r9 (hLUhU}r: (hY]hZ]hX]hW]h[]uhMj4 hG]r; hrX6Hyperlink target "std:command-view" is not referenced.r< r= }r> (hLUhMj8 ubahShubahSj ubj )r? }r@ (hLUhU}rA (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMVUtypej uhG]rB h)rC }rD (hLUhU}rE (hY]hZ]hX]hW]h[]uhMj? hG]rF hrX7Hyperlink target "std:command-shell" is not referenced.rG rH }rI (hLUhMjC ubahShubahSj ubj )rJ }rK (hLUhU}rL (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMfUtypej uhG]rM h)rN }rO (hLUhU}rP (hY]hZ]hX]hW]h[]uhMjJ hG]rQ hrX7Hyperlink target "std:command-parse" is not referenced.rR rS }rT (hLUhMjN ubahShubahSj ubj )rU }rV (hLUhU}rW (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMUtypej uhG]rX h)rY }rZ (hLUhU}r[ (hY]hZ]hX]hW]h[]uhMjU hG]r\ hrX:Hyperlink target "std:command-settings" is not referenced.r] r^ }r_ (hLUhMjY ubahShubahSj ubj )r` }ra (hLUhU}rb (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMUtypej uhG]rc h)rd }re (hLUhU}rf (hY]hZ]hX]hW]h[]uhMj` hG]rg hrX;Hyperlink target "std:command-runspider" is not referenced.rh ri }rj (hLUhMjd ubahShubahSj ubj )rk }rl (hLUhU}rm (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMUtypej uhG]rn h)ro }rp (hLUhU}rq (hY]hZ]hX]hW]h[]uhMjk hG]rr hrX9Hyperlink target "std:command-version" is not referenced.rs rt }ru (hLUhMjo ubahShubahSj ubj )rv }rw (hLUhU}rx (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMUtypej uhG]ry h)rz }r{ (hLUhU}r| (hY]hZ]hX]hW]h[]uhMjv hG]r} hrX8Hyperlink target "std:command-deploy" is not referenced.r~ r }r (hLUhMjz ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrX7Hyperlink target "std:command-bench" is not referenced.r r }r (hLUhMj ubahShubahSj ubj )r }r (hLUhU}r (hY]UlevelKhW]hX]UsourcehQhZ]h[]UlineMUtypej uhG]r h)r }r (hLUhU}r (hY]hZ]hX]hW]h[]uhMj hG]r hrXAHyperlink target "std:setting-COMMANDS_MODULE" is not referenced.r r }r (hLUhMj ubahShubahSj ubeUreporterr NUid_startr KU autofootnotesr ]r U citation_refsr }r Uindirect_targetsr ]r Usettingsr (cdocutils.frontend Values r or }r (Ufootnote_backlinksr KUrecord_dependenciesr NU rfc_base_urlr Uhttp://tools.ietf.org/html/r U tracebackr Upep_referencesr NUstrip_commentsr NU toc_backlinksr Uentryr U language_coder Uenr U datestampr NU report_levelr KU _destinationr NU halt_levelr KU strip_classesr NhoNUerror_encoding_error_handlerr Ubackslashreplacer Udebugr NUembed_stylesheetr Uoutput_encoding_error_handlerr Ustrictr U sectnum_xformr KUdump_transformsr NU docinfo_xformr KUwarning_streamr NUpep_file_url_templater Upep-%04dr Uexit_status_levelr KUconfigr NUstrict_visitorr NUcloak_email_addressesr Utrim_footnote_reference_spacer Uenvr NUdump_pseudo_xmlr NUexpose_internalsr NUsectsubtitle_xformr U source_linkr NUrfc_referencesr NUoutput_encodingr Uutf-8r U source_urlr NUinput_encodingr U utf-8-sigr U_disable_configr NU id_prefixr UU tab_widthr KUerror_encodingr UUTF-8r U_sourcer UE/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/commands.rstr Ugettext_compactr U generatorr NUdump_internalsr NU smart_quotesr U pep_base_urlr Uhttp://www.python.org/dev/peps/r Usyntax_highlightr Ulongr Uinput_encoding_error_handlerr j Uauto_id_prefixr Uidr Udoctitle_xformr Ustrip_elements_with_classesr NU _config_filesr ]Ufile_insertion_enabledr U raw_enabledr KU dump_settingsr NubUsymbol_footnote_startr KUidsr }r (h,j*h0j=jjh1hh/h`j8j=h-hh4jjjjjh8j2 h?hh5h`h:jf ja jf jjh6j h7j j j jjjwj|h.je h2j jWj\h9jh+j jjjo jt hCjh;jjjhjh@j\hAjhBj:hEjhDjt j%j*hFj|h3j3j5j:uUsubstitution_namesr }r hSh^hU}r (hY]hW]hX]UsourcehQhZ]h[]uU footnotesr ]r Urefidsr }r (h=]r jaj]r jah5]r hJaj]r jaja ]r jc aj]r jaj]r jajo ]r jq ajW]r jYaj ]r j ah1]r haj]r jaj8]r j:aj]r jaj%]r j'ajw]r jyaj ]r j aj]r jaj5]r j7auub.PKo1Dd *scrapy-0.22/.doctrees/topics/leaks.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xtoo many spiders?qNXscrapy.utils.trackref moduleqNXleaks without leaksqNXcommon causes of memory leaksq NX guppy libraryq Xa real exampleq NXwhich objects are tracked?q NXscrapy.utils.trackref.iter_allq Xdebugging memory leaksqNXpython memory management part 2qXpython memory management part 3qXguppy documentationqX%scrapy.utils.trackref.print_live_refsqXtopics-leaks-guppyqX this paperqX!debugging memory leaks with guppyqNXpython memory managementqXtopics-leaks-trackrefsqX$debugging memory leaks with trackrefqNX topics-leaksqXtopics-leaks-without-leaksqX scrapy.utils.trackref.object_refqX scrapy.utils.trackref.get_oldestqX setuptoolsquUsubstitution_defsq}qUparse_messagesq ]q!Ucurrent_sourceq"NU decorationq#NUautofootnote_startq$KUnameidsq%}q&(hUtoo-many-spidersq'hUscrapy-utils-trackref-moduleq(hUleaks-without-leaksq)h Ucommon-causes-of-memory-leaksq*h U guppy-libraryq+h Ua-real-exampleq,h Uwhich-objects-are-trackedq-h h hUdebugging-memory-leaksq.hUpython-memory-management-part-2q/hUpython-memory-management-part-3q0hUguppy-documentationq1hhhUtopics-leaks-guppyq2hU this-paperq3hU!debugging-memory-leaks-with-guppyq4hUpython-memory-managementq5hUtopics-leaks-trackrefsq6hU$debugging-memory-leaks-with-trackrefq7hU topics-leaksq8hUtopics-leaks-without-leaksq9hhhhhU setuptoolsq:uUchildrenq;]q<(cdocutils.nodes target q=)q>}q?(U rawsourceq@X.. _topics-leaks:UparentqAhUsourceqBcdocutils.nodes reprunicode qCXB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/leaks.rstqDqE}qFbUtagnameqGUtargetqHU attributesqI}qJ(UidsqK]UbackrefsqL]UdupnamesqM]UclassesqN]UnamesqO]UrefidqPh8uUlineqQKUdocumentqRhh;]ubcdocutils.nodes section qS)qT}qU(h@UhAhhBhEUexpect_referenced_by_nameqV}qWhh>shGUsectionqXhI}qY(hM]hN]hL]hK]qZ(h.h8ehO]q[(hheuhQKhRhUexpect_referenced_by_idq\}q]h8h>sh;]q^(cdocutils.nodes title q_)q`}qa(h@XDebugging memory leaksqbhAhThBhEhGUtitleqchI}qd(hM]hN]hL]hK]hO]uhQKhRhh;]qecdocutils.nodes Text qfXDebugging memory leaksqgqh}qi(h@hbhAh`ubaubcdocutils.nodes paragraph qj)qk}ql(h@XIn Scrapy, objects such as Requests, Responses and Items have a finite lifetime: they are created, used for a while, and finally destroyed.qmhAhThBhEhGU paragraphqnhI}qo(hM]hN]hL]hK]hO]uhQKhRhh;]qphfXIn Scrapy, objects such as Requests, Responses and Items have a finite lifetime: they are created, used for a while, and finally destroyed.qqqr}qs(h@hmhAhkubaubhj)qt}qu(h@XFrom all those objects, the Request is probably the one with the longest lifetime, as it stays waiting in the Scheduler queue until it's time to process it. For more info see :ref:`topics-architecture`.hAhThBhEhGhnhI}qv(hM]hN]hL]hK]hO]uhQK hRhh;]qw(hfXFrom all those objects, the Request is probably the one with the longest lifetime, as it stays waiting in the Scheduler queue until it's time to process it. For more info see qxqy}qz(h@XFrom all those objects, the Request is probably the one with the longest lifetime, as it stays waiting in the Scheduler queue until it's time to process it. For more info see hAhtubcsphinx.addnodes pending_xref q{)q|}q}(h@X:ref:`topics-architecture`q~hAhthBhEhGU pending_xrefqhI}q(UreftypeXrefUrefwarnqU reftargetqXtopics-architectureU refdomainXstdqhK]hL]U refexplicithM]hN]hO]UrefdocqX topics/leaksquhQK h;]qcdocutils.nodes emphasis q)q}q(h@h~hI}q(hM]hN]q(UxrefqhXstd-refqehL]hK]hO]uhAh|h;]qhfXtopics-architectureqq}q(h@UhAhubahGUemphasisqubaubhfX.q}q(h@X.hAhtubeubhj)q}q(h@XAs these Scrapy objects have a (rather long) lifetime, there is always the risk of accumulating them in memory without releasing them properly and thus causing what is known as a "memory leak".qhAhThBhEhGhnhI}q(hM]hN]hL]hK]hO]uhQKhRhh;]qhfXAs these Scrapy objects have a (rather long) lifetime, there is always the risk of accumulating them in memory without releasing them properly and thus causing what is known as a "memory leak".qq}q(h@hhAhubaubhj)q}q(h@X{To help debugging memory leaks, Scrapy provides a built-in mechanism for tracking objects references called :ref:`trackref `, and you can also use a third-party library called :ref:`Guppy ` for more advanced memory debugging (see below for more info). Both mechanisms must be used from the :ref:`Telnet Console `.hAhThBhEhGhnhI}q(hM]hN]hL]hK]hO]uhQKhRhh;]q(hfXlTo help debugging memory leaks, Scrapy provides a built-in mechanism for tracking objects references called qq}q(h@XlTo help debugging memory leaks, Scrapy provides a built-in mechanism for tracking objects references called hAhubh{)q}q(h@X(:ref:`trackref `qhAhhBhEhGhhI}q(UreftypeXrefhhXtopics-leaks-trackrefsU refdomainXstdqhK]hL]U refexplicithM]hN]hO]hhuhQKh;]qh)q}q(h@hhI}q(hM]hN]q(hhXstd-refqehL]hK]hO]uhAhh;]qhfXtrackrefqq}q(h@UhAhubahGhubaubhfX4, and you can also use a third-party library called qq}q(h@X4, and you can also use a third-party library called hAhubh{)q}q(h@X!:ref:`Guppy `qhAhhBhEhGhhI}q(UreftypeXrefhhXtopics-leaks-guppyU refdomainXstdqhK]hL]U refexplicithM]hN]hO]hhuhQKh;]qh)q}q(h@hhI}q(hM]hN]q(hhXstd-refqehL]hK]hO]uhAhh;]qhfXGuppyq…q}q(h@UhAhubahGhubaubhfXe for more advanced memory debugging (see below for more info). Both mechanisms must be used from the qŅq}q(h@Xe for more advanced memory debugging (see below for more info). Both mechanisms must be used from the hAhubh{)q}q(h@X,:ref:`Telnet Console `qhAhhBhEhGhhI}q(UreftypeXrefhhXtopics-telnetconsoleU refdomainXstdqhK]hL]U refexplicithM]hN]hO]hhuhQKh;]qh)q}q(h@hhI}q(hM]hN]q(hhXstd-refqehL]hK]hO]uhAhh;]qhfXTelnet Consoleqԅq}q(h@UhAhubahGhubaubhfX.q}q(h@X.hAhubeubhS)q}q(h@UhAhThBhEhGhXhI}q(hM]hN]hL]hK]qh*ahO]qh auhQKhRhh;]q(h_)q}q(h@XCommon causes of memory leaksqhAhhBhEhGhchI}q(hM]hN]hL]hK]hO]uhQKhRhh;]qhfXCommon causes of memory leaksq䅁q}q(h@hhAhubaubhj)q}q(h@XIt happens quite often (sometimes by accident, sometimes on purpose) that the Scrapy developer passes objects referenced in Requests (for example, using the :attr:`~scrapy.http.Request.meta` attribute or the request callback function) and that effectively bounds the lifetime of those referenced objects to the lifetime of the Request. This is, by far, the most common cause of memory leaks in Scrapy projects, and a quite difficult one to debug for newcomers.hAhhBhEhGhnhI}q(hM]hN]hL]hK]hO]uhQKhRhh;]q(hfXIt happens quite often (sometimes by accident, sometimes on purpose) that the Scrapy developer passes objects referenced in Requests (for example, using the q녁q}q(h@XIt happens quite often (sometimes by accident, sometimes on purpose) that the Scrapy developer passes objects referenced in Requests (for example, using the hAhubh{)q}q(h@X!:attr:`~scrapy.http.Request.meta`qhAhhBhEhGhhI}q(UreftypeXattrhhXscrapy.http.Request.metaU refdomainXpyqhK]hL]U refexplicithM]hN]hO]hhUpy:classqNU py:moduleqNuhQKh;]qcdocutils.nodes literal q)q}q(h@hhI}q(hM]hN]q(hhXpy-attrqehL]hK]hO]uhAhh;]qhfXmetaqq}q(h@UhAhubahGUliteralrubaubhfX attribute or the request callback function) and that effectively bounds the lifetime of those referenced objects to the lifetime of the Request. This is, by far, the most common cause of memory leaks in Scrapy projects, and a quite difficult one to debug for newcomers.rr}r(h@X attribute or the request callback function) and that effectively bounds the lifetime of those referenced objects to the lifetime of the Request. This is, by far, the most common cause of memory leaks in Scrapy projects, and a quite difficult one to debug for newcomers.hAhubeubhj)r}r(h@X In big projects, the spiders are typically written by different people and some of those spiders could be "leaking" and thus affecting the rest of the other (well-written) spiders when they get to run concurrently, which, in turn, affects the whole crawling process.rhAhhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQK#hRhh;]rhfX In big projects, the spiders are typically written by different people and some of those spiders could be "leaking" and thus affecting the rest of the other (well-written) spiders when they get to run concurrently, which, in turn, affects the whole crawling process.r r }r (h@jhAjubaubhj)r }r (h@X3At the same time, it's hard to avoid the reasons that cause these leaks without restricting the power of the framework, so we have decided not to restrict the functionally but provide useful tools for debugging these leaks, which quite often consist in an answer to the question: *which spider is leaking?*.hAhhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQK(hRhh;]r(hfXAt the same time, it's hard to avoid the reasons that cause these leaks without restricting the power of the framework, so we have decided not to restrict the functionally but provide useful tools for debugging these leaks, which quite often consist in an answer to the question: rr}r(h@XAt the same time, it's hard to avoid the reasons that cause these leaks without restricting the power of the framework, so we have decided not to restrict the functionally but provide useful tools for debugging these leaks, which quite often consist in an answer to the question: hAj ubh)r}r(h@X*which spider is leaking?*hI}r(hM]hN]hL]hK]hO]uhAj h;]rhfXwhich spider is leaking?rr}r(h@UhAjubahGhubhfX.r}r(h@X.hAj ubeubhj)r}r(h@X!The leak could also come from a custom middleware, pipeline or extension that you have written, if you are not releasing the (previously allocated) resources properly. For example, if you're allocating resources on :signal:`spider_opened` but not releasing them on :signal:`spider_closed`.hAhhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQK-hRhh;]r(hfXThe leak could also come from a custom middleware, pipeline or extension that you have written, if you are not releasing the (previously allocated) resources properly. For example, if you're allocating resources on r r!}r"(h@XThe leak could also come from a custom middleware, pipeline or extension that you have written, if you are not releasing the (previously allocated) resources properly. For example, if you're allocating resources on hAjubh{)r#}r$(h@X:signal:`spider_opened`r%hAjhBhEhGhhI}r&(UreftypeXsignalhhX spider_openedU refdomainXstdr'hK]hL]U refexplicithM]hN]hO]hhuhQK-h;]r(h)r)}r*(h@j%hI}r+(hM]hN]r,(hj'X std-signalr-ehL]hK]hO]uhAj#h;]r.hfX spider_openedr/r0}r1(h@UhAj)ubahGjubaubhfX but not releasing them on r2r3}r4(h@X but not releasing them on hAjubh{)r5}r6(h@X:signal:`spider_closed`r7hAjhBhEhGhhI}r8(UreftypeXsignalhhX spider_closedU refdomainXstdr9hK]hL]U refexplicithM]hN]hO]hhuhQK-h;]r:h)r;}r<(h@j7hI}r=(hM]hN]r>(hj9X std-signalr?ehL]hK]hO]uhAj5h;]r@hfX spider_closedrArB}rC(h@UhAj;ubahGjubaubhfX.rD}rE(h@X.hAjubeubh=)rF}rG(h@X.. _topics-leaks-trackrefs:hAhhBhEhGhHhI}rH(hK]hL]hM]hN]hO]hPh6uhQK2hRhh;]ubeubhS)rI}rJ(h@UhAhThBhEhV}rKhjFshGhXhI}rL(hM]hN]hL]hK]rM(h7h6ehO]rN(hheuhQK5hRhh\}rOh6jFsh;]rP(h_)rQ}rR(h@X(Debugging memory leaks with ``trackref``rShAjIhBhEhGhchI}rT(hM]hN]hL]hK]hO]uhQK5hRhh;]rU(hfXDebugging memory leaks with rVrW}rX(h@XDebugging memory leaks with rYhAjQubh)rZ}r[(h@X ``trackref``r\hI}r](hM]hN]hL]hK]hO]uhAjQh;]r^hfXtrackrefr_r`}ra(h@UhAjZubahGjubeubhj)rb}rc(h@X``trackref`` is a module provided by Scrapy to debug the most common cases of memory leaks. It basically tracks the references to all live Requests, Responses, Item and Selector objects.hAjIhBhEhGhnhI}rd(hM]hN]hL]hK]hO]uhQK7hRhh;]re(h)rf}rg(h@X ``trackref``hI}rh(hM]hN]hL]hK]hO]uhAjbh;]rihfXtrackrefrjrk}rl(h@UhAjfubahGjubhfX is a module provided by Scrapy to debug the most common cases of memory leaks. It basically tracks the references to all live Requests, Responses, Item and Selector objects.rmrn}ro(h@X is a module provided by Scrapy to debug the most common cases of memory leaks. It basically tracks the references to all live Requests, Responses, Item and Selector objects.hAjbubeubhj)rp}rq(h@XYou can enter the telnet console and inspect how many objects (of the classes mentioned above) are currently alive using the ``prefs()`` function which is an alias to the :func:`~scrapy.utils.trackref.print_live_refs` function::hAjIhBhEhGhnhI}rr(hM]hN]hL]hK]hO]uhQK;hRhh;]rs(hfX}You can enter the telnet console and inspect how many objects (of the classes mentioned above) are currently alive using the rtru}rv(h@X}You can enter the telnet console and inspect how many objects (of the classes mentioned above) are currently alive using the hAjpubh)rw}rx(h@X ``prefs()``hI}ry(hM]hN]hL]hK]hO]uhAjph;]rzhfXprefs()r{r|}r}(h@UhAjwubahGjubhfX# function which is an alias to the r~r}r(h@X# function which is an alias to the hAjpubh{)r}r(h@X.:func:`~scrapy.utils.trackref.print_live_refs`rhAjphBhEhGhhI}r(UreftypeXfunchhX%scrapy.utils.trackref.print_live_refsU refdomainXpyrhK]hL]U refexplicithM]hN]hO]hhhNhNuhQK;h;]rh)r}r(h@jhI}r(hM]hN]r(hjXpy-funcrehL]hK]hO]uhAjh;]rhfXprint_live_refs()rr}r(h@UhAjubahGjubaubhfX function:rr}r(h@X function:hAjpubeubcdocutils.nodes literal_block r)r}r(h@Xtelnet localhost 6023 >>> prefs() Live References ExampleSpider 1 oldest: 15s ago HtmlResponse 10 oldest: 1s ago Selector 2 oldest: 0s ago FormRequest 878 oldest: 7s agohAjIhBhEhGU literal_blockrhI}r(U xml:spacerUpreserverhK]hL]hM]hN]hO]uhQK?hRhh;]rhfXtelnet localhost 6023 >>> prefs() Live References ExampleSpider 1 oldest: 15s ago HtmlResponse 10 oldest: 1s ago Selector 2 oldest: 0s ago FormRequest 878 oldest: 7s agorr}r(h@UhAjubaubhj)r}r(h@XTAs you can see, that report also shows the "age" of the oldest object in each class.rhAjIhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKIhRhh;]rhfXTAs you can see, that report also shows the "age" of the oldest object in each class.rr}r(h@jhAjubaubhj)r}r(h@XIf you do have leaks, chances are you can figure out which spider is leaking by looking at the oldest request or response. You can get the oldest object of each class using the :func:`~scrapy.utils.trackref.get_oldest` function like this (from the telnet console).hAjIhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKLhRhh;]r(hfXIf you do have leaks, chances are you can figure out which spider is leaking by looking at the oldest request or response. You can get the oldest object of each class using the rr}r(h@XIf you do have leaks, chances are you can figure out which spider is leaking by looking at the oldest request or response. You can get the oldest object of each class using the hAjubh{)r}r(h@X):func:`~scrapy.utils.trackref.get_oldest`rhAjhBhEhGhhI}r(UreftypeXfunchhX scrapy.utils.trackref.get_oldestU refdomainXpyrhK]hL]U refexplicithM]hN]hO]hhhNhNuhQKLh;]rh)r}r(h@jhI}r(hM]hN]r(hjXpy-funcrehL]hK]hO]uhAjh;]rhfX get_oldest()rr}r(h@UhAjubahGjubaubhfX. function like this (from the telnet console).rr}r(h@X. function like this (from the telnet console).hAjubeubhS)r}r(h@UhAjIhBhEhGhXhI}r(hM]hN]hL]hK]rh-ahO]rh auhQKRhRhh;]r(h_)r}r(h@XWhich objects are tracked?rhAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQKRhRhh;]rhfXWhich objects are tracked?rr}r(h@jhAjubaubhj)r}r(h@XYThe objects tracked by ``trackrefs`` are all from these classes (and all its subclasses):hAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKThRhh;]r(hfXThe objects tracked by rr}r(h@XThe objects tracked by hAjubh)r}r(h@X ``trackrefs``hI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX trackrefsrr}r(h@UhAjubahGjubhfX5 are all from these classes (and all its subclasses):rr}r(h@X5 are all from these classes (and all its subclasses):hAjubeubcdocutils.nodes bullet_list r)r}r(h@UhAjhBhEhGU bullet_listrhI}r(UbulletrX*hK]hL]hM]hN]hO]uhQKWhRhh;]r(cdocutils.nodes list_item r)r}r(h@X``scrapy.http.Request``rhAjhBhEhGU list_itemrhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rhj)r}r(h@jhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKWh;]rh)r}r(h@jhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXscrapy.http.Requestrr}r(h@UhAjubahGjubaubaubj)r}r(h@X``scrapy.http.Response``rhAjhBhEhGjhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rhj)r}r(h@jhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKXh;]rh)r}r(h@jhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXscrapy.http.Responserr}r(h@UhAjubahGjubaubaubj)r}r(h@X``scrapy.item.Item``r hAjhBhEhGjhI}r (hM]hN]hL]hK]hO]uhQNhRhh;]r hj)r }r (h@j hAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKYh;]rh)r}r(h@j hI}r(hM]hN]hL]hK]hO]uhAj h;]rhfXscrapy.item.Itemrr}r(h@UhAjubahGjubaubaubj)r}r(h@X``scrapy.selector.Selector``rhAjhBhEhGjhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rhj)r}r(h@jhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKZh;]rh)r }r!(h@jhI}r"(hM]hN]hL]hK]hO]uhAjh;]r#hfXscrapy.selector.Selectorr$r%}r&(h@UhAj ubahGjubaubaubj)r'}r((h@X``scrapy.spider.Spider`` hAjhBhEhGjhI}r)(hM]hN]hL]hK]hO]uhQNhRhh;]r*hj)r+}r,(h@X``scrapy.spider.Spider``r-hAj'hBhEhGhnhI}r.(hM]hN]hL]hK]hO]uhQK[h;]r/h)r0}r1(h@j-hI}r2(hM]hN]hL]hK]hO]uhAj+h;]r3hfXscrapy.spider.Spiderr4r5}r6(h@UhAj0ubahGjubaubaubeubeubhS)r7}r8(h@UhAjIhBhEhGhXhI}r9(hM]hN]hL]hK]r:h,ahO]r;h auhQK^hRhh;]r<(h_)r=}r>(h@XA real exampler?hAj7hBhEhGhchI}r@(hM]hN]hL]hK]hO]uhQK^hRhh;]rAhfXA real examplerBrC}rD(h@j?hAj=ubaubhj)rE}rF(h@XELet's see a concrete example of an hypothetical case of memory leaks.rGhAj7hBhEhGhnhI}rH(hM]hN]hL]hK]hO]uhQK`hRhh;]rIhfXELet's see a concrete example of an hypothetical case of memory leaks.rJrK}rL(h@jGhAjEubaubhj)rM}rN(h@X=Suppose we have some spider with a line similar to this one::rOhAj7hBhEhGhnhI}rP(hM]hN]hL]hK]hO]uhQKbhRhh;]rQhfX<Suppose we have some spider with a line similar to this one:rRrS}rT(h@X<Suppose we have some spider with a line similar to this one:hAjMubaubj)rU}rV(h@Xreturn Request("http://www.somenastyspider.com/product.php?pid=%d" % product_id, callback=self.parse, meta={referer: response}")hAj7hBhEhGjhI}rW(jjhK]hL]hM]hN]hO]uhQKdhRhh;]rXhfXreturn Request("http://www.somenastyspider.com/product.php?pid=%d" % product_id, callback=self.parse, meta={referer: response}")rYrZ}r[(h@UhAjUubaubhj)r\}r](h@XThat line is passing a response reference inside a request which effectively ties the response lifetime to the requests' one, and that would definitely cause memory leaks.r^hAj7hBhEhGhnhI}r_(hM]hN]hL]hK]hO]uhQKghRhh;]r`hfXThat line is passing a response reference inside a request which effectively ties the response lifetime to the requests' one, and that would definitely cause memory leaks.rarb}rc(h@j^hAj\ubaubhj)rd}re(h@XLet's see how we can discover which one is the nasty spider (without knowing it a-priori, of course) by using the ``trackref`` tool.hAj7hBhEhGhnhI}rf(hM]hN]hL]hK]hO]uhQKkhRhh;]rg(hfXrLet's see how we can discover which one is the nasty spider (without knowing it a-priori, of course) by using the rhri}rj(h@XrLet's see how we can discover which one is the nasty spider (without knowing it a-priori, of course) by using the hAjdubh)rk}rl(h@X ``trackref``hI}rm(hM]hN]hL]hK]hO]uhAjdh;]rnhfXtrackrefrorp}rq(h@UhAjkubahGjubhfX tool.rrrs}rt(h@X tool.hAjdubeubhj)ru}rv(h@XAfter the crawler is running for a few minutes and we notice its memory usage has grown a lot, we can enter its telnet console and check the live references::hAj7hBhEhGhnhI}rw(hM]hN]hL]hK]hO]uhQKnhRhh;]rxhfXAfter the crawler is running for a few minutes and we notice its memory usage has grown a lot, we can enter its telnet console and check the live references:ryrz}r{(h@XAfter the crawler is running for a few minutes and we notice its memory usage has grown a lot, we can enter its telnet console and check the live references:hAjuubaubj)r|}r}(h@X>>> prefs() Live References SomenastySpider 1 oldest: 15s ago HtmlResponse 3890 oldest: 265s ago Selector 2 oldest: 0s ago Request 3878 oldest: 250s agohAj7hBhEhGjhI}r~(jjhK]hL]hM]hN]hO]uhQKrhRhh;]rhfX>>> prefs() Live References SomenastySpider 1 oldest: 15s ago HtmlResponse 3890 oldest: 265s ago Selector 2 oldest: 0s ago Request 3878 oldest: 250s agorr}r(h@UhAj|ubaubhj)r}r(h@XThe fact that there are so many live responses (and that they're so old) is definitely suspicious, as responses should have a relatively short lifetime compared to Requests. So let's check the oldest response::hAj7hBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKzhRhh;]rhfXThe fact that there are so many live responses (and that they're so old) is definitely suspicious, as responses should have a relatively short lifetime compared to Requests. So let's check the oldest response:rr}r(h@XThe fact that there are so many live responses (and that they're so old) is definitely suspicious, as responses should have a relatively short lifetime compared to Requests. So let's check the oldest response:hAjubaubj)r}r(h@X>>> from scrapy.utils.trackref import get_oldest >>> r = get_oldest('HtmlResponse') >>> r.url 'http://www.somenastyspider.com/product.php?pid=123'hAj7hBhEhGjhI}r(jjhK]hL]hM]hN]hO]uhQK~hRhh;]rhfX>>> from scrapy.utils.trackref import get_oldest >>> r = get_oldest('HtmlResponse') >>> r.url 'http://www.somenastyspider.com/product.php?pid=123'rr}r(h@UhAjubaubhj)r}r(h@X There it is. By looking at the URL of the oldest response we can see it belongs to the ``somenastyspider.com`` spider. We can now go and check the code of that spider to discover the nasty line that is generating the leaks (passing response references inside requests).hAj7hBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfXWThere it is. By looking at the URL of the oldest response we can see it belongs to the rr}r(h@XWThere it is. By looking at the URL of the oldest response we can see it belongs to the hAjubh)r}r(h@X``somenastyspider.com``hI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXsomenastyspider.comrr}r(h@UhAjubahGjubhfX spider. We can now go and check the code of that spider to discover the nasty line that is generating the leaks (passing response references inside requests).rr}r(h@X spider. We can now go and check the code of that spider to discover the nasty line that is generating the leaks (passing response references inside requests).hAjubeubhj)r}r(h@XwIf you want to iterate over all objects, instead of getting the oldest one, you can use the :func:`iter_all` function::hAj7hBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfX\If you want to iterate over all objects, instead of getting the oldest one, you can use the rr}r(h@X\If you want to iterate over all objects, instead of getting the oldest one, you can use the hAjubh{)r}r(h@X:func:`iter_all`rhAjhBhEhGhhI}r(UreftypeXfunchhXiter_allU refdomainXpyrhK]hL]U refexplicithM]hN]hO]hhhNhNuhQKh;]rh)r}r(h@jhI}r(hM]hN]r(hjXpy-funcrehL]hK]hO]uhAjh;]rhfX iter_all()rr}r(h@UhAjubahGjubaubhfX function:rr}r(h@X function:hAjubeubj)r}r(h@X>>> from scrapy.utils.trackref import iter_all >>> [r.url for r in iter_all('HtmlResponse')] ['http://www.somenastyspider.com/product.php?pid=123', 'http://www.somenastyspider.com/product.php?pid=584', ...hAj7hBhEhGjhI}r(jjhK]hL]hM]hN]hO]uhQKhRhh;]rhfX>>> from scrapy.utils.trackref import iter_all >>> [r.url for r in iter_all('HtmlResponse')] ['http://www.somenastyspider.com/product.php?pid=123', 'http://www.somenastyspider.com/product.php?pid=584', ...rr}r(h@UhAjubaubeubhS)r}r(h@UhAjIhBhEhGhXhI}r(hM]hN]hL]hK]rh'ahO]rhauhQKhRhh;]r(h_)r}r(h@XToo many spiders?rhAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXToo many spiders?rr}r(h@jhAjubaubhj)r}r(h@XIf your project has too many spiders, the output of ``prefs()`` can be difficult to read. For this reason, that function has a ``ignore`` argument which can be used to ignore a particular class (and all its subclases). For example, using::hAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfX4If your project has too many spiders, the output of rr}r(h@X4If your project has too many spiders, the output of hAjubh)r}r(h@X ``prefs()``hI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXprefs()rr}r(h@UhAjubahGjubhfX@ can be difficult to read. For this reason, that function has a rr}r(h@X@ can be difficult to read. For this reason, that function has a hAjubh)r}r(h@X ``ignore``hI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXignorerr}r(h@UhAjubahGjubhfXe argument which can be used to ignore a particular class (and all its subclases). For example, using:rr}r(h@Xe argument which can be used to ignore a particular class (and all its subclases). For example, using:hAjubeubj)r}r(h@X=>>> from scrapy.spider import Spider >>> prefs(ignore=Spider)hAjhBhEhGjhI}r(jjhK]hL]hM]hN]hO]uhQKhRhh;]rhfX=>>> from scrapy.spider import Spider >>> prefs(ignore=Spider)rr}r(h@UhAjubaubhj)r}r(h@X*Won't show any live references to spiders.rhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfX*Won't show any live references to spiders.rr}r(h@jhAjubaubh=)r}r(h@UhAjhBhEhGhHhI}r(hM]hK]rXmodule-scrapy.utils.trackrefrahL]UismodhN]hO]uhQNhRhh;]ubcsphinx.addnodes index r)r}r(h@UhAjhBhEhGUindexrhI}r(hK]hL]hM]hN]hO]Uentries]r(UsinglerXscrapy.utils.trackref (module)Xmodule-scrapy.utils.trackrefUtrauhQNhRhh;]ubeubhS)r}r(h@UhAjIhBhEhGhXhI}r (hM]hN]hL]hK]r h(ahO]r hauhQKhRhh;]r (h_)r }r(h@Xscrapy.utils.trackref modulerhAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXscrapy.utils.trackref modulerr}r(h@jhAj ubaubhj)r}r(h@XMHere are the functions available in the :mod:`~scrapy.utils.trackref` module.rhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfX(Here are the functions available in the rr}r(h@X(Here are the functions available in the hAjubh{)r}r(h@X:mod:`~scrapy.utils.trackref`rhAjhBhEhGhhI}r (UreftypeXmodhhXscrapy.utils.trackrefU refdomainXpyr!hK]hL]U refexplicithM]hN]hO]hhhNhXscrapy.utils.trackrefr"uhQKh;]r#h)r$}r%(h@jhI}r&(hM]hN]r'(hj!Xpy-modr(ehL]hK]hO]uhAjh;]r)hfXtrackrefr*r+}r,(h@UhAj$ubahGjubaubhfX module.r-r.}r/(h@X module.hAjubeubj)r0}r1(h@UhAjhBhEhGjhI}r2(hK]hL]hM]hN]hO]Uentries]r3(jX+object_ref (class in scrapy.utils.trackref)hUtr4auhQNhRhh;]ubcsphinx.addnodes desc r5)r6}r7(h@UhAjhBhEhGUdescr8hI}r9(Unoindexr:Udomainr;XpyhK]hL]hM]hN]hO]Uobjtyper<Xclassr=Udesctyper>j=uhQNhRhh;]r?(csphinx.addnodes desc_signature r@)rA}rB(h@X object_refrChAj6hBhEhGUdesc_signaturerDhI}rE(hK]rFhaUmodulerGj"hL]hM]hN]hO]rHhaUfullnamerIjCUclassrJUUfirstrKuhQKhRhh;]rL(csphinx.addnodes desc_annotation rM)rN}rO(h@Xclass hAjAhBhEhGUdesc_annotationrPhI}rQ(hM]hN]hL]hK]hO]uhQKhRhh;]rRhfXclass rSrT}rU(h@UhAjNubaubcsphinx.addnodes desc_addname rV)rW}rX(h@Xscrapy.utils.trackref.hAjAhBhEhGU desc_addnamerYhI}rZ(hM]hN]hL]hK]hO]uhQKhRhh;]r[hfXscrapy.utils.trackref.r\r]}r^(h@UhAjWubaubcsphinx.addnodes desc_name r_)r`}ra(h@jChAjAhBhEhGU desc_namerbhI}rc(hM]hN]hL]hK]hO]uhQKhRhh;]rdhfX object_refrerf}rg(h@UhAj`ubaubeubcsphinx.addnodes desc_content rh)ri}rj(h@UhAj6hBhEhGU desc_contentrkhI}rl(hM]hN]hL]hK]hO]uhQKhRhh;]rmhj)rn}ro(h@XmInherit from this class (instead of object) if you want to track live instances with the ``trackref`` module.hAjihBhEhGhnhI}rp(hM]hN]hL]hK]hO]uhQKhRhh;]rq(hfXYInherit from this class (instead of object) if you want to track live instances with the rrrs}rt(h@XYInherit from this class (instead of object) if you want to track live instances with the hAjnubh)ru}rv(h@X ``trackref``hI}rw(hM]hN]hL]hK]hO]uhAjnh;]rxhfXtrackrefryrz}r{(h@UhAjuubahGjubhfX module.r|r}}r~(h@X module.hAjnubeubaubeubj)r}r(h@UhAjhBNhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(jX3print_live_refs() (in module scrapy.utils.trackref)hUtrauhQNhRhh;]ubj5)r}r(h@UhAjhBNhGj8hI}r(j:j;XpyrhK]hL]hM]hN]hO]j<Xfunctionrj>juhQNhRhh;]r(j@)r}r(h@X,print_live_refs(class_name, ignore=NoneType)hAjhBhEhGjDhI}r(hK]rhajGj"hL]hM]hN]hO]rhajIXprint_live_refsrjJUjKuhQKhRhh;]r(jV)r}r(h@Xscrapy.utils.trackref.hAjhBhEhGjYhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXscrapy.utils.trackref.rr}r(h@UhAjubaubj_)r}r(h@jhAjhBhEhGjbhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXprint_live_refsrr}r(h@UhAjubaubcsphinx.addnodes desc_parameterlist r)r}r(h@UhAjhBhEhGUdesc_parameterlistrhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(csphinx.addnodes desc_parameter r)r}r(h@X class_namehI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX class_namerr}r(h@UhAjubahGUdesc_parameterrubj)r}r(h@Xignore=NoneTypehI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXignore=NoneTyperr}r(h@UhAjubahGjubeubeubjh)r}r(h@UhAjhBhEhGjkhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hj)r}r(h@X9Print a report of live references, grouped by class name.rhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfX9Print a report of live references, grouped by class name.rr}r(h@jhAjubaubcdocutils.nodes field_list r)r}r(h@UhAjhBNhGU field_listrhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rcdocutils.nodes field r)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(cdocutils.nodes field_name r)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX Parametersrr}r(h@UhAjubahGU field_namerubcdocutils.nodes field_body r)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(cdocutils.nodes strong r)r}r(h@XignorehI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXignorerr}r(h@UhAjubahGUstrongrubhfX (rr}r(h@UhAjubh{)r}r(h@UhI}r(UreftypeUobjrU reftargetXclass or classes tuplerU refdomainjhK]hL]U refexplicithM]hN]hO]uhAjh;]rh)r}r(h@jhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXclass or classes tuplerr}r(h@UhAjubahGhubahGhubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfXUif given, all objects from the specified class (or tuple of classes) will be ignored.rr}r(h@XUif given, all objects from the specified class (or tuple of classes) will be ignored.rhAjubehGhnubahGU field_bodyrubehGUfieldrubaubeubeubj)r}r(h@UhAjhBhEhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(jX.get_oldest() (in module scrapy.utils.trackref)hUtrauhQNhRhh;]ubj5)r}r(h@UhAjhBhEhGj8hI}r (j:j;XpyhK]hL]hM]hN]hO]j<Xfunctionr j>j uhQNhRhh;]r (j@)r }r (h@Xget_oldest(class_name)hAjhBhEhGjDhI}r(hK]rhajGj"hL]hM]hN]hO]rhajIX get_oldestrjJUjKuhQKhRhh;]r(jV)r}r(h@Xscrapy.utils.trackref.hAj hBhEhGjYhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXscrapy.utils.trackref.rr}r(h@UhAjubaubj_)r}r(h@jhAj hBhEhGjbhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfX get_oldestrr}r (h@UhAjubaubj)r!}r"(h@UhAj hBhEhGjhI}r#(hM]hN]hL]hK]hO]uhQKhRhh;]r$j)r%}r&(h@X class_namehI}r'(hM]hN]hL]hK]hO]uhAj!h;]r(hfX class_namer)r*}r+(h@UhAj%ubahGjubaubeubjh)r,}r-(h@UhAjhBhEhGjkhI}r.(hM]hN]hL]hK]hO]uhQKhRhh;]r/hj)r0}r1(h@XReturn the oldest object alive with the given class name, or ``None`` if none is found. Use :func:`print_live_refs` first to get a list of all tracked live objects per class name.hAj,hBhEhGhnhI}r2(hM]hN]hL]hK]hO]uhQKhRhh;]r3(hfX=Return the oldest object alive with the given class name, or r4r5}r6(h@X=Return the oldest object alive with the given class name, or hAj0ubh)r7}r8(h@X``None``hI}r9(hM]hN]hL]hK]hO]uhAj0h;]r:hfXNoner;r<}r=(h@UhAj7ubahGjubhfX if none is found. Use r>r?}r@(h@X if none is found. Use hAj0ubh{)rA}rB(h@X:func:`print_live_refs`rChAj0hBhEhGhhI}rD(UreftypeXfunchhXprint_live_refsU refdomainXpyrEhK]hL]U refexplicithM]hN]hO]hhhNhj"uhQKh;]rFh)rG}rH(h@jChI}rI(hM]hN]rJ(hjEXpy-funcrKehL]hK]hO]uhAjAh;]rLhfXprint_live_refs()rMrN}rO(h@UhAjGubahGjubaubhfX@ first to get a list of all tracked live objects per class name.rPrQ}rR(h@X@ first to get a list of all tracked live objects per class name.hAj0ubeubaubeubj)rS}rT(h@UhAjhBhEhGjhI}rU(hK]hL]hM]hN]hO]Uentries]rV(jX,iter_all() (in module scrapy.utils.trackref)h UtrWauhQNhRhh;]ubj5)rX}rY(h@UhAjhBhEhGj8hI}rZ(j:j;XpyhK]hL]hM]hN]hO]j<Xfunctionr[j>j[uhQNhRhh;]r\(j@)r]}r^(h@Xiter_all(class_name)r_hAjXhBhEhGjDhI}r`(hK]rah ajGj"hL]hM]hN]hO]rbh ajIXiter_allrcjJUjKuhQKhRhh;]rd(jV)re}rf(h@Xscrapy.utils.trackref.hAj]hBhEhGjYhI}rg(hM]hN]hL]hK]hO]uhQKhRhh;]rhhfXscrapy.utils.trackref.rirj}rk(h@UhAjeubaubj_)rl}rm(h@jchAj]hBhEhGjbhI}rn(hM]hN]hL]hK]hO]uhQKhRhh;]rohfXiter_allrprq}rr(h@UhAjlubaubj)rs}rt(h@UhAj]hBhEhGjhI}ru(hM]hN]hL]hK]hO]uhQKhRhh;]rvj)rw}rx(h@X class_namehI}ry(hM]hN]hL]hK]hO]uhAjsh;]rzhfX class_namer{r|}r}(h@UhAjwubahGjubaubeubjh)r~}r(h@UhAjXhBhEhGjkhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhj)r}r(h@XReturn an iterator over all objects alive with the given class name, or ``None`` if none is found. Use :func:`print_live_refs` first to get a list of all tracked live objects per class name.hAj~hBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfXHReturn an iterator over all objects alive with the given class name, or rr}r(h@XHReturn an iterator over all objects alive with the given class name, or hAjubh)r}r(h@X``None``hI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXNonerr}r(h@UhAjubahGjubhfX if none is found. Use rr}r(h@X if none is found. Use hAjubh{)r}r(h@X:func:`print_live_refs`rhAjhBhEhGhhI}r(UreftypeXfunchhXprint_live_refsU refdomainXpyrhK]hL]U refexplicithM]hN]hO]hhhNhj"uhQKh;]rh)r}r(h@jhI}r(hM]hN]r(hjXpy-funcrehL]hK]hO]uhAjh;]rhfXprint_live_refs()rr}r(h@UhAjubahGjubaubhfX@ first to get a list of all tracked live objects per class name.rr}r(h@X@ first to get a list of all tracked live objects per class name.hAjubeubaubeubh=)r}r(h@X.. _topics-leaks-guppy:hAjhBhEhGhHhI}r(hK]hL]hM]hN]hO]hPh2uhQKhRhh;]ubeubeubhS)r}r(h@UhAhThBhEhV}rhjshGhXhI}r(hM]hN]hL]hK]r(h4h2ehO]r(hheuhQKhRhh\}rh2jsh;]r(h_)r}r(h@X!Debugging memory leaks with GuppyrhAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfX!Debugging memory leaks with Guppyrr}r(h@jhAjubaubhj)r}r(h@X``trackref`` provides a very convenient mechanism for tracking down memory leaks, but it only keeps track of the objects that are more likely to cause memory leaks (Requests, Responses, Items, and Selectors). However, there are other cases where the memory leaks could come from other (more or less obscure) objects. If this is your case, and you can't find your leaks using ``trackref``, you still have another resource: the `Guppy library`_.hAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(h)r}r(h@X ``trackref``hI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXtrackrefrr}r(h@UhAjubahGjubhfXk provides a very convenient mechanism for tracking down memory leaks, but it only keeps track of the objects that are more likely to cause memory leaks (Requests, Responses, Items, and Selectors). However, there are other cases where the memory leaks could come from other (more or less obscure) objects. If this is your case, and you can't find your leaks using rr}r(h@Xk provides a very convenient mechanism for tracking down memory leaks, but it only keeps track of the objects that are more likely to cause memory leaks (Requests, Responses, Items, and Selectors). However, there are other cases where the memory leaks could come from other (more or less obscure) objects. If this is your case, and you can't find your leaks using hAjubh)r}r(h@X ``trackref``hI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXtrackrefrr}r(h@UhAjubahGjubhfX', you still have another resource: the rr}r(h@X', you still have another resource: the hAjubcdocutils.nodes reference r)r}r(h@X`Guppy library`_UresolvedrKhAjhGU referencerhI}r(UnameX Guppy libraryUrefurirX!http://pypi.python.org/pypi/guppyrhK]hL]hM]hN]hO]uh;]rhfX Guppy libraryrr}r(h@UhAjubaubhfX.r}r(h@X.hAjubeubh=)r}r(h@X4.. _Guppy library: http://pypi.python.org/pypi/guppyU referencedrKhAjhBhEhGhHhI}r(jjhK]rh+ahL]hM]hN]hO]rh auhQKhRhh;]ubhj)r}r(h@XMIf you use ``setuptools``, you can install Guppy with the following command::rhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfX If you use rr}r(h@X If you use hAjubh)r}r(h@X``setuptools``hI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX setuptoolsrr}r(h@UhAjubahGjubhfX3, you can install Guppy with the following command:rr}r(h@X3, you can install Guppy with the following command:hAjubeubj)r}r(h@Xeasy_install guppyhAjhBhEhGjhI}r(jjhK]hL]hM]hN]hO]uhQKhRhh;]rhfXeasy_install guppyrr}r(h@UhAjubaubh=)r}r(h@X6.. _setuptools: http://pypi.python.org/pypi/setuptoolshAjhBhEhGhHhI}r(jX&http://pypi.python.org/pypi/setuptoolshK]rh:ahL]hM]hN]hO]rhauhQKhRhh;]ubhj)r}r(h@XThe telnet console also comes with a built-in shortcut (``hpy``) for accessing Guppy heap objects. Here's an example to view all Python objects available in the heap using Guppy::hAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfX8The telnet console also comes with a built-in shortcut (rr}r(h@X8The telnet console also comes with a built-in shortcut (hAjubh)r }r (h@X``hpy``hI}r (hM]hN]hL]hK]hO]uhAjh;]r hfXhpyr r}r(h@UhAj ubahGjubhfXs) for accessing Guppy heap objects. Here's an example to view all Python objects available in the heap using Guppy:rr}r(h@Xs) for accessing Guppy heap objects. Here's an example to view all Python objects available in the heap using Guppy:hAjubeubj)r}r(h@X>>> x = hpy.heap() >>> x.bytype Partition of a set of 297033 objects. Total size = 52587824 bytes. Index Count % Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 2222272 4 42890920 82 type 5 16870 6 2024400 4 44915320 85 function 6 13949 5 1673880 3 46589200 89 types.CodeType 7 13422 5 1653104 3 48242304 92 list 8 3735 1 1173680 2 49415984 94 _sre.SRE_Pattern 9 1209 0 456936 1 49872920 95 scrapy.http.headers.Headers <1676 more rows. Type e.g. '_.more' to view.>hAjhBhEhGjhI}r(jjhK]hL]hM]hN]hO]uhQKhRhh;]rhfX>>> x = hpy.heap() >>> x.bytype Partition of a set of 297033 objects. Total size = 52587824 bytes. Index Count % Size % Cumulative % Type 0 22307 8 16423880 31 16423880 31 dict 1 122285 41 12441544 24 28865424 55 str 2 68346 23 5966696 11 34832120 66 tuple 3 227 0 5836528 11 40668648 77 unicode 4 2461 1 2222272 4 42890920 82 type 5 16870 6 2024400 4 44915320 85 function 6 13949 5 1673880 3 46589200 89 types.CodeType 7 13422 5 1653104 3 48242304 92 list 8 3735 1 1173680 2 49415984 94 _sre.SRE_Pattern 9 1209 0 456936 1 49872920 95 scrapy.http.headers.Headers <1676 more rows. Type e.g. '_.more' to view.>rr}r(h@UhAjubaubhj)r}r(h@XYou can see that most space is used by dicts. Then, if you want to see from which attribute those dicts are referenced, you could do::hAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXYou can see that most space is used by dicts. Then, if you want to see from which attribute those dicts are referenced, you could do:rr}r (h@XYou can see that most space is used by dicts. Then, if you want to see from which attribute those dicts are referenced, you could do:hAjubaubj)r!}r"(h@X >>> x.bytype[0].byvia Partition of a set of 22307 objects. Total size = 16423880 bytes. Index Count % Size % Cumulative % Referred Via: 0 10982 49 9416336 57 9416336 57 '.__dict__' 1 1820 8 2681504 16 12097840 74 '.__dict__', '.func_globals' 2 3097 14 1122904 7 13220744 80 3 990 4 277200 2 13497944 82 "['cookies']" 4 987 4 276360 2 13774304 84 "['cache']" 5 985 4 275800 2 14050104 86 "['meta']" 6 897 4 251160 2 14301264 87 '[2]' 7 1 0 196888 1 14498152 88 "['moduleDict']", "['modules']" 8 672 3 188160 1 14686312 89 "['cb_kwargs']" 9 27 0 155016 1 14841328 90 '[1]' <333 more rows. Type e.g. '_.more' to view.>hAjhBhEhGjhI}r#(jjhK]hL]hM]hN]hO]uhQKhRhh;]r$hfX >>> x.bytype[0].byvia Partition of a set of 22307 objects. Total size = 16423880 bytes. Index Count % Size % Cumulative % Referred Via: 0 10982 49 9416336 57 9416336 57 '.__dict__' 1 1820 8 2681504 16 12097840 74 '.__dict__', '.func_globals' 2 3097 14 1122904 7 13220744 80 3 990 4 277200 2 13497944 82 "['cookies']" 4 987 4 276360 2 13774304 84 "['cache']" 5 985 4 275800 2 14050104 86 "['meta']" 6 897 4 251160 2 14301264 87 '[2]' 7 1 0 196888 1 14498152 88 "['moduleDict']", "['modules']" 8 672 3 188160 1 14686312 89 "['cb_kwargs']" 9 27 0 155016 1 14841328 90 '[1]' <333 more rows. Type e.g. '_.more' to view.>r%r&}r'(h@UhAj!ubaubhj)r(}r)(h@XAs you can see, the Guppy module is very powerful but also requires some deep knowledge about Python internals. For more info about Guppy, refer to the `Guppy documentation`_.hAjhBhEhGhnhI}r*(hM]hN]hL]hK]hO]uhQKhRhh;]r+(hfXAs you can see, the Guppy module is very powerful but also requires some deep knowledge about Python internals. For more info about Guppy, refer to the r,r-}r.(h@XAs you can see, the Guppy module is very powerful but also requires some deep knowledge about Python internals. For more info about Guppy, refer to the hAj(ubj)r/}r0(h@X`Guppy documentation`_jKhAj(hGjhI}r1(UnameXGuppy documentationjX http://guppy-pe.sourceforge.net/r2hK]hL]hM]hN]hO]uh;]r3hfXGuppy documentationr4r5}r6(h@UhAj/ubaubhfX.r7}r8(h@X.hAj(ubeubh=)r9}r:(h@X9.. _Guppy documentation: http://guppy-pe.sourceforge.net/jKhAjhBhEhGhHhI}r;(jj2hK]r<h1ahL]hM]hN]hO]r=hauhQKhRhh;]ubh=)r>}r?(h@X.. _topics-leaks-without-leaks:hAjhBhEhGhHhI}r@(hK]hL]hM]hN]hO]hPh9uhQKhRhh;]ubeubhS)rA}rB(h@UhAhThBhEhV}rChj>shGhXhI}rD(hM]hN]hL]hK]rE(h)h9ehO]rF(hheuhQMhRhh\}rGh9j>sh;]rH(h_)rI}rJ(h@XLeaks without leaksrKhAjAhBhEhGhchI}rL(hM]hN]hL]hK]hO]uhQMhRhh;]rMhfXLeaks without leaksrNrO}rP(h@jKhAjIubaubhj)rQ}rR(h@X{Sometimes, you may notice that the memory usage of your Scrapy process will only increase, but never decrease. Unfortunately, this could happen even though neither Scrapy nor your project are leaking memory. This is due to a (not so well) known problem of Python, which may not return released memory to the operating system in some cases. For more information on this issue see:rShAjAhBhEhGhnhI}rT(hM]hN]hL]hK]hO]uhQMhRhh;]rUhfX{Sometimes, you may notice that the memory usage of your Scrapy process will only increase, but never decrease. Unfortunately, this could happen even though neither Scrapy nor your project are leaking memory. This is due to a (not so well) known problem of Python, which may not return released memory to the operating system in some cases. For more information on this issue see:rVrW}rX(h@jShAjQubaubj)rY}rZ(h@UhAjAhBhEhGjhI}r[(jX*hK]hL]hM]hN]hO]uhQM hRhh;]r\(j)r]}r^(h@XD`Python Memory Management `_r_hAjYhBhEhGjhI}r`(hM]hN]hL]hK]hO]uhQNhRhh;]rahj)rb}rc(h@j_hAj]hBhEhGhnhI}rd(hM]hN]hL]hK]hO]uhQM h;]re(j)rf}rg(h@j_hI}rh(UnameXPython Memory ManagementjX&http://evanjones.ca/python-memory.htmlrihK]hL]hM]hN]hO]uhAjbh;]rjhfXPython Memory Managementrkrl}rm(h@UhAjfubahGjubh=)rn}ro(h@X) jKhAjbhGhHhI}rp(UrefurijihK]rqh5ahL]hM]hN]hO]rrhauh;]ubeubaubj)rs}rt(h@XQ`Python Memory Management Part 2 `_ruhAjYhBhEhGjhI}rv(hM]hN]hL]hK]hO]uhQNhRhh;]rwhj)rx}ry(h@juhAjshBhEhGhnhI}rz(hM]hN]hL]hK]hO]uhQM h;]r{(j)r|}r}(h@juhI}r~(UnameXPython Memory Management Part 2jX,http://evanjones.ca/python-memory-part2.htmlrhK]hL]hM]hN]hO]uhAjxh;]rhfXPython Memory Management Part 2rr}r(h@UhAj|ubahGjubh=)r}r(h@X/ jKhAjxhGhHhI}r(UrefurijhK]rh/ahL]hM]hN]hO]rhauh;]ubeubaubj)r}r(h@XR`Python Memory Management Part 3 `_ hAjYhBhEhGjhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rhj)r}r(h@XQ`Python Memory Management Part 3 `_rhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQM h;]r(j)r}r(h@jhI}r(UnameXPython Memory Management Part 3jX,http://evanjones.ca/python-memory-part3.htmlrhK]hL]hM]hN]hO]uhAjh;]rhfXPython Memory Management Part 3rr}r(h@UhAjubahGjubh=)r}r(h@X/ jKhAjhGhHhI}r(UrefurijhK]rh0ahL]hM]hN]hO]rhauh;]ubeubaubeubhj)r}r(h@XThe improvements proposed by Evan Jones, which are detailed in `this paper`_, got merged in Python 2.5, but this only reduces the problem, it doesn't fix it completely. To quote the paper:hAjAhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQMhRhh;]r(hfX?The improvements proposed by Evan Jones, which are detailed in rr}r(h@X?The improvements proposed by Evan Jones, which are detailed in hAjubj)r}r(h@X `this paper`_jKhAjhGjhI}r(UnameX this paperjX$http://evanjones.ca/memoryallocator/rhK]hL]hM]hN]hO]uh;]rhfX this paperrr}r(h@UhAjubaubhfXp, got merged in Python 2.5, but this only reduces the problem, it doesn't fix it completely. To quote the paper:rr}r(h@Xp, got merged in Python 2.5, but this only reduces the problem, it doesn't fix it completely. To quote the paper:hAjubeubcdocutils.nodes block_quote r)r}r(h@UhAjAhBhEhGU block_quoterhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rhj)r}r(h@X *Unfortunately, this patch can only free an arena if there are no more objects allocated in it anymore. This means that fragmentation is a large issue. An application could have many megabytes of free memory, scattered throughout all the arenas, but it will be unable to free any of it. This is a problem experienced by all memory allocators. The only way to solve it is to move to a compacting garbage collector, which is able to move objects in memory. This would require significant changes to the Python interpreter.*rhAjhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQMh;]rh)r}r(h@jhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXUnfortunately, this patch can only free an arena if there are no more objects allocated in it anymore. This means that fragmentation is a large issue. An application could have many megabytes of free memory, scattered throughout all the arenas, but it will be unable to free any of it. This is a problem experienced by all memory allocators. The only way to solve it is to move to a compacting garbage collector, which is able to move objects in memory. This would require significant changes to the Python interpreter.rr}r(h@UhAjubahGhubaubaubhj)r}r(h@XThis problem will be fixed in future Scrapy releases, where we plan to adopt a new process model and run spiders in a pool of recyclable sub-processes.rhAjAhBhEhGhnhI}r(hM]hN]hL]hK]hO]uhQMhRhh;]rhfXThis problem will be fixed in future Scrapy releases, where we plan to adopt a new process model and run spiders in a pool of recyclable sub-processes.rr}r(h@jhAjubaubh=)r}r(h@X4.. _this paper: http://evanjones.ca/memoryallocator/jKhAjAhBhEhGhHhI}r(jjhK]rh3ahL]hM]hN]hO]rhauhQMhRhh;]ubeubeubeh@UU transformerrNU footnote_refsr}rUrefnamesr}r(X this paper]rjaX guppy library]rjaXguppy documentation]rj/auUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rhRhU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h@UhI}r(hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypeUINFOruh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX2Hyperlink target "topics-leaks" is not referenced.rr}r(h@UhAjubahGhnubahGUsystem_messagerubj)r}r(h@UhI}r(hM]UlevelKhK]hL]UsourcehEhN]hO]UlineK2Utypejuh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX<Hyperlink target "topics-leaks-trackrefs" is not referenced.rr}r(h@UhAjubahGhnubahGjubj)r}r(h@UhI}r(hM]UlevelKhK]hL]UsourcehEhN]hO]Utypejuh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXBHyperlink target "module-scrapy.utils.trackref" is not referenced.rr}r(h@UhAjubahGhnubahGjubj)r}r(h@UhI}r (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypejuh;]r hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAjh;]rhfX8Hyperlink target "topics-leaks-guppy" is not referenced.rr}r(h@UhAj ubahGhnubahGjubj)r}r(h@UhI}r(hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypejuh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX0Hyperlink target "setuptools" is not referenced.rr}r(h@UhAjubahGhnubahGjubj)r}r(h@UhI}r(hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypejuh;]r hj)r!}r"(h@UhI}r#(hM]hN]hL]hK]hO]uhAjh;]r$hfX@Hyperlink target "topics-leaks-without-leaks" is not referenced.r%r&}r'(h@UhAj!ubahGhnubahGjubeUreporterr(NUid_startr)KU autofootnotesr*]r+U citation_refsr,}r-Uindirect_targetsr.]r/Usettingsr0(cdocutils.frontend Values r1or2}r3(Ufootnote_backlinksr4KUrecord_dependenciesr5NU rfc_base_urlr6Uhttp://tools.ietf.org/html/r7U tracebackr8Upep_referencesr9NUstrip_commentsr:NU toc_backlinksr;Uentryr<U language_coder=Uenr>U datestampr?NU report_levelr@KU _destinationrANU halt_levelrBKU strip_classesrCNhcNUerror_encoding_error_handlerrDUbackslashreplacerEUdebugrFNUembed_stylesheetrGUoutput_encoding_error_handlerrHUstrictrIU sectnum_xformrJKUdump_transformsrKNU docinfo_xformrLKUwarning_streamrMNUpep_file_url_templaterNUpep-%04drOUexit_status_levelrPKUconfigrQNUstrict_visitorrRNUcloak_email_addressesrSUtrim_footnote_reference_spacerTUenvrUNUdump_pseudo_xmlrVNUexpose_internalsrWNUsectsubtitle_xformrXU source_linkrYNUrfc_referencesrZNUoutput_encodingr[Uutf-8r\U source_urlr]NUinput_encodingr^U utf-8-sigr_U_disable_configr`NU id_prefixraUU tab_widthrbKUerror_encodingrcUUTF-8rdU_sourcereUB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/leaks.rstrfUgettext_compactrgU generatorrhNUdump_internalsriNU smart_quotesrjU pep_base_urlrkUhttp://www.python.org/dev/peps/rlUsyntax_highlightrmUlongrnUinput_encoding_error_handlerrojIUauto_id_prefixrpUidrqUdoctitle_xformrrUstrip_elements_with_classesrsNU _config_filesrt]Ufile_insertion_enabledruU raw_enabledrvKU dump_settingsrwNubUsymbol_footnote_startrxKUidsry}rz(h.hTh)jAh1j9h/jh0jh7jIh j]h(jh3jhjh2jh'jh6jIh4jjjh,j7h5jnh8hTh*hh+jh9jAhjAhj h-jh:juUsubstitution_namesr{}r|hGhRhI}r}(hM]hK]hL]UsourcehEhN]hO]uU footnotesr~]rUrefidsr}r(h6]rjFah2]rjah9]rj>ah8]rh>auub.PKo1D'M*scrapy-0.22/.doctrees/topics/email.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X mail_fromqNX$scrapy.mail.MailSender.from_settingsqX quick exampleqNXscrapy.mail.MailSender.sendq Xtwisted non-blocking ioq Xmail_sslq NXtopics-email-settingsq X topics-emailq X mail_userqNXmail_tlsqNX mail_passqNXscrapy.mail.MailSenderqX mail settingsqNX mail_hostqNX mail_portqNXsmtplibqXsending e-mailqNXmailsender class referenceqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q (hU mail-fromq!hhhU quick-exampleq"h h h Utwisted-non-blocking-ioq#h Umail-sslq$h Utopics-email-settingsq%h U topics-emailq&hU mail-userq'hUmail-tlsq(hU mail-passq)hhhU mail-settingsq*hU mail-hostq+hU mail-portq,hUsmtplibq-hUsending-e-mailq.hUmailsender-class-referenceq/uUchildrenq0]q1(cdocutils.nodes target q2)q3}q4(U rawsourceq5X.. _topics-email:Uparentq6hUsourceq7cdocutils.nodes reprunicode q8XB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/email.rstq9q:}q;bUtagnameq}q?(Uidsq@]UbackrefsqA]UdupnamesqB]UclassesqC]UnamesqD]UrefidqEh&uUlineqFKUdocumentqGhh0]ubcdocutils.nodes section qH)qI}qJ(h5Uh6hh7h:Uexpect_referenced_by_nameqK}qLh h3sh}qN(hB]hC]hA]h@]qO(Xmodule-scrapy.mailqPh.h&ehD]qQ(hh euhFKhGhUexpect_referenced_by_idqR}qSh&h3sh0]qT(cdocutils.nodes title qU)qV}qW(h5XSending e-mailqXh6hIh7h:h}qZ(hB]hC]hA]h@]hD]uhFKhGhh0]q[cdocutils.nodes Text q\XSending e-mailq]q^}q_(h5hXh6hVubaubcsphinx.addnodes index q`)qa}qb(h5Uh6hIh7h:h}qd(h@]hA]hB]hC]hD]Uentries]qe(UsingleqfXscrapy.mail (module)Xmodule-scrapy.mailUtqgauhFNhGhh0]ubcdocutils.nodes paragraph qh)qi}qj(h5XAlthough Python makes sending e-mails relatively easy via the `smtplib`_ library, Scrapy provides its own facility for sending e-mails which is very easy to use and it's implemented using `Twisted non-blocking IO`_, to avoid interfering with the non-blocking IO of the crawler. It also provides a simple API for sending attachments and it's very easy to configure, with a few :ref:`settings `.h6hIh7h:h}ql(hB]hC]hA]h@]hD]uhFK hGhh0]qm(h\X>Although Python makes sending e-mails relatively easy via the qnqo}qp(h5X>Although Python makes sending e-mails relatively easy via the h6hiubcdocutils.nodes reference qq)qr}qs(h5X `smtplib`_UresolvedqtKh6hih}qv(UnameXsmtplibqwUrefuriqxX+http://docs.python.org/library/smtplib.htmlqyh@]hA]hB]hC]hD]uh0]qzh\Xsmtplibq{q|}q}(h5Uh6hrubaubh\Xt library, Scrapy provides its own facility for sending e-mails which is very easy to use and it's implemented using q~q}q(h5Xt library, Scrapy provides its own facility for sending e-mails which is very easy to use and it's implemented using h6hiubhq)q}q(h5X`Twisted non-blocking IO`_htKh6hih}q(UnameXTwisted non-blocking IOhxXEhttp://twistedmatrix.com/projects/core/documentation/howto/async.htmlqh@]hA]hB]hC]hD]uh0]qh\XTwisted non-blocking IOqq}q(h5Uh6hubaubh\X, to avoid interfering with the non-blocking IO of the crawler. It also provides a simple API for sending attachments and it's very easy to configure, with a few qq}q(h5X, to avoid interfering with the non-blocking IO of the crawler. It also provides a simple API for sending attachments and it's very easy to configure, with a few h6hiubcsphinx.addnodes pending_xref q)q}q(h5X':ref:`settings `qh6hih7h:h}q(UreftypeXrefUrefwarnqU reftargetqXtopics-email-settingsU refdomainXstdqh@]hA]U refexplicithB]hC]hD]UrefdocqX topics/emailquhFK h0]qcdocutils.nodes emphasis q)q}q(h5hh>}q(hB]hC]q(UxrefqhXstd-refqehA]h@]hD]uh6hh0]qh\Xsettingsqq}q(h5Uh6hubah}q(hxhyh@]qh-ahA]hB]hC]hD]qhauhFKhGhh0]ubh2)q}q(h5Xb.. _Twisted non-blocking IO: http://twistedmatrix.com/projects/core/documentation/howto/async.htmlhKh6hIh7h:h}q(hxhh@]qh#ahA]hB]hC]hD]qh auhFKhGhh0]ubhH)q}q(h5Uh6hIh7h:h}q(hB]hC]hA]h@]qh"ahD]qhauhFKhGhh0]q(hU)q}q(h5X Quick exampleqh6hh7h:h}q(hB]hC]hA]h@]hD]uhFKhGhh0]qh\X Quick exampleqq}q(h5hh6hubaubhh)q}q(h5XjThere are two ways to instantiate the mail sender. You can instantiate it using the standard constructor::h6hh7h:h}q(hB]hC]hA]h@]hD]uhFKhGhh0]qh\XiThere are two ways to instantiate the mail sender. You can instantiate it using the standard constructor:qÅq}q(h5XiThere are two ways to instantiate the mail sender. You can instantiate it using the standard constructor:h6hubaubcdocutils.nodes literal_block q)q}q(h5X8from scrapy.mail import MailSender mailer = MailSender()h6hh7h:h}q(U xml:spaceqUpreserveqh@]hA]hB]hC]hD]uhFKhGhh0]qh\X8from scrapy.mail import MailSender mailer = MailSender()q΅q}q(h5Uh6hubaubhh)q}q(h5X|Or you can instantiate it passing a Scrapy settings object, which will respect the :ref:`settings `::h6hh7h:h}q(hB]hC]hA]h@]hD]uhFKhGhh0]q(h\XSOr you can instantiate it passing a Scrapy settings object, which will respect the qՅq}q(h5XSOr you can instantiate it passing a Scrapy settings object, which will respect the h6hubh)q}q(h5X':ref:`settings `qh6hh7h:h}q(UreftypeXrefhhXtopics-email-settingsU refdomainXstdqh@]hA]U refexplicithB]hC]hD]hhuhFKh0]qh)q}q(h5hh>}q(hB]hC]q(hhXstd-refqehA]h@]hD]uh6hh0]qh\Xsettingsq䅁q}q(h5Uh6hubah}q(hhh@]hA]hB]hC]hD]uhFK hGhh0]qh\X+mailer = MailSender.from_settings(settings)q텁q}q(h5Uh6hubaubhh)q}q(h5XCAnd here is how to use it to send an e-mail (without attachments)::qh6hh7h:h}q(hB]hC]hA]h@]hD]uhFK"hGhh0]qh\XBAnd here is how to use it to send an e-mail (without attachments):qq}q(h5XBAnd here is how to use it to send an e-mail (without attachments):h6hubaubh)q}q(h5Xmmailer.send(to=["someone@example.com"], subject="Some subject", body="Some body", cc=["another@example.com"])h6hh7h:h}q(hhh@]hA]hB]hC]hD]uhFK$hGhh0]qh\Xmmailer.send(to=["someone@example.com"], subject="Some subject", body="Some body", cc=["another@example.com"])qq}q(h5Uh6hubaubeubhH)q}r(h5Uh6hIh7h:h}r(hB]hC]hA]h@]rh/ahD]rhauhFK'hGhh0]r(hU)r}r(h5XMailSender class referencerh6hh7h:h}r(hB]hC]hA]h@]hD]uhFK'hGhh0]r h\XMailSender class referencer r }r (h5jh6jubaubhh)r }r(h5XMailSender is the preferred class to use for sending emails from Scrapy, as it uses `Twisted non-blocking IO`_, like the rest of the framework.h6hh7h:h}r(hB]hC]hA]h@]hD]uhFK)hGhh0]r(h\XTMailSender is the preferred class to use for sending emails from Scrapy, as it uses rr}r(h5XTMailSender is the preferred class to use for sending emails from Scrapy, as it uses h6j ubhq)r}r(h5X`Twisted non-blocking IO`_htKh6j h}r(UnameXTwisted non-blocking IOhxhh@]hA]hB]hC]hD]uh0]rh\XTwisted non-blocking IOrr}r(h5Uh6jubaubh\X!, like the rest of the framework.rr}r(h5X!, like the rest of the framework.h6j ubeubh`)r}r(h5Uh6hh7Nh}r (h@]hA]hB]hC]hD]Uentries]r!(hfX!MailSender (class in scrapy.mail)hUtr"auhFNhGhh0]ubcsphinx.addnodes desc r#)r$}r%(h5Uh6hh7Nh}r'(Unoindexr(Udomainr)Xpyr*h@]hA]hB]hC]hD]Uobjtyper+Xclassr,Udesctyper-j,uhFNhGhh0]r.(csphinx.addnodes desc_signature r/)r0}r1(h5XUMailSender(smtphost=None, mailfrom=None, smtpuser=None, smtppass=None, smtpport=None)h6j$h7h:h}r3(h@]r4haUmoduler5X scrapy.mailr6hA]hB]hC]hD]r7haUfullnamer8X MailSenderr9Uclassr:UUfirstr;uhFKfhGhh0]r<(csphinx.addnodes desc_annotation r=)r>}r?(h5Xclass h6j0h7h:h}rA(hB]hC]hA]h@]hD]uhFKfhGhh0]rBh\Xclass rCrD}rE(h5Uh6j>ubaubcsphinx.addnodes desc_addname rF)rG}rH(h5X scrapy.mail.h6j0h7h:h}rJ(hB]hC]hA]h@]hD]uhFKfhGhh0]rKh\X scrapy.mail.rLrM}rN(h5Uh6jGubaubcsphinx.addnodes desc_name rO)rP}rQ(h5j9h6j0h7h:h}rS(hB]hC]hA]h@]hD]uhFKfhGhh0]rTh\X MailSenderrUrV}rW(h5Uh6jPubaubcsphinx.addnodes desc_parameterlist rX)rY}rZ(h5Uh6j0h7h:h}r\(hB]hC]hA]h@]hD]uhFKfhGhh0]r](csphinx.addnodes desc_parameter r^)r_}r`(h5X smtphost=Noneh>}ra(hB]hC]hA]h@]hD]uh6jYh0]rbh\X smtphost=Nonercrd}re(h5Uh6j_ubah}ri(hB]hC]hA]h@]hD]uh6jYh0]rjh\X mailfrom=Nonerkrl}rm(h5Uh6jgubah}rp(hB]hC]hA]h@]hD]uh6jYh0]rqh\X smtpuser=Nonerrrs}rt(h5Uh6jnubah}rw(hB]hC]hA]h@]hD]uh6jYh0]rxh\X smtppass=Noneryrz}r{(h5Uh6juubah}r~(hB]hC]hA]h@]hD]uh6jYh0]rh\X smtpport=Nonerr}r(h5Uh6j|ubah}r(hB]hC]hA]h@]hD]uhFKfhGhh0]r(cdocutils.nodes field_list r)r}r(h5Uh6jh7Nh}r(hB]hC]hA]h@]hD]uhFNhGhh0]rcdocutils.nodes field r)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(cdocutils.nodes field_name r)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\X Parametersrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jh0]rcdocutils.nodes bullet_list r)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(cdocutils.nodes list_item r)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(cdocutils.nodes strong r)r}r(h5Xsmtphosth>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xsmtphostrr}r(h5Uh6jubah}r(UreftypeUobjrU reftargetXstrrU refdomainj*h@]hA]U refexplicithB]hC]hD]uh6jh0]rh)r}r(h5jh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xstrrr}r(h5Uh6jubah}r(UreftypeXsettinghhX MAIL_HOSTU refdomainXstdrh@]hA]U refexplicithB]hC]hD]hhuhFK.h0]rcdocutils.nodes literal r)r}r(h5jh>}r(hB]hC]r(hjX std-settingrehA]h@]hD]uh6jh0]rh\X MAIL_HOSTrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Xmailfromh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xmailfromrr}r(h5Uh6jubah}r(UreftypejU reftargetXstrrU refdomainj*h@]hA]U refexplicithB]hC]hD]uh6jh0]rh)r}r(h5jh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xstrrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jh0]rh\XFrom:rr}r(h5Uh6j ubah}r(UreftypeXsettinghhX MAIL_FROMU refdomainXstdrh@]hA]U refexplicithB]hC]hD]hhuhFK2h0]rj)r}r(h5jh>}r(hB]hC]r(hjX std-settingr ehA]h@]hD]uh6jh0]r!h\X MAIL_FROMr"r#}r$(h5Uh6jubah}r*(hB]hC]hA]h@]hD]uh6jh0]r+hh)r,}r-(h5Uh>}r.(hB]hC]hA]h@]hD]uh6j(h0]r/(j)r0}r1(h5Xsmtpuserh>}r2(hB]hC]hA]h@]hD]uh6j,h0]r3h\Xsmtpuserr4r5}r6(h5Uh6j0ubah(h5X:setting:`MAIL_USER`r?h6j,h7h:h}r@(UreftypeXsettinghhX MAIL_USERU refdomainXstdrAh@]hA]U refexplicithB]hC]hD]hhuhFK6h0]rBj)rC}rD(h5j?h>}rE(hB]hC]rF(hjAX std-settingrGehA]h@]hD]uh6j=h0]rHh\X MAIL_USERrIrJ}rK(h5Uh6jCubah}rQ(hB]hC]hA]h@]hD]uh6jh0]rRhh)rS}rT(h5Uh>}rU(hB]hC]hA]h@]hD]uh6jOh0]rV(j)rW}rX(h5Xsmtppassh>}rY(hB]hC]hA]h@]hD]uh6jSh0]rZh\Xsmtppassr[r\}r](h5Uh6jWubah}rc(UreftypejU reftargetXstrrdU refdomainj*h@]hA]U refexplicithB]hC]hD]uh6jSh0]reh)rf}rg(h5jdh>}rh(hB]hC]hA]h@]hD]uh6jah0]rih\Xstrrjrk}rl(h5Uh6jfubah}rx(hB]hC]hA]h@]hD]uh6jh0]ryhh)rz}r{(h5Uh>}r|(hB]hC]hA]h@]hD]uh6jvh0]r}(j)r~}r(h5Xsmtpporth>}r(hB]hC]hA]h@]hD]uh6jzh0]rh\Xsmtpportrr}r(h5Uh6j~ubah}r(UreftypejU reftargetXbooleanrU refdomainj*h@]hA]U refexplicithB]hC]hD]uh6jzh0]rh)r}r(h5jh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xbooleanrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Xsmtptlsh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xsmtptlsrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Xsmtpsslh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xsmtpsslrr}r(h5Uh6jubah}r(h@]hA]hB]hC]hD]Uentries]r(hfX5from_settings() (scrapy.mail.MailSender class method)hUtrauhFNhGhh0]ubj#)r}r(h5Uh6jh7Nh}r(j(j)Xpyh@]hA]hB]hC]hD]j+X classmethodrj-juhFNhGhh0]r(j/)r}r(h5Xfrom_settings(settings)h6jh7h:h}r(h@]rhaj5j6hA]hB]hC]hD]rhaj8XMailSender.from_settingsj:j9j;uhFKNhGhh0]r(j=)r}r(h5U classmethod rh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKNhGhh0]rh\X classmethod rr}r(h5Uh6jubaubjO)r}r(h5X from_settingsh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKNhGhh0]rh\X from_settingsrr}r(h5Uh6jubaubjX)r}r(h5Uh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKNhGhh0]rj^)r}r(h5Xsettingsh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xsettingsrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uhFKNhGhh0]r(hh)r}r(h5XtInstantiate using a Scrapy settings object, which will respect :ref:`these Scrapy settings `.h6jh7h:h}r(hB]hC]hA]h@]hD]uhFKIhGhh0]r(h\X?Instantiate using a Scrapy settings object, which will respect rr}r(h5X?Instantiate using a Scrapy settings object, which will respect h6jubh)r}r(h5X4:ref:`these Scrapy settings `rh6jh7h:h}r(UreftypeXrefhhXtopics-email-settingsU refdomainXstdrh@]hA]U refexplicithB]hC]hD]hhuhFKIh0]rh)r}r(h5jh>}r (hB]hC]r (hjXstd-refr ehA]h@]hD]uh6jh0]r h\Xthese Scrapy settingsr r}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uhFNhGhh0]rj)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\X Parametersrr}r (h5Uh6jubah}r#(hB]hC]hA]h@]hD]uh6jh0]r$hh)r%}r&(h5Uh>}r'(hB]hC]hA]h@]hD]uh6j!h0]r((j)r)}r*(h5Xsettingsh>}r+(hB]hC]hA]h@]hD]uh6j%h0]r,h\Xsettingsr-r.}r/(h5Uh6j)ubah}r6(UreftypeXclasshhXscrapy.settings.SettingsU refdomainXpyr7h@]hA]U refexplicithB]hC]hD]hhUpy:classr8j9U py:moduler9j6uhFKMh0]r:j)r;}r<(h5j5h>}r=(hB]hC]r>(hj7Xpy-classr?ehA]h@]hD]uh6j3h0]r@h\Xscrapy.settings.SettingsrArB}rC(h5Uh6j;ubah}rQ(h@]hA]hB]hC]hD]Uentries]rR(hfX&send() (scrapy.mail.MailSender method)h UtrSauhFNhGhh0]ubj#)rT}rU(h5Uh6jh7Nh}rV(j(j)XpyrWh@]hA]hB]hC]hD]j+XmethodrXj-jXuhFNhGhh0]rY(j/)rZ}r[(h5X,send(to, subject, body, cc=None, attachs=())h6jTh7h:h}r\(h@]r]h aj5j6hA]hB]hC]hD]r^h aj8XMailSender.sendj:j9j;uhFKdhGhh0]r_(jO)r`}ra(h5Xsendh6jZh7h:h}rb(hB]hC]hA]h@]hD]uhFKdhGhh0]rch\Xsendrdre}rf(h5Uh6j`ubaubjX)rg}rh(h5Uh6jZh7h:h}ri(hB]hC]hA]h@]hD]uhFKdhGhh0]rj(j^)rk}rl(h5Xtoh>}rm(hB]hC]hA]h@]hD]uh6jgh0]rnh\Xtororp}rq(h5Uh6jkubah}rt(hB]hC]hA]h@]hD]uh6jgh0]ruh\Xsubjectrvrw}rx(h5Uh6jrubah}r{(hB]hC]hA]h@]hD]uh6jgh0]r|h\Xbodyr}r~}r(h5Uh6jyubah}r(hB]hC]hA]h@]hD]uh6jgh0]rh\Xcc=Nonerr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jgh0]rh\X attachs=()rr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uhFKdhGhh0]r(hh)r}r(h5X#Send email to the given recipients.rh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKQhGhh0]rh\X#Send email to the given recipients.rr}r(h5jh6jubaubj)r}r(h5Uh6jh7Nh}r(hB]hC]hA]h@]hD]uhFNhGhh0]rj)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\X Parametersrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jh0]rj)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Xtoh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xtorr}r(h5Uh6jubah}r(UreftypejU reftargetXlistrU refdomainjWh@]hA]U refexplicithB]hC]hD]uh6jh0]rh)r}r(h5jh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xlistrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Xsubjecth>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xsubjectrr}r(h5Uh6jubah}r(UreftypejU reftargetXstrrU refdomainjWh@]hA]U refexplicithB]hC]hD]uh6jh0]rh)r}r(h5jh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xstrrr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]r(j)r}r(h5Xcch>}r (hB]hC]hA]h@]hD]uh6jh0]r h\Xccr r }r (h5Uh6jubah}r(UreftypejU reftargetXlistrU refdomainjWh@]hA]U refexplicithB]hC]hD]uh6jh0]rh)r}r(h5jh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\Xlistrr}r(h5Uh6jubah}r((hB]hC]hA]h@]hD]uh6jh0]r)hh)r*}r+(h5Uh>}r,(hB]hC]hA]h@]hD]uh6j&h0]r-(j)r.}r/(h5Xbodyh>}r0(hB]hC]hA]h@]hD]uh6j*h0]r1h\Xbodyr2r3}r4(h5Uh6j.ubah}r:(UreftypejU reftargetXstrr;U refdomainjWh@]hA]U refexplicithB]hC]hD]uh6j*h0]r<h)r=}r>(h5j;h>}r?(hB]hC]hA]h@]hD]uh6j8h0]r@h\XstrrArB}rC(h5Uh6j=ubah}rO(hB]hC]hA]h@]hD]uh6jh0]rPhh)rQ}rR(h5Uh>}rS(hB]hC]hA]h@]hD]uh6jMh0]rT(j)rU}rV(h5Xattachsh>}rW(hB]hC]hA]h@]hD]uh6jQh0]rXh\XattachsrYrZ}r[(h5Uh6jUubah}ra(UreftypejU reftargetXiterablerbU refdomainjWh@]hA]U refexplicithB]hC]hD]uh6jQh0]rch)rd}re(h5jbh>}rf(hB]hC]hA]h@]hD]uh6j_h0]rgh\Xiterablerhri}rj(h5Uh6jdubah}ru(hB]hC]hA]h@]hD]uh6jQh0]rvh\X$(attach_name, mimetype, file_object)rwrx}ry(h5Uh6jsubah}r(hB]hC]hA]h@]hD]uh6jQh0]rh\X attach_namerr}r(h5Uh6j}ubah}r(hB]hC]hA]h@]hD]uh6jQh0]rh\Xmimetyperr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uh6jQh0]rh\X file_objectrr}r(h5Uh6jubah is a readable file object with the contents of the attachmentrr}r(h5X> is a readable file object with the contents of the attachmenth6jQubeh}r(h@]hA]hB]hC]hD]hEh%uhFKghGhh0]ubeubhH)r}r(h5Uh6hIh7h:hK}rh jsh}r(hB]hC]hA]h@]r(h*h%ehD]r(hh euhFKjhGhhR}rh%jsh0]r(hU)r}r(h5X Mail settingsrh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKjhGhh0]rh\X Mail settingsrr}r(h5jh6jubaubhh)r}r(h5XThese settings define the default constructor values of the :class:`MailSender` class, and can be used to configure e-mail notifications in your project without writing any code (for those extensions and code that uses :class:`MailSender`).h6jh7h:h}r(hB]hC]hA]h@]hD]uhFKlhGhh0]r(h\X<These settings define the default constructor values of the rr}r(h5X<These settings define the default constructor values of the h6jubh)r}r(h5X:class:`MailSender`rh6jh7h:h}r(UreftypeXclasshhX MailSenderU refdomainXpyrh@]hA]U refexplicithB]hC]hD]hhj8Nj9j6uhFKlh0]rj)r}r(h5jh>}r(hB]hC]r(hjXpy-classrehA]h@]hD]uh6jh0]rh\X MailSenderrr}r(h5Uh6jubah}r(UreftypeXclasshhX MailSenderU refdomainXpyrh@]hA]U refexplicithB]hC]hD]hhj8Nj9j6uhFKlh0]rj)r}r(h5jh>}r(hB]hC]r(hjXpy-classrehA]h@]hD]uh6jh0]rh\X MailSenderrr}r(h5Uh6jubah}r(h@]hA]hB]hC]hD]Uentries]r(XpairXMAIL_FROM; settingXstd:setting-MAIL_FROMrUtrauhFKqhGhh0]ubh2)r}r(h5Uh6jh7h:h}r(h@]hA]hB]hC]hD]hEjuhFKqhGhh0]ubhH)r}r(h5Uh6jh7h:hK}h}r(hB]hC]hA]h@]r(h!jehD]rhauhFKshGhhR}rjjsh0]r(hU)r}r(h5X MAIL_FROMrh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKshGhh0]rh\X MAIL_FROMrr}r(h5jh6jubaubhh)r}r(h5XDefault: ``'scrapy@localhost'``rh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKuhGhh0]r(h\X Default: rr}r(h5X Default: h6jubj)r}r(h5X``'scrapy@localhost'``h>}r(hB]hC]hA]h@]hD]uh6jh0]rh\X'scrapy@localhost'rr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uhFKwhGhh0]r(h\XSender email to use (rr}r(h5XSender email to use (h6jubj)r}r (h5X ``From:``h>}r (hB]hC]hA]h@]hD]uh6jh0]r h\XFrom:r r }r(h5Uh6jubah}r(h@]hA]hB]hC]hD]Uentries]r(XpairXMAIL_HOST; settingXstd:setting-MAIL_HOSTrUtrauhFKzhGhh0]ubh2)r}r(h5Uh6jh7h:h}r(h@]hA]hB]hC]hD]hEjuhFKzhGhh0]ubeubhH)r}r(h5Uh6jh7h:hK}h}r(hB]hC]hA]h@]r(h+jehD]rhauhFK|hGhhR}r jjsh0]r!(hU)r"}r#(h5X MAIL_HOSTr$h6jh7h:h}r%(hB]hC]hA]h@]hD]uhFK|hGhh0]r&h\X MAIL_HOSTr'r(}r)(h5j$h6j"ubaubhh)r*}r+(h5XDefault: ``'localhost'``r,h6jh7h:h}r-(hB]hC]hA]h@]hD]uhFK~hGhh0]r.(h\X Default: r/r0}r1(h5X Default: h6j*ubj)r2}r3(h5X``'localhost'``h>}r4(hB]hC]hA]h@]hD]uh6j*h0]r5h\X 'localhost'r6r7}r8(h5Uh6j2ubah}r<(hB]hC]hA]h@]hD]uhFKhGhh0]r=h\X$SMTP host to use for sending emails.r>r?}r@(h5j;h6j9ubaubh`)rA}rB(h5Uh6jh7h:h}rC(h@]hA]hB]hC]hD]Uentries]rD(XpairXMAIL_PORT; settingXstd:setting-MAIL_PORTrEUtrFauhFKhGhh0]ubh2)rG}rH(h5Uh6jh7h:h}rI(h@]hA]hB]hC]hD]hEjEuhFKhGhh0]ubeubhH)rJ}rK(h5Uh6jh7h:hK}h}rL(hB]hC]hA]h@]rM(h,jEehD]rNhauhFKhGhhR}rOjEjGsh0]rP(hU)rQ}rR(h5X MAIL_PORTrSh6jJh7h:h}rT(hB]hC]hA]h@]hD]uhFKhGhh0]rUh\X MAIL_PORTrVrW}rX(h5jSh6jQubaubhh)rY}rZ(h5XDefault: ``25``r[h6jJh7h:h}r\(hB]hC]hA]h@]hD]uhFKhGhh0]r](h\X Default: r^r_}r`(h5X Default: h6jYubj)ra}rb(h5X``25``h>}rc(hB]hC]hA]h@]hD]uh6jYh0]rdh\X25rerf}rg(h5Uh6jaubah}rk(hB]hC]hA]h@]hD]uhFKhGhh0]rlh\X$SMTP port to use for sending emails.rmrn}ro(h5jjh6jhubaubh`)rp}rq(h5Uh6jJh7h:h}rr(h@]hA]hB]hC]hD]Uentries]rs(XpairXMAIL_USER; settingXstd:setting-MAIL_USERrtUtruauhFKhGhh0]ubh2)rv}rw(h5Uh6jJh7h:h}rx(h@]hA]hB]hC]hD]hEjtuhFKhGhh0]ubeubhH)ry}rz(h5Uh6jh7h:hK}h}r{(hB]hC]hA]h@]r|(h'jtehD]r}hauhFKhGhhR}r~jtjvsh0]r(hU)r}r(h5X MAIL_USERrh6jyh7h:h}r(hB]hC]hA]h@]hD]uhFKhGhh0]rh\X MAIL_USERrr}r(h5jh6jubaubhh)r}r(h5XDefault: ``None``rh6jyh7h:h}r(hB]hC]hA]h@]hD]uhFKhGhh0]r(h\X Default: rr}r(h5X Default: h6jubj)r}r(h5X``None``h>}r(hB]hC]hA]h@]hD]uh6jh0]rh\XNonerr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uhFKhGhh0]rh\XZUser to use for SMTP authentication. If disabled no SMTP authentication will be performed.rr}r(h5jh6jubaubh`)r}r(h5Uh6jyh7h:h}r(h@]hA]hB]hC]hD]Uentries]r(XpairXMAIL_PASS; settingXstd:setting-MAIL_PASSrUtrauhFKhGhh0]ubh2)r}r(h5Uh6jyh7h:h}r(h@]hA]hB]hC]hD]hEjuhFKhGhh0]ubeubhH)r}r(h5Uh6jh7h:hK}h}r(hB]hC]hA]h@]r(h)jehD]rhauhFKhGhhR}rjjsh0]r(hU)r}r(h5X MAIL_PASSrh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKhGhh0]rh\X MAIL_PASSrr}r(h5jh6jubaubhh)r}r(h5XDefault: ``None``rh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKhGhh0]r(h\X Default: rr}r(h5X Default: h6jubj)r}r(h5X``None``h>}r(hB]hC]hA]h@]hD]uh6jh0]rh\XNonerr}r(h5Uh6jubah}r(hB]hC]hA]h@]hD]uhFKhGhh0]r(h\X4Password to use for SMTP authentication, along with rr}r(h5X4Password to use for SMTP authentication, along with h6jubh)r}r(h5X:setting:`MAIL_USER`rh6jh7h:h}r(UreftypeXsettinghhX MAIL_USERU refdomainXstdrh@]hA]U refexplicithB]hC]hD]hhuhFKh0]rj)r}r(h5jh>}r(hB]hC]r(hjX std-settingrehA]h@]hD]uh6jh0]rh\X MAIL_USERrr}r(h5Uh6jubah}r(h@]hA]hB]hC]hD]Uentries]r(XpairXMAIL_TLS; settingXstd:setting-MAIL_TLSrUtrauhFKhGhh0]ubh2)r}r(h5Uh6jh7h:h}r(h@]hA]hB]hC]hD]hEjuhFKhGhh0]ubeubhH)r}r(h5Uh6jh7h:hK}h}r(hB]hC]hA]h@]r(h(jehD]rhauhFKhGhhR}rjjsh0]r(hU)r}r(h5XMAIL_TLSrh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKhGhh0]rh\XMAIL_TLSrr}r(h5jh6jubaubhh)r}r(h5XDefault: ``False``rh6jh7h:h}r(hB]hC]hA]h@]hD]uhFKhGhh0]r(h\X Default: rr}r(h5X Default: h6jubj)r}r(h5X ``False``h>}r(hB]hC]hA]h@]hD]uh6jh0]rh\XFalserr}r(h5Uh6jubah}r (hB]hC]hA]h@]hD]uhFKhGhh0]r h\XEnforce using STARTTLS. STARTTLS is a way to take an existing insecure connection, and upgrade it to a secure connection using SSL/TLS.r r }r (h5jh6jubaubh`)r}r(h5Uh6jh7h:h}r(h@]hA]hB]hC]hD]Uentries]r(XpairXMAIL_SSL; settingXstd:setting-MAIL_SSLrUtrauhFKhGhh0]ubh2)r}r(h5Uh6jh7h:h}r(h@]hA]hB]hC]hD]hEjuhFKhGhh0]ubeubhH)r}r(h5Uh6jh7h:hK}h}r(hB]hC]hA]h@]r(h$jehD]rh auhFKhGhhR}rjjsh0]r(hU)r}r(h5XMAIL_SSLr h6jh7h:h}r!(hB]hC]hA]h@]hD]uhFKhGhh0]r"h\XMAIL_SSLr#r$}r%(h5j h6jubaubhh)r&}r'(h5XDefault: ``False``r(h6jh7h:h}r)(hB]hC]hA]h@]hD]uhFKhGhh0]r*(h\X Default: r+r,}r-(h5X Default: h6j&ubj)r.}r/(h5X ``False``h>}r0(hB]hC]hA]h@]hD]uh6j&h0]r1h\XFalser2r3}r4(h5Uh6j.ubah}r8(hB]hC]hA]h@]hD]uhFKhGhh0]r9h\X4Enforce connecting using an SSL encrypted connectionr:r;}r<(h5j7h6j5ubaubeubeubeubeh5UU transformerr=NU footnote_refsr>}r?Urefnamesr@}rA(hw]rBhraXtwisted non-blocking io]rC(hjeuUsymbol_footnotesrD]rEUautofootnote_refsrF]rGUsymbol_footnote_refsrH]rIU citationsrJ]rKhGhU current_linerLNUtransform_messagesrM]rN(cdocutils.nodes system_message rO)rP}rQ(h5Uh>}rR(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKUtypeUINFOrSuh0]rThh)rU}rV(h5Uh>}rW(hB]hC]hA]h@]hD]uh6jPh0]rXh\X2Hyperlink target "topics-email" is not referenced.rYrZ}r[(h5Uh6jUubah}r_(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKgUtypejSuh0]r`hh)ra}rb(h5Uh>}rc(hB]hC]hA]h@]hD]uh6j]h0]rdh\X;Hyperlink target "topics-email-settings" is not referenced.rerf}rg(h5Uh6jaubah}rj(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKqUtypejSuh0]rkhh)rl}rm(h5Uh>}rn(hB]hC]hA]h@]hD]uh6jhh0]roh\X;Hyperlink target "std:setting-MAIL_FROM" is not referenced.rprq}rr(h5Uh6jlubah}ru(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKzUtypejSuh0]rvhh)rw}rx(h5Uh>}ry(hB]hC]hA]h@]hD]uh6jsh0]rzh\X;Hyperlink target "std:setting-MAIL_HOST" is not referenced.r{r|}r}(h5Uh6jwubah}r(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKUtypejSuh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6j~h0]rh\X;Hyperlink target "std:setting-MAIL_PORT" is not referenced.rr}r(h5Uh6jubah}r(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKUtypejSuh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\X;Hyperlink target "std:setting-MAIL_USER" is not referenced.rr}r(h5Uh6jubah}r(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKUtypejSuh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\X;Hyperlink target "std:setting-MAIL_PASS" is not referenced.rr}r(h5Uh6jubah}r(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKUtypejSuh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\X:Hyperlink target "std:setting-MAIL_TLS" is not referenced.rr}r(h5Uh6jubah}r(hB]UlevelKh@]hA]Usourceh:hC]hD]UlineKUtypejSuh0]rhh)r}r(h5Uh>}r(hB]hC]hA]h@]hD]uh6jh0]rh\X:Hyperlink target "std:setting-MAIL_SSL" is not referenced.rr}r(h5Uh6jubah}r (hB]h@]r hPahA]UismodhC]hD]uhFNhGhh0]ubh'jyhjh)jh#hh"hjjjjhj0jEjJh(juUsubstitution_namesr }r h}r(hB]h@]hA]Usourceh:hC]hD]uU footnotesr]rUrefidsr}r(j]rjah%]rjah&]rh3aj]rjaj]rjajE]rjGaj]rjaj]rjajt]rjvauub.PK o1DCC/scrapy-0.22/.doctrees/topics/webservice.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xweb service resourcesqNX spider manager json-rpc resourceqNX web serviceqNXexample of web service clientq NXwebservice_logfileq NXtopics-webservice-resources-refq Xtopics-webserviceq Xweb service settingsq NXMscrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonResource.ws_nameqX#extension manager json-rpc resourceqNXengine status json resourceqNXEscrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonResourceqX!statsresource (json-rpc resource)qNXcrawler json-rpc resourceqNX!stats collector json-rpc resourceqNXavailable json resourcesqNX;scrapy.contrib.webservice.enginestatus.EngineStatusResourceqX!examples of web service resourcesqNXwebservice_enabledqNXavailable json-rpc resourcesqNX-scrapy.contrib.webservice.stats.StatsResourceqXtopics-webservice-resourcesqXHscrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonRpcResourceqXscrapy-ws.py scriptqNX$enginestatusresource (json resource)qNXwriting a web service resourceqNX1scrapy.contrib.webservice.crawler.CrawlerResourceq Xtopics-webservice-crawlerq!Xtwisted web guideq"Xwebservice_portq#NXwebservice_resources_baseq$NXtwisted.web.resource.resourceq%X json-rpc 2.0q&Xwebservice_resourcesq'NXSscrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonRpcResource.get_targetq(Xwebservice_hostq)NuUsubstitution_defsq*}q+Uparse_messagesq,]q-Ucurrent_sourceq.NU decorationq/NUautofootnote_startq0KUnameidsq1}q2(hUweb-service-resourcesq3hU spider-manager-json-rpc-resourceq4hU web-serviceq5h Uexample-of-web-service-clientq6h Uwebservice-logfileq7h Utopics-webservice-resources-refq8h Utopics-webserviceq9h Uweb-service-settingsq:hhhU#extension-manager-json-rpc-resourceq;hUengine-status-json-resourceqhU!stats-collector-json-rpc-resourceq?hUavailable-json-resourcesq@hhhU!examples-of-web-service-resourcesqAhUwebservice-enabledqBhUavailable-json-rpc-resourcesqChhhUtopics-webservice-resourcesqDhhhUscrapy-ws-py-scriptqEhU"enginestatusresource-json-resourceqFhUwriting-a-web-service-resourceqGh h h!Utopics-webservice-crawlerqHh"Utwisted-web-guideqIh#Uwebservice-portqJh$Uwebservice-resources-baseqKh%Utwisted-web-resource-resourceqLh&U json-rpc-2-0qMh'Uwebservice-resourcesqNh(h(h)Uwebservice-hostqOuUchildrenqP]qQ(cdocutils.nodes target qR)qS}qT(U rawsourceqUX.. _topics-webservice:UparentqVhUsourceqWcdocutils.nodes reprunicode qXXG/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/webservice.rstqYqZ}q[bUtagnameq\Utargetq]U attributesq^}q_(Uidsq`]Ubackrefsqa]Udupnamesqb]Uclassesqc]Unamesqd]Urefidqeh9uUlineqfKUdocumentqghhP]ubcdocutils.nodes section qh)qi}qj(hUUhVhhWhZUexpect_referenced_by_nameqk}qlh hSsh\Usectionqmh^}qn(hb]hc]ha]h`]qo(h5h9ehd]qp(hh euhfKhghUexpect_referenced_by_idqq}qrh9hSshP]qs(cdocutils.nodes title qt)qu}qv(hUX Web ServiceqwhVhihWhZh\Utitleqxh^}qy(hb]hc]ha]h`]hd]uhfKhghhP]qzcdocutils.nodes Text q{X Web Serviceq|q}}q~(hUhwhVhuubaubcdocutils.nodes paragraph q)q}q(hUXScrapy comes with a built-in web service for monitoring and controlling a running crawler. The service exposes most resources using the `JSON-RPC 2.0`_ protocol, but there are also other (read-only) resources which just output JSON data.hVhihWhZh\U paragraphqh^}q(hb]hc]ha]h`]hd]uhfKhghhP]q(h{XScrapy comes with a built-in web service for monitoring and controlling a running crawler. The service exposes most resources using the qq}q(hUXScrapy comes with a built-in web service for monitoring and controlling a running crawler. The service exposes most resources using the hVhubcdocutils.nodes reference q)q}q(hUX`JSON-RPC 2.0`_UresolvedqKhVhh\U referenceqh^}q(UnameX JSON-RPC 2.0UrefuriqXhttp://www.jsonrpc.org/qh`]ha]hb]hc]hd]uhP]qh{X JSON-RPC 2.0qq}q(hUUhVhubaubh{XV protocol, but there are also other (read-only) resources which just output JSON data.qq}q(hUXV protocol, but there are also other (read-only) resources which just output JSON data.hVhubeubh)q}q(hUXProvides an extensible web service for managing a Scrapy process. It's enabled by the :setting:`WEBSERVICE_ENABLED` setting. The web server will listen in the port specified in :setting:`WEBSERVICE_PORT`, and will log to the file specified in :setting:`WEBSERVICE_LOGFILE`.hVhihWhZh\hh^}q(hb]hc]ha]h`]hd]uhfK hghhP]q(h{XVProvides an extensible web service for managing a Scrapy process. It's enabled by the qq}q(hUXVProvides an extensible web service for managing a Scrapy process. It's enabled by the hVhubcsphinx.addnodes pending_xref q)q}q(hUX:setting:`WEBSERVICE_ENABLED`qhVhhWhZh\U pending_xrefqh^}q(UreftypeXsettingUrefwarnqU reftargetqXWEBSERVICE_ENABLEDU refdomainXstdqh`]ha]U refexplicithb]hc]hd]UrefdocqXtopics/webservicequhfK hP]qcdocutils.nodes literal q)q}q(hUhh^}q(hb]hc]q(UxrefqhX std-settingqeha]h`]hd]uhVhhP]qh{XWEBSERVICE_ENABLEDqq}q(hUUhVhubah\Uliteralqubaubh{X> setting. The web server will listen in the port specified in qq}q(hUX> setting. The web server will listen in the port specified in hVhubh)q}q(hUX:setting:`WEBSERVICE_PORT`qhVhhWhZh\hh^}q(UreftypeXsettinghhXWEBSERVICE_PORTU refdomainXstdqh`]ha]U refexplicithb]hc]hd]hhuhfK hP]qh)q}q(hUhh^}q(hb]hc]q(hhX std-settingqeha]h`]hd]uhVhhP]qh{XWEBSERVICE_PORTqŅq}q(hUUhVhubah\hubaubh{X(, and will log to the file specified in qȅq}q(hUX(, and will log to the file specified in hVhubh)q}q(hUX:setting:`WEBSERVICE_LOGFILE`qhVhhWhZh\hh^}q(UreftypeXsettinghhXWEBSERVICE_LOGFILEU refdomainXstdqh`]ha]U refexplicithb]hc]hd]hhuhfK hP]qh)q}q(hUhh^}q(hb]hc]q(hhX std-settingqeha]h`]hd]uhVhhP]qh{XWEBSERVICE_LOGFILEqׅq}q(hUUhVhubah\hubaubh{X.q}q(hUX.hVhubeubh)q}q(hUXThe web service is a :ref:`built-in Scrapy extension ` which comes enabled by default, but you can also disable it if you're running tight on memory.hVhihWhZh\hh^}q(hb]hc]ha]h`]hd]uhfKhghhP]q(h{XThe web service is a qq}q(hUXThe web service is a hVhubh)q}q(hUX8:ref:`built-in Scrapy extension `qhVhhWhZh\hh^}q(UreftypeXrefhhXtopics-extensions-refU refdomainXstdqh`]ha]U refexplicithb]hc]hd]hhuhfKhP]qcdocutils.nodes emphasis q)q}q(hUhh^}q(hb]hc]q(hhXstd-refqeha]h`]hd]uhVhhP]qh{Xbuilt-in Scrapy extensionqq}q(hUUhVhubah\Uemphasisqubaubh{X_ which comes enabled by default, but you can also disable it if you're running tight on memory.qq}q(hUX_ which comes enabled by default, but you can also disable it if you're running tight on memory.hVhubeubhR)q}q(hUX .. _topics-webservice-resources:hVhihWhZh\h]h^}q(h`]ha]hb]hc]hd]hehDuhfKhghhP]ubhh)q}q(hUUhVhihWhZhk}qhhsh\hmh^}q(hb]hc]ha]h`]q(h3hDehd]q(hheuhfKhghhq}rhDhshP]r(ht)r}r(hUXWeb service resourcesrhVhhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{XWeb service resourcesrr}r (hUjhVjubaubh)r }r (hUXThe web service contains several resources, defined in the :setting:`WEBSERVICE_RESOURCES` setting. Each resource provides a different functionality. See :ref:`topics-webservice-resources-ref` for a list of resources available by default.hVhhWhZh\hh^}r (hb]hc]ha]h`]hd]uhfKhghhP]r (h{X;The web service contains several resources, defined in the rr}r(hUX;The web service contains several resources, defined in the hVj ubh)r}r(hUX:setting:`WEBSERVICE_RESOURCES`rhVj hWhZh\hh^}r(UreftypeXsettinghhXWEBSERVICE_RESOURCESU refdomainXstdrh`]ha]U refexplicithb]hc]hd]hhuhfKhP]rh)r}r(hUjh^}r(hb]hc]r(hjX std-settingreha]h`]hd]uhVjhP]rh{XWEBSERVICE_RESOURCESrr}r(hUUhVjubah\hubaubh{X@ setting. Each resource provides a different functionality. See r r!}r"(hUX@ setting. Each resource provides a different functionality. See hVj ubh)r#}r$(hUX&:ref:`topics-webservice-resources-ref`r%hVj hWhZh\hh^}r&(UreftypeXrefhhXtopics-webservice-resources-refU refdomainXstdr'h`]ha]U refexplicithb]hc]hd]hhuhfKhP]r(h)r)}r*(hUj%h^}r+(hb]hc]r,(hj'Xstd-refr-eha]h`]hd]uhVj#hP]r.h{Xtopics-webservice-resources-refr/r0}r1(hUUhVj)ubah\hubaubh{X. for a list of resources available by default.r2r3}r4(hUX. for a list of resources available by default.hVj ubeubh)r5}r6(hUXwAlthough you can implement your own resources using any protocol, there are two kinds of resources bundled with Scrapy:r7hVhhWhZh\hh^}r8(hb]hc]ha]h`]hd]uhfKhghhP]r9h{XwAlthough you can implement your own resources using any protocol, there are two kinds of resources bundled with Scrapy:r:r;}r<(hUj7hVj5ubaubcdocutils.nodes bullet_list r=)r>}r?(hUUhVhhWhZh\U bullet_listr@h^}rA(UbulletrBX*h`]ha]hb]hc]hd]uhfK"hghhP]rC(cdocutils.nodes list_item rD)rE}rF(hUXESimple JSON resources - which are read-only and just output JSON datarGhVj>hWhZh\U list_itemrHh^}rI(hb]hc]ha]h`]hd]uhfNhghhP]rJh)rK}rL(hUjGhVjEhWhZh\hh^}rM(hb]hc]ha]h`]hd]uhfK"hP]rNh{XESimple JSON resources - which are read-only and just output JSON datarOrP}rQ(hUjGhVjKubaubaubjD)rR}rS(hUXnJSON-RPC resources - which provide direct access to certain Scrapy objects using the `JSON-RPC 2.0`_ protocol hVj>hWhZh\jHh^}rT(hb]hc]ha]h`]hd]uhfNhghhP]rUh)rV}rW(hUXmJSON-RPC resources - which provide direct access to certain Scrapy objects using the `JSON-RPC 2.0`_ protocolhVjRhWhZh\hh^}rX(hb]hc]ha]h`]hd]uhfK#hP]rY(h{XUJSON-RPC resources - which provide direct access to certain Scrapy objects using the rZr[}r\(hUXUJSON-RPC resources - which provide direct access to certain Scrapy objects using the hVjVubh)r]}r^(hUX`JSON-RPC 2.0`_hKhVjVh\hh^}r_(UnameX JSON-RPC 2.0hhh`]ha]hb]hc]hd]uhP]r`h{X JSON-RPC 2.0rarb}rc(hUUhVj]ubaubh{X protocolrdre}rf(hUX protocolhVjVubeubaubeubhR)rg}rh(hUUhVhhWhZh\h]h^}ri(hb]h`]rjX module-scrapy.contrib.webservicerkaha]Uismodhc]hd]uhfNhghhP]ubcsphinx.addnodes index rl)rm}rn(hUUhVhhWhZh\Uindexroh^}rp(h`]ha]hb]hc]hd]Uentries]rq(UsinglerrX"scrapy.contrib.webservice (module)X module-scrapy.contrib.webserviceUtrsauhfNhghhP]ubhR)rt}ru(hUX$.. _topics-webservice-resources-ref:hVhhWhZh\h]h^}rv(h`]ha]hb]hc]hd]heh8uhfK)hghhP]ubhh)rw}rx(hUUhVhhWhZhk}ryh jtsh\hmh^}rz(hb]hc]ha]h`]r{(hCh8ehd]r|(hh euhfK,hghhq}r}h8jtshP]r~(ht)r}r(hUXAvailable JSON-RPC resourcesrhVjwhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfK,hghhP]rh{XAvailable JSON-RPC resourcesrr}r(hUjhVjubaubh)r}r(hUX@These are the JSON-RPC resources available by default in Scrapy:rhVjwhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfK.hghhP]rh{X@These are the JSON-RPC resources available by default in Scrapy:rr}r(hUjhVjubaubhR)r}r(hUX.. _topics-webservice-crawler:hVjwhWhZh\h]h^}r(h`]ha]hb]hc]hd]hehHuhfK0hghhP]ubhh)r}r(hUUhVjwhWhZhk}rh!jsh\hmh^}r(hb]hc]ha]h`]r(X(module-scrapy.contrib.webservice.crawlerrh>hHehd]r(hh!euhfK3hghhq}rhHjshP]r(ht)r}r(hUXCrawler JSON-RPC resourcerhVjhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfK3hghhP]rh{XCrawler JSON-RPC resourcerr}r(hUjhVjubaubjl)r}r(hUUhVjhWhZh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(jrX*scrapy.contrib.webservice.crawler (module)X(module-scrapy.contrib.webservice.crawlerUtrauhfNhghhP]ubjl)r}r(hUUhVjhWhZh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(jrX<CrawlerResource (class in scrapy.contrib.webservice.crawler)h UtrauhfNhghhP]ubcsphinx.addnodes desc r)r}r(hUUhVjhWhZh\Udescrh^}r(UnoindexrUdomainrXpyh`]ha]hb]hc]hd]UobjtyperXclassrUdesctyperjuhfNhghhP]r(csphinx.addnodes desc_signature r)r}r(hUXCrawlerResourcerhVjhWhZh\Udesc_signaturerh^}r(h`]rh aUmodulerX!scrapy.contrib.webservice.crawlerrha]hb]hc]hd]rh aUfullnamerjUclassrUUfirstruhfK>hghhP]r(csphinx.addnodes desc_annotation r)r}r(hUXclass hVjhWhZh\Udesc_annotationrh^}r(hb]hc]ha]h`]hd]uhfK>hghhP]rh{Xclass rr}r(hUUhVjubaubcsphinx.addnodes desc_addname r)r}r(hUX"scrapy.contrib.webservice.crawler.hVjhWhZh\U desc_addnamerh^}r(hb]hc]ha]h`]hd]uhfK>hghhP]rh{X"scrapy.contrib.webservice.crawler.rr}r(hUUhVjubaubcsphinx.addnodes desc_name r)r}r(hUjhVjhWhZh\U desc_namerh^}r(hb]hc]ha]h`]hd]uhfK>hghhP]rh{XCrawlerResourcerr}r(hUUhVjubaubeubcsphinx.addnodes desc_content r)r}r(hUUhVjhWhZh\U desc_contentrh^}r(hb]hc]ha]h`]hd]uhfK>hghhP]r(h)r}r(hUXLProvides access to the main Crawler object that controls the Scrapy process.rhVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfK:hghhP]rh{XLProvides access to the main Crawler object that controls the Scrapy process.rr}r(hUjhVjubaubh)r}r(hUX6Available by default at: http://localhost:6080/crawlerhVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfK=hghhP]r(h{XAvailable by default at: rr}r(hUXAvailable by default at: hVjubh)r}r(hUXhttp://localhost:6080/crawlerrh^}r(Urefurijh`]ha]hb]hc]hd]uhVjhP]rh{Xhttp://localhost:6080/crawlerrr}r(hUUhVjubah\hubeubeubeubeubhh)r}r(hUUhVjwhWhZh\hmh^}r(hb]hc]ha]h`]r(X&module-scrapy.contrib.webservice.statsrh?ehd]rhauhfK@hghhP]r(ht)r}r(hUX!Stats Collector JSON-RPC resourcerhVjhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfK@hghhP]r h{X!Stats Collector JSON-RPC resourcer r }r (hUjhVjubaubjl)r }r(hUUhVjhWhZh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(jrX(scrapy.contrib.webservice.stats (module)X&module-scrapy.contrib.webservice.statsUtrauhfNhghhP]ubjl)r}r(hUUhVjhWhZh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(jrX8StatsResource (class in scrapy.contrib.webservice.stats)hUtrauhfNhghhP]ubj)r}r(hUUhVjhWhZh\jh^}r(jjXpyh`]ha]hb]hc]hd]jXclassrjjuhfNhghhP]r(j)r}r(hUX StatsResourcerhVjhWhZh\jh^}r(h`]r hajXscrapy.contrib.webservice.statsr!ha]hb]hc]hd]r"hajjjUjuhfKJhghhP]r#(j)r$}r%(hUXclass hVjhWhZh\jh^}r&(hb]hc]ha]h`]hd]uhfKJhghhP]r'h{Xclass r(r)}r*(hUUhVj$ubaubj)r+}r,(hUX scrapy.contrib.webservice.stats.hVjhWhZh\jh^}r-(hb]hc]ha]h`]hd]uhfKJhghhP]r.h{X scrapy.contrib.webservice.stats.r/r0}r1(hUUhVj+ubaubj)r2}r3(hUjhVjhWhZh\jh^}r4(hb]hc]ha]h`]hd]uhfKJhghhP]r5h{X StatsResourcer6r7}r8(hUUhVj2ubaubeubj)r9}r:(hUUhVjhWhZh\jh^}r;(hb]hc]ha]h`]hd]uhfKJhghhP]r<(h)r=}r>(hUX;Provides access to the Stats Collector used by the crawler.r?hVj9hWhZh\hh^}r@(hb]hc]ha]h`]hd]uhfKGhghhP]rAh{X;Provides access to the Stats Collector used by the crawler.rBrC}rD(hUj?hVj=ubaubh)rE}rF(hUX4Available by default at: http://localhost:6080/statshVj9hWhZh\hh^}rG(hb]hc]ha]h`]hd]uhfKIhghhP]rH(h{XAvailable by default at: rIrJ}rK(hUXAvailable by default at: hVjEubh)rL}rM(hUXhttp://localhost:6080/statsrNh^}rO(UrefurijNh`]ha]hb]hc]hd]uhVjEhP]rPh{Xhttp://localhost:6080/statsrQrR}rS(hUUhVjLubah\hubeubeubeubeubhh)rT}rU(hUUhVjwhWhZh\hmh^}rV(hb]hc]ha]h`]rWh4ahd]rXhauhfKLhghhP]rY(ht)rZ}r[(hUX Spider Manager JSON-RPC resourcer\hVjThWhZh\hxh^}r](hb]hc]ha]h`]hd]uhfKLhghhP]r^h{X Spider Manager JSON-RPC resourcer_r`}ra(hUj\hVjZubaubh)rb}rc(hUXYou can access the spider manager JSON-RPC resource through the :ref:`topics-webservice-crawler` at: http://localhost:6080/crawler/spidershVjThWhZh\hh^}rd(hb]hc]ha]h`]hd]uhfKNhghhP]re(h{X@You can access the spider manager JSON-RPC resource through the rfrg}rh(hUX@You can access the spider manager JSON-RPC resource through the hVjbubh)ri}rj(hUX :ref:`topics-webservice-crawler`rkhVjbhWhZh\hh^}rl(UreftypeXrefhhXtopics-webservice-crawlerU refdomainXstdrmh`]ha]U refexplicithb]hc]hd]hhuhfKNhP]rnh)ro}rp(hUjkh^}rq(hb]hc]rr(hjmXstd-refrseha]h`]hd]uhVjihP]rth{Xtopics-webservice-crawlerrurv}rw(hUUhVjoubah\hubaubh{X at: rxry}rz(hUX at: hVjbubh)r{}r|(hUX%http://localhost:6080/crawler/spidersr}h^}r~(Urefurij}h`]ha]hb]hc]hd]uhVjbhP]rh{X%http://localhost:6080/crawler/spidersrr}r(hUUhVj{ubah\hubeubeubhh)r}r(hUUhVjwhWhZh\hmh^}r(hb]hc]ha]h`]rh;ahd]rhauhfKRhghhP]r(ht)r}r(hUX#Extension Manager JSON-RPC resourcerhVjhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfKRhghhP]rh{X#Extension Manager JSON-RPC resourcerr}r(hUjhVjubaubh)r}r(hUXYou can access the extension manager JSON-RPC resource through the :ref:`topics-webservice-crawler` at: http://localhost:6080/crawler/spidershVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKThghhP]r(h{XCYou can access the extension manager JSON-RPC resource through the rr}r(hUXCYou can access the extension manager JSON-RPC resource through the hVjubh)r}r(hUX :ref:`topics-webservice-crawler`rhVjhWhZh\hh^}r(UreftypeXrefhhXtopics-webservice-crawlerU refdomainXstdrh`]ha]U refexplicithb]hc]hd]hhuhfKThP]rh)r}r(hUjh^}r(hb]hc]r(hjXstd-refreha]h`]hd]uhVjhP]rh{Xtopics-webservice-crawlerrr}r(hUUhVjubah\hubaubh{X at: rr}r(hUX at: hVjubh)r}r(hUX%http://localhost:6080/crawler/spidersrh^}r(Urefurijh`]ha]hb]hc]hd]uhVjhP]rh{X%http://localhost:6080/crawler/spidersrr}r(hUUhVjubah\hubeubeubeubhh)r}r(hUUhVhhWhZh\hmh^}r(hb]hc]ha]h`]rh@ahd]rhauhfKXhghhP]r(ht)r}r(hUXAvailable JSON resourcesrhVjhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfKXhghhP]rh{XAvailable JSON resourcesrr}r(hUjhVjubaubh)r}r(hUX2These are the JSON resources available by default:rhVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKZhghhP]rh{X2These are the JSON resources available by default:rr}r(hUjhVjubaubhh)r}r(hUUhVjhWhZh\hmh^}r(hb]hc]ha]h`]r(X-module-scrapy.contrib.webservice.enginestatusrhThese are the settings that control the web service behaviour:r.hVjhWhZh\hh^}r/(hb]hc]ha]h`]hd]uhfKkhghhP]r0h{X>These are the settings that control the web service behaviour:r1r2}r3(hUj.hVj,ubaubjl)r4}r5(hUUhVjhWhZh\joh^}r6(h`]ha]hb]hc]hd]Uentries]r7(XpairXWEBSERVICE_ENABLED; settingXstd:setting-WEBSERVICE_ENABLEDr8Utr9auhfKnhghhP]ubhR)r:}r;(hUUhVjhWhZh\h]h^}r<(h`]ha]hb]hc]hd]hej8uhfKnhghhP]ubhh)r=}r>(hUUhVjhWhZhk}h\hmh^}r?(hb]hc]ha]h`]r@(hBj8ehd]rAhauhfKphghhq}rBj8j:shP]rC(ht)rD}rE(hUXWEBSERVICE_ENABLEDrFhVj=hWhZh\hxh^}rG(hb]hc]ha]h`]hd]uhfKphghhP]rHh{XWEBSERVICE_ENABLEDrIrJ}rK(hUjFhVjDubaubh)rL}rM(hUXDefault: ``True``rNhVj=hWhZh\hh^}rO(hb]hc]ha]h`]hd]uhfKrhghhP]rP(h{X Default: rQrR}rS(hUX Default: hVjLubh)rT}rU(hUX``True``h^}rV(hb]hc]ha]h`]hd]uhVjLhP]rWh{XTruerXrY}rZ(hUUhVjTubah\hubeubh)r[}r\(hUXfA boolean which specifies if the web service will be enabled (provided its extension is also enabled).r]hVj=hWhZh\hh^}r^(hb]hc]ha]h`]hd]uhfKthghhP]r_h{XfA boolean which specifies if the web service will be enabled (provided its extension is also enabled).r`ra}rb(hUj]hVj[ubaubjl)rc}rd(hUUhVj=hWhZh\joh^}re(h`]ha]hb]hc]hd]Uentries]rf(XpairXWEBSERVICE_LOGFILE; settingXstd:setting-WEBSERVICE_LOGFILErgUtrhauhfKxhghhP]ubhR)ri}rj(hUUhVj=hWhZh\h]h^}rk(h`]ha]hb]hc]hd]hejguhfKxhghhP]ubeubhh)rl}rm(hUUhVjhWhZhk}h\hmh^}rn(hb]hc]ha]h`]ro(h7jgehd]rph auhfKzhghhq}rqjgjishP]rr(ht)rs}rt(hUXWEBSERVICE_LOGFILEruhVjlhWhZh\hxh^}rv(hb]hc]ha]h`]hd]uhfKzhghhP]rwh{XWEBSERVICE_LOGFILErxry}rz(hUjuhVjsubaubh)r{}r|(hUXDefault: ``None``r}hVjlhWhZh\hh^}r~(hb]hc]ha]h`]hd]uhfK|hghhP]r(h{X Default: rr}r(hUX Default: hVj{ubh)r}r(hUX``None``h^}r(hb]hc]ha]h`]hd]uhVj{hP]rh{XNonerr}r(hUUhVjubah\hubeubh)r}r(hUXuA file to use for logging HTTP requests made to the web service. If unset web the log is sent to standard scrapy log.rhVjlhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfK~hghhP]rh{XuA file to use for logging HTTP requests made to the web service. If unset web the log is sent to standard scrapy log.rr}r(hUjhVjubaubjl)r}r(hUUhVjlhWhZh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(XpairXWEBSERVICE_PORT; settingXstd:setting-WEBSERVICE_PORTrUtrauhfKhghhP]ubhR)r}r(hUUhVjlhWhZh\h]h^}r(h`]ha]hb]hc]hd]hejuhfKhghhP]ubeubhh)r}r(hUUhVjhWhZhk}h\hmh^}r(hb]hc]ha]h`]r(hJjehd]rh#auhfKhghhq}rjjshP]r(ht)r}r(hUXWEBSERVICE_PORTrhVjhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{XWEBSERVICE_PORTrr}r(hUjhVjubaubh)r}r(hUXDefault: ``[6080, 7030]``rhVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(h{X Default: rr}r(hUX Default: hVjubh)r}r(hUX``[6080, 7030]``h^}r(hb]hc]ha]h`]hd]uhVjhP]rh{X [6080, 7030]rr}r(hUUhVjubah\hubeubh)r}r(hUXlThe port range to use for the web service. If set to ``None`` or ``0``, a dynamically assigned port is used.hVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(h{X5The port range to use for the web service. If set to rr}r(hUX5The port range to use for the web service. If set to hVjubh)r}r(hUX``None``h^}r(hb]hc]ha]h`]hd]uhVjhP]rh{XNonerr}r(hUUhVjubah\hubh{X or rr}r(hUX or hVjubh)r}r(hUX``0``h^}r(hb]hc]ha]h`]hd]uhVjhP]rh{X0r}r(hUUhVjubah\hubh{X&, a dynamically assigned port is used.rr}r(hUX&, a dynamically assigned port is used.hVjubeubjl)r}r(hUUhVjhWhZh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(XpairXWEBSERVICE_HOST; settingXstd:setting-WEBSERVICE_HOSTrUtrauhfKhghhP]ubhR)r}r(hUUhVjhWhZh\h]h^}r(h`]ha]hb]hc]hd]hejuhfKhghhP]ubeubhh)r}r(hUUhVjhWhZhk}h\hmh^}r(hb]hc]ha]h`]r(hOjehd]rh)auhfKhghhq}rjjshP]r(ht)r}r(hUXWEBSERVICE_HOSTrhVjhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{XWEBSERVICE_HOSTrr}r(hUjhVjubaubh)r}r(hUXDefault: ``'0.0.0.0'``rhVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(h{X Default: rr}r(hUX Default: hVjubh)r}r(hUX ``'0.0.0.0'``h^}r(hb]hc]ha]h`]hd]uhVjhP]rh{X '0.0.0.0'rr}r(hUUhVjubah\hubeubh)r}r(hUX.The interface the web service should listen onrhVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{X.The interface the web service should listen onrr}r(hUjhVjubaubeubhh)r}r(hUUhVjhWhZh\hmh^}r(hb]hc]ha]h`]rhNahd]rh'auhfKhghhP]r(ht)r}r (hUXWEBSERVICE_RESOURCESr hVjhWhZh\hxh^}r (hb]hc]ha]h`]hd]uhfKhghhP]r h{XWEBSERVICE_RESOURCESr r}r(hUj hVjubaubh)r}r(hUXDefault: ``{}``rhVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(h{X Default: rr}r(hUX Default: hVjubh)r}r(hUX``{}``h^}r(hb]hc]ha]h`]hd]uhVjhP]rh{X{}rr}r(hUUhVjubah\hubeubh)r}r (hUXThe list of web service resources enabled for your project. See :ref:`topics-webservice-resources`. These are added to the ones available by default in Scrapy, defined in the :setting:`WEBSERVICE_RESOURCES_BASE` setting.hVjhWhZh\hh^}r!(hb]hc]ha]h`]hd]uhfKhghhP]r"(h{X@The list of web service resources enabled for your project. See r#r$}r%(hUX@The list of web service resources enabled for your project. See hVjubh)r&}r'(hUX":ref:`topics-webservice-resources`r(hVjhWhZh\hh^}r)(UreftypeXrefhhXtopics-webservice-resourcesU refdomainXstdr*h`]ha]U refexplicithb]hc]hd]hhuhfKhP]r+h)r,}r-(hUj(h^}r.(hb]hc]r/(hj*Xstd-refr0eha]h`]hd]uhVj&hP]r1h{Xtopics-webservice-resourcesr2r3}r4(hUUhVj,ubah\hubaubh{XM. These are added to the ones available by default in Scrapy, defined in the r5r6}r7(hUXM. These are added to the ones available by default in Scrapy, defined in the hVjubh)r8}r9(hUX$:setting:`WEBSERVICE_RESOURCES_BASE`r:hVjhWhZh\hh^}r;(UreftypeXsettinghhXWEBSERVICE_RESOURCES_BASEU refdomainXstdr<h`]ha]U refexplicithb]hc]hd]hhuhfKhP]r=h)r>}r?(hUj:h^}r@(hb]hc]rA(hj<X std-settingrBeha]h`]hd]uhVj8hP]rCh{XWEBSERVICE_RESOURCES_BASErDrE}rF(hUUhVj>ubah\hubaubh{X setting.rGrH}rI(hUX setting.hVjubeubeubhh)rJ}rK(hUUhVjhWhZh\hmh^}rL(hb]hc]ha]h`]rMhKahd]rNh$auhfKhghhP]rO(ht)rP}rQ(hUXWEBSERVICE_RESOURCES_BASErRhVjJhWhZh\hxh^}rS(hb]hc]ha]h`]hd]uhfKhghhP]rTh{XWEBSERVICE_RESOURCES_BASErUrV}rW(hUjRhVjPubaubh)rX}rY(hUX Default::rZhVjJhWhZh\hh^}r[(hb]hc]ha]h`]hd]uhfKhghhP]r\h{XDefault:r]r^}r_(hUXDefault:hVjXubaubcdocutils.nodes literal_block r`)ra}rb(hUX{ 'scrapy.contrib.webservice.crawler.CrawlerResource': 1, 'scrapy.contrib.webservice.enginestatus.EngineStatusResource': 1, 'scrapy.contrib.webservice.stats.StatsResource': 1, }hVjJhWhZh\U literal_blockrch^}rd(U xml:spacereUpreserverfh`]ha]hb]hc]hd]uhfKhghhP]rgh{X{ 'scrapy.contrib.webservice.crawler.CrawlerResource': 1, 'scrapy.contrib.webservice.enginestatus.EngineStatusResource': 1, 'scrapy.contrib.webservice.stats.StatsResource': 1, }rhri}rj(hUUhVjaubaubh)rk}rl(hUXThe list of web service resources available by default in Scrapy. You shouldn't change this setting in your project, change :setting:`WEBSERVICE_RESOURCES` instead. If you want to disable some resource set its value to ``None`` in :setting:`WEBSERVICE_RESOURCES`.hVjJhWhZh\hh^}rm(hb]hc]ha]h`]hd]uhfKhghhP]rn(h{X|The list of web service resources available by default in Scrapy. You shouldn't change this setting in your project, change rorp}rq(hUX|The list of web service resources available by default in Scrapy. You shouldn't change this setting in your project, change hVjkubh)rr}rs(hUX:setting:`WEBSERVICE_RESOURCES`rthVjkhWhZh\hh^}ru(UreftypeXsettinghhXWEBSERVICE_RESOURCESU refdomainXstdrvh`]ha]U refexplicithb]hc]hd]hhuhfKhP]rwh)rx}ry(hUjth^}rz(hb]hc]r{(hjvX std-settingr|eha]h`]hd]uhVjrhP]r}h{XWEBSERVICE_RESOURCESr~r}r(hUUhVjxubah\hubaubh{X@ instead. If you want to disable some resource set its value to rr}r(hUX@ instead. If you want to disable some resource set its value to hVjkubh)r}r(hUX``None``h^}r(hb]hc]ha]h`]hd]uhVjkhP]rh{XNonerr}r(hUUhVjubah\hubh{X in rr}r(hUX in hVjkubh)r}r(hUX:setting:`WEBSERVICE_RESOURCES`rhVjkhWhZh\hh^}r(UreftypeXsettinghhXWEBSERVICE_RESOURCESU refdomainXstdrh`]ha]U refexplicithb]hc]hd]hhuhfKhP]rh)r}r(hUjh^}r(hb]hc]r(hjX std-settingreha]h`]hd]uhVjhP]rh{XWEBSERVICE_RESOURCESrr}r(hUUhVjubah\hubaubh{X.r}r(hUX.hVjkubeubeubeubhh)r}r(hUUhVhihWhZh\hmh^}r(hb]hc]ha]h`]rhGahd]rhauhfKhghhP]r(ht)r}r(hUXWriting a web service resourcerhVjhWhZh\hxh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{XWriting a web service resourcerr}r(hUjhVjubaubh)r}r(hUXWeb service resources are implemented using the Twisted Web API. See this `Twisted Web guide`_ for more information on Twisted web and Twisted web resources.hVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(h{XJWeb service resources are implemented using the Twisted Web API. See this rr}r(hUXJWeb service resources are implemented using the Twisted Web API. See this hVjubh)r}r(hUX`Twisted Web guide`_hKhVjh\hh^}r(UnameXTwisted Web guidehX,http://jcalderone.livejournal.com/50562.htmlrh`]ha]hb]hc]hd]uhP]rh{XTwisted Web guiderr}r(hUUhVjubaubh{X? for more information on Twisted web and Twisted web resources.rr}r(hUX? for more information on Twisted web and Twisted web resources.hVjubeubh)r}r(hUXTo write a web service resource you should subclass the :class:`JsonResource` or :class:`JsonRpcResource` classes and implement the :class:`renderGET` method.hVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(h{X8To write a web service resource you should subclass the rr}r(hUX8To write a web service resource you should subclass the hVjubh)r}r(hUX:class:`JsonResource`rhVjhWhZh\hh^}r(UreftypeXclasshhX JsonResourceU refdomainXpyrh`]ha]U refexplicithb]hc]hd]hhUpy:classrNU py:modulerjuhfKhP]rh)r}r(hUjh^}r(hb]hc]r(hjXpy-classreha]h`]hd]uhVjhP]rh{X JsonResourcerr}r(hUUhVjubah\hubaubh{X or rr}r(hUX or hVjubh)r}r(hUX:class:`JsonRpcResource`rhVjhWhZh\hh^}r(UreftypeXclasshhXJsonRpcResourceU refdomainXpyrh`]ha]U refexplicithb]hc]hd]hhjNjjuhfKhP]rh)r}r(hUjh^}r(hb]hc]r(hjXpy-classreha]h`]hd]uhVjhP]rh{XJsonRpcResourcerr}r(hUUhVjubah\hubaubh{X classes and implement the rr}r(hUX classes and implement the hVjubh)r}r(hUX:class:`renderGET`rhVjhWhZh\hh^}r(UreftypeXclasshhX renderGETU refdomainXpyrh`]ha]U refexplicithb]hc]hd]hhjNjjuhfKhP]rh)r}r(hUjh^}r(hb]hc]r(hjXpy-classreha]h`]hd]uhVjhP]rh{X renderGETrr}r(hUUhVjubah\hubaubh{X method.rr}r(hUX method.hVjubeubjl)r}r(hUUhVjhWNh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(jrXPscrapy.webservice.JsonResource (class in scrapy.contrib.webservice.enginestatus)hUtrauhfNhghhP]ubj)r}r(hUUhVjhWNh\jh^}r(jjXpyh`]ha]hb]hc]hd]jXclassrjjuhfNhghhP]r(j)r}r (hUXscrapy.webservice.JsonResourcer hVjhWhZh\jh^}r (h`]r hajjha]hb]hc]hd]r hajXscrapy.webservice.JsonResourcerjXscrapy.webservicejuhfKhghhP]r(j)r}r(hUXclass hVjhWhZh\jh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{Xclass rr}r(hUUhVjubaubj)r}r(hUXscrapy.webservice.hVjhWhZh\jh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{Xscrapy.webservice.rr}r(hUUhVjubaubj)r}r(hUX JsonResourcehVjhWhZh\jh^}r (hb]hc]ha]h`]hd]uhfKhghhP]r!h{X JsonResourcer"r#}r$(hUUhVjubaubeubj)r%}r&(hUUhVjhWhZh\jh^}r'(hb]hc]ha]h`]hd]uhfKhghhP]r((h)r)}r*(hUX_A subclass of `twisted.web.resource.Resource`_ that implements a JSON web service resource. SeehVj%hWhZh\hh^}r+(hb]hc]ha]h`]hd]uhfKhghhP]r,(h{XA subclass of r-r.}r/(hUXA subclass of hVj)ubh)r0}r1(hUX `twisted.web.resource.Resource`_hKhVj)h\hh^}r2(UnameXtwisted.web.resource.ResourcehXPhttp://twistedmatrix.com/documents/10.0.0/api/twisted.web.resource.Resource.htmlr3h`]ha]hb]hc]hd]uhP]r4h{Xtwisted.web.resource.Resourcer5r6}r7(hUUhVj0ubaubh{X1 that implements a JSON web service resource. Seer8r9}r:(hUX1 that implements a JSON web service resource. SeehVj)ubeubjl)r;}r<(hUUhVj%hWhZh\joh^}r=(h`]ha]hb]hc]hd]Uentries]r>(jrXYws_name (scrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonResource attribute)hUtr?auhfNhghhP]ubj)r@}rA(hUUhVj%hWhZh\jh^}rB(jjXpyh`]ha]hb]hc]hd]jX attributerCjjCuhfNhghhP]rD(j)rE}rF(hUXws_namerGhVj@hWhZh\jh^}rH(h`]rIhajjha]hb]hc]hd]rJhajX&scrapy.webservice.JsonResource.ws_namejjjuhfKhghhP]rKj)rL}rM(hUjGhVjEhWhZh\jh^}rN(hb]hc]ha]h`]hd]uhfKhghhP]rOh{Xws_namerPrQ}rR(hUUhVjLubaubaubj)rS}rT(hUUhVj@hWhZh\jh^}rU(hb]hc]ha]h`]hd]uhfKhghhP]rV(h)rW}rX(hUXThe name by which the Scrapy web service will known this resource, and also the path where this resource will listen. For example, assuming Scrapy web service is listening on http://localhost:6080/ and the ``ws_name`` is ``'resource1'`` the URL for that resource will be:hVjShWhZh\hh^}rY(hb]hc]ha]h`]hd]uhfKhghhP]rZ(h{XThe name by which the Scrapy web service will known this resource, and also the path where this resource will listen. For example, assuming Scrapy web service is listening on r[r\}r](hUXThe name by which the Scrapy web service will known this resource, and also the path where this resource will listen. For example, assuming Scrapy web service is listening on hVjWubh)r^}r_(hUXhttp://localhost:6080/r`h^}ra(Urefurij`h`]ha]hb]hc]hd]uhVjWhP]rbh{Xhttp://localhost:6080/rcrd}re(hUUhVj^ubah\hubh{X and the rfrg}rh(hUX and the hVjWubh)ri}rj(hUX ``ws_name``h^}rk(hb]hc]ha]h`]hd]uhVjWhP]rlh{Xws_namermrn}ro(hUUhVjiubah\hubh{X is rprq}rr(hUX is hVjWubh)rs}rt(hUX``'resource1'``h^}ru(hb]hc]ha]h`]hd]uhVjWhP]rvh{X 'resource1'rwrx}ry(hUUhVjsubah\hubh{X# the URL for that resource will be:rzr{}r|(hUX# the URL for that resource will be:hVjWubeubcdocutils.nodes block_quote r})r~}r(hUUhVjShWhZh\U block_quoterh^}r(hb]hc]ha]h`]hd]uhfNhghhP]rh)r}r(hUX http://localhost:6080/resource1/rhVj~hWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhP]rh)r}r(hUjh^}r(Urefurijh`]ha]hb]hc]hd]uhVjhP]rh{X http://localhost:6080/resource1/rr}r(hUUhVjubah\hubaubaubeubeubeubeubjl)r}r(hUUhVjhWNh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(jrXSscrapy.webservice.JsonRpcResource (class in scrapy.contrib.webservice.enginestatus)hUtrauhfNhghhP]ubj)r}r(hUUhVjhWNh\jh^}r(jjXpyh`]ha]hb]hc]hd]jXclassrjjuhfNhghhP]r(j)r}r(hUX7scrapy.webservice.JsonRpcResource(crawler, target=None)hVjhWhZh\jh^}r(h`]rhajjha]hb]hc]hd]rhajX!scrapy.webservice.JsonRpcResourcerjXscrapy.webservicejuhfKhghhP]r(j)r}r(hUXclass hVjhWhZh\jh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{Xclass rr}r(hUUhVjubaubj)r}r(hUXscrapy.webservice.hVjhWhZh\jh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{Xscrapy.webservice.rr}r(hUUhVjubaubj)r}r(hUXJsonRpcResourcehVjhWhZh\jh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{XJsonRpcResourcerr}r(hUUhVjubaubcsphinx.addnodes desc_parameterlist r)r}r(hUUhVjhWhZh\Udesc_parameterlistrh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(csphinx.addnodes desc_parameter r)r}r(hUXcrawlerh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{Xcrawlerrr}r(hUUhVjubah\Udesc_parameterrubj)r}r(hUX target=Noneh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{X target=Nonerr}r(hUUhVjubah\jubeubeubj)r}r(hUUhVjhWhZh\jh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(h)r}r(hUXThis is a subclass of :class:`JsonResource` for implementing JSON-RPC resources. JSON-RPC resources wrap Python (Scrapy) objects around a JSON-RPC API. The resource wrapped must be returned by the :meth:`get_target` method, which returns the target passed in the constructor by defaulthVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]r(h{XThis is a subclass of rr}r(hUXThis is a subclass of hVjubh)r}r(hUX:class:`JsonResource`rhVjhWhZh\hh^}r(UreftypeXclasshhX JsonResourceU refdomainXpyrh`]ha]U refexplicithb]hc]hd]hhjjjjuhfKhP]rh)r}r(hUjh^}r(hb]hc]r(hjXpy-classreha]h`]hd]uhVjhP]rh{X JsonResourcerr}r(hUUhVjubah\hubaubh{X for implementing JSON-RPC resources. JSON-RPC resources wrap Python (Scrapy) objects around a JSON-RPC API. The resource wrapped must be returned by the rr}r(hUX for implementing JSON-RPC resources. JSON-RPC resources wrap Python (Scrapy) objects around a JSON-RPC API. The resource wrapped must be returned by the hVjubh)r}r(hUX:meth:`get_target`rhVjhWhZh\hh^}r(UreftypeXmethhhX get_targetU refdomainXpyrh`]ha]U refexplicithb]hc]hd]hhjjjjuhfKhP]rh)r}r(hUjh^}r(hb]hc]r(hjXpy-methreha]h`]hd]uhVjhP]rh{X get_target()rr}r(hUUhVjubah\hubaubh{XF method, which returns the target passed in the constructor by defaultrr}r(hUXF method, which returns the target passed in the constructor by defaulthVjubeubjl)r}r(hUUhVjhWhZh\joh^}r(h`]ha]hb]hc]hd]Uentries]r(jrX^get_target() (scrapy.contrib.webservice.enginestatus.scrapy.webservice.JsonRpcResource method)h(UtrauhfNhghhP]ubj)r}r(hUUhVjhWhZh\jh^}r(jjXpyh`]ha]hb]hc]hd]jXmethodrjjuhfNhghhP]r(j)r}r(hUX get_target()rhVjhWhZh\jh^}r(h`]rh(ajjha]hb]hc]hd]r h(ajX,scrapy.webservice.JsonRpcResource.get_targetjjjuhfKhghhP]r (j)r }r (hUX get_targethVjhWhZh\jh^}r (hb]hc]ha]h`]hd]uhfKhghhP]rh{X get_targetrr}r(hUUhVj ubaubj)r}r(hUUhVjhWhZh\jh^}r(hb]hc]ha]h`]hd]uhfKhghhP]ubeubj)r}r(hUUhVjhWhZh\jh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh)r}r(hUXqReturn the object wrapped by this JSON-RPC resource. By default, it returns the object passed on the constructor.rhVjhWhZh\hh^}r(hb]hc]ha]h`]hd]uhfKhghhP]rh{XqReturn the object wrapped by this JSON-RPC resource. By default, it returns the object passed on the constructor.rr}r (hUjhVjubaubaubeubeubeubeubhh)r!}r"(hUUhVhihWhZh\hmh^}r#(hb]hc]ha]h`]r$hAahd]r%hauhfKhghhP]r&(ht)r'}r((hUX!Examples of web service resourcesr)hVj!hWhZh\hxh^}r*(hb]hc]ha]h`]hd]uhfKhghhP]r+h{X!Examples of web service resourcesr,r-}r.(hUj)hVj'ubaubhh)r/}r0(hUUhVj!hWhZh\hmh^}r1(hb]hc]ha]h`]r2h=ahd]r3hauhfKhghhP]r4(ht)r5}r6(hUX!StatsResource (JSON-RPC resource)r7hVj/hWhZh\hxh^}r8(hb]hc]ha]h`]hd]uhfKhghhP]r9h{X!StatsResource (JSON-RPC resource)r:r;}r<(hUj7hVj5ubaubj`)r=}r>(hUXfrom scrapy.webservice import JsonRpcResource class StatsResource(JsonRpcResource): ws_name = 'stats' def __init__(self, crawler): JsonRpcResource.__init__(self, crawler, crawler.stats) hVj/hWhZh\jch^}r?(hb]jejfh`]ha]UsourceXO/var/build/user_builds/scrapy/checkouts/0.22/scrapy/contrib/webservice/stats.pyhc]hd]uhfKhghhP]r@h{Xfrom scrapy.webservice import JsonRpcResource class StatsResource(JsonRpcResource): ws_name = 'stats' def __init__(self, crawler): JsonRpcResource.__init__(self, crawler, crawler.stats) rArB}rC(hUUhVj=ubaubeubhh)rD}rE(hUUhVj!hWhZh\hmh^}rF(hb]hc]ha]h`]rGhFahd]rHhauhfKhghhP]rI(ht)rJ}rK(hUX$EngineStatusResource (JSON resource)rLhVjDhWhZh\hxh^}rM(hb]hc]ha]h`]hd]uhfKhghhP]rNh{X$EngineStatusResource (JSON resource)rOrP}rQ(hUjLhVjJubaubj`)rR}rS(hUXfrom scrapy.webservice import JsonResource from scrapy.utils.engine import get_engine_status class EngineStatusResource(JsonResource): ws_name = 'enginestatus' def __init__(self, crawler, spider_name=None): JsonResource.__init__(self, crawler) self._spider_name = spider_name self.isLeaf = spider_name is not None def render_GET(self, txrequest): status = get_engine_status(self.crawler.engine) if self._spider_name is None: return status for sp, st in status['spiders'].items(): if sp.name == self._spider_name: return st def getChild(self, name, txrequest): return EngineStatusResource(name, self.crawler) hVjDhWhZh\jch^}rT(hb]jejfh`]ha]UsourceXV/var/build/user_builds/scrapy/checkouts/0.22/scrapy/contrib/webservice/enginestatus.pyhc]hd]uhfKhghhP]rUh{Xfrom scrapy.webservice import JsonResource from scrapy.utils.engine import get_engine_status class EngineStatusResource(JsonResource): ws_name = 'enginestatus' def __init__(self, crawler, spider_name=None): JsonResource.__init__(self, crawler) self._spider_name = spider_name self.isLeaf = spider_name is not None def render_GET(self, txrequest): status = get_engine_status(self.crawler.engine) if self._spider_name is None: return status for sp, st in status['spiders'].items(): if sp.name == self._spider_name: return st def getChild(self, name, txrequest): return EngineStatusResource(name, self.crawler) rVrW}rX(hUUhVjRubaubeubeubhh)rY}rZ(hUUhVhihWhZh\hmh^}r[(hb]hc]ha]h`]r\h6ahd]r]h auhfKhghhP]r^(ht)r_}r`(hUXExample of web service clientrahVjYhWhZh\hxh^}rb(hb]hc]ha]h`]hd]uhfKhghhP]rch{XExample of web service clientrdre}rf(hUjahVj_ubaubhh)rg}rh(hUUhVjYhWhZh\hmh^}ri(hb]hc]ha]h`]rjhEahd]rkhauhfKhghhP]rl(ht)rm}rn(hUXscrapy-ws.py scriptrohVjghWhZh\hxh^}rp(hb]hc]ha]h`]hd]uhfKhghhP]rqh{Xscrapy-ws.py scriptrrrs}rt(hUjohVjmubaubj`)ru}rv(hUX~#!/usr/bin/env python """ Example script to control a Scrapy server using its JSON-RPC web service. It only provides a reduced functionality as its main purpose is to illustrate how to write a web service client. Feel free to improve or write you own. Also, keep in mind that the JSON-RPC API is not stable. The recommended way for controlling a Scrapy server is through the execution queue (see the "queue" command). """ from __future__ import print_function import sys, optparse, urllib, json from urlparse import urljoin from scrapy.utils.jsonrpc import jsonrpc_client_call, JsonRpcError def get_commands(): return { 'help': cmd_help, 'stop': cmd_stop, 'list-available': cmd_list_available, 'list-running': cmd_list_running, 'list-resources': cmd_list_resources, 'get-global-stats': cmd_get_global_stats, 'get-spider-stats': cmd_get_spider_stats, } def cmd_help(args, opts): """help - list available commands""" print("Available commands:") for _, func in sorted(get_commands().items()): print(" ", func.__doc__) def cmd_stop(args, opts): """stop - stop a running spider""" jsonrpc_call(opts, 'crawler/engine', 'close_spider', args[0]) def cmd_list_running(args, opts): """list-running - list running spiders""" for x in json_get(opts, 'crawler/engine/open_spiders'): print(x) def cmd_list_available(args, opts): """list-available - list name of available spiders""" for x in jsonrpc_call(opts, 'crawler/spiders', 'list'): print(x) def cmd_list_resources(args, opts): """list-resources - list available web service resources""" for x in json_get(opts, '')['resources']: print(x) def cmd_get_spider_stats(args, opts): """get-spider-stats - get stats of a running spider""" stats = jsonrpc_call(opts, 'stats', 'get_stats', args[0]) for name, value in stats.items(): print("%-40s %s" % (name, value)) def cmd_get_global_stats(args, opts): """get-global-stats - get global stats""" stats = jsonrpc_call(opts, 'stats', 'get_stats') for name, value in stats.items(): print("%-40s %s" % (name, value)) def get_wsurl(opts, path): return urljoin("http://%s:%s/"% (opts.host, opts.port), path) def jsonrpc_call(opts, path, method, *args, **kwargs): url = get_wsurl(opts, path) return jsonrpc_client_call(url, method, *args, **kwargs) def json_get(opts, path): url = get_wsurl(opts, path) return json.loads(urllib.urlopen(url).read()) def parse_opts(): usage = "%prog [options] [arg] ..." description = "Scrapy web service control script. Use '%prog help' " \ "to see the list of available commands." op = optparse.OptionParser(usage=usage, description=description) op.add_option("-H", dest="host", default="localhost", \ help="Scrapy host to connect to") op.add_option("-P", dest="port", type="int", default=6080, \ help="Scrapy port to connect to") opts, args = op.parse_args() if not args: op.print_help() sys.exit(2) cmdname, cmdargs, opts = args[0], args[1:], opts commands = get_commands() if cmdname not in commands: sys.stderr.write("Unknown command: %s\n\n" % cmdname) cmd_help(None, None) sys.exit(1) return commands[cmdname], cmdargs, opts def main(): cmd, args, opts = parse_opts() try: cmd(args, opts) except IndexError: print(cmd.__doc__) except JsonRpcError as e: print(str(e)) if e.data: print("Server Traceback below:") print(e.data) if __name__ == '__main__': main() hVjghWhZh\jch^}rw(hb]jejfh`]ha]UsourceX@/var/build/user_builds/scrapy/checkouts/0.22/extras/scrapy-ws.pyhc]hd]uhfKhghhP]rxh{X~#!/usr/bin/env python """ Example script to control a Scrapy server using its JSON-RPC web service. It only provides a reduced functionality as its main purpose is to illustrate how to write a web service client. Feel free to improve or write you own. Also, keep in mind that the JSON-RPC API is not stable. The recommended way for controlling a Scrapy server is through the execution queue (see the "queue" command). """ from __future__ import print_function import sys, optparse, urllib, json from urlparse import urljoin from scrapy.utils.jsonrpc import jsonrpc_client_call, JsonRpcError def get_commands(): return { 'help': cmd_help, 'stop': cmd_stop, 'list-available': cmd_list_available, 'list-running': cmd_list_running, 'list-resources': cmd_list_resources, 'get-global-stats': cmd_get_global_stats, 'get-spider-stats': cmd_get_spider_stats, } def cmd_help(args, opts): """help - list available commands""" print("Available commands:") for _, func in sorted(get_commands().items()): print(" ", func.__doc__) def cmd_stop(args, opts): """stop - stop a running spider""" jsonrpc_call(opts, 'crawler/engine', 'close_spider', args[0]) def cmd_list_running(args, opts): """list-running - list running spiders""" for x in json_get(opts, 'crawler/engine/open_spiders'): print(x) def cmd_list_available(args, opts): """list-available - list name of available spiders""" for x in jsonrpc_call(opts, 'crawler/spiders', 'list'): print(x) def cmd_list_resources(args, opts): """list-resources - list available web service resources""" for x in json_get(opts, '')['resources']: print(x) def cmd_get_spider_stats(args, opts): """get-spider-stats - get stats of a running spider""" stats = jsonrpc_call(opts, 'stats', 'get_stats', args[0]) for name, value in stats.items(): print("%-40s %s" % (name, value)) def cmd_get_global_stats(args, opts): """get-global-stats - get global stats""" stats = jsonrpc_call(opts, 'stats', 'get_stats') for name, value in stats.items(): print("%-40s %s" % (name, value)) def get_wsurl(opts, path): return urljoin("http://%s:%s/"% (opts.host, opts.port), path) def jsonrpc_call(opts, path, method, *args, **kwargs): url = get_wsurl(opts, path) return jsonrpc_client_call(url, method, *args, **kwargs) def json_get(opts, path): url = get_wsurl(opts, path) return json.loads(urllib.urlopen(url).read()) def parse_opts(): usage = "%prog [options] [arg] ..." description = "Scrapy web service control script. Use '%prog help' " \ "to see the list of available commands." op = optparse.OptionParser(usage=usage, description=description) op.add_option("-H", dest="host", default="localhost", \ help="Scrapy host to connect to") op.add_option("-P", dest="port", type="int", default=6080, \ help="Scrapy port to connect to") opts, args = op.parse_args() if not args: op.print_help() sys.exit(2) cmdname, cmdargs, opts = args[0], args[1:], opts commands = get_commands() if cmdname not in commands: sys.stderr.write("Unknown command: %s\n\n" % cmdname) cmd_help(None, None) sys.exit(1) return commands[cmdname], cmdargs, opts def main(): cmd, args, opts = parse_opts() try: cmd(args, opts) except IndexError: print(cmd.__doc__) except JsonRpcError as e: print(str(e)) if e.data: print("Server Traceback below:") print(e.data) if __name__ == '__main__': main() ryrz}r{(hUUhVjuubaubhR)r|}r}(hUXC.. _Twisted Web guide: http://jcalderone.livejournal.com/50562.htmlU referencedr~KhVjghWhZh\h]h^}r(hjh`]rhIaha]hb]hc]hd]rh"auhfKhghhP]ubhR)r}r(hUX).. _JSON-RPC 2.0: http://www.jsonrpc.org/j~KhVjghWhZh\h]h^}r(hhh`]rhMaha]hb]hc]hd]rh&auhfKhghhP]ubhR)r}r(hUXs.. _twisted.web.resource.Resource: http://twistedmatrix.com/documents/10.0.0/api/twisted.web.resource.Resource.htmlj~KhVjghWhZh\h]h^}r(hj3h`]rhLaha]hb]hc]hd]rh%auhfKhghhP]ubeubeubeubehUUU transformerrNU footnote_refsr}rUrefnamesr}r(X json-rpc 2.0]r(hj]eXtwisted web guide]rjaXtwisted.web.resource.resource]rj0auUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rhghU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UlineKUtypeUINFOruhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{X7Hyperlink target "topics-webservice" is not referenced.rr}r(hUUhVjubah\hubah\Usystem_messagerubj)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UlineKUtypejuhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{XAHyperlink target "topics-webservice-resources" is not referenced.rr}r(hUUhVjubah\hubah\jubj)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UtypejuhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{XFHyperlink target "module-scrapy.contrib.webservice" is not referenced.rr}r(hUUhVjubah\hubah\jubj)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UlineK)UtypejuhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{XEHyperlink target "topics-webservice-resources-ref" is not referenced.rr}r(hUUhVjubah\hubah\jubj)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UlineK0UtypejuhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{X?Hyperlink target "topics-webservice-crawler" is not referenced.rr}r(hUUhVjubah\hubah\jubj)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UlineKnUtypejuhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{XDHyperlink target "std:setting-WEBSERVICE_ENABLED" is not referenced.rr}r(hUUhVjubah\hubah\jubj)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UlineKxUtypejuhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{XDHyperlink target "std:setting-WEBSERVICE_LOGFILE" is not referenced.rr}r(hUUhVjubah\hubah\jubj)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UlineKUtypejuhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{XAHyperlink target "std:setting-WEBSERVICE_PORT" is not referenced.rr}r(hUUhVjubah\hubah\jubj)r}r(hUUh^}r(hb]UlevelKh`]ha]UsourcehZhc]hd]UlineKUtypejuhP]rh)r}r(hUUh^}r(hb]hc]ha]h`]hd]uhVjhP]rh{XAHyperlink target "std:setting-WEBSERVICE_HOST" is not referenced.rr}r(hUUhVjubah\hubah\jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr }r Uindirect_targetsr ]r Usettingsr (cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesr NhxNUerror_encoding_error_handlerr!Ubackslashreplacer"Udebugr#NUembed_stylesheetr$Uoutput_encoding_error_handlerr%Ustrictr&U sectnum_xformr'KUdump_transformsr(NU docinfo_xformr)KUwarning_streamr*NUpep_file_url_templater+Upep-%04dr,Uexit_status_levelr-KUconfigr.NUstrict_visitorr/NUcloak_email_addressesr0Utrim_footnote_reference_spacer1Uenvr2NUdump_pseudo_xmlr3NUexpose_internalsr4NUsectsubtitle_xformr5U source_linkr6NUrfc_referencesr7NUoutput_encodingr8Uutf-8r9U source_urlr:NUinput_encodingr;U utf-8-sigr<U_disable_configr=NU id_prefixr>UU tab_widthr?KUerror_encodingr@UUTF-8rAU_sourcerBUG/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/webservice.rstrCUgettext_compactrDU generatorrENUdump_internalsrFNU smart_quotesrGU pep_base_urlrHUhttp://www.python.org/dev/peps/rIUsyntax_highlightrJUlongrKUinput_encoding_error_handlerrLj&Uauto_id_prefixrMUidrNUdoctitle_xformrOUstrip_elements_with_classesrPNU _config_filesrQ]Ufile_insertion_enabledrRU raw_enabledrSKU dump_settingsrTNubUsymbol_footnote_startrUKUidsrV}rW(hKjJh8jwjjh=j/hjuUsubstitution_namesrd}reh\hgh^}rf(hb]h`]ha]UsourcehZhc]hd]uU footnotesrg]rhUrefidsri}rj(hH]rkjaj8]rlj:ah8]rmjtaj]rnjah9]rohSaj]rpjahD]rqhajg]rrjiauub.PKo1DBRR.scrapy-0.22/.doctrees/topics/exporters.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X-scrapy.contrib.exporter.JsonLinesItemExporterqXpprintitemexporterqNX*scrapy.contrib.exporter.PickleItemExporterqXtopics-exporters-referenceq Xserialization of item fieldsq NX&1. declaring a serializer in the fieldq NX*2. overriding the serialize_field() methodq NXpickle module documentationq XjsonlinesitemexporterqNXjson-with-large-dataqX<scrapy.contrib.exporter.BaseItemExporter.export_empty_fieldsqX9scrapy.contrib.exporter.BaseItemExporter.fields_to_exportqX!built-in item exporters referenceqNX8scrapy.contrib.exporter.BaseItemExporter.serialize_fieldqX jsonencoderqXusing item exportersqNXitem exportersqNXtopics-exporters-serializersqX4scrapy.contrib.exporter.BaseItemExporter.export_itemqX8scrapy.contrib.exporter.BaseItemExporter.start_exportingqX'scrapy.contrib.exporter.XmlItemExporterqX1scrapy.contrib.exporter.BaseItemExporter.encodingqX csv.writerqX9scrapy.contrib.exporter.BaseItemExporter.finish_exportingqX*scrapy.contrib.exporter.PprintItemExporterqXtopics-exportersqXjsonitemexporterq NX$topics-exporters-field-serializationq!X(scrapy.contrib.exporter.BaseItemExporterq"Xxmlitemexporterq#NX'scrapy.contrib.exporter.CsvItemExporterq$Xbaseitemexporterq%NXpickleitemexporterq&NX(scrapy.contrib.exporter.JsonItemExporterq'Xcsvitemexporterq(NuUsubstitution_defsq)}q*Uparse_messagesq+]q,cdocutils.nodes system_message q-)q.}q/(U rawsourceq0UUparentq1cdocutils.nodes section q2)q3}q4(h0Uh1h2)q5}q6(h0Uh1h2)q7}q8(h0Uh1hUsourceq9cdocutils.nodes reprunicode q:XF/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/exporters.rstq;q<}q=bUexpect_referenced_by_nameq>}q?hcdocutils.nodes target q@)qA}qB(h0X.. _topics-exporters:h1hh9h` which uses an Item Exporter to export scraped items to different files, one per spider::h1hh9h`r"h1jh9h(hG]hH]hI]hJ]hK]hLU$topics-exporters-field-serializationr?uhNKIhOhhP]ubeubh2)r@}rA(h0Uh1h7h9h}rBh!j<shChQhE}rC(hI]hJ]hH]hG]rD(Userialization-of-item-fieldsrEj?ehK]rF(h h!euhNKLhOhhW}rGj?j<shP]rH(hZ)rI}rJ(h0XSerialization of item fieldsrKh1j@h9hHowever, you can customize how each field value is serialized r]r^}r_(h0X>However, you can customize how each field value is serialized h1jYubh)r`}ra(h0X2*before it is passed to the serialization library*hE}rb(hI]hJ]hH]hG]hK]uh1jYhP]rchaX0before it is passed to the serialization libraryrdre}rf(h0Uh1j`ubahChubhaX.rg}rh(h0X.h1jYubeubhm)ri}rj(h0XYThere are two ways to customize how a field will be serialized, which are described next.rkh1j@h9h}rwhjqshChQhE}rx(hI]hJ]hH]hG]ry(U#declaring-a-serializer-in-the-fieldrzjtehK]r{(h heuhNK[hOhhW}r|jtjqshP]r}(hZ)r~}r(h0X&1. Declaring a serializer in the fieldrh1juh9h`. The serializer must be a callable which receives a value and returns its serialized form.h1juh9h`rh1jh9h}rh jshChQhE}r(hI]hJ]hH]hG]r(U!built-in-item-exporters-referencerjehK]r(hh euhNKhOhhW}rjjshP]r(hZ)r }r (h0X!Built-in Item Exporters referencer h1h5h9h)r?}r@(h0XTBaseItemExporter(fields_to_export=None, export_empty_fields=False, encoding='utf-8')h1j4h9h)r}r(h0Xexport_item(item)h1jh9h)r}r(h0X#serialize_field(field, name, value)h1jh9h` and returns the result of applying that serializer to the value. If no serializer is found, it returns the value unchanged except for ``unicode`` values which are encoded to ``str`` using the encoding declared in the :attr:`encoding` attribute.h1j,h9h(h0X/By default, this method looks for a serializer h1j8ubh)r?}r@(h0X@:ref:`declared in the item field `rAh1j8h9h)r}r(h0Xstart_exporting()h1j h9h}r?(h0Uh1jh9h)rH}rI(h0Xfinish_exporting()h1jCh9h)r}r(h0Xfields_to_exportrh1jzh9h)r}r(h0Xexport_empty_fieldsrh1jh9h)r}r(h0Xencodingrh1jh9h)r4}r5(h0XJXmlItemExporter(file, item_element='item', root_element='items', **kwargs)h1j.h9hhaXclass r?r@}rA(h0Uh1j;ubaubjT)rB}rC(h0Xscrapy.contrib.exporter.h1j4h9h Color TV 1200 DVD player 200 h1jph9h Color TV 1200 DVD player 200 rr}r(h0Uh1jubaubhm)r}r(h0XUnless overridden in the :meth:`serialize_field` method, multi-valued fields are exported by serializing each value inside a ```` element. This is for convenience, as multi-valued fields are very common.h1jph9h``hE}r9(hI]hJ]hH]hG]hK]uh1jhP]r:haXr;r<}r=(h0Uh1j7ubahChubhaXJ element. This is for convenience, as multi-valued fields are very common.r>r?}r@(h0XJ element. This is for convenience, as multi-valued fields are very common.h1jubeubhm)rA}rB(h0XFor example, the item::h1jph9h John Doe 23 h1jph9h John Doe 23 rZr[}r\(h0Uh1jVubaubeubeubeubh2)r]}r^(h0Uh1h5h9h)rw}rx(h0XPCsvItemExporter(file, include_headers_line=True, join_multivalued=',', **kwargs)h1jqh9h(UreftypeXattrhhX!BaseItemExporter.fields_to_exportU refdomainXpyr?hG]hH]U refexplicithI]hJ]hK]hhhj|hhuhNMhP]r@h)rA}rB(h0j=hE}rC(hI]hJ]rD(hj?Xpy-attrrEehH]hG]hK]uh1j;hP]rFhaX!BaseItemExporter.fields_to_exportrGrH}rI(h0Uh1jAubahChubaubhaX# or the first exported item fields.rJrK}rL(h0X# or the first exported item fields.h1jubehChqubahCjubj)rM}rN(h0UhE}rO(hI]hJ]hH]hG]hK]uh1jhP]rPhm)rQ}rR(h0UhE}rS(hI]hJ]hH]hG]hK]uh1jMhP]rT(j)rU}rV(h0Xjoin_multivaluedhE}rW(hI]hJ]hH]hG]hK]uh1jQhP]rXhaXjoin_multivaluedrYrZ}r[(h0Uh1jUubahCjubhaX -- r\r]}r^(h0Uh1jQubhaXPThe char (or chars) that will be used for joining multi-valued fields, if found.r_r`}ra(h0XPThe char (or chars) that will be used for joining multi-valued fields, if found.rbh1jQubehChqubahCjubehCjubahCjubehCjubaubhm)rc}rd(h0XThe additional keyword arguments of this constructor are passed to the :class:`BaseItemExporter` constructor, and the leftover arguments to the `csv.writer`_ constructor, so you can use any `csv.writer` constructor argument to customize this exporter.h1jh9h)r}r(h0X.PickleItemExporter(file, protocol=0, **kwargs)h1jh9h(h0XprotocolhE}r?(hI]hJ]hH]hG]hK]uh1j9hP]r@haXprotocolrArB}rC(h0Uh1j=ubahCjubhaX (rDrE}rF(h0Uh1j9ubh)rG}rH(h0UhE}rI(UreftypejU reftargetXintrJU refdomainjhG]hH]U refexplicithI]hJ]hK]uh1j9hP]rKh)rL}rM(h0jJhE}rN(hI]hJ]hH]hG]hK]uh1jGhP]rOhaXintrPrQ}rR(h0Uh1jLubahChubahChubhaX)rS}rT(h0Uh1j9ubhaX -- rUrV}rW(h0Uh1j9ubhaXThe pickle protocol to use.rXrY}rZ(h0XThe pickle protocol to use.r[h1j9ubehChqubahCjubehCjubahCjubehCjubaubhm)r\}r](h0XBFor more information, refer to the `pickle module documentation`_.h1jh9h)r}r(h0X"PprintItemExporter(file, **kwargs)h1jh9h Ujsonitemexporterr? ahK]r@ h auhNMWhOhhP]rA (hZ)rB }rC (h0XJsonItemExporterrD h1j; h9h)rT }rU (h0X JsonItemExporter(file, **kwargs)h1jO h9h}r hj shCUwarningr hE}r (hI]hJ]hH]hG]r j ahK]r hauhNNhOhhW}r j j shP]r hm)r }r (h0XJSON is very simple and flexible serialization format, but it doesn't scale well for large amounts of data since incremental (aka. stream-mode) parsing is not well supported (if at all) among JSON parsers (on any language), and most of them just parse the entire object in memory. If you want the power and simplicity of JSON with a more stream-friendly format, consider using :class:`JsonLinesItemExporter` instead, or splitting the output in multiple chunks.h1j h9h)r+ }r, (h0X%JsonLinesItemExporter(file, **kwargs)h1j& h9h }r? (h0Uh1j9 ubaubj])r@ }rA (h0j0 h1j+ h9h }r? Uindirect_targetsr@ ]rA UsettingsrB (cdocutils.frontend Values rC orD }rE (Ufootnote_backlinksrF KUrecord_dependenciesrG NU rfc_base_urlrH Uhttp://tools.ietf.org/html/rI U tracebackrJ Upep_referencesrK NUstrip_commentsrL NU toc_backlinksrM UentryrN U language_coderO UenrP U datestamprQ NU report_levelrR KU _destinationrS NU halt_levelrT KU strip_classesrU Nh^NUerror_encoding_error_handlerrV UbackslashreplacerW UdebugrX NUembed_stylesheetrY Uoutput_encoding_error_handlerrZ Ustrictr[ U sectnum_xformr\ KUdump_transformsr] NU docinfo_xformr^ KUwarning_streamr_ NUpep_file_url_templater` Upep-%04dra Uexit_status_levelrb KUconfigrc NUstrict_visitorrd NUcloak_email_addressesre Utrim_footnote_reference_spacerf Uenvrg NUdump_pseudo_xmlrh NUexpose_internalsri NUsectsubtitle_xformrj U source_linkrk NUrfc_referencesrl NUoutput_encodingrm Uutf-8rn U source_urlro NUinput_encodingrp U utf-8-sigrq U_disable_configrr NU id_prefixrs UU tab_widthrt KUerror_encodingru UUTF-8rv U_sourcerw UF/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/exporters.rstrx Ugettext_compactry U generatorrz NUdump_internalsr{ NU smart_quotesr| U pep_base_urlr} Uhttp://www.python.org/dev/peps/r~ Usyntax_highlightr Ulongr Uinput_encoding_error_handlerr j[ Uauto_id_prefixr Uidr Udoctitle_xformr Ustrip_elements_with_classesr NU _config_filesr ]Ufile_insertion_enabledr U raw_enabledr KU dump_settingsr NubUsymbol_footnote_startr KUidsr }r (hUh7jEj@jjjjhhjh5jh5jjj h3j j j j hj+ hjjzjuhjhjh$jwhjj j jtjuhjhjhj4hTh@)r }r (h0Uh1h7h9hbUtagnameq?Utargetq@U attributesqA}qB(UidsqC]UbackrefsqD]UdupnamesqE]UclassesqF]UnamesqG]UrefidqHh+uUlineqIKUdocumentqJhh3]ubcdocutils.nodes section qK)qL}qM(h8Uh9hh:h=Uexpect_referenced_by_nameqN}qOhh6sh?UsectionqPhA}qQ(hE]hF]hD]hC]qR(h!h+ehG]qS(hheuhIKhJhUexpect_referenced_by_idqT}qUh+h6sh3]qV(cdocutils.nodes title qW)qX}qY(h8XCommon PracticesqZh9hLh:h=h?Utitleq[hA}q\(hE]hF]hD]hC]hG]uhIKhJhh3]q]cdocutils.nodes Text q^XCommon Practicesq_q`}qa(h8hZh9hXubaubcdocutils.nodes paragraph qb)qc}qd(h8XThis section documents common practices when using Scrapy. These are things that cover many topics and don't often fall into any other specific section.qeh9hLh:h=h?U paragraphqfhA}qg(hE]hF]hD]hC]hG]uhIKhJhh3]qhh^XThis section documents common practices when using Scrapy. These are things that cover many topics and don't often fall into any other specific section.qiqj}qk(h8heh9hcubaubh5)ql}qm(h8X.. _run-from-script:h9hLh:h=h?h@hA}qn(hC]hD]hE]hF]hG]hHh&uhIK hJhh3]ubhK)qo}qp(h8Uh9hLh:h=hN}qqh hlsh?hPhA}qr(hE]hF]hD]hC]qs(h/h&ehG]qt(hh euhIK hJhhT}quh&hlsh3]qv(hW)qw}qx(h8XRun Scrapy from a scriptqyh9hoh:h=h?h[hA}qz(hE]hF]hD]hC]hG]uhIK hJhh3]q{h^XRun Scrapy from a scriptq|q}}q~(h8hyh9hwubaubhb)q}q(h8XYou can use the :ref:`API ` to run Scrapy from a script, instead of the typical way of running Scrapy via ``scrapy crawl``.h9hoh:h=h?hfhA}q(hE]hF]hD]hC]hG]uhIKhJhh3]q(h^XYou can use the qq}q(h8XYou can use the h9hubcsphinx.addnodes pending_xref q)q}q(h8X:ref:`API `qh9hh:h=h?U pending_xrefqhA}q(UreftypeXrefUrefwarnqU reftargetqX topics-apiU refdomainXstdqhC]hD]U refexplicithE]hF]hG]UrefdocqXtopics/practicesquhIKh3]qcdocutils.nodes emphasis q)q}q(h8hhA}q(hE]hF]q(UxrefqhXstd-refqehD]hC]hG]uh9hh3]qh^XAPIqq}q(h8Uh9hubah?Uemphasisqubaubh^XO to run Scrapy from a script, instead of the typical way of running Scrapy via qq}q(h8XO to run Scrapy from a script, instead of the typical way of running Scrapy via h9hubcdocutils.nodes literal q)q}q(h8X``scrapy crawl``hA}q(hE]hF]hD]hC]hG]uh9hh3]qh^X scrapy crawlqq}q(h8Uh9hubah?Uliteralqubh^X.q}q(h8X.h9hubeubhb)q}q(h8XRemember that Scrapy is built on top of the Twisted asynchronous networking library, so you need run it inside the Twisted reactor.qh9hoh:h=h?hfhA}q(hE]hF]hD]hC]hG]uhIKhJhh3]qh^XRemember that Scrapy is built on top of the Twisted asynchronous networking library, so you need run it inside the Twisted reactor.qq}q(h8hh9hubaubhb)q}q(h8XNote that you will also have to shutdown the Twisted reactor yourself after the spider is finished. This can be achieved by connecting a handler to the ``signals.spider_closed`` signal.h9hoh:h=h?hfhA}q(hE]hF]hD]hC]hG]uhIKhJhh3]q(h^XNote that you will also have to shutdown the Twisted reactor yourself after the spider is finished. This can be achieved by connecting a handler to the qq}q(h8XNote that you will also have to shutdown the Twisted reactor yourself after the spider is finished. This can be achieved by connecting a handler to the h9hubh)q}q(h8X``signals.spider_closed``hA}q(hE]hF]hD]hC]hG]uh9hh3]qh^Xsignals.spider_closedqq}q(h8Uh9hubah?hubh^X signal.q…q}q(h8X signal.h9hubeubhb)q}q(h8XaWhat follows is a working example of how to do that, using the `testspiders`_ project as example.h9hoh:h=h?hfhA}q(hE]hF]hD]hC]hG]uhIKhJhh3]q(h^X?What follows is a working example of how to do that, using the qɅq}q(h8X?What follows is a working example of how to do that, using the h9hubcdocutils.nodes reference q)q}q(h8X`testspiders`_UresolvedqKh9hh?U referenceqhA}q(UnameX testspidersqUrefuriqX*https://github.com/scrapinghub/testspidersqhC]hD]hE]hF]hG]uh3]qh^X testspidersqօq}q(h8Uh9hubaubh^X project as example.qمq}q(h8X project as example.h9hubeubcdocutils.nodes literal_block q)q}q(h8X&from twisted.internet import reactor from scrapy.crawler import Crawler from scrapy import log, signals from testspiders.spiders.followall import FollowAllSpider from scrapy.utils.project import get_project_settings spider = FollowAllSpider(domain='scrapinghub.com') settings = get_project_settings() crawler = Crawler(settings) crawler.signals.connect(reactor.stop, signal=signals.spider_closed) crawler.configure() crawler.crawl(spider) crawler.start() log.start() reactor.run() # the script will block here until the spider_closed signal was senth9hoh:h=h?U literal_blockqhA}q(U xml:spaceqUpreserveqhC]hD]hE]hF]hG]uhIKhJhh3]qh^X&from twisted.internet import reactor from scrapy.crawler import Crawler from scrapy import log, signals from testspiders.spiders.followall import FollowAllSpider from scrapy.utils.project import get_project_settings spider = FollowAllSpider(domain='scrapinghub.com') settings = get_project_settings() crawler = Crawler(settings) crawler.signals.connect(reactor.stop, signal=signals.spider_closed) crawler.configure() crawler.crawl(spider) crawler.start() log.start() reactor.run() # the script will block here until the spider_closed signal was sentq䅁q}q(h8Uh9hubaubcsphinx.addnodes seealso q)q}q(h8X`Twisted Reactor Overview`_.qh9hoh:h=h?UseealsoqhA}q(hE]hF]hD]hC]hG]uhINhJhh3]qhb)q}q(h8hh9hh:h=h?hfhA}q(hE]hF]hD]hC]hG]uhIK.h3]q(h)q}q(h8X`Twisted Reactor Overview`_hKh9hh?hhA}q(UnameXTwisted Reactor OverviewhXIhttp://twistedmatrix.com/documents/current/core/howto/reactor-basics.htmlqhC]hD]hE]hF]hG]uh3]qh^XTwisted Reactor Overviewqq}q(h8Uh9hubaubh^X.q}q(h8X.h9hubeubaubeubhK)q}q(h8Uh9hLh:h=h?hPhA}q(hE]hF]hD]hC]qh*ahG]rhauhIK1hJhh3]r(hW)r}r(h8X,Running multiple spiders in the same processrh9hh:h=h?h[hA}r(hE]hF]hD]hC]hG]uhIK1hJhh3]rh^X,Running multiple spiders in the same processrr}r (h8jh9jubaubhb)r }r (h8XBy default, Scrapy runs a single spider per process when you run ``scrapy crawl``. However, Scrapy supports running multiple spiders per process using the :ref:`internal API `.h9hh:h=h?hfhA}r (hE]hF]hD]hC]hG]uhIK3hJhh3]r (h^XABy default, Scrapy runs a single spider per process when you run rr}r(h8XABy default, Scrapy runs a single spider per process when you run h9j ubh)r}r(h8X``scrapy crawl``hA}r(hE]hF]hD]hC]hG]uh9j h3]rh^X scrapy crawlrr}r(h8Uh9jubah?hubh^XJ. However, Scrapy supports running multiple spiders per process using the rr}r(h8XJ. However, Scrapy supports running multiple spiders per process using the h9j ubh)r}r(h8X :ref:`internal API `rh9j h:h=h?hhA}r(UreftypeXrefhhX topics-apiU refdomainXstdrhC]hD]U refexplicithE]hF]hG]hhuhIK3h3]r h)r!}r"(h8jhA}r#(hE]hF]r$(hjXstd-refr%ehD]hC]hG]uh9jh3]r&h^X internal APIr'r(}r)(h8Uh9j!ubah?hubaubh^X.r*}r+(h8X.h9j ubeubhb)r,}r-(h8X5Here is an example, using the `testspiders`_ project:r.h9hh:h=h?hfhA}r/(hE]hF]hD]hC]hG]uhIK7hJhh3]r0(h^XHere is an example, using the r1r2}r3(h8XHere is an example, using the h9j,ubh)r4}r5(h8X`testspiders`_hKh9j,h?hhA}r6(UnameX testspidershhhC]hD]hE]hF]hG]uh3]r7h^X testspidersr8r9}r:(h8Uh9j4ubaubh^X project:r;r<}r=(h8X project:h9j,ubeubh)r>}r?(h8X from twisted.internet import reactor from scrapy.crawler import Crawler from scrapy import log from testspiders.spiders.followall import FollowAllSpider from scrapy.utils.project import get_project_settings def setup_crawler(domain): spider = FollowAllSpider(domain=domain) settings = get_project_settings() crawler = Crawler(settings) crawler.configure() crawler.crawl(spider) crawler.start() for domain in ['scrapinghub.com', 'insophia.com']: setup_crawler(domain) log.start() reactor.run()h9hh:h=h?hhA}r@(hhhC]hD]hE]hF]hG]uhIK;hJhh3]rAh^X from twisted.internet import reactor from scrapy.crawler import Crawler from scrapy import log from testspiders.spiders.followall import FollowAllSpider from scrapy.utils.project import get_project_settings def setup_crawler(domain): spider = FollowAllSpider(domain=domain) settings = get_project_settings() crawler = Crawler(settings) crawler.configure() crawler.crawl(spider) crawler.start() for domain in ['scrapinghub.com', 'insophia.com']: setup_crawler(domain) log.start() reactor.run()rBrC}rD(h8Uh9j>ubaubh)rE}rF(h8X:ref:`run-from-script`.rGh9hh:h=h?hhA}rH(hE]hF]hD]hC]hG]uhINhJhh3]rIhb)rJ}rK(h8jGh9jEh:h=h?hfhA}rL(hE]hF]hD]hC]hG]uhIKNh3]rM(h)rN}rO(h8X:ref:`run-from-script`rPh9jJh:h=h?hhA}rQ(UreftypeXrefhhXrun-from-scriptU refdomainXstdrRhC]hD]U refexplicithE]hF]hG]hhuhIKNh3]rSh)rT}rU(h8jPhA}rV(hE]hF]rW(hjRXstd-refrXehD]hC]hG]uh9jNh3]rYh^Xrun-from-scriptrZr[}r\(h8Uh9jTubah?hubaubh^X.r]}r^(h8X.h9jJubeubaubh5)r_}r`(h8X.. _distributed-crawls:h9hh:h=h?h@hA}ra(hC]hD]hE]hF]hG]hHh,uhIKPhJhh3]ubeubhK)rb}rc(h8Uh9hLh:h=hN}rdhj_sh?hPhA}re(hE]hF]hD]hC]rf(h,h-ehG]rg(hheuhIKShJhhT}rhh,j_sh3]ri(hW)rj}rk(h8XDistributed crawlsrlh9jbh:h=h?h[hA}rm(hE]hF]hD]hC]hG]uhIKShJhh3]rnh^XDistributed crawlsrorp}rq(h8jlh9jjubaubhb)rr}rs(h8XScrapy doesn't provide any built-in facility for running crawls in a distribute (multi-server) manner. However, there are some ways to distribute crawls, which vary depending on how you plan to distribute them.rth9jbh:h=h?hfhA}ru(hE]hF]hD]hC]hG]uhIKUhJhh3]rvh^XScrapy doesn't provide any built-in facility for running crawls in a distribute (multi-server) manner. However, there are some ways to distribute crawls, which vary depending on how you plan to distribute them.rwrx}ry(h8jth9jrubaubhb)rz}r{(h8XIf you have many spiders, the obvious way to distribute the load is to setup many Scrapyd instances and distribute spider runs among those.r|h9jbh:h=h?hfhA}r}(hE]hF]hD]hC]hG]uhIKYhJhh3]r~h^XIf you have many spiders, the obvious way to distribute the load is to setup many Scrapyd instances and distribute spider runs among those.rr}r(h8j|h9jzubaubhb)r}r(h8XIf you instead want to run a single (big) spider through many machines, what you usually do is partition the urls to crawl and send them to each separate spider. Here is a concrete example:rh9jbh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIK\hJhh3]rh^XIf you instead want to run a single (big) spider through many machines, what you usually do is partition the urls to crawl and send them to each separate spider. Here is a concrete example:rr}r(h8jh9jubaubhb)r}r(h8XTFirst, you prepare the list of urls to crawl and put them into separate files/urls::h9jbh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIK`hJhh3]rh^XSFirst, you prepare the list of urls to crawl and put them into separate files/urls:rr}r(h8XSFirst, you prepare the list of urls to crawl and put them into separate files/urls:h9jubaubh)r}r(h8Xhttp://somedomain.com/urls-to-crawl/spider1/part1.list http://somedomain.com/urls-to-crawl/spider1/part2.list http://somedomain.com/urls-to-crawl/spider1/part3.listh9jbh:h=h?hhA}r(hhhC]hD]hE]hF]hG]uhIKchJhh3]rh^Xhttp://somedomain.com/urls-to-crawl/spider1/part1.list http://somedomain.com/urls-to-crawl/spider1/part2.list http://somedomain.com/urls-to-crawl/spider1/part3.listrr}r(h8Uh9jubaubhb)r}r(h8XThen you fire a spider run on 3 different Scrapyd servers. The spider would receive a (spider) argument ``part`` with the number of the partition to crawl::h9jbh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIKghJhh3]r(h^XhThen you fire a spider run on 3 different Scrapyd servers. The spider would receive a (spider) argument rr}r(h8XhThen you fire a spider run on 3 different Scrapyd servers. The spider would receive a (spider) argument h9jubh)r}r(h8X``part``hA}r(hE]hF]hD]hC]hG]uh9jh3]rh^Xpartrr}r(h8Uh9jubah?hubh^X+ with the number of the partition to crawl:rr}r(h8X+ with the number of the partition to crawl:h9jubeubh)r}r(h8X1curl http://scrapy1.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=1 curl http://scrapy2.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=2 curl http://scrapy3.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=3h9jbh:h=h?hhA}r(hhhC]hD]hE]hF]hG]uhIKkhJhh3]rh^X1curl http://scrapy1.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=1 curl http://scrapy2.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=2 curl http://scrapy3.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=3rr}r(h8Uh9jubaubh5)r}r(h8X .. _bans:h9jbh:h=h?h@hA}r(hC]hD]hE]hF]hG]hHh%uhIKohJhh3]ubeubhK)r}r(h8Uh9hLh:h=hN}rh jsh?hPhA}r(hE]hF]hD]hC]r(h0h%ehG]r(hh euhIKrhJhhT}rh%jsh3]r(hW)r}r(h8XAvoiding getting bannedrh9jh:h=h?h[hA}r(hE]hF]hD]hC]hG]uhIKrhJhh3]rh^XAvoiding getting bannedrr}r(h8jh9jubaubhb)r}r(h8XSome websites implement certain measures to prevent bots from crawling them, with varying degrees of sophistication. Getting around those measures can be difficult and tricky, and may sometimes require special infrastructure. Please consider contacting `commercial support`_ if in doubt.h9jh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIKthJhh3]r(h^XSome websites implement certain measures to prevent bots from crawling them, with varying degrees of sophistication. Getting around those measures can be difficult and tricky, and may sometimes require special infrastructure. Please consider contacting rr}r(h8XSome websites implement certain measures to prevent bots from crawling them, with varying degrees of sophistication. Getting around those measures can be difficult and tricky, and may sometimes require special infrastructure. Please consider contacting h9jubh)r}r(h8X`commercial support`_hKh9jh?hhA}r(UnameXcommercial supporthXhttp://scrapy.org/support/rhC]hD]hE]hF]hG]uh3]rh^Xcommercial supportrr}r(h8Uh9jubaubh^X if in doubt.rr}r(h8X if in doubt.h9jubeubhb)r}r(h8XIHere are some tips to keep in mind when dealing with these kind of sites:rh9jh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIKyhJhh3]rh^XIHere are some tips to keep in mind when dealing with these kind of sites:rr}r(h8jh9jubaubcdocutils.nodes bullet_list r)r}r(h8Uh9jh:h=h?U bullet_listrhA}r(UbulletrX*hC]hD]hE]hF]hG]uhIK{hJhh3]r(cdocutils.nodes list_item r)r}r(h8Xirotate your user agent from a pool of well-known ones from browsers (google around to get a list of them)h9jh:h=h?U list_itemrhA}r(hE]hF]hD]hC]hG]uhINhJhh3]rhb)r}r(h8Xirotate your user agent from a pool of well-known ones from browsers (google around to get a list of them)rh9jh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIK{h3]rh^Xirotate your user agent from a pool of well-known ones from browsers (google around to get a list of them)rr}r(h8jh9jubaubaubj)r}r(h8Xddisable cookies (see :setting:`COOKIES_ENABLED`) as some sites may use cookies to spot bot behaviourh9jh:h=h?jhA}r(hE]hF]hD]hC]hG]uhINhJhh3]rhb)r}r(h8Xddisable cookies (see :setting:`COOKIES_ENABLED`) as some sites may use cookies to spot bot behaviourh9jh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIK}h3]r(h^Xdisable cookies (see rr}r(h8Xdisable cookies (see h9jubh)r}r(h8X:setting:`COOKIES_ENABLED`rh9jh:h=h?hhA}r(UreftypeXsettinghhXCOOKIES_ENABLEDU refdomainXstdrhC]hD]U refexplicithE]hF]hG]hhuhIK}h3]rh)r}r(h8jhA}r(hE]hF]r(hjX std-settingrehD]hC]hG]uh9jh3]rh^XCOOKIES_ENABLEDr r }r (h8Uh9jubah?hubaubh^X5) as some sites may use cookies to spot bot behaviourr r }r(h8X5) as some sites may use cookies to spot bot behaviourh9jubeubaubj)r}r(h8XIuse download delays (2 or higher). See :setting:`DOWNLOAD_DELAY` setting.rh9jh:h=h?jhA}r(hE]hF]hD]hC]hG]uhINhJhh3]rhb)r}r(h8jh9jh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIKh3]r(h^X'use download delays (2 or higher). See rr}r(h8X'use download delays (2 or higher). See h9jubh)r}r(h8X:setting:`DOWNLOAD_DELAY`rh9jh:h=h?hhA}r(UreftypeXsettinghhXDOWNLOAD_DELAYU refdomainXstdrhC]hD]U refexplicithE]hF]hG]hhuhIKh3]r h)r!}r"(h8jhA}r#(hE]hF]r$(hjX std-settingr%ehD]hC]hG]uh9jh3]r&h^XDOWNLOAD_DELAYr'r(}r)(h8Uh9j!ubah?hubaubh^X setting.r*r+}r,(h8X setting.h9jubeubaubj)r-}r.(h8XVif possible, use `Google cache`_ to fetch pages, instead of hitting the sites directlyh9jh:h=h?jhA}r/(hE]hF]hD]hC]hG]uhINhJhh3]r0hb)r1}r2(h8XVif possible, use `Google cache`_ to fetch pages, instead of hitting the sites directlyh9j-h:h=h?hfhA}r3(hE]hF]hD]hC]hG]uhIKh3]r4(h^Xif possible, use r5r6}r7(h8Xif possible, use h9j1ubh)r8}r9(h8X`Google cache`_hKh9j1h?hhA}r:(UnameX Google cachehX,http://www.googleguide.com/cached_pages.htmlr;hC]hD]hE]hF]hG]uh3]r<h^X Google cacher=r>}r?(h8Uh9j8ubaubh^X6 to fetch pages, instead of hitting the sites directlyr@rA}rB(h8X6 to fetch pages, instead of hitting the sites directlyh9j1ubeubaubj)rC}rD(h8Xcuse a pool of rotating IPs. For example, the free `Tor project`_ or paid services like `ProxyMesh`_h9jh:h=h?jhA}rE(hE]hF]hD]hC]hG]uhINhJhh3]rFhb)rG}rH(h8Xcuse a pool of rotating IPs. For example, the free `Tor project`_ or paid services like `ProxyMesh`_h9jCh:h=h?hfhA}rI(hE]hF]hD]hC]hG]uhIKh3]rJ(h^X2use a pool of rotating IPs. For example, the free rKrL}rM(h8X2use a pool of rotating IPs. For example, the free h9jGubh)rN}rO(h8X`Tor project`_hKh9jGh?hhA}rP(UnameX Tor projecthXhttps://www.torproject.org/rQhC]hD]hE]hF]hG]uh3]rRh^X Tor projectrSrT}rU(h8Uh9jNubaubh^X or paid services like rVrW}rX(h8X or paid services like h9jGubh)rY}rZ(h8X `ProxyMesh`_hKh9jGh?hhA}r[(UnameX ProxyMeshhXhttp://proxymesh.com/r\hC]hD]hE]hF]hG]uh3]r]h^X ProxyMeshr^r_}r`(h8Uh9jYubaubeubaubj)ra}rb(h8Xuse a highly distributed downloader that circumvents bans internally, so you can just focus on parsing clean pages. One example of such downloaders is `Crawlera`_ h9jh:h=h?jhA}rc(hE]hF]hD]hC]hG]uhINhJhh3]rdhb)re}rf(h8Xuse a highly distributed downloader that circumvents bans internally, so you can just focus on parsing clean pages. One example of such downloaders is `Crawlera`_h9jah:h=h?hfhA}rg(hE]hF]hD]hC]hG]uhIKh3]rh(h^Xuse a highly distributed downloader that circumvents bans internally, so you can just focus on parsing clean pages. One example of such downloaders is rirj}rk(h8Xuse a highly distributed downloader that circumvents bans internally, so you can just focus on parsing clean pages. One example of such downloaders is h9jeubh)rl}rm(h8X `Crawlera`_hKh9jeh?hhA}rn(UnameXCrawlerahXhttp://crawlera.comrohC]hD]hE]hF]hG]uh3]rph^XCrawlerarqrr}rs(h8Uh9jlubaubeubaubeubhb)rt}ru(h8XfIf you are still unable to prevent your bot getting banned, consider contacting `commercial support`_.h9jh:h=h?hfhA}rv(hE]hF]hD]hC]hG]uhIKhJhh3]rw(h^XPIf you are still unable to prevent your bot getting banned, consider contacting rxry}rz(h8XPIf you are still unable to prevent your bot getting banned, consider contacting h9jtubh)r{}r|(h8X`commercial support`_hKh9jth?hhA}r}(UnameXcommercial supporthjhC]hD]hE]hF]hG]uh3]r~h^Xcommercial supportrr}r(h8Uh9j{ubaubh^X.r}r(h8X.h9jtubeubh5)r}r(h8X,.. _Tor project: https://www.torproject.org/U referencedrKh9jh:h=h?h@hA}r(hjQhC]rh1ahD]hE]hF]hG]rhauhIKhJhh3]ubh5)r}r(h8X2.. _commercial support: http://scrapy.org/support/jKh9jh:h=h?h@hA}r(hjhC]rh$ahD]hE]hF]hG]rh auhIKhJhh3]ubh5)r}r(h8X$.. _ProxyMesh: http://proxymesh.com/jKh9jh:h=h?h@hA}r(hj\hC]rh)ahD]hE]hF]hG]rhauhIKhJhh3]ubh5)r}r(h8X>.. _Google cache: http://www.googleguide.com/cached_pages.htmljKh9jh:h=h?h@hA}r(hj;hC]rh#ahD]hE]hF]hG]rhauhIKhJhh3]ubh5)r}r(h8X;.. _testspiders: https://github.com/scrapinghub/testspidersjKh9jh:h=h?h@hA}r(hhhC]rh(ahD]hE]hF]hG]rh auhIKhJhh3]ubh5)r}r(h8Xg.. _Twisted Reactor Overview: http://twistedmatrix.com/documents/current/core/howto/reactor-basics.htmljKh9jh:h=h?h@hA}r(hhhC]rh2ahD]hE]hF]hG]rhauhIKhJhh3]ubh5)r}r(h8X!.. _Crawlera: http://crawlera.comjKh9jh:h=h?h@hA}r(hjohC]rh'ahD]hE]hF]hG]rh auhIKhJhh3]ubh5)r}r(h8X.. _dynamic-item-classes:h9jh:h=h?h@hA}r(hC]hD]hE]hF]hG]hHh"uhIKhJhh3]ubeubhK)r}r(h8Uh9hLh:h=hN}rhjsh?hPhA}r(hE]hF]hD]hC]r(h.h"ehG]r(hheuhIKhJhhT}rh"jsh3]r(hW)r}r(h8X Dynamic Creation of Item Classesrh9jh:h=h?h[hA}r(hE]hF]hD]hC]hG]uhIKhJhh3]rh^X Dynamic Creation of Item Classesrr}r(h8jh9jubaubhb)r}r(h8XFor applications in which the structure of item class is to be determined by user input, or other changing conditions, you can dynamically create item classes instead of manually coding them.rh9jh:h=h?hfhA}r(hE]hF]hD]hC]hG]uhIKhJhh3]rh^XFor applications in which the structure of item class is to be determined by user input, or other changing conditions, you can dynamically create item classes instead of manually coding them.rr}r(h8jh9jubaubh)r}r(h8Xfrom scrapy.item import DictItem, Field def create_item_class(class_name,field_list): field_dict = {} for field_name in field_list: field_dict[field_name] = Field() return type(class_name,DictItem,field_dict)h9jh:h=h?hhA}r(hhhC]hD]hE]hF]hG]uhIKhJhh3]rh^Xfrom scrapy.item import DictItem, Field def create_item_class(class_name,field_list): field_dict = {} for field_name in field_list: field_dict[field_name] = Field() return type(class_name,DictItem,field_dict)rr}r(h8Uh9jubaubeubeubeh8UU transformerrNU footnote_refsr}rUrefnamesr}r(Xcrawlera]rjlaX google cache]rj8aXcommercial support]r(jj{eh]r(hj4eX proxymesh]rjYaX tor project]rjNaXtwisted reactor overview]rhauUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rhJhU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h8UhA}r(hE]UlevelKhC]hD]Usourceh=hF]hG]UlineKUtypeUINFOruh3]rhb)r}r(h8UhA}r(hE]hF]hD]hC]hG]uh9jh3]rh^X6Hyperlink target "topics-practices" is not referenced.rr}r(h8Uh9jubah?hfubah?Usystem_messagerubj)r}r(h8UhA}r(hE]UlevelKhC]hD]Usourceh=hF]hG]UlineK Utypejuh3]rhb)r}r(h8UhA}r(hE]hF]hD]hC]hG]uh9jh3]rh^X5Hyperlink target "run-from-script" is not referenced.rr}r(h8Uh9jubah?hfubah?jubj)r}r(h8UhA}r(hE]UlevelKhC]hD]Usourceh=hF]hG]UlineKPUtypejuh3]rhb)r}r(h8UhA}r(hE]hF]hD]hC]hG]uh9jh3]rh^X8Hyperlink target "distributed-crawls" is not referenced.rr}r(h8Uh9jubah?hfubah?jubj)r}r(h8UhA}r(hE]UlevelKhC]hD]Usourceh=hF]hG]UlineKoUtypejuh3]rhb)r }r (h8UhA}r (hE]hF]hD]hC]hG]uh9jh3]r h^X*Hyperlink target "bans" is not referenced.r r}r(h8Uh9j ubah?hfubah?jubj)r}r(h8UhA}r(hE]UlevelKhC]hD]Usourceh=hF]hG]UlineKUtypejuh3]rhb)r}r(h8UhA}r(hE]hF]hD]hC]hG]uh9jh3]rh^X:Hyperlink target "dynamic-item-classes" is not referenced.rr}r(h8Uh9jubah?hfubah?jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}r Uindirect_targetsr!]r"Usettingsr#(cdocutils.frontend Values r$or%}r&(Ufootnote_backlinksr'KUrecord_dependenciesr(NU rfc_base_urlr)Uhttp://tools.ietf.org/html/r*U tracebackr+Upep_referencesr,NUstrip_commentsr-NU toc_backlinksr.Uentryr/U language_coder0Uenr1U datestampr2NU report_levelr3KU _destinationr4NU halt_levelr5KU strip_classesr6Nh[NUerror_encoding_error_handlerr7Ubackslashreplacer8Udebugr9NUembed_stylesheetr:Uoutput_encoding_error_handlerr;Ustrictr<U sectnum_xformr=KUdump_transformsr>NU docinfo_xformr?KUwarning_streamr@NUpep_file_url_templaterAUpep-%04drBUexit_status_levelrCKUconfigrDNUstrict_visitorrENUcloak_email_addressesrFUtrim_footnote_reference_spacerGUenvrHNUdump_pseudo_xmlrINUexpose_internalsrJNUsectsubtitle_xformrKU source_linkrLNUrfc_referencesrMNUoutput_encodingrNUutf-8rOU source_urlrPNUinput_encodingrQU utf-8-sigrRU_disable_configrSNU id_prefixrTUU tab_widthrUKUerror_encodingrVUUTF-8rWU_sourcerXUF/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/practices.rstrYUgettext_compactrZU generatorr[NUdump_internalsr\NU smart_quotesr]U pep_base_urlr^Uhttp://www.python.org/dev/peps/r_Usyntax_highlightr`UlongraUinput_encoding_error_handlerrbj<Uauto_id_prefixrcUidrdUdoctitle_xformreUstrip_elements_with_classesrfNU _config_filesrg]rhUfile_insertion_enabledriU raw_enabledrjKU dump_settingsrkNubUsymbol_footnote_startrlKUidsrm}rn(h.jh2jh$jh0jh%jh!hLh&hoh1jh'jh,jbh)jh-jbh#jh+hLh*hh"jh(jh/houUsubstitution_namesro}rph?hJhA}rq(hE]hC]hD]Usourceh=hF]hG]uU footnotesrr]rsUrefidsrt}ru(h+]rvh6ah,]rwj_ah"]rxjah%]ryjah&]rzhlauub.PK o1DJ`H``*scrapy-0.22/.doctrees/topics/shell.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xavailable scrapy objectsqNXexample of shell sessionqNXusing the shellqNX scrapy shellq NX4invoking the shell from spiders to inspect responsesq NX tagq Xtopics-shell-inspect-responseq Xipythonq X topics-shellqXipython installation guideqXlaunch the shellqNXavailable shortcutsqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUavailable-scrapy-objectsqhUexample-of-shell-sessionqhUusing-the-shellqh U scrapy-shellqh U4invoking-the-shell-from-spiders-to-inspect-responsesqh Ubase-tagq h Utopics-shell-inspect-responseq!h Uipythonq"hU topics-shellq#hUipython-installation-guideq$hUlaunch-the-shellq%hUavailable-shortcutsq&uUchildrenq']q((cdocutils.nodes target q))q*}q+(U rawsourceq,X.. _topics-shell:Uparentq-hUsourceq.cdocutils.nodes reprunicode q/XB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/shell.rstq0q1}q2bUtagnameq3Utargetq4U attributesq5}q6(Uidsq7]Ubackrefsq8]Udupnamesq9]Uclassesq:]Unamesq;]Urefidqhh']ubcdocutils.nodes section q?)q@}qA(h,Uh-hh.h1Uexpect_referenced_by_nameqB}qChh*sh3UsectionqDh5}qE(h9]h:]h8]h7]qF(hh#eh;]qG(h heuh=Kh>hUexpect_referenced_by_idqH}qIh#h*sh']qJ(cdocutils.nodes title qK)qL}qM(h,X Scrapy shellqNh-h@h.h1h3UtitleqOh5}qP(h9]h:]h8]h7]h;]uh=Kh>hh']qQcdocutils.nodes Text qRX Scrapy shellqSqT}qU(h,hNh-hLubaubcdocutils.nodes paragraph qV)qW}qX(h,XThe Scrapy shell is an interactive shell where you can try and debug your scraping code very quickly, without having to run the spider. It's meant to be used for testing data extraction code, but you can actually use it for testing any kind of code as it is also a regular Python shell.qYh-h@h.h1h3U paragraphqZh5}q[(h9]h:]h8]h7]h;]uh=Kh>hh']q\hRXThe Scrapy shell is an interactive shell where you can try and debug your scraping code very quickly, without having to run the spider. It's meant to be used for testing data extraction code, but you can actually use it for testing any kind of code as it is also a regular Python shell.q]q^}q_(h,hYh-hWubaubhV)q`}qa(h,X!The shell is used for testing XPath or CSS expressions and see how they work and what data they extract from the web pages you're trying to scrape. It allows you to interactively test your expressions while you're writing your spider, without having to run the spider to test every change.qbh-h@h.h1h3hZh5}qc(h9]h:]h8]h7]h;]uh=K h>hh']qdhRX!The shell is used for testing XPath or CSS expressions and see how they work and what data they extract from the web pages you're trying to scrape. It allows you to interactively test your expressions while you're writing your spider, without having to run the spider to test every change.qeqf}qg(h,hbh-h`ubaubhV)qh}qi(h,XOnce you get familiarized with the Scrapy shell, you'll see that it's an invaluable tool for developing and debugging your spiders.qjh-h@h.h1h3hZh5}qk(h9]h:]h8]h7]h;]uh=Kh>hh']qlhRXOnce you get familiarized with the Scrapy shell, you'll see that it's an invaluable tool for developing and debugging your spiders.qmqn}qo(h,hjh-hhubaubhV)qp}qq(h,XIf you have `IPython`_ installed, the Scrapy shell will use it (instead of the standard Python console). The `IPython`_ console is much more powerful and provides smart auto-completion and colorized output, among other things.h-h@h.h1h3hZh5}qr(h9]h:]h8]h7]h;]uh=Kh>hh']qs(hRX If you have qtqu}qv(h,X If you have h-hpubcdocutils.nodes reference qw)qx}qy(h,X `IPython`_UresolvedqzKh-hph3U referenceq{h5}q|(UnameXIPythonUrefuriq}Xhttp://ipython.org/q~h7]h8]h9]h:]h;]uh']qhRXIPythonqq}q(h,Uh-hxubaubhRXW installed, the Scrapy shell will use it (instead of the standard Python console). The qq}q(h,XW installed, the Scrapy shell will use it (instead of the standard Python console). The h-hpubhw)q}q(h,X `IPython`_hzKh-hph3h{h5}q(UnameXIPythonh}h~h7]h8]h9]h:]h;]uh']qhRXIPythonqq}q(h,Uh-hubaubhRXk console is much more powerful and provides smart auto-completion and colorized output, among other things.qq}q(h,Xk console is much more powerful and provides smart auto-completion and colorized output, among other things.h-hpubeubhV)q}q(h,XWe highly recommend you install `IPython`_, specially if you're working on Unix systems (where `IPython`_ excels). See the `IPython installation guide`_ for more info.h-h@h.h1h3hZh5}q(h9]h:]h8]h7]h;]uh=Kh>hh']q(hRX We highly recommend you install qq}q(h,X We highly recommend you install h-hubhw)q}q(h,X `IPython`_hzKh-hh3h{h5}q(UnameXIPythonh}h~h7]h8]h9]h:]h;]uh']qhRXIPythonqq}q(h,Uh-hubaubhRX5, specially if you're working on Unix systems (where qq}q(h,X5, specially if you're working on Unix systems (where h-hubhw)q}q(h,X `IPython`_hzKh-hh3h{h5}q(UnameXIPythonh}h~h7]h8]h9]h:]h;]uh']qhRXIPythonqq}q(h,Uh-hubaubhRX excels). See the qq}q(h,X excels). See the h-hubhw)q}q(h,X`IPython installation guide`_hzKh-hh3h{h5}q(UnameXIPython installation guideh}Xhttp://ipython.org/install.htmlqh7]h8]h9]h:]h;]uh']qhRXIPython installation guideqq}q(h,Uh-hubaubhRX for more info.qq}q(h,X for more info.h-hubeubh))q}q(h,X .. _IPython: http://ipython.org/U referencedqKh-h@h.h1h3h4h5}q(h}h~h7]qh"ah8]h9]h:]h;]qh auh=Kh>hh']ubh))q}q(h,X?.. _IPython installation guide: http://ipython.org/install.htmlhKh-h@h.h1h3h4h5}q(h}hh7]qh$ah8]h9]h:]h;]qhauh=Kh>hh']ubh?)q}q(h,Uh-h@h.h1h3hDh5}q(h9]h:]h8]h7]qh%ah;]qhauh=K h>hh']q(hK)q}q(h,XLaunch the shellqh-hh.h1h3hOh5}q(h9]h:]h8]h7]h;]uh=K h>hh']qhRXLaunch the shellq̅q}q(h,hh-hubaubhV)q}q(h,XOTo launch the Scrapy shell you can use the :command:`shell` command like this::h-hh.h1h3hZh5}q(h9]h:]h8]h7]h;]uh=K"h>hh']q(hRX+To launch the Scrapy shell you can use the qӅq}q(h,X+To launch the Scrapy shell you can use the h-hubcsphinx.addnodes pending_xref q)q}q(h,X:command:`shell`qh-hh.h1h3U pending_xrefqh5}q(UreftypeXcommandUrefwarnq܉U reftargetqXshellU refdomainXstdqh7]h8]U refexplicith9]h:]h;]UrefdocqX topics/shellquh=K"h']qcdocutils.nodes literal q)q}q(h,hh5}q(h9]h:]q(UxrefqhX std-commandqeh8]h7]h;]uh-hh']qhRXshellqꅁq}q(h,Uh-hubah3UliteralqubaubhRX command like this:qq}q(h,X command like this:h-hubeubcdocutils.nodes literal_block q)q}q(h,Xscrapy shell h-hh.h1h3U literal_blockqh5}q(U xml:spaceqUpreserveqh7]h8]h9]h:]h;]uh=K%h>hh']qhRXscrapy shell qq}q(h,Uh-hubaubhV)q}q(h,X2Where the ```` is the URL you want to scrape.qh-hh.h1h3hZh5}q(h9]h:]h8]h7]h;]uh=K'h>hh']r(hRX Where the rr}r(h,X Where the h-hubh)r}r(h,X ````h5}r(h9]h:]h8]h7]h;]uh-hh']rhRXrr }r (h,Uh-jubah3hubhRX is the URL you want to scrape.r r }r (h,X is the URL you want to scrape.h-hubeubeubh?)r}r(h,Uh-h@h.h1h3hDh5}r(h9]h:]h8]h7]rhah;]rhauh=K*h>hh']r(hK)r}r(h,XUsing the shellrh-jh.h1h3hOh5}r(h9]h:]h8]h7]h;]uh=K*h>hh']rhRXUsing the shellrr}r(h,jh-jubaubhV)r}r(h,XThe Scrapy shell is just a regular Python console (or `IPython`_ console if you have it available) which provides some additional shortcut functions for convenience.h-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=K,h>hh']r(hRX6The Scrapy shell is just a regular Python console (or r r!}r"(h,X6The Scrapy shell is just a regular Python console (or h-jubhw)r#}r$(h,X `IPython`_hzKh-jh3h{h5}r%(UnameXIPythonh}h~h7]h8]h9]h:]h;]uh']r&hRXIPythonr'r(}r)(h,Uh-j#ubaubhRXe console if you have it available) which provides some additional shortcut functions for convenience.r*r+}r,(h,Xe console if you have it available) which provides some additional shortcut functions for convenience.h-jubeubh?)r-}r.(h,Uh-jh.h1h3hDh5}r/(h9]h:]h8]h7]r0h&ah;]r1hauh=K1h>hh']r2(hK)r3}r4(h,XAvailable Shortcutsr5h-j-h.h1h3hOh5}r6(h9]h:]h8]h7]h;]uh=K1h>hh']r7hRXAvailable Shortcutsr8r9}r:(h,j5h-j3ubaubcdocutils.nodes block_quote r;)r<}r=(h,Uh-j-h.Nh3U block_quoter>h5}r?(h9]h:]h8]h7]h;]uh=Nh>hh']r@cdocutils.nodes bullet_list rA)rB}rC(h,Uh5}rD(UbulletrEX*h7]h8]h9]h:]h;]uh-j<h']rF(cdocutils.nodes list_item rG)rH}rI(h,XL``shelp()`` - print a help with the list of available objects and shortcuts h5}rJ(h9]h:]h8]h7]h;]uh-jBh']rKhV)rL}rM(h,XK``shelp()`` - print a help with the list of available objects and shortcutsh-jHh.h1h3hZh5}rN(h9]h:]h8]h7]h;]uh=K3h']rO(h)rP}rQ(h,X ``shelp()``h5}rR(h9]h:]h8]h7]h;]uh-jLh']rShRXshelp()rTrU}rV(h,Uh-jPubah3hubhRX@ - print a help with the list of available objects and shortcutsrWrX}rY(h,X@ - print a help with the list of available objects and shortcutsh-jLubeubah3U list_itemrZubjG)r[}r\(h,X{``fetch(request_or_url)`` - fetch a new response from the given request or URL and update all related objects accordingly. h5}r](h9]h:]h8]h7]h;]uh-jBh']r^hV)r_}r`(h,Xz``fetch(request_or_url)`` - fetch a new response from the given request or URL and update all related objects accordingly.h-j[h.h1h3hZh5}ra(h9]h:]h8]h7]h;]uh=K5h']rb(h)rc}rd(h,X``fetch(request_or_url)``h5}re(h9]h:]h8]h7]h;]uh-j_h']rfhRXfetch(request_or_url)rgrh}ri(h,Uh-jcubah3hubhRXa - fetch a new response from the given request or URL and update all related objects accordingly.rjrk}rl(h,Xa - fetch a new response from the given request or URL and update all related objects accordingly.h-j_ubeubah3jZubjG)rm}rn(h,XM``view(response)`` - open the given response in your local web browser, for inspection. This will add a `\ tag`_ to the response body in order for external links (such as images and style sheets) to display properly. Note, however,that this will create a temporary file in your computer, which won't be removed automatically. h5}ro(h9]h:]h8]h7]h;]uh-jBh']rphV)rq}rr(h,XL``view(response)`` - open the given response in your local web browser, for inspection. This will add a `\ tag`_ to the response body in order for external links (such as images and style sheets) to display properly. Note, however,that this will create a temporary file in your computer, which won't be removed automatically.h-jmh.h1h3hZh5}rs(h9]h:]h8]h7]h;]uh=K8h']rt(h)ru}rv(h,X``view(response)``h5}rw(h9]h:]h8]h7]h;]uh-jqh']rxhRXview(response)ryrz}r{(h,Uh-juubah3hubhRXV - open the given response in your local web browser, for inspection. This will add a r|r}}r~(h,XV - open the given response in your local web browser, for inspection. This will add a h-jqubhw)r}r(h,X`\ tag`_hzKh-jqh3h{h5}r(UnameX tagh}X>https://developer.mozilla.org/en-US/docs/Web/HTML/Element/baserh7]h8]h9]h:]h;]uh']rhRX tagrr}r(h,Uh-jubaubhRX to the response body in order for external links (such as images and style sheets) to display properly. Note, however,that this will create a temporary file in your computer, which won't be removed automatically.rr}r(h,X to the response body in order for external links (such as images and style sheets) to display properly. Note, however,that this will create a temporary file in your computer, which won't be removed automatically.h-jqubeubah3jZubeh3U bullet_listrubaubh))r}r(h,XN.. _ tag: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/basehKh-j-h.h1h3h4h5}r(h}jh7]rh ah8]h9]h:]h;]rh auh=K>h>hh']ubeubh?)r}r(h,Uh-jh.h1h3hDh5}r(h9]h:]h8]h7]rhah;]rhauh=KAh>hh']r(hK)r}r(h,XAvailable Scrapy objectsrh-jh.h1h3hOh5}r(h9]h:]h8]h7]h;]uh=KAh>hh']rhRXAvailable Scrapy objectsrr}r(h,jh-jubaubhV)r}r(h,XThe Scrapy shell automatically creates some convenient objects from the downloaded page, like the :class:`~scrapy.http.Response` object and the :class:`~scrapy.selector.Selector` objects (for both HTML and XML content).h-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=KCh>hh']r(hRXbThe Scrapy shell automatically creates some convenient objects from the downloaded page, like the rr}r(h,XbThe Scrapy shell automatically creates some convenient objects from the downloaded page, like the h-jubh)r}r(h,X:class:`~scrapy.http.Response`rh-jh.h1h3hh5}r(UreftypeXclassh܉hXscrapy.http.ResponseU refdomainXpyrh7]h8]U refexplicith9]h:]h;]hhUpy:classrNU py:modulerNuh=KCh']rh)r}r(h,jh5}r(h9]h:]r(hjXpy-classreh8]h7]h;]uh-jh']rhRXResponserr}r(h,Uh-jubah3hubaubhRX object and the rr}r(h,X object and the h-jubh)r}r(h,X":class:`~scrapy.selector.Selector`rh-jh.h1h3hh5}r(UreftypeXclassh܉hXscrapy.selector.SelectorU refdomainXpyrh7]h8]U refexplicith9]h:]h;]hhjNjNuh=KCh']rh)r}r(h,jh5}r(h9]h:]r(hjXpy-classreh8]h7]h;]uh-jh']rhRXSelectorrr}r(h,Uh-jubah3hubaubhRX) objects (for both HTML and XML content).rr}r(h,X) objects (for both HTML and XML content).h-jubeubhV)r}r(h,XThose objects are:rh-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=KHh>hh']rhRXThose objects are:rr}r(h,jh-jubaubj;)r}r(h,Uh-jh.Nh3j>h5}r(h9]h:]h8]h7]h;]uh=Nh>hh']rjA)r}r(h,Uh5}r(jEX*h7]h8]h9]h:]h;]uh-jh']r(jG)r}r(h,X``spider`` - the Spider which is known to handle the URL, or a :class:`~scrapy.spider.Spider` object if there is no spider found for the current URL h5}r(h9]h:]h8]h7]h;]uh-jh']rhV)r}r(h,X``spider`` - the Spider which is known to handle the URL, or a :class:`~scrapy.spider.Spider` object if there is no spider found for the current URLh-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=KJh']r(h)r}r(h,X ``spider``h5}r(h9]h:]h8]h7]h;]uh-jh']rhRXspiderrr}r(h,Uh-jubah3hubhRX5 - the Spider which is known to handle the URL, or a rr}r(h,X5 - the Spider which is known to handle the URL, or a h-jubh)r}r(h,X:class:`~scrapy.spider.Spider`rh-jh.h1h3hh5}r(UreftypeXclassh܉hXscrapy.spider.SpiderU refdomainXpyrh7]h8]U refexplicith9]h:]h;]hhjNjNuh=KJh']rh)r}r(h,jh5}r(h9]h:]r(hjXpy-classreh8]h7]h;]uh-jh']rhRXSpiderrr}r(h,Uh-jubah3hubaubhRX7 object if there is no spider found for the current URLrr}r(h,X7 object if there is no spider found for the current URLh-jubeubah3jZubjG)r}r(h,X``request`` - a :class:`~scrapy.http.Request` object of the last fetched page. You can modify this request using :meth:`~scrapy.http.Request.replace` or fetch a new request (without leaving the shell) using the ``fetch`` shortcut. h5}r(h9]h:]h8]h7]h;]uh-jh']rhV)r}r(h,X``request`` - a :class:`~scrapy.http.Request` object of the last fetched page. You can modify this request using :meth:`~scrapy.http.Request.replace` or fetch a new request (without leaving the shell) using the ``fetch`` shortcut.h-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=KNh']r(h)r}r(h,X ``request``h5}r (h9]h:]h8]h7]h;]uh-jh']r hRXrequestr r }r (h,Uh-jubah3hubhRX - a rr}r(h,X - a h-jubh)r}r(h,X:class:`~scrapy.http.Request`rh-jh.h1h3hh5}r(UreftypeXclassh܉hXscrapy.http.RequestU refdomainXpyrh7]h8]U refexplicith9]h:]h;]hhjNjNuh=KNh']rh)r}r(h,jh5}r(h9]h:]r(hjXpy-classreh8]h7]h;]uh-jh']rhRXRequestrr}r(h,Uh-jubah3hubaubhRXD object of the last fetched page. You can modify this request using r r!}r"(h,XD object of the last fetched page. You can modify this request using h-jubh)r#}r$(h,X$:meth:`~scrapy.http.Request.replace`r%h-jh.h1h3hh5}r&(UreftypeXmethh܉hXscrapy.http.Request.replaceU refdomainXpyr'h7]h8]U refexplicith9]h:]h;]hhjNjNuh=KNh']r(h)r)}r*(h,j%h5}r+(h9]h:]r,(hj'Xpy-methr-eh8]h7]h;]uh-j#h']r.hRX replace()r/r0}r1(h,Uh-j)ubah3hubaubhRX> or fetch a new request (without leaving the shell) using the r2r3}r4(h,X> or fetch a new request (without leaving the shell) using the h-jubh)r5}r6(h,X ``fetch``h5}r7(h9]h:]h8]h7]h;]uh-jh']r8hRXfetchr9r:}r;(h,Uh-j5ubah3hubhRX shortcut.r<r=}r>(h,X shortcut.h-jubeubah3jZubjG)r?}r@(h,XX``response`` - a :class:`~scrapy.http.Response` object containing the last fetched page h5}rA(h9]h:]h8]h7]h;]uh-jh']rBhV)rC}rD(h,XW``response`` - a :class:`~scrapy.http.Response` object containing the last fetched pageh-j?h.h1h3hZh5}rE(h9]h:]h8]h7]h;]uh=KSh']rF(h)rG}rH(h,X ``response``h5}rI(h9]h:]h8]h7]h;]uh-jCh']rJhRXresponserKrL}rM(h,Uh-jGubah3hubhRX - a rNrO}rP(h,X - a h-jCubh)rQ}rR(h,X:class:`~scrapy.http.Response`rSh-jCh.h1h3hh5}rT(UreftypeXclassh܉hXscrapy.http.ResponseU refdomainXpyrUh7]h8]U refexplicith9]h:]h;]hhjNjNuh=KSh']rVh)rW}rX(h,jSh5}rY(h9]h:]rZ(hjUXpy-classr[eh8]h7]h;]uh-jQh']r\hRXResponser]r^}r_(h,Uh-jWubah3hubaubhRX( object containing the last fetched pager`ra}rb(h,X( object containing the last fetched pageh-jCubeubah3jZubjG)rc}rd(h,Xa``sel`` - a :class:`~scrapy.selector.Selector` object constructed with the last response fetched h5}re(h9]h:]h8]h7]h;]uh-jh']rfhV)rg}rh(h,X```sel`` - a :class:`~scrapy.selector.Selector` object constructed with the last response fetchedh-jch.h1h3hZh5}ri(h9]h:]h8]h7]h;]uh=KVh']rj(h)rk}rl(h,X``sel``h5}rm(h9]h:]h8]h7]h;]uh-jgh']rnhRXselrorp}rq(h,Uh-jkubah3hubhRX - a rrrs}rt(h,X - a h-jgubh)ru}rv(h,X":class:`~scrapy.selector.Selector`rwh-jgh.h1h3hh5}rx(UreftypeXclassh܉hXscrapy.selector.SelectorU refdomainXpyryh7]h8]U refexplicith9]h:]h;]hhjNjNuh=KVh']rzh)r{}r|(h,jwh5}r}(h9]h:]r~(hjyXpy-classreh8]h7]h;]uh-juh']rhRXSelectorrr}r(h,Uh-j{ubah3hubaubhRX2 object constructed with the last response fetchedrr}r(h,X2 object constructed with the last response fetchedh-jgubeubah3jZubjG)r}r(h,XD``settings`` - the current :ref:`Scrapy settings ` h5}r(h9]h:]h8]h7]h;]uh-jh']rhV)r}r(h,XC``settings`` - the current :ref:`Scrapy settings `rh-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=KYh']r(h)r}r(h,X ``settings``h5}r(h9]h:]h8]h7]h;]uh-jh']rhRXsettingsrr}r(h,Uh-jubah3hubhRX - the current rr}r(h,X - the current h-jubh)r}r(h,X(:ref:`Scrapy settings `rh-jh.h1h3hh5}r(UreftypeXrefh܈hXtopics-settingsU refdomainXstdrh7]h8]U refexplicith9]h:]h;]hhuh=KYh']rcdocutils.nodes emphasis r)r}r(h,jh5}r(h9]h:]r(hjXstd-refreh8]h7]h;]uh-jh']rhRXScrapy settingsrr}r(h,Uh-jubah3Uemphasisrubaubeubah3jZubeh3jubaubeubeubh?)r}r(h,Uh-h@h.h1h3hDh5}r(h9]h:]h8]h7]rhah;]rhauh=K\h>hh']r(hK)r}r(h,XExample of shell sessionrh-jh.h1h3hOh5}r(h9]h:]h8]h7]h;]uh=K\h>hh']rhRXExample of shell sessionrr}r(h,jh-jubaubhV)r}r(h,X`Here's an example of a typical shell session where we start by scraping the http://scrapy.org page, and then proceed to scrape the http://slashdot.org page. Finally, we modify the (Slashdot) request method to POST and re-fetch it getting a HTTP 405 (method not allowed) error. We end the session by typing Ctrl-D (in Unix systems) or Ctrl-Z in Windows.h-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=K^h>hh']r(hRXLHere's an example of a typical shell session where we start by scraping the rr}r(h,XLHere's an example of a typical shell session where we start by scraping the h-jubhw)r}r(h,Xhttp://scrapy.orgrh5}r(Urefurijh7]h8]h9]h:]h;]uh-jh']rhRXhttp://scrapy.orgrr}r(h,Uh-jubah3h{ubhRX& page, and then proceed to scrape the rr}r(h,X& page, and then proceed to scrape the h-jubhw)r}r(h,Xhttp://slashdot.orgrh5}r(Urefurijh7]h8]h9]h:]h;]uh-jh']rhRXhttp://slashdot.orgrr}r(h,Uh-jubah3h{ubhRX page. Finally, we modify the (Slashdot) request method to POST and re-fetch it getting a HTTP 405 (method not allowed) error. We end the session by typing Ctrl-D (in Unix systems) or Ctrl-Z in Windows.rr}r(h,X page. Finally, we modify the (Slashdot) request method to POST and re-fetch it getting a HTTP 405 (method not allowed) error. We end the session by typing Ctrl-D (in Unix systems) or Ctrl-Z in Windows.h-jubeubhV)r}r(h,XKeep in mind that the data extracted here may not be the same when you try it, as those pages are not static and could have changed by the time you test this. The only purpose of this example is to get you familiarized with how the Scrapy shell works.rh-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=Kdh>hh']rhRXKeep in mind that the data extracted here may not be the same when you try it, as those pages are not static and could have changed by the time you test this. The only purpose of this example is to get you familiarized with how the Scrapy shell works.rr}r(h,jh-jubaubhV)r}r(h,XFirst, we launch the shell::rh-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=Kih>hh']rhRXFirst, we launch the shell:rr}r(h,XFirst, we launch the shell:h-jubaubh)r}r(h,X(scrapy shell 'http://scrapy.org' --nologh-jh.h1h3hh5}r(hhh7]h8]h9]h:]h;]uh=Kkh>hh']rhRX(scrapy shell 'http://scrapy.org' --nologrr}r(h,Uh-jubaubhV)r}r(h,XThen, the shell fetches the URL (using the Scrapy downloader) and prints the list of available objects and useful shortcuts (you'll notice that these lines all start with the ``[s]`` prefix)::h-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=Kmh>hh']r(hRXThen, the shell fetches the URL (using the Scrapy downloader) and prints the list of available objects and useful shortcuts (you'll notice that these lines all start with the rr}r(h,XThen, the shell fetches the URL (using the Scrapy downloader) and prints the list of available objects and useful shortcuts (you'll notice that these lines all start with the h-jubh)r}r(h,X``[s]``h5}r(h9]h:]h8]h7]h;]uh-jh']rhRX[s]rr}r(h,Uh-jubah3hubhRX prefix):rr}r(h,X prefix):h-jubeubh)r}r(h,X[s] Available objects [s] sel [s] item Item() [s] request [s] response [s] settings [s] spider [s] Useful shortcuts: [s] shelp() Prints this help. [s] fetch(req_or_url) Fetch a new request or URL and update objects [s] view(response) View response in a browser >>>h-jh.h1h3hh5}r(hhh7]h8]h9]h:]h;]uh=Kqh>hh']rhRX[s] Available objects [s] sel [s] item Item() [s] request [s] response [s] settings [s] spider [s] Useful shortcuts: [s] shelp() Prints this help. [s] fetch(req_or_url) Fetch a new request or URL and update objects [s] view(response) View response in a browser >>>rr}r(h,Uh-jubaubhV)r}r(h,X2After that, we can star playing with the objects::rh-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=Kh>hh']r hRX1After that, we can star playing with the objects:r r }r (h,X1After that, we can star playing with the objects:h-jubaubh)r }r(h,XY>>> sel.xpath("//h2/text()").extract()[0] u'Welcome to Scrapy' >>> fetch("http://slashdot.org") [s] Available Scrapy objects: [s] sel [s] item JobItem() [s] request [s] response <200 http://slashdot.org> [s] settings [s] spider [s] Useful shortcuts: [s] shelp() Shell help (print this help) [s] fetch(req_or_url) Fetch request (or URL) and update local objects [s] view(response) View response in a browser >>> sel.xpath("//h2/text()").extract() [u'News for nerds, stuff that matters'] >>> request = request.replace(method="POST") >>> fetch(request) 2009-04-03 00:57:39-0300 [default] ERROR: Downloading from : 405 Method Not Allowed >>>h-jh.h1h3hh5}r(hhh7]h8]h9]h:]h;]uh=Kh>hh']rhRXY>>> sel.xpath("//h2/text()").extract()[0] u'Welcome to Scrapy' >>> fetch("http://slashdot.org") [s] Available Scrapy objects: [s] sel [s] item JobItem() [s] request [s] response <200 http://slashdot.org> [s] settings [s] spider [s] Useful shortcuts: [s] shelp() Shell help (print this help) [s] fetch(req_or_url) Fetch request (or URL) and update local objects [s] view(response) View response in a browser >>> sel.xpath("//h2/text()").extract() [u'News for nerds, stuff that matters'] >>> request = request.replace(method="POST") >>> fetch(request) 2009-04-03 00:57:39-0300 [default] ERROR: Downloading from : 405 Method Not Allowed >>>rr}r(h,Uh-j ubaubh))r}r(h,X".. _topics-shell-inspect-response:h-jh.h1h3h4h5}r(h7]h8]h9]h:]h;]hhh']ubeubh?)r}r(h,Uh-h@h.h1hB}rh jsh3hDh5}r(h9]h:]h8]h7]r(hh!eh;]r(h h euh=Kh>hhH}rh!jsh']r(hK)r}r (h,X4Invoking the shell from spiders to inspect responsesr!h-jh.h1h3hOh5}r"(h9]h:]h8]h7]h;]uh=Kh>hh']r#hRX4Invoking the shell from spiders to inspect responsesr$r%}r&(h,j!h-jubaubhV)r'}r((h,XSometimes you want to inspect the responses that are being processed in a certain point of your spider, if only to check that response you expect is getting there.r)h-jh.h1h3hZh5}r*(h9]h:]h8]h7]h;]uh=Kh>hh']r+hRXSometimes you want to inspect the responses that are being processed in a certain point of your spider, if only to check that response you expect is getting there.r,r-}r.(h,j)h-j'ubaubhV)r/}r0(h,XMThis can be achieved by using the ``scrapy.shell.inspect_response`` function.r1h-jh.h1h3hZh5}r2(h9]h:]h8]h7]h;]uh=Kh>hh']r3(hRX"This can be achieved by using the r4r5}r6(h,X"This can be achieved by using the h-j/ubh)r7}r8(h,X!``scrapy.shell.inspect_response``h5}r9(h9]h:]h8]h7]h;]uh-j/h']r:hRXscrapy.shell.inspect_responser;r<}r=(h,Uh-j7ubah3hubhRX function.r>r?}r@(h,X function.h-j/ubeubhV)rA}rB(h,X=Here's an example of how you would call it from your spider::rCh-jh.h1h3hZh5}rD(h9]h:]h8]h7]h;]uh=Kh>hh']rEhRX<Here's an example of how you would call it from your spider:rFrG}rH(h,X<Here's an example of how you would call it from your spider:h-jAubaubh)rI}rJ(h,Xclass MySpider(Spider): ... def parse(self, response): if response.url == 'http://www.example.com/products.php': from scrapy.shell import inspect_response inspect_response(response) # ... your parsing code ..h-jh.h1h3hh5}rK(hhh7]h8]h9]h:]h;]uh=Kh>hh']rLhRXclass MySpider(Spider): ... def parse(self, response): if response.url == 'http://www.example.com/products.php': from scrapy.shell import inspect_response inspect_response(response) # ... your parsing code ..rMrN}rO(h,Uh-jIubaubhV)rP}rQ(h,XAWhen you run the spider, you will get something similar to this::rRh-jh.h1h3hZh5}rS(h9]h:]h8]h7]h;]uh=Kh>hh']rThRX@When you run the spider, you will get something similar to this:rUrV}rW(h,X@When you run the spider, you will get something similar to this:h-jPubaubh)rX}rY(h,X2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled (referer: ) 2009-08-27 19:15:26-0300 [example.com] DEBUG: Crawled (referer: ) [s] Available objects [s] sel ... >>> response.url 'http://www.example.com/products.php'h-jh.h1h3hh5}rZ(hhh7]h8]h9]h:]h;]uh=Kh>hh']r[hRX2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled (referer: ) 2009-08-27 19:15:26-0300 [example.com] DEBUG: Crawled (referer: ) [s] Available objects [s] sel ... >>> response.url 'http://www.example.com/products.php'r\r]}r^(h,Uh-jXubaubhV)r_}r`(h,X7Then, you can check if the extraction code is working::rah-jh.h1h3hZh5}rb(h9]h:]h8]h7]h;]uh=Kh>hh']rchRX6Then, you can check if the extraction code is working:rdre}rf(h,X6Then, you can check if the extraction code is working:h-j_ubaubh)rg}rh(h,X>>> sel.xpath('//h1') []h-jh.h1h3hh5}ri(hhh7]h8]h9]h:]h;]uh=Kh>hh']rjhRX>>> sel.xpath('//h1') []rkrl}rm(h,Uh-jgubaubhV)rn}ro(h,XtNope, it doesn't. So you can open the response in your web browser and see if it's the response you were expecting::h-jh.h1h3hZh5}rp(h9]h:]h8]h7]h;]uh=Kh>hh']rqhRXsNope, it doesn't. So you can open the response in your web browser and see if it's the response you were expecting:rrrs}rt(h,XsNope, it doesn't. So you can open the response in your web browser and see if it's the response you were expecting:h-jnubaubh)ru}rv(h,X>>> view(response) >>>h-jh.h1h3hh5}rw(hhh7]h8]h9]h:]h;]uh=Kh>hh']rxhRX>>> view(response) >>>ryrz}r{(h,Uh-juubaubhV)r|}r}(h,XYFinally you hit Ctrl-D (or Ctrl-Z in Windows) to exit the shell and resume the crawling::h-jh.h1h3hZh5}r~(h9]h:]h8]h7]h;]uh=Kh>hh']rhRXXFinally you hit Ctrl-D (or Ctrl-Z in Windows) to exit the shell and resume the crawling:rr}r(h,XXFinally you hit Ctrl-D (or Ctrl-Z in Windows) to exit the shell and resume the crawling:h-j|ubaubh)r}r(h,X>>> ^D 2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled (referer: ) 2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled (referer: ) # ...h-jh.h1h3hh5}r(hhh7]h8]h9]h:]h;]uh=Kh>hh']rhRX>>> ^D 2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled (referer: ) 2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled (referer: ) # ...rr}r(h,Uh-jubaubhV)r}r(h,XNote that you can't use the ``fetch`` shortcut here since the Scrapy engine is blocked by the shell. However, after you leave the shell, the spider will continue crawling where it stopped, as shown above.h-jh.h1h3hZh5}r(h9]h:]h8]h7]h;]uh=Kh>hh']r(hRXNote that you can't use the rr}r(h,XNote that you can't use the h-jubh)r}r(h,X ``fetch``h5}r(h9]h:]h8]h7]h;]uh-jh']rhRXfetchrr}r(h,Uh-jubah3hubhRX shortcut here since the Scrapy engine is blocked by the shell. However, after you leave the shell, the spider will continue crawling where it stopped, as shown above.rr}r(h,X shortcut here since the Scrapy engine is blocked by the shell. However, after you leave the shell, the spider will continue crawling where it stopped, as shown above.h-jubeubeubeubeh,UU transformerrNU footnote_refsr}rUrefnamesr}r(Xipython]r(hxhhhj#eXipython installation guide]rhaX tag]rjauUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh>hU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h,Uh5}r(h9]UlevelKh7]h8]Usourceh1h:]h;]UlineKUtypeUINFOruh']rhV)r}r(h,Uh5}r(h9]h:]h8]h7]h;]uh-jh']rhRX2Hyperlink target "topics-shell" is not referenced.rr}r(h,Uh-jubah3hZubah3Usystem_messagerubj)r}r(h,Uh5}r(h9]UlevelKh7]h8]Usourceh1h:]h;]UlineKUtypejuh']rhV)r}r(h,Uh5}r(h9]h:]h8]h7]h;]uh-jh']rhRXCHyperlink target "topics-shell-inspect-response" is not referenced.rr}r(h,Uh-jubah3hZubah3jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhONUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/shell.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesr U pep_base_urlr Uhttp://www.python.org/dev/peps/r Usyntax_highlightr Ulongr Uinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(h jhh@h%hh!jhjhjh$hhjh"hh#h@h&j-hjuUsubstitution_namesr}rh3h>h5}r(h9]h7]h8]Usourceh1h:]h;]uU footnotesr]rUrefidsr}r (h#]r!h*ah!]r"jauub.PKo1D<zvava/scrapy-0.22/.doctrees/topics/djangoitem.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xtopics-djangoitemqXusing djangoitemqNXdjangoitem caveatsqNX djangoitemq NXdjango settings set upq NuUsubstitution_defsq }q Uparse_messagesq ]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUtopics-djangoitemqhUusing-djangoitemqhUdjangoitem-caveatsqh U djangoitemqh Udjango-settings-set-upquUchildrenq]q(cdocutils.nodes target q)q}q(U rawsourceqX.. _topics-djangoitem:UparentqhUsourceq cdocutils.nodes reprunicode q!XG/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/djangoitem.rstq"q#}q$bUtagnameq%Utargetq&U attributesq'}q((Uidsq)]Ubackrefsq*]Udupnamesq+]Uclassesq,]Unamesq-]Urefidq.huUlineq/KUdocumentq0hh]ubcdocutils.nodes section q1)q2}q3(hUhhh h#Uexpect_referenced_by_nameq4}q5hhsh%Usectionq6h'}q7(h+]h,]h*]h)]q8(hheh-]q9(h heuh/Kh0hUexpect_referenced_by_idq:}q;hhsh]q<(cdocutils.nodes title q=)q>}q?(hX DjangoItemq@hh2h h#h%UtitleqAh'}qB(h+]h,]h*]h)]h-]uh/Kh0hh]qCcdocutils.nodes Text qDX DjangoItemqEqF}qG(hh@hh>ubaubcdocutils.nodes paragraph qH)qI}qJ(hX:class:`DjangoItem` is a class of item that gets its fields definition from a Django model, you simply create a :class:`DjangoItem` and specify what Django model it relates to.hh2h h#h%U paragraphqKh'}qL(h+]h,]h*]h)]h-]uh/Kh0hh]qM(csphinx.addnodes pending_xref qN)qO}qP(hX:class:`DjangoItem`qQhhIh h#h%U pending_xrefqRh'}qS(UreftypeXclassUrefwarnqTU reftargetqUX DjangoItemU refdomainXpyqVh)]h*]U refexplicith+]h,]h-]UrefdocqWXtopics/djangoitemqXUpy:classqYNU py:moduleqZNuh/Kh]q[cdocutils.nodes literal q\)q]}q^(hhQh'}q_(h+]h,]q`(UxrefqahVXpy-classqbeh*]h)]h-]uhhOh]qchDX DjangoItemqdqe}qf(hUhh]ubah%UliteralqgubaubhDX] is a class of item that gets its fields definition from a Django model, you simply create a qhqi}qj(hX] is a class of item that gets its fields definition from a Django model, you simply create a hhIubhN)qk}ql(hX:class:`DjangoItem`qmhhIh h#h%hRh'}qn(UreftypeXclasshThUX DjangoItemU refdomainXpyqoh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/Kh]qph\)qq}qr(hhmh'}qs(h+]h,]qt(hahoXpy-classqueh*]h)]h-]uhhkh]qvhDX DjangoItemqwqx}qy(hUhhqubah%hgubaubhDX- and specify what Django model it relates to.qzq{}q|(hX- and specify what Django model it relates to.hhIubeubhH)q}}q~(hXBesides of getting the model fields defined on your item, :class:`DjangoItem` provides a method to create and populate a Django model instance with the item data.hh2h h#h%hKh'}q(h+]h,]h*]h)]h-]uh/K h0hh]q(hDX:Besides of getting the model fields defined on your item, qq}q(hX:Besides of getting the model fields defined on your item, hh}ubhN)q}q(hX:class:`DjangoItem`qhh}h h#h%hRh'}q(UreftypeXclasshThUX DjangoItemU refdomainXpyqh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/K h]qh\)q}q(hhh'}q(h+]h,]q(hahXpy-classqeh*]h)]h-]uhhh]qhDX DjangoItemqq}q(hUhhubah%hgubaubhDXU provides a method to create and populate a Django model instance with the item data.qq}q(hXU provides a method to create and populate a Django model instance with the item data.hh}ubeubh1)q}q(hUhh2h h#h%h6h'}q(h+]h,]h*]h)]qhah-]qhauh/Kh0hh]q(h=)q}q(hXUsing DjangoItemqhhh h#h%hAh'}q(h+]h,]h*]h)]h-]uh/Kh0hh]qhDXUsing DjangoItemqq}q(hhhhubaubhH)q}q(hX:class:`DjangoItem` works much like ModelForms in Django, you create a subclass and define its ``django_model`` attribute to be a valid Django model. With this you will get an item with a field for each Django model field.hhh h#h%hKh'}q(h+]h,]h*]h)]h-]uh/Kh0hh]q(hN)q}q(hX:class:`DjangoItem`qhhh h#h%hRh'}q(UreftypeXclasshThUX DjangoItemU refdomainXpyqh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/Kh]qh\)q}q(hhh'}q(h+]h,]q(hahXpy-classqeh*]h)]h-]uhhh]qhDX DjangoItemqq}q(hUhhubah%hgubaubhDXL works much like ModelForms in Django, you create a subclass and define its qq}q(hXL works much like ModelForms in Django, you create a subclass and define its hhubh\)q}q(hX``django_model``h'}q(h+]h,]h*]h)]h-]uhhh]qhDX django_modelqq}q(hUhhubah%hgubhDXo attribute to be a valid Django model. With this you will get an item with a field for each Django model field.qq}q(hXo attribute to be a valid Django model. With this you will get an item with a field for each Django model field.hhubeubhH)q}q(hXIn addition, you can define fields that aren't present in the model and even override fields that are present in the model defining them in the item.qhhh h#h%hKh'}q(h+]h,]h*]h)]h-]uh/Kh0hh]qhDXIn addition, you can define fields that aren't present in the model and even override fields that are present in the model defining them in the item.qɅq}q(hhhhubaubhH)q}q(hXLet's see some examples:qhhh h#h%hKh'}q(h+]h,]h*]h)]h-]uh/Kh0hh]qhDXLet's see some examples:qхq}q(hhhhubaubhH)q}q(hX*Creating a Django model for the examples::qhhh h#h%hKh'}q(h+]h,]h*]h)]h-]uh/Kh0hh]qhDX)Creating a Django model for the examples:qمq}q(hX)Creating a Django model for the examples:hhubaubcdocutils.nodes literal_block q)q}q(hXfrom django.db import models class Person(models.Model): name = models.CharField(max_length=255) age = models.IntegerField()hhh h#h%U literal_blockqh'}q(U xml:spaceqUpreserveqh)]h*]h+]h,]h-]uh/Kh0hh]qhDXfrom django.db import models class Person(models.Model): name = models.CharField(max_length=255) age = models.IntegerField()q䅁q}q(hUhhubaubhH)q}q(hX&Defining a basic :class:`DjangoItem`::qhhh h#h%hKh'}q(h+]h,]h*]h)]h-]uh/K#h0hh]q(hDXDefining a basic q셁q}q(hXDefining a basic hhubhN)q}q(hX:class:`DjangoItem`qhhh h#h%hRh'}q(UreftypeXclasshThUX DjangoItemU refdomainXpyqh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/K#h]qh\)q}q(hhh'}q(h+]h,]q(hahXpy-classqeh*]h)]h-]uhhh]qhDX DjangoItemqq}q(hUhhubah%hgubaubhDX:q}q(hX:hhubeubh)r}r(hXifrom scrapy.contrib.djangoitem import DjangoItem class PersonItem(DjangoItem): django_model = Personhhh h#h%hh'}r(hhh)]h*]h+]h,]h-]uh/K%h0hh]rhDXifrom scrapy.contrib.djangoitem import DjangoItem class PersonItem(DjangoItem): django_model = Personrr}r(hUhjubaubhH)r}r(hX?:class:`DjangoItem` work just like :class:`~scrapy.item.Item`::r hhh h#h%hKh'}r (h+]h,]h*]h)]h-]uh/K*h0hh]r (hN)r }r (hX:class:`DjangoItem`rhjh h#h%hRh'}r(UreftypeXclasshThUX DjangoItemU refdomainXpyrh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/K*h]rh\)r}r(hjh'}r(h+]h,]r(hajXpy-classreh*]h)]h-]uhj h]rhDX DjangoItemrr}r(hUhjubah%hgubaubhDX work just like rr}r(hX work just like hjubhN)r}r(hX:class:`~scrapy.item.Item`r hjh h#h%hRh'}r!(UreftypeXclasshThUXscrapy.item.ItemU refdomainXpyr"h)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/K*h]r#h\)r$}r%(hj h'}r&(h+]h,]r'(haj"Xpy-classr(eh*]h)]h-]uhjh]r)hDXItemr*r+}r,(hUhj$ubah%hgubaubhDX:r-}r.(hX:hjubeubh)r/}r0(hX?>>> p = PersonItem() >>> p['name'] = 'John' >>> p['age'] = '22'hhh h#h%hh'}r1(hhh)]h*]h+]h,]h-]uh/K,h0hh]r2hDX?>>> p = PersonItem() >>> p['name'] = 'John' >>> p['age'] = '22'r3r4}r5(hUhj/ubaubhH)r6}r7(hXxTo obtain the Django model from the item, we call the extra method :meth:`~DjangoItem.save` of the :class:`DjangoItem`::hhh h#h%hKh'}r8(h+]h,]h*]h)]h-]uh/K0h0hh]r9(hDXCTo obtain the Django model from the item, we call the extra method r:r;}r<(hXCTo obtain the Django model from the item, we call the extra method hj6ubhN)r=}r>(hX:meth:`~DjangoItem.save`r?hj6h h#h%hRh'}r@(UreftypeXmethhThUXDjangoItem.saveU refdomainXpyrAh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/K0h]rBh\)rC}rD(hj?h'}rE(h+]h,]rF(hajAXpy-methrGeh*]h)]h-]uhj=h]rHhDXsave()rIrJ}rK(hUhjCubah%hgubaubhDX of the rLrM}rN(hX of the hj6ubhN)rO}rP(hX:class:`DjangoItem`rQhj6h h#h%hRh'}rR(UreftypeXclasshThUX DjangoItemU refdomainXpyrSh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/K0h]rTh\)rU}rV(hjQh'}rW(h+]h,]rX(hajSXpy-classrYeh*]h)]h-]uhjOh]rZhDX DjangoItemr[r\}r](hUhjUubah%hgubaubhDX:r^}r_(hX:hj6ubeubh)r`}ra(hXP>>> person = p.save() >>> person.name 'John' >>> person.age '22' >>> person.id 1hhh h#h%hh'}rb(hhh)]h*]h+]h,]h-]uh/K3h0hh]rchDXP>>> person = p.save() >>> person.name 'John' >>> person.age '22' >>> person.id 1rdre}rf(hUhj`ubaubhH)rg}rh(hXThe model is already saved when we call :meth:`~DjangoItem.save`, we can prevent this by calling it with ``commit=False``. We can use ``commit=False`` in :meth:`~DjangoItem.save` method to obtain an unsaved model::hhh h#h%hKh'}ri(h+]h,]h*]h)]h-]uh/K;h0hh]rj(hDX(The model is already saved when we call rkrl}rm(hX(The model is already saved when we call hjgubhN)rn}ro(hX:meth:`~DjangoItem.save`rphjgh h#h%hRh'}rq(UreftypeXmethhThUXDjangoItem.saveU refdomainXpyrrh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/K;h]rsh\)rt}ru(hjph'}rv(h+]h,]rw(hajrXpy-methrxeh*]h)]h-]uhjnh]ryhDXsave()rzr{}r|(hUhjtubah%hgubaubhDX), we can prevent this by calling it with r}r~}r(hX), we can prevent this by calling it with hjgubh\)r}r(hX``commit=False``h'}r(h+]h,]h*]h)]h-]uhjgh]rhDX commit=Falserr}r(hUhjubah%hgubhDX . We can use rr}r(hX . We can use hjgubh\)r}r(hX``commit=False``h'}r(h+]h,]h*]h)]h-]uhjgh]rhDX commit=Falserr}r(hUhjubah%hgubhDX in rr}r(hX in hjgubhN)r}r(hX:meth:`~DjangoItem.save`rhjgh h#h%hRh'}r(UreftypeXmethhThUXDjangoItem.saveU refdomainXpyrh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/K;h]rh\)r}r(hjh'}r(h+]h,]r(hajXpy-methreh*]h)]h-]uhjh]rhDXsave()rr}r(hUhjubah%hgubaubhDX# method to obtain an unsaved model:rr}r(hX# method to obtain an unsaved model:hjgubeubh)r}r(hX_>>> person = p.save(commit=False) >>> person.name 'John' >>> person.age '22' >>> person.id Nonehhh h#h%hh'}r(hhh)]h*]h+]h,]h-]uh/K?h0hh]rhDX_>>> person = p.save(commit=False) >>> person.name 'John' >>> person.age '22' >>> person.id Nonerr}r(hUhjubaubhH)r}r(hX5As said before, we can add other fields to the item::rhhh h#h%hKh'}r(h+]h,]h*]h)]h-]uh/KGh0hh]rhDX4As said before, we can add other fields to the item:rr}r(hX4As said before, we can add other fields to the item:hjubaubh)r}r(hXIclass PersonItem(DjangoItem): django_model = Person sex = Field()hhh h#h%hh'}r(hhh)]h*]h+]h,]h-]uh/KIh0hh]rhDXIclass PersonItem(DjangoItem): django_model = Person sex = Field()rr}r(hUhjubaubh)r}r(hXR>>> p = PersonItem() >>> p['name'] = 'John' >>> p['age'] = '22' >>> p['sex'] = 'M'hhh h#h%hh'}r(hhh)]h*]h+]h,]h-]uh/KOh0hh]rhDXR>>> p = PersonItem() >>> p['name'] = 'John' >>> p['age'] = '22' >>> p['sex'] = 'M'rr}r(hUhjubaubcdocutils.nodes note r)r}r(hXZfields added to the item won't be taken into account when doing a :meth:`~DjangoItem.save`rhhh h#h%Unoterh'}r(h+]h,]h*]h)]h-]uh/Nh0hh]rhH)r}r(hjhjh h#h%hKh'}r(h+]h,]h*]h)]h-]uh/KTh]r(hDXBfields added to the item won't be taken into account when doing a rr}r(hXBfields added to the item won't be taken into account when doing a hjubhN)r}r(hX:meth:`~DjangoItem.save`rhjh h#h%hRh'}r(UreftypeXmethhThUXDjangoItem.saveU refdomainXpyrh)]h*]U refexplicith+]h,]h-]hWhXhYNhZNuh/KTh]rh\)r}r(hjh'}r(h+]h,]r(hajXpy-methreh*]h)]h-]uhjh]rhDXsave()rr}r(hUhjubah%hgubaubeubaubhH)r}r(hX;And we can override the fields of the model with your own::rhhh h#h%hKh'}r(h+]h,]h*]h)]h-]uh/KVh0hh]rhDX:And we can override the fields of the model with your own:rr}r(hX:And we can override the fields of the model with your own:hjubaubh)r}r(hX[class PersonItem(DjangoItem): django_model = Person name = Field(default='No Name')hhh h#h%hh'}r(hhh)]h*]h+]h,]h-]uh/KXh0hh]rhDX[class PersonItem(DjangoItem): django_model = Person name = Field(default='No Name')rr}r(hUhjubaubhH)r}r(hXoThis is useful to provide properties to the field, like a default or any other property that your project uses.rhhh h#h%hKh'}r(h+]h,]h*]h)]h-]uh/K\h0hh]rhDXoThis is useful to provide properties to the field, like a default or any other property that your project uses.rr}r(hjhjubaubeubh1)r}r(hUhh2h h#h%h6h'}r(h+]h,]h*]h)]rhah-]rhauh/K`h0hh]r(h=)r}r(hXDjangoItem caveatsrhjh h#h%hAh'}r(h+]h,]h*]h)]h-]uh/K`h0hh]rhDXDjangoItem caveatsrr}r(hjhjubaubhH)r}r(hXDjangoItem is a rather convenient way to integrate Scrapy projects with Django models, but bear in mind that Django ORM may not scale well if you scrape a lot of items (ie. millions) with Scrapy. This is because a relational backend is often not a good choice for a write intensive application (such as a web crawler), specially if the database is highly normalized and with many indices.rhjh h#h%hKh'}r(h+]h,]h*]h)]h-]uh/Kbh0hh]r hDXDjangoItem is a rather convenient way to integrate Scrapy projects with Django models, but bear in mind that Django ORM may not scale well if you scrape a lot of items (ie. millions) with Scrapy. This is because a relational backend is often not a good choice for a write intensive application (such as a web crawler), specially if the database is highly normalized and with many indices.r r }r (hjhjubaubeubh1)r }r(hUhh2h h#h%h6h'}r(h+]h,]h*]h)]rhah-]rh auh/Kih0hh]r(h=)r}r(hXDjango settings set uprhj h h#h%hAh'}r(h+]h,]h*]h)]h-]uh/Kih0hh]rhDXDjango settings set uprr}r(hjhjubaubhH)r}r(hXTo use the Django models outside the Django application you need to set up the ``DJANGO_SETTINGS_MODULE`` environment variable and --in most cases-- modify the ``PYTHONPATH`` environment variable to be able to import the settings module.hj h h#h%hKh'}r(h+]h,]h*]h)]h-]uh/Kkh0hh]r(hDXOTo use the Django models outside the Django application you need to set up the rr }r!(hXOTo use the Django models outside the Django application you need to set up the hjubh\)r"}r#(hX``DJANGO_SETTINGS_MODULE``h'}r$(h+]h,]h*]h)]h-]uhjh]r%hDXDJANGO_SETTINGS_MODULEr&r'}r((hUhj"ubah%hgubhDX7 environment variable and --in most cases-- modify the r)r*}r+(hX7 environment variable and --in most cases-- modify the hjubh\)r,}r-(hX``PYTHONPATH``h'}r.(h+]h,]h*]h)]h-]uhjh]r/hDX PYTHONPATHr0r1}r2(hUhj,ubah%hgubhDX? environment variable to be able to import the settings module.r3r4}r5(hX? environment variable to be able to import the settings module.hjubeubhH)r6}r7(hXThere are many ways to do this depending on your use case and preferences. Below is detailed one of the simplest ways to do it.r8hj h h#h%hKh'}r9(h+]h,]h*]h)]h-]uh/Kph0hh]r:hDXThere are many ways to do this depending on your use case and preferences. Below is detailed one of the simplest ways to do it.r;r<}r=(hj8hj6ubaubhH)r>}r?(hXSuppose your Django project is named ``mysite``, is located in the path ``/home/projects/mysite`` and you have created an app ``myapp`` with the model ``Person``. That means your directory structure is something like this::hj h h#h%hKh'}r@(h+]h,]h*]h)]h-]uh/Ksh0hh]rA(hDX%Suppose your Django project is named rBrC}rD(hX%Suppose your Django project is named hj>ubh\)rE}rF(hX ``mysite``h'}rG(h+]h,]h*]h)]h-]uhj>h]rHhDXmysiterIrJ}rK(hUhjEubah%hgubhDX, is located in the path rLrM}rN(hX, is located in the path hj>ubh\)rO}rP(hX``/home/projects/mysite``h'}rQ(h+]h,]h*]h)]h-]uhj>h]rRhDX/home/projects/mysiterSrT}rU(hUhjOubah%hgubhDX and you have created an app rVrW}rX(hX and you have created an app hj>ubh\)rY}rZ(hX ``myapp``h'}r[(h+]h,]h*]h)]h-]uhj>h]r\hDXmyappr]r^}r_(hUhjYubah%hgubhDX with the model r`ra}rb(hX with the model hj>ubh\)rc}rd(hX ``Person``h'}re(h+]h,]h*]h)]h-]uhj>h]rfhDXPersonrgrh}ri(hUhjcubah%hgubhDX=. That means your directory structure is something like this:rjrk}rl(hX=. That means your directory structure is something like this:hj>ubeubh)rm}rn(hX/home/projects/mysite ├── manage.py ├── myapp │   ├── __init__.py │   ├── models.py │   ├── tests.py │   └── views.py └── mysite ├── __init__.py ├── settings.py ├── urls.py └── wsgi.pyhj h h#h%hh'}ro(hhh)]h*]h+]h,]h-]uh/Kwh0hh]rphDX/home/projects/mysite ├── manage.py ├── myapp │   ├── __init__.py │   ├── models.py │   ├── tests.py │   └── views.py └── mysite ├── __init__.py ├── settings.py ├── urls.py └── wsgi.pyrqrr}rs(hUhjmubaubhH)rt}ru(hXThen you need to add ``/home/projects/mysite`` to the ``PYTHONPATH`` environment variable and set up the environment variable ``DJANGO_SETTINGS_MODULE`` to ``mysite.settings``. That can be done in your Scrapy's settings file by adding the lines below::hj h h#h%hKh'}rv(h+]h,]h*]h)]h-]uh/Kh0hh]rw(hDXThen you need to add rxry}rz(hXThen you need to add hjtubh\)r{}r|(hX``/home/projects/mysite``h'}r}(h+]h,]h*]h)]h-]uhjth]r~hDX/home/projects/mysiterr}r(hUhj{ubah%hgubhDX to the rr}r(hX to the hjtubh\)r}r(hX``PYTHONPATH``h'}r(h+]h,]h*]h)]h-]uhjth]rhDX PYTHONPATHrr}r(hUhjubah%hgubhDX: environment variable and set up the environment variable rr}r(hX: environment variable and set up the environment variable hjtubh\)r}r(hX``DJANGO_SETTINGS_MODULE``h'}r(h+]h,]h*]h)]h-]uhjth]rhDXDJANGO_SETTINGS_MODULErr}r(hUhjubah%hgubhDX to rr}r(hX to hjtubh\)r}r(hX``mysite.settings``h'}r(h+]h,]h*]h)]h-]uhjth]rhDXmysite.settingsrr}r(hUhjubah%hgubhDXL. That can be done in your Scrapy's settings file by adding the lines below:rr}r(hXL. That can be done in your Scrapy's settings file by adding the lines below:hjtubeubh)r}r(hXwimport sys sys.path.append('/home/projects/mysite') import os os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'hj h h#h%hh'}r(hhh)]h*]h+]h,]h-]uh/Kh0hh]rhDXwimport sys sys.path.append('/home/projects/mysite') import os os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'rr}r(hUhjubaubhH)r}r(hX)Notice that we modify the ``sys.path`` variable instead the ``PYTHONPATH`` environment variable as we are already within the python runtime. If everything is right, you should be able to start the ``scrapy shell`` command and import the model ``Person`` (i.e. ``from myapp.models import Person``).hj h h#h%hKh'}r(h+]h,]h*]h)]h-]uh/Kh0hh]r(hDXNotice that we modify the rr}r(hXNotice that we modify the hjubh\)r}r(hX ``sys.path``h'}r(h+]h,]h*]h)]h-]uhjh]rhDXsys.pathrr}r(hUhjubah%hgubhDX variable instead the rr}r(hX variable instead the hjubh\)r}r(hX``PYTHONPATH``h'}r(h+]h,]h*]h)]h-]uhjh]rhDX PYTHONPATHrr}r(hUhjubah%hgubhDX{ environment variable as we are already within the python runtime. If everything is right, you should be able to start the rr}r(hX{ environment variable as we are already within the python runtime. If everything is right, you should be able to start the hjubh\)r}r(hX``scrapy shell``h'}r(h+]h,]h*]h)]h-]uhjh]rhDX scrapy shellrr}r(hUhjubah%hgubhDX command and import the model rr}r(hX command and import the model hjubh\)r}r(hX ``Person``h'}r(h+]h,]h*]h)]h-]uhjh]rhDXPersonrr}r(hUhjubah%hgubhDX (i.e. rr}r(hX (i.e. hjubh\)r}r(hX#``from myapp.models import Person``h'}r(h+]h,]h*]h)]h-]uhjh]rhDXfrom myapp.models import Personrr}r(hUhjubah%hgubhDX).rr}r(hX).hjubeubeubeubehUU transformerrNU footnote_refsr}rUrefnamesr}rUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh0hU current_linerNUtransform_messagesr]rcdocutils.nodes system_message r)r}r(hUh'}r(h+]UlevelKh)]h*]Usourceh#h,]h-]UlineKUtypeUINFOruh]rhH)r}r(hUh'}r(h+]h,]h*]h)]h-]uhjh]rhDX7Hyperlink target "topics-djangoitem" is not referenced.rr}r(hUhjubah%hKubah%Usystem_messagerubaUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr (cdocutils.frontend Values r or }r (Ufootnote_backlinksr KUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhANUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetr Uoutput_encoding_error_handlerr!Ustrictr"U sectnum_xformr#KUdump_transformsr$NU docinfo_xformr%KUwarning_streamr&NUpep_file_url_templater'Upep-%04dr(Uexit_status_levelr)KUconfigr*NUstrict_visitorr+NUcloak_email_addressesr,Utrim_footnote_reference_spacer-Uenvr.NUdump_pseudo_xmlr/NUexpose_internalsr0NUsectsubtitle_xformr1U source_linkr2NUrfc_referencesr3NUoutput_encodingr4Uutf-8r5U source_urlr6NUinput_encodingr7U utf-8-sigr8U_disable_configr9NU id_prefixr:UU tab_widthr;KUerror_encodingr<UUTF-8r=U_sourcer>UG/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/djangoitem.rstr?Ugettext_compactr@U generatorrANUdump_internalsrBNU smart_quotesrCU pep_base_urlrDUhttp://www.python.org/dev/peps/rEUsyntax_highlightrFUlongrGUinput_encoding_error_handlerrHj"Uauto_id_prefixrIUidrJUdoctitle_xformrKUstrip_elements_with_classesrLNU _config_filesrM]rNUfile_insertion_enabledrOU raw_enabledrPKU dump_settingsrQNubUsymbol_footnote_startrRKUidsrS}rT(hh2hhhjhj hh2uUsubstitution_namesrU}rVh%h0h'}rW(h+]h)]h*]Usourceh#h,]h-]uU footnotesrX]rYUrefidsrZ}r[h]r\hasub.PKo1DI$|Jmm.scrapy-0.22/.doctrees/topics/contracts.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xtopics-contractsqX%scrapy.contracts.Contract.pre_processqX&scrapy.contracts.Contract.post_processqXcustom contractsq NX-scrapy.contracts.Contract.adjust_request_argsq Xspiders contractsq NX(scrapy.contracts.default.ScrapesContractq Xscrapy.contracts.Contractq X$scrapy.contracts.default.UrlContractqX(scrapy.contracts.default.ReturnsContractquUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUtopics-contractsqhhhhh Ucustom-contractsqh h h Uspiders-contractsqh h h h hhhhuUchildrenq]q(cdocutils.nodes target q)q}q (U rawsourceq!X.. _topics-contracts:Uparentq"hUsourceq#cdocutils.nodes reprunicode q$XF/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/contracts.rstq%q&}q'bUtagnameq(Utargetq)U attributesq*}q+(Uidsq,]Ubackrefsq-]Udupnamesq.]Uclassesq/]Unamesq0]Urefidq1huUlineq2KUdocumentq3hh]ubcdocutils.nodes section q4)q5}q6(h!Uh"hh#h&Uexpect_referenced_by_nameq7}q8hhsh(Usectionq9h*}q:(h.]h/]h-]h,]q;(hheh0]q<(h heuh2Kh3hUexpect_referenced_by_idq=}q>hhsh]q?(cdocutils.nodes title q@)qA}qB(h!XSpiders ContractsqCh"h5h#h&h(UtitleqDh*}qE(h.]h/]h-]h,]h0]uh2Kh3hh]qFcdocutils.nodes Text qGXSpiders ContractsqHqI}qJ(h!hCh"hAubaubcsphinx.addnodes versionmodified qK)qL}qM(h!Uh"h5h#h&h(UversionmodifiedqNh*}qO(UversionqPX0.15h,]h-]h.]h/]h0]UtypeqQX versionaddedqRuh2Kh3hh]qScdocutils.nodes paragraph qT)qU}qV(h!Uh"hLh#h&h(U paragraphqWh*}qX(h.]h/]h-]h,]h0]uh2Kh3hh]qYcdocutils.nodes inline qZ)q[}q\(h!Uh*}q](h.]h/]q^hNah-]h,]h0]uh"hUh]q_hGXNew in version 0.15.q`qa}qb(h!Uh"h[ubah(Uinlineqcubaubaubcdocutils.nodes note qd)qe}qf(h!XThis is a new feature (introduced in Scrapy 0.15) and may be subject to minor functionality/API updates. Check the :ref:`release notes ` to be notified of updates.h"h5h#h&h(Unoteqgh*}qh(h.]h/]h-]h,]h0]uh2Nh3hh]qihT)qj}qk(h!XThis is a new feature (introduced in Scrapy 0.15) and may be subject to minor functionality/API updates. Check the :ref:`release notes ` to be notified of updates.h"heh#h&h(hWh*}ql(h.]h/]h-]h,]h0]uh2K h]qm(hGXsThis is a new feature (introduced in Scrapy 0.15) and may be subject to minor functionality/API updates. Check the qnqo}qp(h!XsThis is a new feature (introduced in Scrapy 0.15) and may be subject to minor functionality/API updates. Check the h"hjubcsphinx.addnodes pending_xref qq)qr}qs(h!X:ref:`release notes `qth"hjh#h&h(U pending_xrefquh*}qv(UreftypeXrefUrefwarnqwU reftargetqxXnewsU refdomainXstdqyh,]h-]U refexplicith.]h/]h0]UrefdocqzXtopics/contractsq{uh2K h]q|cdocutils.nodes emphasis q})q~}q(h!hth*}q(h.]h/]q(UxrefqhyXstd-refqeh-]h,]h0]uh"hrh]qhGX release notesqq}q(h!Uh"h~ubah(UemphasisqubaubhGX to be notified of updates.qq}q(h!X to be notified of updates.h"hjubeubaubhT)q}q(h!XTesting spiders can get particularly annoying and while nothing prevents you from writing unit tests the task gets cumbersome quickly. Scrapy offers an integrated way of testing your spiders by the means of contracts.qh"h5h#h&h(hWh*}q(h.]h/]h-]h,]h0]uh2K h3hh]qhGXTesting spiders can get particularly annoying and while nothing prevents you from writing unit tests the task gets cumbersome quickly. Scrapy offers an integrated way of testing your spiders by the means of contracts.qq}q(h!hh"hubaubhT)q}q(h!XThis allows you to test each callback of your spider by hardcoding a sample url and check various constraints for how the callback processes the response. Each contract is prefixed with an ``@`` and included in the docstring. See the following example::h"h5h#h&h(hWh*}q(h.]h/]h-]h,]h0]uh2Kh3hh]q(hGXThis allows you to test each callback of your spider by hardcoding a sample url and check various constraints for how the callback processes the response. Each contract is prefixed with an qq}q(h!XThis allows you to test each callback of your spider by hardcoding a sample url and check various constraints for how the callback processes the response. Each contract is prefixed with an h"hubcdocutils.nodes literal q)q}q(h!X``@``h*}q(h.]h/]h-]h,]h0]uh"hh]qhGX@q}q(h!Uh"hubah(UliteralqubhGX: and included in the docstring. See the following example:qq}q(h!X: and included in the docstring. See the following example:h"hubeubcdocutils.nodes literal_block q)q}q(h!Xdef parse(self, response): """ This function parses a sample response. Some contracts are mingled with this docstring. @url http://www.amazon.com/s?field-keywords=selfish+gene @returns items 1 16 @returns requests 0 0 @scrapes Title Author Year Price """h"h5h#h&h(U literal_blockqh*}q(U xml:spaceqUpreserveqh,]h-]h.]h/]h0]uh2Kh3hh]qhGXdef parse(self, response): """ This function parses a sample response. Some contracts are mingled with this docstring. @url http://www.amazon.com/s?field-keywords=selfish+gene @returns items 1 16 @returns requests 0 0 @scrapes Title Author Year Price """qq}q(h!Uh"hubaubhT)q}q(h!X7This callback is tested using three built-in contracts:qh"h5h#h&h(hWh*}q(h.]h/]h-]h,]h0]uh2K h3hh]qhGX7This callback is tested using three built-in contracts:qq}q(h!hh"hubaubh)q}q(h!Uh"h5h#h&h(h)h*}q(h.]h,]qXmodule-scrapy.contracts.defaultqah-]Uismodh/]h0]uh2K#h3hh]ubcsphinx.addnodes index q)q}q(h!Uh"h5h#h&h(Uindexqh*}q(h,]h-]h.]h/]h0]Uentries]q(UsingleqX!scrapy.contracts.default (module)Xmodule-scrapy.contracts.defaultUtqauh2K#h3hh]ubh)q}q(h!Uh"h5h#Nh(hh*}q(h,]h-]h.]h/]h0]Uentries]q(hX/UrlContract (class in scrapy.contracts.default)hUtqauh2Nh3hh]ubcsphinx.addnodes desc q)q}q(h!Uh"h5h#Nh(Udescqh*}q(UnoindexqЉUdomainqXpyh,]h-]h.]h/]h0]UobjtypeqXclassqUdesctypeqhuh2Nh3hh]q(csphinx.addnodes desc_signature q)q}q(h!X UrlContractqh"hh#h&h(Udesc_signatureqh*}q(h,]qhaUmoduleqXscrapy.contracts.defaultqh-]h.]h/]h0]qhaUfullnameqhUclassqUUfirstquh2K+h3hh]q(csphinx.addnodes desc_annotation q)q}q(h!Xclass h"hh#h&h(Udesc_annotationqh*}q(h.]h/]h-]h,]h0]uh2K+h3hh]qhGXclass qꅁq}q(h!Uh"hubaubcsphinx.addnodes desc_addname q)q}q(h!Xscrapy.contracts.default.h"hh#h&h(U desc_addnameqh*}q(h.]h/]h-]h,]h0]uh2K+h3hh]qhGXscrapy.contracts.default.qq}q(h!Uh"hubaubcsphinx.addnodes desc_name q)q}q(h!hh"hh#h&h(U desc_nameqh*}q(h.]h/]h-]h,]h0]uh2K+h3hh]qhGX UrlContractqq}q(h!Uh"hubaubeubcsphinx.addnodes desc_content q)r}r(h!Uh"hh#h&h(U desc_contentrh*}r(h.]h/]h-]h,]h0]uh2K+h3hh]r(hT)r}r(h!XThis contract (``@url``) sets the sample url used when checking other contract conditions for this spider. This contract is mandatory. All callbacks lacking this contract are ignored when running the checks::h"jh#h&h(hWh*}r(h.]h/]h-]h,]h0]uh2K&h3hh]r(hGXThis contract (r r }r (h!XThis contract (h"jubh)r }r (h!X``@url``h*}r(h.]h/]h-]h,]h0]uh"jh]rhGX@urlrr}r(h!Uh"j ubah(hubhGX) sets the sample url used when checking other contract conditions for this spider. This contract is mandatory. All callbacks lacking this contract are ignored when running the checks:rr}r(h!X) sets the sample url used when checking other contract conditions for this spider. This contract is mandatory. All callbacks lacking this contract are ignored when running the checks:h"jubeubh)r}r(h!X@url urlh"jh#h&h(hh*}r(hhh,]h-]h.]h/]h0]uh2K*h3hh]rhGX@url urlrr}r(h!Uh"jubaubeubeubh)r}r(h!Uh"h5h#Nh(hh*}r(h,]h-]h.]h/]h0]Uentries]r (hX3ReturnsContract (class in scrapy.contracts.default)hUtr!auh2Nh3hh]ubh)r"}r#(h!Uh"h5h#Nh(hh*}r$(hЉhXpyh,]h-]h.]h/]h0]hXclassr%hj%uh2Nh3hh]r&(h)r'}r((h!XReturnsContractr)h"j"h#h&h(hh*}r*(h,]r+hahhh-]h.]h/]h0]r,hahj)hUhuh2K2h3hh]r-(h)r.}r/(h!Xclass h"j'h#h&h(hh*}r0(h.]h/]h-]h,]h0]uh2K2h3hh]r1hGXclass r2r3}r4(h!Uh"j.ubaubh)r5}r6(h!Xscrapy.contracts.default.h"j'h#h&h(hh*}r7(h.]h/]h-]h,]h0]uh2K2h3hh]r8hGXscrapy.contracts.default.r9r:}r;(h!Uh"j5ubaubh)r<}r=(h!j)h"j'h#h&h(hh*}r>(h.]h/]h-]h,]h0]uh2K2h3hh]r?hGXReturnsContractr@rA}rB(h!Uh"j<ubaubeubh)rC}rD(h!Uh"j"h#h&h(jh*}rE(h.]h/]h-]h,]h0]uh2K2h3hh]rF(hT)rG}rH(h!XThis contract (``@returns``) sets lower and upper bounds for the items and requests returned by the spider. The upper bound is optional::h"jCh#h&h(hWh*}rI(h.]h/]h-]h,]h0]uh2K.h3hh]rJ(hGXThis contract (rKrL}rM(h!XThis contract (h"jGubh)rN}rO(h!X ``@returns``h*}rP(h.]h/]h-]h,]h0]uh"jGh]rQhGX@returnsrRrS}rT(h!Uh"jNubah(hubhGXm) sets lower and upper bounds for the items and requests returned by the spider. The upper bound is optional:rUrV}rW(h!Xm) sets lower and upper bounds for the items and requests returned by the spider. The upper bound is optional:h"jGubeubh)rX}rY(h!X'@returns item(s)|request(s) [min [max]]h"jCh#h&h(hh*}rZ(hhh,]h-]h.]h/]h0]uh2K1h3hh]r[hGX'@returns item(s)|request(s) [min [max]]r\r]}r^(h!Uh"jXubaubeubeubh)r_}r`(h!Uh"h5h#Nh(hh*}ra(h,]h-]h.]h/]h0]Uentries]rb(hX3ScrapesContract (class in scrapy.contracts.default)h Utrcauh2Nh3hh]ubh)rd}re(h!Uh"h5h#Nh(hh*}rf(hЉhXpyh,]h-]h.]h/]h0]hXclassrghjguh2Nh3hh]rh(h)ri}rj(h!XScrapesContractrkh"jdh#h&h(hh*}rl(h,]rmh ahhh-]h.]h/]h0]rnh ahjkhUhuh2K9h3hh]ro(h)rp}rq(h!Xclass h"jih#h&h(hh*}rr(h.]h/]h-]h,]h0]uh2K9h3hh]rshGXclass rtru}rv(h!Uh"jpubaubh)rw}rx(h!Xscrapy.contracts.default.h"jih#h&h(hh*}ry(h.]h/]h-]h,]h0]uh2K9h3hh]rzhGXscrapy.contracts.default.r{r|}r}(h!Uh"jwubaubh)r~}r(h!jkh"jih#h&h(hh*}r(h.]h/]h-]h,]h0]uh2K9h3hh]rhGXScrapesContractrr}r(h!Uh"j~ubaubeubh)r}r(h!Uh"jdh#h&h(jh*}r(h.]h/]h-]h,]h0]uh2K9h3hh]r(hT)r}r(h!XkThis contract (``@scrapes``) checks that all the items returned by the callback have the specified fields::h"jh#h&h(hWh*}r(h.]h/]h-]h,]h0]uh2K5h3hh]r(hGXThis contract (rr}r(h!XThis contract (h"jubh)r}r(h!X ``@scrapes``h*}r(h.]h/]h-]h,]h0]uh"jh]rhGX@scrapesrr}r(h!Uh"jubah(hubhGXO) checks that all the items returned by the callback have the specified fields:rr}r(h!XO) checks that all the items returned by the callback have the specified fields:h"jubeubh)r}r(h!X@scrapes field_1 field_2 ...h"jh#h&h(hh*}r(hhh,]h-]h.]h/]h0]uh2K8h3hh]rhGX@scrapes field_1 field_2 ...rr}r(h!Uh"jubaubeubeubhT)r}r(h!X<Use the :command:`check` command to run the contract checks.rh"h5h#h&h(hWh*}r(h.]h/]h-]h,]h0]uh2K:h3hh]r(hGXUse the rr}r(h!XUse the h"jubhq)r}r(h!X:command:`check`rh"jh#h&h(huh*}r(UreftypeXcommandhwhxXcheckU refdomainXstdrh,]h-]U refexplicith.]h/]h0]hzh{uh2K:h]rh)r}r(h!jh*}r(h.]h/]r(hjX std-commandreh-]h,]h0]uh"jh]rhGXcheckrr}r(h!Uh"jubah(hubaubhGX$ command to run the contract checks.rr}r(h!X$ command to run the contract checks.h"jubeubh4)r}r(h!Uh"h5h#h&h(h9h*}r(h.]h/]h-]h,]rhah0]rh auh2K=h3hh]r(h@)r}r(h!XCustom Contractsrh"jh#h&h(hDh*}r(h.]h/]h-]h,]h0]uh2K=h3hh]rhGXCustom Contractsrr}r(h!jh"jubaubhT)r}r(h!XIf you find you need more power than the built-in scrapy contracts you can create and load your own contracts in the project by using the :setting:`SPIDER_CONTRACTS` setting::h"jh#h&h(hWh*}r(h.]h/]h-]h,]h0]uh2K?h3hh]r(hGXIf you find you need more power than the built-in scrapy contracts you can create and load your own contracts in the project by using the rr}r(h!XIf you find you need more power than the built-in scrapy contracts you can create and load your own contracts in the project by using the h"jubhq)r}r(h!X:setting:`SPIDER_CONTRACTS`rh"jh#h&h(huh*}r(UreftypeXsettinghwhxXSPIDER_CONTRACTSU refdomainXstdrh,]h-]U refexplicith.]h/]h0]hzh{uh2K?h]rh)r}r(h!jh*}r(h.]h/]r(hjX std-settingreh-]h,]h0]uh"jh]rhGXSPIDER_CONTRACTSrr}r(h!Uh"jubah(hubaubhGX setting:rr}r(h!X setting:h"jubeubh)r}r(h!XoSPIDER_CONTRACTS = { 'myproject.contracts.ResponseCheck': 10, 'myproject.contracts.ItemValidate': 10, }h"jh#h&h(hh*}r(hhh,]h-]h.]h/]h0]uh2KCh3hh]rhGXoSPIDER_CONTRACTS = { 'myproject.contracts.ResponseCheck': 10, 'myproject.contracts.ItemValidate': 10, }rr}r(h!Uh"jubaubhT)r}r(h!XbEach contract must inherit from :class:`scrapy.contracts.Contract` and can override three methods:h"jh#h&h(hWh*}r(h.]h/]h-]h,]h0]uh2KHh3hh]r(hGX Each contract must inherit from rr}r(h!X Each contract must inherit from h"jubhq)r}r(h!X":class:`scrapy.contracts.Contract`rh"jh#h&h(huh*}r(UreftypeXclasshwhxXscrapy.contracts.ContractU refdomainXpyrh,]h-]U refexplicith.]h/]h0]hzh{Upy:classrNU py:modulerhuh2KHh]rh)r}r(h!jh*}r(h.]h/]r(hjXpy-classreh-]h,]h0]uh"jh]rhGXscrapy.contracts.Contractrr}r(h!Uh"jubah(hubaubhGX and can override three methods:rr}r(h!X and can override three methods:h"jubeubh)r}r(h!Uh"jh#h&h(h)h*}r(h.]h,]rXmodule-scrapy.contractsrah-]Uismodh/]h0]uh2KLh3hh]ubh)r }r (h!Uh"jh#h&h(hh*}r (h,]h-]h.]h/]h0]Uentries]r (hXscrapy.contracts (module)Xmodule-scrapy.contractsUtr auh2KLh3hh]ubh)r}r(h!Uh"jh#Nh(hh*}r(h,]h-]h.]h/]h0]Uentries]r(hX$Contract (class in scrapy.contracts)h Utrauh2Nh3hh]ubh)r}r(h!Uh"jh#Nh(hh*}r(hЉhXpyrh,]h-]h.]h/]h0]hXclassrhjuh2Nh3hh]r(h)r}r(h!XContract(method, *args)h"jh#h&h(hh*}r(h,]rh ahXscrapy.contractsrh-]h.]h/]h0]rh ahXContractrhUhuh2Keh3hh]r (h)r!}r"(h!Xclass h"jh#h&h(hh*}r#(h.]h/]h-]h,]h0]uh2Keh3hh]r$hGXclass r%r&}r'(h!Uh"j!ubaubh)r(}r)(h!Xscrapy.contracts.h"jh#h&h(hh*}r*(h.]h/]h-]h,]h0]uh2Keh3hh]r+hGXscrapy.contracts.r,r-}r.(h!Uh"j(ubaubh)r/}r0(h!jh"jh#h&h(hh*}r1(h.]h/]h-]h,]h0]uh2Keh3hh]r2hGXContractr3r4}r5(h!Uh"j/ubaubcsphinx.addnodes desc_parameterlist r6)r7}r8(h!Uh"jh#h&h(Udesc_parameterlistr9h*}r:(h.]h/]h-]h,]h0]uh2Keh3hh]r;(csphinx.addnodes desc_parameter r<)r=}r>(h!Xmethodh*}r?(h.]h/]h-]h,]h0]uh"j7h]r@hGXmethodrArB}rC(h!Uh"j=ubah(Udesc_parameterrDubj<)rE}rF(h!X*argsh*}rG(h.]h/]h-]h,]h0]uh"j7h]rHhGX*argsrIrJ}rK(h!Uh"jEubah(jDubeubeubh)rL}rM(h!Uh"jh#h&h(jh*}rN(h.]h/]h-]h,]h0]uh2Keh3hh]rO(cdocutils.nodes field_list rP)rQ}rR(h!Uh"jLh#Nh(U field_listrSh*}rT(h.]h/]h-]h,]h0]uh2Nh3hh]rUcdocutils.nodes field rV)rW}rX(h!Uh*}rY(h.]h/]h-]h,]h0]uh"jQh]rZ(cdocutils.nodes field_name r[)r\}r](h!Uh*}r^(h.]h/]h-]h,]h0]uh"jWh]r_hGX Parametersr`ra}rb(h!Uh"j\ubah(U field_namercubcdocutils.nodes field_body rd)re}rf(h!Uh*}rg(h.]h/]h-]h,]h0]uh"jWh]rhcdocutils.nodes bullet_list ri)rj}rk(h!Uh*}rl(h.]h/]h-]h,]h0]uh"jeh]rm(cdocutils.nodes list_item rn)ro}rp(h!Uh*}rq(h.]h/]h-]h,]h0]uh"jjh]rrhT)rs}rt(h!Uh*}ru(h.]h/]h-]h,]h0]uh"joh]rv(cdocutils.nodes strong rw)rx}ry(h!Xmethodh*}rz(h.]h/]h-]h,]h0]uh"jsh]r{hGXmethodr|r}}r~(h!Uh"jxubah(UstrongrubhGX (rr}r(h!Uh"jsubhq)r}r(h!Uh*}r(UreftypeUobjrU reftargetXfunctionrU refdomainjh,]h-]U refexplicith.]h/]h0]uh"jsh]rh})r}r(h!jh*}r(h.]h/]h-]h,]h0]uh"jh]rhGXfunctionrr}r(h!Uh"jubah(hubah(huubhGX)r}r(h!Uh"jsubhGX -- rr}r(h!Uh"jsubhGX5callback function to which the contract is associatedrr}r(h!X5callback function to which the contract is associatedrh"jsubeh(hWubah(U list_itemrubjn)r}r(h!Uh*}r(h.]h/]h-]h,]h0]uh"jjh]rhT)r}r(h!Uh*}r(h.]h/]h-]h,]h0]uh"jh]r(jw)r}r(h!Xargsh*}r(h.]h/]h-]h,]h0]uh"jh]rhGXargsrr}r(h!Uh"jubah(jubhGX (rr}r(h!Uh"jubhq)r}r(h!Uh*}r(UreftypejU reftargetXlistrU refdomainjh,]h-]U refexplicith.]h/]h0]uh"jh]rh})r}r(h!jh*}r(h.]h/]h-]h,]h0]uh"jh]rhGXlistrr}r(h!Uh"jubah(hubah(huubhGX)r}r(h!Uh"jubhGX -- rr}r(h!Uh"jubhGXBlist of arguments passed into the docstring (whitespace separated)rr}r(h!XBlist of arguments passed into the docstring (whitespace separated)rh"jubeh(hWubah(jubeh(U bullet_listrubah(U field_bodyrubeh(Ufieldrubaubh)r}r(h!Uh"jLh#h&h(hh*}r(h,]h-]h.]h/]h0]Uentries]r(hX8adjust_request_args() (scrapy.contracts.Contract method)h Utrauh2Nh3hh]ubh)r}r(h!Uh"jLh#h&h(hh*}r(hЉhXpyh,]h-]h.]h/]h0]hXmethodrhjuh2Nh3hh]r(h)r}r(h!X"Contract.adjust_request_args(args)h"jh#h&h(hh*}r(h,]rh ahjh-]h.]h/]h0]rh ahXContract.adjust_request_argshjhuh2K[h3hh]r(h)r}r(h!Xadjust_request_argsh"jh#h&h(hh*}r(h.]h/]h-]h,]h0]uh2K[h3hh]rhGXadjust_request_argsrr}r(h!Uh"jubaubj6)r}r(h!Uh"jh#h&h(j9h*}r(h.]h/]h-]h,]h0]uh2K[h3hh]rj<)r}r(h!Xargsh*}r(h.]h/]h-]h,]h0]uh"jh]rhGXargsrr}r(h!Uh"jubah(jDubaubeubh)r}r(h!Uh"jh#h&h(jh*}r(h.]h/]h-]h,]h0]uh2K[h3hh]rhT)r}r(h!XThis receives a ``dict`` as an argument containing default arguments for :class:`~scrapy.http.Request` object. Must return the same or a modified version of it.h"jh#h&h(hWh*}r(h.]h/]h-]h,]h0]uh2KXh3hh]r(hGXThis receives a rr}r(h!XThis receives a h"jubh)r}r(h!X``dict``h*}r(h.]h/]h-]h,]h0]uh"jh]rhGXdictrr}r(h!Uh"jubah(hubhGX1 as an argument containing default arguments for rr}r(h!X1 as an argument containing default arguments for h"jubhq)r}r(h!X:class:`~scrapy.http.Request`rh"jh#h&h(huh*}r(UreftypeXclasshwhxXscrapy.http.RequestU refdomainXpyrh,]h-]U refexplicith.]h/]h0]hzh{jjjjuh2KXh]rh)r}r(h!jh*}r(h.]h/]r(hjXpy-classreh-]h,]h0]uh"jh]rhGXRequestrr}r (h!Uh"jubah(hubaubhGX: object. Must return the same or a modified version of it.r r }r (h!X: object. Must return the same or a modified version of it.h"jubeubaubeubh)r }r(h!Uh"jLh#h&h(hh*}r(h,]h-]h.]h/]h0]Uentries]r(hX0pre_process() (scrapy.contracts.Contract method)hUtrauh2Nh3hh]ubh)r}r(h!Uh"jLh#h&h(hh*}r(hЉhXpyh,]h-]h.]h/]h0]hXmethodrhjuh2Nh3hh]r(h)r}r(h!XContract.pre_process(response)h"jh#h&h(hh*}r(h,]rhahjh-]h.]h/]h0]rhahXContract.pre_processhjhuh2K`h3hh]r(h)r}r(h!X pre_processh"jh#h&h(hh*}r(h.]h/]h-]h,]h0]uh2K`h3hh]r hGX pre_processr!r"}r#(h!Uh"jubaubj6)r$}r%(h!Uh"jh#h&h(j9h*}r&(h.]h/]h-]h,]h0]uh2K`h3hh]r'j<)r(}r)(h!Xresponseh*}r*(h.]h/]h-]h,]h0]uh"j$h]r+hGXresponser,r-}r.(h!Uh"j(ubah(jDubaubeubh)r/}r0(h!Uh"jh#h&h(jh*}r1(h.]h/]h-]h,]h0]uh2K`h3hh]r2hT)r3}r4(h!XThis allows hooking in various checks on the response received from the sample request, before it's being passed to the callback.r5h"j/h#h&h(hWh*}r6(h.]h/]h-]h,]h0]uh2K^h3hh]r7hGXThis allows hooking in various checks on the response received from the sample request, before it's being passed to the callback.r8r9}r:(h!j5h"j3ubaubaubeubh)r;}r<(h!Uh"jLh#h&h(hh*}r=(h,]h-]h.]h/]h0]Uentries]r>(hX1post_process() (scrapy.contracts.Contract method)hUtr?auh2Nh3hh]ubh)r@}rA(h!Uh"jLh#h&h(hh*}rB(hЉhXpyh,]h-]h.]h/]h0]hXmethodrChjCuh2Nh3hh]rD(h)rE}rF(h!XContract.post_process(output)rGh"j@h#h&h(hh*}rH(h,]rIhahjh-]h.]h/]h0]rJhahXContract.post_processhjhuh2Kdh3hh]rK(h)rL}rM(h!X post_processh"jEh#h&h(hh*}rN(h.]h/]h-]h,]h0]uh2Kdh3hh]rOhGX post_processrPrQ}rR(h!Uh"jLubaubj6)rS}rT(h!Uh"jEh#h&h(j9h*}rU(h.]h/]h-]h,]h0]uh2Kdh3hh]rVj<)rW}rX(h!Xoutputh*}rY(h.]h/]h-]h,]h0]uh"jSh]rZhGXoutputr[r\}r](h!Uh"jWubah(jDubaubeubh)r^}r_(h!Uh"j@h#h&h(jh*}r`(h.]h/]h-]h,]h0]uh2Kdh3hh]rahT)rb}rc(h!XvThis allows processing the output of the callback. Iterators are converted listified before being passed to this hook.rdh"j^h#h&h(hWh*}re(h.]h/]h-]h,]h0]uh2Kch3hh]rfhGXvThis allows processing the output of the callback. Iterators are converted listified before being passed to this hook.rgrh}ri(h!jdh"jbubaubaubeubeubeubhT)rj}rk(h!XHere is a demo contract which checks the presence of a custom header in the response received. Raise :class:`scrapy.exceptions.ContractFail` in order to get the failures pretty printed::h"jh#h&h(hWh*}rl(h.]h/]h-]h,]h0]uh2Kfh3hh]rm(hGXeHere is a demo contract which checks the presence of a custom header in the response received. Raise rnro}rp(h!XeHere is a demo contract which checks the presence of a custom header in the response received. Raise h"jjubhq)rq}rr(h!X':class:`scrapy.exceptions.ContractFail`rsh"jjh#h&h(huh*}rt(UreftypeXclasshwhxXscrapy.exceptions.ContractFailU refdomainXpyruh,]h-]U refexplicith.]h/]h0]hzh{jNjjuh2Kfh]rvh)rw}rx(h!jsh*}ry(h.]h/]rz(hjuXpy-classr{eh-]h,]h0]uh"jqh]r|hGXscrapy.exceptions.ContractFailr}r~}r(h!Uh"jwubah(hubaubhGX- in order to get the failures pretty printed:rr}r(h!X- in order to get the failures pretty printed:h"jjubeubh)r}r(h!Xfrom scrapy.contracts import Contract from scrapy.exceptions import ContractFail class HasHeaderContract(Contract): """ Demo contract which checks the presence of a custom header @has_header X-CustomHeader """ name = 'has_header' def pre_process(self, response): for header in self.args: if header not in response.headers: raise ContractFail('X-CustomHeader not present')h"jh#h&h(hh*}r(hhh,]h-]h.]h/]h0]uh2Kjh3hh]rhGXfrom scrapy.contracts import Contract from scrapy.exceptions import ContractFail class HasHeaderContract(Contract): """ Demo contract which checks the presence of a custom header @has_header X-CustomHeader """ name = 'has_header' def pre_process(self, response): for header in self.args: if header not in response.headers: raise ContractFail('X-CustomHeader not present')rr}r(h!Uh"jubaubeubeubeh!UU transformerrNU footnote_refsr}rUrefnamesr}rUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh3hU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h!Uh*}r(h.]UlevelKh,]h-]Usourceh&h/]h0]UlineKUtypeUINFOruh]rhT)r}r(h!Uh*}r(h.]h/]h-]h,]h0]uh"jh]rhGX6Hyperlink target "topics-contracts" is not referenced.rr}r(h!Uh"jubah(hWubah(Usystem_messagerubj)r}r(h!Uh*}r(h.]UlevelKh,]h-]Usourceh&h/]h0]UlineK#Utypejuh]rhT)r}r(h!Uh*}r(h.]h/]h-]h,]h0]uh"jh]rhGXEHyperlink target "module-scrapy.contracts.default" is not referenced.rr}r(h!Uh"jubah(hWubah(jubj)r}r(h!Uh*}r(h.]UlevelKh,]h-]Usourceh&h/]h0]UlineKLUtypejuh]rhT)r}r(h!Uh*}r(h.]h/]h-]h,]h0]uh"jh]rhGX=Hyperlink target "module-scrapy.contracts" is not referenced.rr}r(h!Uh"jubah(hWubah(jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhDNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUF/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/contracts.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesr NU _config_filesr ]Ufile_insertion_enabledr U raw_enabledr KU dump_settingsr NubUsymbol_footnote_startrKUidsr}r(hh5hjhjEhhhh5h jjjhjh jih jhhhj'uUsubstitution_namesr}rh(h3h*}r(h.]h,]h-]Usourceh&h/]h0]uU footnotesr]rUrefidsr}rh]rhasub.PKo1DWX}},scrapy-0.22/.doctrees/topics/firebug.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xgoogle directoryqXgetting links to followqNXinspect elementqXusing firebug for scrapingq NX introductionq NXopen directory projectq Xfirebugq Xtopics-firebugq Xhas been shut down by googleqXextracting the dataqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUgoogle-directoryqhUgetting-links-to-followqhUinspect-elementqh Uusing-firebug-for-scrapingqh U introductionqh Uopen-directory-projectqh Ufirebugqh Utopics-firebugq hUhas-been-shut-down-by-googleq!hUextracting-the-dataq"uUchildrenq#]q$(cdocutils.nodes target q%)q&}q'(U rawsourceq(X.. _topics-firebug:Uparentq)hUsourceq*cdocutils.nodes reprunicode q+XD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/firebug.rstq,q-}q.bUtagnameq/Utargetq0U attributesq1}q2(Uidsq3]Ubackrefsq4]Udupnamesq5]Uclassesq6]Unamesq7]Urefidq8h uUlineq9KUdocumentq:hh#]ubcdocutils.nodes section q;)q<}q=(h(Uh)hh*h-Uexpect_referenced_by_nameq>}q?h h&sh/Usectionq@h1}qA(h5]h6]h4]h3]qB(hh eh7]qC(h h euh9Kh:hUexpect_referenced_by_idqD}qEh h&sh#]qF(cdocutils.nodes title qG)qH}qI(h(XUsing Firebug for scrapingqJh)h` but with a different face.h)hh*h-h/h[h1}q(h5]h6]h4]h3]h7]uh9Kh:hh#]q(hNX'In this example, we'll show how to use q҅q}q(h(X'In this example, we'll show how to use h)hubha)q}q(h(X `Firebug`_hdKh)hh/heh1}q(UnameXFirebughghh3]h4]h5]h6]h7]uh#]qhNXFirebugqمq}q(h(Uh)hubaubhNX to scrape data from the q܅q}q(h(X to scrape data from the h)hubha)q}q(h(X`Google Directory`_hdKh)hh/heh1}q(UnameXGoogle DirectoryhgXhttp://directory.google.com/qh3]h4]h5]h6]h7]uh#]qhNXGoogle Directoryq䅁q}q(h(Uh)hubaubhNX&, which contains the same data as the q煁q}q(h(X&, which contains the same data as the h)hubha)q}q(h(X`Open Directory Project`_hdKh)hh/heh1}q(UnameXOpen Directory ProjecthgXhttp://www.dmoz.orgqh3]h4]h5]h6]h7]uh#]qhNXOpen Directory Projectqq}q(h(Uh)hubaubhNX used in the qq}q(h(X used in the h)hubhp)q}q(h(X :ref:`tutorial `qh)hh*h-h/hth1}q(UreftypeXrefhvhwXintro-tutorialU refdomainXstdqh3]h4]U refexplicith5]h6]h7]hyhzuh9Kh#]qh|)q}q(h(hh1}q(h5]h6]q(hhXstd-refqeh4]h3]h7]uh)hh#]rhNXtutorialrr}r(h(Uh)hubah/hubaubhNX but with a different face.rr}r(h(X but with a different face.h)hubeubh%)r}r(h(X".. _Firebug: http://getfirebug.comU referencedr Kh)hh*h-h/h0h1}r (hghh3]r hah4]h5]h6]h7]r h auh9Kh:hh#]ubh%)r }r(h(X2.. _Google Directory: http://directory.google.com/j Kh)hh*h-h/h0h1}r(hghh3]rhah4]h5]h6]h7]rhauh9Kh:hh#]ubh%)r}r(h(X/.. _Open Directory Project: http://www.dmoz.orgj Kh)hh*h-h/h0h1}r(hghh3]rhah4]h5]h6]h7]rh auh9Kh:hh#]ubhX)r}r(h(X#Firebug comes with a very useful feature called `Inspect Element`_ which allows you to inspect the HTML code of the different page elements just by hovering your mouse over them. Otherwise you would have to search for the tags manually through the HTML body which can be a very tedious task.h)hh*h-h/h[h1}r(h5]h6]h4]h3]h7]uh9Kh:hh#]r(hNX0Firebug comes with a very useful feature called rr}r(h(X0Firebug comes with a very useful feature called h)jubha)r}r(h(X`Inspect Element`_hdKh)jh/heh1}r (UnameXInspect ElementhgX*http://www.youtube.com/watch?v=-pT_pDe54aAr!h3]h4]h5]h6]h7]uh#]r"hNXInspect Elementr#r$}r%(h(Uh)jubaubhNX which allows you to inspect the HTML code of the different page elements just by hovering your mouse over them. Otherwise you would have to search for the tags manually through the HTML body which can be a very tedious task.r&r'}r((h(X which allows you to inspect the HTML code of the different page elements just by hovering your mouse over them. Otherwise you would have to search for the tags manually through the HTML body which can be a very tedious task.h)jubeubh%)r)}r*(h(X?.. _Inspect Element: http://www.youtube.com/watch?v=-pT_pDe54aAj Kh)hh*h-h/h0h1}r+(hgj!h3]r,hah4]h5]h6]h7]r-hauh9K#h:hh#]ubhX)r.}r/(h(XNIn the following screenshot you can see the `Inspect Element`_ tool in action.r0h)hh*h-h/h[h1}r1(h5]h6]h4]h3]h7]uh9K%h:hh#]r2(hNX,In the following screenshot you can see the r3r4}r5(h(X,In the following screenshot you can see the h)j.ubha)r6}r7(h(X`Inspect Element`_hdKh)j.h/heh1}r8(UnameXInspect Elementhgj!h3]h4]h5]h6]h7]uh#]r9hNXInspect Elementr:r;}r<(h(Uh)j6ubaubhNX tool in action.r=r>}r?(h(X tool in action.h)j.ubeubcdocutils.nodes image r@)rA}rB(h(Xi.. image:: _images/firebug1.png :width: 913 :height: 600 :alt: Inspecting elements with Firebug h)hh*h-h/UimagerCh1}rD(h5]UuriXtopics/_images/firebug1.pngrEh3]h4]UwidthX913h6]U candidatesrF}rGU*jEsh7]Ualth+X Inspecting elements with FirebugrHrI}rJbUheightX600uh9Nh:hh#]ubhX)rK}rL(h(XpAt first sight, we can see that the directory is divided in categories, which are also divided in subcategories.rMh)hh*h-h/h[h1}rN(h5]h6]h4]h3]h7]uh9K,h:hh#]rOhNXpAt first sight, we can see that the directory is divided in categories, which are also divided in subcategories.rPrQ}rR(h(jMh)jKubaubhX)rS}rT(h(XrHowever, it seems that there are more subcategories than the ones being shown in this page, so we'll keep looking:rUh)hh*h-h/h[h1}rV(h5]h6]h4]h3]h7]uh9K/h:hh#]rWhNXrHowever, it seems that there are more subcategories than the ones being shown in this page, so we'll keep looking:rXrY}rZ(h(jUh)jSubaubj@)r[}r\(h(Xi.. image:: _images/firebug2.png :width: 819 :height: 629 :alt: Inspecting elements with Firebug h)hh*h-h/jCh1}r](h5]UuriXtopics/_images/firebug2.pngr^h3]h4]UwidthX819h6]jF}r_U*j^sh7]Ualth+X Inspecting elements with Firebugr`ra}rbbUheightX629uh9Nh:hh#]ubhX)rc}rd(h(XAs expected, the subcategories contain links to other subcategories, and also links to actual websites, which is the purpose of the directory.reh)hh*h-h/h[h1}rf(h5]h6]h4]h3]h7]uh9K7h:hh#]rghNXAs expected, the subcategories contain links to other subcategories, and also links to actual websites, which is the purpose of the directory.rhri}rj(h(jeh)jcubaubeubh;)rk}rl(h(Uh)h`. We'll also use the :ref:`Scrapy shell ` to test those XPath's and make sure they work as we expect.h)jh*h-h/h[h1}r(h5]h6]h4]h3]h7]uh9Klh:hh#]r(hNX[With the help of Firebug, we'll take a look at some page containing links to websites (say rr}r(h(X[With the help of Firebug, we'll take a look at some page containing links to websites (say h)jubha)r}r(h(X,http://directory.google.com/Top/Arts/Awards/rh1}r(Urefurijh3]h4]h5]h6]h7]uh)jh#]rhNX,http://directory.google.com/Top/Arts/Awards/rr}r (h(Uh)jubah/heubhNX4) and find out how we can extract those links using r!r"}r#(h(X4) and find out how we can extract those links using h)jubhp)r$}r%(h(X#:ref:`Selectors `r&h)jh*h-h/hth1}r'(UreftypeXrefhvhwXtopics-selectorsU refdomainXstdr(h3]h4]U refexplicith5]h6]h7]hyhzuh9Klh#]r)h|)r*}r+(h(j&h1}r,(h5]h6]r-(hj(Xstd-refr.eh4]h3]h7]uh)j$h#]r/hNX Selectorsr0r1}r2(h(Uh)j*ubah/hubaubhNX. We'll also use the r3r4}r5(h(X. We'll also use the h)jubhp)r6}r7(h(X":ref:`Scrapy shell `r8h)jh*h-h/hth1}r9(UreftypeXrefhvhwX topics-shellU refdomainXstdr:h3]h4]U refexplicith5]h6]h7]hyhzuh9Klh#]r;h|)r<}r=(h(j8h1}r>(h5]h6]r?(hj:Xstd-refr@eh4]h3]h7]uh)j6h#]rAhNX Scrapy shellrBrC}rD(h(Uh)j<ubah/hubaubhNX< to test those XPath's and make sure they work as we expect.rErF}rG(h(X< to test those XPath's and make sure they work as we expect.h)jubeubj@)rH}rI(h(Xi.. image:: _images/firebug3.png :width: 965 :height: 751 :alt: Inspecting elements with Firebug h)jh*h-h/jCh1}rJ(h5]UuriXtopics/_images/firebug3.pngrKh3]h4]UwidthX965h6]jF}rLU*jKsh7]Ualth+X Inspecting elements with FirebugrMrN}rObUheightX751uh9Nh:hh#]ubhX)rP}rQ(h(XAs you can see, the page markup is not very descriptive: the elements don't contain ``id``, ``class`` or any attribute that clearly identifies them, so we''ll use the ranking bars as a reference point to select the data to extract when we construct our XPaths.h)jh*h-h/h[h1}rR(h5]h6]h4]h3]h7]uh9Kwh:hh#]rS(hNXTAs you can see, the page markup is not very descriptive: the elements don't contain rTrU}rV(h(XTAs you can see, the page markup is not very descriptive: the elements don't contain h)jPubj)rW}rX(h(X``id``h1}rY(h5]h6]h4]h3]h7]uh)jPh#]rZhNXidr[r\}r](h(Uh)jWubah/jubhNX, r^r_}r`(h(X, h)jPubj)ra}rb(h(X ``class``h1}rc(h5]h6]h4]h3]h7]uh)jPh#]rdhNXclassrerf}rg(h(Uh)jaubah/jubhNX or any attribute that clearly identifies them, so we''ll use the ranking bars as a reference point to select the data to extract when we construct our XPaths.rhri}rj(h(X or any attribute that clearly identifies them, so we''ll use the ranking bars as a reference point to select the data to extract when we construct our XPaths.h)jPubeubhX)rk}rl(h(XAfter using FireBug, we can see that each link is inside a ``td`` tag, which is itself inside a ``tr`` tag that also contains the link's ranking bar (in another ``td``).h)jh*h-h/h[h1}rm(h5]h6]h4]h3]h7]uh9K|h:hh#]rn(hNX;After using FireBug, we can see that each link is inside a rorp}rq(h(X;After using FireBug, we can see that each link is inside a h)jkubj)rr}rs(h(X``td``h1}rt(h5]h6]h4]h3]h7]uh)jkh#]ruhNXtdrvrw}rx(h(Uh)jrubah/jubhNX tag, which is itself inside a ryrz}r{(h(X tag, which is itself inside a h)jkubj)r|}r}(h(X``tr``h1}r~(h5]h6]h4]h3]h7]uh)jkh#]rhNXtrrr}r(h(Uh)j|ubah/jubhNX; tag that also contains the link's ranking bar (in another rr}r(h(X; tag that also contains the link's ranking bar (in another h)jkubj)r}r(h(X``td``h1}r(h5]h6]h4]h3]h7]uh)jkh#]rhNXtdrr}r(h(Uh)jubah/jubhNX).rr}r(h(X).h)jkubeubhX)r}r(h(XSo we can select the ranking bar, then find its parent (the ``tr``), and then finally, the link's ``td`` (which contains the data we want to scrape).h)jh*h-h/h[h1}r(h5]h6]h4]h3]h7]uh9Kh:hh#]r(hNX<So we can select the ranking bar, then find its parent (the rr}r(h(X<So we can select the ranking bar, then find its parent (the h)jubj)r}r(h(X``tr``h1}r(h5]h6]h4]h3]h7]uh)jh#]rhNXtrrr}r(h(Uh)jubah/jubhNX ), and then finally, the link's rr}r(h(X ), and then finally, the link's h)jubj)r}r(h(X``td``h1}r(h5]h6]h4]h3]h7]uh)jh#]rhNXtdrr}r(h(Uh)jubah/jubhNX- (which contains the data we want to scrape).rr}r(h(X- (which contains the data we want to scrape).h)jubeubhX)r}r(h(X%This results in the following XPath::rh)jh*h-h/h[h1}r(h5]h6]h4]h3]h7]uh9Kh:hh#]rhNX$This results in the following XPath:rr}r(h(X$This results in the following XPath:h)jubaubj)r}r(h(XJ//td[descendant::a[contains(@href, "#pagerank")]]/following-sibling::td//ah)jh*h-h/jh1}r(jjh3]h4]h5]h6]h7]uh9Kh:hh#]rhNXJ//td[descendant::a[contains(@href, "#pagerank")]]/following-sibling::td//arr}r(h(Uh)jubaubhX)r}r(h(XIt's important to use the :ref:`Scrapy shell ` to test these complex XPath expressions and make sure they work as expected.h)jh*h-h/h[h1}r(h5]h6]h4]h3]h7]uh9Kh:hh#]r(hNXIt's important to use the rr}r(h(XIt's important to use the h)jubhp)r}r(h(X":ref:`Scrapy shell `rh)jh*h-h/hth1}r(UreftypeXrefhvhwX topics-shellU refdomainXstdrh3]h4]U refexplicith5]h6]h7]hyhzuh9Kh#]rh|)r}r(h(jh1}r(h5]h6]r(hjXstd-refreh4]h3]h7]uh)jh#]rhNX Scrapy shellrr}r(h(Uh)jubah/hubaubhNXM to test these complex XPath expressions and make sure they work as expected.rr}r(h(XM to test these complex XPath expressions and make sure they work as expected.h)jubeubhX)r}r(h(XBasically, that expression will look for the ranking bar's ``td`` element, and then select any ``td`` element who has a descendant ``a`` element whose ``href`` attribute contains the string ``#pagerank``"h)jh*h-h/h[h1}r(h5]h6]h4]h3]h7]uh9Kh:hh#]r(hNX;Basically, that expression will look for the ranking bar's rr}r(h(X;Basically, that expression will look for the ranking bar's h)jubj)r}r(h(X``td``h1}r(h5]h6]h4]h3]h7]uh)jh#]rhNXtdrr}r(h(Uh)jubah/jubhNX element, and then select any rr}r(h(X element, and then select any h)jubj)r}r(h(X``td``h1}r(h5]h6]h4]h3]h7]uh)jh#]rhNXtdrr}r(h(Uh)jubah/jubhNX element who has a descendant rr}r(h(X element who has a descendant h)jubj)r}r(h(X``a``h1}r(h5]h6]h4]h3]h7]uh)jh#]rhNXar}r(h(Uh)jubah/jubhNX element whose rr}r(h(X element whose h)jubj)r}r(h(X``href``h1}r(h5]h6]h4]h3]h7]uh)jh#]rhNXhrefrr}r(h(Uh)jubah/jubhNX attribute contains the string rr}r(h(X attribute contains the string h)jubj)r}r(h(X ``#pagerank``h1}r(h5]h6]h4]h3]h7]uh)jh#]rhNX #pagerankrr}r(h(Uh)jubah/jubhNX"r}r (h(X"h)jubeubhX)r }r (h(XOf course, this is not the only XPath, and maybe not the simpler one to select that data. Another approach could be, for example, to find any ``font`` tags that have that grey colour of the links,h)jh*h-h/h[h1}r (h5]h6]h4]h3]h7]uh9Kh:hh#]r (hNXOf course, this is not the only XPath, and maybe not the simpler one to select that data. Another approach could be, for example, to find any rr}r(h(XOf course, this is not the only XPath, and maybe not the simpler one to select that data. Another approach could be, for example, to find any h)j ubj)r}r(h(X``font``h1}r(h5]h6]h4]h3]h7]uh)j h#]rhNXfontrr}r(h(Uh)jubah/jubhNX. tags that have that grey colour of the links,rr}r(h(X. tags that have that grey colour of the links,h)j ubeubhX)r}r(h(X7Finally, we can write our ``parse_category()`` method::rh)jh*h-h/h[h1}r(h5]h6]h4]h3]h7]uh9Kh:hh#]r(hNXFinally, we can write our r r!}r"(h(XFinally, we can write our h)jubj)r#}r$(h(X``parse_category()``h1}r%(h5]h6]h4]h3]h7]uh)jh#]r&hNXparse_category()r'r(}r)(h(Uh)j#ubah/jubhNX method:r*r+}r,(h(X method:h)jubeubj)r-}r.(h(Xdef parse_category(self, response): sel = Selector(response) # The path to website links in directory page links = sel.xpath('//td[descendant::a[contains(@href, "#pagerank")]]/following-sibling::td/font') for link in links: item = DirectoryItem() item['name'] = link.xpath('a/text()').extract() item['url'] = link.xpath('a/@href').extract() item['description'] = link.xpath('font[2]/text()').extract() yield itemh)jh*h-h/jh1}r/(jjh3]h4]h5]h6]h7]uh9Kh:hh#]r0hNXdef parse_category(self, response): sel = Selector(response) # The path to website links in directory page links = sel.xpath('//td[descendant::a[contains(@href, "#pagerank")]]/following-sibling::td/font') for link in links: item = DirectoryItem() item['name'] = link.xpath('a/text()').extract() item['url'] = link.xpath('a/@href').extract() item['description'] = link.xpath('font[2]/text()').extract() yield itemr1r2}r3(h(Uh)j-ubaubhX)r4}r5(h(XBe aware that you may find some elements which appear in Firebug but not in the original HTML, such as the typical case of ```` elements.h)jh*h-h/h[h1}r6(h5]h6]h4]h3]h7]uh9Kh:hh#]r7(hNX{Be aware that you may find some elements which appear in Firebug but not in the original HTML, such as the typical case of r8r9}r:(h(X{Be aware that you may find some elements which appear in Firebug but not in the original HTML, such as the typical case of h)j4ubj)r;}r<(h(X ````h1}r=(h5]h6]h4]h3]h7]uh)j4h#]r>hNXr?r@}rA(h(Uh)j;ubah/jubhNX elements.rBrC}rD(h(X elements.h)j4ubeubhX)rE}rF(h(XRor tags which Therefer in page HTML sources may on Firebug inspects the live DOMrGh)jh*h-h/h[h1}rH(h5]h6]h4]h3]h7]uh9Kh:hh#]rIhNXRor tags which Therefer in page HTML sources may on Firebug inspects the live DOMrJrK}rL(h(jGh)jEubaubh%)rM}rN(h(Xr.. _has been shut down by Google: http://searchenginewatch.com/article/2096661/Google-Directory-Has-Been-Shut-Downj Kh)jh*h-h/h0h1}rO(hghhh3]rPh!ah4]h5]h6]h7]rQhauh9Kh:hh#]ubeubeubeh(UU transformerrRNU footnote_refsrS}rTUrefnamesrU}rV(Xhas been shut down by google]rWhbaXgoogle directory]rXhaXinspect element]rY(jj6eXfirebug]rZ(hheXopen directory project]r[hauUsymbol_footnotesr\]r]Uautofootnote_refsr^]r_Usymbol_footnote_refsr`]raU citationsrb]rch:hU current_linerdNUtransform_messagesre]rfcdocutils.nodes system_message rg)rh}ri(h(Uh1}rj(h5]UlevelKh3]h4]Usourceh-h6]h7]UlineKUtypeUINFOrkuh#]rlhX)rm}rn(h(Uh1}ro(h5]h6]h4]h3]h7]uh)jhh#]rphNX4Hyperlink target "topics-firebug" is not referenced.rqrr}rs(h(Uh)jmubah/h[ubah/Usystem_messagertubaUreporterruNUid_startrvKU autofootnotesrw]rxU citation_refsry}rzUindirect_targetsr{]r|Usettingsr}(cdocutils.frontend Values r~or}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhKNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/firebug.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(hj)h"jh!jMhhh hKUnameidsq?}q@(hhhUscrapyd-documentationqAhUcrawlspider-exampleqBh h h h h UspiderqCh U sitemapspiderqDh h hUspidersqEhhhhhhhhhhhhhhhU spiderargsqFhUxmlfeedspider-exampleqGhhhhhUcrawling-rulesqHhhhU xmlfeedspiderqIhhhhhhh h h!Uspider-argumentsqJh"h"h#UsitemapsqKh$Usitemap-index-filesqLh%h%h&h&h'Utopics-spiders-refqMh(Ucsvfeedspider-exampleqNh)Ubuilt-in-spiders-referenceqOh*h*h+U robots-txtqPh,U crawlspiderqQh-U csvfeedspiderqRh.Utopics-spidersqSh/h/h0Uspider-exampleqTh1h1h2UtldqUh3h3h4h4h5Usitemapspider-examplesqVh6h6h7h7uUchildrenqW]qX(cdocutils.nodes target qY)qZ}q[(U rawsourceq\X.. _topics-spiders:Uparentq]hUsourceq^cdocutils.nodes reprunicode q_XD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/spiders.rstq`qa}qbbUtagnameqcUtargetqdU attributesqe}qf(Uidsqg]Ubackrefsqh]Udupnamesqi]Uclassesqj]Unamesqk]UrefidqlhSuUlineqmKUdocumentqnhhW]ubcdocutils.nodes section qo)qp}qq(h\Uh]hh^haUexpect_referenced_by_nameqr}qsh.hZshcUsectionqthe}qu(hi]hj]hh]hg]qv(hEhSehk]qw(hh.euhmKhnhUexpect_referenced_by_idqx}qyhShZshW]qz(cdocutils.nodes title q{)q|}q}(h\XSpidersq~h]hph^hahcUtitleqhe}q(hi]hj]hh]hg]hk]uhmKhnhhW]qcdocutils.nodes Text qXSpidersqq}q(h\h~h]h|ubaubcdocutils.nodes paragraph q)q}q(h\X~Spiders are classes which define how a certain site (or group of sites) will be scraped, including how to perform the crawl (ie. follow links) and how to extract structured data from their pages (ie. scraping items). In other words, Spiders are the place where you define the custom behaviour for crawling and parsing pages for a particular site (or, in some cases, group of sites).qh]hph^hahcU paragraphqhe}q(hi]hj]hh]hg]hk]uhmKhnhhW]qhX~Spiders are classes which define how a certain site (or group of sites) will be scraped, including how to perform the crawl (ie. follow links) and how to extract structured data from their pages (ie. scraping items). In other words, Spiders are the place where you define the custom behaviour for crawling and parsing pages for a particular site (or, in some cases, group of sites).qq}q(h\hh]hubaubh)q}q(h\XAFor spiders, the scraping cycle goes through something like this:qh]hph^hahchhe}q(hi]hj]hh]hg]hk]uhmK hnhhW]qhXAFor spiders, the scraping cycle goes through something like this:qq}q(h\hh]hubaubcdocutils.nodes enumerated_list q)q}q(h\Uh]hph^hahcUenumerated_listqhe}q(UsuffixqU.hg]hh]hi]UprefixqUhj]hk]UenumtypeqUarabicquhmKhnhhW]q(cdocutils.nodes list_item q)q}q(h\XYou start by generating the initial Requests to crawl the first URLs, and specify a callback function to be called with the response downloaded from those requests. The first requests to perform are obtained by calling the :meth:`~scrapy.spider.Spider.start_requests` method which (by default) generates :class:`~scrapy.http.Request` for the URLs specified in the :attr:`~scrapy.spider.Spider.start_urls` and the :attr:`~scrapy.spider.Spider.parse` method as callback function for the Requests. h]hh^hahcU list_itemqhe}q(hi]hj]hh]hg]hk]uhmNhnhhW]q(h)q}q(h\XYou start by generating the initial Requests to crawl the first URLs, and specify a callback function to be called with the response downloaded from those requests.qh]hh^hahchhe}q(hi]hj]hh]hg]hk]uhmKhW]qhXYou start by generating the initial Requests to crawl the first URLs, and specify a callback function to be called with the response downloaded from those requests.qq}q(h\hh]hubaubh)q}q(h\XIThe first requests to perform are obtained by calling the :meth:`~scrapy.spider.Spider.start_requests` method which (by default) generates :class:`~scrapy.http.Request` for the URLs specified in the :attr:`~scrapy.spider.Spider.start_urls` and the :attr:`~scrapy.spider.Spider.parse` method as callback function for the Requests.h]hh^hahchhe}q(hi]hj]hh]hg]hk]uhmKhW]q(hX:The first requests to perform are obtained by calling the qq}q(h\X:The first requests to perform are obtained by calling the h]hubcsphinx.addnodes pending_xref q)q}q(h\X,:meth:`~scrapy.spider.Spider.start_requests`qh]hh^hahcU pending_xrefqhe}q(UreftypeXmethUrefwarnqU reftargetqX#scrapy.spider.Spider.start_requestsU refdomainXpyqhg]hh]U refexplicithi]hj]hk]UrefdocqXtopics/spidersqUpy:classqNU py:moduleqNuhmKhW]qcdocutils.nodes literal q)q}q(h\hhe}q(hi]hj]q(UxrefqhXpy-methqehh]hg]hk]uh]hhW]qhXstart_requests()qͅq}q(h\Uh]hubahcUliteralqubaubhX% method which (by default) generates qхq}q(h\X% method which (by default) generates h]hubh)q}q(h\X:class:`~scrapy.http.Request`qh]hh^hahchhe}q(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyqhg]hh]U refexplicithi]hj]hk]hhhNhNuhmKhW]qh)q}q(h\hhe}q(hi]hj]q(hhXpy-classqehh]hg]hk]uh]hhW]qhXRequestqq}q(h\Uh]hubahchubaubhX for the URLs specified in the qㅁq}q(h\X for the URLs specified in the h]hubh)q}q(h\X(:attr:`~scrapy.spider.Spider.start_urls`qh]hh^hahchhe}q(UreftypeXattrhhXscrapy.spider.Spider.start_urlsU refdomainXpyqhg]hh]U refexplicithi]hj]hk]hhhNhNuhmKhW]qh)q}q(h\hhe}q(hi]hj]q(hhXpy-attrqehh]hg]hk]uh]hhW]qhX start_urlsqq}q(h\Uh]hubahchubaubhX and the qq}q(h\X and the h]hubh)q}q(h\X#:attr:`~scrapy.spider.Spider.parse`qh]hh^hahchhe}q(UreftypeXattrhhXscrapy.spider.Spider.parseU refdomainXpyqhg]hh]U refexplicithi]hj]hk]hhhNhNuhmKhW]qh)q}q(h\hhe}r(hi]hj]r(hhXpy-attrrehh]hg]hk]uh]hhW]rhXparserr}r(h\Uh]hubahchubaubhX. method as callback function for the Requests.rr}r (h\X. method as callback function for the Requests.h]hubeubeubh)r }r (h\XOIn the callback function, you parse the response (web page) and return either :class:`~scrapy.item.Item` objects, :class:`~scrapy.http.Request` objects, or an iterable of both. Those Requests will also contain a callback (maybe the same) and will then be downloaded by Scrapy and then their response handled by the specified callback. h]hh^hahchhe}r (hi]hj]hh]hg]hk]uhmNhnhhW]r h)r}r(h\XNIn the callback function, you parse the response (web page) and return either :class:`~scrapy.item.Item` objects, :class:`~scrapy.http.Request` objects, or an iterable of both. Those Requests will also contain a callback (maybe the same) and will then be downloaded by Scrapy and then their response handled by the specified callback.h]j h^hahchhe}r(hi]hj]hh]hg]hk]uhmKhW]r(hXNIn the callback function, you parse the response (web page) and return either rr}r(h\XNIn the callback function, you parse the response (web page) and return either h]jubh)r}r(h\X:class:`~scrapy.item.Item`rh]jh^hahchhe}r(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhNhNuhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-classrehh]hg]hk]uh]jhW]r hXItemr!r"}r#(h\Uh]jubahchubaubhX objects, r$r%}r&(h\X objects, h]jubh)r'}r((h\X:class:`~scrapy.http.Request`r)h]jh^hahchhe}r*(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyr+hg]hh]U refexplicithi]hj]hk]hhhNhNuhmKhW]r,h)r-}r.(h\j)he}r/(hi]hj]r0(hj+Xpy-classr1ehh]hg]hk]uh]j'hW]r2hXRequestr3r4}r5(h\Uh]j-ubahchubaubhX objects, or an iterable of both. Those Requests will also contain a callback (maybe the same) and will then be downloaded by Scrapy and then their response handled by the specified callback.r6r7}r8(h\X objects, or an iterable of both. Those Requests will also contain a callback (maybe the same) and will then be downloaded by Scrapy and then their response handled by the specified callback.h]jubeubaubh)r9}r:(h\XIn callback functions, you parse the page contents, typically using :ref:`topics-selectors` (but you can also use BeautifulSoup, lxml or whatever mechanism you prefer) and generate items with the parsed data. h]hh^hahchhe}r;(hi]hj]hh]hg]hk]uhmNhnhhW]r<h)r=}r>(h\XIn callback functions, you parse the page contents, typically using :ref:`topics-selectors` (but you can also use BeautifulSoup, lxml or whatever mechanism you prefer) and generate items with the parsed data.h]j9h^hahchhe}r?(hi]hj]hh]hg]hk]uhmK hW]r@(hXDIn callback functions, you parse the page contents, typically using rArB}rC(h\XDIn callback functions, you parse the page contents, typically using h]j=ubh)rD}rE(h\X:ref:`topics-selectors`rFh]j=h^hahchhe}rG(UreftypeXrefhhXtopics-selectorsU refdomainXstdrHhg]hh]U refexplicithi]hj]hk]hhuhmK hW]rIcdocutils.nodes emphasis rJ)rK}rL(h\jFhe}rM(hi]hj]rN(hjHXstd-refrOehh]hg]hk]uh]jDhW]rPhXtopics-selectorsrQrR}rS(h\Uh]jKubahcUemphasisrTubaubhXu (but you can also use BeautifulSoup, lxml or whatever mechanism you prefer) and generate items with the parsed data.rUrV}rW(h\Xu (but you can also use BeautifulSoup, lxml or whatever mechanism you prefer) and generate items with the parsed data.h]j=ubeubaubh)rX}rY(h\XFinally, the items returned from the spider will be typically persisted to a database (in some :ref:`Item Pipeline `) or written to a file using :ref:`topics-feed-exports`. h]hh^hahchhe}rZ(hi]hj]hh]hg]hk]uhmNhnhhW]r[h)r\}r](h\XFinally, the items returned from the spider will be typically persisted to a database (in some :ref:`Item Pipeline `) or written to a file using :ref:`topics-feed-exports`.h]jXh^hahchhe}r^(hi]hj]hh]hg]hk]uhmK$hW]r_(hX_Finally, the items returned from the spider will be typically persisted to a database (in some r`ra}rb(h\X_Finally, the items returned from the spider will be typically persisted to a database (in some h]j\ubh)rc}rd(h\X+:ref:`Item Pipeline `reh]j\h^hahchhe}rf(UreftypeXrefhhXtopics-item-pipelineU refdomainXstdrghg]hh]U refexplicithi]hj]hk]hhuhmK$hW]rhjJ)ri}rj(h\jehe}rk(hi]hj]rl(hjgXstd-refrmehh]hg]hk]uh]jchW]rnhX Item Pipelinerorp}rq(h\Uh]jiubahcjTubaubhX) or written to a file using rrrs}rt(h\X) or written to a file using h]j\ubh)ru}rv(h\X:ref:`topics-feed-exports`rwh]j\h^hahchhe}rx(UreftypeXrefhhXtopics-feed-exportsU refdomainXstdryhg]hh]U refexplicithi]hj]hk]hhuhmK$hW]rzjJ)r{}r|(h\jwhe}r}(hi]hj]r~(hjyXstd-refrehh]hg]hk]uh]juhW]rhXtopics-feed-exportsrr}r(h\Uh]j{ubahcjTubaubhX.r}r(h\X.h]j\ubeubaubeubh)r}r(h\XEven though this cycle applies (more or less) to any kind of spider, there are different kinds of default spiders bundled into Scrapy for different purposes. We will talk about those types here.rh]hph^hahchhe}r(hi]hj]hh]hg]hk]uhmK(hnhhW]rhXEven though this cycle applies (more or less) to any kind of spider, there are different kinds of default spiders bundled into Scrapy for different purposes. We will talk about those types here.rr}r(h\jh]jubaubhY)r}r(h\X.. _spiderargs:h]hph^hahchdhe}r(hg]hh]hi]hj]hk]hlhFuhmK,hnhhW]ubho)r}r(h\Uh]hph^hahr}rhjshchthe}r(hi]hj]hh]hg]r(hJhFehk]r(h!heuhmK/hnhhx}rhFjshW]r(h{)r}r(h\XSpider argumentsrh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmK/hnhhW]rhXSpider argumentsrr}r(h\jh]jubaubh)r}r(h\XSpiders can receive arguments that modify their behaviour. Some common uses for spider arguments are to define the start URLs or to restrict the crawl to certain sections of the site, but they can be used to configure any functionality of the spider.rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmK1hnhhW]rhXSpiders can receive arguments that modify their behaviour. Some common uses for spider arguments are to define the start URLs or to restrict the crawl to certain sections of the site, but they can be used to configure any functionality of the spider.rr}r(h\jh]jubaubh)r}r(h\XgSpider arguments are passed through the :command:`crawl` command using the ``-a`` option. For example::h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmK6hnhhW]r(hX(Spider arguments are passed through the rr}r(h\X(Spider arguments are passed through the h]jubh)r}r(h\X:command:`crawl`rh]jh^hahchhe}r(UreftypeXcommandhhXcrawlU refdomainXstdrhg]hh]U refexplicithi]hj]hk]hhuhmK6hW]rh)r}r(h\jhe}r(hi]hj]r(hjX std-commandrehh]hg]hk]uh]jhW]rhXcrawlrr}r(h\Uh]jubahchubaubhX command using the rr}r(h\X command using the h]jubh)r}r(h\X``-a``he}r(hi]hj]hh]hg]hk]uh]jhW]rhX-arr}r(h\Uh]jubahchubhX option. For example:rr}r(h\X option. For example:h]jubeubcdocutils.nodes literal_block r)r}r(h\X-scrapy crawl myspider -a category=electronicsh]jh^hahcU literal_blockrhe}r(U xml:spacerUpreserverhg]hh]hi]hj]hk]uhmK9hnhhW]rhX-scrapy crawl myspider -a category=electronicsrr}r(h\Uh]jubaubh)r}r(h\X1Spiders receive arguments in their constructors::rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmK;hnhhW]rhX0Spiders receive arguments in their constructors:rr}r(h\X0Spiders receive arguments in their constructors:h]jubaubj)r}r(h\Xclass MySpider(Spider): name = 'myspider' def __init__(self, category=None, *args, **kwargs): super(MySpider, self).__init__(*args, **kwargs) self.start_urls = ['http://www.example.com/categories/%s' % category] # ...h]jh^hahcjhe}r(jjhg]hh]hi]hj]hk]uhmK=hnhhW]rhXclass MySpider(Spider): name = 'myspider' def __init__(self, category=None, *args, **kwargs): super(MySpider, self).__init__(*args, **kwargs) self.start_urls = ['http://www.example.com/categories/%s' % category] # ...rr}r(h\Uh]jubaubh)r}r(h\XlSpider arguments can also be passed through the Scrapyd ``schedule.json`` API. See `Scrapyd documentation`_.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKEhnhhW]r(hX8Spider arguments can also be passed through the Scrapyd rr}r(h\X8Spider arguments can also be passed through the Scrapyd h]jubh)r}r(h\X``schedule.json``he}r(hi]hj]hh]hg]hk]uh]jhW]rhX schedule.jsonrr}r(h\Uh]jubahchubhX API. See rr}r(h\X API. See h]jubcdocutils.nodes reference r)r}r(h\X`Scrapyd documentation`_UresolvedrKh]jhcU referencerhe}r(UnameXScrapyd documentationUrefurirXhttp://scrapyd.readthedocs.org/rhg]hh]hi]hj]hk]uhW]rhXScrapyd documentationrr}r(h\Uh]jubaubhX.r}r(h\X.h]jubeubhY)r}r(h\X.. _topics-spiders-ref:h]jh^hahchdhe}r(hg]hh]hi]hj]hk]hlhMuhmKHhnhhW]ubeubho)r}r (h\Uh]hph^hahr}r h'jshchthe}r (hi]hj]hh]hg]r (hOhMehk]r (h)h'euhmKKhnhhx}rhMjshW]r(h{)r}r(h\XBuilt-in spiders referencerh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKKhnhhW]rhXBuilt-in spiders referencerr}r(h\jh]jubaubh)r}r(h\X!Scrapy comes with some useful generic spiders that you can use, to subclass your spiders from. Their aim is to provide convenient functionality for a few common scraping cases, like following all links on a site based on certain rules, crawling from `Sitemaps`_, or parsing a XML/CSV feed.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKMhnhhW]r(hXScrapy comes with some useful generic spiders that you can use, to subclass your spiders from. Their aim is to provide convenient functionality for a few common scraping cases, like following all links on a site based on certain rules, crawling from rr}r(h\XScrapy comes with some useful generic spiders that you can use, to subclass your spiders from. Their aim is to provide convenient functionality for a few common scraping cases, like following all links on a site based on certain rules, crawling from h]jubj)r}r (h\X `Sitemaps`_jKh]jhcjhe}r!(UnameXSitemapsjXhttp://www.sitemaps.orgr"hg]hh]hi]hj]hk]uhW]r#hXSitemapsr$r%}r&(h\Uh]jubaubhX, or parsing a XML/CSV feed.r'r(}r)(h\X, or parsing a XML/CSV feed.h]jubeubh)r*}r+(h\XFor the examples used in the following spiders, we'll assume you have a project with a ``TestItem`` declared in a ``myproject.items`` module::h]jh^hahchhe}r,(hi]hj]hh]hg]hk]uhmKRhnhhW]r-(hXWFor the examples used in the following spiders, we'll assume you have a project with a r.r/}r0(h\XWFor the examples used in the following spiders, we'll assume you have a project with a h]j*ubh)r1}r2(h\X ``TestItem``he}r3(hi]hj]hh]hg]hk]uh]j*hW]r4hXTestItemr5r6}r7(h\Uh]j1ubahchubhX declared in a r8r9}r:(h\X declared in a h]j*ubh)r;}r<(h\X``myproject.items``he}r=(hi]hj]hh]hg]hk]uh]j*hW]r>hXmyproject.itemsr?r@}rA(h\Uh]j;ubahchubhX module:rBrC}rD(h\X module:h]j*ubeubj)rE}rF(h\Xqfrom scrapy.item import Item class TestItem(Item): id = Field() name = Field() description = Field()h]jh^hahcjhe}rG(jjhg]hh]hi]hj]hk]uhmKUhnhhW]rHhXqfrom scrapy.item import Item class TestItem(Item): id = Field() name = Field() description = Field()rIrJ}rK(h\Uh]jEubaubhY)rL}rM(h\Uh]jh^hahchdhe}rN(hi]hg]rOXmodule-scrapy.spiderrPahh]Uismodhj]hk]uhmNhnhhW]ubcsphinx.addnodes index rQ)rR}rS(h\Uh]jh^hahcUindexrThe}rU(hg]hh]hi]hj]hk]Uentries]rV(UsinglerWXscrapy.spider (module)Xmodule-scrapy.spiderUtrXauhmNhnhhW]ubho)rY}rZ(h\Uh]jh^hahchthe}r[(hi]hj]hh]hg]r\hCahk]r]h auhmKahnhhW]r^(h{)r_}r`(h\XSpiderrah]jYh^hahchhe}rb(hi]hj]hh]hg]hk]uhmKahnhhW]rchXSpiderrdre}rf(h\jah]j_ubaubjQ)rg}rh(h\Uh]jYh^NhcjThe}ri(hg]hh]hi]hj]hk]Uentries]rj(jWXSpider (class in scrapy.spider)hUtrkauhmNhnhhW]ubcsphinx.addnodes desc rl)rm}rn(h\Uh]jYh^NhcUdescrohe}rp(UnoindexrqUdomainrrXpyhg]hh]hi]hj]hk]UobjtypersXclassrtUdesctyperujtuhmNhnhhW]rv(csphinx.addnodes desc_signature rw)rx}ry(h\XSpider()h]jmh^hahcUdesc_signaturerzhe}r{(hg]r|haUmoduler}X scrapy.spiderr~hh]hi]hj]hk]rhaUfullnamerXSpiderrUclassrUUfirstruhmKhnhhW]r(csphinx.addnodes desc_annotation r)r}r(h\Xclass h]jxh^hahcUdesc_annotationrhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXclass rr}r(h\Uh]jubaubcsphinx.addnodes desc_addname r)r}r(h\Xscrapy.spider.h]jxh^hahcU desc_addnamerhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXscrapy.spider.rr}r(h\Uh]jubaubcsphinx.addnodes desc_name r)r}r(h\jh]jxh^hahcU desc_namerhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXSpiderrr}r(h\Uh]jubaubeubcsphinx.addnodes desc_content r)r}r(h\Uh]jmh^hahcU desc_contentrhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]r(h)r}r(h\XgThis is the simplest spider, and the one from which every other spider must inherit from (either the ones that come bundled with Scrapy, or the ones that you write yourself). It doesn't provide any special functionality. It just requests the given ``start_urls``/``start_requests``, and calls the spider's method ``parse`` for each of the resulting responses.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKehnhhW]r(hXThis is the simplest spider, and the one from which every other spider must inherit from (either the ones that come bundled with Scrapy, or the ones that you write yourself). It doesn't provide any special functionality. It just requests the given rr}r(h\XThis is the simplest spider, and the one from which every other spider must inherit from (either the ones that come bundled with Scrapy, or the ones that you write yourself). It doesn't provide any special functionality. It just requests the given h]jubh)r}r(h\X``start_urls``he}r(hi]hj]hh]hg]hk]uh]jhW]rhX start_urlsrr}r(h\Uh]jubahchubhX/r}r(h\X/h]jubh)r}r(h\X``start_requests``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXstart_requestsrr}r(h\Uh]jubahchubhX , and calls the spider's method rr}r(h\X , and calls the spider's method h]jubh)r}r(h\X ``parse``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXparserr}r(h\Uh]jubahchubhX% for each of the resulting responses.rr}r(h\X% for each of the resulting responses.h]jubeubjQ)r}r(h\Uh]jh^hahcjThe}r(hg]hh]hi]hj]hk]Uentries]r(jWX%name (scrapy.spider.Spider attribute)h UtrauhmNhnhhW]ubjl)r}r(h\Uh]jh^hahcjohe}r(jqjrXpyhg]hh]hi]hj]hk]jsX attributerjujuhmNhnhhW]r(jw)r}r(h\Xnamerh]jh^hahcjzhe}r(hg]rh aj}j~hh]hi]hj]hk]rh ajX Spider.namejjjuhmKwhnhhW]rj)r}r(h\jh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmKwhnhhW]rhXnamerr}r(h\Uh]jubaubaubj)r}r(h\Uh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmKwhnhhW]r(h)r}r(h\X,A string which defines the name for this spider. The spider name is how the spider is located (and instantiated) by Scrapy, so it must be unique. However, nothing prevents you from instantiating more than one instance of the same spider. This is the most important spider attribute and it's required.rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKmhnhhW]rhX,A string which defines the name for this spider. The spider name is how the spider is located (and instantiated) by Scrapy, so it must be unique. However, nothing prevents you from instantiating more than one instance of the same spider. This is the most important spider attribute and it's required.rr}r(h\jh]jubaubh)r}r(h\XIf the spider scrapes a single domain, a common practice is to name the spider after the domain, or without the `TLD`_. So, for example, a spider that crawls ``mywebsite.com`` would often be called ``mywebsite``.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKshnhhW]r(hXpIf the spider scrapes a single domain, a common practice is to name the spider after the domain, or without the rr}r(h\XpIf the spider scrapes a single domain, a common practice is to name the spider after the domain, or without the h]jubj)r}r(h\X`TLD`_jKh]jhcjhe}r(UnameXTLDjX-http://en.wikipedia.org/wiki/Top-level_domainrhg]hh]hi]hj]hk]uhW]rhXTLDrr}r(h\Uh]jubaubhX(. So, for example, a spider that crawls rr}r(h\X(. So, for example, a spider that crawls h]jubh)r}r(h\X``mywebsite.com``he}r(hi]hj]hh]hg]hk]uh]jhW]rhX mywebsite.comrr}r(h\Uh]jubahchubhX would often be called rr}r (h\X would often be called h]jubh)r }r (h\X ``mywebsite``he}r (hi]hj]hh]hg]hk]uh]jhW]r hX mywebsiterr}r(h\Uh]j ubahchubhX.r}r(h\X.h]jubeubeubeubjQ)r}r(h\Uh]jh^hahcjThe}r(hg]hh]hi]hj]hk]Uentries]r(jWX0allowed_domains (scrapy.spider.Spider attribute)h7UtrauhmNhnhhW]ubjl)r}r(h\Uh]jh^hahcjohe}r(jqjrXpyhg]hh]hi]hj]hk]jsX attributerjujuhmNhnhhW]r(jw)r}r(h\Xallowed_domainsrh]jh^hahcjzhe}r (hg]r!h7aj}j~hh]hi]hj]hk]r"h7ajXSpider.allowed_domainsjjjuhmK~hnhhW]r#j)r$}r%(h\jh]jh^hahcjhe}r&(hi]hj]hh]hg]hk]uhmK~hnhhW]r'hXallowed_domainsr(r)}r*(h\Uh]j$ubaubaubj)r+}r,(h\Uh]jh^hahcjhe}r-(hi]hj]hh]hg]hk]uhmK~hnhhW]r.h)r/}r0(h\XAn optional list of strings containing domains that this spider is allowed to crawl. Requests for URLs not belonging to the domain names specified in this list won't be followed if :class:`~scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware` is enabled.h]j+h^hahchhe}r1(hi]hj]hh]hg]hk]uhmKzhnhhW]r2(hXAn optional list of strings containing domains that this spider is allowed to crawl. Requests for URLs not belonging to the domain names specified in this list won't be followed if r3r4}r5(h\XAn optional list of strings containing domains that this spider is allowed to crawl. Requests for URLs not belonging to the domain names specified in this list won't be followed if h]j/ubh)r6}r7(h\XC:class:`~scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware`r8h]j/h^hahchhe}r9(UreftypeXclasshhX9scrapy.contrib.spidermiddleware.offsite.OffsiteMiddlewareU refdomainXpyr:hg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKzhW]r;h)r<}r=(h\j8he}r>(hi]hj]r?(hj:Xpy-classr@ehh]hg]hk]uh]j6hW]rAhXOffsiteMiddlewarerBrC}rD(h\Uh]j<ubahchubaubhX is enabled.rErF}rG(h\X is enabled.h]j/ubeubaubeubjQ)rH}rI(h\Uh]jh^hahcjThe}rJ(hg]hh]hi]hj]hk]Uentries]rK(jWX+start_urls (scrapy.spider.Spider attribute)hUtrLauhmNhnhhW]ubjl)rM}rN(h\Uh]jh^hahcjohe}rO(jqjrXpyhg]hh]hi]hj]hk]jsX attributerPjujPuhmNhnhhW]rQ(jw)rR}rS(h\X start_urlsrTh]jMh^hahcjzhe}rU(hg]rVhaj}j~hh]hi]hj]hk]rWhajXSpider.start_urlsjjjuhmKhnhhW]rXj)rY}rZ(h\jTh]jRh^hahcjhe}r[(hi]hj]hh]hg]hk]uhmKhnhhW]r\hX start_urlsr]r^}r_(h\Uh]jYubaubaubj)r`}ra(h\Uh]jMh^hahcjhe}rb(hi]hj]hh]hg]hk]uhmKhnhhW]rch)rd}re(h\XA list of URLs where the spider will begin to crawl from, when no particular URLs are specified. So, the first pages downloaded will be those listed here. The subsequent URLs will be generated successively from data contained in the start URLs.rfh]j`h^hahchhe}rg(hi]hj]hh]hg]hk]uhmKhnhhW]rhhXA list of URLs where the spider will begin to crawl from, when no particular URLs are specified. So, the first pages downloaded will be those listed here. The subsequent URLs will be generated successively from data contained in the start URLs.rirj}rk(h\jfh]jdubaubaubeubjQ)rl}rm(h\Uh]jh^hahcjThe}rn(hg]hh]hi]hj]hk]Uentries]ro(jWX.start_requests() (scrapy.spider.Spider method)h UtrpauhmNhnhhW]ubjl)rq}rr(h\Uh]jh^hahcjohe}rs(jqjrXpyhg]hh]hi]hj]hk]jsXmethodrtjujtuhmNhnhhW]ru(jw)rv}rw(h\Xstart_requests()h]jqh^hahcjzhe}rx(hg]ryh aj}j~hh]hi]hj]hk]rzh ajXSpider.start_requestsjjjuhmKhnhhW]r{(j)r|}r}(h\Xstart_requestsh]jvh^hahcjhe}r~(hi]hj]hh]hg]hk]uhmKhnhhW]rhXstart_requestsrr}r(h\Uh]j|ubaubcsphinx.addnodes desc_parameterlist r)r}r(h\Uh]jvh^hahcUdesc_parameterlistrhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]ubeubj)r}r(h\Uh]jqh^hahcjhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]r(h)r}r(h\XUThis method must return an iterable with the first Requests to crawl for this spider.rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXUThis method must return an iterable with the first Requests to crawl for this spider.rr}r(h\jh]jubaubh)r}r(h\X=This is the method called by Scrapy when the spider is opened for scraping when no particular URLs are specified. If particular URLs are specified, the :meth:`make_requests_from_url` is used instead to create the Requests. This method is also called only once from Scrapy, so it's safe to implement it as a generator.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]r(hXThis is the method called by Scrapy when the spider is opened for scraping when no particular URLs are specified. If particular URLs are specified, the rr}r(h\XThis is the method called by Scrapy when the spider is opened for scraping when no particular URLs are specified. If particular URLs are specified, the h]jubh)r}r(h\X:meth:`make_requests_from_url`rh]jh^hahchhe}r(UreftypeXmethhhXmake_requests_from_urlU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-methrehh]hg]hk]uh]jhW]rhXmake_requests_from_url()rr}r(h\Uh]jubahchubaubhX is used instead to create the Requests. This method is also called only once from Scrapy, so it's safe to implement it as a generator.rr}r(h\X is used instead to create the Requests. This method is also called only once from Scrapy, so it's safe to implement it as a generator.h]jubeubh)r}r(h\XwThe default implementation uses :meth:`make_requests_from_url` to generate Requests for each url in :attr:`start_urls`.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]r(hX The default implementation uses rr}r(h\X The default implementation uses h]jubh)r}r(h\X:meth:`make_requests_from_url`rh]jh^hahchhe}r(UreftypeXmethhhXmake_requests_from_urlU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-methrehh]hg]hk]uh]jhW]rhXmake_requests_from_url()rr}r(h\Uh]jubahchubaubhX& to generate Requests for each url in rr}r(h\X& to generate Requests for each url in h]jubh)r}r(h\X:attr:`start_urls`rh]jh^hahchhe}r(UreftypeXattrhhX start_urlsU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-attrrehh]hg]hk]uh]jhW]rhX start_urlsrr}r(h\Uh]jubahchubaubhX.r}r(h\X.h]jubeubh)r}r(h\XIf you want to change the Requests used to start scraping a domain, this is the method to override. For example, if you need to start by logging in using a POST request, you could do::h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXIf you want to change the Requests used to start scraping a domain, this is the method to override. For example, if you need to start by logging in using a POST request, you could do:rr}r(h\XIf you want to change the Requests used to start scraping a domain, this is the method to override. For example, if you need to start by logging in using a POST request, you could do:h]jubaubj)r}r(h\X`def start_requests(self): return [FormRequest("http://www.example.com/login", formdata={'user': 'john', 'pass': 'secret'}, callback=self.logged_in)] def logged_in(self, response): # here you would extract links to follow and return Requests for # each of them, with another callback passh]jh^hahcjhe}r(jjhg]hh]hi]hj]hk]uhmKhnhhW]rhX`def start_requests(self): return [FormRequest("http://www.example.com/login", formdata={'user': 'john', 'pass': 'secret'}, callback=self.logged_in)] def logged_in(self, response): # here you would extract links to follow and return Requests for # each of them, with another callback passrr}r(h\Uh]jubaubeubeubjQ)r}r(h\Uh]jh^hahcjThe}r(hg]hh]hi]hj]hk]Uentries]r(jWX6make_requests_from_url() (scrapy.spider.Spider method)h*UtrauhmNhnhhW]ubjl)r}r(h\Uh]jh^hahcjohe}r(jqjrXpyhg]hh]hi]hj]hk]jsXmethodrjujuhmNhnhhW]r(jw)r}r(h\Xmake_requests_from_url(url)h]jh^hahcjzhe}r(hg]rh*aj}j~hh]hi]hj]hk]rh*ajXSpider.make_requests_from_urljjjuhmKhnhhW]r(j)r}r(h\Xmake_requests_from_urlh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXmake_requests_from_urlrr}r(h\Uh]jubaubj)r}r(h\Uh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rcsphinx.addnodes desc_parameter r)r}r(h\Xurlhe}r(hi]hj]hh]hg]hk]uh]jhW]rhXurlrr}r(h\Uh]jubahcUdesc_parameterrubaubeubj)r }r (h\Uh]jh^hahcjhe}r (hi]hj]hh]hg]hk]uhmKhnhhW]r (h)r }r(h\XA method that receives a URL and returns a :class:`~scrapy.http.Request` object (or a list of :class:`~scrapy.http.Request` objects) to scrape. This method is used to construct the initial requests in the :meth:`start_requests` method, and is typically used to convert urls to requests.h]j h^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]r(hX+A method that receives a URL and returns a rr}r(h\X+A method that receives a URL and returns a h]j ubh)r}r(h\X:class:`~scrapy.http.Request`rh]j h^hahchhe}r(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-classrehh]hg]hk]uh]jhW]rhXRequestr r!}r"(h\Uh]jubahchubaubhX object (or a list of r#r$}r%(h\X object (or a list of h]j ubh)r&}r'(h\X:class:`~scrapy.http.Request`r(h]j h^hahchhe}r)(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyr*hg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]r+h)r,}r-(h\j(he}r.(hi]hj]r/(hj*Xpy-classr0ehh]hg]hk]uh]j&hW]r1hXRequestr2r3}r4(h\Uh]j,ubahchubaubhXR objects) to scrape. This method is used to construct the initial requests in the r5r6}r7(h\XR objects) to scrape. This method is used to construct the initial requests in the h]j ubh)r8}r9(h\X:meth:`start_requests`r:h]j h^hahchhe}r;(UreftypeXmethhhXstart_requestsU refdomainXpyr<hg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]r=h)r>}r?(h\j:he}r@(hi]hj]rA(hj<Xpy-methrBehh]hg]hk]uh]j8hW]rChXstart_requests()rDrE}rF(h\Uh]j>ubahchubaubhX; method, and is typically used to convert urls to requests.rGrH}rI(h\X; method, and is typically used to convert urls to requests.h]j ubeubh)rJ}rK(h\XUnless overridden, this method returns Requests with the :meth:`parse` method as their callback function, and with dont_filter parameter enabled (see :class:`~scrapy.http.Request` class for more info).h]j h^hahchhe}rL(hi]hj]hh]hg]hk]uhmKhnhhW]rM(hX9Unless overridden, this method returns Requests with the rNrO}rP(h\X9Unless overridden, this method returns Requests with the h]jJubh)rQ}rR(h\X :meth:`parse`rSh]jJh^hahchhe}rT(UreftypeXmethhhXparseU refdomainXpyrUhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rVh)rW}rX(h\jShe}rY(hi]hj]rZ(hjUXpy-methr[ehh]hg]hk]uh]jQhW]r\hXparse()r]r^}r_(h\Uh]jWubahchubaubhXP method as their callback function, and with dont_filter parameter enabled (see r`ra}rb(h\XP method as their callback function, and with dont_filter parameter enabled (see h]jJubh)rc}rd(h\X:class:`~scrapy.http.Request`reh]jJh^hahchhe}rf(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyrghg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rhh)ri}rj(h\jehe}rk(hi]hj]rl(hjgXpy-classrmehh]hg]hk]uh]jchW]rnhXRequestrorp}rq(h\Uh]jiubahchubaubhX class for more info).rrrs}rt(h\X class for more info).h]jJubeubeubeubjQ)ru}rv(h\Uh]jh^NhcjThe}rw(hg]hh]hi]hj]hk]Uentries]rx(jWX%parse() (scrapy.spider.Spider method)hUtryauhmNhnhhW]ubjl)rz}r{(h\Uh]jh^Nhcjohe}r|(jqjrXpyr}hg]hh]hi]hj]hk]jsXmethodr~juj~uhmNhnhhW]r(jw)r}r(h\Xparse(response)h]jzh^hahcjzhe}r(hg]rhaj}j~hh]hi]hj]hk]rhajX Spider.parsejjjuhmKhnhhW]r(j)r}r(h\Xparseh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXparserr}r(h\Uh]jubaubj)r}r(h\Uh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rj)r}r(h\Xresponsehe}r(hi]hj]hh]hg]hk]uh]jhW]rhXresponserr}r(h\Uh]jubahcjubaubeubj)r}r(h\Uh]jzh^hahcjhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]r(h)r}r(h\XzThis is the default callback used by Scrapy to process downloaded responses, when their requests don't specify a callback.rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXzThis is the default callback used by Scrapy to process downloaded responses, when their requests don't specify a callback.rr}r(h\jh]jubaubh)r}r(h\XThe ``parse`` method is in charge of processing the response and returning scraped data and/or more URLs to follow. Other Requests callbacks have the same requirements as the :class:`Spider` class.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]r(hXThe rr}r(h\XThe h]jubh)r}r(h\X ``parse``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXparserr}r(h\Uh]jubahchubhX method is in charge of processing the response and returning scraped data and/or more URLs to follow. Other Requests callbacks have the same requirements as the rr}r(h\X method is in charge of processing the response and returning scraped data and/or more URLs to follow. Other Requests callbacks have the same requirements as the h]jubh)r}r(h\X:class:`Spider`rh]jh^hahchhe}r(UreftypeXclasshhXSpiderU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-classrehh]hg]hk]uh]jhW]rhXSpiderrr}r(h\Uh]jubahchubaubhX class.rr}r(h\X class.h]jubeubh)r}r(h\XThis method, as well as any other Request callback, must return an iterable of :class:`~scrapy.http.Request` and/or :class:`~scrapy.item.Item` objects.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]r(hXOThis method, as well as any other Request callback, must return an iterable of rr}r(h\XOThis method, as well as any other Request callback, must return an iterable of h]jubh)r}r(h\X:class:`~scrapy.http.Request`rh]jh^hahchhe}r(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-classrehh]hg]hk]uh]jhW]rhXRequestrr}r(h\Uh]jubahchubaubhX and/or rr}r(h\X and/or h]jubh)r}r(h\X:class:`~scrapy.item.Item`rh]jh^hahchhe}r(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-classrehh]hg]hk]uh]jhW]rhXItemrr}r(h\Uh]jubahchubaubhX objects.rr}r(h\X objects.h]jubeubcdocutils.nodes field_list r)r}r(h\Uh]jh^NhcU field_listrhe}r(hi]hj]hh]hg]hk]uhmNhnhhW]rcdocutils.nodes field r)r}r(h\Uhe}r(hi]hj]hh]hg]hk]uh]jhW]r(cdocutils.nodes field_name r)r}r(h\Uhe}r(hi]hj]hh]hg]hk]uh]jhW]rhX Parametersrr}r(h\Uh]jubahcU field_namerubcdocutils.nodes field_body r)r}r(h\Uhe}r (hi]hj]hh]hg]hk]uh]jhW]r h)r }r (h\Uhe}r (hi]hj]hh]hg]hk]uh]jhW]r(cdocutils.nodes strong r)r}r(h\Xresponsehe}r(hi]hj]hh]hg]hk]uh]j hW]rhXresponserr}r(h\Uh]jubahcUstrongrubhX (rr}r(h\Uh]j ubh)r}r(h\Uhe}r(UreftypeUobjrU reftargetX:class:~scrapy.http.Response`rU refdomainj}hg]hh]U refexplicithi]hj]hk]uh]j hW]r jJ)r!}r"(h\jhe}r#(hi]hj]hh]hg]hk]uh]jhW]r$hX:class:~scrapy.http.Response`r%r&}r'(h\Uh]j!ubahcjTubahchubhX)r(}r)(h\Uh]j ubhX -- r*r+}r,(h\Uh]j ubhXthe response to parser-r.}r/(h\Xthe response to parseh]j ubehchubahcU field_bodyr0ubehcUfieldr1ubaubeubeubjQ)r2}r3(h\Uh]jh^hahcjThe}r4(hg]hh]hi]hj]hk]Uentries]r5(jWX#log() (scrapy.spider.Spider method)h/Utr6auhmNhnhhW]ubjl)r7}r8(h\Uh]jh^hahcjohe}r9(jqjrXpyhg]hh]hi]hj]hk]jsXmethodr:juj:uhmNhnhhW]r;(jw)r<}r=(h\X log(message, [level, component])h]j7h^hahcjzhe}r>(hg]r?h/aj}j~hh]hi]hj]hk]r@h/ajX Spider.logjjjuhmKhnhhW]rA(j)rB}rC(h\Xlogh]j<h^hahcjhe}rD(hi]hj]hh]hg]hk]uhmKhnhhW]rEhXlogrFrG}rH(h\Uh]jBubaubj)rI}rJ(h\Uh]j<h^hahcjhe}rK(hi]hj]hh]hg]hk]uhmKhnhhW]rL(j)rM}rN(h\Xmessagehe}rO(hi]hj]hh]hg]hk]uh]jIhW]rPhXmessagerQrR}rS(h\Uh]jMubahcjubcsphinx.addnodes desc_optional rT)rU}rV(h\Uhe}rW(hi]hj]hh]hg]hk]uh]jIhW]rX(j)rY}rZ(h\Xlevelhe}r[(hi]hj]hh]hg]hk]uh]jUhW]r\hXlevelr]r^}r_(h\Uh]jYubahcjubj)r`}ra(h\X componenthe}rb(hi]hj]hh]hg]hk]uh]jUhW]rchX componentrdre}rf(h\Uh]j`ubahcjubehcU desc_optionalrgubeubeubj)rh}ri(h\Uh]j7h^hahcjhe}rj(hi]hj]hh]hg]hk]uhmKhnhhW]rkh)rl}rm(h\XLog a message using the :func:`scrapy.log.msg` function, automatically populating the spider argument with the :attr:`name` of this spider. For more information see :ref:`topics-logging`.h]jhh^hahchhe}rn(hi]hj]hh]hg]hk]uhmKhnhhW]ro(hXLog a message using the rprq}rr(h\XLog a message using the h]jlubh)rs}rt(h\X:func:`scrapy.log.msg`ruh]jlh^hahchhe}rv(UreftypeXfunchhXscrapy.log.msgU refdomainXpyrwhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rxh)ry}rz(h\juhe}r{(hi]hj]r|(hjwXpy-funcr}ehh]hg]hk]uh]jshW]r~hXscrapy.log.msg()rr}r(h\Uh]jyubahchubaubhXA function, automatically populating the spider argument with the rr}r(h\XA function, automatically populating the spider argument with the h]jlubh)r}r(h\X :attr:`name`rh]jlh^hahchhe}r(UreftypeXattrhhXnameU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhj~uhmKhW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-attrrehh]hg]hk]uh]jhW]rhXnamerr}r(h\Uh]jubahchubaubhX* of this spider. For more information see rr}r(h\X* of this spider. For more information see h]jlubh)r}r(h\X:ref:`topics-logging`rh]jlh^hahchhe}r(UreftypeXrefhhXtopics-loggingU refdomainXstdrhg]hh]U refexplicithi]hj]hk]hhuhmKhW]rjJ)r}r(h\jhe}r(hi]hj]r(hjXstd-refrehh]hg]hk]uh]jhW]rhXtopics-loggingrr}r(h\Uh]jubahcjTubaubhX.r}r(h\X.h]jlubeubaubeubeubeubho)r}r(h\Uh]jYh^hahchthe}r(hi]hj]hh]hg]rhTahk]rh0auhmKhnhhW]r(h{)r}r(h\XSpider examplerh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXSpider examplerr}r(h\jh]jubaubh)r}r(h\XLet's see an example::rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXLet's see an example:rr}r(h\XLet's see an example:h]jubaubj)r}r(h\Xfrom scrapy import log # This module is useful for printing out debug information from scrapy.spider import Spider class MySpider(Spider): name = 'example.com' allowed_domains = ['example.com'] start_urls = [ 'http://www.example.com/1.html', 'http://www.example.com/2.html', 'http://www.example.com/3.html', ] def parse(self, response): self.log('A response from %s just arrived!' % response.url)h]jh^hahcjhe}r(jjhg]hh]hi]hj]hk]uhmKhnhhW]rhXfrom scrapy import log # This module is useful for printing out debug information from scrapy.spider import Spider class MySpider(Spider): name = 'example.com' allowed_domains = ['example.com'] start_urls = [ 'http://www.example.com/1.html', 'http://www.example.com/2.html', 'http://www.example.com/3.html', ] def parse(self, response): self.log('A response from %s just arrived!' % response.url)rr}r(h\Uh]jubaubh)r}r(h\XOAnother example returning multiples Requests and Items from a single callback::rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXNAnother example returning multiples Requests and Items from a single callback:rr}r(h\XNAnother example returning multiples Requests and Items from a single callback:h]jubaubj)r}r(h\Xqfrom scrapy.selector import Selector from scrapy.spider import Spider from scrapy.http import Request from myproject.items import MyItem class MySpider(Spider): name = 'example.com' allowed_domains = ['example.com'] start_urls = [ 'http://www.example.com/1.html', 'http://www.example.com/2.html', 'http://www.example.com/3.html', ] def parse(self, response): sel = Selector(response) for h3 in sel.xpath('//h3').extract(): yield MyItem(title=h3) for url in sel.xpath('//a/@href').extract(): yield Request(url, callback=self.parse)h]jh^hahcjhe}r(jjhg]hh]hi]hj]hk]uhmKhnhhW]rhXqfrom scrapy.selector import Selector from scrapy.spider import Spider from scrapy.http import Request from myproject.items import MyItem class MySpider(Spider): name = 'example.com' allowed_domains = ['example.com'] start_urls = [ 'http://www.example.com/1.html', 'http://www.example.com/2.html', 'http://www.example.com/3.html', ] def parse(self, response): sel = Selector(response) for h3 in sel.xpath('//h3').extract(): yield MyItem(title=h3) for url in sel.xpath('//a/@href').extract(): yield Request(url, callback=self.parse)rr}r(h\Uh]jubaubhY)r}r(h\Uh]jh^hahchdhe}r(hi]hg]rXmodule-scrapy.contrib.spidersrahh]Uismodhj]hk]uhmNhnhhW]ubjQ)r}r(h\Uh]jh^hahcjThe}r(hg]hh]hi]hj]hk]Uentries]r(jWXscrapy.contrib.spiders (module)Xmodule-scrapy.contrib.spidersUtrauhmNhnhhW]ubeubeubho)r}r(h\Uh]jh^hahchthe}r(hi]hj]hh]hg]rhQahk]rh,auhmKhnhhW]r(h{)r}r(h\X CrawlSpiderrh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhX CrawlSpiderrr}r(h\jh]jubaubjQ)r}r(h\Uh]jh^NhcjThe}r(hg]hh]hi]hj]hk]Uentries]r(jWX-CrawlSpider (class in scrapy.contrib.spiders)h UtrauhmNhnhhW]ubjl)r}r(h\Uh]jh^Nhcjohe}r(jqjrXpyhg]hh]hi]hj]hk]jsXclassrjujuhmNhnhhW]r(jw)r}r(h\X CrawlSpiderrh]jh^hahcjzhe}r(hg]rh aj}Xscrapy.contrib.spidersrhh]hi]hj]hk]rh ajjjUjuhmMhnhhW]r(j)r}r(h\Xclass h]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rhXclass rr}r(h\Uh]jubaubj)r}r(h\Xscrapy.contrib.spiders.h]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rhXscrapy.contrib.spiders.r r }r (h\Uh]jubaubj)r }r (h\jh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rhX CrawlSpiderrr}r(h\Uh]j ubaubeubj)r}r(h\Uh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]r(h)r}r(h\XThis is the most commonly used spider for crawling regular websites, as it provides a convenient mechanism for following links by defining a set of rules. It may not be the best suited for your particular web sites or project, but it's generic enough for several cases, so you can start from it and override it as needed for more custom functionality, or just implement your own spider.rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmKhnhhW]rhXThis is the most commonly used spider for crawling regular websites, as it provides a convenient mechanism for following links by defining a set of rules. It may not be the best suited for your particular web sites or project, but it's generic enough for several cases, so you can start from it and override it as needed for more custom functionality, or just implement your own spider.rr}r(h\jh]jubaubh)r}r (h\XmApart from the attributes inherited from Spider (that you must specify), this class supports a new attribute:r!h]jh^hahchhe}r"(hi]hj]hh]hg]hk]uhmKhnhhW]r#hXmApart from the attributes inherited from Spider (that you must specify), this class supports a new attribute:r$r%}r&(h\j!h]jubaubjQ)r'}r((h\Uh]jh^hahcjThe}r)(hg]hh]hi]hj]hk]Uentries]r*(jWX4rules (scrapy.contrib.spiders.CrawlSpider attribute)hUtr+auhmNhnhhW]ubjl)r,}r-(h\Uh]jh^hahcjohe}r.(jqjrXpyhg]hh]hi]hj]hk]jsX attributer/juj/uhmNhnhhW]r0(jw)r1}r2(h\Xrulesr3h]j,h^hahcjzhe}r4(hg]r5haj}jhh]hi]hj]hk]r6hajXCrawlSpider.rulesjjjuhmMhnhhW]r7j)r8}r9(h\j3h]j1h^hahcjhe}r:(hi]hj]hh]hg]hk]uhmMhnhhW]r;hXrulesr<r=}r>(h\Uh]j8ubaubaubj)r?}r@(h\Uh]j,h^hahcjhe}rA(hi]hj]hh]hg]hk]uhmMhnhhW]rBh)rC}rD(h\XWhich is a list of one (or more) :class:`Rule` objects. Each :class:`Rule` defines a certain behaviour for crawling the site. Rules objects are described below. If multiple rules match the same link, the first one will be used, according to the order they're defined in this attribute.h]j?h^hahchhe}rE(hi]hj]hh]hg]hk]uhmMhnhhW]rF(hX!Which is a list of one (or more) rGrH}rI(h\X!Which is a list of one (or more) h]jCubh)rJ}rK(h\X :class:`Rule`rLh]jCh^hahchhe}rM(UreftypeXclasshhXRuleU refdomainXpyrNhg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]rOh)rP}rQ(h\jLhe}rR(hi]hj]rS(hjNXpy-classrTehh]hg]hk]uh]jJhW]rUhXRulerVrW}rX(h\Uh]jPubahchubaubhX objects. Each rYrZ}r[(h\X objects. Each h]jCubh)r\}r](h\X :class:`Rule`r^h]jCh^hahchhe}r_(UreftypeXclasshhXRuleU refdomainXpyr`hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]rah)rb}rc(h\j^he}rd(hi]hj]re(hj`Xpy-classrfehh]hg]hk]uh]j\hW]rghXRulerhri}rj(h\Uh]jbubahchubaubhX defines a certain behaviour for crawling the site. Rules objects are described below. If multiple rules match the same link, the first one will be used, according to the order they're defined in this attribute.rkrl}rm(h\X defines a certain behaviour for crawling the site. Rules objects are described below. If multiple rules match the same link, the first one will be used, according to the order they're defined in this attribute.h]jCubeubaubeubh)rn}ro(h\X0This spider also exposes an overrideable method:rph]jh^hahchhe}rq(hi]hj]hh]hg]hk]uhmM hnhhW]rrhX0This spider also exposes an overrideable method:rsrt}ru(h\jph]jnubaubjQ)rv}rw(h\Uh]jh^hahcjThe}rx(hg]hh]hi]hj]hk]Uentries]ry(jWX=parse_start_url() (scrapy.contrib.spiders.CrawlSpider method)hUtrzauhmNhnhhW]ubjl)r{}r|(h\Uh]jh^hahcjohe}r}(jqjrXpyhg]hh]hi]hj]hk]jsXmethodr~juj~uhmNhnhhW]r(jw)r}r(h\Xparse_start_url(response)h]j{h^hahcjzhe}r(hg]rhaj}jhh]hi]hj]hk]rhajXCrawlSpider.parse_start_urljjjuhmMhnhhW]r(j)r}r(h\Xparse_start_urlh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rhXparse_start_urlrr}r(h\Uh]jubaubj)r}r(h\Uh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rj)r}r(h\Xresponsehe}r(hi]hj]hh]hg]hk]uh]jhW]rhXresponserr}r(h\Uh]jubahcjubaubeubj)r}r(h\Uh]j{h^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rh)r}r(h\XThis method is called for the start_urls responses. It allows to parse the initial responses and must return either a :class:`~scrapy.item.Item` object, a :class:`~scrapy.http.Request` object, or an iterable containing any of them.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmM hnhhW]r(hXvThis method is called for the start_urls responses. It allows to parse the initial responses and must return either a rr}r(h\XvThis method is called for the start_urls responses. It allows to parse the initial responses and must return either a h]jubh)r}r(h\X:class:`~scrapy.item.Item`rh]jh^hahchhe}r(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhjuhmM hW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-classrehh]hg]hk]uh]jhW]rhXItemrr}r(h\Uh]jubahchubaubhX object, a rr}r(h\X object, a h]jubh)r}r(h\X:class:`~scrapy.http.Request`rh]jh^hahchhe}r(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhjuhmM hW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-classrehh]hg]hk]uh]jhW]rhXRequestrr}r(h\Uh]jubahchubaubhX/ object, or an iterable containing any of them.rr}r(h\X/ object, or an iterable containing any of them.h]jubeubaubeubeubeubho)r}r(h\Uh]jh^hahchthe}r(hi]hj]hh]hg]rhHahk]rhauhmMhnhhW]r(h{)r}r(h\XCrawling rulesrh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rhXCrawling rulesrr}r(h\jh]jubaubjQ)r}r(h\Uh]jh^hahcjThe}r(hg]hh]hi]hj]hk]Uentries]r(jWX&Rule (class in scrapy.contrib.spiders)hUtrauhmNhnhhW]ubjl)r}r(h\Uh]jh^hahcjohe}r(jqjrXpyhg]hh]hi]hj]hk]jsXclassrjujuhmNhnhhW]r(jw)r}r(h\XjRule(link_extractor, callback=None, cb_kwargs=None, follow=None, process_links=None, process_request=None)h]jh^hahcjzhe}r(hg]rhaj}jhh]hi]hj]hk]rhajXRulerjUjuhmM5hnhhW]r(j)r}r(h\Xclass h]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmM5hnhhW]rhXclass rr}r(h\Uh]jubaubj)r}r(h\Xscrapy.contrib.spiders.h]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmM5hnhhW]rhXscrapy.contrib.spiders.rr}r(h\Uh]jubaubj)r}r(h\jh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmM5hnhhW]rhXRulerr}r(h\Uh]jubaubj)r}r(h\Uh]jh^hahcjhe}r(hi]hj]hh]hg]hk]uhmM5hnhhW]r(j)r}r(h\Xlink_extractorhe}r(hi]hj]hh]hg]hk]uh]jhW]rhXlink_extractorrr}r(h\Uh]jubahcjubj)r}r(h\X callback=Nonehe}r(hi]hj]hh]hg]hk]uh]jhW]r hX callback=Noner r }r (h\Uh]jubahcjubj)r }r(h\Xcb_kwargs=Nonehe}r(hi]hj]hh]hg]hk]uh]jhW]rhXcb_kwargs=Nonerr}r(h\Uh]j ubahcjubj)r}r(h\X follow=Nonehe}r(hi]hj]hh]hg]hk]uh]jhW]rhX follow=Nonerr}r(h\Uh]jubahcjubj)r}r(h\Xprocess_links=Nonehe}r(hi]hj]hh]hg]hk]uh]jhW]rhXprocess_links=Nonerr }r!(h\Uh]jubahcjubj)r"}r#(h\Xprocess_request=Nonehe}r$(hi]hj]hh]hg]hk]uh]jhW]r%hXprocess_request=Noner&r'}r((h\Uh]j"ubahcjubeubeubj)r)}r*(h\Uh]jh^hahcjhe}r+(hi]hj]hh]hg]hk]uhmM5hnhhW]r,(h)r-}r.(h\X``link_extractor`` is a :ref:`Link Extractor ` object which defines how links will be extracted from each crawled page.h]j)h^hahchhe}r/(hi]hj]hh]hg]hk]uhmMhnhhW]r0(h)r1}r2(h\X``link_extractor``he}r3(hi]hj]hh]hg]hk]uh]j-hW]r4hXlink_extractorr5r6}r7(h\Uh]j1ubahchubhX is a r8r9}r:(h\X is a h]j-ubh)r;}r<(h\X.:ref:`Link Extractor `r=h]j-h^hahchhe}r>(UreftypeXrefhhXtopics-link-extractorsU refdomainXstdr?hg]hh]U refexplicithi]hj]hk]hhuhmMhW]r@jJ)rA}rB(h\j=he}rC(hi]hj]rD(hj?Xstd-refrEehh]hg]hk]uh]j;hW]rFhXLink ExtractorrGrH}rI(h\Uh]jAubahcjTubaubhXI object which defines how links will be extracted from each crawled page.rJrK}rL(h\XI object which defines how links will be extracted from each crawled page.h]j-ubeubh)rM}rN(h\Xw``callback`` is a callable or a string (in which case a method from the spider object with that name will be used) to be called for each link extracted with the specified link_extractor. This callback receives a response as its first argument and must return a list containing :class:`~scrapy.item.Item` and/or :class:`~scrapy.http.Request` objects (or any subclass of them).h]j)h^hahchhe}rO(hi]hj]hh]hg]hk]uhmMhnhhW]rP(h)rQ}rR(h\X ``callback``he}rS(hi]hj]hh]hg]hk]uh]jMhW]rThXcallbackrUrV}rW(h\Uh]jQubahchubhX  is a callable or a string (in which case a method from the spider object with that name will be used) to be called for each link extracted with the specified link_extractor. This callback receives a response as its first argument and must return a list containing rXrY}rZ(h\X  is a callable or a string (in which case a method from the spider object with that name will be used) to be called for each link extracted with the specified link_extractor. This callback receives a response as its first argument and must return a list containing h]jMubh)r[}r\(h\X:class:`~scrapy.item.Item`r]h]jMh^hahchhe}r^(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyr_hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]r`h)ra}rb(h\j]he}rc(hi]hj]rd(hj_Xpy-classreehh]hg]hk]uh]j[hW]rfhXItemrgrh}ri(h\Uh]jaubahchubaubhX and/or rjrk}rl(h\X and/or h]jMubh)rm}rn(h\X:class:`~scrapy.http.Request`roh]jMh^hahchhe}rp(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyrqhg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]rrh)rs}rt(h\johe}ru(hi]hj]rv(hjqXpy-classrwehh]hg]hk]uh]jmhW]rxhXRequestryrz}r{(h\Uh]jsubahchubaubhX# objects (or any subclass of them).r|r}}r~(h\X# objects (or any subclass of them).h]jMubeubcdocutils.nodes warning r)r}r(h\XWhen writing crawl spider rules, avoid using ``parse`` as callback, since the :class:`CrawlSpider` uses the ``parse`` method itself to implement its logic. So if you override the ``parse`` method, the crawl spider will no longer work.h]j)h^hahcUwarningrhe}r(hi]hj]hh]hg]hk]uhmNhnhhW]rh)r}r(h\XWhen writing crawl spider rules, avoid using ``parse`` as callback, since the :class:`CrawlSpider` uses the ``parse`` method itself to implement its logic. So if you override the ``parse`` method, the crawl spider will no longer work.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmM hW]r(hX-When writing crawl spider rules, avoid using rr}r(h\X-When writing crawl spider rules, avoid using h]jubh)r}r(h\X ``parse``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXparserr}r(h\Uh]jubahchubhX as callback, since the rr}r(h\X as callback, since the h]jubh)r}r(h\X:class:`CrawlSpider`rh]jh^hahchhe}r(UreftypeXclasshhX CrawlSpiderU refdomainXpyrhg]hh]U refexplicithi]hj]hk]hhhjhjuhmM hW]rh)r}r(h\jhe}r(hi]hj]r(hjXpy-classrehh]hg]hk]uh]jhW]rhX CrawlSpiderrr}r(h\Uh]jubahchubaubhX uses the rr}r(h\X uses the h]jubh)r}r(h\X ``parse``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXparserr}r(h\Uh]jubahchubhX> method itself to implement its logic. So if you override the rr}r(h\X> method itself to implement its logic. So if you override the h]jubh)r}r(h\X ``parse``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXparserr}r(h\Uh]jubahchubhX. method, the crawl spider will no longer work.rr}r(h\X. method, the crawl spider will no longer work.h]jubeubaubh)r}r(h\X^``cb_kwargs`` is a dict containing the keyword arguments to be passed to the callback functionh]j)h^hahchhe}r(hi]hj]hh]hg]hk]uhmM%hnhhW]r(h)r}r(h\X ``cb_kwargs``he}r(hi]hj]hh]hg]hk]uh]jhW]rhX cb_kwargsrr}r(h\Uh]jubahchubhXQ is a dict containing the keyword arguments to be passed to the callback functionrr}r(h\XQ is a dict containing the keyword arguments to be passed to the callback functionh]jubeubh)r}r(h\X``follow`` is a boolean which specifies if links should be followed from each response extracted with this rule. If ``callback`` is None ``follow`` defaults to ``True``, otherwise it default to ``False``.h]j)h^hahchhe}r(hi]hj]hh]hg]hk]uhmM(hnhhW]r(h)r}r(h\X ``follow``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXfollowrr}r(h\Uh]jubahchubhXj is a boolean which specifies if links should be followed from each response extracted with this rule. If rr}r(h\Xj is a boolean which specifies if links should be followed from each response extracted with this rule. If h]jubh)r}r(h\X ``callback``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXcallbackrr}r(h\Uh]jubahchubhX is None rr}r(h\X is None h]jubh)r}r(h\X ``follow``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXfollowrr}r(h\Uh]jubahchubhX defaults to rr}r(h\X defaults to h]jubh)r}r(h\X``True``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXTruerr}r(h\Uh]jubahchubhX, otherwise it default to rr}r(h\X, otherwise it default to h]jubh)r}r(h\X ``False``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXFalserr}r(h\Uh]jubahchubhX.r}r(h\X.h]jubeubh)r}r(h\X``process_links`` is a callable, or a string (in which case a method from the spider object with that name will be used) which will be called for each list of links extracted from each response using the specified ``link_extractor``. This is mainly used for filtering purposes.h]j)h^hahchhe}r(hi]hj]hh]hg]hk]uhmM,hnhhW]r(h)r}r(h\X``process_links``he}r(hi]hj]hh]hg]hk]uh]jhW]rhX process_linksrr}r (h\Uh]jubahchubhX is a callable, or a string (in which case a method from the spider object with that name will be used) which will be called for each list of links extracted from each response using the specified r r }r (h\X is a callable, or a string (in which case a method from the spider object with that name will be used) which will be called for each list of links extracted from each response using the specified h]jubh)r }r(h\X``link_extractor``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXlink_extractorrr}r(h\Uh]j ubahchubhX-. This is mainly used for filtering purposes.rr}r(h\X-. This is mainly used for filtering purposes.h]jubeubh)r}r(h\X``process_request`` is a callable, or a string (in which case a method from the spider object with that name will be used) which will be called with every request extracted by this rule, and must return a request or None (to filter out the request).h]j)h^hahchhe}r(hi]hj]hh]hg]hk]uhmM1hnhhW]r(h)r}r(h\X``process_request``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXprocess_requestrr }r!(h\Uh]jubahchubhX is a callable, or a string (in which case a method from the spider object with that name will be used) which will be called with every request extracted by this rule, and must return a request or None (to filter out the request).r"r#}r$(h\X is a callable, or a string (in which case a method from the spider object with that name will be used) which will be called with every request extracted by this rule, and must return a request or None (to filter out the request).h]jubeubeubeubeubho)r%}r&(h\Uh]jh^hahchthe}r'(hi]hj]hh]hg]r(hBahk]r)hauhmM7hnhhW]r*(h{)r+}r,(h\XCrawlSpider exampler-h]j%h^hahchhe}r.(hi]hj]hh]hg]hk]uhmM7hnhhW]r/hXCrawlSpider exampler0r1}r2(h\j-h]j+ubaubh)r3}r4(h\X<Let's now take a look at an example CrawlSpider with rules::r5h]j%h^hahchhe}r6(hi]hj]hh]hg]hk]uhmM9hnhhW]r7hX;Let's now take a look at an example CrawlSpider with rules:r8r9}r:(h\X;Let's now take a look at an example CrawlSpider with rules:h]j3ubaubj)r;}r<(h\Xfrom scrapy.contrib.spiders import CrawlSpider, Rule from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor from scrapy.selector import Selector from scrapy.item import Item class MySpider(CrawlSpider): name = 'example.com' allowed_domains = ['example.com'] start_urls = ['http://www.example.com'] rules = ( # Extract links matching 'category.php' (but not matching 'subsection.php') # and follow links from them (since no callback means follow=True by default). Rule(SgmlLinkExtractor(allow=('category\.php', ), deny=('subsection\.php', ))), # Extract links matching 'item.php' and parse them with the spider's method parse_item Rule(SgmlLinkExtractor(allow=('item\.php', )), callback='parse_item'), ) def parse_item(self, response): self.log('Hi, this is an item page! %s' % response.url) sel = Selector(response) item = Item() item['id'] = sel.xpath('//td[@id="item_id"]/text()').re(r'ID: (\d+)') item['name'] = sel.xpath('//td[@id="item_name"]/text()').extract() item['description'] = sel.xpath('//td[@id="item_description"]/text()').extract() return itemh]j%h^hahcjhe}r=(jjhg]hh]hi]hj]hk]uhmM;hnhhW]r>hXfrom scrapy.contrib.spiders import CrawlSpider, Rule from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor from scrapy.selector import Selector from scrapy.item import Item class MySpider(CrawlSpider): name = 'example.com' allowed_domains = ['example.com'] start_urls = ['http://www.example.com'] rules = ( # Extract links matching 'category.php' (but not matching 'subsection.php') # and follow links from them (since no callback means follow=True by default). Rule(SgmlLinkExtractor(allow=('category\.php', ), deny=('subsection\.php', ))), # Extract links matching 'item.php' and parse them with the spider's method parse_item Rule(SgmlLinkExtractor(allow=('item\.php', )), callback='parse_item'), ) def parse_item(self, response): self.log('Hi, this is an item page! %s' % response.url) sel = Selector(response) item = Item() item['id'] = sel.xpath('//td[@id="item_id"]/text()').re(r'ID: (\d+)') item['name'] = sel.xpath('//td[@id="item_name"]/text()').extract() item['description'] = sel.xpath('//td[@id="item_description"]/text()').extract() return itemr?r@}rA(h\Uh]j;ubaubh)rB}rC(h\XThis spider would start crawling example.com's home page, collecting category links, and item links, parsing the latter with the ``parse_item`` method. For each item response, some data will be extracted from the HTML using XPath, and a :class:`~scrapy.item.Item` will be filled with it.h]j%h^hahchhe}rD(hi]hj]hh]hg]hk]uhmMYhnhhW]rE(hXThis spider would start crawling example.com's home page, collecting category links, and item links, parsing the latter with the rFrG}rH(h\XThis spider would start crawling example.com's home page, collecting category links, and item links, parsing the latter with the h]jBubh)rI}rJ(h\X``parse_item``he}rK(hi]hj]hh]hg]hk]uh]jBhW]rLhX parse_itemrMrN}rO(h\Uh]jIubahchubhX^ method. For each item response, some data will be extracted from the HTML using XPath, and a rPrQ}rR(h\X^ method. For each item response, some data will be extracted from the HTML using XPath, and a h]jBubh)rS}rT(h\X:class:`~scrapy.item.Item`rUh]jBh^hahchhe}rV(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrWhg]hh]U refexplicithi]hj]hk]hhhNhjuhmMYhW]rXh)rY}rZ(h\jUhe}r[(hi]hj]r\(hjWXpy-classr]ehh]hg]hk]uh]jShW]r^hXItemr_r`}ra(h\Uh]jYubahchubaubhX will be filled with it.rbrc}rd(h\X will be filled with it.h]jBubeubeubeubho)re}rf(h\Uh]jh^hahchthe}rg(hi]hj]hh]hg]rhhIahk]rihauhmM_hnhhW]rj(h{)rk}rl(h\X XMLFeedSpiderrmh]jeh^hahchhe}rn(hi]hj]hh]hg]hk]uhmM_hnhhW]rohX XMLFeedSpiderrprq}rr(h\jmh]jkubaubjQ)rs}rt(h\Uh]jeh^NhcjThe}ru(hg]hh]hi]hj]hk]Uentries]rv(jWX/XMLFeedSpider (class in scrapy.contrib.spiders)hUtrwauhmNhnhhW]ubjl)rx}ry(h\Uh]jeh^Nhcjohe}rz(jqjrXpyhg]hh]hi]hj]hk]jsXclassr{juj{uhmNhnhhW]r|(jw)r}}r~(h\X XMLFeedSpiderrh]jxh^hahcjzhe}r(hg]rhaj}jhh]hi]hj]hk]rhajjjUjuhmMhnhhW]r(j)r}r(h\Xclass h]j}h^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rhXclass rr}r(h\Uh]jubaubj)r}r(h\Xscrapy.contrib.spiders.h]j}h^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rhXscrapy.contrib.spiders.rr}r(h\Uh]jubaubj)r}r(h\jh]j}h^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]rhX XMLFeedSpiderrr}r(h\Uh]jubaubeubj)r}r(h\Uh]jxh^hahcjhe}r(hi]hj]hh]hg]hk]uhmMhnhhW]r(h)r}r(h\XXMLFeedSpider is designed for parsing XML feeds by iterating through them by a certain node name. The iterator can be chosen from: ``iternodes``, ``xml``, and ``html``. It's recommended to use the ``iternodes`` iterator for performance reasons, since the ``xml`` and ``html`` iterators generate the whole DOM at once in order to parse it. However, using ``html`` as the iterator may be useful when parsing XML with bad markup.h]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmMchnhhW]r(hXXMLFeedSpider is designed for parsing XML feeds by iterating through them by a certain node name. The iterator can be chosen from: rr}r(h\XXMLFeedSpider is designed for parsing XML feeds by iterating through them by a certain node name. The iterator can be chosen from: h]jubh)r}r(h\X ``iternodes``he}r(hi]hj]hh]hg]hk]uh]jhW]rhX iternodesrr}r(h\Uh]jubahchubhX, rr}r(h\X, h]jubh)r}r(h\X``xml``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXxmlrr}r(h\Uh]jubahchubhX, and rr}r(h\X, and h]jubh)r}r(h\X``html``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXhtmlrr}r(h\Uh]jubahchubhX. It's recommended to use the rr}r(h\X. It's recommended to use the h]jubh)r}r(h\X ``iternodes``he}r(hi]hj]hh]hg]hk]uh]jhW]rhX iternodesrr}r(h\Uh]jubahchubhX- iterator for performance reasons, since the rr}r(h\X- iterator for performance reasons, since the h]jubh)r}r(h\X``xml``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXxmlrr}r(h\Uh]jubahchubhX and rr}r(h\X and h]jubh)r}r(h\X``html``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXhtmlrr}r(h\Uh]jubahchubhXP iterators generate the whole DOM at once in order to parse it. However, using rr}r(h\XP iterators generate the whole DOM at once in order to parse it. However, using h]jubh)r}r(h\X``html``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXhtmlrr}r(h\Uh]jubahchubhX@ as the iterator may be useful when parsing XML with bad markup.rr}r(h\X@ as the iterator may be useful when parsing XML with bad markup.h]jubeubh)r}r(h\XUTo set the iterator and the tag name, you must define the following class attributes:rh]jh^hahchhe}r(hi]hj]hh]hg]hk]uhmMjhnhhW]rhXUTo set the iterator and the tag name, you must define the following class attributes:rr}r(h\jh]jubaubjQ)r}r(h\Uh]jh^hahcjThe}r(hg]hh]hi]hj]hk]Uentries]r(jWX9iterator (scrapy.contrib.spiders.XMLFeedSpider attribute)hUtrauhmNhnhhW]ubjl)r}r(h\Uh]jh^hahcjohe}r(jqjrXpyhg]hh]hi]hj]hk]jsX attributerjujuhmNhnhhW]r(jw)r}r(h\Xiteratorrh]jh^hahcjzhe}r(hg]r haj}jhh]hi]hj]hk]r hajXXMLFeedSpider.iteratorjjjuhmM|hnhhW]r j)r }r (h\jh]jh^hahcjhe}r (hi]hj]hh]hg]hk]uhmM|hnhhW]r hXiteratorr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]jh^hahcjhe}r (hi]hj]hh]hg]hk]uhmM|hnhhW]r (h)r }r (h\X=A string which defines the iterator to use. It can be either:r h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMohnhhW]r hX=A string which defines the iterator to use. It can be either:r r }r (h\j h]j ubaubcdocutils.nodes block_quote r )r }r (h\Uh]j h^NhcU block_quoter he}r (hi]hj]hh]hg]hk]uhmNhnhhW]r cdocutils.nodes bullet_list r )r }r (h\Uhe}r (Ubulletr X-hg]hh]hi]hj]hk]uh]j hW]r! (h)r" }r# (h\X?``'iternodes'`` - a fast iterator based on regular expressions he}r$ (hi]hj]hh]hg]hk]uh]j hW]r% h)r& }r' (h\X>``'iternodes'`` - a fast iterator based on regular expressionsh]j" h^hahchhe}r( (hi]hj]hh]hg]hk]uhmMqhW]r) (h)r* }r+ (h\X``'iternodes'``he}r, (hi]hj]hh]hg]hk]uh]j& hW]r- hX 'iternodes'r. r/ }r0 (h\Uh]j* ubahchubhX/ - a fast iterator based on regular expressionsr1 r2 }r3 (h\X/ - a fast iterator based on regular expressionsh]j& ubeubahchubh)r4 }r5 (h\X``'html'`` - an iterator which uses :class:`~scrapy.selector.Selector`. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feeds he}r6 (hi]hj]hh]hg]hk]uh]j hW]r7 h)r8 }r9 (h\X``'html'`` - an iterator which uses :class:`~scrapy.selector.Selector`. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feedsh]j4 h^hahchhe}r: (hi]hj]hh]hg]hk]uhmMshW]r; (h)r< }r= (h\X ``'html'``he}r> (hi]hj]hh]hg]hk]uh]j8 hW]r? hX'html'r@ rA }rB (h\Uh]j< ubahchubhX - an iterator which uses rC rD }rE (h\X - an iterator which uses h]j8 ubh)rF }rG (h\X":class:`~scrapy.selector.Selector`rH h]j8 h^hahchhe}rI (UreftypeXclasshhXscrapy.selector.SelectorU refdomainXpyrJ hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMshW]rK h)rL }rM (h\jH he}rN (hi]hj]rO (hjJ Xpy-classrP ehh]hg]hk]uh]jF hW]rQ hXSelectorrR rS }rT (h\Uh]jL ubahchubaubhXk. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feedsrU rV }rW (h\Xk. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feedsh]j8 ubeubahchubh)rX }rY (h\X``'xml'`` - an iterator which uses :class:`~scrapy.selector.Selector`. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feeds he}rZ (hi]hj]hh]hg]hk]uh]j hW]r[ h)r\ }r] (h\X``'xml'`` - an iterator which uses :class:`~scrapy.selector.Selector`. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feedsh]jX h^hahchhe}r^ (hi]hj]hh]hg]hk]uhmMwhW]r_ (h)r` }ra (h\X ``'xml'``he}rb (hi]hj]hh]hg]hk]uh]j\ hW]rc hX'xml'rd re }rf (h\Uh]j` ubahchubhX - an iterator which uses rg rh }ri (h\X - an iterator which uses h]j\ ubh)rj }rk (h\X":class:`~scrapy.selector.Selector`rl h]j\ h^hahchhe}rm (UreftypeXclasshhXscrapy.selector.SelectorU refdomainXpyrn hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMwhW]ro h)rp }rq (h\jl he}rr (hi]hj]rs (hjn Xpy-classrt ehh]hg]hk]uh]jj hW]ru hXSelectorrv rw }rx (h\Uh]jp ubahchubaubhXk. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feedsry rz }r{ (h\Xk. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feedsh]j\ ubeubahchubehcU bullet_listr| ubaubh)r} }r~ (h\X It defaults to: ``'iternodes'``.r h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmM{hnhhW]r (hXIt defaults to: r r }r (h\XIt defaults to: h]j} ubh)r }r (h\X``'iternodes'``he}r (hi]hj]hh]hg]hk]uh]j} hW]r hX 'iternodes'r r }r (h\Uh]j ubahchubhX.r }r (h\X.h]j} ubeubeubeubjQ)r }r (h\Uh]jh^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX8itertag (scrapy.contrib.spiders.XMLFeedSpider attribute)h"Utr auhmNhnhhW]ubjl)r }r (h\Uh]jh^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsX attributer juj uhmNhnhhW]r (jw)r }r (h\Xitertagr h]j h^hahcjzhe}r (hg]r h"aj}jhh]hi]hj]hk]r h"ajXXMLFeedSpider.itertagjjjuhmMhnhhW]r j)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hXitertagr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (h)r }r (h\XHA string with the name of the node (or element) to iterate in. Example::h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hXGA string with the name of the node (or element) to iterate in. Example:r r }r (h\XGA string with the name of the node (or element) to iterate in. Example:h]j ubaubj)r }r (h\Xitertag = 'product'h]j h^hahcjhe}r (jjhg]hh]hi]hj]hk]uhmMhnhhW]r hXitertag = 'product'r r }r (h\Uh]j ubaubeubeubjQ)r }r (h\Uh]jh^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX;namespaces (scrapy.contrib.spiders.XMLFeedSpider attribute)h%Utr auhmNhnhhW]ubjl)r }r (h\Uh]jh^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsX attributer juj uhmNhnhhW]r (jw)r }r (h\X namespacesr h]j h^hahcjzhe}r (hg]r h%aj}jhh]hi]hj]hk]r h%ajXXMLFeedSpider.namespacesjjjuhmMhnhhW]r j)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hX namespacesr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (h)r }r (h\XA list of ``(prefix, uri)`` tuples which define the namespaces available in that document that will be processed with this spider. The ``prefix`` and ``uri`` will be used to automatically register namespaces using the :meth:`~scrapy.selector.Selector.register_namespace` method.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (hX A list of r r }r (h\X A list of h]j ubh)r }r (h\X``(prefix, uri)``he}r (hi]hj]hh]hg]hk]uh]j hW]r hX (prefix, uri)r r }r (h\Uh]j ubahchubhXl tuples which define the namespaces available in that document that will be processed with this spider. The r r }r (h\Xl tuples which define the namespaces available in that document that will be processed with this spider. The h]j ubh)r }r (h\X ``prefix``he}r (hi]hj]hh]hg]hk]uh]j hW]r hXprefixr r }r (h\Uh]j ubahchubhX and r r }r (h\X and h]j ubh)r }r (h\X``uri``he}r (hi]hj]hh]hg]hk]uh]j hW]r hXurir r }r (h\Uh]j ubahchubhX= will be used to automatically register namespaces using the r r }r (h\X= will be used to automatically register namespaces using the h]j ubh)r }r (h\X4:meth:`~scrapy.selector.Selector.register_namespace`r h]j h^hahchhe}r (UreftypeXmethhhX+scrapy.selector.Selector.register_namespaceU refdomainXpyr hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]r h)r }r (h\j he}r (hi]hj]r (hj Xpy-methr ehh]hg]hk]uh]j hW]r hXregister_namespace()r r }r (h\Uh]j ubahchubaubhX method.r r }r (h\X method.h]j ubeubh)r }r (h\XLYou can then specify nodes with namespaces in the :attr:`itertag` attribute.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (hX2You can then specify nodes with namespaces in the r r }r (h\X2You can then specify nodes with namespaces in the h]j ubh)r }r (h\X:attr:`itertag`r h]j h^hahchhe}r (UreftypeXattrhhXitertagU refdomainXpyr hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]r h)r }r (h\j he}r (hi]hj]r (hj Xpy-attrr ehh]hg]hk]uh]j hW]r hXitertagr r }r (h\Uh]j ubahchubaubhX attribute.r! r" }r# (h\X attribute.h]j ubeubh)r$ }r% (h\X Example::h]j h^hahchhe}r& (hi]hj]hh]hg]hk]uhmMhnhhW]r' hXExample:r( r) }r* (h\XExample:h]j$ ubaubj)r+ }r, (h\Xclass YourSpider(XMLFeedSpider): namespaces = [('n', 'http://www.sitemaps.org/schemas/sitemap/0.9')] itertag = 'n:url' # ...h]j h^hahcjhe}r- (jjhg]hh]hi]hj]hk]uhmMhnhhW]r. hXclass YourSpider(XMLFeedSpider): namespaces = [('n', 'http://www.sitemaps.org/schemas/sitemap/0.9')] itertag = 'n:url' # ...r/ r0 }r1 (h\Uh]j+ ubaubeubeubh)r2 }r3 (h\XXApart from these new attributes, this spider has the following overrideable methods too:r4 h]jh^hahchhe}r5 (hi]hj]hh]hg]hk]uhmMhnhhW]r6 hXXApart from these new attributes, this spider has the following overrideable methods too:r7 r8 }r9 (h\j4 h]j2 ubaubjQ)r: }r; (h\Uh]jh^hahcjThe}r< (hg]hh]hi]hj]hk]Uentries]r= (jWX>adapt_response() (scrapy.contrib.spiders.XMLFeedSpider method)hUtr> auhmNhnhhW]ubjl)r? }r@ (h\Uh]jh^hahcjohe}rA (jqjrXpyhg]hh]hi]hj]hk]jsXmethodrB jujB uhmNhnhhW]rC (jw)rD }rE (h\Xadapt_response(response)h]j? h^hahcjzhe}rF (hg]rG haj}jhh]hi]hj]hk]rH hajXXMLFeedSpider.adapt_responsejjjuhmMhnhhW]rI (j)rJ }rK (h\Xadapt_responseh]jD h^hahcjhe}rL (hi]hj]hh]hg]hk]uhmMhnhhW]rM hXadapt_responserN rO }rP (h\Uh]jJ ubaubj)rQ }rR (h\Uh]jD h^hahcjhe}rS (hi]hj]hh]hg]hk]uhmMhnhhW]rT j)rU }rV (h\Xresponsehe}rW (hi]hj]hh]hg]hk]uh]jQ hW]rX hXresponserY rZ }r[ (h\Uh]jU ubahcjubaubeubj)r\ }r] (h\Uh]j? h^hahcjhe}r^ (hi]hj]hh]hg]hk]uhmMhnhhW]r_ h)r` }ra (h\XA method that receives the response as soon as it arrives from the spider middleware, before the spider starts parsing it. It can be used to modify the response body before parsing it. This method receives a response and also returns a response (it could be the same or another one).rb h]j\ h^hahchhe}rc (hi]hj]hh]hg]hk]uhmMhnhhW]rd hXA method that receives the response as soon as it arrives from the spider middleware, before the spider starts parsing it. It can be used to modify the response body before parsing it. This method receives a response and also returns a response (it could be the same or another one).re rf }rg (h\jb h]j` ubaubaubeubjQ)rh }ri (h\Uh]jh^hahcjThe}rj (hg]hh]hi]hj]hk]Uentries]rk (jWX:parse_node() (scrapy.contrib.spiders.XMLFeedSpider method)h6Utrl auhmNhnhhW]ubjl)rm }rn (h\Uh]jh^hahcjohe}ro (jqjrXpyhg]hh]hi]hj]hk]jsXmethodrp jujp uhmNhnhhW]rq (jw)rr }rs (h\Xparse_node(response, selector)h]jm h^hahcjzhe}rt (hg]ru h6aj}jhh]hi]hj]hk]rv h6ajXXMLFeedSpider.parse_nodejjjuhmMhnhhW]rw (j)rx }ry (h\X parse_nodeh]jr h^hahcjhe}rz (hi]hj]hh]hg]hk]uhmMhnhhW]r{ hX parse_noder| r} }r~ (h\Uh]jx ubaubj)r }r (h\Uh]jr h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (j)r }r (h\Xresponsehe}r (hi]hj]hh]hg]hk]uh]j hW]r hXresponser r }r (h\Uh]j ubahcjubj)r }r (h\Xselectorhe}r (hi]hj]hh]hg]hk]uh]j hW]r hXselectorr r }r (h\Uh]j ubahcjubeubeubj)r }r (h\Uh]jm h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r h)r }r (h\X}This method is called for the nodes matching the provided tag name (``itertag``). Receives the response and an :class:`~scrapy.selector.Selector` for each node. Overriding this method is mandatory. Otherwise, you spider won't work. This method must return either a :class:`~scrapy.item.Item` object, a :class:`~scrapy.http.Request` object, or an iterable containing any of them.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (hXDThis method is called for the nodes matching the provided tag name (r r }r (h\XDThis method is called for the nodes matching the provided tag name (h]j ubh)r }r (h\X ``itertag``he}r (hi]hj]hh]hg]hk]uh]j hW]r hXitertagr r }r (h\Uh]j ubahchubhX!). Receives the response and an r r }r (h\X!). Receives the response and an h]j ubh)r }r (h\X":class:`~scrapy.selector.Selector`r h]j h^hahchhe}r (UreftypeXclasshhXscrapy.selector.SelectorU refdomainXpyr hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]r h)r }r (h\j he}r (hi]hj]r (hj Xpy-classr ehh]hg]hk]uh]j hW]r hXSelectorr r }r (h\Uh]j ubahchubaubhXz for each node. Overriding this method is mandatory. Otherwise, you spider won't work. This method must return either a r r }r (h\Xz for each node. Overriding this method is mandatory. Otherwise, you spider won't work. This method must return either a h]j ubh)r }r (h\X:class:`~scrapy.item.Item`r h]j h^hahchhe}r (UreftypeXclasshhXscrapy.item.ItemU refdomainXpyr hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]r h)r }r (h\j he}r (hi]hj]r (hj Xpy-classr ehh]hg]hk]uh]j hW]r hXItemr r }r (h\Uh]j ubahchubaubhX object, a r r }r (h\X object, a h]j ubh)r }r (h\X:class:`~scrapy.http.Request`r h]j h^hahchhe}r (UreftypeXclasshhXscrapy.http.RequestU refdomainXpyr hg]hh]U refexplicithi]hj]hk]hhhjhjuhmMhW]r h)r }r (h\j he}r (hi]hj]r (hj Xpy-classr ehh]hg]hk]uh]j hW]r hXRequestr r }r (h\Uh]j ubahchubaubhX/ object, or an iterable containing any of them.r r }r (h\X/ object, or an iterable containing any of them.h]j ubeubaubeubjQ)r }r (h\Uh]jh^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX?process_results() (scrapy.contrib.spiders.XMLFeedSpider method)h3Utr auhmNhnhhW]ubjl)r }r (h\Uh]jh^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsXmethodr juj uhmNhnhhW]r (jw)r }r (h\X"process_results(response, results)h]j h^hahcjzhe}r (hg]r h3aj}jhh]hi]hj]hk]r h3ajXXMLFeedSpider.process_resultsjjjuhmMhnhhW]r (j)r }r (h\Xprocess_resultsh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hXprocess_resultsr r }r (h\Uh]j ubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (j)r }r (h\Xresponsehe}r (hi]hj]hh]hg]hk]uh]j hW]r hXresponser r }r (h\Uh]j ubahcjubj)r }r (h\Xresultshe}r (hi]hj]hh]hg]hk]uh]j hW]r hXresultsr r }r (h\Uh]j ubahcjubeubeubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r h)r }r (h\XiThis method is called for each result (item or request) returned by the spider, and it's intended to perform any last time processing required before returning the results to the framework core, for example setting the item IDs. It receives a list of results and the response which originated those results. It must return a list of results (Items or Requests).r h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hXiThis method is called for each result (item or request) returned by the spider, and it's intended to perform any last time processing required before returning the results to the framework core, for example setting the item IDs. It receives a list of results and the response which originated those results. It must return a list of results (Items or Requests).r r }r (h\j h]j ubaubaubeubeubeubho)r }r (h\Uh]jeh^hahchthe}r (hi]hj]hh]hg]r hGahk]r hauhmMhnhhW]r (h{)r }r (h\XXMLFeedSpider exampler h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hXXMLFeedSpider exampler r }r (h\j h]j ubaubh)r }r (h\XHThese spiders are pretty easy to use, let's have a look at one example::r! h]j h^hahchhe}r" (hi]hj]hh]hg]hk]uhmMhnhhW]r# hXGThese spiders are pretty easy to use, let's have a look at one example:r$ r% }r& (h\XGThese spiders are pretty easy to use, let's have a look at one example:h]j ubaubj)r' }r( (h\Xfrom scrapy import log from scrapy.contrib.spiders import XMLFeedSpider from myproject.items import TestItem class MySpider(XMLFeedSpider): name = 'example.com' allowed_domains = ['example.com'] start_urls = ['http://www.example.com/feed.xml'] iterator = 'iternodes' # This is actually unnecessary, since it's the default value itertag = 'item' def parse_node(self, response, node): log.msg('Hi, this is a <%s> node!: %s' % (self.itertag, ''.join(node.extract()))) item = Item() item['id'] = node.xpath('@id').extract() item['name'] = node.xpath('name').extract() item['description'] = node.xpath('description').extract() return itemh]j h^hahcjhe}r) (jjhg]hh]hi]hj]hk]uhmMhnhhW]r* hXfrom scrapy import log from scrapy.contrib.spiders import XMLFeedSpider from myproject.items import TestItem class MySpider(XMLFeedSpider): name = 'example.com' allowed_domains = ['example.com'] start_urls = ['http://www.example.com/feed.xml'] iterator = 'iternodes' # This is actually unnecessary, since it's the default value itertag = 'item' def parse_node(self, response, node): log.msg('Hi, this is a <%s> node!: %s' % (self.itertag, ''.join(node.extract()))) item = Item() item['id'] = node.xpath('@id').extract() item['name'] = node.xpath('name').extract() item['description'] = node.xpath('description').extract() return itemr+ r, }r- (h\Uh]j' ubaubh)r. }r/ (h\XBasically what we did up there was to create a spider that downloads a feed from the given ``start_urls``, and then iterates through each of its ``item`` tags, prints them out, and stores some random data in an :class:`~scrapy.item.Item`.h]j h^hahchhe}r0 (hi]hj]hh]hg]hk]uhmMhnhhW]r1 (hX[Basically what we did up there was to create a spider that downloads a feed from the given r2 r3 }r4 (h\X[Basically what we did up there was to create a spider that downloads a feed from the given h]j. ubh)r5 }r6 (h\X``start_urls``he}r7 (hi]hj]hh]hg]hk]uh]j. hW]r8 hX start_urlsr9 r: }r; (h\Uh]j5 ubahchubhX(, and then iterates through each of its r< r= }r> (h\X(, and then iterates through each of its h]j. ubh)r? }r@ (h\X``item``he}rA (hi]hj]hh]hg]hk]uh]j. hW]rB hXitemrC rD }rE (h\Uh]j? ubahchubhX: tags, prints them out, and stores some random data in an rF rG }rH (h\X: tags, prints them out, and stores some random data in an h]j. ubh)rI }rJ (h\X:class:`~scrapy.item.Item`rK h]j. h^hahchhe}rL (UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrM hg]hh]U refexplicithi]hj]hk]hhhNhjuhmMhW]rN h)rO }rP (h\jK he}rQ (hi]hj]rR (hjM Xpy-classrS ehh]hg]hk]uh]jI hW]rT hXItemrU rV }rW (h\Uh]jO ubahchubaubhX.rX }rY (h\X.h]j. ubeubeubeubho)rZ }r[ (h\Uh]jh^hahchthe}r\ (hi]hj]hh]hg]r] hRahk]r^ h-auhmMhnhhW]r_ (h{)r` }ra (h\X CSVFeedSpiderrb h]jZ h^hahchhe}rc (hi]hj]hh]hg]hk]uhmMhnhhW]rd hX CSVFeedSpiderre rf }rg (h\jb h]j` ubaubjQ)rh }ri (h\Uh]jZ h^NhcjThe}rj (hg]hh]hi]hj]hk]Uentries]rk (jWX/CSVFeedSpider (class in scrapy.contrib.spiders)h4Utrl auhmNhnhhW]ubjl)rm }rn (h\Uh]jZ h^Nhcjohe}ro (jqjrXpyhg]hh]hi]hj]hk]jsXclassrp jujp uhmNhnhhW]rq (jw)rr }rs (h\X CSVFeedSpiderrt h]jm h^hahcjzhe}ru (hg]rv h4aj}jhh]hi]hj]hk]rw h4ajjt jUjuhmMhnhhW]rx (j)ry }rz (h\Xclass h]jr h^hahcjhe}r{ (hi]hj]hh]hg]hk]uhmMhnhhW]r| hXclass r} r~ }r (h\Uh]jy ubaubj)r }r (h\Xscrapy.contrib.spiders.h]jr h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hXscrapy.contrib.spiders.r r }r (h\Uh]j ubaubj)r }r (h\jt h]jr h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hX CSVFeedSpiderr r }r (h\Uh]j ubaubeubj)r }r (h\Uh]jm h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (h)r }r (h\XThis spider is very similar to the XMLFeedSpider, except that it iterates over rows, instead of nodes. The method that gets called in each iteration is :meth:`parse_row`.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (hXThis spider is very similar to the XMLFeedSpider, except that it iterates over rows, instead of nodes. The method that gets called in each iteration is r r }r (h\XThis spider is very similar to the XMLFeedSpider, except that it iterates over rows, instead of nodes. The method that gets called in each iteration is h]j ubh)r }r (h\X:meth:`parse_row`r h]j h^hahchhe}r (UreftypeXmethhhX parse_rowU refdomainXpyr hg]hh]U refexplicithi]hj]hk]hhhjt hjuhmMhW]r h)r }r (h\j he}r (hi]hj]r (hj Xpy-methr ehh]hg]hk]uh]j hW]r hX parse_row()r r }r (h\Uh]j ubahchubaubhX.r }r (h\X.h]j ubeubjQ)r }r (h\Uh]j h^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX:delimiter (scrapy.contrib.spiders.CSVFeedSpider attribute)hUtr auhmNhnhhW]ubjl)r }r (h\Uh]j h^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsX attributer juj uhmNhnhhW]r (jw)r }r (h\X delimiterr h]j h^hahcjzhe}r (hg]r haj}jhh]hi]hj]hk]r hajXCSVFeedSpider.delimiterjjt juhmMhnhhW]r j)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hX delimiterr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r h)r }r (h\XaA string with the separator character for each field in the CSV file Defaults to ``','`` (comma).h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (hXQA string with the separator character for each field in the CSV file Defaults to r r }r (h\XQA string with the separator character for each field in the CSV file Defaults to h]j ubh)r }r (h\X``','``he}r (hi]hj]hh]hg]hk]uh]j hW]r hX','r r }r (h\Uh]j ubahchubhX (comma).r r }r (h\X (comma).h]j ubeubaubeubjQ)r }r (h\Uh]j h^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX8headers (scrapy.contrib.spiders.CSVFeedSpider attribute)h1Utr auhmNhnhhW]ubjl)r }r (h\Uh]j h^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsX attributer juj uhmNhnhhW]r (jw)r }r (h\Xheadersr h]j h^hahcjzhe}r (hg]r h1aj}jhh]hi]hj]hk]r h1ajXCSVFeedSpider.headersjjt juhmMhnhhW]r j)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hXheadersr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r h)r }r (h\X_A list of the rows contained in the file CSV feed which will be used to extract fields from it.r h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hX_A list of the rows contained in the file CSV feed which will be used to extract fields from it.r r }r (h\j h]j ubaubaubeubjQ)r }r (h\Uh]j h^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX9parse_row() (scrapy.contrib.spiders.CSVFeedSpider method)hUtr auhmNhnhhW]ubjl)r }r (h\Uh]j h^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsXmethodr juj uhmNhnhhW]r (jw)r }r (h\Xparse_row(response, row)h]j h^hahcjzhe}r (hg]r haj}jhh]hi]hj]hk]r hajXCSVFeedSpider.parse_rowjjt juhmMhnhhW]r (j)r }r (h\X parse_rowh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hX parse_rowr r }r (h\Uh]j ubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (j)r }r (h\Xresponsehe}r (hi]hj]hh]hg]hk]uh]j hW]r hXresponser r }r (h\Uh]j ubahcjubj)r }r (h\Xrowhe}r (hi]hj]hh]hg]hk]uh]j hW]r hXrowr! r" }r# (h\Uh]j ubahcjubeubeubj)r$ }r% (h\Uh]j h^hahcjhe}r& (hi]hj]hh]hg]hk]uhmMhnhhW]r' h)r( }r) (h\XReceives a response and a dict (representing each row) with a key for each provided (or detected) header of the CSV file. This spider also gives the opportunity to override ``adapt_response`` and ``process_results`` methods for pre- and post-processing purposes.h]j$ h^hahchhe}r* (hi]hj]hh]hg]hk]uhmMhnhhW]r+ (hXReceives a response and a dict (representing each row) with a key for each provided (or detected) header of the CSV file. This spider also gives the opportunity to override r, r- }r. (h\XReceives a response and a dict (representing each row) with a key for each provided (or detected) header of the CSV file. This spider also gives the opportunity to override h]j( ubh)r/ }r0 (h\X``adapt_response``he}r1 (hi]hj]hh]hg]hk]uh]j( hW]r2 hXadapt_responser3 r4 }r5 (h\Uh]j/ ubahchubhX and r6 r7 }r8 (h\X and h]j( ubh)r9 }r: (h\X``process_results``he}r; (hi]hj]hh]hg]hk]uh]j( hW]r< hXprocess_resultsr= r> }r? (h\Uh]j9 ubahchubhX/ methods for pre- and post-processing purposes.r@ rA }rB (h\X/ methods for pre- and post-processing purposes.h]j( ubeubaubeubeubeubho)rC }rD (h\Uh]jZ h^hahchthe}rE (hi]hj]hh]hg]rF hNahk]rG h(auhmMhnhhW]rH (h{)rI }rJ (h\XCSVFeedSpider examplerK h]jC h^hahchhe}rL (hi]hj]hh]hg]hk]uhmMhnhhW]rM hXCSVFeedSpider examplerN rO }rP (h\jK h]jI ubaubh)rQ }rR (h\XVLet's see an example similar to the previous one, but using a :class:`CSVFeedSpider`::h]jC h^hahchhe}rS (hi]hj]hh]hg]hk]uhmMhnhhW]rT (hX>Let's see an example similar to the previous one, but using a rU rV }rW (h\X>Let's see an example similar to the previous one, but using a h]jQ ubh)rX }rY (h\X:class:`CSVFeedSpider`rZ h]jQ h^hahchhe}r[ (UreftypeXclasshhX CSVFeedSpiderU refdomainXpyr\ hg]hh]U refexplicithi]hj]hk]hhhNhjuhmMhW]r] h)r^ }r_ (h\jZ he}r` (hi]hj]ra (hj\ Xpy-classrb ehh]hg]hk]uh]jX hW]rc hX CSVFeedSpiderrd re }rf (h\Uh]j^ ubahchubaubhX:rg }rh (h\X:h]jQ ubeubj)ri }rj (h\X;from scrapy import log from scrapy.contrib.spiders import CSVFeedSpider from myproject.items import TestItem class MySpider(CSVFeedSpider): name = 'example.com' allowed_domains = ['example.com'] start_urls = ['http://www.example.com/feed.csv'] delimiter = ';' headers = ['id', 'name', 'description'] def parse_row(self, response, row): log.msg('Hi, this is a row!: %r' % row) item = TestItem() item['id'] = row['id'] item['name'] = row['name'] item['description'] = row['description'] return itemh]jC h^hahcjhe}rk (jjhg]hh]hi]hj]hk]uhmMhnhhW]rl hX;from scrapy import log from scrapy.contrib.spiders import CSVFeedSpider from myproject.items import TestItem class MySpider(CSVFeedSpider): name = 'example.com' allowed_domains = ['example.com'] start_urls = ['http://www.example.com/feed.csv'] delimiter = ';' headers = ['id', 'name', 'description'] def parse_row(self, response, row): log.msg('Hi, this is a row!: %r' % row) item = TestItem() item['id'] = row['id'] item['name'] = row['name'] item['description'] = row['description'] return itemrm rn }ro (h\Uh]ji ubaubeubeubho)rp }rq (h\Uh]jh^hahchthe}rr (hi]hj]hh]hg]rs hDahk]rt h auhmMhnhhW]ru (h{)rv }rw (h\X SitemapSpiderrx h]jp h^hahchhe}ry (hi]hj]hh]hg]hk]uhmMhnhhW]rz hX SitemapSpiderr{ r| }r} (h\jx h]jv ubaubjQ)r~ }r (h\Uh]jp h^NhcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX/SitemapSpider (class in scrapy.contrib.spiders)hUtr auhmNhnhhW]ubjl)r }r (h\Uh]jp h^Nhcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsXclassr juj uhmNhnhhW]r (jw)r }r (h\X SitemapSpiderr h]j h^hahcjzhe}r (hg]r haj}jhh]hi]hj]hk]r hajj jUjuhmMGhnhhW]r (j)r }r (h\Xclass h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMGhnhhW]r hXclass r r }r (h\Uh]j ubaubj)r }r (h\Xscrapy.contrib.spiders.h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMGhnhhW]r hXscrapy.contrib.spiders.r r }r (h\Uh]j ubaubj)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMGhnhhW]r hX SitemapSpiderr r }r (h\Uh]j ubaubeubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMGhnhhW]r (h)r }r (h\XSSitemapSpider allows you to crawl a site by discovering the URLs using `Sitemaps`_.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmM hnhhW]r (hXGSitemapSpider allows you to crawl a site by discovering the URLs using r r }r (h\XGSitemapSpider allows you to crawl a site by discovering the URLs using h]j ubj)r }r (h\X `Sitemaps`_jKh]j hcjhe}r (UnameXSitemapsjj"hg]hh]hi]hj]hk]uhW]r hXSitemapsr r }r (h\Uh]j ubaubhX.r }r (h\X.h]j ubeubh)r }r (h\XLIt supports nested sitemaps and discovering sitemap urls from `robots.txt`_.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmM hnhhW]r (hX>It supports nested sitemaps and discovering sitemap urls from r r }r (h\X>It supports nested sitemaps and discovering sitemap urls from h]j ubj)r }r (h\X `robots.txt`_jKh]j hcjhe}r (UnameX robots.txtr jXhttp://www.robotstxt.org/r hg]hh]hi]hj]hk]uhW]r hX robots.txtr r }r (h\Uh]j ubaubhX.r }r (h\X.h]j ubeubjQ)r }r (h\Uh]j h^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX=sitemap_urls (scrapy.contrib.spiders.SitemapSpider attribute)h Utr auhmNhnhhW]ubjl)r }r (h\Uh]j h^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsX attributer juj uhmNhnhhW]r (jw)r }r (h\X sitemap_urlsr h]j h^hahcjzhe}r (hg]r h aj}jhh]hi]hj]hk]r h ajXSitemapSpider.sitemap_urlsjj juhmMhnhhW]r j)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hX sitemap_urlsr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (h)r }r (h\XEA list of urls pointing to the sitemaps whose urls you want to crawl.r h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r hXEA list of urls pointing to the sitemaps whose urls you want to crawl.r r }r (h\j h]j ubaubh)r }r (h\X\You can also point to a `robots.txt`_ and it will be parsed to extract sitemap urls from it.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (hXYou can also point to a r r }r (h\XYou can also point to a h]j ubj)r }r (h\X `robots.txt`_jKh]j hcjhe}r (UnameX robots.txtjj hg]hh]hi]hj]hk]uhW]r hX robots.txtr r }r (h\Uh]j ubaubhX7 and it will be parsed to extract sitemap urls from it.r r }r (h\X7 and it will be parsed to extract sitemap urls from it.h]j ubeubeubeubjQ)r }r (h\Uh]j h^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX>sitemap_rules (scrapy.contrib.spiders.SitemapSpider attribute)hUtr auhmNhnhhW]ubjl)r }r (h\Uh]j h^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsX attributer juj uhmNhnhhW]r (jw)r }r (h\X sitemap_rulesr h]j h^hahcjzhe}r (hg]r haj}jhh]hi]hj]hk]r hajXSitemapSpider.sitemap_rulesjj juhmM+hnhhW]r j)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmM+hnhhW]r hX sitemap_rulesr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmM+hnhhW]r (h)r }r (h\X-A list of tuples ``(regex, callback)`` where:r h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmMhnhhW]r (hXA list of tuples r r! }r" (h\XA list of tuples h]j ubh)r# }r$ (h\X``(regex, callback)``he}r% (hi]hj]hh]hg]hk]uh]j hW]r& hX(regex, callback)r' r( }r) (h\Uh]j# ubahchubhX where:r* r+ }r, (h\X where:h]j ubeubj )r- }r. (h\Uh]j h^hahcj| he}r/ (j X*hg]hh]hi]hj]hk]uhmMhnhhW]r0 (h)r1 }r2 (h\X``regex`` is a regular expression to match urls extracted from sitemaps. ``regex`` can be either a str or a compiled regex object. h]j- h^hahchhe}r3 (hi]hj]hh]hg]hk]uhmNhnhhW]r4 h)r5 }r6 (h\X``regex`` is a regular expression to match urls extracted from sitemaps. ``regex`` can be either a str or a compiled regex object.h]j1 h^hahchhe}r7 (hi]hj]hh]hg]hk]uhmMhW]r8 (h)r9 }r: (h\X ``regex``he}r; (hi]hj]hh]hg]hk]uh]j5 hW]r< hXregexr= r> }r? (h\Uh]j9 ubahchubhX@ is a regular expression to match urls extracted from sitemaps. r@ rA }rB (h\X@ is a regular expression to match urls extracted from sitemaps. h]j5 ubh)rC }rD (h\X ``regex``he}rE (hi]hj]hh]hg]hk]uh]j5 hW]rF hXregexrG rH }rI (h\Uh]jC ubahchubhX0 can be either a str or a compiled regex object.rJ rK }rL (h\X0 can be either a str or a compiled regex object.h]j5 ubeubaubh)rM }rN (h\Xcallback is the callback to use for processing the urls that match the regular expression. ``callback`` can be a string (indicating the name of a spider method) or a callable. h]j- h^hahchhe}rO (hi]hj]hh]hg]hk]uhmNhnhhW]rP h)rQ }rR (h\Xcallback is the callback to use for processing the urls that match the regular expression. ``callback`` can be a string (indicating the name of a spider method) or a callable.h]jM h^hahchhe}rS (hi]hj]hh]hg]hk]uhmMhW]rT (hX[callback is the callback to use for processing the urls that match the regular expression. rU rV }rW (h\X[callback is the callback to use for processing the urls that match the regular expression. h]jQ ubh)rX }rY (h\X ``callback``he}rZ (hi]hj]hh]hg]hk]uh]jQ hW]r[ hXcallbackr\ r] }r^ (h\Uh]jX ubahchubhXH can be a string (indicating the name of a spider method) or a callable.r_ r` }ra (h\XH can be a string (indicating the name of a spider method) or a callable.h]jQ ubeubaubeubh)rb }rc (h\X For example::rd h]j h^hahchhe}re (hi]hj]hh]hg]hk]uhmM"hnhhW]rf hX For example:rg rh }ri (h\X For example:h]jb ubaubj)rj }rk (h\X0sitemap_rules = [('/product/', 'parse_product')]h]j h^hahcjhe}rl (jjhg]hh]hi]hj]hk]uhmM$hnhhW]rm hX0sitemap_rules = [('/product/', 'parse_product')]rn ro }rp (h\Uh]jj ubaubh)rq }rr (h\XMRules are applied in order, and only the first one that matches will be used.rs h]j h^hahchhe}rt (hi]hj]hh]hg]hk]uhmM&hnhhW]ru hXMRules are applied in order, and only the first one that matches will be used.rv rw }rx (h\js h]jq ubaubh)ry }rz (h\XeIf you omit this attribute, all urls found in sitemaps will be processed with the ``parse`` callback.h]j h^hahchhe}r{ (hi]hj]hh]hg]hk]uhmM)hnhhW]r| (hXRIf you omit this attribute, all urls found in sitemaps will be processed with the r} r~ }r (h\XRIf you omit this attribute, all urls found in sitemaps will be processed with the h]jy ubh)r }r (h\X ``parse``he}r (hi]hj]hh]hg]hk]uh]jy hW]r hXparser r }r (h\Uh]j ubahchubhX callback.r r }r (h\X callback.h]jy ubeubeubeubjQ)r }r (h\Uh]j h^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWX?sitemap_follow (scrapy.contrib.spiders.SitemapSpider attribute)h&Utr auhmNhnhhW]ubjl)r }r (h\Uh]j h^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsX attributer juj uhmNhnhhW]r (jw)r }r (h\Xsitemap_followr h]j h^hahcjzhe}r (hg]r h&aj}jhh]hi]hj]hk]r h&ajXSitemapSpider.sitemap_followjj juhmM3hnhhW]r j)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmM3hnhhW]r hXsitemap_followr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmM3hnhhW]r (h)r }r (h\XA list of regexes of sitemap that should be followed. This is is only for sites that use `Sitemap index files`_ that point to other sitemap files.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmM.hnhhW]r (hXYA list of regexes of sitemap that should be followed. This is is only for sites that use r r }r (h\XYA list of regexes of sitemap that should be followed. This is is only for sites that use h]j ubj)r }r (h\X`Sitemap index files`_jKh]j hcjhe}r (UnameXSitemap index filesjX*http://www.sitemaps.org/protocol.php#indexr hg]hh]hi]hj]hk]uhW]r hXSitemap index filesr r }r (h\Uh]j ubaubhX# that point to other sitemap files.r r }r (h\X# that point to other sitemap files.h]j ubeubh)r }r (h\X&By default, all sitemaps are followed.r h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmM2hnhhW]r hX&By default, all sitemaps are followed.r r }r (h\j h]j ubaubeubeubjQ)r }r (h\Uh]j h^hahcjThe}r (hg]hh]hi]hj]hk]Uentries]r (jWXHsitemap_alternate_links (scrapy.contrib.spiders.SitemapSpider attribute)hUtr auhmNhnhhW]ubjl)r }r (h\Uh]j h^hahcjohe}r (jqjrXpyhg]hh]hi]hj]hk]jsX attributer juj uhmNhnhhW]r (jw)r }r (h\Xsitemap_alternate_linksr h]j h^hahcjzhe}r (hg]r haj}jhh]hi]hj]hk]r hajX%SitemapSpider.sitemap_alternate_linksjj juhmMEhnhhW]r j)r }r (h\j h]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMEhnhhW]r hXsitemap_alternate_linksr r }r (h\Uh]j ubaubaubj)r }r (h\Uh]j h^hahcjhe}r (hi]hj]hh]hg]hk]uhmMEhnhhW]r (h)r }r (h\XSpecifies if alternate links for one ``url`` should be followed. These are links for the same website in another language passed within the same ``url`` block.h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmM6hnhhW]r (hX%Specifies if alternate links for one r r }r (h\X%Specifies if alternate links for one h]j ubh)r }r (h\X``url``he}r (hi]hj]hh]hg]hk]uh]j hW]r hXurlr r }r (h\Uh]j ubahchubhXe should be followed. These are links for the same website in another language passed within the same r r }r (h\Xe should be followed. These are links for the same website in another language passed within the same h]j ubh)r }r (h\X``url``he}r (hi]hj]hh]hg]hk]uh]j hW]r hXurlr r }r (h\Uh]j ubahchubhX block.r r }r (h\X block.h]j ubeubh)r }r (h\X For example::r h]j h^hahchhe}r (hi]hj]hh]hg]hk]uhmM:hnhhW]r hX For example:r r }r (h\X For example:h]j ubaubj)r }r(h\X| http://example.com/ h]j h^hahcjhe}r(jjhg]hh]hi]hj]hk]uhmM<hnhhW]rhX| http://example.com/ rr}r(h\Uh]j ubaubh)r}r(h\XWith ``sitemap_alternate_links`` set, this would retrieve both URLs. With ``sitemap_alternate_links`` disabled, only ``http://example.com/`` would be retrieved.h]j h^hahchhe}r(hi]hj]hh]hg]hk]uhmMAhnhhW]r (hXWith r r }r (h\XWith h]jubh)r }r(h\X``sitemap_alternate_links``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXsitemap_alternate_linksrr}r(h\Uh]j ubahchubhX* set, this would retrieve both URLs. With rr}r(h\X* set, this would retrieve both URLs. With h]jubh)r}r(h\X``sitemap_alternate_links``he}r(hi]hj]hh]hg]hk]uh]jhW]rhXsitemap_alternate_linksrr}r(h\Uh]jubahchubhX disabled, only rr}r (h\X disabled, only h]jubh)r!}r"(h\X``http://example.com/``he}r#(hi]hj]hh]hg]hk]uh]jhW]r$hXhttp://example.com/r%r&}r'(h\Uh]j!ubahchubhX would be retrieved.r(r)}r*(h\X would be retrieved.h]jubeubh)r+}r,(h\X0Default is ``sitemap_alternate_links`` disabled.r-h]j h^hahchhe}r.(hi]hj]hh]hg]hk]uhmMEhnhhW]r/(hX Default is r0r1}r2(h\X Default is h]j+ubh)r3}r4(h\X``sitemap_alternate_links``he}r5(hi]hj]hh]hg]hk]uh]j+hW]r6hXsitemap_alternate_linksr7r8}r9(h\Uh]j3ubahchubhX disabled.r:r;}r<(h\X disabled.h]j+ubeubeubeubeubeubho)r=}r>(h\Uh]jp h^hahchthe}r?(hi]hj]hh]hg]r@hVahk]rAh5auhmMIhnhhW]rB(h{)rC}rD(h\XSitemapSpider examplesrEh]j=h^hahchhe}rF(hi]hj]hh]hg]hk]uhmMIhnhhW]rGhXSitemapSpider examplesrHrI}rJ(h\jEh]jCubaubh)rK}rL(h\X]Simplest example: process all urls discovered through sitemaps using the ``parse`` callback::h]j=h^hahchhe}rM(hi]hj]hh]hg]hk]uhmMKhnhhW]rN(hXISimplest example: process all urls discovered through sitemaps using the rOrP}rQ(h\XISimplest example: process all urls discovered through sitemaps using the h]jKubh)rR}rS(h\X ``parse``he}rT(hi]hj]hh]hg]hk]uh]jKhW]rUhXparserVrW}rX(h\Uh]jRubahchubhX callback:rYrZ}r[(h\X callback:h]jKubeubj)r\}r](h\Xfrom scrapy.contrib.spiders import SitemapSpider class MySpider(SitemapSpider): sitemap_urls = ['http://www.example.com/sitemap.xml'] def parse(self, response): pass # ... scrape item here ...h]j=h^hahcjhe}r^(jjhg]hh]hi]hj]hk]uhmMNhnhhW]r_hXfrom scrapy.contrib.spiders import SitemapSpider class MySpider(SitemapSpider): sitemap_urls = ['http://www.example.com/sitemap.xml'] def parse(self, response): pass # ... scrape item here ...r`ra}rb(h\Uh]j\ubaubh)rc}rd(h\XRProcess some urls with certain callback and other urls with a different callback::h]j=h^hahchhe}re(hi]hj]hh]hg]hk]uhmMVhnhhW]rfhXQProcess some urls with certain callback and other urls with a different callback:rgrh}ri(h\XQProcess some urls with certain callback and other urls with a different callback:h]jcubaubj)rj}rk(h\Xfrom scrapy.contrib.spiders import SitemapSpider class MySpider(SitemapSpider): sitemap_urls = ['http://www.example.com/sitemap.xml'] sitemap_rules = [ ('/product/', 'parse_product'), ('/category/', 'parse_category'), ] def parse_product(self, response): pass # ... scrape product ... def parse_category(self, response): pass # ... scrape category ...h]j=h^hahcjhe}rl(jjhg]hh]hi]hj]hk]uhmMYhnhhW]rmhXfrom scrapy.contrib.spiders import SitemapSpider class MySpider(SitemapSpider): sitemap_urls = ['http://www.example.com/sitemap.xml'] sitemap_rules = [ ('/product/', 'parse_product'), ('/category/', 'parse_category'), ] def parse_product(self, response): pass # ... scrape product ... def parse_category(self, response): pass # ... scrape category ...rnro}rp(h\Uh]jjubaubh)rq}rr(h\XqFollow sitemaps defined in the `robots.txt`_ file and only follow sitemaps whose url contains ``/sitemap_shop``::h]j=h^hahchhe}rs(hi]hj]hh]hg]hk]uhmMhhnhhW]rt(hXFollow sitemaps defined in the rurv}rw(h\XFollow sitemaps defined in the h]jqubj)rx}ry(h\X `robots.txt`_jKh]jqhcjhe}rz(UnameX robots.txtjj hg]hh]hi]hj]hk]uhW]r{hX robots.txtr|r}}r~(h\Uh]jxubaubhX2 file and only follow sitemaps whose url contains rr}r(h\X2 file and only follow sitemaps whose url contains h]jqubh)r}r(h\X``/sitemap_shop``he}r(hi]hj]hh]hg]hk]uh]jqhW]rhX /sitemap_shoprr}r(h\Uh]jubahchubhX:r}r(h\X:h]jqubeubj)r}r(h\X<from scrapy.contrib.spiders import SitemapSpider class MySpider(SitemapSpider): sitemap_urls = ['http://www.example.com/robots.txt'] sitemap_rules = [ ('/shop/', 'parse_shop'), ] sitemap_follow = ['/sitemap_shops'] def parse_shop(self, response): pass # ... scrape shop here ...h]j=h^hahcjhe}r(jjhg]hh]hi]hj]hk]uhmMkhnhhW]rhX<from scrapy.contrib.spiders import SitemapSpider class MySpider(SitemapSpider): sitemap_urls = ['http://www.example.com/robots.txt'] sitemap_rules = [ ('/shop/', 'parse_shop'), ] sitemap_follow = ['/sitemap_shops'] def parse_shop(self, response): pass # ... scrape shop here ...rr}r(h\Uh]jubaubh)r}r(h\X2Combine SitemapSpider with other sources of urls::rh]j=h^hahchhe}r(hi]hj]hh]hg]hk]uhmMwhnhhW]rhX1Combine SitemapSpider with other sources of urls:rr}r(h\X1Combine SitemapSpider with other sources of urls:h]jubaubj)r}r(h\Xbfrom scrapy.contrib.spiders import SitemapSpider class MySpider(SitemapSpider): sitemap_urls = ['http://www.example.com/robots.txt'] sitemap_rules = [ ('/shop/', 'parse_shop'), ] other_urls = ['http://www.example.com/about'] def start_requests(self): requests = list(super(MySpider, self).start_requests()) requests += [Request(x, callback=self.parse_other) for x in self.other_urls] return requests def parse_shop(self, response): pass # ... scrape shop here ... def parse_other(self, response): pass # ... scrape other here ...h]j=h^hahcjhe}r(jjhg]hh]hi]hj]hk]uhmMyhnhhW]rhXbfrom scrapy.contrib.spiders import SitemapSpider class MySpider(SitemapSpider): sitemap_urls = ['http://www.example.com/robots.txt'] sitemap_rules = [ ('/shop/', 'parse_shop'), ] other_urls = ['http://www.example.com/about'] def start_requests(self): requests = list(super(MySpider, self).start_requests()) requests += [Request(x, callback=self.parse_other) for x in self.other_urls] return requests def parse_shop(self, response): pass # ... scrape shop here ... def parse_other(self, response): pass # ... scrape other here ...rr}r(h\Uh]jubaubhY)r}r(h\X%.. _Sitemaps: http://www.sitemaps.orgU referencedrKh]j=h^hahchdhe}r(jj"hg]rhKahh]hi]hj]hk]rh#auhmMhnhhW]ubhY)r}r(h\XC.. _Sitemap index files: http://www.sitemaps.org/protocol.php#indexjKh]j=h^hahchdhe}r(jj hg]rhLahh]hi]hj]hk]rh$auhmMhnhhW]ubhY)r}r(h\X).. _robots.txt: http://www.robotstxt.org/jKh]j=h^hahchdhe}r(jj hg]rhPahh]hi]hj]hk]rh+auhmMhnhhW]ubhY)r}r(h\X6.. _TLD: http://en.wikipedia.org/wiki/Top-level_domainjKh]j=h^hahchdhe}r(jjhg]rhUahh]hi]hj]hk]rh2auhmMhnhhW]ubhY)r}r(h\X:.. _Scrapyd documentation: http://scrapyd.readthedocs.org/jKh]j=h^hahchdhe}r(jjhg]rhAahh]hi]hj]hk]rhauhmMhnhhW]ubeubeubeubeubeh\UU transformerrNU footnote_refsr}rUrefnamesr}r(Xsitemaps]r(jj eXsitemap index files]rj aj ]r(j j jxeXscrapyd documentation]rjaXtld]rjauUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rhnhU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h\Uhe}r(hi]UlevelKhg]hh]Usourcehahj]hk]UlineKUtypeUINFOruhW]rh)r}r(h\Uhe}r(hi]hj]hh]hg]hk]uh]jhW]rhX4Hyperlink target "topics-spiders" is not referenced.rr}r(h\Uh]jubahchubahcUsystem_messagerubj)r}r(h\Uhe}r(hi]UlevelKhg]hh]Usourcehahj]hk]UlineK,UtypejuhW]rh)r}r(h\Uhe}r(hi]hj]hh]hg]hk]uh]jhW]rhX0Hyperlink target "spiderargs" is not referenced.rr}r(h\Uh]jubahchubahcjubj)r}r(h\Uhe}r(hi]UlevelKhg]hh]Usourcehahj]hk]UlineKHUtypejuhW]rh)r}r(h\Uhe}r(hi]hj]hh]hg]hk]uh]jhW]rhX8Hyperlink target "topics-spiders-ref" is not referenced.rr}r(h\Uh]jubahchubahcjubj)r}r(h\Uhe}r(hi]UlevelKhg]hh]Usourcehahj]hk]UtypejuhW]rh)r}r(h\Uhe}r(hi]hj]hh]hg]hk]uh]jhW]rhX:Hyperlink target "module-scrapy.spider" is not referenced.rr}r(h\Uh]jubahchubahcjubj)r}r(h\Uhe}r(hi]UlevelKhg]hh]Usourcehahj]hk]UtypejuhW]rh)r}r(h\Uhe}r(hi]hj]hh]hg]hk]uh]jhW]rhXCHyperlink target "module-scrapy.contrib.spiders" is not referenced.rr}r (h\Uh]jubahchubahcjubeUreporterr NUid_startr KU autofootnotesr ]r U citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenr U datestampr!NU report_levelr"KU _destinationr#NU halt_levelr$KU strip_classesr%NhNUerror_encoding_error_handlerr&Ubackslashreplacer'Udebugr(NUembed_stylesheetr)Uoutput_encoding_error_handlerr*Ustrictr+U sectnum_xformr,KUdump_transformsr-NU docinfo_xformr.KUwarning_streamr/NUpep_file_url_templater0Upep-%04dr1Uexit_status_levelr2KUconfigr3NUstrict_visitorr4NUcloak_email_addressesr5Utrim_footnote_reference_spacer6Uenvr7NUdump_pseudo_xmlr8NUexpose_internalsr9NUsectsubtitle_xformr:U source_linkr;NUrfc_referencesr<NUoutput_encodingr=Uutf-8r>U source_urlr?NUinput_encodingr@U utf-8-sigrAU_disable_configrBNU id_prefixrCUU tab_widthrDKUerror_encodingrEUUTF-8rFU_sourcerGUD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/spiders.rstrHUgettext_compactrIU generatorrJNUdump_internalsrKNU smart_quotesrLU pep_base_urlrMUhttp://www.python.org/dev/peps/rNUsyntax_highlightrOUlongrPUinput_encoding_error_handlerrQj+Uauto_id_prefixrRUidrSUdoctitle_xformrTUstrip_elements_with_classesrUNU _config_filesrV]Ufile_insertion_enabledrWU raw_enabledrXKU dump_settingsrYNubUsymbol_footnote_startrZKUidsr[}r\(hNjC h jh j hLjhCjYhDjp h jjjhjhjhjxhJjhj hj hGj hjRhFjhAjhVj=hjhj1hjD hIjeh*jhj hj h jvhTjhj h"j hKjhHjh%j h&j hMjhjhOjhPjhQjhRjZ hShph/j<hj}h1j hUjhBj%jPjLh3j h4jr h6jr h7jhEhpuUsubstitution_namesr]}r^hchnhe}r_(hi]hg]hh]Usourcehahj]hk]uU footnotesr`]raUrefidsrb}rc(hF]rdjahS]rehZahM]rfjauub.PKo1DS0rr,scrapy-0.22/.doctrees/topics/loaders.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X*scrapy.contrib.loader.ItemLoader.load_itemqX%scrapy.contrib.loader.ItemLoader.itemqX(scrapy.contrib.loader.ItemLoader.get_cssqX4scrapy.contrib.loader.ItemLoader.get_input_processorq X8scrapy.contrib.loader.ItemLoader.default_input_processorq Xitemloader objectsq NXavailable built-in processorsq NX.scrapy.contrib.loader.ItemLoader.replace_valueq Xtopics-loadersqX)scrapy.contrib.loader.processor.TakeFirstqX1scrapy.contrib.loader.ItemLoader.get_output_valueqX(scrapy.contrib.loader.ItemLoader.contextqX scrapy.contrib.loader.ItemLoaderqXitem loader contextqNX"reusing and extending item loadersqNX$scrapy.contrib.loader.processor.JoinqX*scrapy.contrib.loader.ItemLoader.add_valueqXtopics-loaders-processorsqX*scrapy.contrib.loader.processor.MapComposeqX.scrapy.contrib.loader.ItemLoader.replace_xpathqX$using item loaders to populate itemsqNX,scrapy.contrib.loader.ItemLoader.replace_cssqX*scrapy.contrib.loader.ItemLoader.get_xpathqX%declaring input and output processorsqNX#topics-loaders-available-processorsqX#topics-loaders-processors-declaringqX)scrapy.contrib.loader.ItemLoader.selectorq Xinput and output processorsq!NX(scrapy.contrib.loader.processor.Identityq"Xdeclaring item loadersq#NXtopics-loaders-contextq$X5scrapy.contrib.loader.ItemLoader.get_collected_valuesq%X3scrapy.contrib.loader.ItemLoader.default_item_classq&X7scrapy.contrib.loader.ItemLoader.default_selector_classq'X9scrapy.contrib.loader.ItemLoader.default_output_processorq(X5scrapy.contrib.loader.ItemLoader.get_output_processorq)X*scrapy.contrib.loader.ItemLoader.add_xpathq*X*scrapy.contrib.loader.ItemLoader.get_valueq+X'scrapy.contrib.loader.processor.Composeq,Xtopics-loaders-extendingq-X(scrapy.contrib.loader.ItemLoader.add_cssq.X item loadersq/NuUsubstitution_defsq0}q1Uparse_messagesq2]q3Ucurrent_sourceq4NU decorationq5NUautofootnote_startq6KUnameidsq7}q8(hhhhhhh h h h h Uitemloader-objectsq9h Uavailable-built-in-processorsq:h h hUtopics-loadersq;hhhhhhhhhUitem-loader-contextqhhhhhU$using-item-loaders-to-populate-itemsq?hhhhhU%declaring-input-and-output-processorsq@hU#topics-loaders-available-processorsqAhU#topics-loaders-processors-declaringqBh h h!Uinput-and-output-processorsqCh"h"h#Udeclaring-item-loadersqDh$Utopics-loaders-contextqEh%h%h&h&h'h'h(h(h)h)h*h*h+h+h,h,h-Utopics-loaders-extendingqFh.h.h/U item-loadersqGuUchildrenqH]qI(cdocutils.nodes target qJ)qK}qL(U rawsourceqMX.. _topics-loaders:UparentqNhUsourceqOcdocutils.nodes reprunicode qPXD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/loaders.rstqQqR}qSbUtagnameqTUtargetqUU attributesqV}qW(UidsqX]UbackrefsqY]UdupnamesqZ]Uclassesq[]Unamesq\]Urefidq]h;uUlineq^KUdocumentq_hhH]ubcdocutils.nodes section q`)qa}qb(hMUhNhhOhRUexpect_referenced_by_nameqc}qdhhKshTUsectionqehV}qf(hZ]h[]hY]hX]qg(Xmodule-scrapy.contrib.loaderqhhGh;eh\]qi(h/heuh^Kh_hUexpect_referenced_by_idqj}qkh;hKshH]ql(cdocutils.nodes title qm)qn}qo(hMX Item LoadersqphNhahOhRhTUtitleqqhV}qr(hZ]h[]hY]hX]h\]uh^Kh_hhH]qscdocutils.nodes Text qtX Item Loadersquqv}qw(hMhphNhnubaubcsphinx.addnodes index qx)qy}qz(hMUhNhahOhRhTUindexq{hV}q|(hX]hY]hZ]h[]h\]Uentries]q}(Usingleq~Xscrapy.contrib.loader (module)Xmodule-scrapy.contrib.loaderUtqauh^Nh_hhH]ubcdocutils.nodes paragraph q)q}q(hMXaItem Loaders provide a convenient mechanism for populating scraped :ref:`Items `. Even though Items can be populated using their own dictionary-like API, the Item Loaders provide a much more convenient API for populating them from a scraping process, by automating some common tasks like parsing the raw extracted data before assigning it.hNhahOhRhTU paragraphqhV}q(hZ]h[]hY]hX]h\]uh^K h_hhH]q(htXCItem Loaders provide a convenient mechanism for populating scraped qq}q(hMXCItem Loaders provide a convenient mechanism for populating scraped hNhubcsphinx.addnodes pending_xref q)q}q(hMX:ref:`Items `qhNhhOhRhTU pending_xrefqhV}q(UreftypeXrefUrefwarnqU reftargetqX topics-itemsU refdomainXstdqhX]hY]U refexplicithZ]h[]h\]UrefdocqXtopics/loadersquh^K hH]qcdocutils.nodes emphasis q)q}q(hMhhV}q(hZ]h[]q(UxrefqhXstd-refqehY]hX]h\]uhNhhH]qhtXItemsqq}q(hMUhNhubahTUemphasisqubaubhtX. Even though Items can be populated using their own dictionary-like API, the Item Loaders provide a much more convenient API for populating them from a scraping process, by automating some common tasks like parsing the raw extracted data before assigning it.qq}q(hMX. Even though Items can be populated using their own dictionary-like API, the Item Loaders provide a much more convenient API for populating them from a scraping process, by automating some common tasks like parsing the raw extracted data before assigning it.hNhubeubh)q}q(hMXIn other words, :ref:`Items ` provide the *container* of scraped data, while Item Loaders provide the mechanism for *populating* that container.hNhahOhRhThhV}q(hZ]h[]hY]hX]h\]uh^Kh_hhH]q(htXIn other words, qq}q(hMXIn other words, hNhubh)q}q(hMX:ref:`Items `qhNhhOhRhThhV}q(UreftypeXrefhhX topics-itemsU refdomainXstdqhX]hY]U refexplicithZ]h[]h\]hhuh^KhH]qh)q}q(hMhhV}q(hZ]h[]q(hhXstd-refqehY]hX]h\]uhNhhH]qhtXItemsqq}q(hMUhNhubahThubaubhtX provide the qq}q(hMX provide the hNhubh)q}q(hMX *container*hV}q(hZ]h[]hY]hX]h\]uhNhhH]qhtX containerqq}q(hMUhNhubahThubhtX? of scraped data, while Item Loaders provide the mechanism for qąq}q(hMX? of scraped data, while Item Loaders provide the mechanism for hNhubh)q}q(hMX *populating*hV}q(hZ]h[]hY]hX]h\]uhNhhH]qhtX populatingq˅q}q(hMUhNhubahThubhtX that container.q΅q}q(hMX that container.hNhubeubh)q}q(hMXItem Loaders are designed to provide a flexible, efficient and easy mechanism for extending and overriding different field parsing rules, either by spider, or by source format (HTML, XML, etc) without becoming a nightmare to maintain.qhNhahOhRhThhV}q(hZ]h[]hY]hX]h\]uh^Kh_hhH]qhtXItem Loaders are designed to provide a flexible, efficient and easy mechanism for extending and overriding different field parsing rules, either by spider, or by source format (HTML, XML, etc) without becoming a nightmare to maintain.qօq}q(hMhhNhubaubh`)q}q(hMUhNhahOhRhThehV}q(hZ]h[]hY]hX]qh?ah\]qhauh^Kh_hhH]q(hm)q}q(hMX$Using Item Loaders to populate itemsqhNhhOhRhThqhV}q(hZ]h[]hY]hX]h\]uh^Kh_hhH]qhtX$Using Item Loaders to populate itemsq䅁q}q(hMhhNhubaubh)q}q(hMX:To use an Item Loader, you must first instantiate it. You can either instantiate it with an dict-like object (e.g. Item or dict) or without one, in which case an Item is automatically instantiated in the Item Loader constructor using the Item class specified in the :attr:`ItemLoader.default_item_class` attribute.hNhhOhRhThhV}q(hZ]h[]hY]hX]h\]uh^Kh_hhH]q(htX To use an Item Loader, you must first instantiate it. You can either instantiate it with an dict-like object (e.g. Item or dict) or without one, in which case an Item is automatically instantiated in the Item Loader constructor using the Item class specified in the q녁q}q(hMX To use an Item Loader, you must first instantiate it. You can either instantiate it with an dict-like object (e.g. Item or dict) or without one, in which case an Item is automatically instantiated in the Item Loader constructor using the Item class specified in the hNhubh)q}q(hMX%:attr:`ItemLoader.default_item_class`qhNhhOhRhThhV}q(UreftypeXattrhhXItemLoader.default_item_classU refdomainXpyqhX]hY]U refexplicithZ]h[]h\]hhUpy:classqNU py:moduleqXscrapy.contrib.loaderquh^KhH]qcdocutils.nodes literal q)q}q(hMhhV}q(hZ]h[]q(hhXpy-attrqehY]hX]h\]uhNhhH]qhtXItemLoader.default_item_classqq}r(hMUhNhubahTUliteralrubaubhtX attribute.rr}r(hMX attribute.hNhubeubh)r}r(hMXThen, you start collecting values into the Item Loader, typically using :ref:`Selectors `. You can add more than one value to the same item field; the Item Loader will know how to "join" those values later using a proper processing function.hNhhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^K!h_hhH]r(htXHThen, you start collecting values into the Item Loader, typically using r r }r (hMXHThen, you start collecting values into the Item Loader, typically using hNjubh)r }r (hMX#:ref:`Selectors `rhNjhOhRhThhV}r(UreftypeXrefhhXtopics-selectorsU refdomainXstdrhX]hY]U refexplicithZ]h[]h\]hhuh^K!hH]rh)r}r(hMjhV}r(hZ]h[]r(hjXstd-refrehY]hX]h\]uhNj hH]rhtX Selectorsrr}r(hMUhNjubahThubaubhtX. You can add more than one value to the same item field; the Item Loader will know how to "join" those values later using a proper processing function.rr}r(hMX. You can add more than one value to the same item field; the Item Loader will know how to "join" those values later using a proper processing function.hNjubeubh)r}r(hMXHere is a typical Item Loader usage in a :ref:`Spider `, using the :ref:`Product item ` declared in the :ref:`Items chapter `::hNhhOhRhThhV}r (hZ]h[]hY]hX]h\]uh^K&h_hhH]r!(htX)Here is a typical Item Loader usage in a r"r#}r$(hMX)Here is a typical Item Loader usage in a hNjubh)r%}r&(hMX:ref:`Spider `r'hNjhOhRhThhV}r((UreftypeXrefhhXtopics-spidersU refdomainXstdr)hX]hY]U refexplicithZ]h[]h\]hhuh^K&hH]r*h)r+}r,(hMj'hV}r-(hZ]h[]r.(hj)Xstd-refr/ehY]hX]h\]uhNj%hH]r0htXSpiderr1r2}r3(hMUhNj+ubahThubaubhtX , using the r4r5}r6(hMX , using the hNjubh)r7}r8(hMX,:ref:`Product item `r9hNjhOhRhThhV}r:(UreftypeXrefhhXtopics-items-declaringU refdomainXstdr;hX]hY]U refexplicithZ]h[]h\]hhuh^K&hH]r<h)r=}r>(hMj9hV}r?(hZ]h[]r@(hj;Xstd-refrAehY]hX]h\]uhNj7hH]rBhtX Product itemrCrD}rE(hMUhNj=ubahThubaubhtX declared in the rFrG}rH(hMX declared in the hNjubh)rI}rJ(hMX#:ref:`Items chapter `rKhNjhOhRhThhV}rL(UreftypeXrefhhX topics-itemsU refdomainXstdrMhX]hY]U refexplicithZ]h[]h\]hhuh^K&hH]rNh)rO}rP(hMjKhV}rQ(hZ]h[]rR(hjMXstd-refrSehY]hX]h\]uhNjIhH]rThtX Items chapterrUrV}rW(hMUhNjOubahThubaubhtX:rX}rY(hMX:hNjubeubcdocutils.nodes literal_block rZ)r[}r\(hMXfrom scrapy.contrib.loader import ItemLoader from myproject.items import Product def parse(self, response): l = ItemLoader(item=Product(), response=response) l.add_xpath('name', '//div[@class="product_name"]') l.add_xpath('name', '//div[@class="product_title"]') l.add_xpath('price', '//p[@id="price"]') l.add_css('stock', 'p#stock]') l.add_value('last_updated', 'today') # you can also use literal values return l.load_item()hNhhOhRhTU literal_blockr]hV}r^(U xml:spacer_Upreserver`hX]hY]hZ]h[]h\]uh^K*h_hhH]rahtXfrom scrapy.contrib.loader import ItemLoader from myproject.items import Product def parse(self, response): l = ItemLoader(item=Product(), response=response) l.add_xpath('name', '//div[@class="product_name"]') l.add_xpath('name', '//div[@class="product_title"]') l.add_xpath('price', '//p[@id="price"]') l.add_css('stock', 'p#stock]') l.add_value('last_updated', 'today') # you can also use literal values return l.load_item()rbrc}rd(hMUhNj[ubaubh)re}rf(hMXBy quickly looking at that code, we can see the ``name`` field is being extracted from two different XPath locations in the page:hNhhOhRhThhV}rg(hZ]h[]hY]hX]h\]uh^K6h_hhH]rh(htX0By quickly looking at that code, we can see the rirj}rk(hMX0By quickly looking at that code, we can see the hNjeubh)rl}rm(hMX``name``hV}rn(hZ]h[]hY]hX]h\]uhNjehH]rohtXnamerprq}rr(hMUhNjlubahTjubhtXI field is being extracted from two different XPath locations in the page:rsrt}ru(hMXI field is being extracted from two different XPath locations in the page:hNjeubeubcdocutils.nodes enumerated_list rv)rw}rx(hMUhNhhOhRhTUenumerated_listryhV}rz(Usuffixr{U.hX]hY]hZ]Uprefixr|Uh[]h\]Uenumtyper}Uarabicr~uh^K9h_hhH]r(cdocutils.nodes list_item r)r}r(hMX ``//div[@class="product_name"]``rhNjwhOhRhTU list_itemrhV}r(hZ]h[]hY]hX]h\]uh^Nh_hhH]rh)r}r(hMjhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^K9hH]rh)r}r(hMjhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX//div[@class="product_name"]rr}r(hMUhNjubahTjubaubaubj)r}r(hMX"``//div[@class="product_title"]`` hNjwhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Nh_hhH]rh)r}r(hMX!``//div[@class="product_title"]``rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^K:hH]rh)r}r(hMjhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX//div[@class="product_title"]rr}r(hMUhNjubahTjubaubaubeubh)r}r(hMXIn other words, data is being collected by extracting it from two XPath locations, using the :meth:`~ItemLoader.add_xpath` method. This is the data that will be assigned to the ``name`` field later.hNhhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^K(hMUhNj6ubahTjubaubhtX, r?r@}rA(hMX, hNjubh)rB}rC(hMX:meth:`~ItemLoader.add_css`rDhNjhOhRhThhV}rE(UreftypeXmethhhXItemLoader.add_cssU refdomainXpyrFhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KEhH]rGh)rH}rI(hMjDhV}rJ(hZ]h[]rK(hjFXpy-methrLehY]hX]h\]uhNjBhH]rMhtX add_css()rNrO}rP(hMUhNjHubahTjubaubhtX, and rQrR}rS(hMX, and hNjubh)rT}rU(hMX:meth:`~ItemLoader.add_value`rVhNjhOhRhThhV}rW(UreftypeXmethhhXItemLoader.add_valueU refdomainXpyrXhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KEhH]rYh)rZ}r[(hMjVhV}r\(hZ]h[]r](hjXXpy-methr^ehY]hX]h\]uhNjThH]r_htX add_value()r`ra}rb(hMUhNjZubahTjubaubhtX calls.rcrd}re(hMX calls.hNjubeubhJ)rf}rg(hMX.. _topics-loaders-processors:hNhhOhRhThUhV}rh(hX]hY]hZ]h[]h\]h]h>uh^KJh_hhH]ubeubh`)ri}rj(hMUhNhahOhRhc}rkhjfshThehV}rl(hZ]h[]hY]hX]rm(hCh>eh\]rn(h!heuh^KMh_hhj}roh>jfshH]rp(hm)rq}rr(hMXInput and Output processorsrshNjihOhRhThqhV}rt(hZ]h[]hY]hX]h\]uh^KMh_hhH]ruhtXInput and Output processorsrvrw}rx(hMjshNjqubaubh)ry}rz(hMXAn Item Loader contains one input processor and one output processor for each (item) field. The input processor processes the extracted data as soon as it's received (through the :meth:`~ItemLoader.add_xpath`, :meth:`~ItemLoader.add_css` or :meth:`~ItemLoader.add_value` methods) and the result of the input processor is collected and kept inside the ItemLoader. After collecting all data, the :meth:`ItemLoader.load_item` method is called to populate and get the populated :class:`~scrapy.item.Item` object. That's when the output processor is called with the data previously collected (and processed using the input processor). The result of the output processor is the final value that gets assigned to the item.hNjihOhRhThhV}r{(hZ]h[]hY]hX]h\]uh^KOh_hhH]r|(htXAn Item Loader contains one input processor and one output processor for each (item) field. The input processor processes the extracted data as soon as it's received (through the r}r~}r(hMXAn Item Loader contains one input processor and one output processor for each (item) field. The input processor processes the extracted data as soon as it's received (through the hNjyubh)r}r(hMX:meth:`~ItemLoader.add_xpath`rhNjyhOhRhThhV}r(UreftypeXmethhhXItemLoader.add_xpathU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KOhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX add_xpath()rr}r(hMUhNjubahTjubaubhtX, rr}r(hMX, hNjyubh)r}r(hMX:meth:`~ItemLoader.add_css`rhNjyhOhRhThhV}r(UreftypeXmethhhXItemLoader.add_cssU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KOhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX add_css()rr}r(hMUhNjubahTjubaubhtX or rr}r(hMX or hNjyubh)r}r(hMX:meth:`~ItemLoader.add_value`rhNjyhOhRhThhV}r(UreftypeXmethhhXItemLoader.add_valueU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KOhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX add_value()rr}r(hMUhNjubahTjubaubhtX| methods) and the result of the input processor is collected and kept inside the ItemLoader. After collecting all data, the rr}r(hMX| methods) and the result of the input processor is collected and kept inside the ItemLoader. After collecting all data, the hNjyubh)r}r(hMX:meth:`ItemLoader.load_item`rhNjyhOhRhThhV}r(UreftypeXmethhhXItemLoader.load_itemU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KOhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtXItemLoader.load_item()rr}r(hMUhNjubahTjubaubhtX4 method is called to populate and get the populated rr}r(hMX4 method is called to populate and get the populated hNjyubh)r}r(hMX:class:`~scrapy.item.Item`rhNjyhOhRhThhV}r(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KOhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtXItemrr}r(hMUhNjubahTjubaubhtX object. That's when the output processor is called with the data previously collected (and processed using the input processor). The result of the output processor is the final value that gets assigned to the item.rr}r(hMX object. That's when the output processor is called with the data previously collected (and processed using the input processor). The result of the output processor is the final value that gets assigned to the item.hNjyubeubh)r}r(hMXLet's see an example to illustrate how the input and output processors are called for a particular field (the same applies for any other field)::hNjihOhRhThhV}r(hZ]h[]hY]hX]h\]uh^KZh_hhH]rhtXLet's see an example to illustrate how the input and output processors are called for a particular field (the same applies for any other field):rr}r(hMXLet's see an example to illustrate how the input and output processors are called for a particular field (the same applies for any other field):hNjubaubjZ)r}r(hMXl = ItemLoader(Product(), some_selector) l.add_xpath('name', xpath1) # (1) l.add_xpath('name', xpath2) # (2) l.add_css('name', css) # (3) l.add_value('name', 'test') # (4) return l.load_item() # (5)hNjihOhRhTj]hV}r(j_j`hX]hY]hZ]h[]h\]uh^K]h_hhH]rhtXl = ItemLoader(Product(), some_selector) l.add_xpath('name', xpath1) # (1) l.add_xpath('name', xpath2) # (2) l.add_css('name', css) # (3) l.add_value('name', 'test') # (4) return l.load_item() # (5)rr}r(hMUhNjubaubh)r}r(hMXSo what happens is:rhNjihOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Kdh_hhH]rhtXSo what happens is:rr}r(hMjhNjubaubjv)r}r(hMUhNjihOhRhTjyhV}r(j{U.hX]hY]hZ]j|Uh[]h\]j}j~uh^Kfh_hhH]r(j)r}r(hMXData from ``xpath1`` is extracted, and passed through the *input processor* of the ``name`` field. The result of the input processor is collected and kept in the Item Loader (but not yet assigned to the item). hNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Nh_hhH]rh)r}r(hMXData from ``xpath1`` is extracted, and passed through the *input processor* of the ``name`` field. The result of the input processor is collected and kept in the Item Loader (but not yet assigned to the item).hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^KfhH]r(htX Data from rr}r(hMX Data from hNjubh)r}r(hMX ``xpath1``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXxpath1rr}r(hMUhNjubahTjubhtX& is extracted, and passed through the rr}r(hMX& is extracted, and passed through the hNjubh)r }r (hMX*input processor*hV}r (hZ]h[]hY]hX]h\]uhNjhH]r htXinput processorr r}r(hMUhNj ubahThubhtX of the rr}r(hMX of the hNjubh)r}r(hMX``name``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXnamerr}r(hMUhNjubahTjubhtXv field. The result of the input processor is collected and kept in the Item Loader (but not yet assigned to the item).rr}r(hMXv field. The result of the input processor is collected and kept in the Item Loader (but not yet assigned to the item).hNjubeubaubj)r}r(hMXData from ``xpath2`` is extracted, and passed through the same *input processor* used in (1). The result of the input processor is appended to the data collected in (1) (if any). hNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Nh_hhH]r h)r!}r"(hMXData from ``xpath2`` is extracted, and passed through the same *input processor* used in (1). The result of the input processor is appended to the data collected in (1) (if any).hNjhOhRhThhV}r#(hZ]h[]hY]hX]h\]uh^KjhH]r$(htX Data from r%r&}r'(hMX Data from hNj!ubh)r(}r)(hMX ``xpath2``hV}r*(hZ]h[]hY]hX]h\]uhNj!hH]r+htXxpath2r,r-}r.(hMUhNj(ubahTjubhtX+ is extracted, and passed through the same r/r0}r1(hMX+ is extracted, and passed through the same hNj!ubh)r2}r3(hMX*input processor*hV}r4(hZ]h[]hY]hX]h\]uhNj!hH]r5htXinput processorr6r7}r8(hMUhNj2ubahThubhtXb used in (1). The result of the input processor is appended to the data collected in (1) (if any).r9r:}r;(hMXb used in (1). The result of the input processor is appended to the data collected in (1) (if any).hNj!ubeubaubj)r<}r=(hMX This case is similar to the previous ones, except that the data is extracted from the ``css`` CSS selector, and passed through the same *input processor* used in (1) and (2). The result of the input processor is appended to the data collected in (1) and (2) (if any). hNjhOhRhTjhV}r>(hZ]h[]hY]hX]h\]uh^Nh_hhH]r?h)r@}rA(hMX This case is similar to the previous ones, except that the data is extracted from the ``css`` CSS selector, and passed through the same *input processor* used in (1) and (2). The result of the input processor is appended to the data collected in (1) and (2) (if any).hNj<hOhRhThhV}rB(hZ]h[]hY]hX]h\]uh^KnhH]rC(htXVThis case is similar to the previous ones, except that the data is extracted from the rDrE}rF(hMXVThis case is similar to the previous ones, except that the data is extracted from the hNj@ubh)rG}rH(hMX``css``hV}rI(hZ]h[]hY]hX]h\]uhNj@hH]rJhtXcssrKrL}rM(hMUhNjGubahTjubhtX+ CSS selector, and passed through the same rNrO}rP(hMX+ CSS selector, and passed through the same hNj@ubh)rQ}rR(hMX*input processor*hV}rS(hZ]h[]hY]hX]h\]uhNj@hH]rThtXinput processorrUrV}rW(hMUhNjQubahThubhtXr used in (1) and (2). The result of the input processor is appended to the data collected in (1) and (2) (if any).rXrY}rZ(hMXr used in (1) and (2). The result of the input processor is appended to the data collected in (1) and (2) (if any).hNj@ubeubaubj)r[}r\(hMXThis case is also similar to the previous ones, except that the value to be collected is assigned directly, instead of being extracted from a XPath expression or a CSS selector. However, the value is still passed through the input processors. In this case, since the value is not iterable it is converted to an iterable of a single element before passing it to the input processor, because input processor always receive iterables. hNjhOhRhTjhV}r](hZ]h[]hY]hX]h\]uh^Nh_hhH]r^h)r_}r`(hMXThis case is also similar to the previous ones, except that the value to be collected is assigned directly, instead of being extracted from a XPath expression or a CSS selector. However, the value is still passed through the input processors. In this case, since the value is not iterable it is converted to an iterable of a single element before passing it to the input processor, because input processor always receive iterables.rahNj[hOhRhThhV}rb(hZ]h[]hY]hX]h\]uh^KshH]rchtXThis case is also similar to the previous ones, except that the value to be collected is assigned directly, instead of being extracted from a XPath expression or a CSS selector. However, the value is still passed through the input processors. In this case, since the value is not iterable it is converted to an iterable of a single element before passing it to the input processor, because input processor always receive iterables.rdre}rf(hMjahNj_ubaubaubj)rg}rh(hMXThe data collected in steps (1), (2), (3) and (4) is passed through the *output processor* of the ``name`` field. The result of the output processor is the value assigned to the ``name`` field in the item. hNjhOhRhTjhV}ri(hZ]h[]hY]hX]h\]uh^Nh_hhH]rjh)rk}rl(hMXThe data collected in steps (1), (2), (3) and (4) is passed through the *output processor* of the ``name`` field. The result of the output processor is the value assigned to the ``name`` field in the item.hNjghOhRhThhV}rm(hZ]h[]hY]hX]h\]uh^K{hH]rn(htXHThe data collected in steps (1), (2), (3) and (4) is passed through the rorp}rq(hMXHThe data collected in steps (1), (2), (3) and (4) is passed through the hNjkubh)rr}rs(hMX*output processor*hV}rt(hZ]h[]hY]hX]h\]uhNjkhH]ruhtXoutput processorrvrw}rx(hMUhNjrubahThubhtX of the ryrz}r{(hMX of the hNjkubh)r|}r}(hMX``name``hV}r~(hZ]h[]hY]hX]h\]uhNjkhH]rhtXnamerr}r(hMUhNj|ubahTjubhtXH field. The result of the output processor is the value assigned to the rr}r(hMXH field. The result of the output processor is the value assigned to the hNjkubh)r}r(hMX``name``hV}r(hZ]h[]hY]hX]h\]uhNjkhH]rhtXnamerr}r(hMUhNjubahTjubhtX field in the item.rr}r(hMX field in the item.hNjkubeubaubeubh)r}r(hMX1It's worth noticing that processors are just callable objects, which are called with the data to be parsed, and return a parsed value. So you can use any function as input or output processor. The only requirement is that they must accept one (and only one) positional argument, which will be an iterator.rhNjihOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Kh_hhH]rhtX1It's worth noticing that processors are just callable objects, which are called with the data to be parsed, and return a parsed value. So you can use any function as input or output processor. The only requirement is that they must accept one (and only one) positional argument, which will be an iterator.rr}r(hMjhNjubaubcdocutils.nodes note r)r}r(hMXcBoth input and output processors must receive an iterator as their first argument. The output of those functions can be anything. The result of input processors will be appended to an internal list (in the Loader) containing the collected values (for that field). The result of the output processors is the value that will be finally assigned to the item.hNjihOhRhTUnoterhV}r(hZ]h[]hY]hX]h\]uh^Nh_hhH]rh)r}r(hMXcBoth input and output processors must receive an iterator as their first argument. The output of those functions can be anything. The result of input processors will be appended to an internal list (in the Loader) containing the collected values (for that field). The result of the output processors is the value that will be finally assigned to the item.rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^KhH]rhtXcBoth input and output processors must receive an iterator as their first argument. The output of those functions can be anything. The result of input processors will be appended to an internal list (in the Loader) containing the collected values (for that field). The result of the output processors is the value that will be finally assigned to the item.rr}r(hMjhNjubaubaubh)r}r(hMXThe other thing you need to keep in mind is that the values returned by input processors are collected internally (in lists) and then passed to output processors to populate the fields.rhNjihOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Kh_hhH]rhtXThe other thing you need to keep in mind is that the values returned by input processors are collected internally (in lists) and then passed to output processors to populate the fields.rr}r(hMjhNjubaubh)r}r(hMXLast, but not least, Scrapy comes with some :ref:`commonly used processors ` built-in for convenience.hNjihOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Kh_hhH]r(htX,Last, but not least, Scrapy comes with some rr}r(hMX,Last, but not least, Scrapy comes with some hNjubh)r}r(hMXE:ref:`commonly used processors `rhNjhOhRhThhV}r(UreftypeXrefhhX#topics-loaders-available-processorsU refdomainXstdrhX]hY]U refexplicithZ]h[]h\]hhuh^KhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXstd-refrehY]hX]h\]uhNjhH]rhtXcommonly used processorsrr}r(hMUhNjubahThubaubhtX built-in for convenience.rr}r(hMX built-in for convenience.hNjubeubeubh`)r}r(hMUhNhahOhRhThehV}r(hZ]h[]hY]hX]rhDah\]rh#auh^Kh_hhH]r(hm)r}r(hMXDeclaring Item LoadersrhNjhOhRhThqhV}r(hZ]h[]hY]hX]h\]uh^Kh_hhH]rhtXDeclaring Item Loadersrr}r(hMjhNjubaubh)r}r(hMX^Item Loaders are declared like Items, by using a class definition syntax. Here is an example::hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Kh_hhH]rhtX]Item Loaders are declared like Items, by using a class definition syntax. Here is an example:rr}r(hMX]Item Loaders are declared like Items, by using a class definition syntax. Here is an example:hNjubaubjZ)r}r(hMX6from scrapy.contrib.loader import ItemLoader from scrapy.contrib.loader.processor import TakeFirst, MapCompose, Join class ProductLoader(ItemLoader): default_output_processor = TakeFirst() name_in = MapCompose(unicode.title) name_out = Join() price_in = MapCompose(unicode.strip) # ...hNjhOhRhTj]hV}r(j_j`hX]hY]hZ]h[]h\]uh^Kh_hhH]rhtX6from scrapy.contrib.loader import ItemLoader from scrapy.contrib.loader.processor import TakeFirst, MapCompose, Join class ProductLoader(ItemLoader): default_output_processor = TakeFirst() name_in = MapCompose(unicode.title) name_out = Join() price_in = MapCompose(unicode.strip) # ...rr}r(hMUhNjubaubh)r}r(hMX2As you can see, input processors are declared using the ``_in`` suffix while output processors are declared using the ``_out`` suffix. And you can also declare a default input/output processors using the :attr:`ItemLoader.default_input_processor` and :attr:`ItemLoader.default_output_processor` attributes.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Kh_hhH]r(htX8As you can see, input processors are declared using the rr}r(hMX8As you can see, input processors are declared using the hNjubh)r}r(hMX``_in``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX_inrr}r(hMUhNjubahTjubhtX7 suffix while output processors are declared using the rr}r(hMX7 suffix while output processors are declared using the hNjubh)r}r(hMX``_out``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX_outrr}r(hMUhNjubahTjubhtXN suffix. And you can also declare a default input/output processors using the rr}r(hMXN suffix. And you can also declare a default input/output processors using the hNjubh)r}r(hMX*:attr:`ItemLoader.default_input_processor`rhNjhOhRhThhV}r(UreftypeXattrhhX"ItemLoader.default_input_processorU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-attrrehY]hX]h\]uhNjhH]r htX"ItemLoader.default_input_processorr r }r (hMUhNjubahTjubaubhtX and r r}r(hMX and hNjubh)r}r(hMX+:attr:`ItemLoader.default_output_processor`rhNjhOhRhThhV}r(UreftypeXattrhhX#ItemLoader.default_output_processorU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-attrrehY]hX]h\]uhNjhH]rhtX#ItemLoader.default_output_processorrr}r(hMUhNjubahTjubaubhtX attributes.rr }r!(hMX attributes.hNjubeubhJ)r"}r#(hMX(.. _topics-loaders-processors-declaring:hNjhOhRhThUhV}r$(hX]hY]hZ]h[]h\]h]hBuh^Kh_hhH]ubeubh`)r%}r&(hMUhNhahOhRhc}r'hj"shThehV}r((hZ]h[]hY]hX]r)(h@hBeh\]r*(hheuh^Kh_hhj}r+hBj"shH]r,(hm)r-}r.(hMX%Declaring Input and Output Processorsr/hNj%hOhRhThqhV}r0(hZ]h[]hY]hX]h\]uh^Kh_hhH]r1htX%Declaring Input and Output Processorsr2r3}r4(hMj/hNj-ubaubh)r5}r6(hMXSAs seen in the previous section, input and output processors can be declared in the Item Loader definition, and it's very common to declare input processors this way. However, there is one more place where you can specify the input and output processors to use: in the :ref:`Item Field ` metadata. Here is an example::hNj%hOhRhThhV}r7(hZ]h[]hY]hX]h\]uh^Kh_hhH]r8(htX As seen in the previous section, input and output processors can be declared in the Item Loader definition, and it's very common to declare input processors this way. However, there is one more place where you can specify the input and output processors to use: in the r9r:}r;(hMX As seen in the previous section, input and output processors can be declared in the Item Loader definition, and it's very common to declare input processors this way. However, there is one more place where you can specify the input and output processors to use: in the hNj5ubh)r<}r=(hMX':ref:`Item Field `r>hNj5hOhRhThhV}r?(UreftypeXrefhhXtopics-items-fieldsU refdomainXstdr@hX]hY]U refexplicithZ]h[]h\]hhuh^KhH]rAh)rB}rC(hMj>hV}rD(hZ]h[]rE(hj@Xstd-refrFehY]hX]h\]uhNj<hH]rGhtX Item FieldrHrI}rJ(hMUhNjBubahThubaubhtX metadata. Here is an example:rKrL}rM(hMX metadata. Here is an example:hNj5ubeubjZ)rN}rO(hMXfrom scrapy.item import Item, Field from scrapy.contrib.loader.processor import MapCompose, Join, TakeFirst from scrapy.utils.markup import remove_entities from myproject.utils import filter_prices class Product(Item): name = Field( input_processor=MapCompose(remove_entities), output_processor=Join(), ) price = Field( default=0, input_processor=MapCompose(remove_entities, filter_prices), output_processor=TakeFirst(), )hNj%hOhRhTj]hV}rP(j_j`hX]hY]hZ]h[]h\]uh^Kh_hhH]rQhtXfrom scrapy.item import Item, Field from scrapy.contrib.loader.processor import MapCompose, Join, TakeFirst from scrapy.utils.markup import remove_entities from myproject.utils import filter_prices class Product(Item): name = Field( input_processor=MapCompose(remove_entities), output_processor=Join(), ) price = Field( default=0, input_processor=MapCompose(remove_entities, filter_prices), output_processor=TakeFirst(), )rRrS}rT(hMUhNjNubaubh)rU}rV(hMXJThe precedence order, for both input and output processors, is as follows:rWhNj%hOhRhThhV}rX(hZ]h[]hY]hX]h\]uh^Kh_hhH]rYhtXJThe precedence order, for both input and output processors, is as follows:rZr[}r\(hMjWhNjUubaubjv)r]}r^(hMUhNj%hOhRhTjyhV}r_(j{U.hX]hY]hZ]j|Uh[]h\]j}j~uh^Kh_hhH]r`(j)ra}rb(hMXWItem Loader field-specific attributes: ``field_in`` and ``field_out`` (most precedence)hNj]hOhRhTjhV}rc(hZ]h[]hY]hX]h\]uh^Nh_hhH]rdh)re}rf(hMXWItem Loader field-specific attributes: ``field_in`` and ``field_out`` (most precedence)hNjahOhRhThhV}rg(hZ]h[]hY]hX]h\]uh^KhH]rh(htX'Item Loader field-specific attributes: rirj}rk(hMX'Item Loader field-specific attributes: hNjeubh)rl}rm(hMX ``field_in``hV}rn(hZ]h[]hY]hX]h\]uhNjehH]rohtXfield_inrprq}rr(hMUhNjlubahTjubhtX and rsrt}ru(hMX and hNjeubh)rv}rw(hMX ``field_out``hV}rx(hZ]h[]hY]hX]h\]uhNjehH]ryhtX field_outrzr{}r|(hMUhNjvubahTjubhtX (most precedence)r}r~}r(hMX (most precedence)hNjeubeubaubj)r}r(hMXAField metadata (``input_processor`` and ``output_processor`` key)rhNj]hOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Nh_hhH]rh)r}r(hMjhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^KhH]r(htXField metadata (rr}r(hMXField metadata (hNjubh)r}r(hMX``input_processor``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXinput_processorrr}r(hMUhNjubahTjubhtX and rr}r(hMX and hNjubh)r}r(hMX``output_processor``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXoutput_processorrr}r(hMUhNjubahTjubhtX key)rr}r(hMX key)hNjubeubaubj)r}r(hMXItem Loader defaults: :meth:`ItemLoader.default_input_processor` and :meth:`ItemLoader.default_output_processor` (least precedence) hNj]hOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Nh_hhH]rh)r}r(hMXItem Loader defaults: :meth:`ItemLoader.default_input_processor` and :meth:`ItemLoader.default_output_processor` (least precedence)hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^KhH]r(htXItem Loader defaults: rr}r(hMXItem Loader defaults: hNjubh)r}r(hMX*:meth:`ItemLoader.default_input_processor`rhNjhOhRhThhV}r(UreftypeXmethhhX"ItemLoader.default_input_processorU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX$ItemLoader.default_input_processor()rr}r(hMUhNjubahTjubaubhtX and rr}r(hMX and hNjubh)r}r(hMX+:meth:`ItemLoader.default_output_processor`rhNjhOhRhThhV}r(UreftypeXmethhhX#ItemLoader.default_output_processorU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX%ItemLoader.default_output_processor()rr}r(hMUhNjubahTjubaubhtX (least precedence)rr}r(hMX (least precedence)hNjubeubaubeubh)r}r(hMX*See also: :ref:`topics-loaders-extending`.rhNj%hOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Kh_hhH]r(htX See also: rr}r(hMX See also: hNjubh)r}r(hMX:ref:`topics-loaders-extending`rhNjhOhRhThhV}r(UreftypeXrefhhXtopics-loaders-extendingU refdomainXstdrhX]hY]U refexplicithZ]h[]h\]hhuh^KhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXstd-refrehY]hX]h\]uhNjhH]rhtXtopics-loaders-extendingrr}r(hMUhNjubahThubaubhtX.r}r(hMX.hNjubeubhJ)r}r(hMX.. _topics-loaders-context:hNj%hOhRhThUhV}r(hX]hY]hZ]h[]h\]h]hEuh^Kh_hhH]ubeubh`)r}r(hMUhNhahOhRhc}rh$jshThehV}r(hZ]h[]hY]hX]r(h}r?(hMUhNjhOhRhTjyhV}r@(j{U.hX]hY]hZ]j|Uh[]h\]j}j~uh^Kh_hhH]rA(j)rB}rC(hMXBy modifying the currently active Item Loader context (:attr:`~ItemLoader.context` attribute):: loader = ItemLoader(product) loader.context['unit'] = 'cm' hNj>hOhRhTjhV}rD(hZ]h[]hY]hX]h\]uh^Nh_hhH]rE(h)rF}rG(hMX_By modifying the currently active Item Loader context (:attr:`~ItemLoader.context` attribute)::hNjBhOhRhThhV}rH(hZ]h[]hY]hX]h\]uh^KhH]rI(htX7By modifying the currently active Item Loader context (rJrK}rL(hMX7By modifying the currently active Item Loader context (hNjFubh)rM}rN(hMX:attr:`~ItemLoader.context`rOhNjFhOhRhThhV}rP(UreftypeXattrhhXItemLoader.contextU refdomainXpyrQhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KhH]rRh)rS}rT(hMjOhV}rU(hZ]h[]rV(hjQXpy-attrrWehY]hX]h\]uhNjMhH]rXhtXcontextrYrZ}r[(hMUhNjSubahTjubaubhtX attribute):r\r]}r^(hMX attribute):hNjFubeubjZ)r_}r`(hMX:loader = ItemLoader(product) loader.context['unit'] = 'cm'hNjBhTj]hV}ra(j_j`hX]hY]hZ]h[]h\]uh^KhH]rbhtX:loader = ItemLoader(product) loader.context['unit'] = 'cm'rcrd}re(hMUhNj_ubaubeubj)rf}rg(hMXOn Item Loader instantiation (the keyword arguments of Item Loader constructor are stored in the Item Loader context):: loader = ItemLoader(product, unit='cm') hNj>hOhRhTjhV}rh(hZ]h[]hY]hX]h\]uh^Nh_hhH]ri(h)rj}rk(hMXwOn Item Loader instantiation (the keyword arguments of Item Loader constructor are stored in the Item Loader context)::hNjfhOhRhThhV}rl(hZ]h[]hY]hX]h\]uh^KhH]rmhtXvOn Item Loader instantiation (the keyword arguments of Item Loader constructor are stored in the Item Loader context):rnro}rp(hMXvOn Item Loader instantiation (the keyword arguments of Item Loader constructor are stored in the Item Loader context):hNjjubaubjZ)rq}rr(hMX'loader = ItemLoader(product, unit='cm')hNjfhTj]hV}rs(j_j`hX]hY]hZ]h[]h\]uh^KhH]rthtX'loader = ItemLoader(product, unit='cm')rurv}rw(hMUhNjqubaubeubj)rx}ry(hMX On Item Loader declaration, for those input/output processors that support instatiating them with a Item Loader context. :class:`~processor.MapCompose` is one of them:: class ProductLoader(ItemLoader): length_out = MapCompose(parse_length, unit='cm') hNj>hOhRhTjhV}rz(hZ]h[]hY]hX]h\]uh^Nh_hhH]r{(h)r|}r}(hMXOn Item Loader declaration, for those input/output processors that support instatiating them with a Item Loader context. :class:`~processor.MapCompose` is one of them::hNjxhOhRhThhV}r~(hZ]h[]hY]hX]h\]uh^KhH]r(htXyOn Item Loader declaration, for those input/output processors that support instatiating them with a Item Loader context. rr}r(hMXyOn Item Loader declaration, for those input/output processors that support instatiating them with a Item Loader context. hNj|ubh)r}r(hMX:class:`~processor.MapCompose`rhNj|hOhRhThhV}r(UreftypeXclasshhXprocessor.MapComposeU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhNhhuh^KhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtX MapComposerr}r(hMUhNjubahTjubaubhtX is one of them:rr}r(hMX is one of them:hNj|ubeubjZ)r}r(hMXUclass ProductLoader(ItemLoader): length_out = MapCompose(parse_length, unit='cm')hNjxhTj]hV}r(j_j`hX]hY]hZ]h[]h\]uh^KhH]rhtXUclass ProductLoader(ItemLoader): length_out = MapCompose(parse_length, unit='cm')rr}r(hMUhNjubaubeubeubeubh`)r}r(hMUhNhahOhRhThehV}r(hZ]h[]hY]hX]rh9ah\]rh auh^Mh_hhH]r(hm)r}r(hMXItemLoader objectsrhNjhOhRhThqhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXItemLoader objectsrr}r(hMjhNjubaubhx)r}r(hMUhNjhONhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X+ItemLoader (class in scrapy.contrib.loader)hUtrauh^Nh_hhH]ubcsphinx.addnodes desc r)r}r(hMUhNjhONhTUdescrhV}r(UnoindexrUdomainrXpyhX]hY]hZ]h[]h\]UobjtyperXclassrUdesctyperjuh^Nh_hhH]r(csphinx.addnodes desc_signature r)r}r(hMX0ItemLoader([item, selector, response], **kwargs)hNjhOhRhTUdesc_signaturerhV}r(hX]rhaUmodulerhhY]hZ]h[]h\]rhaUfullnamerX ItemLoaderrUclassrUUfirstruh^Mh_hhH]r(csphinx.addnodes desc_annotation r)r}r(hMXclass hNjhOhRhTUdesc_annotationrhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXclass rr}r(hMUhNjubaubcsphinx.addnodes desc_addname r)r}r(hMXscrapy.contrib.loader.hNjhOhRhTU desc_addnamerhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXscrapy.contrib.loader.rr}r(hMUhNjubaubcsphinx.addnodes desc_name r)r}r(hMjhNjhOhRhTU desc_namerhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtX ItemLoaderrr}r(hMUhNjubaubcsphinx.addnodes desc_parameterlist r)r}r(hMUhNjhOhRhTUdesc_parameterlistrhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(csphinx.addnodes desc_optional r)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]r(csphinx.addnodes desc_parameter r)r}r(hMXitemhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXitemrr}r(hMUhNjubahTUdesc_parameterrubj)r}r(hMXselectorhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXselectorrr}r(hMUhNjubahTjubj)r}r(hMXresponsehV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXresponserr}r(hMUhNjubahTjubehTU desc_optionalrubj)r}r(hMX**kwargshV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX**kwargsr r }r (hMUhNjubahTjubeubeubcsphinx.addnodes desc_content r )r }r(hMUhNjhOhRhTU desc_contentrhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(h)r}r(hMXReturn a new Item Loader for populating the given Item. If no item is given, one is instantiated automatically using the class in :attr:`default_item_class`.hNj hOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(htXReturn a new Item Loader for populating the given Item. If no item is given, one is instantiated automatically using the class in rr}r(hMXReturn a new Item Loader for populating the given Item. If no item is given, one is instantiated automatically using the class in hNjubh)r}r(hMX:attr:`default_item_class`rhNjhOhRhThhV}r(UreftypeXattrhhXdefault_item_classU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r (hMjhV}r!(hZ]h[]r"(hjXpy-attrr#ehY]hX]h\]uhNjhH]r$htXdefault_item_classr%r&}r'(hMUhNjubahTjubaubhtX.r(}r)(hMX.hNjubeubh)r*}r+(hMXWhen instantiated with a `selector` or a `response` parameters the :class:`ItemLoader` class provides convenient mechanisms for extracting data from web pages using :ref:`selectors `.hNj hOhRhThhV}r,(hZ]h[]hY]hX]h\]uh^Mh_hhH]r-(htXWhen instantiated with a r.r/}r0(hMXWhen instantiated with a hNj*ubcdocutils.nodes title_reference r1)r2}r3(hMX `selector`hV}r4(hZ]h[]hY]hX]h\]uhNj*hH]r5htXselectorr6r7}r8(hMUhNj2ubahTUtitle_referencer9ubhtX or a r:r;}r<(hMX or a hNj*ubj1)r=}r>(hMX `response`hV}r?(hZ]h[]hY]hX]h\]uhNj*hH]r@htXresponserArB}rC(hMUhNj=ubahTj9ubhtX parameters the rDrE}rF(hMX parameters the hNj*ubh)rG}rH(hMX:class:`ItemLoader`rIhNj*hOhRhThhV}rJ(UreftypeXclasshhX ItemLoaderU refdomainXpyrKhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rLh)rM}rN(hMjIhV}rO(hZ]h[]rP(hjKXpy-classrQehY]hX]h\]uhNjGhH]rRhtX ItemLoaderrSrT}rU(hMUhNjMubahTjubaubhtXO class provides convenient mechanisms for extracting data from web pages using rVrW}rX(hMXO class provides convenient mechanisms for extracting data from web pages using hNj*ubh)rY}rZ(hMX#:ref:`selectors `r[hNj*hOhRhThhV}r\(UreftypeXrefhhXtopics-selectorsU refdomainXstdr]hX]hY]U refexplicithZ]h[]h\]hhuh^MhH]r^h)r_}r`(hMj[hV}ra(hZ]h[]rb(hj]Xstd-refrcehY]hX]h\]uhNjYhH]rdhtX selectorsrerf}rg(hMUhNj_ubahThubaubhtX.rh}ri(hMX.hNj*ubeubcdocutils.nodes field_list rj)rk}rl(hMUhNj hONhTU field_listrmhV}rn(hZ]h[]hY]hX]h\]uh^Nh_hhH]rocdocutils.nodes field rp)rq}rr(hMUhV}rs(hZ]h[]hY]hX]h\]uhNjkhH]rt(cdocutils.nodes field_name ru)rv}rw(hMUhV}rx(hZ]h[]hY]hX]h\]uhNjqhH]ryhtX Parametersrzr{}r|(hMUhNjvubahTU field_namer}ubcdocutils.nodes field_body r~)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjqhH]rcdocutils.nodes bullet_list r)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]r(j)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]rh)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]r(cdocutils.nodes strong r)r}r(hMXitemhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXitemrr}r(hMUhNjubahTUstrongrubhtX (rr}r(hMUhNjubh)r}r(hMX:class:`~scrapy.item.Item`rhNjhOhRhThhV}r(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtXItemrr}r(hMUhNjubahTjubaubhtX objectrr}r(hMX objecthNjubhtX)r}r(hMUhNjubhtX -- rr}r(hMUhNjubhtX8The item instance to populate using subsequent calls to rr}r(hMX8The item instance to populate using subsequent calls to hNjubh)r}r(hMX:meth:`~ItemLoader.add_xpath`rhNjhOhRhThhV}r(UreftypeXmethhhXItemLoader.add_xpathU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^M hH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX add_xpath()rr}r(hMUhNjubahTjubaubhtX, rr}r(hMX, hNjubh)r}r(hMX:meth:`~ItemLoader.add_css`rhNjhOhRhThhV}r(UreftypeXmethhhXItemLoader.add_cssU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^M hH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX add_css()rr}r(hMUhNjubahTjubaubhtX, or rr}r(hMX, or hNjubh)r}r(hMX:meth:`~ItemLoader.add_value`rhNjhOhRhThhV}r(UreftypeXmethhhXItemLoader.add_valueU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^M hH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX add_value()rr}r(hMUhNjubahTjubaubhtX.r}r(hMX.hNjubehThubahTjubj)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]rh)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]r(j)r}r(hMXselectorhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXselectorrr}r(hMUhNjubahTjubhtX (rr}r(hMUhNjubh)r}r(hMX":class:`~scrapy.selector.Selector`rhNjhOhRhThhV}r(UreftypeXclasshhXscrapy.selector.SelectorU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtXSelectorr r }r (hMUhNjubahTjubaubhtX objectr r }r(hMX objecthNjubhtX)r}r(hMUhNjubhtX -- rr}r(hMUhNjubhtX2The selector to extract data from, when using the rr}r(hMX2The selector to extract data from, when using the hNjubh)r}r(hMX:meth:`add_xpath`rhNjhOhRhThhV}r(UreftypeXmethhhX add_xpathU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r (hjXpy-methr!ehY]hX]h\]uhNjhH]r"htX add_xpath()r#r$}r%(hMUhNjubahTjubaubhtX (resp. r&r'}r((hMX (resp. hNjubh)r)}r*(hMX:meth:`add_css`r+hNjhOhRhThhV}r,(UreftypeXmethhhXadd_cssU refdomainXpyr-hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]r.h)r/}r0(hMj+hV}r1(hZ]h[]r2(hj-Xpy-methr3ehY]hX]h\]uhNj)hH]r4htX add_css()r5r6}r7(hMUhNj/ubahTjubaubhtX) or r8r9}r:(hMX) or hNjubh)r;}r<(hMX:meth:`replace_xpath`r=hNjhOhRhThhV}r>(UreftypeXmethhhX replace_xpathU refdomainXpyr?hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]r@h)rA}rB(hMj=hV}rC(hZ]h[]rD(hj?Xpy-methrEehY]hX]h\]uhNj;hH]rFhtXreplace_xpath()rGrH}rI(hMUhNjAubahTjubaubhtX (resp. rJrK}rL(hMX (resp. hNjubh)rM}rN(hMX:meth:`replace_css`rOhNjhOhRhThhV}rP(UreftypeXmethhhX replace_cssU refdomainXpyrQhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rRh)rS}rT(hMjOhV}rU(hZ]h[]rV(hjQXpy-methrWehY]hX]h\]uhNjMhH]rXhtX replace_css()rYrZ}r[(hMUhNjSubahTjubaubhtX ) method.r\r]}r^(hMX ) method.hNjubehThubahTjubj)r_}r`(hMUhV}ra(hZ]h[]hY]hX]h\]uhNjhH]rbh)rc}rd(hMUhV}re(hZ]h[]hY]hX]h\]uhNj_hH]rf(j)rg}rh(hMXresponsehV}ri(hZ]h[]hY]hX]h\]uhNjchH]rjhtXresponserkrl}rm(hMUhNjgubahTjubhtX (rnro}rp(hMUhNjcubh)rq}rr(hMX:class:`~scrapy.http.Response`rshNjchOhRhThhV}rt(UreftypeXclasshhXscrapy.http.ResponseU refdomainXpyruhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rvh)rw}rx(hMjshV}ry(hZ]h[]rz(hjuXpy-classr{ehY]hX]h\]uhNjqhH]r|htXResponser}r~}r(hMUhNjwubahTjubaubhtX objectrr}r(hMX objecthNjcubhtX)r}r(hMUhNjcubhtX -- rr}r(hMUhNjcubhtX6The response used to construct the selector using the rr}r(hMX6The response used to construct the selector using the hNjcubh)r}r(hMX:attr:`default_selector_class`rhNjchOhRhThhV}r(UreftypeXattrhhXdefault_selector_classU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-attrrehY]hX]h\]uhNjhH]rhtXdefault_selector_classrr}r(hMUhNjubahTjubaubhtXP, unless the selector argument is given, in which case this argument is ignored.rr}r(hMXP, unless the selector argument is given, in which case this argument is ignored.hNjcubehThubahTjubehTU bullet_listrubahTU field_bodyrubehTUfieldrubaubh)r}r(hMXThe item, selector, response and the remaining keyword arguments are assigned to the Loader context (accessible through the :attr:`context` attribute).hNj hOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(htX|The item, selector, response and the remaining keyword arguments are assigned to the Loader context (accessible through the rr}r(hMX|The item, selector, response and the remaining keyword arguments are assigned to the Loader context (accessible through the hNjubh)r}r(hMX:attr:`context`rhNjhOhRhThhV}r(UreftypeXattrhhXcontextU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-attrrehY]hX]h\]uhNjhH]rhtXcontextrr}r(hMUhNjubahTjubaubhtX attribute).rr}r(hMX attribute).hNjubeubh)r}r(hMX9:class:`ItemLoader` instances have the following methods:rhNj hOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(h)r}r(hMX:class:`ItemLoader`rhNjhOhRhThhV}r(UreftypeXclasshhX ItemLoaderU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtX ItemLoaderrr}r(hMUhNjubahTjubaubhtX& instances have the following methods:rr}r(hMX& instances have the following methods:hNjubeubhx)r}r(hMUhNj hOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X5get_value() (scrapy.contrib.loader.ItemLoader method)h+Utrauh^Nh_hhH]ubj)r}r(hMUhNj hOhRhTjhV}r(jjXpyrhX]hY]hZ]h[]h\]jXmethodrjjuh^Nh_hhH]r(j)r}r(hMX'get_value(value, *processors, **kwargs)hNjhOhRhTjhV}r(hX]rh+ajhhY]hZ]h[]h\]rh+ajXItemLoader.get_valuejjjuh^M1h_hhH]r(j)r}r(hMX get_valuehNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^M1h_hhH]rhtX get_valuerr}r(hMUhNjubaubj)r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^M1h_hhH]r(j)r}r(hMXvaluehV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXvaluerr}r(hMUhNjubahTjubj)r}r(hMX *processorshV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX *processorsrr}r(hMUhNjubahTjubj)r}r(hMX**kwargshV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX**kwargsrr}r(hMUhNjubahTjubeubeubj )r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^M1h_hhH]r(h)r}r(hMXNProcess the given ``value`` by the given ``processors`` and keyword arguments.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M"h_hhH]r(htXProcess the given r r }r (hMXProcess the given hNjubh)r }r (hMX ``value``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXvaluerr}r(hMUhNj ubahTjubhtX by the given rr}r(hMX by the given hNjubh)r}r(hMX``processors``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX processorsrr}r(hMUhNjubahTjubhtX and keyword arguments.rr}r(hMX and keyword arguments.hNjubeubh)r }r!(hMXAvailable keyword arguments:r"hNjhOhRhThhV}r#(hZ]h[]hY]hX]h\]uh^M%h_hhH]r$htXAvailable keyword arguments:r%r&}r'(hMj"hNj ubaubjj)r(}r)(hMUhNjhOhRhTjmhV}r*(hZ]h[]hY]hX]h\]uh^Nh_hhH]r+jp)r,}r-(hMUhV}r.(hZ]h[]hY]hX]h\]uhNj(hH]r/(ju)r0}r1(hMUhV}r2(hZ]h[]hY]hX]h\]uhNj,hH]r3htX Parametersr4r5}r6(hMUhNj0ubahTj}ubj~)r7}r8(hMUhV}r9(hZ]h[]hY]hX]h\]uhNj,hH]r:h)r;}r<(hMUhV}r=(hZ]h[]hY]hX]h\]uhNj7hH]r>(j)r?}r@(hMXrehV}rA(hZ]h[]hY]hX]h\]uhNj;hH]rBhtXrerCrD}rE(hMUhNj?ubahTjubhtX (rFrG}rH(hMUhNj;ubh)rI}rJ(hMUhV}rK(UreftypeUobjrLU reftargetXstr or compiled regexrMU refdomainjhX]hY]U refexplicithZ]h[]h\]uhNj;hH]rNh)rO}rP(hMjMhV}rQ(hZ]h[]hY]hX]h\]uhNjIhH]rRhtXstr or compiled regexrSrT}rU(hMUhNjOubahThubahThubhtX)rV}rW(hMUhNj;ubhtX -- rXrY}rZ(hMUhNj;ubhtXKa regular expression to use for extracting data from the given value using r[r\}r](hMXKa regular expression to use for extracting data from the given value using hNj;ubh)r^}r_(hMX(:meth:`~scrapy.utils.misc.extract_regex`r`hNj;hOhRhThhV}ra(UreftypeXmethhhXscrapy.utils.misc.extract_regexU refdomainXpyrbhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^M'hH]rch)rd}re(hMj`hV}rf(hZ]h[]rg(hjbXpy-methrhehY]hX]h\]uhNj^hH]rihtXextract_regex()rjrk}rl(hMUhNjdubahTjubaubhtX" method, applied before processorsrmrn}ro(hMX" method, applied before processorshNj;ubehThubahTjubehTjubaubh)rp}rq(hMX Examples::hNjhOhRhThhV}rr(hZ]h[]hY]hX]h\]uh^M,h_hhH]rshtX Examples:rtru}rv(hMX Examples:hNjpubaubjZ)rw}rx(hMX>>> from scrapy.contrib.loader.processor import TakeFirst >>> loader.get_value(u'name: foo', TakeFirst(), unicode.upper, re='name: (.+)') 'FOO`hNjhOhRhTj]hV}ry(j_j`hX]hY]hZ]h[]h\]uh^M.h_hhH]rzhtX>>> from scrapy.contrib.loader.processor import TakeFirst >>> loader.get_value(u'name: foo', TakeFirst(), unicode.upper, re='name: (.+)') 'FOO`r{r|}r}(hMUhNjwubaubeubeubhx)r~}r(hMUhNj hOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X5add_value() (scrapy.contrib.loader.ItemLoader method)hUtrauh^Nh_hhH]ubj)r}r(hMUhNj hOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jXmethodrjjuh^Nh_hhH]r(j)r}r(hMX3add_value(field_name, value, *processors, **kwargs)hNjhOhRhTjhV}r(hX]rhajhhY]hZ]h[]h\]rhajXItemLoader.add_valuejjjuh^MGh_hhH]r(j)r}r(hMX add_valuehNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^MGh_hhH]rhtX add_valuerr}r(hMUhNjubaubj)r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^MGh_hhH]r(j)r}r(hMX field_namehV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX field_namerr}r(hMUhNjubahTjubj)r}r(hMXvaluehV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXvaluerr}r(hMUhNjubahTjubj)r}r(hMX *processorshV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX *processorsrr}r(hMUhNjubahTjubj)r}r(hMX**kwargshV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX**kwargsrr}r(hMUhNjubahTjubeubeubj )r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^MGh_hhH]r(h)r}r(hMX=Process and then add the given ``value`` for the given field.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M4h_hhH]r(htXProcess and then add the given rr}r(hMXProcess and then add the given hNjubh)r}r(hMX ``value``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXvaluerr}r(hMUhNjubahTjubhtX for the given field.rr}r(hMX for the given field.hNjubeubh)r}r(hMX8The value is first passed through :meth:`get_value` by giving the ``processors`` and ``kwargs``, and then passed through the :ref:`field input processor ` and its result appended to the data collected for that field. If the field already contains collected data, the new data is added.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M6h_hhH]r(htX"The value is first passed through rr}r(hMX"The value is first passed through hNjubh)r}r(hMX:meth:`get_value`rhNjhOhRhThhV}r(UreftypeXmethhhX get_valueU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^M6hH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX get_value()rr}r(hMUhNjubahTjubaubhtX by giving the rr}r(hMX by giving the hNjubh)r}r(hMX``processors``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX processorsrr}r(hMUhNjubahTjubhtX and rr}r(hMX and hNjubh)r}r(hMX ``kwargs``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXkwargsrr}r(hMUhNjubahTjubhtX, and then passed through the rr}r(hMX, and then passed through the hNjubh)r}r(hMX8:ref:`field input processor `rhNjhOhRhThhV}r(UreftypeXrefhhXtopics-loaders-processorsU refdomainXstdrhX]hY]U refexplicithZ]h[]h\]hhuh^M6hH]rh)r}r(hMjhV}r(hZ]h[]r (hjXstd-refr ehY]hX]h\]uhNjhH]r htXfield input processorr r }r (hMUhNjubahThubaubhtX and its result appended to the data collected for that field. If the field already contains collected data, the new data is added.r r }r (hMX and its result appended to the data collected for that field. If the field already contains collected data, the new data is added.hNjubeubh)r }r (hMXThe given ``field_name`` can be ``None``, in which case values for multiple fields may be added. And the processed value should be a dict with field_name mapped to values.hNjhOhRhThhV}r (hZ]h[]hY]hX]h\]uh^M<h_hhH]r (htX The given r r }r (hMX The given hNj ubh)r }r (hMX``field_name``hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX field_namer r }r (hMUhNj ubahTjubhtX can be r r }r (hMX can be hNj ubh)r }r (hMX``None``hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXNoner r }r (hMUhNj ubahTjubhtX, in which case values for multiple fields may be added. And the processed value should be a dict with field_name mapped to values.r! r" }r# (hMX, in which case values for multiple fields may be added. And the processed value should be a dict with field_name mapped to values.hNj ubeubh)r$ }r% (hMX Examples::hNjhOhRhThhV}r& (hZ]h[]hY]hX]h\]uh^M@h_hhH]r' htX Examples:r( r) }r* (hMX Examples:hNj$ ubaubjZ)r+ }r, (hMXloader.add_value('name', u'Color TV') loader.add_value('colours', [u'white', u'blue']) loader.add_value('length', u'100') loader.add_value('name', u'name: foo', TakeFirst(), re='name: (.+)') loader.add_value(None, {'name': u'foo', 'sex': u'male'})hNjhOhRhTj]hV}r- (j_j`hX]hY]hZ]h[]h\]uh^MBh_hhH]r. htXloader.add_value('name', u'Color TV') loader.add_value('colours', [u'white', u'blue']) loader.add_value('length', u'100') loader.add_value('name', u'name: foo', TakeFirst(), re='name: (.+)') loader.add_value(None, {'name': u'foo', 'sex': u'male'})r/ r0 }r1 (hMUhNj+ ubaubeubeubhx)r2 }r3 (hMUhNj hOhRhTh{hV}r4 (hX]hY]hZ]h[]h\]Uentries]r5 (h~X9replace_value() (scrapy.contrib.loader.ItemLoader method)h Utr6 auh^Nh_hhH]ubj)r7 }r8 (hMUhNj hOhRhTjhV}r9 (jjXpyhX]hY]hZ]h[]h\]jXmethodr: jj: uh^Nh_hhH]r; (j)r< }r= (hMX replace_value(field_name, value)hNj7 hOhRhTjhV}r> (hX]r? h ajhhY]hZ]h[]h\]r@ h ajXItemLoader.replace_valuejjjuh^MKh_hhH]rA (j)rB }rC (hMX replace_valuehNj< hOhRhTjhV}rD (hZ]h[]hY]hX]h\]uh^MKh_hhH]rE htX replace_valuerF rG }rH (hMUhNjB ubaubj)rI }rJ (hMUhNj< hOhRhTjhV}rK (hZ]h[]hY]hX]h\]uh^MKh_hhH]rL (j)rM }rN (hMX field_namehV}rO (hZ]h[]hY]hX]h\]uhNjI hH]rP htX field_namerQ rR }rS (hMUhNjM ubahTjubj)rT }rU (hMXvaluehV}rV (hZ]h[]hY]hX]h\]uhNjI hH]rW htXvaluerX rY }rZ (hMUhNjT ubahTjubeubeubj )r[ }r\ (hMUhNj7 hOhRhTjhV}r] (hZ]h[]hY]hX]h\]uh^MKh_hhH]r^ h)r_ }r` (hMXeSimilar to :meth:`add_value` but replaces the collected data with the new value instead of adding it.hNj[ hOhRhThhV}ra (hZ]h[]hY]hX]h\]uh^MJh_hhH]rb (htX Similar to rc rd }re (hMX Similar to hNj_ ubh)rf }rg (hMX:meth:`add_value`rh hNj_ hOhRhThhV}ri (UreftypeXmethhhX add_valueU refdomainXpyrj hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MJhH]rk h)rl }rm (hMjh hV}rn (hZ]h[]ro (hjj Xpy-methrp ehY]hX]h\]uhNjf hH]rq htX add_value()rr rs }rt (hMUhNjl ubahTjubaubhtXI but replaces the collected data with the new value instead of adding it.ru rv }rw (hMXI but replaces the collected data with the new value instead of adding it.hNj_ ubeubaubeubhx)rx }ry (hMUhNj hOhRhTh{hV}rz (hX]hY]hZ]h[]h\]Uentries]r{ (h~X5get_xpath() (scrapy.contrib.loader.ItemLoader method)hUtr| auh^Nh_hhH]ubj)r} }r~ (hMUhNj hOhRhTjhV}r (jjXpyr hX]hY]hZ]h[]h\]jXmethodr jj uh^Nh_hhH]r (j)r }r (hMX'get_xpath(xpath, *processors, **kwargs)hNj} hOhRhTjhV}r (hX]r hajhhY]hZ]h[]h\]r hajXItemLoader.get_xpathjjjuh^M_h_hhH]r (j)r }r (hMX get_xpathhNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^M_h_hhH]r htX get_xpathr r }r (hMUhNj ubaubj)r }r (hMUhNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^M_h_hhH]r (j)r }r (hMXxpathhV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXxpathr r }r (hMUhNj ubahTjubj)r }r (hMX *processorshV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX *processorsr r }r (hMUhNj ubahTjubj)r }r (hMX**kwargshV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX**kwargsr r }r (hMUhNj ubahTjubeubeubj )r }r (hMUhNj} hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^M_h_hhH]r (h)r }r (hMXSimilar to :meth:`ItemLoader.get_value` but receives an XPath instead of a value, which is used to extract a list of unicode strings from the selector associated with this :class:`ItemLoader`.hNj hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^MNh_hhH]r (htX Similar to r r }r (hMX Similar to hNj ubh)r }r (hMX:meth:`ItemLoader.get_value`r hNj hOhRhThhV}r (UreftypeXmethhhXItemLoader.get_valueU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MNhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-methr ehY]hX]h\]uhNj hH]r htXItemLoader.get_value()r r }r (hMUhNj ubahTjubaubhtX but receives an XPath instead of a value, which is used to extract a list of unicode strings from the selector associated with this r r }r (hMX but receives an XPath instead of a value, which is used to extract a list of unicode strings from the selector associated with this hNj ubh)r }r (hMX:class:`ItemLoader`r hNj hOhRhThhV}r (UreftypeXclasshhX ItemLoaderU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MNhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-classr ehY]hX]h\]uhNj hH]r htX ItemLoaderr r }r (hMUhNj ubahTjubaubhtX.r }r (hMX.hNj ubeubjj)r }r (hMUhNj hOhRhTjmhV}r (hZ]h[]hY]hX]h\]uh^Nh_hhH]r jp)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (ju)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX Parametersr r }r (hMUhNj ubahTj}ubj~)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r j)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (j)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r h)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (j)r }r (hMXxpathhV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXxpathr r }r (hMUhNj ubahTjubhtX (r r }r (hMUhNj ubh)r }r (hMUhV}r (UreftypejLU reftargetXstrr U refdomainj hX]hY]U refexplicithZ]h[]h\]uhNj hH]r h)r }r (hMj hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXstrr r }r (hMUhNj ubahThubahThubhtX)r }r (hMUhNj ubhtX -- r r }r (hMUhNj ubhtXthe XPath to extract data fromr r }r (hMXthe XPath to extract data fromhNj ubehThubahTjubj)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r h)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (j)r }r (hMXrehV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXrer r! }r" (hMUhNj ubahTjubhtX (r# r$ }r% (hMUhNj ubh)r& }r' (hMUhV}r( (UreftypejLU reftargetXstr or compiled regexr) U refdomainj hX]hY]U refexplicithZ]h[]h\]uhNj hH]r* h)r+ }r, (hMj) hV}r- (hZ]h[]hY]hX]h\]uhNj& hH]r. htXstr or compiled regexr/ r0 }r1 (hMUhNj+ ubahThubahThubhtX)r2 }r3 (hMUhNj ubhtX -- r4 r5 }r6 (hMUhNj ubhtXNa regular expression to use for extracting data from the selected XPath regionr7 r8 }r9 (hMXNa regular expression to use for extracting data from the selected XPath regionhNj ubehThubahTjubehTjubahTjubehTjubaubh)r: }r; (hMX Examples::hNj hOhRhThhV}r< (hZ]h[]hY]hX]h\]uh^MYh_hhH]r= htX Examples:r> r? }r@ (hMX Examples:hNj: ubaubjZ)rA }rB (hMX# HTML snippet:

              Color TV

              loader.get_xpath('//p[@class="product-name"]') # HTML snippet:

              the price is $1200

              loader.get_xpath('//p[@id="price"]', TakeFirst(), re='the price is (.*)')hNj hOhRhTj]hV}rC (j_j`hX]hY]hZ]h[]h\]uh^M[h_hhH]rD htX# HTML snippet:

              Color TV

              loader.get_xpath('//p[@class="product-name"]') # HTML snippet:

              the price is $1200

              loader.get_xpath('//p[@id="price"]', TakeFirst(), re='the price is (.*)')rE rF }rG (hMUhNjA ubaubeubeubhx)rH }rI (hMUhNj hOhRhTh{hV}rJ (hX]hY]hZ]h[]h\]Uentries]rK (h~X5add_xpath() (scrapy.contrib.loader.ItemLoader method)h*UtrL auh^Nh_hhH]ubj)rM }rN (hMUhNj hOhRhTjhV}rO (jjXpyrP hX]hY]hZ]h[]h\]jXmethodrQ jjQ uh^Nh_hhH]rR (j)rS }rT (hMX3add_xpath(field_name, xpath, *processors, **kwargs)hNjM hOhRhTjhV}rU (hX]rV h*ajhhY]hZ]h[]h\]rW h*ajXItemLoader.add_xpathjjjuh^Mqh_hhH]rX (j)rY }rZ (hMX add_xpathhNjS hOhRhTjhV}r[ (hZ]h[]hY]hX]h\]uh^Mqh_hhH]r\ htX add_xpathr] r^ }r_ (hMUhNjY ubaubj)r` }ra (hMUhNjS hOhRhTjhV}rb (hZ]h[]hY]hX]h\]uh^Mqh_hhH]rc (j)rd }re (hMX field_namehV}rf (hZ]h[]hY]hX]h\]uhNj` hH]rg htX field_namerh ri }rj (hMUhNjd ubahTjubj)rk }rl (hMXxpathhV}rm (hZ]h[]hY]hX]h\]uhNj` hH]rn htXxpathro rp }rq (hMUhNjk ubahTjubj)rr }rs (hMX *processorshV}rt (hZ]h[]hY]hX]h\]uhNj` hH]ru htX *processorsrv rw }rx (hMUhNjr ubahTjubj)ry }rz (hMX**kwargshV}r{ (hZ]h[]hY]hX]h\]uhNj` hH]r| htX**kwargsr} r~ }r (hMUhNjy ubahTjubeubeubj )r }r (hMUhNjM hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mqh_hhH]r (h)r }r (hMXSimilar to :meth:`ItemLoader.add_value` but receives an XPath instead of a value, which is used to extract a list of unicode strings from the selector associated with this :class:`ItemLoader`.hNj hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Mbh_hhH]r (htX Similar to r r }r (hMX Similar to hNj ubh)r }r (hMX:meth:`ItemLoader.add_value`r hNj hOhRhThhV}r (UreftypeXmethhhXItemLoader.add_valueU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MbhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-methr ehY]hX]h\]uhNj hH]r htXItemLoader.add_value()r r }r (hMUhNj ubahTjubaubhtX but receives an XPath instead of a value, which is used to extract a list of unicode strings from the selector associated with this r r }r (hMX but receives an XPath instead of a value, which is used to extract a list of unicode strings from the selector associated with this hNj ubh)r }r (hMX:class:`ItemLoader`r hNj hOhRhThhV}r (UreftypeXclasshhX ItemLoaderU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MbhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-classr ehY]hX]h\]uhNj hH]r htX ItemLoaderr r }r (hMUhNj ubahTjubaubhtX.r }r (hMX.hNj ubeubh)r }r (hMX%See :meth:`get_xpath` for ``kwargs``.hNj hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Mfh_hhH]r (htXSee r r }r (hMXSee hNj ubh)r }r (hMX:meth:`get_xpath`r hNj hOhRhThhV}r (UreftypeXmethhhX get_xpathU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MfhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-methr ehY]hX]h\]uhNj hH]r htX get_xpath()r r }r (hMUhNj ubahTjubaubhtX for r r }r (hMX for hNj ubh)r }r (hMX ``kwargs``hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXkwargsr r }r (hMUhNj ubahTjubhtX.r }r (hMX.hNj ubeubjj)r }r (hMUhNj hOhRhTjmhV}r (hZ]h[]hY]hX]h\]uh^Nh_hhH]r jp)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (ju)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX Parametersr r }r (hMUhNj ubahTj}ubj~)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r h)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (j)r }r (hMXxpathhV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXxpathr r }r (hMUhNj ubahTjubhtX (r r }r (hMUhNj ubh)r }r (hMUhV}r (UreftypejLU reftargetXstrr U refdomainjP hX]hY]U refexplicithZ]h[]h\]uhNj hH]r h)r }r (hMj hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXstrr r }r (hMUhNj ubahThubahThubhtX)r }r (hMUhNj ubhtX -- r r }r (hMUhNj ubhtXthe XPath to extract data fromr r }r (hMXthe XPath to extract data fromhNj ubehThubahTjubehTjubaubh)r }r (hMX Examples::hNj hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Mkh_hhH]r htX Examples:r r }r (hMX Examples:hNj ubaubjZ)r }r (hMX# HTML snippet:

              Color TV

              loader.add_xpath('name', '//p[@class="product-name"]') # HTML snippet:

              the price is $1200

              loader.add_xpath('price', '//p[@id="price"]', re='the price is (.*)')hNj hOhRhTj]hV}r (j_j`hX]hY]hZ]h[]h\]uh^Mmh_hhH]r htX# HTML snippet:

              Color TV

              loader.add_xpath('name', '//p[@class="product-name"]') # HTML snippet:

              the price is $1200

              loader.add_xpath('price', '//p[@id="price"]', re='the price is (.*)')r r }r (hMUhNj ubaubeubeubhx)r }r (hMUhNj hOhRhTh{hV}r (hX]hY]hZ]h[]h\]Uentries]r (h~X9replace_xpath() (scrapy.contrib.loader.ItemLoader method)hUtr auh^Nh_hhH]ubj)r }r (hMUhNj hOhRhTjhV}r (jjXpyhX]hY]hZ]h[]h\]jXmethodr jj uh^Nh_hhH]r (j)r }r (hMX7replace_xpath(field_name, xpath, *processors, **kwargs)hNj hOhRhTjhV}r (hX]r hajhhY]hZ]h[]h\]r! hajXItemLoader.replace_xpathjjjuh^Mvh_hhH]r" (j)r# }r$ (hMX replace_xpathhNj hOhRhTjhV}r% (hZ]h[]hY]hX]h\]uh^Mvh_hhH]r& htX replace_xpathr' r( }r) (hMUhNj# ubaubj)r* }r+ (hMUhNj hOhRhTjhV}r, (hZ]h[]hY]hX]h\]uh^Mvh_hhH]r- (j)r. }r/ (hMX field_namehV}r0 (hZ]h[]hY]hX]h\]uhNj* hH]r1 htX field_namer2 r3 }r4 (hMUhNj. ubahTjubj)r5 }r6 (hMXxpathhV}r7 (hZ]h[]hY]hX]h\]uhNj* hH]r8 htXxpathr9 r: }r; (hMUhNj5 ubahTjubj)r< }r= (hMX *processorshV}r> (hZ]h[]hY]hX]h\]uhNj* hH]r? htX *processorsr@ rA }rB (hMUhNj< ubahTjubj)rC }rD (hMX**kwargshV}rE (hZ]h[]hY]hX]h\]uhNj* hH]rF htX**kwargsrG rH }rI (hMUhNjC ubahTjubeubeubj )rJ }rK (hMUhNj hOhRhTjhV}rL (hZ]h[]hY]hX]h\]uh^Mvh_hhH]rM h)rN }rO (hMXNSimilar to :meth:`add_xpath` but replaces collected data instead of adding it.hNjJ hOhRhThhV}rP (hZ]h[]hY]hX]h\]uh^Mth_hhH]rQ (htX Similar to rR rS }rT (hMX Similar to hNjN ubh)rU }rV (hMX:meth:`add_xpath`rW hNjN hOhRhThhV}rX (UreftypeXmethhhX add_xpathU refdomainXpyrY hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MthH]rZ h)r[ }r\ (hMjW hV}r] (hZ]h[]r^ (hjY Xpy-methr_ ehY]hX]h\]uhNjU hH]r` htX add_xpath()ra rb }rc (hMUhNj[ ubahTjubaubhtX2 but replaces collected data instead of adding it.rd re }rf (hMX2 but replaces collected data instead of adding it.hNjN ubeubaubeubhx)rg }rh (hMUhNj hOhRhTh{hV}ri (hX]hY]hZ]h[]h\]Uentries]rj (h~X3get_css() (scrapy.contrib.loader.ItemLoader method)hUtrk auh^Nh_hhH]ubj)rl }rm (hMUhNj hOhRhTjhV}rn (jjXpyro hX]hY]hZ]h[]h\]jXmethodrp jjp uh^Nh_hhH]rq (j)rr }rs (hMX#get_css(css, *processors, **kwargs)hNjl hOhRhTjhV}rt (hX]ru hajhhY]hZ]h[]h\]rv hajXItemLoader.get_cssjjjuh^Mh_hhH]rw (j)rx }ry (hMXget_csshNjr hOhRhTjhV}rz (hZ]h[]hY]hX]h\]uh^Mh_hhH]r{ htXget_cssr| r} }r~ (hMUhNjx ubaubj)r }r (hMUhNjr hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r (j)r }r (hMXcsshV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXcssr r }r (hMUhNj ubahTjubj)r }r (hMX *processorshV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX *processorsr r }r (hMUhNj ubahTjubj)r }r (hMX**kwargshV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX**kwargsr r }r (hMUhNj ubahTjubeubeubj )r }r (hMUhNjl hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r (h)r }r (hMXSimilar to :meth:`ItemLoader.get_value` but receives a CSS selector instead of a value, which is used to extract a list of unicode strings from the selector associated with this :class:`ItemLoader`.hNj hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Myh_hhH]r (htX Similar to r r }r (hMX Similar to hNj ubh)r }r (hMX:meth:`ItemLoader.get_value`r hNj hOhRhThhV}r (UreftypeXmethhhXItemLoader.get_valueU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MyhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-methr ehY]hX]h\]uhNj hH]r htXItemLoader.get_value()r r }r (hMUhNj ubahTjubaubhtX but receives a CSS selector instead of a value, which is used to extract a list of unicode strings from the selector associated with this r r }r (hMX but receives a CSS selector instead of a value, which is used to extract a list of unicode strings from the selector associated with this hNj ubh)r }r (hMX:class:`ItemLoader`r hNj hOhRhThhV}r (UreftypeXclasshhX ItemLoaderU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MyhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-classr ehY]hX]h\]uhNj hH]r htX ItemLoaderr r }r (hMUhNj ubahTjubaubhtX.r }r (hMX.hNj ubeubjj)r }r (hMUhNj hOhRhTjmhV}r (hZ]h[]hY]hX]h\]uh^Nh_hhH]r jp)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (ju)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX Parametersr r }r (hMUhNj ubahTj}ubj~)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r j)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (j)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r h)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (j)r }r (hMXcsshV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXcssr r }r (hMUhNj ubahTjubhtX (r r }r (hMUhNj ubh)r }r (hMUhV}r (UreftypejLU reftargetXstrr U refdomainjo hX]hY]U refexplicithZ]h[]h\]uhNj hH]r h)r }r (hMj hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXstrr r }r (hMUhNj ubahThubahThubhtX)r }r (hMUhNj ubhtX -- r r }r (hMUhNj ubhtX%the CSS selector to extract data fromr r }r (hMX%the CSS selector to extract data fromhNj ubehThubahTjubj)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r h)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (j)r }r (hMXrehV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXrer r }r (hMUhNj ubahTjubhtX (r r }r (hMUhNj ubh)r }r (hMUhV}r (UreftypejLU reftargetXstr or compiled regexr U refdomainjo hX]hY]U refexplicithZ]h[]h\]uhNj hH]r h)r }r (hMj hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXstr or compiled regexr r }r (hMUhNj ubahThubahThubhtX)r! }r" (hMUhNj ubhtX -- r# r$ }r% (hMUhNj ubhtXLa regular expression to use for extracting data from the selected CSS regionr& r' }r( (hMXLa regular expression to use for extracting data from the selected CSS regionhNj ubehThubahTjubehTjubahTjubehTjubaubh)r) }r* (hMX Examples::hNj hOhRhThhV}r+ (hZ]h[]hY]hX]h\]uh^Mh_hhH]r, htX Examples:r- r. }r/ (hMX Examples:hNj) ubaubjZ)r0 }r1 (hMX# HTML snippet:

              Color TV

              loader.get_css('p.product-name') # HTML snippet:

              the price is $1200

              loader.get_css('p#price', TakeFirst(), re='the price is (.*)')hNj hOhRhTj]hV}r2 (j_j`hX]hY]hZ]h[]h\]uh^Mh_hhH]r3 htX# HTML snippet:

              Color TV

              loader.get_css('p.product-name') # HTML snippet:

              the price is $1200

              loader.get_css('p#price', TakeFirst(), re='the price is (.*)')r4 r5 }r6 (hMUhNj0 ubaubeubeubhx)r7 }r8 (hMUhNj hOhRhTh{hV}r9 (hX]hY]hZ]h[]h\]Uentries]r: (h~X3add_css() (scrapy.contrib.loader.ItemLoader method)h.Utr; auh^Nh_hhH]ubj)r< }r= (hMUhNj hOhRhTjhV}r> (jjXpyr? hX]hY]hZ]h[]h\]jXmethodr@ jj@ uh^Nh_hhH]rA (j)rB }rC (hMX/add_css(field_name, css, *processors, **kwargs)hNj< hOhRhTjhV}rD (hX]rE h.ajhhY]hZ]h[]h\]rF h.ajXItemLoader.add_cssjjjuh^Mh_hhH]rG (j)rH }rI (hMXadd_csshNjB hOhRhTjhV}rJ (hZ]h[]hY]hX]h\]uh^Mh_hhH]rK htXadd_cssrL rM }rN (hMUhNjH ubaubj)rO }rP (hMUhNjB hOhRhTjhV}rQ (hZ]h[]hY]hX]h\]uh^Mh_hhH]rR (j)rS }rT (hMX field_namehV}rU (hZ]h[]hY]hX]h\]uhNjO hH]rV htX field_namerW rX }rY (hMUhNjS ubahTjubj)rZ }r[ (hMXcsshV}r\ (hZ]h[]hY]hX]h\]uhNjO hH]r] htXcssr^ r_ }r` (hMUhNjZ ubahTjubj)ra }rb (hMX *processorshV}rc (hZ]h[]hY]hX]h\]uhNjO hH]rd htX *processorsre rf }rg (hMUhNja ubahTjubj)rh }ri (hMX**kwargshV}rj (hZ]h[]hY]hX]h\]uhNjO hH]rk htX**kwargsrl rm }rn (hMUhNjh ubahTjubeubeubj )ro }rp (hMUhNj< hOhRhTjhV}rq (hZ]h[]hY]hX]h\]uh^Mh_hhH]rr (h)rs }rt (hMXSimilar to :meth:`ItemLoader.add_value` but receives a CSS selector instead of a value, which is used to extract a list of unicode strings from the selector associated with this :class:`ItemLoader`.hNjo hOhRhThhV}ru (hZ]h[]hY]hX]h\]uh^Mh_hhH]rv (htX Similar to rw rx }ry (hMX Similar to hNjs ubh)rz }r{ (hMX:meth:`ItemLoader.add_value`r| hNjs hOhRhThhV}r} (UreftypeXmethhhXItemLoader.add_valueU refdomainXpyr~ hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]r h)r }r (hMj| hV}r (hZ]h[]r (hj~ Xpy-methr ehY]hX]h\]uhNjz hH]r htXItemLoader.add_value()r r }r (hMUhNj ubahTjubaubhtX but receives a CSS selector instead of a value, which is used to extract a list of unicode strings from the selector associated with this r r }r (hMX but receives a CSS selector instead of a value, which is used to extract a list of unicode strings from the selector associated with this hNjs ubh)r }r (hMX:class:`ItemLoader`r hNjs hOhRhThhV}r (UreftypeXclasshhX ItemLoaderU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-classr ehY]hX]h\]uhNj hH]r htX ItemLoaderr r }r (hMUhNj ubahTjubaubhtX.r }r (hMX.hNjs ubeubh)r }r (hMX#See :meth:`get_css` for ``kwargs``.r hNjo hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r (htXSee r r }r (hMXSee hNj ubh)r }r (hMX:meth:`get_css`r hNj hOhRhThhV}r (UreftypeXmethhhXget_cssU refdomainXpyr hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xpy-methr ehY]hX]h\]uhNj hH]r htX get_css()r r }r (hMUhNj ubahTjubaubhtX for r r }r (hMX for hNj ubh)r }r (hMX ``kwargs``hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXkwargsr r }r (hMUhNj ubahTjubhtX.r }r (hMX.hNj ubeubjj)r }r (hMUhNjo hOhRhTjmhV}r (hZ]h[]hY]hX]h\]uh^Nh_hhH]r jp)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (ju)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX Parametersr r }r (hMUhNj ubahTj}ubj~)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r h)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNj hH]r (j)r }r (hMXcsshV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXcssr r }r (hMUhNj ubahTjubhtX (r r }r (hMUhNj ubh)r }r (hMUhV}r (UreftypejLU reftargetXstrr U refdomainj? hX]hY]U refexplicithZ]h[]h\]uhNj hH]r h)r }r (hMj hV}r (hZ]h[]hY]hX]h\]uhNj hH]r htXstrr r }r (hMUhNj ubahThubahThubhtX)r }r (hMUhNj ubhtX -- r r }r (hMUhNj ubhtX%the CSS selector to extract data fromr r }r (hMX%the CSS selector to extract data fromr hNj ubehThubahTjubehTjubaubh)r }r (hMX Examples::r hNjo hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r htX Examples:r r }r (hMX Examples:hNj ubaubjZ)r }r (hMX# HTML snippet:

              Color TV

              loader.add_css('name', 'p.product-name') # HTML snippet:

              the price is $1200

              loader.add_css('price', 'p#price', re='the price is (.*)')hNjo hOhRhTj]hV}r (j_j`hX]hY]hZ]h[]h\]uh^Mh_hhH]r htX# HTML snippet:

              Color TV

              loader.add_css('name', 'p.product-name') # HTML snippet:

              the price is $1200

              loader.add_css('price', 'p#price', re='the price is (.*)')r r }r (hMUhNj ubaubeubeubhx)r }r (hMUhNj hOhRhTh{hV}r (hX]hY]hZ]h[]h\]Uentries]r (h~X7replace_css() (scrapy.contrib.loader.ItemLoader method)hUtr auh^Nh_hhH]ubj)r }r (hMUhNj hOhRhTjhV}r (jjXpyhX]hY]hZ]h[]h\]jXmethodr jj uh^Nh_hhH]r (j)r }r (hMX3replace_css(field_name, css, *processors, **kwargs)hNj hOhRhTjhV}r (hX]r hajhhY]hZ]h[]h\]r hajXItemLoader.replace_cssjjjuh^Mh_hhH]r (j)r }r (hMX replace_csshNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r htX replace_cssr r }r (hMUhNj ubaubj)r }r (hMUhNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r (j)r }r! (hMX field_namehV}r" (hZ]h[]hY]hX]h\]uhNj hH]r# htX field_namer$ r% }r& (hMUhNj ubahTjubj)r' }r( (hMXcsshV}r) (hZ]h[]hY]hX]h\]uhNj hH]r* htXcssr+ r, }r- (hMUhNj' ubahTjubj)r. }r/ (hMX *processorshV}r0 (hZ]h[]hY]hX]h\]uhNj hH]r1 htX *processorsr2 r3 }r4 (hMUhNj. ubahTjubj)r5 }r6 (hMX**kwargshV}r7 (hZ]h[]hY]hX]h\]uhNj hH]r8 htX**kwargsr9 r: }r; (hMUhNj5 ubahTjubeubeubj )r< }r= (hMUhNj hOhRhTjhV}r> (hZ]h[]hY]hX]h\]uh^Mh_hhH]r? h)r@ }rA (hMXLSimilar to :meth:`add_css` but replaces collected data instead of adding it.hNj< hOhRhThhV}rB (hZ]h[]hY]hX]h\]uh^Mh_hhH]rC (htX Similar to rD rE }rF (hMX Similar to hNj@ ubh)rG }rH (hMX:meth:`add_css`rI hNj@ hOhRhThhV}rJ (UreftypeXmethhhXadd_cssU refdomainXpyrK hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rL h)rM }rN (hMjI hV}rO (hZ]h[]rP (hjK Xpy-methrQ ehY]hX]h\]uhNjG hH]rR htX add_css()rS rT }rU (hMUhNjM ubahTjubaubhtX2 but replaces collected data instead of adding it.rV rW }rX (hMX2 but replaces collected data instead of adding it.hNj@ ubeubaubeubhx)rY }rZ (hMUhNj hOhRhTh{hV}r[ (hX]hY]hZ]h[]h\]Uentries]r\ (h~X5load_item() (scrapy.contrib.loader.ItemLoader method)hUtr] auh^Nh_hhH]ubj)r^ }r_ (hMUhNj hOhRhTjhV}r` (jjXpyhX]hY]hZ]h[]h\]jXmethodra jja uh^Nh_hhH]rb (j)rc }rd (hMX load_item()hNj^ hOhRhTjhV}re (hX]rf hajhhY]hZ]h[]h\]rg hajXItemLoader.load_itemjjjuh^Mh_hhH]rh (j)ri }rj (hMX load_itemhNjc hOhRhTjhV}rk (hZ]h[]hY]hX]h\]uh^Mh_hhH]rl htX load_itemrm rn }ro (hMUhNji ubaubj)rp }rq (hMUhNjc hOhRhTjhV}rr (hZ]h[]hY]hX]h\]uh^Mh_hhH]ubeubj )rs }rt (hMUhNj^ hOhRhTjhV}ru (hZ]h[]hY]hX]h\]uh^Mh_hhH]rv h)rw }rx (hMXPopulate the item with the data collected so far, and return it. The data collected is first passed through the :ref:`output processors ` to get the final value to assign to each item field.hNjs hOhRhThhV}ry (hZ]h[]hY]hX]h\]uh^Mh_hhH]rz (htXpPopulate the item with the data collected so far, and return it. The data collected is first passed through the r{ r| }r} (hMXpPopulate the item with the data collected so far, and return it. The data collected is first passed through the hNjw ubh)r~ }r (hMX4:ref:`output processors `r hNjw hOhRhThhV}r (UreftypeXrefhhXtopics-loaders-processorsU refdomainXstdr hX]hY]U refexplicithZ]h[]h\]hhuh^MhH]r h)r }r (hMj hV}r (hZ]h[]r (hj Xstd-refr ehY]hX]h\]uhNj~ hH]r htXoutput processorsr r }r (hMUhNj ubahThubaubhtX5 to get the final value to assign to each item field.r r }r (hMX5 to get the final value to assign to each item field.hNjw ubeubaubeubhx)r }r (hMUhNj hOhRhTh{hV}r (hX]hY]hZ]h[]h\]Uentries]r (h~X@get_collected_values() (scrapy.contrib.loader.ItemLoader method)h%Utr auh^Nh_hhH]ubj)r }r (hMUhNj hOhRhTjhV}r (jjXpyhX]hY]hZ]h[]h\]jXmethodr jj uh^Nh_hhH]r (j)r }r (hMX get_collected_values(field_name)hNj hOhRhTjhV}r (hX]r h%ajhhY]hZ]h[]h\]r h%ajXItemLoader.get_collected_valuesjjjuh^Mh_hhH]r (j)r }r (hMXget_collected_valueshNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r htXget_collected_valuesr r }r (hMUhNj ubaubj)r }r (hMUhNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r j)r }r (hMX field_namehV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX field_namer r }r (hMUhNj ubahTjubaubeubj )r }r (hMUhNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r h)r }r (hMX0Return the collected values for the given field.r hNj hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r htX0Return the collected values for the given field.r r }r (hMj hNj ubaubaubeubhx)r }r (hMUhNj hOhRhTh{hV}r (hX]hY]hZ]h[]h\]Uentries]r (h~X<get_output_value() (scrapy.contrib.loader.ItemLoader method)hUtr auh^Nh_hhH]ubj)r }r (hMUhNj hOhRhTjhV}r (jjXpyhX]hY]hZ]h[]h\]jXmethodr jj uh^Nh_hhH]r (j)r }r (hMXget_output_value(field_name)hNj hOhRhTjhV}r (hX]r hajhhY]hZ]h[]h\]r hajXItemLoader.get_output_valuejjjuh^Mh_hhH]r (j)r }r (hMXget_output_valuehNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r htXget_output_valuer r }r (hMUhNj ubaubj)r }r (hMUhNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r j)r }r (hMX field_namehV}r (hZ]h[]hY]hX]h\]uhNj hH]r htX field_namer r }r (hMUhNj ubahTjubaubeubj )r }r (hMUhNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r h)r }r (hMXReturn the collected values parsed using the output processor, for the given field. This method doesn't populate or modify the item at all.r hNj hOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r htXReturn the collected values parsed using the output processor, for the given field. This method doesn't populate or modify the item at all.r r }r (hMj hNj ubaubaubeubhx)r }r (hMUhNj hOhRhTh{hV}r (hX]hY]hZ]h[]h\]Uentries]r (h~X?get_input_processor() (scrapy.contrib.loader.ItemLoader method)h Utr auh^Nh_hhH]ubj)r }r (hMUhNj hOhRhTjhV}r (jjXpyhX]hY]hZ]h[]h\]jXmethodr jj uh^Nh_hhH]r (j)r }r (hMXget_input_processor(field_name)hNj hOhRhTjhV}r (hX]r h ajhhY]hZ]h[]h\]r h ajXItemLoader.get_input_processorjjjuh^Mh_hhH]r (j)r }r (hMXget_input_processorhNj hOhRhTjhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r htXget_input_processorrr}r(hMUhNj ubaubj)r}r(hMUhNj hOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rj)r}r(hMX field_namehV}r (hZ]h[]hY]hX]h\]uhNjhH]r htX field_namer r }r (hMUhNjubahTjubaubeubj )r}r(hMUhNj hOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rh)r}r(hMX/Return the input processor for the given field.rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtX/Return the input processor for the given field.rr}r(hMjhNjubaubaubeubhx)r}r(hMUhNj hOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X@get_output_processor() (scrapy.contrib.loader.ItemLoader method)h)Utrauh^Nh_hhH]ubj)r}r (hMUhNj hOhRhTjhV}r!(jjXpyhX]hY]hZ]h[]h\]jXmethodr"jj"uh^Nh_hhH]r#(j)r$}r%(hMX get_output_processor(field_name)hNjhOhRhTjhV}r&(hX]r'h)ajhhY]hZ]h[]h\]r(h)ajXItemLoader.get_output_processorjjjuh^Mh_hhH]r)(j)r*}r+(hMXget_output_processorhNj$hOhRhTjhV}r,(hZ]h[]hY]hX]h\]uh^Mh_hhH]r-htXget_output_processorr.r/}r0(hMUhNj*ubaubj)r1}r2(hMUhNj$hOhRhTjhV}r3(hZ]h[]hY]hX]h\]uh^Mh_hhH]r4j)r5}r6(hMX field_namehV}r7(hZ]h[]hY]hX]h\]uhNj1hH]r8htX field_namer9r:}r;(hMUhNj5ubahTjubaubeubj )r<}r=(hMUhNjhOhRhTjhV}r>(hZ]h[]hY]hX]h\]uh^Mh_hhH]r?h)r@}rA(hMX0Return the output processor for the given field.rBhNj<hOhRhThhV}rC(hZ]h[]hY]hX]h\]uh^Mh_hhH]rDhtX0Return the output processor for the given field.rErF}rG(hMjBhNj@ubaubaubeubh)rH}rI(hMX<:class:`ItemLoader` instances have the following attributes:rJhNj hOhRhThhV}rK(hZ]h[]hY]hX]h\]uh^Mh_hhH]rL(h)rM}rN(hMX:class:`ItemLoader`rOhNjHhOhRhThhV}rP(UreftypeXclasshhX ItemLoaderU refdomainXpyrQhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rRh)rS}rT(hMjOhV}rU(hZ]h[]rV(hjQXpy-classrWehY]hX]h\]uhNjMhH]rXhtX ItemLoaderrYrZ}r[(hMUhNjSubahTjubaubhtX) instances have the following attributes:r\r]}r^(hMX) instances have the following attributes:hNjHubeubhx)r_}r`(hMUhNj hOhRhTh{hV}ra(hX]hY]hZ]h[]h\]Uentries]rb(h~X1item (scrapy.contrib.loader.ItemLoader attribute)hUtrcauh^Nh_hhH]ubj)rd}re(hMUhNj hOhRhTjhV}rf(jjXpyhX]hY]hZ]h[]h\]jX attributergjjguh^Nh_hhH]rh(j)ri}rj(hMXitemrkhNjdhOhRhTjhV}rl(hX]rmhajhhY]hZ]h[]h\]rnhajXItemLoader.itemjjjuh^Mh_hhH]roj)rp}rq(hMjkhNjihOhRhTjhV}rr(hZ]h[]hY]hX]h\]uh^Mh_hhH]rshtXitemrtru}rv(hMUhNjpubaubaubj )rw}rx(hMUhNjdhOhRhTjhV}ry(hZ]h[]hY]hX]h\]uh^Mh_hhH]rzh)r{}r|(hMXGThe :class:`~scrapy.item.Item` object being parsed by this Item Loader.hNjwhOhRhThhV}r}(hZ]h[]hY]hX]h\]uh^Mh_hhH]r~(htXThe rr}r(hMXThe hNj{ubh)r}r(hMX:class:`~scrapy.item.Item`rhNj{hOhRhThhV}r(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtXItemrr}r(hMUhNjubahTjubaubhtX) object being parsed by this Item Loader.rr}r(hMX) object being parsed by this Item Loader.hNj{ubeubaubeubhx)r}r(hMUhNj hOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X4context (scrapy.contrib.loader.ItemLoader attribute)hUtrauh^Nh_hhH]ubj)r}r(hMUhNj hOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jX attributerjjuh^Nh_hhH]r(j)r}r(hMXcontextrhNjhOhRhTjhV}r(hX]rhajhhY]hZ]h[]h\]rhajXItemLoader.contextjjjuh^Mh_hhH]rj)r}r(hMjhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXcontextrr}r(hMUhNjubaubaubj )r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rh)r}r(hMXQThe currently active :ref:`Context ` of this Item Loader.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(htXThe currently active rr}r(hMXThe currently active hNjubh)r}r(hMX':ref:`Context `rhNjhOhRhThhV}r(UreftypeXrefhhXtopics-loaders-contextU refdomainXstdrhX]hY]U refexplicithZ]h[]h\]hhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXstd-refrehY]hX]h\]uhNjhH]rhtXContextrr}r(hMUhNjubahThubaubhtX of this Item Loader.rr}r(hMX of this Item Loader.hNjubeubaubeubhx)r}r(hMUhNj hOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X?default_item_class (scrapy.contrib.loader.ItemLoader attribute)h&Utrauh^Nh_hhH]ubj)r}r(hMUhNj hOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jX attributerjjuh^Nh_hhH]r(j)r}r(hMXdefault_item_classrhNjhOhRhTjhV}r(hX]rh&ajhhY]hZ]h[]h\]rh&ajXItemLoader.default_item_classjjjuh^Mh_hhH]rj)r}r(hMjhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXdefault_item_classrr}r(hMUhNjubaubaubj )r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rh)r}r(hMXXAn Item class (or factory), used to instantiate items when not given in the constructor.rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXXAn Item class (or factory), used to instantiate items when not given in the constructor.rr}r(hMjhNjubaubaubeubhx)r}r(hMUhNj hOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~XDdefault_input_processor (scrapy.contrib.loader.ItemLoader attribute)h Utrauh^Nh_hhH]ubj)r}r(hMUhNj hOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jX attributerjjuh^Nh_hhH]r(j)r}r(hMXdefault_input_processorrhNjhOhRhTjhV}r(hX]rh ajhhY]hZ]h[]h\]rh ajX"ItemLoader.default_input_processorjjjuh^Mh_hhH]rj)r}r(hMjhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXdefault_input_processorrr}r(hMUhNjubaubaubj )r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rh)r }r (hMXLThe default input processor to use for those fields which don't specify one.r hNjhOhRhThhV}r (hZ]h[]hY]hX]h\]uh^Mh_hhH]r htXLThe default input processor to use for those fields which don't specify one.rr}r(hMj hNj ubaubaubeubhx)r}r(hMUhNj hOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~XEdefault_output_processor (scrapy.contrib.loader.ItemLoader attribute)h(Utrauh^Nh_hhH]ubj)r}r(hMUhNj hOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jX attributerjjuh^Nh_hhH]r(j)r}r(hMXdefault_output_processorrhNjhOhRhTjhV}r(hX]rh(ajhhY]hZ]h[]h\]r h(ajX#ItemLoader.default_output_processorjjjuh^Mh_hhH]r!j)r"}r#(hMjhNjhOhRhTjhV}r$(hZ]h[]hY]hX]h\]uh^Mh_hhH]r%htXdefault_output_processorr&r'}r((hMUhNj"ubaubaubj )r)}r*(hMUhNjhOhRhTjhV}r+(hZ]h[]hY]hX]h\]uh^Mh_hhH]r,h)r-}r.(hMXMThe default output processor to use for those fields which don't specify one.r/hNj)hOhRhThhV}r0(hZ]h[]hY]hX]h\]uh^Mh_hhH]r1htXMThe default output processor to use for those fields which don't specify one.r2r3}r4(hMj/hNj-ubaubaubeubhx)r5}r6(hMUhNj hOhRhTh{hV}r7(hX]hY]hZ]h[]h\]Uentries]r8(h~XCdefault_selector_class (scrapy.contrib.loader.ItemLoader attribute)h'Utr9auh^Nh_hhH]ubj)r:}r;(hMUhNj hOhRhTjhV}r<(jjXpyhX]hY]hZ]h[]h\]jX attributer=jj=uh^Nh_hhH]r>(j)r?}r@(hMXdefault_selector_classrAhNj:hOhRhTjhV}rB(hX]rCh'ajhhY]hZ]h[]h\]rDh'ajX!ItemLoader.default_selector_classjjjuh^Mh_hhH]rEj)rF}rG(hMjAhNj?hOhRhTjhV}rH(hZ]h[]hY]hX]h\]uh^Mh_hhH]rIhtXdefault_selector_classrJrK}rL(hMUhNjFubaubaubj )rM}rN(hMUhNj:hOhRhTjhV}rO(hZ]h[]hY]hX]h\]uh^Mh_hhH]rPh)rQ}rR(hMXThe class used to construct the :attr:`selector` of this :class:`ItemLoader`, if only a response is given in the constructor. If a selector is given in the constructor this attribute is ignored. This attribute is sometimes overridden in subclasses.hNjMhOhRhThhV}rS(hZ]h[]hY]hX]h\]uh^Mh_hhH]rT(htX The class used to construct the rUrV}rW(hMX The class used to construct the hNjQubh)rX}rY(hMX:attr:`selector`rZhNjQhOhRhThhV}r[(UreftypeXattrhhXselectorU refdomainXpyr\hX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]r]h)r^}r_(hMjZhV}r`(hZ]h[]ra(hj\Xpy-attrrbehY]hX]h\]uhNjXhH]rchtXselectorrdre}rf(hMUhNj^ubahTjubaubhtX of this rgrh}ri(hMX of this hNjQubh)rj}rk(hMX:class:`ItemLoader`rlhNjQhOhRhThhV}rm(UreftypeXclasshhX ItemLoaderU refdomainXpyrnhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]roh)rp}rq(hMjlhV}rr(hZ]h[]rs(hjnXpy-classrtehY]hX]h\]uhNjjhH]ruhtX ItemLoaderrvrw}rx(hMUhNjpubahTjubaubhtX, if only a response is given in the constructor. If a selector is given in the constructor this attribute is ignored. This attribute is sometimes overridden in subclasses.ryrz}r{(hMX, if only a response is given in the constructor. If a selector is given in the constructor this attribute is ignored. This attribute is sometimes overridden in subclasses.hNjQubeubaubeubhx)r|}r}(hMUhNj hOhRhTh{hV}r~(hX]hY]hZ]h[]h\]Uentries]r(h~X5selector (scrapy.contrib.loader.ItemLoader attribute)h Utrauh^Nh_hhH]ubj)r}r(hMUhNj hOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jX attributerjjuh^Nh_hhH]r(j)r}r(hMXselectorrhNjhOhRhTjhV}r(hX]rh ajhhY]hZ]h[]h\]rh ajXItemLoader.selectorjjjuh^Mh_hhH]rj)r}r(hMjhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXselectorrr}r(hMUhNjubaubaubj )r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rh)r}r(hMXThe :class:`~scrapy.selector.Selector` object to extract data from. It's either the selector given in the constructor or one created from the response given in the constructor using the :attr:`default_selector_class`. This attribute is meant to be read-only.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(htXThe rr}r(hMXThe hNjubh)r}r(hMX":class:`~scrapy.selector.Selector`rhNjhOhRhThhV}r(UreftypeXclasshhXscrapy.selector.SelectorU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtXSelectorrr}r(hMUhNjubahTjubaubhtX object to extract data from. It's either the selector given in the constructor or one created from the response given in the constructor using the rr}r(hMX object to extract data from. It's either the selector given in the constructor or one created from the response given in the constructor using the hNjubh)r}r(hMX:attr:`default_selector_class`rhNjhOhRhThhV}r(UreftypeXattrhhXdefault_selector_classU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-attrrehY]hX]h\]uhNjhH]rhtXdefault_selector_classrr}r(hMUhNjubahTjubaubhtX*. This attribute is meant to be read-only.rr}r(hMX*. This attribute is meant to be read-only.hNjubeubaubeubeubeubhJ)r}r(hMX.. _topics-loaders-extending:hNjhOhRhThUhV}r(hX]hY]hZ]h[]h\]h]hFuh^Mh_hhH]ubeubh`)r}r(hMUhNhahOhRhc}rh-jshThehV}r(hZ]h[]hY]hX]r(h=hFeh\]r(hh-euh^Mh_hhj}rhFjshH]r(hm)r}r(hMX"Reusing and extending Item LoadersrhNjhOhRhThqhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtX"Reusing and extending Item Loadersrr}r(hMjhNjubaubh)r}r(hMX As your project grows bigger and acquires more and more spiders, maintenance becomes a fundamental problem, specially when you have to deal with many different parsing rules for each spider, having a lot of exceptions, but also wanting to reuse the common processors.rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtX As your project grows bigger and acquires more and more spiders, maintenance becomes a fundamental problem, specially when you have to deal with many different parsing rules for each spider, having a lot of exceptions, but also wanting to reuse the common processors.rr}r(hMjhNjubaubh)r}r(hMXSItem Loaders are designed to ease the maintenance burden of parsing rules, without losing flexibility and, at the same time, providing a convenient mechanism for extending and overriding them. For this reason Item Loaders support traditional Python class inheritance for dealing with differences of specific spiders (or groups of spiders).rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXSItem Loaders are designed to ease the maintenance burden of parsing rules, without losing flexibility and, at the same time, providing a convenient mechanism for extending and overriding them. For this reason Item Loaders support traditional Python class inheritance for dealing with differences of specific spiders (or groups of spiders).rr}r(hMjhNjubaubh)r}r(hMXSuppose, for example, that some particular site encloses their product names in three dashes (ie. ``---Plasma TV---``) and you don't want to end up scraping those dashes in the final product names.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(htXbSuppose, for example, that some particular site encloses their product names in three dashes (ie. rr}r(hMXbSuppose, for example, that some particular site encloses their product names in three dashes (ie. hNjubh)r}r(hMX``---Plasma TV---``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX---Plasma TV---rr}r(hMUhNjubahTjubhtXP) and you don't want to end up scraping those dashes in the final product names.rr}r(hMXP) and you don't want to end up scraping those dashes in the final product names.hNjubeubh)r}r(hMXuHere's how you can remove those dashes by reusing and extending the default Product Item Loader (``ProductLoader``)::hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(htXaHere's how you can remove those dashes by reusing and extending the default Product Item Loader (rr}r(hMXaHere's how you can remove those dashes by reusing and extending the default Product Item Loader (hNjubh)r}r(hMX``ProductLoader``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX ProductLoaderrr}r(hMUhNjubahTjubhtX):rr}r(hMX):hNjubeubjZ)r}r (hMXfrom scrapy.contrib.loader.processor import MapCompose from myproject.ItemLoaders import ProductLoader def strip_dashes(x): return x.strip('-') class SiteSpecificLoader(ProductLoader): name_in = MapCompose(strip_dashes, ProductLoader.name_in)hNjhOhRhTj]hV}r (j_j`hX]hY]hZ]h[]h\]uh^Mh_hhH]r htXfrom scrapy.contrib.loader.processor import MapCompose from myproject.ItemLoaders import ProductLoader def strip_dashes(x): return x.strip('-') class SiteSpecificLoader(ProductLoader): name_in = MapCompose(strip_dashes, ProductLoader.name_in)r r }r(hMUhNjubaubh)r}r(hMXAnother case where extending Item Loaders can be very helpful is when you have multiple source formats, for example XML and HTML. In the XML version you may want to remove ``CDATA`` occurrences. Here's an example of how to do it::hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(htXAnother case where extending Item Loaders can be very helpful is when you have multiple source formats, for example XML and HTML. In the XML version you may want to remove rr}r(hMXAnother case where extending Item Loaders can be very helpful is when you have multiple source formats, for example XML and HTML. In the XML version you may want to remove hNjubh)r}r(hMX ``CDATA``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXCDATArr}r(hMUhNjubahTjubhtX0 occurrences. Here's an example of how to do it:rr}r(hMX0 occurrences. Here's an example of how to do it:hNjubeubjZ)r }r!(hMXfrom scrapy.contrib.loader.processor import MapCompose from myproject.ItemLoaders import ProductLoader from myproject.utils.xml import remove_cdata class XmlProductLoader(ProductLoader): name_in = MapCompose(remove_cdata, ProductLoader.name_in)hNjhOhRhTj]hV}r"(j_j`hX]hY]hZ]h[]h\]uh^Mh_hhH]r#htXfrom scrapy.contrib.loader.processor import MapCompose from myproject.ItemLoaders import ProductLoader from myproject.utils.xml import remove_cdata class XmlProductLoader(ProductLoader): name_in = MapCompose(remove_cdata, ProductLoader.name_in)r$r%}r&(hMUhNj ubaubh)r'}r((hMX5And that's how you typically extend input processors.r)hNjhOhRhThhV}r*(hZ]h[]hY]hX]h\]uh^Mh_hhH]r+htX5And that's how you typically extend input processors.r,r-}r.(hMj)hNj'ubaubh)r/}r0(hMXAs for output processors, it is more common to declare them in the field metadata, as they usually depend only on the field and not on each specific site parsing rule (as input processors do). See also: :ref:`topics-loaders-processors-declaring`.hNjhOhRhThhV}r1(hZ]h[]hY]hX]h\]uh^Mh_hhH]r2(htXAs for output processors, it is more common to declare them in the field metadata, as they usually depend only on the field and not on each specific site parsing rule (as input processors do). See also: r3r4}r5(hMXAs for output processors, it is more common to declare them in the field metadata, as they usually depend only on the field and not on each specific site parsing rule (as input processors do). See also: hNj/ubh)r6}r7(hMX*:ref:`topics-loaders-processors-declaring`r8hNj/hOhRhThhV}r9(UreftypeXrefhhX#topics-loaders-processors-declaringU refdomainXstdr:hX]hY]U refexplicithZ]h[]h\]hhuh^MhH]r;h)r<}r=(hMj8hV}r>(hZ]h[]r?(hj:Xstd-refr@ehY]hX]h\]uhNj6hH]rAhtX#topics-loaders-processors-declaringrBrC}rD(hMUhNj<ubahThubaubhtX.rE}rF(hMX.hNj/ubeubh)rG}rH(hMX;There are many other possible ways to extend, inherit and override your Item Loaders, and different Item Loaders hierarchies may fit better for different projects. Scrapy only provides the mechanism; it doesn't impose any specific organization of your Loaders collection - that's up to you and your project's needs.rIhNjhOhRhThhV}rJ(hZ]h[]hY]hX]h\]uh^Mh_hhH]rKhtX;There are many other possible ways to extend, inherit and override your Item Loaders, and different Item Loaders hierarchies may fit better for different projects. Scrapy only provides the mechanism; it doesn't impose any specific organization of your Loaders collection - that's up to you and your project's needs.rLrM}rN(hMjIhNjGubaubhJ)rO}rP(hMX(.. _topics-loaders-available-processors:hNjhOhRhThUhV}rQ(hX]hY]hZ]h[]h\]h]hAuh^Mh_hhH]ubeubh`)rR}rS(hMUhNhahOhRhc}rThjOshThehV}rU(hZ]h[]hY]hX]rV(X&module-scrapy.contrib.loader.processorrWh:hAeh\]rX(h heuh^Mh_hhj}rYhAjOshH]rZ(hm)r[}r\(hMXAvailable built-in processorsr]hNjRhOhRhThqhV}r^(hZ]h[]hY]hX]h\]uh^Mh_hhH]r_htXAvailable built-in processorsr`ra}rb(hMj]hNj[ubaubhx)rc}rd(hMUhNjRhOhRhTh{hV}re(hX]hY]hZ]h[]h\]Uentries]rf(h~X(scrapy.contrib.loader.processor (module)X&module-scrapy.contrib.loader.processorUtrgauh^Nh_hhH]ubh)rh}ri(hMXMEven though you can use any callable function as input and output processors, Scrapy provides some commonly used processors, which are described below. Some of them, like the :class:`MapCompose` (which is typically used as input processor) compose the output of several functions executed in order, to produce the final parsed value.hNjRhOhRhThhV}rj(hZ]h[]hY]hX]h\]uh^M#h_hhH]rk(htXEven though you can use any callable function as input and output processors, Scrapy provides some commonly used processors, which are described below. Some of them, like the rlrm}rn(hMXEven though you can use any callable function as input and output processors, Scrapy provides some commonly used processors, which are described below. Some of them, like the hNjhubh)ro}rp(hMX:class:`MapCompose`rqhNjhhOhRhThhV}rr(UreftypeXclasshhX MapComposeU refdomainXpyrshX]hY]U refexplicithZ]h[]h\]hhhNhXscrapy.contrib.loader.processorrtuh^M#hH]ruh)rv}rw(hMjqhV}rx(hZ]h[]ry(hjsXpy-classrzehY]hX]h\]uhNjohH]r{htX MapComposer|r}}r~(hMUhNjvubahTjubaubhtX (which is typically used as input processor) compose the output of several functions executed in order, to produce the final parsed value.rr}r(hMX (which is typically used as input processor) compose the output of several functions executed in order, to produce the final parsed value.hNjhubeubh)r}r(hMX*Here is a list of all built-in processors:rhNjRhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M)h_hhH]rhtX*Here is a list of all built-in processors:rr}r(hMjhNjubaubhx)r}r(hMUhNjRhOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X3Identity (class in scrapy.contrib.loader.processor)h"Utrauh^Nh_hhH]ubj)r}r(hMUhNjRhOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jXclassrjjuh^Nh_hhH]r(j)r}r(hMXIdentityrhNjhOhRhTjhV}r(hX]rh"ajjthY]hZ]h[]h\]rh"ajjjUjuh^M7h_hhH]r(j)r}r(hMXclass hNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^M7h_hhH]rhtXclass rr}r(hMUhNjubaubj)r}r(hMX scrapy.contrib.loader.processor.hNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^M7h_hhH]rhtX scrapy.contrib.loader.processor.rr}r(hMUhNjubaubj)r}r(hMjhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^M7h_hhH]rhtXIdentityrr}r(hMUhNjubaubeubj )r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^M7h_hhH]r(h)r}r(hMXThe simplest processor, which doesn't do anything. It returns the original values unchanged. It doesn't receive any constructor arguments nor accepts Loader contexts.rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M-h_hhH]rhtXThe simplest processor, which doesn't do anything. It returns the original values unchanged. It doesn't receive any constructor arguments nor accepts Loader contexts.rr}r(hMjhNjubaubh)r}r(hMX Example::hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M1h_hhH]rhtXExample:rr}r(hMXExample:hNjubaubjZ)r}r(hMX>>> from scrapy.contrib.loader.processor import Identity >>> proc = Identity() >>> proc(['one', 'two', 'three']) ['one', 'two', 'three']hNjhOhRhTj]hV}r(j_j`hX]hY]hZ]h[]h\]uh^M3h_hhH]rhtX>>> from scrapy.contrib.loader.processor import Identity >>> proc = Identity() >>> proc(['one', 'two', 'three']) ['one', 'two', 'three']rr}r(hMUhNjubaubeubeubhx)r}r(hMUhNjRhOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X4TakeFirst (class in scrapy.contrib.loader.processor)hUtrauh^Nh_hhH]ubj)r}r(hMUhNjRhOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jXclassrjjuh^Nh_hhH]r(j)r}r(hMX TakeFirstrhNjhOhRhTjhV}r(hX]rhajjthY]hZ]h[]h\]rhajjjUjuh^MDh_hhH]r(j)r}r(hMXclass hNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^MDh_hhH]rhtXclass rr}r(hMUhNjubaubj)r}r(hMX scrapy.contrib.loader.processor.hNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^MDh_hhH]rhtX scrapy.contrib.loader.processor.rr}r(hMUhNjubaubj)r}r(hMjhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^MDh_hhH]rhtX TakeFirstrr}r(hMUhNjubaubeubj )r}r(hMUhNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^MDh_hhH]r(h)r}r(hMXReturn the first non-null/non-empty value from the values received, so it's typically used as an output processor to single-valued fields. It doesn't receive any constructor arguments, nor accept Loader contexts.rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M:h_hhH]rhtXReturn the first non-null/non-empty value from the values received, so it's typically used as an output processor to single-valued fields. It doesn't receive any constructor arguments, nor accept Loader contexts.rr}r(hMjhNjubaubh)r}r(hMX Example::hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M>h_hhH]rhtXExample:rr}r(hMXExample:hNjubaubjZ)r}r(hMX|>>> from scrapy.contrib.loader.processor import TakeFirst >>> proc = TakeFirst() >>> proc(['', 'one', 'two', 'three']) 'one'hNjhOhRhTj]hV}r(j_j`hX]hY]hZ]h[]h\]uh^M@h_hhH]rhtX|>>> from scrapy.contrib.loader.processor import TakeFirst >>> proc = TakeFirst() >>> proc(['', 'one', 'two', 'three']) 'one'rr}r (hMUhNjubaubeubeubhx)r }r (hMUhNjRhOhRhTh{hV}r (hX]hY]hZ]h[]h\]Uentries]r (h~X/Join (class in scrapy.contrib.loader.processor)hUtrauh^Nh_hhH]ubj)r}r(hMUhNjRhOhRhTjhV}r(jjXpyhX]hY]hZ]h[]h\]jXclassrjjuh^Nh_hhH]r(j)r}r(hMXJoin(separator=u' ')hNjhOhRhTjhV}r(hX]rhajjthY]hZ]h[]h\]rhajXJoinrjUjuh^MVh_hhH]r(j)r}r(hMXclass hNjhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^MVh_hhH]rhtXclass rr }r!(hMUhNjubaubj)r"}r#(hMX scrapy.contrib.loader.processor.hNjhOhRhTjhV}r$(hZ]h[]hY]hX]h\]uh^MVh_hhH]r%htX scrapy.contrib.loader.processor.r&r'}r((hMUhNj"ubaubj)r)}r*(hMjhNjhOhRhTjhV}r+(hZ]h[]hY]hX]h\]uh^MVh_hhH]r,htXJoinr-r.}r/(hMUhNj)ubaubj)r0}r1(hMUhNjhOhRhTjhV}r2(hZ]h[]hY]hX]h\]uh^MVh_hhH]r3j)r4}r5(hMXseparator=u' 'hV}r6(hZ]h[]hY]hX]h\]uhNj0hH]r7htXseparator=u' 'r8r9}r:(hMUhNj4ubahTjubaubeubj )r;}r<(hMUhNjhOhRhTjhV}r=(hZ]h[]hY]hX]h\]uh^MVh_hhH]r>(h)r?}r@(hMXReturns the values joined with the separator given in the constructor, which defaults to ``u' '``. It doesn't accept Loader contexts.hNj;hOhRhThhV}rA(hZ]h[]hY]hX]h\]uh^MGh_hhH]rB(htXYReturns the values joined with the separator given in the constructor, which defaults to rCrD}rE(hMXYReturns the values joined with the separator given in the constructor, which defaults to hNj?ubh)rF}rG(hMX``u' '``hV}rH(hZ]h[]hY]hX]h\]uhNj?hH]rIhtXu' 'rJrK}rL(hMUhNjFubahTjubhtX$. It doesn't accept Loader contexts.rMrN}rO(hMX$. It doesn't accept Loader contexts.hNj?ubeubh)rP}rQ(hMX]When using the default separator, this processor is equivalent to the function: ``u' '.join``hNj;hOhRhThhV}rR(hZ]h[]hY]hX]h\]uh^MJh_hhH]rS(htXPWhen using the default separator, this processor is equivalent to the function: rTrU}rV(hMXPWhen using the default separator, this processor is equivalent to the function: hNjPubh)rW}rX(hMX ``u' '.join``hV}rY(hZ]h[]hY]hX]h\]uhNjPhH]rZhtX u' '.joinr[r\}r](hMUhNjWubahTjubeubh)r^}r_(hMX Examples::hNj;hOhRhThhV}r`(hZ]h[]hY]hX]h\]uh^MMh_hhH]rahtX Examples:rbrc}rd(hMX Examples:hNj^ubaubjZ)re}rf(hMX>>> from scrapy.contrib.loader.processor import Join >>> proc = Join() >>> proc(['one', 'two', 'three']) u'one two three' >>> proc = Join('
              ') >>> proc(['one', 'two', 'three']) u'one
              two
              three'hNj;hOhRhTj]hV}rg(j_j`hX]hY]hZ]h[]h\]uh^MOh_hhH]rhhtX>>> from scrapy.contrib.loader.processor import Join >>> proc = Join() >>> proc(['one', 'two', 'three']) u'one two three' >>> proc = Join('
              ') >>> proc(['one', 'two', 'three']) u'one
              two
              three'rirj}rk(hMUhNjeubaubeubeubhx)rl}rm(hMUhNjRhOhRhTh{hV}rn(hX]hY]hZ]h[]h\]Uentries]ro(h~X2Compose (class in scrapy.contrib.loader.processor)h,Utrpauh^Nh_hhH]ubj)rq}rr(hMUhNjRhOhRhTjhV}rs(jjXpyhX]hY]hZ]h[]h\]jXclassrtjjtuh^Nh_hhH]ru(j)rv}rw(hMX-Compose(*functions, **default_loader_context)hNjqhOhRhTjhV}rx(hX]ryh,ajjthY]hZ]h[]h\]rzh,ajXComposer{jUjuh^Mrh_hhH]r|(j)r}}r~(hMXclass hNjvhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mrh_hhH]rhtXclass rr}r(hMUhNj}ubaubj)r}r(hMX scrapy.contrib.loader.processor.hNjvhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mrh_hhH]rhtX scrapy.contrib.loader.processor.rr}r(hMUhNjubaubj)r}r(hMj{hNjvhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mrh_hhH]rhtXComposerr}r(hMUhNjubaubj)r}r(hMUhNjvhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mrh_hhH]r(j)r}r(hMX *functionshV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX *functionsrr}r(hMUhNjubahTjubj)r}r(hMX**default_loader_contexthV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX**default_loader_contextrr}r(hMUhNjubahTjubeubeubj )r}r(hMUhNjqhOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mrh_hhH]r(h)r}r(hMX2A processor which is constructed from the composition of the given functions. This means that each input value of this processor is passed to the first function, and the result of that function is passed to the second function, and so on, until the last function returns the output value of this processor.rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^MYh_hhH]rhtX2A processor which is constructed from the composition of the given functions. This means that each input value of this processor is passed to the first function, and the result of that function is passed to the second function, and so on, until the last function returns the output value of this processor.rr}r(hMjhNjubaubh)r}r(hMXuBy default, stop process on None value. This behaviour can be changed by passing keyword argument stop_on_none=False.rhNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^M_h_hhH]rhtXuBy default, stop process on None value. This behaviour can be changed by passing keyword argument stop_on_none=False.rr}r(hMjhNjubaubh)r}r(hMX Example::hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mbh_hhH]rhtXExample:rr}r(hMXExample:hNjubaubjZ)r}r(hMX>>> from scrapy.contrib.loader.processor import Compose >>> proc = Compose(lambda v: v[0], str.upper) >>> proc(['hello', 'world']) 'HELLO'hNjhOhRhTj]hV}r(j_j`hX]hY]hZ]h[]h\]uh^Mdh_hhH]rhtX>>> from scrapy.contrib.loader.processor import Compose >>> proc = Compose(lambda v: v[0], str.upper) >>> proc(['hello', 'world']) 'HELLO'rr}r(hMUhNjubaubh)r}r(hMXEach function can optionally receive a ``loader_context`` parameter. For those which do, this processor will pass the currently active :ref:`Loader context ` through that parameter.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mih_hhH]r(htX'Each function can optionally receive a rr}r(hMX'Each function can optionally receive a hNjubh)r}r(hMX``loader_context``hV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtXloader_contextrr}r(hMUhNjubahTjubhtXN parameter. For those which do, this processor will pass the currently active rr}r(hMXN parameter. For those which do, this processor will pass the currently active hNjubh)r}r(hMX.:ref:`Loader context `rhNjhOhRhThhV}r(UreftypeXrefhhXtopics-loaders-contextU refdomainXstdrhX]hY]U refexplicithZ]h[]h\]hhuh^MihH]rh)r}r(hMjhV}r(hZ]h[]r(hjXstd-refrehY]hX]h\]uhNjhH]rhtXLoader contextrr}r(hMUhNjubahThubaubhtX through that parameter.rr}r(hMX through that parameter.hNjubeubh)r}r(hMX-The keyword arguments passed in the constructor are used as the default Loader context values passed to each function call. However, the final Loader context values passed to functions are overridden with the currently active Loader context accessible through the :meth:`ItemLoader.context` attribute.hNjhOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mmh_hhH]r(htXThe keyword arguments passed in the constructor are used as the default Loader context values passed to each function call. However, the final Loader context values passed to functions are overridden with the currently active Loader context accessible through the rr}r(hMXThe keyword arguments passed in the constructor are used as the default Loader context values passed to each function call. However, the final Loader context values passed to functions are overridden with the currently active Loader context accessible through the hNjubh)r}r(hMX:meth:`ItemLoader.context`rhNjhOhRhThhV}r(UreftypeXmethhhXItemLoader.contextU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhj{hjtuh^MmhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtXItemLoader.context()rr}r(hMUhNjubahTjubaubhtX attribute.rr}r(hMX attribute.hNjubeubeubeubhx)r}r(hMUhNjRhOhRhTh{hV}r(hX]hY]hZ]h[]h\]Uentries]r(h~X5MapCompose (class in scrapy.contrib.loader.processor)hUtrauh^Nh_hhH]ubj)r}r(hMUhNjRhOhRhTjhV}r (jjXpyhX]hY]hZ]h[]h\]jXclassr jj uh^Nh_hhH]r (j)r }r (hMX0MapCompose(*functions, **default_loader_context)hNjhOhRhTjhV}r(hX]rhajjthY]hZ]h[]h\]rhajX MapComposerjUjuh^Mh_hhH]r(j)r}r(hMXclass hNj hOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtXclass rr}r(hMUhNjubaubj)r}r(hMX scrapy.contrib.loader.processor.hNj hOhRhTjhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtX scrapy.contrib.loader.processor.rr}r (hMUhNjubaubj)r!}r"(hMjhNj hOhRhTjhV}r#(hZ]h[]hY]hX]h\]uh^Mh_hhH]r$htX MapComposer%r&}r'(hMUhNj!ubaubj)r(}r)(hMUhNj hOhRhTjhV}r*(hZ]h[]hY]hX]h\]uh^Mh_hhH]r+(j)r,}r-(hMX *functionshV}r.(hZ]h[]hY]hX]h\]uhNj(hH]r/htX *functionsr0r1}r2(hMUhNj,ubahTjubj)r3}r4(hMX**default_loader_contexthV}r5(hZ]h[]hY]hX]h\]uhNj(hH]r6htX**default_loader_contextr7r8}r9(hMUhNj3ubahTjubeubeubj )r:}r;(hMUhNjhOhRhTjhV}r<(hZ]h[]hY]hX]h\]uh^Mh_hhH]r=(h)r>}r?(hMXA processor which is constructed from the composition of the given functions, similar to the :class:`Compose` processor. The difference with this processor is the way internal results are passed among functions, which is as follows:hNj:hOhRhThhV}r@(hZ]h[]hY]hX]h\]uh^Muh_hhH]rA(htX]A processor which is constructed from the composition of the given functions, similar to the rBrC}rD(hMX]A processor which is constructed from the composition of the given functions, similar to the hNj>ubh)rE}rF(hMX:class:`Compose`rGhNj>hOhRhThhV}rH(UreftypeXclasshhXComposeU refdomainXpyrIhX]hY]U refexplicithZ]h[]h\]hhhjhjtuh^MuhH]rJh)rK}rL(hMjGhV}rM(hZ]h[]rN(hjIXpy-classrOehY]hX]h\]uhNjEhH]rPhtXComposerQrR}rS(hMUhNjKubahTjubaubhtX{ processor. The difference with this processor is the way internal results are passed among functions, which is as follows:rTrU}rV(hMX{ processor. The difference with this processor is the way internal results are passed among functions, which is as follows:hNj>ubeubh)rW}rX(hMXThe input value of this processor is *iterated* and each element is passed to the first function, and the result of that function (for each element) is concatenated to construct a new iterable, which is then passed to the second function, and so on, until the last function is applied for each value of the list of values collected so far. The output values of the last function are concatenated together to produce the output of this processor.hNj:hOhRhThhV}rY(hZ]h[]hY]hX]h\]uh^Mzh_hhH]rZ(htX%The input value of this processor is r[r\}r](hMX%The input value of this processor is hNjWubh)r^}r_(hMX *iterated*hV}r`(hZ]h[]hY]hX]h\]uhNjWhH]rahtXiteratedrbrc}rd(hMUhNj^ubahThubhtX and each element is passed to the first function, and the result of that function (for each element) is concatenated to construct a new iterable, which is then passed to the second function, and so on, until the last function is applied for each value of the list of values collected so far. The output values of the last function are concatenated together to produce the output of this processor.rerf}rg(hMX and each element is passed to the first function, and the result of that function (for each element) is concatenated to construct a new iterable, which is then passed to the second function, and so on, until the last function is applied for each value of the list of values collected so far. The output values of the last function are concatenated together to produce the output of this processor.hNjWubeubh)rh}ri(hMX/Each particular function can return a value or a list of values, which is flattened with the list of values returned by the same function applied to the other input values. The functions can also return ``None`` in which case the output of that function is ignored for further processing over the chain.hNj:hOhRhThhV}rj(hZ]h[]hY]hX]h\]uh^Mh_hhH]rk(htXEach particular function can return a value or a list of values, which is flattened with the list of values returned by the same function applied to the other input values. The functions can also return rlrm}rn(hMXEach particular function can return a value or a list of values, which is flattened with the list of values returned by the same function applied to the other input values. The functions can also return hNjhubh)ro}rp(hMX``None``hV}rq(hZ]h[]hY]hX]h\]uhNjhhH]rrhtXNonersrt}ru(hMUhNjoubahTjubhtX\ in which case the output of that function is ignored for further processing over the chain.rvrw}rx(hMX\ in which case the output of that function is ignored for further processing over the chain.hNjhubeubh)ry}rz(hMXyThis processor provides a convenient way to compose functions that only work with single values (instead of iterables). For this reason the :class:`MapCompose` processor is typically used as input processor, since data is often extracted using the :meth:`~scrapy.selector.Selector.extract` method of :ref:`selectors `, which returns a list of unicode strings.hNj:hOhRhThhV}r{(hZ]h[]hY]hX]h\]uh^Mh_hhH]r|(htXThis processor provides a convenient way to compose functions that only work with single values (instead of iterables). For this reason the r}r~}r(hMXThis processor provides a convenient way to compose functions that only work with single values (instead of iterables). For this reason the hNjyubh)r}r(hMX:class:`MapCompose`rhNjyhOhRhThhV}r(UreftypeXclasshhX MapComposeU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhjtuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtX MapComposerr}r(hMUhNjubahTjubaubhtXY processor is typically used as input processor, since data is often extracted using the rr}r(hMXY processor is typically used as input processor, since data is often extracted using the hNjyubh)r}r(hMX):meth:`~scrapy.selector.Selector.extract`rhNjyhOhRhThhV}r(UreftypeXmethhhX scrapy.selector.Selector.extractU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhjtuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-methrehY]hX]h\]uhNjhH]rhtX extract()rr}r(hMUhNjubahTjubaubhtX method of rr}r(hMX method of hNjyubh)r}r(hMX#:ref:`selectors `rhNjyhOhRhThhV}r(UreftypeXrefhhXtopics-selectorsU refdomainXstdrhX]hY]U refexplicithZ]h[]h\]hhuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXstd-refrehY]hX]h\]uhNjhH]rhtX selectorsrr}r(hMUhNjubahThubaubhtX*, which returns a list of unicode strings.rr}r(hMX*, which returns a list of unicode strings.hNjyubeubh)r}r(hMX/The example below should clarify how it works::rhNj:hOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]rhtX.The example below should clarify how it works:rr}r(hMX.The example below should clarify how it works:hNjubaubjZ)r}r(hMX>>> def filter_world(x): ... return None if x == 'world' else x ... >>> from scrapy.contrib.loader.processor import MapCompose >>> proc = MapCompose(filter_world, unicode.upper) >>> proc([u'hello', u'world', u'this', u'is', u'scrapy']) [u'HELLO, u'THIS', u'IS', u'SCRAPY']hNj:hOhRhTj]hV}r(j_j`hX]hY]hZ]h[]h\]uh^Mh_hhH]rhtX>>> def filter_world(x): ... return None if x == 'world' else x ... >>> from scrapy.contrib.loader.processor import MapCompose >>> proc = MapCompose(filter_world, unicode.upper) >>> proc([u'hello', u'world', u'this', u'is', u'scrapy']) [u'HELLO, u'THIS', u'IS', u'SCRAPY']rr}r(hMUhNjubaubh)r}r(hMXAs with the Compose processor, functions can receive Loader contexts, and constructor keyword arguments are used as default context values. See :class:`Compose` processor for more info.hNj:hOhRhThhV}r(hZ]h[]hY]hX]h\]uh^Mh_hhH]r(htXAs with the Compose processor, functions can receive Loader contexts, and constructor keyword arguments are used as default context values. See rr}r(hMXAs with the Compose processor, functions can receive Loader contexts, and constructor keyword arguments are used as default context values. See hNjubh)r}r(hMX:class:`Compose`rhNjhOhRhThhV}r(UreftypeXclasshhXComposeU refdomainXpyrhX]hY]U refexplicithZ]h[]h\]hhhjhjtuh^MhH]rh)r}r(hMjhV}r(hZ]h[]r(hjXpy-classrehY]hX]h\]uhNjhH]rhtXComposerr}r(hMUhNjubahTjubaubhtX processor for more info.rr}r(hMX processor for more info.hNjubeubeubeubeubeubehMUU transformerrNU footnote_refsr}rUrefnamesr}rUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh_hU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(hMUhV}r(hZ]UlevelKhX]hY]UsourcehRh[]h\]UlineKUtypeUINFOruhH]rh)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX4Hyperlink target "topics-loaders" is not referenced.rr}r(hMUhNjubahThubahTUsystem_messagerubj)r}r(hMUhV}r(hZ]UlevelKhX]hY]UsourcehRh[]h\]UlineKJUtypejuhH]rh)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX?Hyperlink target "topics-loaders-processors" is not referenced.rr}r(hMUhNjubahThubahTjubj)r}r(hMUhV}r (hZ]UlevelKhX]hY]UsourcehRh[]h\]UlineKUtypejuhH]r h)r }r (hMUhV}r (hZ]h[]hY]hX]h\]uhNjhH]rhtXIHyperlink target "topics-loaders-processors-declaring" is not referenced.rr}r(hMUhNj ubahThubahTjubj)r}r(hMUhV}r(hZ]UlevelKhX]hY]UsourcehRh[]h\]UlineKUtypejuhH]rh)r}r(hMUhV}r(hZ]h[]hY]hX]h\]uhNjhH]rhtX<Hyperlink target "topics-loaders-context" is not referenced.rr}r(hMUhNjubahThubahTjubj)r}r(hMUhV}r(hZ]UlevelKhX]hY]UsourcehRh[]h\]UlineMUtypejuhH]r h)r!}r"(hMUhV}r#(hZ]h[]hY]hX]h\]uhNjhH]r$htX>Hyperlink target "topics-loaders-extending" is not referenced.r%r&}r'(hMUhNj!ubahThubahTjubj)r(}r)(hMUhV}r*(hZ]UlevelKhX]hY]UsourcehRh[]h\]UlineMUtypejuhH]r+h)r,}r-(hMUhV}r.(hZ]h[]hY]hX]h\]uhNj(hH]r/htXIHyperlink target "topics-loaders-available-processors" is not referenced.r0r1}r2(hMUhNj,ubahThubahTjubeUreporterr3NUid_startr4KU autofootnotesr5]r6U citation_refsr7}r8Uindirect_targetsr9]r:Usettingsr;(cdocutils.frontend Values r<or=}r>(Ufootnote_backlinksr?KUrecord_dependenciesr@NU rfc_base_urlrAUhttp://tools.ietf.org/html/rBU tracebackrCUpep_referencesrDNUstrip_commentsrENU toc_backlinksrFUentryrGU language_coderHUenrIU datestamprJNU report_levelrKKU _destinationrLNU halt_levelrMKU strip_classesrNNhqNUerror_encoding_error_handlerrOUbackslashreplacerPUdebugrQNUembed_stylesheetrRUoutput_encoding_error_handlerrSUstrictrTU sectnum_xformrUKUdump_transformsrVNU docinfo_xformrWKUwarning_streamrXNUpep_file_url_templaterYUpep-%04drZUexit_status_levelr[KUconfigr\NUstrict_visitorr]NUcloak_email_addressesr^Utrim_footnote_reference_spacer_Uenvr`NUdump_pseudo_xmlraNUexpose_internalsrbNUsectsubtitle_xformrcU source_linkrdNUrfc_referencesreNUoutput_encodingrfUutf-8rgU source_urlrhNUinput_encodingriU utf-8-sigrjU_disable_configrkNU id_prefixrlUU tab_widthrmKUerror_encodingrnUUTF-8roU_sourcerpUD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/loaders.rstrqUgettext_compactrrU generatorrsNUdump_internalsrtNU smart_quotesruU pep_base_urlrvUhttp://www.python.org/dev/peps/rwUsyntax_highlightrxUlongryUinput_encoding_error_handlerrzjTUauto_id_prefixr{Uidr|Udoctitle_xformr}Ustrip_elements_with_classesr~NU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(hjc hjihjr hjihj hj hj hhhJ)r}r(hMUhNhahOhRhThUhV}r(hZ]hX]rhhahY]Uismodh[]h\]uh^Nh_hhH]ubh?hhj jWhJ)r}r(hMUhNjRhOhRhThUhV}r(hZ]hX]rjWahY]Uismodh[]h\]uh^Nh_hhH]ubhAjRh+jh9jh jh"jhEjh%j h&jhGhah'j?h(jh)j$h*jS hBj%hFjh.jB h,jvuUsubstitution_namesr}rhTh_hV}r(hZ]hX]hY]UsourcehRh[]h\]uU footnotesr]rUrefidsr}r(hE]rjahA]rjOahB]rj"ah;]rhKah>]rjfahF]rjauub.PKo1DH^^1scrapy-0.22/.doctrees/topics/broad-crawls.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xincrease concurrencyqNX)enable crawling of "ajax crawlable pages"qNXdisable retriesqNXdisable cookiesq NXtopics-broad-crawlsq X broad crawlsq NXreduce log levelq NXreduce download timeoutq NXajax crawlableqXdisable redirectsqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUincrease-concurrencyqhU'enable-crawling-of-ajax-crawlable-pagesqhUdisable-retriesqh Udisable-cookiesqh Utopics-broad-crawlsqh U broad-crawlsqh Ureduce-log-levelqh Ureduce-download-timeoutq hUajax-crawlableq!hUdisable-redirectsq"uUchildrenq#]q$(cdocutils.nodes target q%)q&}q'(U rawsourceq(X.. _topics-broad-crawls:Uparentq)hUsourceq*cdocutils.nodes reprunicode q+XI/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/broad-crawls.rstq,q-}q.bUtagnameq/Utargetq0U attributesq1}q2(Uidsq3]Ubackrefsq4]Udupnamesq5]Uclassesq6]Unamesq7]Urefidq8huUlineq9KUdocumentq:hh#]ubcdocutils.nodes section q;)q<}q=(h(Uh)hh*h-Uexpect_referenced_by_nameq>}q?h h&sh/Usectionq@h1}qA(h5]h6]h4]h3]qB(hheh7]qC(h h euh9Kh:hUexpect_referenced_by_idqD}qEhh&sh#]qF(cdocutils.nodes title qG)qH}qI(h(X Broad CrawlsqJh)h}r?(h(XDisable cookies unless you h)j9ubcdocutils.nodes emphasis r@)rA}rB(h(X*really*h1}rC(h5]h6]h4]h3]h7]uh)j9h#]rDhNXreallyrErF}rG(h(Uh)jAubah/UemphasisrHubhNX need. Cookies are often not needed when doing broad crawls (search engine crawlers ignore them), and they improve performance by saving some CPU cycles and reducing the memory footprint of your Scrapy crawler.rIrJ}rK(h(X need. Cookies are often not needed when doing broad crawls (search engine crawlers ignore them), and they improve performance by saving some CPU cycles and reducing the memory footprint of your Scrapy crawler.h)j9ubeubhR)rL}rM(h(XTo disable cookies use::rNh)j+h*h-h/hVh1}rO(h5]h6]h4]h3]h7]uh9KRh:hh#]rPhNXTo disable cookies use:rQrR}rS(h(XTo disable cookies use:h)jLubaubh)rT}rU(h(XCOOKIES_ENABLED = Falseh)j+h*h-h/hh1}rV(hhh3]h4]h5]h6]h7]uh9KTh:hh#]rWhNXCOOKIES_ENABLED = FalserXrY}rZ(h(Uh)jTubaubeubh;)r[}r\(h(Uh)h`::h)jh*h-h/hVh1}r(h5]h6]h4]h3]h7]uh9Kh:hh#]r(hNX7Scrapy handles (1) automatically; to handle (2) enable rr}r (h(X7Scrapy handles (1) automatically; to handle (2) enable h)jubcsphinx.addnodes pending_xref r!)r"}r#(h(X1:ref:`AjaxCrawlMiddleware `r$h)jh*h-h/U pending_xrefr%h1}r&(UreftypeXrefUrefwarnr'U reftargetr(Xajaxcrawl-middlewareU refdomainXstdr)h3]h4]U refexplicith5]h6]h7]Urefdocr*Xtopics/broad-crawlsr+uh9Kh#]r,j@)r-}r.(h(j$h1}r/(h5]h6]r0(Uxrefr1j)Xstd-refr2eh4]h3]h7]uh)j"h#]r3hNXAjaxCrawlMiddlewarer4r5}r6(h(Uh)j-ubah/jHubaubhNX:r7}r8(h(X:h)jubeubh)r9}r:(h(XAJAXCRAWL_ENABLED = Trueh)jh*h-h/hh1}r;(hhh3]h4]h5]h6]h7]uh9Kh:hh#]r<hNXAJAXCRAWL_ENABLED = Truer=r>}r?(h(Uh)j9ubaubhR)r@}rA(h(XWhen doing broad crawls it's common to crawl a lot of "index" web pages; AjaxCrawlMiddleware helps to crawl them correctly. It is turned OFF by default because it has some performance overhead, and enabling it for focused crawls doesn't make much sense.rBh)jh*h-h/hVh1}rC(h5]h6]h4]h3]h7]uh9Kh:hh#]rDhNXWhen doing broad crawls it's common to crawl a lot of "index" web pages; AjaxCrawlMiddleware helps to crawl them correctly. It is turned OFF by default because it has some performance overhead, and enabling it for focused crawls doesn't make much sense.rErF}rG(h(jBh)j@ubaubh%)rH}rI(h(X_.. _ajax crawlable: https://developers.google.com/webmasters/ajax-crawling/docs/getting-startedU referencedrJKh)jh*h-h/h0h1}rK(jjh3]rLh!ah4]h5]h6]h7]rMhauh9Kh:hh#]ubeubeubeh(UU transformerrNNU footnote_refsrO}rPUrefnamesrQ}rRXajax crawlable]rSjasUsymbol_footnotesrT]rUUautofootnote_refsrV]rWUsymbol_footnote_refsrX]rYU citationsrZ]r[h:hU current_liner\NUtransform_messagesr]]r^cdocutils.nodes system_message r_)r`}ra(h(Uh1}rb(h5]UlevelKh3]h4]Usourceh-h6]h7]UlineKUtypeUINFOrcuh#]rdhR)re}rf(h(Uh1}rg(h5]h6]h4]h3]h7]uh)j`h#]rhhNX9Hyperlink target "topics-broad-crawls" is not referenced.rirj}rk(h(Uh)jeubah/hVubah/Usystem_messagerlubaUreporterrmNUid_startrnKU autofootnotesro]rpU citation_refsrq}rrUindirect_targetsrs]rtUsettingsru(cdocutils.frontend Values rvorw}rx(Ufootnote_backlinksryKUrecord_dependenciesrzNU rfc_base_urlr{Uhttp://tools.ietf.org/html/r|U tracebackr}Upep_referencesr~NUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhKNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUI/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/broad-crawls.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(hj+hjhj[hhh jhh(h2]h3]h1]h0]q?(hheh4]q@(hh euh6Kh7hUexpect_referenced_by_idqA}qBhh#sh ]qC(cdocutils.nodes title qD)qE}qF(h%XStats CollectionqGh&h9h'h*h,UtitleqHh.}qI(h2]h3]h1]h0]h4]uh6Kh7hh ]qJcdocutils.nodes Text qKXStats CollectionqLqM}qN(h%hGh&hEubaubcdocutils.nodes paragraph qO)qP}qQ(h%XfScrapy provides a convenient facility for collecting stats in the form of key/values, where values are often counters. The facility is called the Stats Collector, and can be accessed through the :attr:`~scrapy.crawler.Crawler.stats` attribute of the :ref:`topics-api-crawler`, as illustrated by the examples in the :ref:`topics-stats-usecases` section below.h&h9h'h*h,U paragraphqRh.}qS(h2]h3]h1]h0]h4]uh6Kh7hh ]qT(hKXScrapy provides a convenient facility for collecting stats in the form of key/values, where values are often counters. The facility is called the Stats Collector, and can be accessed through the qUqV}qW(h%XScrapy provides a convenient facility for collecting stats in the form of key/values, where values are often counters. The facility is called the Stats Collector, and can be accessed through the h&hPubcsphinx.addnodes pending_xref qX)qY}qZ(h%X%:attr:`~scrapy.crawler.Crawler.stats`q[h&hPh'h*h,U pending_xrefq\h.}q](UreftypeXattrUrefwarnq^U reftargetq_Xscrapy.crawler.Crawler.statsU refdomainXpyq`h0]h1]U refexplicith2]h3]h4]UrefdocqaX topics/statsqbUpy:classqcNU py:moduleqdNuh6Kh ]qecdocutils.nodes literal qf)qg}qh(h%h[h.}qi(h2]h3]qj(Uxrefqkh`Xpy-attrqleh1]h0]h4]uh&hYh ]qmhKXstatsqnqo}qp(h%Uh&hgubah,UliteralqqubaubhKX attribute of the qrqs}qt(h%X attribute of the h&hPubhX)qu}qv(h%X:ref:`topics-api-crawler`qwh&hPh'h*h,h\h.}qx(UreftypeXrefh^h_Xtopics-api-crawlerU refdomainXstdqyh0]h1]U refexplicith2]h3]h4]hahbuh6Kh ]qzcdocutils.nodes emphasis q{)q|}q}(h%hwh.}q~(h2]h3]q(hkhyXstd-refqeh1]h0]h4]uh&huh ]qhKXtopics-api-crawlerqq}q(h%Uh&h|ubah,UemphasisqubaubhKX(, as illustrated by the examples in the qq}q(h%X(, as illustrated by the examples in the h&hPubhX)q}q(h%X:ref:`topics-stats-usecases`qh&hPh'h*h,h\h.}q(UreftypeXrefh^h_Xtopics-stats-usecasesU refdomainXstdqh0]h1]U refexplicith2]h3]h4]hahbuh6Kh ]qh{)q}q(h%hh.}q(h2]h3]q(hkhXstd-refqeh1]h0]h4]uh&hh ]qhKXtopics-stats-usecasesqq}q(h%Uh&hubah,hubaubhKX section below.qq}q(h%X section below.h&hPubeubhO)q}q(h%XHowever, the Stats Collector is always available, so you can always import it in your module and use its API (to increment or set new stat keys), regardless of whether the stats collection is enabled or not. If it's disabled, the API will still work but it won't collect anything. This is aimed at simplifying the stats collector usage: you should spend no more than one line of code for collecting stats in your spider, Scrapy extension, or whatever code you're using the Stats Collector from.qh&h9h'h*h,hRh.}q(h2]h3]h1]h0]h4]uh6K h7hh ]qhKXHowever, the Stats Collector is always available, so you can always import it in your module and use its API (to increment or set new stat keys), regardless of whether the stats collection is enabled or not. If it's disabled, the API will still work but it won't collect anything. This is aimed at simplifying the stats collector usage: you should spend no more than one line of code for collecting stats in your spider, Scrapy extension, or whatever code you're using the Stats Collector from.qq}q(h%hh&hubaubhO)q}q(h%XAnother feature of the Stats Collector is that it's very efficient (when enabled) and extremely efficient (almost unnoticeable) when disabled.qh&h9h'h*h,hRh.}q(h2]h3]h1]h0]h4]uh6Kh7hh ]qhKXAnother feature of the Stats Collector is that it's very efficient (when enabled) and extremely efficient (almost unnoticeable) when disabled.qq}q(h%hh&hubaubhO)q}q(h%XThe Stats Collector keeps a stats table per open spider which is automatically opened when the spider is opened, and closed when the spider is closed.qh&h9h'h*h,hRh.}q(h2]h3]h1]h0]h4]uh6Kh7hh ]qhKXThe Stats Collector keeps a stats table per open spider which is automatically opened when the spider is opened, and closed when the spider is closed.qq}q(h%hh&hubaubh")q}q(h%X.. _topics-stats-usecases:h&h9h'h*h,h-h.}q(h0]h1]h2]h3]h4]h5huh6Kh7hh ]ubh8)q}q(h%Uh&h9h'h*h;}qhhsh,h=h.}q(h2]h3]h1]h0]q(hheh4]q(h heuh6Kh7hhA}qhhsh ]q(hD)q}q(h%XCommon Stats Collector usesqh&hh'h*h,hHh.}q(h2]h3]h1]h0]h4]uh6Kh7hh ]qhKXCommon Stats Collector usesqÅq}q(h%hh&hubaubhO)q}q(h%XAccess the stats collector through the :attr:`~scrapy.crawler.Crawler.stats` attribute. Here is an example of an extension that access stats::h&hh'h*h,hRh.}q(h2]h3]h1]h0]h4]uh6K h7hh ]q(hKX'Access the stats collector through the qʅq}q(h%X'Access the stats collector through the h&hubhX)q}q(h%X%:attr:`~scrapy.crawler.Crawler.stats`qh&hh'h*h,h\h.}q(UreftypeXattrh^h_Xscrapy.crawler.Crawler.statsU refdomainXpyqh0]h1]U refexplicith2]h3]h4]hahbhcNhdNuh6K h ]qhf)q}q(h%hh.}q(h2]h3]q(hkhXpy-attrqeh1]h0]h4]uh&hh ]qhKXstatsqمq}q(h%Uh&hubah,hqubaubhKXA attribute. Here is an example of an extension that access stats:q܅q}q(h%XA attribute. Here is an example of an extension that access stats:h&hubeubcdocutils.nodes literal_block q)q}q(h%Xclass ExtensionThatAccessStats(object): def __init__(self, stats): self.stats = stats @classmethod def from_crawler(cls, crawler): return cls(crawler.stats)h&hh'h*h,U literal_blockqh.}q(U xml:spaceqUpreserveqh0]h1]h2]h3]h4]uh6K#h7hh ]qhKXclass ExtensionThatAccessStats(object): def __init__(self, stats): self.stats = stats @classmethod def from_crawler(cls, crawler): return cls(crawler.stats)q煁q}q(h%Uh&hubaubhO)q}q(h%XSet stat value::qh&hh'h*h,hRh.}q(h2]h3]h1]h0]h4]uh6K,h7hh ]qhKXSet stat value:qq}q(h%XSet stat value:h&hubaubh)q}q(h%X1stats.set_value('hostname', socket.gethostname())h&hh'h*h,hh.}q(hhh0]h1]h2]h3]h4]uh6K.h7hh ]qhKX1stats.set_value('hostname', socket.gethostname())qq}q(h%Uh&hubaubhO)q}q(h%XIncrement stat value::qh&hh'h*h,hRh.}q(h2]h3]h1]h0]h4]uh6K0h7hh ]qhKXIncrement stat value:qq}r(h%XIncrement stat value:h&hubaubh)r}r(h%X stats.inc_value('pages_crawled')h&hh'h*h,hh.}r(hhh0]h1]h2]h3]h4]uh6K2h7hh ]rhKX stats.inc_value('pages_crawled')rr}r(h%Uh&jubaubhO)r}r (h%X.Set stat value only if greater than previous::r h&hh'h*h,hRh.}r (h2]h3]h1]h0]h4]uh6K4h7hh ]r hKX-Set stat value only if greater than previous:r r}r(h%X-Set stat value only if greater than previous:h&jubaubh)r}r(h%X+stats.max_value('max_items_scraped', value)h&hh'h*h,hh.}r(hhh0]h1]h2]h3]h4]uh6K6h7hh ]rhKX+stats.max_value('max_items_scraped', value)rr}r(h%Uh&jubaubhO)r}r(h%X,Set stat value only if lower than previous::rh&hh'h*h,hRh.}r(h2]h3]h1]h0]h4]uh6K8h7hh ]rhKX+Set stat value only if lower than previous:rr}r(h%X+Set stat value only if lower than previous:h&jubaubh)r}r (h%X1stats.min_value('min_free_memory_percent', value)h&hh'h*h,hh.}r!(hhh0]h1]h2]h3]h4]uh6K:h7hh ]r"hKX1stats.min_value('min_free_memory_percent', value)r#r$}r%(h%Uh&jubaubhO)r&}r'(h%XGet stat value::r(h&hh'h*h,hRh.}r)(h2]h3]h1]h0]h4]uh6K>> stats.get_value('pages_crawled') 8h&hh'h*h,hh.}r0(hhh0]h1]h2]h3]h4]uh6K>h7hh ]r1hKX&>>> stats.get_value('pages_crawled') 8r2r3}r4(h%Uh&j.ubaubhO)r5}r6(h%XGet all stats::r7h&hh'h*h,hRh.}r8(h2]h3]h1]h0]h4]uh6KAh7hh ]r9hKXGet all stats:r:r;}r<(h%XGet all stats:h&j5ubaubh)r=}r>(h%Xo>>> stats.get_stats() {'pages_crawled': 1238, 'start_time': datetime.datetime(2009, 7, 14, 21, 47, 28, 977139)}h&hh'h*h,hh.}r?(hhh0]h1]h2]h3]h4]uh6KCh7hh ]r@hKXo>>> stats.get_stats() {'pages_crawled': 1238, 'start_time': datetime.datetime(2009, 7, 14, 21, 47, 28, 977139)}rArB}rC(h%Uh&j=ubaubeubh8)rD}rE(h%Uh&h9h'h*h,h=h.}rF(h2]h3]h1]h0]rGhah4]rHhauh6KGh7hh ]rI(hD)rJ}rK(h%XAvailable Stats CollectorsrLh&jDh'h*h,hHh.}rM(h2]h3]h1]h0]h4]uh6KGh7hh ]rNhKXAvailable Stats CollectorsrOrP}rQ(h%jLh&jJubaubhO)rR}rS(h%X%Besides the basic :class:`StatsCollector` there are other Stats Collectors available in Scrapy which extend the basic Stats Collector. You can select which Stats Collector to use through the :setting:`STATS_CLASS` setting. The default Stats Collector used is the :class:`MemoryStatsCollector`.h&jDh'h*h,hRh.}rT(h2]h3]h1]h0]h4]uh6KIh7hh ]rU(hKXBesides the basic rVrW}rX(h%XBesides the basic h&jRubhX)rY}rZ(h%X:class:`StatsCollector`r[h&jRh'h*h,h\h.}r\(UreftypeXclassh^h_XStatsCollectorU refdomainXpyr]h0]h1]U refexplicith2]h3]h4]hahbhcNhdNuh6KIh ]r^hf)r_}r`(h%j[h.}ra(h2]h3]rb(hkj]Xpy-classrceh1]h0]h4]uh&jYh ]rdhKXStatsCollectorrerf}rg(h%Uh&j_ubah,hqubaubhKX there are other Stats Collectors available in Scrapy which extend the basic Stats Collector. You can select which Stats Collector to use through the rhri}rj(h%X there are other Stats Collectors available in Scrapy which extend the basic Stats Collector. You can select which Stats Collector to use through the h&jRubhX)rk}rl(h%X:setting:`STATS_CLASS`rmh&jRh'h*h,h\h.}rn(UreftypeXsettingh^h_X STATS_CLASSU refdomainXstdroh0]h1]U refexplicith2]h3]h4]hahbuh6KIh ]rphf)rq}rr(h%jmh.}rs(h2]h3]rt(hkjoX std-settingrueh1]h0]h4]uh&jkh ]rvhKX STATS_CLASSrwrx}ry(h%Uh&jqubah,hqubaubhKX2 setting. The default Stats Collector used is the rzr{}r|(h%X2 setting. The default Stats Collector used is the h&jRubhX)r}}r~(h%X:class:`MemoryStatsCollector`rh&jRh'h*h,h\h.}r(UreftypeXclassh^h_XMemoryStatsCollectorU refdomainXpyrh0]h1]U refexplicith2]h3]h4]hahbhcNhdNuh6KIh ]rhf)r}r(h%jh.}r(h2]h3]r(hkjXpy-classreh1]h0]h4]uh&j}h ]rhKXMemoryStatsCollectorrr}r(h%Uh&jubah,hqubaubhKX.r}r(h%X.h&jRubeubh")r}r(h%Uh&jDh'h*h,h-h.}r(h2]h0]rXmodule-scrapy.statscolrah1]Uismodh3]h4]uh6Nh7hh ]ubcsphinx.addnodes index r)r}r(h%Uh&jDh'h*h,Uindexrh.}r(h0]h1]h2]h3]h4]Uentries]r(UsinglerXscrapy.statscol (module)Xmodule-scrapy.statscolUtrauh6Nh7hh ]ubh8)r}r(h%Uh&jDh'h*h,h=h.}r(h2]h3]h1]h0]rhah4]rh auh6KRh7hh ]r(hD)r}r(h%XMemoryStatsCollectorrh&jh'h*h,hHh.}r(h2]h3]h1]h0]h4]uh6KRh7hh ]rhKXMemoryStatsCollectorrr}r(h%jh&jubaubj)r}r(h%Uh&jh'Nh,jh.}r(h0]h1]h2]h3]h4]Uentries]r(jX/MemoryStatsCollector (class in scrapy.statscol)h Utrauh6Nh7hh ]ubcsphinx.addnodes desc r)r}r(h%Uh&jh'Nh,Udescrh.}r(UnoindexrUdomainrXpyh0]h1]h2]h3]h4]UobjtyperXclassrUdesctyperjuh6Nh7hh ]r(csphinx.addnodes desc_signature r)r}r(h%XMemoryStatsCollectorrh&jh'h*h,Udesc_signaturerh.}r(h0]rh aUmodulerXscrapy.statscolrh1]h2]h3]h4]rh aUfullnamerjUclassrUUfirstruh6Kah7hh ]r(csphinx.addnodes desc_annotation r)r}r(h%Xclass h&jh'h*h,Udesc_annotationrh.}r(h2]h3]h1]h0]h4]uh6Kah7hh ]rhKXclass rr}r(h%Uh&jubaubcsphinx.addnodes desc_addname r)r}r(h%Xscrapy.statscol.h&jh'h*h,U desc_addnamerh.}r(h2]h3]h1]h0]h4]uh6Kah7hh ]rhKXscrapy.statscol.rr}r(h%Uh&jubaubcsphinx.addnodes desc_name r)r}r(h%jh&jh'h*h,U desc_namerh.}r(h2]h3]h1]h0]h4]uh6Kah7hh ]rhKXMemoryStatsCollectorrr}r(h%Uh&jubaubeubcsphinx.addnodes desc_content r)r}r(h%Uh&jh'h*h,U desc_contentrh.}r(h2]h3]h1]h0]h4]uh6Kah7hh ]r(hO)r}r(h%XA simple stats collector that keeps the stats of the last scraping run (for each spider) in memory, after they're closed. The stats can be accessed through the :attr:`spider_stats` attribute, which is a dict keyed by spider domain name.h&jh'h*h,hRh.}r(h2]h3]h1]h0]h4]uh6KVh7hh ]r(hKXA simple stats collector that keeps the stats of the last scraping run (for each spider) in memory, after they're closed. The stats can be accessed through the rr}r(h%XA simple stats collector that keeps the stats of the last scraping run (for each spider) in memory, after they're closed. The stats can be accessed through the h&jubhX)r}r(h%X:attr:`spider_stats`rh&jh'h*h,h\h.}r(UreftypeXattrh^h_X spider_statsU refdomainXpyrh0]h1]U refexplicith2]h3]h4]hahbhcjhdjuh6KVh ]rhf)r}r(h%jh.}r(h2]h3]r(hkjXpy-attrreh1]h0]h4]uh&jh ]rhKX spider_statsrr}r(h%Uh&jubah,hqubaubhKX8 attribute, which is a dict keyed by spider domain name.rr}r(h%X8 attribute, which is a dict keyed by spider domain name.h&jubeubhO)r}r(h%X3This is the default Stats Collector used in Scrapy.rh&jh'h*h,hRh.}r(h2]h3]h1]h0]h4]uh6K[h7hh ]rhKX3This is the default Stats Collector used in Scrapy.rr}r(h%jh&jubaubj)r }r (h%Uh&jh'h*h,jh.}r (h0]h1]h2]h3]h4]Uentries]r (jX=spider_stats (scrapy.statscol.MemoryStatsCollector attribute)hUtr auh6Nh7hh ]ubj)r}r(h%Uh&jh'h*h,jh.}r(jjXpyh0]h1]h2]h3]h4]jX attributerjjuh6Nh7hh ]r(j)r}r(h%X spider_statsrh&jh'h*h,jh.}r(h0]rhajjh1]h2]h3]h4]rhajX!MemoryStatsCollector.spider_statsjjjuh6K`h7hh ]rj)r}r(h%jh&jh'h*h,jh.}r(h2]h3]h1]h0]h4]uh6K`h7hh ]rhKX spider_statsrr}r (h%Uh&jubaubaubj)r!}r"(h%Uh&jh'h*h,jh.}r#(h2]h3]h1]h0]h4]uh6K`h7hh ]r$hO)r%}r&(h%XeA dict of dicts (keyed by spider name) containing the stats of the last scraping run for each spider.r'h&j!h'h*h,hRh.}r((h2]h3]h1]h0]h4]uh6K_h7hh ]r)hKXeA dict of dicts (keyed by spider name) containing the stats of the last scraping run for each spider.r*r+}r,(h%j'h&j%ubaubaubeubeubeubeubh8)r-}r.(h%Uh&jDh'h*h,h=h.}r/(h2]h3]h1]h0]r0hah4]r1h auh6Kch7hh ]r2(hD)r3}r4(h%XDummyStatsCollectorr5h&j-h'h*h,hHh.}r6(h2]h3]h1]h0]h4]uh6Kch7hh ]r7hKXDummyStatsCollectorr8r9}r:(h%j5h&j3ubaubj)r;}r<(h%Uh&j-h'h*h,jh.}r=(h0]h1]h2]h3]h4]Uentries]r>(jX.DummyStatsCollector (class in scrapy.statscol)hUtr?auh6Nh7hh ]ubj)r@}rA(h%Uh&j-h'h*h,jh.}rB(jjXpyh0]h1]h2]h3]h4]jXclassrCjjCuh6Nh7hh ]rD(j)rE}rF(h%XDummyStatsCollectorrGh&j@h'h*h,jh.}rH(h0]rIhajjh1]h2]h3]h4]rJhajjGjUjuh6Klh7hh ]rK(j)rL}rM(h%Xclass h&jEh'h*h,jh.}rN(h2]h3]h1]h0]h4]uh6Klh7hh ]rOhKXclass rPrQ}rR(h%Uh&jLubaubj)rS}rT(h%Xscrapy.statscol.h&jEh'h*h,jh.}rU(h2]h3]h1]h0]h4]uh6Klh7hh ]rVhKXscrapy.statscol.rWrX}rY(h%Uh&jSubaubj)rZ}r[(h%jGh&jEh'h*h,jh.}r\(h2]h3]h1]h0]h4]uh6Klh7hh ]r]hKXDummyStatsCollectorr^r_}r`(h%Uh&jZubaubeubj)ra}rb(h%Uh&j@h'h*h,jh.}rc(h2]h3]h1]h0]h4]uh6Klh7hh ]rdhO)re}rf(h%XVA Stats collector which does nothing but is very efficient (because it does nothing). This stats collector can be set via the :setting:`STATS_CLASS` setting, to disable stats collect in order to improve performance. However, the performance penalty of stats collection is usually marginal compared to other Scrapy workload like parsing pages.h&jah'h*h,hRh.}rg(h2]h3]h1]h0]h4]uh6Kgh7hh ]rh(hKX~A Stats collector which does nothing but is very efficient (because it does nothing). This stats collector can be set via the rirj}rk(h%X~A Stats collector which does nothing but is very efficient (because it does nothing). This stats collector can be set via the h&jeubhX)rl}rm(h%X:setting:`STATS_CLASS`rnh&jeh'h*h,h\h.}ro(UreftypeXsettingh^h_X STATS_CLASSU refdomainXstdrph0]h1]U refexplicith2]h3]h4]hahbuh6Kgh ]rqhf)rr}rs(h%jnh.}rt(h2]h3]ru(hkjpX std-settingrveh1]h0]h4]uh&jlh ]rwhKX STATS_CLASSrxry}rz(h%Uh&jrubah,hqubaubhKX setting, to disable stats collect in order to improve performance. However, the performance penalty of stats collection is usually marginal compared to other Scrapy workload like parsing pages.r{r|}r}(h%X setting, to disable stats collect in order to improve performance. However, the performance penalty of stats collection is usually marginal compared to other Scrapy workload like parsing pages.h&jeubeubaubeubeubeubeubeh%UU transformerr~NU footnote_refsr}rUrefnamesr}rUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh7hU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h%Uh.}r(h2]UlevelKh0]h1]Usourceh*h3]h4]UlineKUtypeUINFOruh ]rhO)r}r(h%Uh.}r(h2]h3]h1]h0]h4]uh&jh ]rhKX2Hyperlink target "topics-stats" is not referenced.rr}r(h%Uh&jubah,hRubah,Usystem_messagerubj)r}r(h%Uh.}r(h2]UlevelKh0]h1]Usourceh*h3]h4]UlineKUtypejuh ]rhO)r}r(h%Uh.}r(h2]h3]h1]h0]h4]uh&jh ]rhKX;Hyperlink target "topics-stats-usecases" is not referenced.rr}r(h%Uh&jubah,hRubah,jubj)r}r(h%Uh.}r(h2]UlevelKh0]h1]Usourceh*h3]h4]Utypejuh ]rhO)r}r(h%Uh.}r(h2]h3]h1]h0]h4]uh&jh ]rhKX<Hyperlink target "module-scrapy.statscol" is not referenced.rr}r(h%Uh&jubah,hRubah,jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhHNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/stats.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(hhhjhjEhjjjhhh jhh9hjDhj-hh9uUsubstitution_namesr}rh,h7h.}r(h2]h0]h1]Usourceh*h3]h4]uU footnotesr]r Urefidsr }r (h]r hah]r h#auub.PKo1DPRYRY1scrapy-0.22/.doctrees/topics/autothrottle.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xautothrottle_max_delayqNXsettingsqNXthrottling algorithmqNX how it worksq NXautothrottle_start_delayq NXautothrottle_enabledq NXautothrottle extensionq NX design goalsq NXautothrottle-algorithmqXautothrottle_debugqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUautothrottle-max-delayqhUsettingsqhUthrottling-algorithmqh U how-it-worksqh Uautothrottle-start-delayqh Uautothrottle-enabledqh Uautothrottle-extensionqh U design-goalsq hUautothrottle-algorithmq!hUautothrottle-debugq"uUchildrenq#]q$cdocutils.nodes section q%)q&}q'(U rawsourceq(UUparentq)hUsourceq*cdocutils.nodes reprunicode q+XI/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/autothrottle.rstq,q-}q.bUtagnameq/Usectionq0U attributesq1}q2(Udupnamesq3]Uclassesq4]Ubackrefsq5]Uidsq6]q7haUnamesq8]q9h auUlineq:KUdocumentq;hh#]q<(cdocutils.nodes title q=)q>}q?(h(XAutoThrottle extensionq@h)h&h*h-h/UtitleqAh1}qB(h3]h4]h5]h6]h8]uh:Kh;hh#]qCcdocutils.nodes Text qDXAutoThrottle extensionqEqF}qG(h(h@h)h>ubaubcdocutils.nodes paragraph qH)qI}qJ(h(XThis is an extension for automatically throttling crawling speed based on load of both the Scrapy server and the website you are crawling.qKh)h&h*h-h/U paragraphqLh1}qM(h3]h4]h5]h6]h8]uh:Kh;hh#]qNhDXThis is an extension for automatically throttling crawling speed based on load of both the Scrapy server and the website you are crawling.qOqP}qQ(h(hKh)hIubaubh%)qR}qS(h(Uh)h&h*h-h/h0h1}qT(h3]h4]h5]h6]qUh ah8]qVh auh:K h;hh#]qW(h=)qX}qY(h(X Design goalsqZh)hRh*h-h/hAh1}q[(h3]h4]h5]h6]h8]uh:K h;hh#]q\hDX Design goalsq]q^}q_(h(hZh)hXubaubcdocutils.nodes enumerated_list q`)qa}qb(h(Uh)hRh*h-h/Uenumerated_listqch1}qd(UsuffixqeU.h6]h5]h3]UprefixqfUh4]h8]UenumtypeqgUarabicqhuh:K h;hh#]qi(cdocutils.nodes list_item qj)qk}ql(h(XAbe nicer to sites instead of using default download delay of zeroqmh)hah*h-h/U list_itemqnh1}qo(h3]h4]h5]h6]h8]uh:Nh;hh#]qphH)qq}qr(h(hmh)hkh*h-h/hLh1}qs(h3]h4]h5]h6]h8]uh:K h#]qthDXAbe nicer to sites instead of using default download delay of zeroquqv}qw(h(hmh)hqubaubaubhj)qx}qy(h(X automatically adjust scrapy to the optimum crawling speed, so the user doesn't have to tune the download delays and concurrent requests to find the optimum one. the user only needs to specify the maximum concurrent requests it allows, and the extension does the rest. h)hah*h-h/hnh1}qz(h3]h4]h5]h6]h8]uh:Nh;hh#]q{hH)q|}q}(h(X automatically adjust scrapy to the optimum crawling speed, so the user doesn't have to tune the download delays and concurrent requests to find the optimum one. the user only needs to specify the maximum concurrent requests it allows, and the extension does the rest.q~h)hxh*h-h/hLh1}q(h3]h4]h5]h6]h8]uh:K h#]qhDX automatically adjust scrapy to the optimum crawling speed, so the user doesn't have to tune the download delays and concurrent requests to find the optimum one. the user only needs to specify the maximum concurrent requests it allows, and the extension does the rest.qq}q(h(h~h)h|ubaubaubeubeubh%)q}q(h(Uh)h&h*h-h/h0h1}q(h3]h4]h5]h6]qhah8]qh auh:Kh;hh#]q(h=)q}q(h(X How it worksqh)hh*h-h/hAh1}q(h3]h4]h5]h6]h8]uh:Kh;hh#]qhDX How it worksqq}q(h(hh)hubaubhH)q}q(h(XIn Scrapy, the download latency is measured as the time elapsed between establishing the TCP connection and receiving the HTTP headers.qh)hh*h-h/hLh1}q(h3]h4]h5]h6]h8]uh:Kh;hh#]qhDXIn Scrapy, the download latency is measured as the time elapsed between establishing the TCP connection and receiving the HTTP headers.qq}q(h(hh)hubaubhH)q}q(h(XmNote that these latencies are very hard to measure accurately in a cooperative multitasking environment because Scrapy may be busy processing a spider callback, for example, and unable to attend downloads. However, these latencies should still give a reasonable estimate of how busy Scrapy (and ultimately, the server) is, and this extension builds on that premise.qh)hh*h-h/hLh1}q(h3]h4]h5]h6]h8]uh:Kh;hh#]qhDXmNote that these latencies are very hard to measure accurately in a cooperative multitasking environment because Scrapy may be busy processing a spider callback, for example, and unable to attend downloads. However, these latencies should still give a reasonable estimate of how busy Scrapy (and ultimately, the server) is, and this extension builds on that premise.qq}q(h(hh)hubaubcdocutils.nodes target q)q}q(h(X.. _autothrottle-algorithm:h)hh*h-h/Utargetqh1}q(h6]h5]h3]h4]h8]Urefidqh!uh:Kh;hh#]ubeubh%)q}q(h(Uh)h&h*h-Uexpect_referenced_by_nameq}qhhsh/h0h1}q(h3]h4]h5]h6]q(hh!eh8]q(hheuh:K h;hUexpect_referenced_by_idq}qh!hsh#]q(h=)q}q(h(XThrottling algorithmqh)hh*h-h/hAh1}q(h3]h4]h5]h6]h8]uh:K h;hh#]qhDXThrottling algorithmqq}q(h(hh)hubaubhH)q}q(h(XJThis adjusts download delays and concurrency based on the following rules:qh)hh*h-h/hLh1}q(h3]h4]h5]h6]h8]uh:K"h;hh#]qhDXJThis adjusts download delays and concurrency based on the following rules:qq}q(h(hh)hubaubh`)q}q(h(Uh)hh*h-h/hch1}q(heU.h6]h5]h3]hfUh4]h8]hghhuh:K$h;hh#]q(hj)q}q(h(Xlspiders always start with one concurrent request and a download delay of :setting:`AUTOTHROTTLE_START_DELAY`h)hh*h-h/hnh1}q(h3]h4]h5]h6]h8]uh:Nh;hh#]qhH)q}q(h(Xlspiders always start with one concurrent request and a download delay of :setting:`AUTOTHROTTLE_START_DELAY`h)hh*h-h/hLh1}q(h3]h4]h5]h6]h8]uh:K$h#]q(hDXIspiders always start with one concurrent request and a download delay of q΅q}q(h(XIspiders always start with one concurrent request and a download delay of h)hubcsphinx.addnodes pending_xref q)q}q(h(X#:setting:`AUTOTHROTTLE_START_DELAY`qh)hh*h-h/U pending_xrefqh1}q(UreftypeXsettingUrefwarnq׉U reftargetqXAUTOTHROTTLE_START_DELAYU refdomainXstdqh6]h5]U refexplicith3]h4]h8]UrefdocqXtopics/autothrottlequh:K$h#]qcdocutils.nodes literal q)q}q(h(hh1}q(h3]h4]q(UxrefqhX std-settingqeh5]h6]h8]uh)hh#]qhDXAUTOTHROTTLE_START_DELAYq允q}q(h(Uh)hubah/Uliteralqubaubeubaubhj)q}q(h(Xwhen a response is received, the download delay is adjusted to the average of previous download delay and the latency of the response. h)hh*h-h/hnh1}q(h3]h4]h5]h6]h8]uh:Nh;hh#]qhH)q}q(h(Xwhen a response is received, the download delay is adjusted to the average of previous download delay and the latency of the response.qh)hh*h-h/hLh1}q(h3]h4]h5]h6]h8]uh:K&h#]qhDXwhen a response is received, the download delay is adjusted to the average of previous download delay and the latency of the response.qq}q(h(hh)hubaubaubeubcdocutils.nodes note q)q}q(h(XEThe AutoThrottle extension honours the standard Scrapy settings for concurrency and delay. This means that it will never set a download delay lower than :setting:`DOWNLOAD_DELAY` or a concurrency higher than :setting:`CONCURRENT_REQUESTS_PER_DOMAIN` (or :setting:`CONCURRENT_REQUESTS_PER_IP`, depending on which one you use).h)hh*h-h/Unoteqh1}q(h3]h4]h5]h6]h8]uh:Nh;hh#]qhH)q}q(h(XEThe AutoThrottle extension honours the standard Scrapy settings for concurrency and delay. This means that it will never set a download delay lower than :setting:`DOWNLOAD_DELAY` or a concurrency higher than :setting:`CONCURRENT_REQUESTS_PER_DOMAIN` (or :setting:`CONCURRENT_REQUESTS_PER_IP`, depending on which one you use).h)hh*h-h/hLh1}q(h3]h4]h5]h6]h8]uh:K)h#]q(hDXThe AutoThrottle extension honours the standard Scrapy settings for concurrency and delay. This means that it will never set a download delay lower than qr}r(h(XThe AutoThrottle extension honours the standard Scrapy settings for concurrency and delay. This means that it will never set a download delay lower than h)hubh)r}r(h(X:setting:`DOWNLOAD_DELAY`rh)hh*h-h/hh1}r(UreftypeXsettingh׉hXDOWNLOAD_DELAYU refdomainXstdrh6]h5]U refexplicith3]h4]h8]hhuh:K)h#]rh)r}r (h(jh1}r (h3]h4]r (hjX std-settingr eh5]h6]h8]uh)jh#]r hDXDOWNLOAD_DELAYrr}r(h(Uh)jubah/hubaubhDX or a concurrency higher than rr}r(h(X or a concurrency higher than h)hubh)r}r(h(X):setting:`CONCURRENT_REQUESTS_PER_DOMAIN`rh)hh*h-h/hh1}r(UreftypeXsettingh׉hXCONCURRENT_REQUESTS_PER_DOMAINU refdomainXstdrh6]h5]U refexplicith3]h4]h8]hhuh:K)h#]rh)r}r(h(jh1}r(h3]h4]r(hjX std-settingreh5]h6]h8]uh)jh#]rhDXCONCURRENT_REQUESTS_PER_DOMAINr r!}r"(h(Uh)jubah/hubaubhDX (or r#r$}r%(h(X (or h)hubh)r&}r'(h(X%:setting:`CONCURRENT_REQUESTS_PER_IP`r(h)hh*h-h/hh1}r)(UreftypeXsettingh׉hXCONCURRENT_REQUESTS_PER_IPU refdomainXstdr*h6]h5]U refexplicith3]h4]h8]hhuh:K)h#]r+h)r,}r-(h(j(h1}r.(h3]h4]r/(hj*X std-settingr0eh5]h6]h8]uh)j&h#]r1hDXCONCURRENT_REQUESTS_PER_IPr2r3}r4(h(Uh)j,ubah/hubaubhDX", depending on which one you use).r5r6}r7(h(X", depending on which one you use).h)hubeubaubeubh%)r8}r9(h(Uh)h&h*h-h/h0h1}r:(h3]h4]h5]h6]r;hah8]r<hauh:K0h;hh#]r=(h=)r>}r?(h(XSettingsr@h)j8h*h-h/hAh1}rA(h3]h4]h5]h6]h8]uh:K0h;hh#]rBhDXSettingsrCrD}rE(h(j@h)j>ubaubhH)rF}rG(h(X<The settings used to control the AutoThrottle extension are:rHh)j8h*h-h/hLh1}rI(h3]h4]h5]h6]h8]uh:K2h;hh#]rJhDX<The settings used to control the AutoThrottle extension are:rKrL}rM(h(jHh)jFubaubcdocutils.nodes bullet_list rN)rO}rP(h(Uh)j8h*h-h/U bullet_listrQh1}rR(UbulletrSX*h6]h5]h3]h4]h8]uh:K4h;hh#]rT(hj)rU}rV(h(X:setting:`AUTOTHROTTLE_ENABLED`rWh)jOh*h-h/hnh1}rX(h3]h4]h5]h6]h8]uh:Nh;hh#]rYhH)rZ}r[(h(jWh)jUh*h-h/hLh1}r\(h3]h4]h5]h6]h8]uh:K4h#]r]h)r^}r_(h(jWh)jZh*h-h/hh1}r`(UreftypeXsettingh׉hXAUTOTHROTTLE_ENABLEDU refdomainXstdrah6]h5]U refexplicith3]h4]h8]hhuh:K4h#]rbh)rc}rd(h(jWh1}re(h3]h4]rf(hjaX std-settingrgeh5]h6]h8]uh)j^h#]rhhDXAUTOTHROTTLE_ENABLEDrirj}rk(h(Uh)jcubah/hubaubaubaubhj)rl}rm(h(X#:setting:`AUTOTHROTTLE_START_DELAY`rnh)jOh*h-h/hnh1}ro(h3]h4]h5]h6]h8]uh:Nh;hh#]rphH)rq}rr(h(jnh)jlh*h-h/hLh1}rs(h3]h4]h5]h6]h8]uh:K5h#]rth)ru}rv(h(jnh)jqh*h-h/hh1}rw(UreftypeXsettingh׉hXAUTOTHROTTLE_START_DELAYU refdomainXstdrxh6]h5]U refexplicith3]h4]h8]hhuh:K5h#]ryh)rz}r{(h(jnh1}r|(h3]h4]r}(hjxX std-settingr~eh5]h6]h8]uh)juh#]rhDXAUTOTHROTTLE_START_DELAYrr}r(h(Uh)jzubah/hubaubaubaubhj)r}r(h(X!:setting:`AUTOTHROTTLE_MAX_DELAY`rh)jOh*h-h/hnh1}r(h3]h4]h5]h6]h8]uh:Nh;hh#]rhH)r}r(h(jh)jh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:K6h#]rh)r}r(h(jh)jh*h-h/hh1}r(UreftypeXsettingh׉hXAUTOTHROTTLE_MAX_DELAYU refdomainXstdrh6]h5]U refexplicith3]h4]h8]hhuh:K6h#]rh)r}r(h(jh1}r(h3]h4]r(hjX std-settingreh5]h6]h8]uh)jh#]rhDXAUTOTHROTTLE_MAX_DELAYrr}r(h(Uh)jubah/hubaubaubaubhj)r}r(h(X:setting:`AUTOTHROTTLE_DEBUG`rh)jOh*h-h/hnh1}r(h3]h4]h5]h6]h8]uh:Nh;hh#]rhH)r}r(h(jh)jh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:K7h#]rh)r}r(h(jh)jh*h-h/hh1}r(UreftypeXsettingh׉hXAUTOTHROTTLE_DEBUGU refdomainXstdrh6]h5]U refexplicith3]h4]h8]hhuh:K7h#]rh)r}r(h(jh1}r(h3]h4]r(hjX std-settingreh5]h6]h8]uh)jh#]rhDXAUTOTHROTTLE_DEBUGrr}r(h(Uh)jubah/hubaubaubaubhj)r}r(h(X):setting:`CONCURRENT_REQUESTS_PER_DOMAIN`rh)jOh*h-h/hnh1}r(h3]h4]h5]h6]h8]uh:Nh;hh#]rhH)r}r(h(jh)jh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:K8h#]rh)r}r(h(jh)jh*h-h/hh1}r(UreftypeXsettingh׉hXCONCURRENT_REQUESTS_PER_DOMAINU refdomainXstdrh6]h5]U refexplicith3]h4]h8]hhuh:K8h#]rh)r}r(h(jh1}r(h3]h4]r(hjX std-settingreh5]h6]h8]uh)jh#]rhDXCONCURRENT_REQUESTS_PER_DOMAINrr}r(h(Uh)jubah/hubaubaubaubhj)r}r(h(X%:setting:`CONCURRENT_REQUESTS_PER_IP`rh)jOh*h-h/hnh1}r(h3]h4]h5]h6]h8]uh:Nh;hh#]rhH)r}r(h(jh)jh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:K9h#]rh)r}r(h(jh)jh*h-h/hh1}r(UreftypeXsettingh׉hXCONCURRENT_REQUESTS_PER_IPU refdomainXstdrh6]h5]U refexplicith3]h4]h8]hhuh:K9h#]rh)r}r(h(jh1}r(h3]h4]r(hjX std-settingreh5]h6]h8]uh)jh#]rhDXCONCURRENT_REQUESTS_PER_IPrr}r(h(Uh)jubah/hubaubaubaubhj)r}r(h(X:setting:`DOWNLOAD_DELAY` h)jOh*h-h/hnh1}r(h3]h4]h5]h6]h8]uh:Nh;hh#]rhH)r}r(h(X:setting:`DOWNLOAD_DELAY`rh)jh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:K:h#]rh)r}r(h(jh)jh*h-h/hh1}r(UreftypeXsettingh׉hXDOWNLOAD_DELAYU refdomainXstdrh6]h5]U refexplicith3]h4]h8]hhuh:K:h#]rh)r}r(h(jh1}r(h3]h4]r(hjX std-settingreh5]h6]h8]uh)jh#]rhDXDOWNLOAD_DELAYrr}r(h(Uh)jubah/hubaubaubaubeubhH)r}r(h(X7For more information see :ref:`autothrottle-algorithm`.rh)j8h*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:KhDX#Enables the AutoThrottle extension.r?r@}rA(h(j<h)j:ubaubj)rB}rC(h(Uh)jh*h-h/jh1}rD(h6]h5]h3]h4]h8]Uentries]rE(XpairX!AUTOTHROTTLE_START_DELAY; settingX$std:setting-AUTOTHROTTLE_START_DELAYrFUtrGauh:KHh;hh#]ubh)rH}rI(h(Uh)jh*h-h/hh1}rJ(h6]h5]h3]h4]h8]hjFuh:KHh;hh#]ubeubh%)rK}rL(h(Uh)j8h*h-h}h/h0h1}rM(h3]h4]h5]h6]rN(hjFeh8]rOh auh:KJh;hh}rPjFjHsh#]rQ(h=)rR}rS(h(XAUTOTHROTTLE_START_DELAYrTh)jKh*h-h/hAh1}rU(h3]h4]h5]h6]h8]uh:KJh;hh#]rVhDXAUTOTHROTTLE_START_DELAYrWrX}rY(h(jTh)jRubaubhH)rZ}r[(h(XDefault: ``5.0``r\h)jKh*h-h/hLh1}r](h3]h4]h5]h6]h8]uh:KLh;hh#]r^(hDX Default: r_r`}ra(h(X Default: h)jZubh)rb}rc(h(X``5.0``h1}rd(h3]h4]h5]h6]h8]uh)jZh#]rehDX5.0rfrg}rh(h(Uh)jbubah/hubeubhH)ri}rj(h(X(The initial download delay (in seconds).rkh)jKh*h-h/hLh1}rl(h3]h4]h5]h6]h8]uh:KNh;hh#]rmhDX(The initial download delay (in seconds).rnro}rp(h(jkh)jiubaubj)rq}rr(h(Uh)jKh*h-h/jh1}rs(h6]h5]h3]h4]h8]Uentries]rt(XpairXAUTOTHROTTLE_MAX_DELAY; settingX"std:setting-AUTOTHROTTLE_MAX_DELAYruUtrvauh:KQh;hh#]ubh)rw}rx(h(Uh)jKh*h-h/hh1}ry(h6]h5]h3]h4]h8]hjuuh:KQh;hh#]ubeubh%)rz}r{(h(Uh)j8h*h-h}h/h0h1}r|(h3]h4]h5]h6]r}(hjueh8]r~hauh:KSh;hh}rjujwsh#]r(h=)r}r(h(XAUTOTHROTTLE_MAX_DELAYrh)jzh*h-h/hAh1}r(h3]h4]h5]h6]h8]uh:KSh;hh#]rhDXAUTOTHROTTLE_MAX_DELAYrr}r(h(jh)jubaubhH)r}r(h(XDefault: ``60.0``rh)jzh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:KUh;hh#]r(hDX Default: rr}r(h(X Default: h)jubh)r}r(h(X``60.0``h1}r(h3]h4]h5]h6]h8]uh)jh#]rhDX60.0rr}r(h(Uh)jubah/hubeubhH)r}r(h(XLThe maximum download delay (in seconds) to be set in case of high latencies.rh)jzh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:KWh;hh#]rhDXLThe maximum download delay (in seconds) to be set in case of high latencies.rr}r(h(jh)jubaubj)r}r(h(Uh)jzh*h-h/jh1}r(h6]h5]h3]h4]h8]Uentries]r(XpairXAUTOTHROTTLE_DEBUG; settingXstd:setting-AUTOTHROTTLE_DEBUGrUtrauh:KZh;hh#]ubh)r}r(h(Uh)jzh*h-h/hh1}r(h6]h5]h3]h4]h8]hjuh:KZh;hh#]ubeubh%)r}r(h(Uh)j8h*h-h}h/h0h1}r(h3]h4]h5]h6]r(h"jeh8]rhauh:K\h;hh}rjjsh#]r(h=)r}r(h(XAUTOTHROTTLE_DEBUGrh)jh*h-h/hAh1}r(h3]h4]h5]h6]h8]uh:K\h;hh#]rhDXAUTOTHROTTLE_DEBUGrr}r(h(jh)jubaubhH)r}r(h(XDefault: ``False``rh)jh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:K^h;hh#]r(hDX Default: rr}r(h(X Default: h)jubh)r}r(h(X ``False``h1}r(h3]h4]h5]h6]h8]uh)jh#]rhDXFalserr}r(h(Uh)jubah/hubeubhH)r}r(h(XEnable AutoThrottle debug mode which will display stats on every response received, so you can see how the throttling parameters are being adjusted in real time.rh)jh*h-h/hLh1}r(h3]h4]h5]h6]h8]uh:K`h;hh#]rhDXEnable AutoThrottle debug mode which will display stats on every response received, so you can see how the throttling parameters are being adjusted in real time.rr}r(h(jh)jubaubeubeubeubah(UU transformerrNU footnote_refsr}rUrefnamesr}rUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh;hU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h(Uh1}r(h3]UlevelKh6]h5]Usourceh-h4]h8]UlineKUtypeUINFOruh#]rhH)r}r(h(Uh1}r(h3]h4]h5]h6]h8]uh)jh#]rhDX<Hyperlink target "autothrottle-algorithm" is not referenced.rr}r(h(Uh)jubah/hLubah/Usystem_messagerubj)r}r(h(Uh1}r(h3]UlevelKh6]h5]Usourceh-h4]h8]UlineK?Utypejuh#]rhH)r}r(h(Uh1}r(h3]h4]h5]h6]h8]uh)jh#]rhDXFHyperlink target "std:setting-AUTOTHROTTLE_ENABLED" is not referenced.rr}r(h(Uh)jubah/hLubah/jubj)r}r(h(Uh1}r(h3]UlevelKh6]h5]Usourceh-h4]h8]UlineKHUtypejuh#]rhH)r}r(h(Uh1}r(h3]h4]h5]h6]h8]uh)jh#]rhDXJHyperlink target "std:setting-AUTOTHROTTLE_START_DELAY" is not referenced.rr}r(h(Uh)jubah/hLubah/jubj)r}r(h(Uh1}r(h3]UlevelKh6]h5]Usourceh-h4]h8]UlineKQUtypejuh#]rhH)r}r(h(Uh1}r (h3]h4]h5]h6]h8]uh)jh#]r hDXHHyperlink target "std:setting-AUTOTHROTTLE_MAX_DELAY" is not referenced.r r }r (h(Uh)jubah/hLubah/jubj)r}r(h(Uh1}r(h3]UlevelKh6]h5]Usourceh-h4]h8]UlineKZUtypejuh#]rhH)r}r(h(Uh1}r(h3]h4]h5]h6]h8]uh)jh#]rhDXDHyperlink target "std:setting-AUTOTHROTTLE_DEBUG" is not referenced.rr}r(h(Uh)jubah/hLubah/jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]r Usettingsr!(cdocutils.frontend Values r"or#}r$(Ufootnote_backlinksr%KUrecord_dependenciesr&NU rfc_base_urlr'Uhttp://tools.ietf.org/html/r(U tracebackr)Upep_referencesr*NUstrip_commentsr+NU toc_backlinksr,Uentryr-U language_coder.Uenr/U datestampr0NU report_levelr1KU _destinationr2NU halt_levelr3KU strip_classesr4NhANUerror_encoding_error_handlerr5Ubackslashreplacer6Udebugr7NUembed_stylesheetr8Uoutput_encoding_error_handlerr9Ustrictr:U sectnum_xformr;KUdump_transformsr<NU docinfo_xformr=KUwarning_streamr>NUpep_file_url_templater?Upep-%04dr@Uexit_status_levelrAKUconfigrBNUstrict_visitorrCNUcloak_email_addressesrDUtrim_footnote_reference_spacerEUenvrFNUdump_pseudo_xmlrGNUexpose_internalsrHNUsectsubtitle_xformrIU source_linkrJNUrfc_referencesrKNUoutput_encodingrLUutf-8rMU source_urlrNNUinput_encodingrOU utf-8-sigrPU_disable_configrQNU id_prefixrRUU tab_widthrSKUerror_encodingrTUUTF-8rUU_sourcerVUI/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/autothrottle.rstrWUgettext_compactrXU generatorrYNUdump_internalsrZNU smart_quotesr[U pep_base_urlr\Uhttp://www.python.org/dev/peps/r]Usyntax_highlightr^Ulongr_Uinput_encoding_error_handlerr`j:Uauto_id_prefixraUidrbUdoctitle_xformrcUstrip_elements_with_classesrdNU _config_filesre]Ufile_insertion_enabledrfU raw_enabledrgKU dump_settingsrhNubUsymbol_footnote_startriKUidsrj}rk(hjKh hRhj8h"jhjzjjjjhh&jujzjFjKh!hhhhhhjuUsubstitution_namesrl}rmh/h;h1}rn(h3]h6]h5]Usourceh-h4]h8]uU footnotesro]rpUrefidsrq}rr(j]rsjaju]rtjwaj]rujajF]rvjHah!]rwhauub.PKo1DZPʔʔ*scrapy-0.22/.doctrees/topics/items.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xdictionary-likeqXdict apiqXother common tasksqNXsetting field valuesq NXworking with itemsq NXaccessing all populated valuesq NX item objectsq NXscrapy.item.Fieldq X field objectsqNXdictqX item fieldsqNX django modelsqXgetting field valuesqNXcreating itemsqNXextending itemsqNX topics-itemsqXscrapy.item.ItemqXscrapy.item.Item.fieldsqXtopics-items-declaringqXitemsqNXtopics-items-fieldsqXdjangoqXdeclaring itemsqNuUsubstitution_defsq}qUparse_messagesq]q Ucurrent_sourceq!NU decorationq"NUautofootnote_startq#KUnameidsq$}q%(hUdictionary-likeq&hUdict-apiq'hUother-common-tasksq(h Usetting-field-valuesq)h Uworking-with-itemsq*h Uaccessing-all-populated-valuesq+h U item-objectsq,h h hU field-objectsq-hUdictq.hU item-fieldsq/hU django-modelsq0hUgetting-field-valuesq1hUcreating-itemsq2hUextending-itemsq3hU topics-itemsq4hhhhhUtopics-items-declaringq5hUitemsq6hUtopics-items-fieldsq7hUdjangoq8hUdeclaring-itemsq9uUchildrenq:]q;(cdocutils.nodes target q<)q=}q>(U rawsourceq?X.. _topics-items:Uparentq@hUsourceqAcdocutils.nodes reprunicode qBXB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/items.rstqCqD}qEbUtagnameqFUtargetqGU attributesqH}qI(UidsqJ]UbackrefsqK]UdupnamesqL]UclassesqM]UnamesqN]UrefidqOh4uUlineqPKUdocumentqQhh:]ubcdocutils.nodes section qR)qS}qT(h?Uh@hhAhDUexpect_referenced_by_nameqU}qVhh=shFUsectionqWhH}qX(hL]hM]hK]hJ]qY(Xmodule-scrapy.itemqZh6h4ehN]q[(hheuhPKhQhUexpect_referenced_by_idq\}q]h4h=sh:]q^(cdocutils.nodes title q_)q`}qa(h?XItemsqbh@hShAhDhFUtitleqchH}qd(hL]hM]hK]hJ]hN]uhPKhQhh:]qecdocutils.nodes Text qfXItemsqgqh}qi(h?hbh@h`ubaubcsphinx.addnodes index qj)qk}ql(h?Uh@hShAhDhFUindexqmhH}qn(hJ]hK]hL]hM]hN]Uentries]qo(UsingleqpXscrapy.item (module)Xmodule-scrapy.itemUtqqauhPNhQhh:]ubcdocutils.nodes paragraph qr)qs}qt(h?XThe main goal in scraping is to extract structured data from unstructured sources, typically, web pages. Scrapy provides the :class:`Item` class for this purpose.h@hShAhDhFU paragraphquhH}qv(hL]hM]hK]hJ]hN]uhPK hQhh:]qw(hfX}The main goal in scraping is to extract structured data from unstructured sources, typically, web pages. Scrapy provides the qxqy}qz(h?X}The main goal in scraping is to extract structured data from unstructured sources, typically, web pages. Scrapy provides the h@hsubcsphinx.addnodes pending_xref q{)q|}q}(h?X :class:`Item`q~h@hshAhDhFU pending_xrefqhH}q(UreftypeXclassUrefwarnqU reftargetqXItemU refdomainXpyqhJ]hK]U refexplicithL]hM]hN]UrefdocqX topics/itemsqUpy:classqNU py:moduleqX scrapy.itemquhPK h:]qcdocutils.nodes literal q)q}q(h?h~hH}q(hL]hM]q(UxrefqhXpy-classqehK]hJ]hN]uh@h|h:]qhfXItemqq}q(h?Uh@hubahFUliteralqubaubhfX class for this purpose.qq}q(h?X class for this purpose.h@hsubeubhr)q}q(h?X:class:`Item` objects are simple containers used to collect the scraped data. They provide a `dictionary-like`_ API with a convenient syntax for declaring their available fields.h@hShAhDhFhuhH}q(hL]hM]hK]hJ]hN]uhPKhQhh:]q(h{)q}q(h?X :class:`Item`qh@hhAhDhFhhH}q(UreftypeXclasshhXItemU refdomainXpyqhJ]hK]U refexplicithL]hM]hN]hhhNhhuhPKh:]qh)q}q(h?hhH}q(hL]hM]q(hhXpy-classqehK]hJ]hN]uh@hh:]qhfXItemqq}q(h?Uh@hubahFhubaubhfXP objects are simple containers used to collect the scraped data. They provide a qq}q(h?XP objects are simple containers used to collect the scraped data. They provide a h@hubcdocutils.nodes reference q)q}q(h?X`dictionary-like`_UresolvedqKh@hhFU referenceqhH}q(UnameXdictionary-likeqUrefuriqX1http://docs.python.org/library/stdtypes.html#dictqhJ]hK]hL]hM]hN]uh:]qhfXdictionary-likeqq}q(h?Uh@hubaubhfXC API with a convenient syntax for declaring their available fields.qq}q(h?XC API with a convenient syntax for declaring their available fields.h@hubeubh<)q}q(h?XF.. _dictionary-like: http://docs.python.org/library/stdtypes.html#dictU referencedqKh@hShAhDhFhGhH}q(hhhJ]qh&ahK]hL]hM]hN]qhauhPKhQhh:]ubh<)q}q(h?X.. _topics-items-declaring:h@hShAhDhFhGhH}q(hJ]hK]hL]hM]hN]hOh5uhPKhQhh:]ubhR)q}q(h?Uh@hShAhDhU}qhhshFhWhH}q(hL]hM]hK]hJ]q(h9h5ehN]q(hheuhPKhQhh\}qh5hsh:]q(h_)q}q(h?XDeclaring Itemsqh@hhAhDhFhchH}q(hL]hM]hK]hJ]hN]uhPKhQhh:]qhfXDeclaring ItemsqՅq}q(h?hh@hubaubhr)q}q(h?XjItems are declared using a simple class definition syntax and :class:`Field` objects. Here is an example::h@hhAhDhFhuhH}q(hL]hM]hK]hJ]hN]uhPKhQhh:]q(hfX>Items are declared using a simple class definition syntax and q܅q}q(h?X>Items are declared using a simple class definition syntax and h@hubh{)q}q(h?X:class:`Field`qh@hhAhDhFhhH}q(UreftypeXclasshhXFieldU refdomainXpyqhJ]hK]U refexplicithL]hM]hN]hhhNhhuhPKh:]qh)q}q(h?hhH}q(hL]hM]q(hhXpy-classqehK]hJ]hN]uh@hh:]qhfXFieldq녁q}q(h?Uh@hubahFhubaubhfX objects. Here is an example:qq}q(h?X objects. Here is an example:h@hubeubcdocutils.nodes literal_block q)q}q(h?Xfrom scrapy.item import Item, Field class Product(Item): name = Field() price = Field() stock = Field() last_updated = Field(serializer=str)h@hhAhDhFU literal_blockqhH}q(U xml:spaceqUpreserveqhJ]hK]hL]hM]hN]uhPKhQhh:]qhfXfrom scrapy.item import Item, Field class Product(Item): name = Field() price = Field() stock = Field() last_updated = Field(serializer=str)qq}q(h?Uh@hubaubcdocutils.nodes note q)q}q(h?XThose familiar with `Django`_ will notice that Scrapy Items are declared similar to `Django Models`_, except that Scrapy Items are much simpler as there is no concept of different field types.h@hhAhDhFUnoteqhH}r(hL]hM]hK]hJ]hN]uhPNhQhh:]rhr)r}r(h?XThose familiar with `Django`_ will notice that Scrapy Items are declared similar to `Django Models`_, except that Scrapy Items are much simpler as there is no concept of different field types.h@hhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPK$h:]r(hfXThose familiar with rr}r(h?XThose familiar with h@jubh)r }r (h?X `Django`_hKh@jhFhhH}r (UnameXDjangohXhttp://www.djangoproject.com/r hJ]hK]hL]hM]hN]uh:]r hfXDjangorr}r(h?Uh@j ubaubhfX7 will notice that Scrapy Items are declared similar to rr}r(h?X7 will notice that Scrapy Items are declared similar to h@jubh)r}r(h?X`Django Models`_hKh@jhFhhH}r(UnameX Django ModelshX6http://docs.djangoproject.com/en/dev/topics/db/models/rhJ]hK]hL]hM]hN]uh:]rhfX Django Modelsrr}r(h?Uh@jubaubhfX\, except that Scrapy Items are much simpler as there is no concept of different field types.rr}r(h?X\, except that Scrapy Items are much simpler as there is no concept of different field types.h@jubeubaubh<)r}r (h?X).. _Django: http://www.djangoproject.com/hKh@hhAhDhFhGhH}r!(hj hJ]r"h8ahK]hL]hM]hN]r#hauhPK(hQhh:]ubh<)r$}r%(h?XI.. _Django Models: http://docs.djangoproject.com/en/dev/topics/db/models/hKh@hhAhDhFhGhH}r&(hjhJ]r'h0ahK]hL]hM]hN]r(hauhPK)hQhh:]ubh<)r)}r*(h?X.. _topics-items-fields:h@hhAhDhFhGhH}r+(hJ]hK]hL]hM]hN]hOh7uhPK+hQhh:]ubeubhR)r,}r-(h?Uh@hShAhDhU}r.hj)shFhWhH}r/(hL]hM]hK]hJ]r0(h/h7ehN]r1(hheuhPK.hQhh\}r2h7j)sh:]r3(h_)r4}r5(h?X Item Fieldsr6h@j,hAhDhFhchH}r7(hL]hM]hK]hJ]hN]uhPK.hQhh:]r8hfX Item Fieldsr9r:}r;(h?j6h@j4ubaubhr)r<}r=(h?X:class:`Field` objects are used to specify metadata for each field. For example, the serializer function for the ``last_updated`` field illustrated in the example above.h@j,hAhDhFhuhH}r>(hL]hM]hK]hJ]hN]uhPK0hQhh:]r?(h{)r@}rA(h?X:class:`Field`rBh@j<hAhDhFhhH}rC(UreftypeXclasshhXFieldU refdomainXpyrDhJ]hK]U refexplicithL]hM]hN]hhhNhhuhPK0h:]rEh)rF}rG(h?jBhH}rH(hL]hM]rI(hjDXpy-classrJehK]hJ]hN]uh@j@h:]rKhfXFieldrLrM}rN(h?Uh@jFubahFhubaubhfXc objects are used to specify metadata for each field. For example, the serializer function for the rOrP}rQ(h?Xc objects are used to specify metadata for each field. For example, the serializer function for the h@j<ubh)rR}rS(h?X``last_updated``hH}rT(hL]hM]hK]hJ]hN]uh@j<h:]rUhfX last_updatedrVrW}rX(h?Uh@jRubahFhubhfX( field illustrated in the example above.rYrZ}r[(h?X( field illustrated in the example above.h@j<ubeubhr)r\}r](h?XYou can specify any kind of metadata for each field. There is no restriction on the values accepted by :class:`Field` objects. For this same reason, there isn't a reference list of all available metadata keys. Each key defined in :class:`Field` objects could be used by a different components, and only those components know about it. You can also define and use any other :class:`Field` key in your project too, for your own needs. The main goal of :class:`Field` objects is to provide a way to define all field metadata in one place. Typically, those components whose behaviour depends on each field use certain field keys to configure that behaviour. You must refer to their documentation to see which metadata keys are used by each component.h@j,hAhDhFhuhH}r^(hL]hM]hK]hJ]hN]uhPK4hQhh:]r_(hfXgYou can specify any kind of metadata for each field. There is no restriction on the values accepted by r`ra}rb(h?XgYou can specify any kind of metadata for each field. There is no restriction on the values accepted by h@j\ubh{)rc}rd(h?X:class:`Field`reh@j\hAhDhFhhH}rf(UreftypeXclasshhXFieldU refdomainXpyrghJ]hK]U refexplicithL]hM]hN]hhhNhhuhPK4h:]rhh)ri}rj(h?jehH}rk(hL]hM]rl(hjgXpy-classrmehK]hJ]hN]uh@jch:]rnhfXFieldrorp}rq(h?Uh@jiubahFhubaubhfXq objects. For this same reason, there isn't a reference list of all available metadata keys. Each key defined in rrrs}rt(h?Xq objects. For this same reason, there isn't a reference list of all available metadata keys. Each key defined in h@j\ubh{)ru}rv(h?X:class:`Field`rwh@j\hAhDhFhhH}rx(UreftypeXclasshhXFieldU refdomainXpyryhJ]hK]U refexplicithL]hM]hN]hhhNhhuhPK4h:]rzh)r{}r|(h?jwhH}r}(hL]hM]r~(hjyXpy-classrehK]hJ]hN]uh@juh:]rhfXFieldrr}r(h?Uh@j{ubahFhubaubhfX objects could be used by a different components, and only those components know about it. You can also define and use any other rr}r(h?X objects could be used by a different components, and only those components know about it. You can also define and use any other h@j\ubh{)r}r(h?X:class:`Field`rh@j\hAhDhFhhH}r(UreftypeXclasshhXFieldU refdomainXpyrhJ]hK]U refexplicithL]hM]hN]hhhNhhuhPK4h:]rh)r}r(h?jhH}r(hL]hM]r(hjXpy-classrehK]hJ]hN]uh@jh:]rhfXFieldrr}r(h?Uh@jubahFhubaubhfX? key in your project too, for your own needs. The main goal of rr}r(h?X? key in your project too, for your own needs. The main goal of h@j\ubh{)r}r(h?X:class:`Field`rh@j\hAhDhFhhH}r(UreftypeXclasshhXFieldU refdomainXpyrhJ]hK]U refexplicithL]hM]hN]hhhNhhuhPK4h:]rh)r}r(h?jhH}r(hL]hM]r(hjXpy-classrehK]hJ]hN]uh@jh:]rhfXFieldrr}r(h?Uh@jubahFhubaubhfX objects is to provide a way to define all field metadata in one place. Typically, those components whose behaviour depends on each field use certain field keys to configure that behaviour. You must refer to their documentation to see which metadata keys are used by each component.rr}r(h?X objects is to provide a way to define all field metadata in one place. Typically, those components whose behaviour depends on each field use certain field keys to configure that behaviour. You must refer to their documentation to see which metadata keys are used by each component.h@j\ubeubhr)r}r(h?XIt's important to note that the :class:`Field` objects used to declare the item do not stay assigned as class attributes. Instead, they can be accessed through the :attr:`Item.fields` attribute.h@j,hAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPK?hQhh:]r(hfX It's important to note that the rr}r(h?X It's important to note that the h@jubh{)r}r(h?X:class:`Field`rh@jhAhDhFhhH}r(UreftypeXclasshhXFieldU refdomainXpyrhJ]hK]U refexplicithL]hM]hN]hhhNhhuhPK?h:]rh)r}r(h?jhH}r(hL]hM]r(hjXpy-classrehK]hJ]hN]uh@jh:]rhfXFieldrr}r(h?Uh@jubahFhubaubhfXv objects used to declare the item do not stay assigned as class attributes. Instead, they can be accessed through the rr}r(h?Xv objects used to declare the item do not stay assigned as class attributes. Instead, they can be accessed through the h@jubh{)r}r(h?X:attr:`Item.fields`rh@jhAhDhFhhH}r(UreftypeXattrhhX Item.fieldsU refdomainXpyrhJ]hK]U refexplicithL]hM]hN]hhhNhhuhPK?h:]rh)r}r(h?jhH}r(hL]hM]r(hjXpy-attrrehK]hJ]hN]uh@jh:]rhfX Item.fieldsrr}r(h?Uh@jubahFhubaubhfX attribute.rr}r(h?X attribute.h@jubeubhr)r}r(h?X6And that's all you need to know about declaring items.rh@j,hAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKChQhh:]rhfX6And that's all you need to know about declaring items.rr}r(h?jh@jubaubeubhR)r}r(h?Uh@hShAhDhFhWhH}r(hL]hM]hK]hJ]rh*ahN]rh auhPKFhQhh:]r(h_)r}r(h?XWorking with Itemsrh@jhAhDhFhchH}r(hL]hM]hK]hJ]hN]uhPKFhQhh:]rhfXWorking with Itemsrr}r(h?jh@jubaubhr)r}r(h?XHere are some examples of common tasks performed with items, using the ``Product`` item :ref:`declared above `. You will notice the API is very similar to the `dict API`_.h@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKHhQhh:]r(hfXGHere are some examples of common tasks performed with items, using the rr}r(h?XGHere are some examples of common tasks performed with items, using the h@jubh)r}r(h?X ``Product``hH}r(hL]hM]hK]hJ]hN]uh@jh:]rhfXProductrr}r(h?Uh@jubahFhubhfX item rr}r(h?X item h@jubh{)r}r(h?X/:ref:`declared above `rh@jhAhDhFhhH}r(UreftypeXrefhhXtopics-items-declaringU refdomainXstdrhJ]hK]U refexplicithL]hM]hN]hhuhPKHh:]rcdocutils.nodes emphasis r)r}r(h?jhH}r(hL]hM]r(hjXstd-refrehK]hJ]hN]uh@jh:]r hfXdeclared abover r }r (h?Uh@jubahFUemphasisr ubaubhfX1. You will notice the API is very similar to the rr}r(h?X1. You will notice the API is very similar to the h@jubh)r}r(h?X `dict API`_hKh@jhFhhH}r(UnameXdict APIhX1http://docs.python.org/library/stdtypes.html#dictrhJ]hK]hL]hM]hN]uh:]rhfXdict APIrr}r(h?Uh@jubaubhfX.r}r(h?X.h@jubeubhR)r}r(h?Uh@jhAhDhFhWhH}r(hL]hM]hK]hJ]rh2ahN]rhauhPKMhQhh:]r (h_)r!}r"(h?XCreating itemsr#h@jhAhDhFhchH}r$(hL]hM]hK]hJ]hN]uhPKMhQhh:]r%hfXCreating itemsr&r'}r((h?j#h@j!ubaubh)r)}r*(h?Xm>>> product = Product(name='Desktop PC', price=1000) >>> print product Product(name='Desktop PC', price=1000)h@jhAhDhFhhH}r+(hhhJ]hK]hL]hM]hN]uhPKQhQhh:]r,hfXm>>> product = Product(name='Desktop PC', price=1000) >>> print product Product(name='Desktop PC', price=1000)r-r.}r/(h?Uh@j)ubaubeubhR)r0}r1(h?Uh@jhAhDhFhWhH}r2(hL]hM]hK]hJ]r3h1ahN]r4hauhPKVhQhh:]r5(h_)r6}r7(h?XGetting field valuesr8h@j0hAhDhFhchH}r9(hL]hM]hK]hJ]hN]uhPKVhQhh:]r:hfXGetting field valuesr;r<}r=(h?j8h@j6ubaubh)r>}r?(h?X>>> product['name'] Desktop PC >>> product.get('name') Desktop PC >>> product['price'] 1000 >>> product['last_updated'] Traceback (most recent call last): ... KeyError: 'last_updated' >>> product.get('last_updated', 'not set') not set >>> product['lala'] # getting unknown field Traceback (most recent call last): ... KeyError: 'lala' >>> product.get('lala', 'unknown field') 'unknown field' >>> 'name' in product # is name field populated? True >>> 'last_updated' in product # is last_updated populated? False >>> 'last_updated' in product.fields # is last_updated a declared field? True >>> 'lala' in product.fields # is lala a declared field? Falseh@j0hAhDhFhhH}r@(hhhJ]hK]hL]hM]hN]uhPKZhQhh:]rAhfX>>> product['name'] Desktop PC >>> product.get('name') Desktop PC >>> product['price'] 1000 >>> product['last_updated'] Traceback (most recent call last): ... KeyError: 'last_updated' >>> product.get('last_updated', 'not set') not set >>> product['lala'] # getting unknown field Traceback (most recent call last): ... KeyError: 'lala' >>> product.get('lala', 'unknown field') 'unknown field' >>> 'name' in product # is name field populated? True >>> 'last_updated' in product # is last_updated populated? False >>> 'last_updated' in product.fields # is last_updated a declared field? True >>> 'lala' in product.fields # is lala a declared field? FalserBrC}rD(h?Uh@j>ubaubeubhR)rE}rF(h?Uh@jhAhDhFhWhH}rG(hL]hM]hK]hJ]rHh)ahN]rIh auhPKhQhh:]rJ(h_)rK}rL(h?XSetting field valuesrMh@jEhAhDhFhchH}rN(hL]hM]hK]hJ]hN]uhPKhQhh:]rOhfXSetting field valuesrPrQ}rR(h?jMh@jKubaubh)rS}rT(h?X>>> product['last_updated'] = 'today' >>> product['last_updated'] today >>> product['lala'] = 'test' # setting unknown field Traceback (most recent call last): ... KeyError: 'Product does not support field: lala'h@jEhAhDhFhhH}rU(hhhJ]hK]hL]hM]hN]uhPKhQhh:]rVhfX>>> product['last_updated'] = 'today' >>> product['last_updated'] today >>> product['lala'] = 'test' # setting unknown field Traceback (most recent call last): ... KeyError: 'Product does not support field: lala'rWrX}rY(h?Uh@jSubaubeubhR)rZ}r[(h?Uh@jhAhDhFhWhH}r\(hL]hM]hK]hJ]r]h+ahN]r^h auhPKhQhh:]r_(h_)r`}ra(h?XAccessing all populated valuesrbh@jZhAhDhFhchH}rc(hL]hM]hK]hJ]hN]uhPKhQhh:]rdhfXAccessing all populated valuesrerf}rg(h?jbh@j`ubaubhr)rh}ri(h?XBTo access all populated values, just use the typical `dict API`_::rjh@jZhAhDhFhuhH}rk(hL]hM]hK]hJ]hN]uhPKhQhh:]rl(hfX5To access all populated values, just use the typical rmrn}ro(h?X5To access all populated values, just use the typical h@jhubh)rp}rq(h?X `dict API`_hKh@jhhFhhH}rr(UnameXdict APIhjhJ]hK]hL]hM]hN]uh:]rshfXdict APIrtru}rv(h?Uh@jpubaubhfX:rw}rx(h?X:h@jhubeubh)ry}rz(h?Xc>>> product.keys() ['price', 'name'] >>> product.items() [('price', 1000), ('name', 'Desktop PC')]h@jZhAhDhFhhH}r{(hhhJ]hK]hL]hM]hN]uhPKhQhh:]r|hfXc>>> product.keys() ['price', 'name'] >>> product.items() [('price', 1000), ('name', 'Desktop PC')]r}r~}r(h?Uh@jyubaubeubhR)r}r(h?Uh@jhAhDhFhWhH}r(hL]hM]hK]hJ]rh(ahN]rhauhPKhQhh:]r(h_)r}r(h?XOther common tasksrh@jhAhDhFhchH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXOther common tasksrr}r(h?jh@jubaubhr)r}r(h?XCopying items::rh@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXCopying items:rr}r(h?XCopying items:h@jubaubh)r}r(h?X>>> product2 = Product(product) >>> print product2 Product(name='Desktop PC', price=1000) >>> product3 = product2.copy() >>> print product3 Product(name='Desktop PC', price=1000)h@jhAhDhFhhH}r(hhhJ]hK]hL]hM]hN]uhPKhQhh:]rhfX>>> product2 = Product(product) >>> print product2 Product(name='Desktop PC', price=1000) >>> product3 = product2.copy() >>> print product3 Product(name='Desktop PC', price=1000)rr}r(h?Uh@jubaubhr)r}r(h?XCreating dicts from items::rh@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXCreating dicts from items:rr}r(h?XCreating dicts from items:h@jubaubh)r}r(h?Xa>>> dict(product) # create a dict from all populated values {'price': 1000, 'name': 'Desktop PC'}h@jhAhDhFhhH}r(hhhJ]hK]hL]hM]hN]uhPKhQhh:]rhfXa>>> dict(product) # create a dict from all populated values {'price': 1000, 'name': 'Desktop PC'}rr}r(h?Uh@jubaubhr)r}r(h?XCreating items from dicts::rh@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXCreating items from dicts:rr}r(h?XCreating items from dicts:h@jubaubh)r}r(h?X>>> Product({'name': 'Laptop PC', 'price': 1500}) Product(price=1500, name='Laptop PC') >>> Product({'name': 'Laptop PC', 'lala': 1500}) # warning: unknown field in dict Traceback (most recent call last): ... KeyError: 'Product does not support field: lala'h@jhAhDhFhhH}r(hhhJ]hK]hL]hM]hN]uhPKhQhh:]rhfX>>> Product({'name': 'Laptop PC', 'price': 1500}) Product(price=1500, name='Laptop PC') >>> Product({'name': 'Laptop PC', 'lala': 1500}) # warning: unknown field in dict Traceback (most recent call last): ... KeyError: 'Product does not support field: lala'rr}r(h?Uh@jubaubeubeubhR)r}r(h?Uh@hShAhDhFhWhH}r(hL]hM]hK]hJ]rh3ahN]rhauhPKhQhh:]r(h_)r}r(h?XExtending Itemsrh@jhAhDhFhchH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXExtending Itemsrr}r(h?jh@jubaubhr)r}r(h?XYou can extend Items (to add more fields or to change some metadata for some fields) by declaring a subclass of your original Item.rh@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXYou can extend Items (to add more fields or to change some metadata for some fields) by declaring a subclass of your original Item.rr}r(h?jh@jubaubhr)r}r(h?X For example::rh@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfX For example:rr}r(h?X For example:h@jubaubh)r}r(h?Xuclass DiscountedProduct(Product): discount_percent = Field(serializer=str) discount_expiration_date = Field()h@jhAhDhFhhH}r(hhhJ]hK]hL]hM]hN]uhPKhQhh:]rhfXuclass DiscountedProduct(Product): discount_percent = Field(serializer=str) discount_expiration_date = Field()rr}r(h?Uh@jubaubhr)r}r(h?XYou can also extend field metadata by using the previous field metadata and appending more values, or changing existing values, like this::h@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXYou can also extend field metadata by using the previous field metadata and appending more values, or changing existing values, like this:rr}r(h?XYou can also extend field metadata by using the previous field metadata and appending more values, or changing existing values, like this:h@jubaubh)r}r(h?Xbclass SpecificProduct(Product): name = Field(Product.fields['name'], serializer=my_serializer)h@jhAhDhFhhH}r(hhhJ]hK]hL]hM]hN]uhPKhQhh:]rhfXbclass SpecificProduct(Product): name = Field(Product.fields['name'], serializer=my_serializer)rr}r(h?Uh@jubaubhr)r}r(h?XThat adds (or replaces) the ``serializer`` metadata key for the ``name`` field, keeping all the previously existing metadata values.h@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]r(hfXThat adds (or replaces) the rr}r(h?XThat adds (or replaces) the h@jubh)r}r(h?X``serializer``hH}r(hL]hM]hK]hJ]hN]uh@jh:]rhfX serializerrr}r(h?Uh@jubahFhubhfX metadata key for the rr}r(h?X metadata key for the h@jubh)r}r(h?X``name``hH}r(hL]hM]hK]hJ]hN]uh@jh:]rhfXnamerr}r(h?Uh@jubahFhubhfX< field, keeping all the previously existing metadata values.rr}r(h?X< field, keeping all the previously existing metadata values.h@jubeubeubhR)r }r (h?Uh@hShAhDhFhWhH}r (hL]hM]hK]hJ]r h,ahN]r h auhPKhQhh:]r(h_)r}r(h?X Item objectsrh@j hAhDhFhchH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfX Item objectsrr}r(h?jh@jubaubhj)r}r(h?Uh@j hANhFhmhH}r(hJ]hK]hL]hM]hN]Uentries]r(hpXItem (class in scrapy.item)hUtrauhPNhQhh:]ubcsphinx.addnodes desc r)r}r(h?Uh@j hANhFUdescrhH}r (Unoindexr!Udomainr"XpyhJ]hK]hL]hM]hN]Uobjtyper#Xclassr$Udesctyper%j$uhPNhQhh:]r&(csphinx.addnodes desc_signature r')r(}r)(h?X Item([arg])r*h@jhAhDhFUdesc_signaturer+hH}r,(hJ]r-haUmoduler.hhK]hL]hM]hN]r/haUfullnamer0XItemr1Uclassr2UUfirstr3uhPKhQhh:]r4(csphinx.addnodes desc_annotation r5)r6}r7(h?Xclass h@j(hAhDhFUdesc_annotationr8hH}r9(hL]hM]hK]hJ]hN]uhPKhQhh:]r:hfXclass r;r<}r=(h?Uh@j6ubaubcsphinx.addnodes desc_addname r>)r?}r@(h?X scrapy.item.h@j(hAhDhFU desc_addnamerAhH}rB(hL]hM]hK]hJ]hN]uhPKhQhh:]rChfX scrapy.item.rDrE}rF(h?Uh@j?ubaubcsphinx.addnodes desc_name rG)rH}rI(h?j1h@j(hAhDhFU desc_namerJhH}rK(hL]hM]hK]hJ]hN]uhPKhQhh:]rLhfXItemrMrN}rO(h?Uh@jHubaubcsphinx.addnodes desc_parameterlist rP)rQ}rR(h?Uh@j(hAhDhFUdesc_parameterlistrShH}rT(hL]hM]hK]hJ]hN]uhPKhQhh:]rUcsphinx.addnodes desc_optional rV)rW}rX(h?UhH}rY(hL]hM]hK]hJ]hN]uh@jQh:]rZcsphinx.addnodes desc_parameter r[)r\}r](h?XarghH}r^(hL]hM]hK]hJ]hN]uh@jWh:]r_hfXargr`ra}rb(h?Uh@j\ubahFUdesc_parameterrcubahFU desc_optionalrdubaubeubcsphinx.addnodes desc_content re)rf}rg(h?Uh@jhAhDhFU desc_contentrhhH}ri(hL]hM]hK]hJ]hN]uhPKhQhh:]rj(hr)rk}rl(h?XAReturn a new Item optionally initialized from the given argument.rmh@jfhAhDhFhuhH}rn(hL]hM]hK]hJ]hN]uhPKhQhh:]rohfXAReturn a new Item optionally initialized from the given argument.rprq}rr(h?jmh@jkubaubhr)rs}rt(h?XxItems replicate the standard `dict API`_, including its constructor. The only additional attribute provided by Items is:h@jfhAhDhFhuhH}ru(hL]hM]hK]hJ]hN]uhPKhQhh:]rv(hfXItems replicate the standard rwrx}ry(h?XItems replicate the standard h@jsubh)rz}r{(h?X `dict API`_hKh@jshFhhH}r|(UnameXdict APIhjhJ]hK]hL]hM]hN]uh:]r}hfXdict APIr~r}r(h?Uh@jzubaubhfXP, including its constructor. The only additional attribute provided by Items is:rr}r(h?XP, including its constructor. The only additional attribute provided by Items is:h@jsubeubhj)r}r(h?Uh@jfhAhDhFhmhH}r(hJ]hK]hL]hM]hN]Uentries]r(hpX#fields (scrapy.item.Item attribute)hUtrauhPNhQhh:]ubj)r}r(h?Uh@jfhAhDhFjhH}r(j!j"XpyhJ]hK]hL]hM]hN]j#X attributerj%juhPNhQhh:]r(j')r}r(h?Xfieldsrh@jhAhDhFj+hH}r(hJ]rhaj.hhK]hL]hM]hN]rhaj0X Item.fieldsj2j1j3uhPKhQhh:]rjG)r}r(h?jh@jhAhDhFjJhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXfieldsrr}r(h?Uh@jubaubaubje)r}r(h?Uh@jhAhDhFjhhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhr)r}r(h?XA dictionary containing *all declared fields* for this Item, not only those populated. The keys are the field names and the values are the :class:`Field` objects used in the :ref:`Item declaration `.h@jhAhDhFhuhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]r(hfXA dictionary containing rr}r(h?XA dictionary containing h@jubj)r}r(h?X*all declared fields*hH}r(hL]hM]hK]hJ]hN]uh@jh:]rhfXall declared fieldsrr}r(h?Uh@jubahFj ubhfX^ for this Item, not only those populated. The keys are the field names and the values are the rr}r(h?X^ for this Item, not only those populated. The keys are the field names and the values are the h@jubh{)r}r(h?X:class:`Field`rh@jhAhDhFhhH}r(UreftypeXclasshhXFieldU refdomainXpyrhJ]hK]U refexplicithL]hM]hN]hhhj1hhuhPKh:]rh)r}r(h?jhH}r(hL]hM]r(hjXpy-classrehK]hJ]hN]uh@jh:]rhfXFieldrr}r(h?Uh@jubahFhubaubhfX objects used in the rr}r(h?X objects used in the h@jubh{)r}r(h?X0:ref:`Item declaration `rh@jhAhDhFhhH}r(UreftypeXrefhhXtopics-items-declaringU refdomainXstdrhJ]hK]U refexplicithL]hM]hN]hhuhPKh:]rj)r}r(h?jhH}r(hL]hM]r(hjXstd-refrehK]hJ]hN]uh@jh:]rhfXItem declarationrr}r(h?Uh@jubahFj ubaubhfX.r}r(h?X.h@jubeubaubeubeubeubh<)r}r(h?X?.. _dict API: http://docs.python.org/library/stdtypes.html#dicthKh@j hAhDhFhGhH}r(hjhJ]rh'ahK]hL]hM]hN]rhauhPKhQhh:]ubeubhR)r}r(h?Uh@hShAhDhFhWhH}r(hL]hM]hK]hJ]rh-ahN]rhauhPKhQhh:]r(h_)r}r(h?X Field objectsrh@jhAhDhFhchH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfX Field objectsrr}r(h?jh@jubaubhj)r}r(h?Uh@jhAhDhFhmhH}r(hJ]hK]hL]hM]hN]Uentries]r(hpXField (class in scrapy.item)h UtrauhPNhQhh:]ubj)r}r(h?Uh@jhAhDhFjhH}r(j!j"XpyhJ]hK]hL]hM]hN]j#Xclassrj%juhPNhQhh:]r(j')r}r(h?X Field([arg])rh@jhAhDhFj+hH}r(hJ]rh aj.hhK]hL]hM]hN]rh aj0XFieldrj2Uj3uhPKhQhh:]r(j5)r}r(h?Xclass h@jhAhDhFj8hH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfXclass rr}r(h?Uh@jubaubj>)r}r(h?X scrapy.item.h@jhAhDhFjAhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rhfX scrapy.item.rr}r(h?Uh@jubaubjG)r}r(h?jh@jhAhDhFjJhH}r (hL]hM]hK]hJ]hN]uhPKhQhh:]r hfXFieldr r }r (h?Uh@jubaubjP)r}r(h?Uh@jhAhDhFjShH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]rjV)r}r(h?UhH}r(hL]hM]hK]hJ]hN]uh@jh:]rj[)r}r(h?XarghH}r(hL]hM]hK]hJ]hN]uh@jh:]rhfXargrr}r(h?Uh@jubahFjcubahFjdubaubeubje)r}r(h?Uh@jhAhDhFjhhH}r(hL]hM]hK]hJ]hN]uhPKhQhh:]r hr)r!}r"(h?X@The :class:`Field` class is just an alias to the built-in `dict`_ class and doesn't provide any extra functionality or attributes. In other words, :class:`Field` objects are plain-old Python dicts. A separate class is used to support the :ref:`item declaration syntax ` based on class attributes.h@jhAhDhFhuhH}r#(hL]hM]hK]hJ]hN]uhPKhQhh:]r$(hfXThe r%r&}r'(h?XThe h@j!ubh{)r(}r)(h?X:class:`Field`r*h@j!hAhDhFhhH}r+(UreftypeXclasshhXFieldU refdomainXpyr,hJ]hK]U refexplicithL]hM]hN]hhhjhhuhPKh:]r-h)r.}r/(h?j*hH}r0(hL]hM]r1(hj,Xpy-classr2ehK]hJ]hN]uh@j(h:]r3hfXFieldr4r5}r6(h?Uh@j.ubahFhubaubhfX( class is just an alias to the built-in r7r8}r9(h?X( class is just an alias to the built-in h@j!ubh)r:}r;(h?X`dict`_hKh@j!hFhhH}r<(UnameXdictr=hX1http://docs.python.org/library/stdtypes.html#dictr>hJ]hK]hL]hM]hN]uh:]r?hfXdictr@rA}rB(h?Uh@j:ubaubhfXR class and doesn't provide any extra functionality or attributes. In other words, rCrD}rE(h?XR class and doesn't provide any extra functionality or attributes. In other words, h@j!ubh{)rF}rG(h?X:class:`Field`rHh@j!hAhDhFhhH}rI(UreftypeXclasshhXFieldU refdomainXpyrJhJ]hK]U refexplicithL]hM]hN]hhhjhhuhPKh:]rKh)rL}rM(h?jHhH}rN(hL]hM]rO(hjJXpy-classrPehK]hJ]hN]uh@jFh:]rQhfXFieldrRrS}rT(h?Uh@jLubahFhubaubhfXM objects are plain-old Python dicts. A separate class is used to support the rUrV}rW(h?XM objects are plain-old Python dicts. A separate class is used to support the h@j!ubh{)rX}rY(h?X7:ref:`item declaration syntax `rZh@j!hAhDhFhhH}r[(UreftypeXrefhhXtopics-items-declaringU refdomainXstdr\hJ]hK]U refexplicithL]hM]hN]hhuhPKh:]r]j)r^}r_(h?jZhH}r`(hL]hM]ra(hj\Xstd-refrbehK]hJ]hN]uh@jXh:]rchfXitem declaration syntaxrdre}rf(h?Uh@j^ubahFj ubaubhfX based on class attributes.rgrh}ri(h?X based on class attributes.h@j!ubeubaubeubh<)rj}rk(h?X;.. _dict: http://docs.python.org/library/stdtypes.html#dicthKh@jhAhDhFhGhH}rl(hj>hJ]rmh.ahK]hL]hM]hN]rnhauhPKhQhh:]ubeubeubeh?UU transformerroNU footnote_refsrp}rqUrefnamesrr}rs(X django models]rtjaXdict api]ru(jjpjzeXdjango]rvj aj=]rwj:ah]rxhauUsymbol_footnotesry]rzUautofootnote_refsr{]r|Usymbol_footnote_refsr}]r~U citationsr]rhQhU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h?UhH}r(hL]UlevelKhJ]hK]UsourcehDhM]hN]UlineKUtypeUINFOruh:]rhr)r}r(h?UhH}r(hL]hM]hK]hJ]hN]uh@jh:]rhfX2Hyperlink target "topics-items" is not referenced.rr}r(h?Uh@jubahFhuubahFUsystem_messagerubj)r}r(h?UhH}r(hL]UlevelKhJ]hK]UsourcehDhM]hN]UlineKUtypejuh:]rhr)r}r(h?UhH}r(hL]hM]hK]hJ]hN]uh@jh:]rhfX<Hyperlink target "topics-items-declaring" is not referenced.rr}r(h?Uh@jubahFhuubahFjubj)r}r(h?UhH}r(hL]UlevelKhJ]hK]UsourcehDhM]hN]UlineK+Utypejuh:]rhr)r}r(h?UhH}r(hL]hM]hK]hJ]hN]uh@jh:]rhfX9Hyperlink target "topics-items-fields" is not referenced.rr}r(h?Uh@jubahFhuubahFjubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhcNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/items.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]rUfile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(h(jh&hh,j h+jZh-jh jh*jh.jjh1j0h3jhZh<)r}r(h?Uh@hShAhDhFhGhH}r(hL]hJ]rhZahK]UismodhM]hN]uhPNhQhh:]ubh9hh4hSh/j,hj(hjh5hh)jEh6hSh7j,h8jh2jh0j$h'juUsubstitution_namesr}rhFhQhH}r(hL]hJ]hK]UsourcehDhM]hN]uU footnotesr]rUrefidsr}r(h4]rh=ah7]rj)ah5]r hauub.PKo1DO']Q]Q*scrapy-0.22/.doctrees/topics/debug.doctreecdocutils.nodes document q)q}q(U nametypesq}q(XloggingqNX topics-debugqX scrapy shellqNX parse commandq NXopen in browserq NXdebugging spidersq NXbase tagq uUsubstitution_defsq }qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUloggingqhU topics-debugqhU scrapy-shellqh U parse-commandqh Uopen-in-browserqh Udebugging-spidersqh Ubase-tagquUchildrenq]q(cdocutils.nodes target q)q }q!(U rawsourceq"X.. _topics-debug:Uparentq#hUsourceq$cdocutils.nodes reprunicode q%XB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/debug.rstq&q'}q(bUtagnameq)Utargetq*U attributesq+}q,(Uidsq-]Ubackrefsq.]Udupnamesq/]Uclassesq0]Unamesq1]Urefidq2huUlineq3KUdocumentq4hh]ubcdocutils.nodes section q5)q6}q7(h"Uh#hh$h'Uexpect_referenced_by_nameq8}q9hh sh)Usectionq:h+}q;(h/]h0]h.]h-]q<(hheh1]q=(h heuh3Kh4hUexpect_referenced_by_idq>}q?hh sh]q@(cdocutils.nodes title qA)qB}qC(h"XDebugging SpidersqDh#h6h$h'h)UtitleqEh+}qF(h/]h0]h.]h-]h1]uh3Kh4hh]qGcdocutils.nodes Text qHXDebugging SpidersqIqJ}qK(h"hDh#hBubaubcdocutils.nodes paragraph qL)qM}qN(h"XuThis document explains the most common techniques for debugging spiders. Consider the following scrapy spider below::h#h6h$h'h)U paragraphqOh+}qP(h/]h0]h.]h-]h1]uh3Kh4hh]qQhHXtThis document explains the most common techniques for debugging spiders. Consider the following scrapy spider below:qRqS}qT(h"XtThis document explains the most common techniques for debugging spiders. Consider the following scrapy spider below:h#hMubaubcdocutils.nodes literal_block qU)qV}qW(h"Xclass MySpider(Spider): name = 'myspider' start_urls = ( 'http://example.com/page1', 'http://example.com/page2', ) def parse(self, response): # collect `item_urls` for item_url in item_urls: yield Request(url=item_url, callback=self.parse_item) def parse_item(self, response): item = MyItem() # populate `item` fields yield Request(url=item_details_url, meta={'item': item}, callback=self.parse_details) def parse_details(self, response): item = response.meta['item'] # populate more `item` fields return itemh#h6h$h'h)U literal_blockqXh+}qY(U xml:spaceqZUpreserveq[h-]h.]h/]h0]h1]uh3K h4hh]q\hHXclass MySpider(Spider): name = 'myspider' start_urls = ( 'http://example.com/page1', 'http://example.com/page2', ) def parse(self, response): # collect `item_urls` for item_url in item_urls: yield Request(url=item_url, callback=self.parse_item) def parse_item(self, response): item = MyItem() # populate `item` fields yield Request(url=item_details_url, meta={'item': item}, callback=self.parse_details) def parse_details(self, response): item = response.meta['item'] # populate more `item` fields return itemq]q^}q_(h"Uh#hVubaubhL)q`}qa(h"XBasically this is a simple spider which parses two pages of items (the start_urls). Items also have a details page with additional information, so we use the ``meta`` functionality of :class:`~scrapy.http.Request` to pass a partially populated item.h#h6h$h'h)hOh+}qb(h/]h0]h.]h-]h1]uh3K!h4hh]qc(hHXBasically this is a simple spider which parses two pages of items (the start_urls). Items also have a details page with additional information, so we use the qdqe}qf(h"XBasically this is a simple spider which parses two pages of items (the start_urls). Items also have a details page with additional information, so we use the h#h`ubcdocutils.nodes literal qg)qh}qi(h"X``meta``h+}qj(h/]h0]h.]h-]h1]uh#h`h]qkhHXmetaqlqm}qn(h"Uh#hhubah)UliteralqoubhHX functionality of qpqq}qr(h"X functionality of h#h`ubcsphinx.addnodes pending_xref qs)qt}qu(h"X:class:`~scrapy.http.Request`qvh#h`h$h'h)U pending_xrefqwh+}qx(UreftypeXclassUrefwarnqyU reftargetqzXscrapy.http.RequestU refdomainXpyq{h-]h.]U refexplicith/]h0]h1]Urefdocq|X topics/debugq}Upy:classq~NU py:moduleqNuh3K!h]qhg)q}q(h"hvh+}q(h/]h0]q(Uxrefqh{Xpy-classqeh.]h-]h1]uh#hth]qhHXRequestqq}q(h"Uh#hubah)houbaubhHX$ to pass a partially populated item.qq}q(h"X$ to pass a partially populated item.h#h`ubeubh5)q}q(h"Uh#h6h$h'h)h:h+}q(h/]h0]h.]h-]qhah1]qh auh3K(h4hh]q(hA)q}q(h"X Parse Commandqh#hh$h'h)hEh+}q(h/]h0]h.]h-]h1]uh3K(h4hh]qhHX Parse Commandqq}q(h"hh#hubaubhL)q}q(h"X$The most basic way of checking the output of your spider is to use the :command:`parse` command. It allows to check the behaviour of different parts of the spider at the method level. It has the advantage of being flexible and simple to use, but does not allow debugging code inside a method.h#hh$h'h)hOh+}q(h/]h0]h.]h-]h1]uh3K*h4hh]q(hHXGThe most basic way of checking the output of your spider is to use the qq}q(h"XGThe most basic way of checking the output of your spider is to use the h#hubhs)q}q(h"X:command:`parse`qh#hh$h'h)hwh+}q(UreftypeXcommandhyhzXparseU refdomainXstdqh-]h.]U refexplicith/]h0]h1]h|h}uh3K*h]qhg)q}q(h"hh+}q(h/]h0]q(hhX std-commandqeh.]h-]h1]uh#hh]qhHXparseqq}q(h"Uh#hubah)houbaubhHX command. It allows to check the behaviour of different parts of the spider at the method level. It has the advantage of being flexible and simple to use, but does not allow debugging code inside a method.qq}q(h"X command. It allows to check the behaviour of different parts of the spider at the method level. It has the advantage of being flexible and simple to use, but does not allow debugging code inside a method.h#hubeubhL)q}q(h"X6In order to see the item scraped from a specific url::qh#hh$h'h)hOh+}q(h/]h0]h.]h-]h1]uh3K/h4hh]qhHX5In order to see the item scraped from a specific url:qq}q(h"X5In order to see the item scraped from a specific url:h#hubaubhU)q}q(h"XK$ scrapy parse --spider=myspider -c parse_item -d 2 [ ... scrapy log lines crawling example.com spider ... ] >>> STATUS DEPTH LEVEL 2 <<< # Scraped Items ------------------------------------------------------------ [{'url': }] # Requests ----------------------------------------------------------------- []h#hh$h'h)hXh+}q(hZh[h-]h.]h/]h0]h1]uh3K1h4hh]qhHXK$ scrapy parse --spider=myspider -c parse_item -d 2 [ ... scrapy log lines crawling example.com spider ... ] >>> STATUS DEPTH LEVEL 2 <<< # Scraped Items ------------------------------------------------------------ [{'url': }] # Requests ----------------------------------------------------------------- []qq}q(h"Uh#hubaubhL)q}q(h"XTUsing the ``--verbose`` or ``-v`` option we can see the status at each depth level::qh#hh$h'h)hOh+}q(h/]h0]h.]h-]h1]uh3K;h4hh]q(hHX Using the qɅq}q(h"X Using the h#hubhg)q}q(h"X ``--verbose``h+}q(h/]h0]h.]h-]h1]uh#hh]qhHX --verboseqЅq}q(h"Uh#hubah)houbhHX or qӅq}q(h"X or h#hubhg)q}q(h"X``-v``h+}q(h/]h0]h.]h-]h1]uh#hh]qhHX-vqڅq}q(h"Uh#hubah)houbhHX2 option we can see the status at each depth level:q݅q}q(h"X2 option we can see the status at each depth level:h#hubeubhU)q}q(h"X$ scrapy parse --spider=myspider -c parse_item -d 2 -v [ ... scrapy log lines crawling example.com spider ... ] >>> DEPTH LEVEL: 1 <<< # Scraped Items ------------------------------------------------------------ [] # Requests ----------------------------------------------------------------- [] >>> DEPTH LEVEL: 2 <<< # Scraped Items ------------------------------------------------------------ [{'url': }] # Requests ----------------------------------------------------------------- []h#hh$h'h)hXh+}q(hZh[h-]h.]h/]h0]h1]uh3K=h4hh]qhHX$ scrapy parse --spider=myspider -c parse_item -d 2 -v [ ... scrapy log lines crawling example.com spider ... ] >>> DEPTH LEVEL: 1 <<< # Scraped Items ------------------------------------------------------------ [] # Requests ----------------------------------------------------------------- [] >>> DEPTH LEVEL: 2 <<< # Scraped Items ------------------------------------------------------------ [{'url': }] # Requests ----------------------------------------------------------------- []q䅁q}q(h"Uh#hubaubhL)q}q(h"XSChecking items scraped from a single start_url, can also be easily achieved using::h#hh$h'h)hOh+}q(h/]h0]h.]h-]h1]uh3KOh4hh]qhHXRChecking items scraped from a single start_url, can also be easily achieved using:q녁q}q(h"XRChecking items scraped from a single start_url, can also be easily achieved using:h#hubaubhU)q}q(h"X@$ scrapy parse --spider=myspider -d 3 'http://example.com/page1'h#hh$h'h)hXh+}q(hZh[h-]h.]h/]h0]h1]uh3KRh4hh]qhHX@$ scrapy parse --spider=myspider -d 3 'http://example.com/page1'qq}q(h"Uh#hubaubeubh5)q}q(h"Uh#h6h$h'h)h:h+}q(h/]h0]h.]h-]qhah1]qhauh3KVh4hh]q(hA)q}q(h"X Scrapy Shellqh#hh$h'h)hEh+}q(h/]h0]h.]h-]h1]uh3KVh4hh]qhHX Scrapy Shellrr}r(h"hh#hubaubhL)r}r(h"XWhile the :command:`parse` command is very useful for checking behaviour of a spider, it is of little help to check what happens inside a callback, besides showing the response received and the output. How to debug the situation when ``parse_details`` sometimes receives no item?h#hh$h'h)hOh+}r(h/]h0]h.]h-]h1]uh3KXh4hh]r(hHX While the rr}r (h"X While the h#jubhs)r }r (h"X:command:`parse`r h#jh$h'h)hwh+}r (UreftypeXcommandhyhzXparseU refdomainXstdrh-]h.]U refexplicith/]h0]h1]h|h}uh3KXh]rhg)r}r(h"j h+}r(h/]h0]r(hjX std-commandreh.]h-]h1]uh#j h]rhHXparserr}r(h"Uh#jubah)houbaubhHX command is very useful for checking behaviour of a spider, it is of little help to check what happens inside a callback, besides showing the response received and the output. How to debug the situation when rr}r(h"X command is very useful for checking behaviour of a spider, it is of little help to check what happens inside a callback, besides showing the response received and the output. How to debug the situation when h#jubhg)r}r(h"X``parse_details``h+}r(h/]h0]h.]h-]h1]uh#jh]rhHX parse_detailsr r!}r"(h"Uh#jubah)houbhHX sometimes receives no item?r#r$}r%(h"X sometimes receives no item?h#jubeubhL)r&}r'(h"XtFortunately, the :command:`shell` is your bread and butter in this case (see :ref:`topics-shell-inspect-response`)::h#hh$h'h)hOh+}r((h/]h0]h.]h-]h1]uh3K]h4hh]r)(hHXFortunately, the r*r+}r,(h"XFortunately, the h#j&ubhs)r-}r.(h"X:command:`shell`r/h#j&h$h'h)hwh+}r0(UreftypeXcommandhyhzXshellU refdomainXstdr1h-]h.]U refexplicith/]h0]h1]h|h}uh3K]h]r2hg)r3}r4(h"j/h+}r5(h/]h0]r6(hj1X std-commandr7eh.]h-]h1]uh#j-h]r8hHXshellr9r:}r;(h"Uh#j3ubah)houbaubhHX, is your bread and butter in this case (see r<r=}r>(h"X, is your bread and butter in this case (see h#j&ubhs)r?}r@(h"X$:ref:`topics-shell-inspect-response`rAh#j&h$h'h)hwh+}rB(UreftypeXrefhyhzXtopics-shell-inspect-responseU refdomainXstdrCh-]h.]U refexplicith/]h0]h1]h|h}uh3K]h]rDcdocutils.nodes emphasis rE)rF}rG(h"jAh+}rH(h/]h0]rI(hjCXstd-refrJeh.]h-]h1]uh#j?h]rKhHXtopics-shell-inspect-responserLrM}rN(h"Uh#jFubah)UemphasisrOubaubhHX):rPrQ}rR(h"X):h#j&ubeubhU)rS}rT(h"Xfrom scrapy.shell import inspect_response def parse_details(self, response): item = response.meta.get('item', None) if item: # populate more `item` fields return item else: inspect_response(response, self)h#hh$h'h)hXh+}rU(hZh[h-]h.]h/]h0]h1]uh3K`h4hh]rVhHXfrom scrapy.shell import inspect_response def parse_details(self, response): item = response.meta.get('item', None) if item: # populate more `item` fields return item else: inspect_response(response, self)rWrX}rY(h"Uh#jSubaubhL)rZ}r[(h"X/See also: :ref:`topics-shell-inspect-response`.r\h#hh$h'h)hOh+}r](h/]h0]h.]h-]h1]uh3Kjh4hh]r^(hHX See also: r_r`}ra(h"X See also: h#jZubhs)rb}rc(h"X$:ref:`topics-shell-inspect-response`rdh#jZh$h'h)hwh+}re(UreftypeXrefhyhzXtopics-shell-inspect-responseU refdomainXstdrfh-]h.]U refexplicith/]h0]h1]h|h}uh3Kjh]rgjE)rh}ri(h"jdh+}rj(h/]h0]rk(hjfXstd-refrleh.]h-]h1]uh#jbh]rmhHXtopics-shell-inspect-responsernro}rp(h"Uh#jhubah)jOubaubhHX.rq}rr(h"X.h#jZubeubeubh5)rs}rt(h"Uh#h6h$h'h)h:h+}ru(h/]h0]h.]h-]rvhah1]rwh auh3Kmh4hh]rx(hA)ry}rz(h"XOpen in browserr{h#jsh$h'h)hEh+}r|(h/]h0]h.]h-]h1]uh3Kmh4hh]r}hHXOpen in browserr~r}r(h"j{h#jyubaubhL)r}r(h"XSometimes you just want to see how a certain response looks in a browser, you can use the ``open_in_browser`` function for that. Here is an example of how you would use it::h#jsh$h'h)hOh+}r(h/]h0]h.]h-]h1]uh3Koh4hh]r(hHXZSometimes you just want to see how a certain response looks in a browser, you can use the rr}r(h"XZSometimes you just want to see how a certain response looks in a browser, you can use the h#jubhg)r}r(h"X``open_in_browser``h+}r(h/]h0]h.]h-]h1]uh#jh]rhHXopen_in_browserrr}r(h"Uh#jubah)houbhHX? function for that. Here is an example of how you would use it:rr}r(h"X? function for that. Here is an example of how you would use it:h#jubeubhU)r}r(h"Xfrom scrapy.utils.response import open_in_browser def parse_details(self, response): if "item name" not in response.body: open_in_browser(response)h#jsh$h'h)hXh+}r(hZh[h-]h.]h/]h0]h1]uh3Ksh4hh]rhHXfrom scrapy.utils.response import open_in_browser def parse_details(self, response): if "item name" not in response.body: open_in_browser(response)rr}r(h"Uh#jubaubhL)r}r(h"X``open_in_browser`` will open a browser with the response received by Scrapy at that point, adjusting the `base tag`_ so that images and styles are displayed properly.h#jsh$h'h)hOh+}r(h/]h0]h.]h-]h1]uh3Kyh4hh]r(hg)r}r(h"X``open_in_browser``h+}r(h/]h0]h.]h-]h1]uh#jh]rhHXopen_in_browserrr}r(h"Uh#jubah)houbhHXW will open a browser with the response received by Scrapy at that point, adjusting the rr}r(h"XW will open a browser with the response received by Scrapy at that point, adjusting the h#jubcdocutils.nodes reference r)r}r(h"X `base tag`_UresolvedrKh#jh)U referencerh+}r(UnameXbase tagUrefurirX*http://www.w3schools.com/tags/tag_base.asprh-]h.]h/]h0]h1]uh]rhHXbase tagrr}r(h"Uh#jubaubhHX2 so that images and styles are displayed properly.rr}r(h"X2 so that images and styles are displayed properly.h#jubeubeubh5)r}r(h"Uh#h6h$h'h)h:h+}r(h/]h0]h.]h-]rhah1]rhauh3K~h4hh]r(hA)r}r(h"XLoggingrh#jh$h'h)hEh+}r(h/]h0]h.]h-]h1]uh3K~h4hh]rhHXLoggingrr}r(h"jh#jubaubhL)r}r(h"XLogging is another useful option for getting information about your spider run. Although not as convenient, it comes with the advantage that the logs will be available in all future runs should they be necessary again::h#jh$h'h)hOh+}r(h/]h0]h.]h-]h1]uh3Kh4hh]rhHXLogging is another useful option for getting information about your spider run. Although not as convenient, it comes with the advantage that the logs will be available in all future runs should they be necessary again:rr}r(h"XLogging is another useful option for getting information about your spider run. Although not as convenient, it comes with the advantage that the logs will be available in all future runs should they be necessary again:h#jubaubhU)r}r(h"Xfrom scrapy import log def parse_details(self, response): item = response.meta.get('item', None) if item: # populate more `item` fields return item else: self.log('No item received for %s' % response.url, level=log.WARNING)h#jh$h'h)hXh+}r(hZh[h-]h.]h/]h0]h1]uh3Kh4hh]rhHXfrom scrapy import log def parse_details(self, response): item = response.meta.get('item', None) if item: # populate more `item` fields return item else: self.log('No item received for %s' % response.url, level=log.WARNING)rr}r(h"Uh#jubaubhL)r}r(h"X>For more information, check the :ref:`topics-logging` section.rh#jh$h'h)hOh+}r(h/]h0]h.]h-]h1]uh3Kh4hh]r(hHX For more information, check the rr}r(h"X For more information, check the h#jubhs)r}r(h"X:ref:`topics-logging`rh#jh$h'h)hwh+}r(UreftypeXrefhyhzXtopics-loggingU refdomainXstdrh-]h.]U refexplicith/]h0]h1]h|h}uh3Kh]rjE)r}r(h"jh+}r(h/]h0]r(hjXstd-refreh.]h-]h1]uh#jh]rhHXtopics-loggingrr}r(h"Uh#jubah)jOubaubhHX section.rr}r(h"X section.h#jubeubh)r}r(h"X8.. _base tag: http://www.w3schools.com/tags/tag_base.aspU referencedrKh#jh$h'h)h*h+}r(jjh-]rhah.]h/]h0]h1]rh auh3Kh4hh]ubeubeubeh"UU transformerrNU footnote_refsr}rUrefnamesr}rXbase tag]rjasUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh4hU current_linerNUtransform_messagesr]rcdocutils.nodes system_message r)r}r(h"Uh+}r(h/]UlevelKh-]h.]Usourceh'h0]h1]UlineKUtypeUINFOruh]rhL)r }r (h"Uh+}r (h/]h0]h.]h-]h1]uh#jh]r hHX2Hyperlink target "topics-debug" is not referenced.r r}r(h"Uh#j ubah)hOubah)Usystem_messagerubaUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/r U tracebackr!Upep_referencesr"NUstrip_commentsr#NU toc_backlinksr$Uentryr%U language_coder&Uenr'U datestampr(NU report_levelr)KU _destinationr*NU halt_levelr+KU strip_classesr,NhENUerror_encoding_error_handlerr-Ubackslashreplacer.Udebugr/NUembed_stylesheetr0Uoutput_encoding_error_handlerr1Ustrictr2U sectnum_xformr3KUdump_transformsr4NU docinfo_xformr5KUwarning_streamr6NUpep_file_url_templater7Upep-%04dr8Uexit_status_levelr9KUconfigr:NUstrict_visitorr;NUcloak_email_addressesr<Utrim_footnote_reference_spacer=Uenvr>NUdump_pseudo_xmlr?NUexpose_internalsr@NUsectsubtitle_xformrAU source_linkrBNUrfc_referencesrCNUoutput_encodingrDUutf-8rEU source_urlrFNUinput_encodingrGU utf-8-sigrHU_disable_configrINU id_prefixrJUU tab_widthrKKUerror_encodingrLUUTF-8rMU_sourcerNUB/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/debug.rstrOUgettext_compactrPU generatorrQNUdump_internalsrRNU smart_quotesrSU pep_base_urlrTUhttp://www.python.org/dev/peps/rUUsyntax_highlightrVUlongrWUinput_encoding_error_handlerrXj2Uauto_id_prefixrYUidrZUdoctitle_xformr[Ustrip_elements_with_classesr\NU _config_filesr]]r^Ufile_insertion_enabledr_U raw_enabledr`KU dump_settingsraNubUsymbol_footnote_startrbKUidsrc}rd(hhhjhjshh6hh6hhhjuUsubstitution_namesre}rfh)h4h+}rg(h/]h-]h.]Usourceh'h0]h1]uU footnotesrh]riUrefidsrj}rkh]rlh asub.PKo1De΄|[|[5scrapy-0.22/.doctrees/topics/request-response.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xscrapy.http.Response.bodyqXscrapy.http.Request.urlqXshallow copiedqX-topics-request-response-ref-request-userloginq X bindaddressq NXscrapy.http.Request.headersq Xscrapy.http.Responseq Xmeta http-equivq Xrequest objectsqNXresponse subclassesqNXscrapy.http.Request.metaqXscrapy.http.FormRequestqX(scrapy.http.TextResponse.body_as_unicodeqXtopics-request-metaqXscrapy.http.HtmlResponseqXrequest subclassesqNX6topics-request-response-ref-request-callback-argumentsqXtwisted failureqX.topics-request-response-ref-request-subclassesqXscrapy.http.Response.metaqX!scrapy.http.TextResponse.encodingqXscrapy.http.Request.bodyqXrequest usage examplesqNXtopics-request-responseqX,using formrequest to send data via http postqNXrequest.meta special keysqNXhtmlresponse objectsq NXscrapy.http.XmlResponseq!Xscrapy.http.Requestq"Xscrapy.http.Response.statusq#Xscrapy.http.Request.replaceq$Xformrequest objectsq%NX:using formrequest.from_response() to simulate a user loginq&NXscrapy.http.Response.replaceq'X%scrapy.http.FormRequest.from_responseq(X-passing additional data to callback functionsq)NXscrapy.http.Response.copyq*Xscrapy.http.Response.requestq+Xtextresponse objectsq,NXresponse objectsq-NXscrapy.http.Request.methodq.Xscrapy.http.Response.urlq/Xxmlresponse objectsq0NXlxml.html formsq1Xrequests and responsesq2NXscrapy.http.Request.copyq3Xscrapy.http.Response.headersq4X/topics-request-response-ref-response-subclassesq5Xscrapy.http.TextResponseq6Xscrapy.http.Response.flagsq7uUsubstitution_defsq8}q9Uparse_messagesq:]q;Ucurrent_sourceqKUnameidsq?}q@(hhhhhUshallow-copiedqAh U-topics-request-response-ref-request-userloginqBh U bindaddressqCh h h h h Umeta-http-equivqDhUrequest-objectsqEhUresponse-subclassesqFhhhhhhhUtopics-request-metaqGhhhUrequest-subclassesqHhU6topics-request-response-ref-request-callback-argumentsqIhUtwisted-failureqJhU.topics-request-response-ref-request-subclassesqKhhhhhhhUrequest-usage-examplesqLhUtopics-request-responseqMhU,using-formrequest-to-send-data-via-http-postqNhUrequest-meta-special-keysqOh Uhtmlresponse-objectsqPh!h!h"h"h#h#h$h$h%Uformrequest-objectsqQh&U8using-formrequest-from-response-to-simulate-a-user-loginqRh'h'h(h(h)U-passing-additional-data-to-callback-functionsqSh*h*h+h+h,Utextresponse-objectsqTh-Uresponse-objectsqUh.h.h/h/h0Uxmlresponse-objectsqVh1Ulxml-html-formsqWh2Urequests-and-responsesqXh3h3h4h4h5U/topics-request-response-ref-response-subclassesqYh6h6h7h7uUchildrenqZ]q[(cdocutils.nodes target q\)q]}q^(U rawsourceq_X.. _topics-request-response:Uparentq`hUsourceqacdocutils.nodes reprunicode qbXM/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/request-response.rstqcqd}qebUtagnameqfUtargetqgU attributesqh}qi(Uidsqj]Ubackrefsqk]Udupnamesql]Uclassesqm]Unamesqn]UrefidqohMuUlineqpKUdocumentqqhhZ]ubcdocutils.nodes section qr)qs}qt(h_Uh`hhahdUexpect_referenced_by_namequ}qvhh]shfUsectionqwhh}qx(hl]hm]hk]hj]qy(Xmodule-scrapy.httpqzhXhMehn]q{(h2heuhpKhqhUexpect_referenced_by_idq|}q}hMh]shZ]q~(cdocutils.nodes title q)q}q(h_XRequests and Responsesqh`hshahdhfUtitleqhh}q(hl]hm]hk]hj]hn]uhpKhqhhZ]qcdocutils.nodes Text qXRequests and Responsesqq}q(h_hh`hubaubcsphinx.addnodes index q)q}q(h_Uh`hshahdhfUindexqhh}q(hj]hk]hl]hm]hn]Uentries]q(UsingleqXscrapy.http (module)Xmodule-scrapy.httpUtqauhpNhqhhZ]ubcdocutils.nodes paragraph q)q}q(h_XRScrapy uses :class:`Request` and :class:`Response` objects for crawling web sites.h`hshahdhfU paragraphqhh}q(hl]hm]hk]hj]hn]uhpK hqhhZ]q(hX Scrapy uses qq}q(h_X Scrapy uses h`hubcsphinx.addnodes pending_xref q)q}q(h_X:class:`Request`qh`hhahdhfU pending_xrefqhh}q(UreftypeXclassUrefwarnqU reftargetqXRequestU refdomainXpyqhj]hk]U refexplicithl]hm]hn]UrefdocqXtopics/request-responseqUpy:classqNU py:moduleqX scrapy.httpquhpK hZ]qcdocutils.nodes literal q)q}q(h_hhh}q(hl]hm]q(UxrefqhXpy-classqehk]hj]hn]uh`hhZ]qhXRequestqq}q(h_Uh`hubahfUliteralqubaubhX and qq}q(h_X and h`hubh)q}q(h_X:class:`Response`qh`hhahdhfhhh}q(UreftypeXclasshhXResponseU refdomainXpyqhj]hk]U refexplicithl]hm]hn]hhhNhhuhpK hZ]qh)q}q(h_hhh}q(hl]hm]q(hhXpy-classqehk]hj]hn]uh`hhZ]qhXResponseqŅq}q(h_Uh`hubahfhubaubhX objects for crawling web sites.qȅq}q(h_X objects for crawling web sites.h`hubeubh)q}q(h_XTypically, :class:`Request` objects are generated in the spiders and pass across the system until they reach the Downloader, which executes the request and returns a :class:`Response` object which travels back to the spider that issued the request.h`hshahdhfhhh}q(hl]hm]hk]hj]hn]uhpK hqhhZ]q(hX Typically, qυq}q(h_X Typically, h`hubh)q}q(h_X:class:`Request`qh`hhahdhfhhh}q(UreftypeXclasshhXRequestU refdomainXpyqhj]hk]U refexplicithl]hm]hn]hhhNhhuhpK hZ]qh)q}q(h_hhh}q(hl]hm]q(hhXpy-classqehk]hj]hn]uh`hhZ]qhXRequestqޅq}q(h_Uh`hubahfhubaubhX objects are generated in the spiders and pass across the system until they reach the Downloader, which executes the request and returns a qᅁq}q(h_X objects are generated in the spiders and pass across the system until they reach the Downloader, which executes the request and returns a h`hubh)q}q(h_X:class:`Response`qh`hhahdhfhhh}q(UreftypeXclasshhXResponseU refdomainXpyqhj]hk]U refexplicithl]hm]hn]hhhNhhuhpK hZ]qh)q}q(h_hhh}q(hl]hm]q(hhXpy-classqehk]hj]hn]uh`hhZ]qhXResponseqq}q(h_Uh`hubahfhubaubhXA object which travels back to the spider that issued the request.qq}q(h_XA object which travels back to the spider that issued the request.h`hubeubh)q}q(h_X Both :class:`Request` and :class:`Response` classes have subclasses which add functionality not required in the base classes. These are described below in :ref:`topics-request-response-ref-request-subclasses` and :ref:`topics-request-response-ref-response-subclasses`.h`hshahdhfhhh}q(hl]hm]hk]hj]hn]uhpKhqhhZ]q(hXBoth qq}q(h_XBoth h`hubh)q}q(h_X:class:`Request`qh`hhahdhfhhh}r(UreftypeXclasshhXRequestU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]rh)r}r(h_hhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`hhZ]rhXRequestr r }r (h_Uh`jubahfhubaubhX and r r }r(h_X and h`hubh)r}r(h_X:class:`Response`rh`hhahdhfhhh}r(UreftypeXclasshhXResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhXResponserr}r(h_Uh`jubahfhubaubhXp classes have subclasses which add functionality not required in the base classes. These are described below in rr}r (h_Xp classes have subclasses which add functionality not required in the base classes. These are described below in h`hubh)r!}r"(h_X5:ref:`topics-request-response-ref-request-subclasses`r#h`hhahdhfhhh}r$(UreftypeXrefhhX.topics-request-response-ref-request-subclassesU refdomainXstdr%hj]hk]U refexplicithl]hm]hn]hhuhpKhZ]r&cdocutils.nodes emphasis r')r(}r)(h_j#hh}r*(hl]hm]r+(hj%Xstd-refr,ehk]hj]hn]uh`j!hZ]r-hX.topics-request-response-ref-request-subclassesr.r/}r0(h_Uh`j(ubahfUemphasisr1ubaubhX and r2r3}r4(h_X and h`hubh)r5}r6(h_X6:ref:`topics-request-response-ref-response-subclasses`r7h`hhahdhfhhh}r8(UreftypeXrefhhX/topics-request-response-ref-response-subclassesU refdomainXstdr9hj]hk]U refexplicithl]hm]hn]hhuhpKhZ]r:j')r;}r<(h_j7hh}r=(hl]hm]r>(hj9Xstd-refr?ehk]hj]hn]uh`j5hZ]r@hX/topics-request-response-ref-response-subclassesrArB}rC(h_Uh`j;ubahfj1ubaubhX.rD}rE(h_X.h`hubeubhr)rF}rG(h_Uh`hshahdhfhwhh}rH(hl]hm]hk]hj]rIhEahn]rJhauhpKhqhhZ]rK(h)rL}rM(h_XRequest objectsrNh`jFhahdhfhhh}rO(hl]hm]hk]hj]hn]uhpKhqhhZ]rPhXRequest objectsrQrR}rS(h_jNh`jLubaubh)rT}rU(h_Uh`jFhaNhfhhh}rV(hj]hk]hl]hm]hn]Uentries]rW(hXRequest (class in scrapy.http)h"UtrXauhpNhqhhZ]ubcsphinx.addnodes desc rY)rZ}r[(h_Uh`jFhaNhfUdescr\hh}r](Unoindexr^Udomainr_Xpyr`hj]hk]hl]hm]hn]UobjtyperaXclassrbUdesctypercjbuhpNhqhhZ]rd(csphinx.addnodes desc_signature re)rf}rg(h_X~Request(url[, callback, method='GET', headers, body, cookies, meta, encoding='utf-8', priority=0, dont_filter=False, errback])h`jZhahdhfUdesc_signaturerhhh}ri(hj]rjh"aUmodulerkhhk]hl]hm]hn]rlh"aUfullnamermXRequestrnUclassroUUfirstrpuhpKhqhhZ]rq(csphinx.addnodes desc_annotation rr)rs}rt(h_Xclass h`jfhahdhfUdesc_annotationruhh}rv(hl]hm]hk]hj]hn]uhpKhqhhZ]rwhXclass rxry}rz(h_Uh`jsubaubcsphinx.addnodes desc_addname r{)r|}r}(h_X scrapy.http.h`jfhahdhfU desc_addnamer~hh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhX scrapy.http.rr}r(h_Uh`j|ubaubcsphinx.addnodes desc_name r)r}r(h_jnh`jfhahdhfU desc_namerhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXRequestrr}r(h_Uh`jubaubcsphinx.addnodes desc_parameterlist r)r}r(h_Uh`jfhahdhfUdesc_parameterlistrhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(csphinx.addnodes desc_parameter r)r}r(h_Xurlhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXurlrr}r(h_Uh`jubahfUdesc_parameterrubcsphinx.addnodes desc_optional r)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j)r}r(h_Xcallbackhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXcallbackrr}r(h_Uh`jubahfjubj)r}r(h_X method='GET'hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX method='GET'rr}r(h_Uh`jubahfjubj)r}r(h_Xheadershh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXheadersrr}r(h_Uh`jubahfjubj)r}r(h_Xbodyhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXbodyrr}r(h_Uh`jubahfjubj)r}r(h_Xcookieshh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXcookiesrr}r(h_Uh`jubahfjubj)r}r(h_Xmetahh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXmetarr}r(h_Uh`jubahfjubj)r}r(h_Xencoding='utf-8'hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXencoding='utf-8'rr}r(h_Uh`jubahfjubj)r}r(h_X priority=0hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX priority=0rr}r(h_Uh`jubahfjubj)r}r(h_Xdont_filter=Falsehh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXdont_filter=Falserr}r(h_Uh`jubahfjubj)r}r(h_Xerrbackhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXerrbackrr}r(h_Uh`jubahfjubehfU desc_optionalrubeubeubcsphinx.addnodes desc_content r)r}r(h_Uh`jZhahdhfU desc_contentrhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(h)r}r(h_XA :class:`Request` object represents an HTTP request, which is usually generated in the Spider and executed by the Downloader, and thus generating a :class:`Response`.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(hXA rr}r(h_XA h`jubh)r}r(h_X:class:`Request`rh`jhahdhfhhh}r(UreftypeXclasshhXRequestU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjnhhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhXRequestrr}r(h_Uh`jubahfhubaubhX object represents an HTTP request, which is usually generated in the Spider and executed by the Downloader, and thus generating a rr}r(h_X object represents an HTTP request, which is usually generated in the Spider and executed by the Downloader, and thus generating a h`jubh)r}r(h_X:class:`Response`r h`jhahdhfhhh}r (UreftypeXclasshhXResponseU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhjnhhuhpKhZ]r h)r }r(h_j hh}r(hl]hm]r(hj Xpy-classrehk]hj]hn]uh`jhZ]rhXResponserr}r(h_Uh`j ubahfhubaubhX.r}r(h_X.h`jubeubcdocutils.nodes field_list r)r}r(h_Uh`jhaNhfU field_listrhh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rcdocutils.nodes field r)r}r (h_Uhh}r!(hl]hm]hk]hj]hn]uh`jhZ]r"(cdocutils.nodes field_name r#)r$}r%(h_Uhh}r&(hl]hm]hk]hj]hn]uh`jhZ]r'hX Parametersr(r)}r*(h_Uh`j$ubahfU field_namer+ubcdocutils.nodes field_body r,)r-}r.(h_Uhh}r/(hl]hm]hk]hj]hn]uh`jhZ]r0cdocutils.nodes bullet_list r1)r2}r3(h_Uhh}r4(hl]hm]hk]hj]hn]uh`j-hZ]r5(cdocutils.nodes list_item r6)r7}r8(h_Uhh}r9(hl]hm]hk]hj]hn]uh`j2hZ]r:h)r;}r<(h_Uhh}r=(hl]hm]hk]hj]hn]uh`j7hZ]r>(cdocutils.nodes strong r?)r@}rA(h_Xurlhh}rB(hl]hm]hk]hj]hn]uh`j;hZ]rChXurlrDrE}rF(h_Uh`j@ubahfUstrongrGubhX (rHrI}rJ(h_Uh`j;ubh)rK}rL(h_Uhh}rM(UreftypeUobjrNU reftargetXstringrOU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`j;hZ]rPj')rQ}rR(h_jOhh}rS(hl]hm]hk]hj]hn]uh`jKhZ]rThXstringrUrV}rW(h_Uh`jQubahfj1ubahfhubhX)rX}rY(h_Uh`j;ubhX -- rZr[}r\(h_Uh`j;ubhXthe URL of this requestr]r^}r_(h_Xthe URL of this requestr`h`j;ubehfhubahfU list_itemraubj6)rb}rc(h_Uhh}rd(hl]hm]hk]hj]hn]uh`j2hZ]reh)rf}rg(h_Uhh}rh(hl]hm]hk]hj]hn]uh`jbhZ]ri(j?)rj}rk(h_Xcallbackhh}rl(hl]hm]hk]hj]hn]uh`jfhZ]rmhXcallbackrnro}rp(h_Uh`jjubahfjGubhX (rqrr}rs(h_Uh`jfubh)rt}ru(h_Uhh}rv(UreftypejNU reftargetXcallablerwU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jfhZ]rxj')ry}rz(h_jwhh}r{(hl]hm]hk]hj]hn]uh`jthZ]r|hXcallabler}r~}r(h_Uh`jyubahfj1ubahfhubhX)r}r(h_Uh`jfubhX -- rr}r(h_Uh`jfubhXthe function that will be called with the response of this request (once its downloaded) as its first parameter. For more information see rr}r(h_Xthe function that will be called with the response of this request (once its downloaded) as its first parameter. For more information see h`jfubh)r}r(h_X=:ref:`topics-request-response-ref-request-callback-arguments`rh`jfhahdhfhhh}r(UreftypeXrefhhX6topics-request-response-ref-request-callback-argumentsU refdomainXstdrhj]hk]U refexplicithl]hm]hn]hhuhpK$hZ]rj')r}r(h_jhh}r(hl]hm]r(hjXstd-refrehk]hj]hn]uh`jhZ]rhX6topics-request-response-ref-request-callback-argumentsrr}r(h_Uh`jubahfj1ubaubhX> below. If a Request doesn't specify a callback, the spider's rr}r(h_X> below. If a Request doesn't specify a callback, the spider's h`jfubh)r}r(h_X#:meth:`~scrapy.spider.Spider.parse`rh`jfhahdhfhhh}r(UreftypeXmethhhXscrapy.spider.Spider.parseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjnhhuhpK$hZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-methrehk]hj]hn]uh`jhZ]rhXparse()rr}r(h_Uh`jubahfhubaubhXf method will be used. Note that if exceptions are raised during processing, errback is called instead.rr}r(h_Xf method will be used. Note that if exceptions are raised during processing, errback is called instead.h`jfubehfhubahfjaubj6)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j2hZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j?)r}r(h_Xmethodhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXmethodrr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`jubh)r}r(h_Uhh}r(UreftypejNU reftargetXstringrU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jhZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXstringrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`jubhX -- rr}r(h_Uh`jubhX-the HTTP method of this request. Defaults to rr}r(h_X-the HTTP method of this request. Defaults to h`jubh)r}r(h_X ``'GET'``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX'GET'rr}r(h_Uh`jubahfhubhX.r}r(h_X.h`jubehfhubahfjaubj6)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j2hZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j?)r}r(h_Xmetahh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXmetarr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`jubh)r}r(h_Uhh}r(UreftypejNU reftargetXdictrU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jhZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXdictrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`jubhX -- rr}r(h_Uh`jubhXthe initial values for the rr}r(h_Xthe initial values for the h`jubh)r}r(h_X:attr:`Request.meta`rh`jhahdhfhhh}r(UreftypeXattrhhX Request.metaU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjnhhuhpK0hZ]rh)r}r(h_jhh}r (hl]hm]r (hjXpy-attrr ehk]hj]hn]uh`jhZ]r hX Request.metar r}r(h_Uh`jubahfhubaubhXO attribute. If given, the dict passed in this parameter will be shallow copied.rr}r(h_XO attribute. If given, the dict passed in this parameter will be shallow copied.h`jubehfhubahfjaubj6)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j2hZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j?)r}r(h_Xbodyhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXbodyrr }r!(h_Uh`jubahfjGubhX (r"r#}r$(h_Uh`jubh)r%}r&(h_Uhh}r'(UreftypejNU reftargetXstr or unicoder(U refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jhZ]r)j')r*}r+(h_j(hh}r,(hl]hm]hk]hj]hn]uh`j%hZ]r-hXstr or unicoder.r/}r0(h_Uh`j*ubahfj1ubahfhubhX)r1}r2(h_Uh`jubhX -- r3r4}r5(h_Uh`jubhXthe request body. If a r6r7}r8(h_Xthe request body. If a h`jubh)r9}r:(h_X ``unicode``hh}r;(hl]hm]hk]hj]hn]uh`jhZ]r<hXunicoder=r>}r?(h_Uh`j9ubahfhubhX! is passed, then it's encoded to r@rA}rB(h_X! is passed, then it's encoded to h`jubh)rC}rD(h_X``str``hh}rE(hl]hm]hk]hj]hn]uh`jhZ]rFhXstrrGrH}rI(h_Uh`jCubahfhubhX using the rJrK}rL(h_X using the h`jubcdocutils.nodes title_reference rM)rN}rO(h_X `encoding`hh}rP(hl]hm]hk]hj]hn]uh`jhZ]rQhXencodingrRrS}rT(h_Uh`jNubahfUtitle_referencerUubhX passed (which defaults to rVrW}rX(h_X passed (which defaults to h`jubh)rY}rZ(h_X ``utf-8``hh}r[(hl]hm]hk]hj]hn]uh`jhZ]r\hXutf-8r]r^}r_(h_Uh`jYubahfhubhX). If r`ra}rb(h_X). If h`jubh)rc}rd(h_X``body``hh}re(hl]hm]hk]hj]hn]uh`jhZ]rfhXbodyrgrh}ri(h_Uh`jcubahfhubhXu is not given,, an empty string is stored. Regardless of the type of this argument, the final value stored will be a rjrk}rl(h_Xu is not given,, an empty string is stored. Regardless of the type of this argument, the final value stored will be a h`jubh)rm}rn(h_X``str``hh}ro(hl]hm]hk]hj]hn]uh`jhZ]rphXstrrqrr}rs(h_Uh`jmubahfhubhX (never rtru}rv(h_X (never h`jubh)rw}rx(h_X ``unicode``hh}ry(hl]hm]hk]hj]hn]uh`jhZ]rzhXunicoder{r|}r}(h_Uh`jwubahfhubhX or r~r}r(h_X or h`jubh)r}r(h_X``None``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXNonerr}r(h_Uh`jubahfhubhX).rr}r(h_X).h`jubehfhubahfjaubj6)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j2hZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j?)r}r(h_Xheadershh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXheadersrr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`jubh)r}r(h_Uhh}r(UreftypejNU reftargetXdictrU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jhZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXdictrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`jubhX -- rr}r(h_Uh`jubhXthe headers of this request. The dict values can be strings (for single valued headers) or lists (for multi-valued headers). If rr}r(h_Xthe headers of this request. The dict values can be strings (for single valued headers) or lists (for multi-valued headers). If h`jubh)r}r(h_X``None``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXNonerr}r(h_Uh`jubahfhubhX= is passed as value, the HTTP header will not be sent at all.rr}r(h_X= is passed as value, the HTTP header will not be sent at all.h`jubehfhubahfjaubj6)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j2hZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j?)r}r(h_Xcookieshh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXcookiesrr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`jubh)r}r(h_Uhh}r(UreftypejNU reftargetX dict or listrU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jhZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX dict or listrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`jubhX -- rr}r(h_Uh`jubh)r}r(h_X4the request cookies. These can be sent in two forms.rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpK@hZ]rhX4the request cookies. These can be sent in two forms.rr}r(h_jh`jubaubcdocutils.nodes enumerated_list r)r}r(h_Uhh}r(UsuffixrU.hj]hk]hl]UprefixrUhm]hn]UenumtyperUarabicruh`jhZ]r(j6)r}r(h_XUsing a dict:: request_with_cookies = Request(url="http://www.example.com", cookies={'currency': 'USD', 'country': 'UY'})hh}r(hl]hm]hk]hj]hn]uh`jhZ]r(h)r}r(h_XUsing a dict::h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKBhZ]rhX Using a dict:rr}r(h_X Using a dict:h`jubaubcdocutils.nodes literal_block r)r}r(h_Xrequest_with_cookies = Request(url="http://www.example.com", cookies={'currency': 'USD', 'country': 'UY'})h`jhfU literal_blockrhh}r(U xml:spacerUpreserverhj]hk]hl]hm]hn]uhpKDhZ]rhXrequest_with_cookies = Request(url="http://www.example.com", cookies={'currency': 'USD', 'country': 'UY'})rr}r(h_Uh`jubaubehfjaubj6)r}r(h_XPUsing a list of dicts:: request_with_cookies = Request(url="http://www.example.com", cookies=[{'name': 'currency', 'value': 'USD', 'domain': 'example.com', 'path': '/currency'}]) hh}r(hl]hm]hk]hj]hn]uh`jhZ]r(h)r }r (h_XUsing a list of dicts::h`jhahdhfhhh}r (hl]hm]hk]hj]hn]uhpKFhZ]r hXUsing a list of dicts:r r}r(h_XUsing a list of dicts:h`j ubaubj)r}r(h_X1request_with_cookies = Request(url="http://www.example.com", cookies=[{'name': 'currency', 'value': 'USD', 'domain': 'example.com', 'path': '/currency'}])h`jhfjhh}r(jjhj]hk]hl]hm]hn]uhpKHhZ]rhX1request_with_cookies = Request(url="http://www.example.com", cookies=[{'name': 'currency', 'value': 'USD', 'domain': 'example.com', 'path': '/currency'}])rr}r(h_Uh`jubaubehfjaubehfUenumerated_listrubh)r}r(h_XThe latter form allows for customizing the ``domain`` and ``path`` attributes of the cookie. These is only useful if the cookies are saved for later requests.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKNhZ]r(hX+The latter form allows for customizing the rr}r(h_X+The latter form allows for customizing the h`jubh)r}r (h_X ``domain``hh}r!(hl]hm]hk]hj]hn]uh`jhZ]r"hXdomainr#r$}r%(h_Uh`jubahfhubhX and r&r'}r((h_X and h`jubh)r)}r*(h_X``path``hh}r+(hl]hm]hk]hj]hn]uh`jhZ]r,hXpathr-r.}r/(h_Uh`j)ubahfhubhX\ attributes of the cookie. These is only useful if the cookies are saved for later requests.r0r1}r2(h_X\ attributes of the cookie. These is only useful if the cookies are saved for later requests.h`jubeubh)r3}r4(h_XxWhen some site returns cookies (in a response) those are stored in the cookies for that domain and will be sent again in future requests. That's the typical behaviour of any regular web browser. However, if, for some reason, you want to avoid merging with existing cookies you can instruct Scrapy to do so by setting the ``dont_merge_cookies`` key in the :attr:`Request.meta`.h`jhahdhfhhh}r5(hl]hm]hk]hj]hn]uhpKRhZ]r6(hXAWhen some site returns cookies (in a response) those are stored in the cookies for that domain and will be sent again in future requests. That's the typical behaviour of any regular web browser. However, if, for some reason, you want to avoid merging with existing cookies you can instruct Scrapy to do so by setting the r7r8}r9(h_XAWhen some site returns cookies (in a response) those are stored in the cookies for that domain and will be sent again in future requests. That's the typical behaviour of any regular web browser. However, if, for some reason, you want to avoid merging with existing cookies you can instruct Scrapy to do so by setting the h`j3ubh)r:}r;(h_X``dont_merge_cookies``hh}r<(hl]hm]hk]hj]hn]uh`j3hZ]r=hXdont_merge_cookiesr>r?}r@(h_Uh`j:ubahfhubhX key in the rArB}rC(h_X key in the h`j3ubh)rD}rE(h_X:attr:`Request.meta`rFh`j3hahdhfhhh}rG(UreftypeXattrhhX Request.metaU refdomainXpyrHhj]hk]U refexplicithl]hm]hn]hhhjnhhuhpKRhZ]rIh)rJ}rK(h_jFhh}rL(hl]hm]rM(hjHXpy-attrrNehk]hj]hn]uh`jDhZ]rOhX Request.metarPrQ}rR(h_Uh`jJubahfhubaubhX.rS}rT(h_X.h`j3ubeubh)rU}rV(h_X,Example of request without merging cookies::h`jhahdhfhhh}rW(hl]hm]hk]hj]hn]uhpKYhZ]rXhX+Example of request without merging cookies:rYrZ}r[(h_X+Example of request without merging cookies:h`jUubaubj)r\}r](h_Xrequest_with_cookies = Request(url="http://www.example.com", cookies={'currency': 'USD', 'country': 'UY'}, meta={'dont_merge_cookies': True})h`jhfjhh}r^(jjhj]hk]hl]hm]hn]uhpK[hZ]r_hXrequest_with_cookies = Request(url="http://www.example.com", cookies={'currency': 'USD', 'country': 'UY'}, meta={'dont_merge_cookies': True})r`ra}rb(h_Uh`j\ubaubh)rc}rd(h_X$For more info see :ref:`cookies-mw`.h`jhahdhfhhh}re(hl]hm]hk]hj]hn]uhpK_hZ]rf(hXFor more info see rgrh}ri(h_XFor more info see h`jcubh)rj}rk(h_X:ref:`cookies-mw`rlh`jchahdhfhhh}rm(UreftypeXrefhhX cookies-mwU refdomainXstdrnhj]hk]U refexplicithl]hm]hn]hhuhpK_hZ]roj')rp}rq(h_jlhh}rr(hl]hm]rs(hjnXstd-refrtehk]hj]hn]uh`jjhZ]ruhX cookies-mwrvrw}rx(h_Uh`jpubahfj1ubaubhX.ry}rz(h_X.h`jcubeubehfhubahfjaubj6)r{}r|(h_Uhh}r}(hl]hm]hk]hj]hn]uh`j2hZ]r~h)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j{hZ]r(j?)r}r(h_Xencodinghh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXencodingrr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`jubh)r}r(h_Uhh}r(UreftypejNU reftargetXstringrU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jhZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXstringrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`jubhX -- rr}r(h_Uh`jubhX*the encoding of this request (defaults to rr}r(h_X*the encoding of this request (defaults to h`jubh)r}r(h_X ``'utf-8'``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX'utf-8'rr}r(h_Uh`jubahfhubhXS). This encoding will be used to percent-encode the URL and to convert the body to rr}r(h_XS). This encoding will be used to percent-encode the URL and to convert the body to h`jubh)r}r(h_X``str``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXstrrr}r(h_Uh`jubahfhubhX (if given as rr}r(h_X (if given as h`jubh)r}r(h_X ``unicode``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXunicoderr}r(h_Uh`jubahfhubhX).rr}r(h_X).h`jubehfhubahfjaubj6)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j2hZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j?)r}r(h_Xpriorityhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXpriorityrr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`jubh)r}r(h_Uhh}r(UreftypejNU reftargetXintrU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jhZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXintrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`jubhX -- rr}r(h_Uh`jubhX*the priority of this request (defaults to rr}r(h_X*the priority of this request (defaults to h`jubh)r}r(h_X``0``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX0r}r(h_Uh`jubahfhubhXV). The priority is used by the scheduler to define the order used to process requests.rr}r(h_XV). The priority is used by the scheduler to define the order used to process requests.h`jubehfhubahfjaubj6)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j2hZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j?)r}r(h_X dont_filterhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX dont_filterrr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`jubh)r}r(h_Uhh}r(UreftypejNU reftargetXbooleanrU refdomainj`hj]hk]U refexplicithl]hm]hn]uh`jhZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXbooleanr r }r (h_Uh`jubahfj1ubahfhubhX)r }r (h_Uh`jubhX -- rr}r(h_Uh`jubhXindicates that this request should not be filtered by the scheduler. This is used when you want to perform an identical request multiple times, to ignore the duplicates filter. Use it with care, or you will get into crawling loops. Default to rr}r(h_Xindicates that this request should not be filtered by the scheduler. This is used when you want to perform an identical request multiple times, to ignore the duplicates filter. Use it with care, or you will get into crawling loops. Default to h`jubh)r}r(h_X ``False``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXFalserr}r(h_Uh`jubahfhubhX.r}r(h_X.h`jubehfhubahfjaubj6)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j2hZ]r h)r!}r"(h_Uhh}r#(hl]hm]hk]hj]hn]uh`jhZ]r$(j?)r%}r&(h_Xerrbackhh}r'(hl]hm]hk]hj]hn]uh`j!hZ]r(hXerrbackr)r*}r+(h_Uh`j%ubahfjGubhX (r,r-}r.(h_Uh`j!ubh)r/}r0(h_Uhh}r1(UreftypejNU reftargetXcallabler2U refdomainj`hj]hk]U refexplicithl]hm]hn]uh`j!hZ]r3j')r4}r5(h_j2hh}r6(hl]hm]hk]hj]hn]uh`j/hZ]r7hXcallabler8r9}r:(h_Uh`j4ubahfj1ubahfhubhX)r;}r<(h_Uh`j!ubhX -- r=r>}r?(h_Uh`j!ubhXa function that will be called if any exception was raised while processing the request. This includes pages that failed with 404 HTTP errors and such. It receives a r@rA}rB(h_Xa function that will be called if any exception was raised while processing the request. This includes pages that failed with 404 HTTP errors and such. It receives a h`j!ubcdocutils.nodes reference rC)rD}rE(h_X`Twisted Failure`_UresolvedrFKh`j!hfU referencerGhh}rH(UnameXTwisted FailureUrefurirIXRhttp://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.htmlrJhj]hk]hl]hm]hn]uhZ]rKhXTwisted FailurerLrM}rN(h_Uh`jDubaubhX instance as first parameter.rOrP}rQ(h_X instance as first parameter.h`j!ubehfhubahfjaubehfU bullet_listrRubahfU field_bodyrSubehfUfieldrTubaubh)rU}rV(h_Uh`jhahdhfhhh}rW(hj]hk]hl]hm]hn]Uentries]rX(hX#url (scrapy.http.Request attribute)hUtrYauhpNhqhhZ]ubjY)rZ}r[(h_Uh`jhahdhfj\hh}r\(j^j_Xpyhj]hk]hl]hm]hn]jaX attributer]jcj]uhpNhqhhZ]r^(je)r_}r`(h_X Request.urlh`jZhahdhfjhhh}ra(hj]rbhajkhhk]hl]hm]hn]rchajmX Request.urljojnjpuhpKhqhhZ]rdj)re}rf(h_Xurlh`j_hahdhfjhh}rg(hl]hm]hk]hj]hn]uhpKhqhhZ]rhhXurlrirj}rk(h_Uh`jeubaubaubj)rl}rm(h_Uh`jZhahdhfjhh}rn(hl]hm]hk]hj]hn]uhpKhqhhZ]ro(h)rp}rq(h_XA string containing the URL of this request. Keep in mind that this attribute contains the escaped URL, so it can differ from the URL passed in the constructor.rrh`jlhahdhfhhh}rs(hl]hm]hk]hj]hn]uhpKzhqhhZ]rthXA string containing the URL of this request. Keep in mind that this attribute contains the escaped URL, so it can differ from the URL passed in the constructor.rurv}rw(h_jrh`jpubaubh)rx}ry(h_XPThis attribute is read-only. To change the URL of a Request use :meth:`replace`.h`jlhahdhfhhh}rz(hl]hm]hk]hj]hn]uhpK~hqhhZ]r{(hX@This attribute is read-only. To change the URL of a Request use r|r}}r~(h_X@This attribute is read-only. To change the URL of a Request use h`jxubh)r}r(h_X:meth:`replace`rh`jxhahdhfhhh}r(UreftypeXmethhhXreplaceU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjnhhuhpK~hZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-methrehk]hj]hn]uh`jhZ]rhX replace()rr}r(h_Uh`jubahfhubaubhX.r}r(h_X.h`jxubeubeubeubh)r}r(h_Uh`jhahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX&method (scrapy.http.Request attribute)h.UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jhahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerjcjuhpNhqhhZ]r(je)r}r(h_XRequest.methodh`jhahdhfjhhh}r(hj]rh.ajkhhk]hl]hm]hn]rh.ajmXRequest.methodjojnjpuhpKhqhhZ]rj)r}r(h_Xmethodh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXmethodrr}r(h_Uh`jubaubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rh)r}r(h_XA string representing the HTTP method in the request. This is guaranteed to be uppercase. Example: ``"GET"``, ``"POST"``, ``"PUT"``, etch`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(hXcA string representing the HTTP method in the request. This is guaranteed to be uppercase. Example: rr}r(h_XcA string representing the HTTP method in the request. This is guaranteed to be uppercase. Example: h`jubh)r}r(h_X ``"GET"``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX"GET"rr}r(h_Uh`jubahfhubhX, rr}r(h_X, h`jubh)r}r(h_X ``"POST"``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX"POST"rr}r(h_Uh`jubahfhubhX, rr}r(h_X, h`jubh)r}r(h_X ``"PUT"``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX"PUT"rr}r(h_Uh`jubahfhubhX, etcrr}r(h_X, etch`jubeubaubeubh)r}r(h_Uh`jhahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX'headers (scrapy.http.Request attribute)h UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jhahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerjcjuhpNhqhhZ]r(je)r}r(h_XRequest.headersh`jhahdhfjhhh}r(hj]rh ajkhhk]hl]hm]hn]rh ajmXRequest.headersjojnjpuhpKhqhhZ]rj)r}r(h_Xheadersh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXheadersrr}r(h_Uh`jubaubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rh)r}r(h_X<A dictionary-like object which contains the request headers.rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhX<A dictionary-like object which contains the request headers.rr}r(h_jh`jubaubaubeubh)r}r(h_Uh`jhahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX$body (scrapy.http.Request attribute)hUtrauhpNhqhhZ]ubjY)r}r(h_Uh`jhahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerjcjuhpNhqhhZ]r(je)r}r(h_X Request.bodyh`jhahdhfjhhh}r(hj]rhajkhhk]hl]hm]hn]rhajmX Request.bodyjojnjpuhpKhqhhZ]rj)r}r(h_Xbodyh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXbodyrr}r (h_Uh`jubaubaubj)r }r (h_Uh`jhahdhfjhh}r (hl]hm]hk]hj]hn]uhpKhqhhZ]r (h)r}r(h_X%A str that contains the request body.rh`j hahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhX%A str that contains the request body.rr}r(h_jh`jubaubh)r}r(h_XQThis attribute is read-only. To change the body of a Request use :meth:`replace`.h`j hahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(hXAThis attribute is read-only. To change the body of a Request use rr}r(h_XAThis attribute is read-only. To change the body of a Request use h`jubh)r}r(h_X:meth:`replace`rh`jhahdhfhhh}r (UreftypeXmethhhXreplaceU refdomainXpyr!hj]hk]U refexplicithl]hm]hn]hhhjnhhuhpKhZ]r"h)r#}r$(h_jhh}r%(hl]hm]r&(hj!Xpy-methr'ehk]hj]hn]uh`jhZ]r(hX replace()r)r*}r+(h_Uh`j#ubahfhubaubhX.r,}r-(h_X.h`jubeubeubeubh)r.}r/(h_Uh`jhahdhfhhh}r0(hj]hk]hl]hm]hn]Uentries]r1(hX$meta (scrapy.http.Request attribute)hUtr2auhpNhqhhZ]ubjY)r3}r4(h_Uh`jhahdhfj\hh}r5(j^j_Xpyhj]hk]hl]hm]hn]jaX attributer6jcj6uhpNhqhhZ]r7(je)r8}r9(h_X Request.metah`j3hahdhfjhhh}r:(hj]r;hajkhhk]hl]hm]hn]r<hajmX Request.metajojnjpuhpKhqhhZ]r=j)r>}r?(h_Xmetah`j8hahdhfjhh}r@(hl]hm]hk]hj]hn]uhpKhqhhZ]rAhXmetarBrC}rD(h_Uh`j>ubaubaubj)rE}rF(h_Uh`j3hahdhfjhh}rG(hl]hm]hk]hj]hn]uhpKhqhhZ]rH(h)rI}rJ(h_XA dict that contains arbitrary metadata for this request. This dict is empty for new Requests, and is usually populated by different Scrapy components (extensions, middlewares, etc). So the data contained in this dict depends on the extensions you have enabled.rKh`jEhahdhfhhh}rL(hl]hm]hk]hj]hn]uhpKhqhhZ]rMhXA dict that contains arbitrary metadata for this request. This dict is empty for new Requests, and is usually populated by different Scrapy components (extensions, middlewares, etc). So the data contained in this dict depends on the extensions you have enabled.rNrO}rP(h_jKh`jIubaubh)rQ}rR(h_XTSee :ref:`topics-request-meta` for a list of special meta keys recognized by Scrapy.h`jEhahdhfhhh}rS(hl]hm]hk]hj]hn]uhpKhqhhZ]rT(hXSee rUrV}rW(h_XSee h`jQubh)rX}rY(h_X:ref:`topics-request-meta`rZh`jQhahdhfhhh}r[(UreftypeXrefhhXtopics-request-metaU refdomainXstdr\hj]hk]U refexplicithl]hm]hn]hhuhpKhZ]r]j')r^}r_(h_jZhh}r`(hl]hm]ra(hj\Xstd-refrbehk]hj]hn]uh`jXhZ]rchXtopics-request-metardre}rf(h_Uh`j^ubahfj1ubaubhX6 for a list of special meta keys recognized by Scrapy.rgrh}ri(h_X6 for a list of special meta keys recognized by Scrapy.h`jQubeubh)rj}rk(h_XThis dict is `shallow copied`_ when the request is cloned using the ``copy()`` or ``replace()`` methods, and can also be accessed, in your spider, from the ``response.meta`` attribute.h`jEhahdhfhhh}rl(hl]hm]hk]hj]hn]uhpKhqhhZ]rm(hX This dict is rnro}rp(h_X This dict is h`jjubjC)rq}rr(h_X`shallow copied`_jFKh`jjhfjGhh}rs(UnameXshallow copiedjIX(http://docs.python.org/library/copy.htmlrthj]hk]hl]hm]hn]uhZ]ruhXshallow copiedrvrw}rx(h_Uh`jqubaubhX& when the request is cloned using the ryrz}r{(h_X& when the request is cloned using the h`jjubh)r|}r}(h_X ``copy()``hh}r~(hl]hm]hk]hj]hn]uh`jjhZ]rhXcopy()rr}r(h_Uh`j|ubahfhubhX or rr}r(h_X or h`jjubh)r}r(h_X ``replace()``hh}r(hl]hm]hk]hj]hn]uh`jjhZ]rhX replace()rr}r(h_Uh`jubahfhubhX= methods, and can also be accessed, in your spider, from the rr}r(h_X= methods, and can also be accessed, in your spider, from the h`jjubh)r}r(h_X``response.meta``hh}r(hl]hm]hk]hj]hn]uh`jjhZ]rhX response.metarr}r(h_Uh`jubahfhubhX attribute.rr}r(h_X attribute.h`jjubeubeubeubh\)r}r(h_X<.. _shallow copied: http://docs.python.org/library/copy.htmlU referencedrKh`jhahdhfhghh}r(jIjthj]rhAahk]hl]hm]hn]rhauhpKhqhhZ]ubh)r}r(h_Uh`jhahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX#copy() (scrapy.http.Request method)h3UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jhahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaXmethodrjcjuhpNhqhhZ]r(je)r}r(h_XRequest.copy()h`jhahdhfjhhh}r(hj]rh3ajkhhk]hl]hm]hn]rh3ajmX Request.copyjojnjpuhpKhqhhZ]r(j)r}r(h_Xcopyh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXcopyrr}r(h_Uh`jubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]ubeubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rh)r}r(h_X~Return a new Request which is a copy of this Request. See also: :ref:`topics-request-response-ref-request-callback-arguments`.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(hX@Return a new Request which is a copy of this Request. See also: rr}r(h_X@Return a new Request which is a copy of this Request. See also: h`jubh)r}r(h_X=:ref:`topics-request-response-ref-request-callback-arguments`rh`jhahdhfhhh}r(UreftypeXrefhhX6topics-request-response-ref-request-callback-argumentsU refdomainXstdrhj]hk]U refexplicithl]hm]hn]hhuhpKhZ]rj')r}r(h_jhh}r(hl]hm]r(hjXstd-refrehk]hj]hn]uh`jhZ]rhX6topics-request-response-ref-request-callback-argumentsrr}r(h_Uh`jubahfj1ubaubhX.r}r(h_X.h`jubeubaubeubh)r}r(h_Uh`jhahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX&replace() (scrapy.http.Request method)h$UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jhahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaXmethodrjcjuhpNhqhhZ]r(je)r}r(h_XfRequest.replace([url, method, headers, body, cookies, meta, encoding, dont_filter, callback, errback])h`jhahdhfjhhh}r(hj]rh$ajkhhk]hl]hm]hn]rh$ajmXRequest.replacejojnjpuhpKhqhhZ]r(j)r}r(h_Xreplaceh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXreplacerr}r(h_Uh`jubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rj)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j)r}r(h_Xurlhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXurlrr}r(h_Uh`jubahfjubj)r}r(h_Xmethodhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXmethodrr}r(h_Uh`jubahfjubj)r}r(h_Xheadershh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXheadersrr}r (h_Uh`jubahfjubj)r }r (h_Xbodyhh}r (hl]hm]hk]hj]hn]uh`jhZ]r hXbodyrr}r(h_Uh`j ubahfjubj)r}r(h_Xcookieshh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXcookiesrr}r(h_Uh`jubahfjubj)r}r(h_Xmetahh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXmetarr}r(h_Uh`jubahfjubj)r}r (h_Xencodinghh}r!(hl]hm]hk]hj]hn]uh`jhZ]r"hXencodingr#r$}r%(h_Uh`jubahfjubj)r&}r'(h_X dont_filterhh}r((hl]hm]hk]hj]hn]uh`jhZ]r)hX dont_filterr*r+}r,(h_Uh`j&ubahfjubj)r-}r.(h_Xcallbackhh}r/(hl]hm]hk]hj]hn]uh`jhZ]r0hXcallbackr1r2}r3(h_Uh`j-ubahfjubj)r4}r5(h_Xerrbackhh}r6(hl]hm]hk]hj]hn]uh`jhZ]r7hXerrbackr8r9}r:(h_Uh`j4ubahfjubehfjubaubeubj)r;}r<(h_Uh`jhahdhfjhh}r=(hl]hm]hk]hj]hn]uhpKhqhhZ]r>h)r?}r@(h_X>Return a Request object with the same members, except for those members given new values by whichever keyword arguments are specified. The attribute :attr:`Request.meta` is copied by default (unless a new value is given in the ``meta`` argument). See also :ref:`topics-request-response-ref-request-callback-arguments`.h`j;hahdhfhhh}rA(hl]hm]hk]hj]hn]uhpKhqhhZ]rB(hXReturn a Request object with the same members, except for those members given new values by whichever keyword arguments are specified. The attribute rCrD}rE(h_XReturn a Request object with the same members, except for those members given new values by whichever keyword arguments are specified. The attribute h`j?ubh)rF}rG(h_X:attr:`Request.meta`rHh`j?hahdhfhhh}rI(UreftypeXattrhhX Request.metaU refdomainXpyrJhj]hk]U refexplicithl]hm]hn]hhhjnhhuhpKhZ]rKh)rL}rM(h_jHhh}rN(hl]hm]rO(hjJXpy-attrrPehk]hj]hn]uh`jFhZ]rQhX Request.metarRrS}rT(h_Uh`jLubahfhubaubhX: is copied by default (unless a new value is given in the rUrV}rW(h_X: is copied by default (unless a new value is given in the h`j?ubh)rX}rY(h_X``meta``hh}rZ(hl]hm]hk]hj]hn]uh`j?hZ]r[hXmetar\r]}r^(h_Uh`jXubahfhubhX argument). See also r_r`}ra(h_X argument). See also h`j?ubh)rb}rc(h_X=:ref:`topics-request-response-ref-request-callback-arguments`rdh`j?hahdhfhhh}re(UreftypeXrefhhX6topics-request-response-ref-request-callback-argumentsU refdomainXstdrfhj]hk]U refexplicithl]hm]hn]hhuhpKhZ]rgj')rh}ri(h_jdhh}rj(hl]hm]rk(hjfXstd-refrlehk]hj]hn]uh`jbhZ]rmhX6topics-request-response-ref-request-callback-argumentsrnro}rp(h_Uh`jhubahfj1ubaubhX.rq}rr(h_X.h`j?ubeubaubeubeubeubh\)rs}rt(h_X;.. _topics-request-response-ref-request-callback-arguments:h`jFhahdhfhghh}ru(hj]hk]hl]hm]hn]hohIuhpKhqhhZ]ubhr)rv}rw(h_Uh`jFhahdhu}rxhjsshfhwhh}ry(hl]hm]hk]hj]rz(hShIehn]r{(h)heuhpKhqhh|}r|hIjsshZ]r}(h)r~}r(h_X-Passing additional data to callback functionsrh`jvhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhX-Passing additional data to callback functionsrr}r(h_jh`j~ubaubh)r}r(h_XThe callback of a request is a function that will be called when the response of that request is downloaded. The callback function will be called with the downloaded :class:`Response` object as its first argument.h`jvhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(hXThe callback of a request is a function that will be called when the response of that request is downloaded. The callback function will be called with the downloaded rr}r(h_XThe callback of a request is a function that will be called when the response of that request is downloaded. The callback function will be called with the downloaded h`jubh)r}r(h_X:class:`Response`rh`jhahdhfhhh}r(UreftypeXclasshhXResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhXResponserr}r(h_Uh`jubahfhubaubhX object as its first argument.rr}r(h_X object as its first argument.h`jubeubh)r}r(h_X Example::rh`jvhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXExample:rr}r(h_XExample:h`jubaubj)r}r(h_Xdef parse_page1(self, response): return Request("http://www.example.com/some_page.html", callback=self.parse_page2) def parse_page2(self, response): # this would log http://www.example.com/some_page.html self.log("Visited %s" % response.url)h`jvhahdhfjhh}r(jjhj]hk]hl]hm]hn]uhpKhqhhZ]rhXdef parse_page1(self, response): return Request("http://www.example.com/some_page.html", callback=self.parse_page2) def parse_page2(self, response): # this would log http://www.example.com/some_page.html self.log("Visited %s" % response.url)rr}r(h_Uh`jubaubh)r}r(h_XIn some cases you may be interested in passing arguments to those callback functions so you can receive the arguments later, in the second callback. You can use the :attr:`Request.meta` attribute for that.h`jvhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(hXIn some cases you may be interested in passing arguments to those callback functions so you can receive the arguments later, in the second callback. You can use the rr}r(h_XIn some cases you may be interested in passing arguments to those callback functions so you can receive the arguments later, in the second callback. You can use the h`jubh)r}r(h_X:attr:`Request.meta`rh`jhahdhfhhh}r(UreftypeXattrhhX Request.metaU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhX Request.metarr}r(h_Uh`jubahfhubaubhX attribute for that.rr}r(h_X attribute for that.h`jubeubh)r}r(h_XrHere's an example of how to pass an item using this mechanism, to populate different fields from different pages::h`jvhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXqHere's an example of how to pass an item using this mechanism, to populate different fields from different pages:rr}r(h_XqHere's an example of how to pass an item using this mechanism, to populate different fields from different pages:h`jubaubj)r}r(h_Xsdef parse_page1(self, response): item = MyItem() item['main_url'] = response.url request = Request("http://www.example.com/some_page.html", callback=self.parse_page2) request.meta['item'] = item return request def parse_page2(self, response): item = response.meta['item'] item['other_url'] = response.url return itemh`jvhahdhfjhh}r(jjhj]hk]hl]hm]hn]uhpKhqhhZ]rhXsdef parse_page1(self, response): item = MyItem() item['main_url'] = response.url request = Request("http://www.example.com/some_page.html", callback=self.parse_page2) request.meta['item'] = item return request def parse_page2(self, response): item = response.meta['item'] item['other_url'] = response.url return itemrr}r(h_Uh`jubaubh\)r}r(h_X.. _topics-request-meta:h`jvhahdhfhghh}r(hj]hk]hl]hm]hn]hohGuhpKhqhhZ]ubeubeubhr)r}r(h_Uh`hshahdhu}rhjshfhwhh}r(hl]hm]hk]hj]r(hOhGehn]r(hheuhpKhqhh|}rhGjshZ]r(h)r}r(h_XRequest.meta special keysrh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXRequest.meta special keysrr}r(h_jh`jubaubh)r}r(h_XThe :attr:`Request.meta` attribute can contain any arbitrary data, but there are some special keys recognized by Scrapy and its built-in extensions.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(hXThe rr}r(h_XThe h`jubh)r}r(h_X:attr:`Request.meta`rh`jhahdhfhhh}r(UreftypeXattrhhX Request.metaU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhX Request.metarr}r(h_Uh`jubahfhubaubhX| attribute can contain any arbitrary data, but there are some special keys recognized by Scrapy and its built-in extensions.rr}r(h_X| attribute can contain any arbitrary data, but there are some special keys recognized by Scrapy and its built-in extensions.h`jubeubh)r}r(h_X Those are:rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhX Those are:rr}r(h_jh`jubaubj1)r }r (h_Uh`jhahdhfjRhh}r (Ubulletr X*hj]hk]hl]hm]hn]uhpKhqhhZ]r (j6)r}r(h_X:reqmeta:`dont_redirect`rh`j hahdhfjahh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r}r(h_jh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhZ]rh)r}r(h_jh`jhahdhfhhh}r(UreftypeXreqmetahhX dont_redirectU refdomainXstdrhj]hk]U refexplicithl]hm]hn]hhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjX std-reqmetar ehk]hj]hn]uh`jhZ]r!hX dont_redirectr"r#}r$(h_Uh`jubahfhubaubaubaubj6)r%}r&(h_X:reqmeta:`dont_retry`r'h`j hahdhfjahh}r((hl]hm]hk]hj]hn]uhpNhqhhZ]r)h)r*}r+(h_j'h`j%hahdhfhhh}r,(hl]hm]hk]hj]hn]uhpKhZ]r-h)r.}r/(h_j'h`j*hahdhfhhh}r0(UreftypeXreqmetahhX dont_retryU refdomainXstdr1hj]hk]U refexplicithl]hm]hn]hhuhpKhZ]r2h)r3}r4(h_j'hh}r5(hl]hm]r6(hj1X std-reqmetar7ehk]hj]hn]uh`j.hZ]r8hX dont_retryr9r:}r;(h_Uh`j3ubahfhubaubaubaubj6)r<}r=(h_X!:reqmeta:`handle_httpstatus_list`r>h`j hahdhfjahh}r?(hl]hm]hk]hj]hn]uhpNhqhhZ]r@h)rA}rB(h_j>h`j<hahdhfhhh}rC(hl]hm]hk]hj]hn]uhpKhZ]rDh)rE}rF(h_j>h`jAhahdhfhhh}rG(UreftypeXreqmetahhXhandle_httpstatus_listU refdomainXstdrHhj]hk]U refexplicithl]hm]hn]hhuhpKhZ]rIh)rJ}rK(h_j>hh}rL(hl]hm]rM(hjHX std-reqmetarNehk]hj]hn]uh`jEhZ]rOhXhandle_httpstatus_listrPrQ}rR(h_Uh`jJubahfhubaubaubaubj6)rS}rT(h_XR``dont_merge_cookies`` (see ``cookies`` parameter of :class:`Request` constructor)rUh`j hahdhfjahh}rV(hl]hm]hk]hj]hn]uhpNhqhhZ]rWh)rX}rY(h_jUh`jShahdhfhhh}rZ(hl]hm]hk]hj]hn]uhpKhZ]r[(h)r\}r](h_X``dont_merge_cookies``hh}r^(hl]hm]hk]hj]hn]uh`jXhZ]r_hXdont_merge_cookiesr`ra}rb(h_Uh`j\ubahfhubhX (see rcrd}re(h_X (see h`jXubh)rf}rg(h_X ``cookies``hh}rh(hl]hm]hk]hj]hn]uh`jXhZ]rihXcookiesrjrk}rl(h_Uh`jfubahfhubhX parameter of rmrn}ro(h_X parameter of h`jXubh)rp}rq(h_X:class:`Request`rrh`jXhahdhfhhh}rs(UreftypeXclasshhXRequestU refdomainXpyrthj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]ruh)rv}rw(h_jrhh}rx(hl]hm]ry(hjtXpy-classrzehk]hj]hn]uh`jphZ]r{hXRequestr|r}}r~(h_Uh`jvubahfhubaubhX constructor)rr}r(h_X constructor)h`jXubeubaubj6)r}r(h_X:reqmeta:`cookiejar`rh`j hahdhfjahh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r}r(h_jh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhZ]rh)r}r(h_jh`jhahdhfhhh}r(UreftypeXreqmetahhX cookiejarU refdomainXstdrhj]hk]U refexplicithl]hm]hn]hhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjX std-reqmetarehk]hj]hn]uh`jhZ]rhX cookiejarrr}r(h_Uh`jubahfhubaubaubaubj6)r}r(h_X:reqmeta:`redirect_urls`rh`j hahdhfjahh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r}r(h_jh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhZ]rh)r}r(h_jh`jhahdhfhhh}r(UreftypeXreqmetahhX redirect_urlsU refdomainXstdrhj]hk]U refexplicithl]hm]hn]hhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjX std-reqmetarehk]hj]hn]uh`jhZ]rhX redirect_urlsrr}r(h_Uh`jubahfhubaubaubaubj6)r}r(h_X:reqmeta:`bindaddress` h`j hahdhfjahh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r}r(h_X:reqmeta:`bindaddress`rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhZ]rh)r}r(h_jh`jhahdhfhhh}r(UreftypeXreqmetahhX bindaddressU refdomainXstdrhj]hk]U refexplicithl]hm]hn]hhuhpKhZ]rh)r}r(h_jhh}r(hl]hm]r(hjX std-reqmetarehk]hj]hn]uh`jhZ]rhX bindaddressrr}r(h_Uh`jubahfhubaubaubaubeubh)r}r(h_Uh`jhahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(XpairXbindaddress; reqmetaXstd:reqmeta-bindaddressrUtrauhpKhqhhZ]ubh\)r}r(h_Uh`jhahdhfhghh}r(hj]hk]hl]hm]hn]hojuhpKhqhhZ]ubhr)r}r(h_Uh`jhahdhu}hfhwhh}r(hl]hm]hk]hj]r(hCjehn]rh auhpKhqhh|}rjjshZ]r(h)r}r(h_X bindaddressrh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhX bindaddressrr}r(h_jh`jubaubh)r}r(h_XHThe IP of the outgoing IP address to use for the performing the request.rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXHThe IP of the outgoing IP address to use for the performing the request.rr}r(h_jh`jubaubh\)r}r(h_X3.. _topics-request-response-ref-request-subclasses:h`jhahdhfhghh}r(hj]hk]hl]hm]hn]hohKuhpKhqhhZ]ubeubeubhr)r}r(h_Uh`hshahdhu}rhjshfhwhh}r(hl]hm]hk]hj]r(hHhKehn]r(hheuhpKhqhh|}rhKjshZ]r(h)r}r(h_XRequest subclassesrh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]rhXRequest subclassesrr}r(h_jh`jubaubh)r}r(h_X~Here is the list of built-in :class:`Request` subclasses. You can also subclass it to implement your own custom functionality.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpKhqhhZ]r(hXHere is the list of built-in rr}r (h_XHere is the list of built-in h`jubh)r }r (h_X:class:`Request`r h`jhahdhfhhh}r (UreftypeXclasshhXRequestU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]r h)r }r (h_j hh}r (hl]hm]r (hj Xpy-classr ehk]hj]hn]uh`j hZ]r hXRequestr r }r (h_Uh`j ubahfhubaubhXQ subclasses. You can also subclass it to implement your own custom functionality.r r }r (h_XQ subclasses. You can also subclass it to implement your own custom functionality.h`jubeubhr)r }r (h_Uh`jhahdhfhwhh}r (hl]hm]hk]hj]r hQahn]r h%auhpKhqhhZ]r (h)r }r (h_XFormRequest objectsr h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpKhqhhZ]r hXFormRequest objectsr r }r (h_j h`j ubaubh)r! }r" (h_XThe FormRequest class extends the base :class:`Request` with functionality for dealing with HTML forms. It uses `lxml.html forms`_ to pre-populate form fields with form data from :class:`Response` objects.h`j hahdhfhhh}r# (hl]hm]hk]hj]hn]uhpKhqhhZ]r$ (hX'The FormRequest class extends the base r% r& }r' (h_X'The FormRequest class extends the base h`j! ubh)r( }r) (h_X:class:`Request`r* h`j! hahdhfhhh}r+ (UreftypeXclasshhXRequestU refdomainXpyr, hj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]r- h)r. }r/ (h_j* hh}r0 (hl]hm]r1 (hj, Xpy-classr2 ehk]hj]hn]uh`j( hZ]r3 hXRequestr4 r5 }r6 (h_Uh`j. ubahfhubaubhX9 with functionality for dealing with HTML forms. It uses r7 r8 }r9 (h_X9 with functionality for dealing with HTML forms. It uses h`j! ubjC)r: }r; (h_X`lxml.html forms`_jFKh`j! hfjGhh}r< (UnameXlxml.html formsjIX"http://lxml.de/lxmlhtml.html#formsr= hj]hk]hl]hm]hn]uhZ]r> hXlxml.html formsr? r@ }rA (h_Uh`j: ubaubhX2 to pre-populate form fields with form data from rB rC }rD (h_X2 to pre-populate form fields with form data from h`j! ubh)rE }rF (h_X:class:`Response`rG h`j! hahdhfhhh}rH (UreftypeXclasshhXResponseU refdomainXpyrI hj]hk]U refexplicithl]hm]hn]hhhNhhuhpKhZ]rJ h)rK }rL (h_jG hh}rM (hl]hm]rN (hjI Xpy-classrO ehk]hj]hn]uh`jE hZ]rP hXResponserQ rR }rS (h_Uh`jK ubahfhubaubhX objects.rT rU }rV (h_X objects.h`j! ubeubh\)rW }rX (h_X7.. _lxml.html forms: http://lxml.de/lxmlhtml.html#formsjKh`j hahdhfhghh}rY (jIj= hj]rZ hWahk]hl]hm]hn]r[ h1auhpKhqhhZ]ubh)r\ }r] (h_Uh`j haNhfhhh}r^ (hj]hk]hl]hm]hn]Uentries]r_ (hX"FormRequest (class in scrapy.http)hUtr` auhpNhqhhZ]ubjY)ra }rb (h_Uh`j haNhfj\hh}rc (j^j_Xpyrd hj]hk]hl]hm]hn]jaXclassre jcje uhpNhqhhZ]rf (je)rg }rh (h_X!FormRequest(url, [formdata, ...])h`ja hahdhfjhhh}ri (hj]rj hajkhhk]hl]hm]hn]rk hajmX FormRequestrl joUjpuhpM>hqhhZ]rm (jr)rn }ro (h_Xclass h`jg hahdhfjuhh}rp (hl]hm]hk]hj]hn]uhpM>hqhhZ]rq hXclass rr rs }rt (h_Uh`jn ubaubj{)ru }rv (h_X scrapy.http.h`jg hahdhfj~hh}rw (hl]hm]hk]hj]hn]uhpM>hqhhZ]rx hX scrapy.http.ry rz }r{ (h_Uh`ju ubaubj)r| }r} (h_jl h`jg hahdhfjhh}r~ (hl]hm]hk]hj]hn]uhpM>hqhhZ]r hX FormRequestr r }r (h_Uh`j| ubaubj)r }r (h_Uh`jg hahdhfjhh}r (hl]hm]hk]hj]hn]uhpM>hqhhZ]r (j)r }r (h_Xurlhh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXurlr r }r (h_Uh`j ubahfjubj)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j)r }r (h_Xformdatahh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXformdatar r }r (h_Uh`j ubahfjubj)r }r (h_X...hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX...r r }r (h_Uh`j ubahfjubehfjubeubeubj)r }r (h_Uh`ja hahdhfjhh}r (hl]hm]hk]hj]hn]uhpM>hqhhZ]r (h)r }r (h_XThe :class:`FormRequest` class adds a new argument to the constructor. The remaining arguments are the same as for the :class:`Request` class and are not documented here.h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpMhqhhZ]r (hXThe r r }r (h_XThe h`j ubh)r }r (h_X:class:`FormRequest`r h`j hahdhfhhh}r (UreftypeXclasshhX FormRequestU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhjl hhuhpMhZ]r h)r }r (h_j hh}r (hl]hm]r (hj Xpy-classr ehk]hj]hn]uh`j hZ]r hX FormRequestr r }r (h_Uh`j ubahfhubaubhX_ class adds a new argument to the constructor. The remaining arguments are the same as for the r r }r (h_X_ class adds a new argument to the constructor. The remaining arguments are the same as for the h`j ubh)r }r (h_X:class:`Request`r h`j hahdhfhhh}r (UreftypeXclasshhXRequestU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhjl hhuhpMhZ]r h)r }r (h_j hh}r (hl]hm]r (hj Xpy-classr ehk]hj]hn]uh`j hZ]r hXRequestr r }r (h_Uh`j ubahfhubaubhX# class and are not documented here.r r }r (h_X# class and are not documented here.h`j ubeubj)r }r (h_Uh`j haNhfjhh}r (hl]hm]hk]hj]hn]uhpNhqhhZ]r j)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j#)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX Parametersr r }r (h_Uh`j ubahfj+ubj,)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j?)r }r (h_Xformdatahh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXformdatar r }r (h_Uh`j ubahfjGubhX (r r }r (h_Uh`j ubh)r }r (h_Uhh}r (UreftypejNU reftargetXdict or iterable of tuplesr U refdomainjd hj]hk]U refexplicithl]hm]hn]uh`j hZ]r j')r }r (h_j hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXdict or iterable of tuplesr r }r (h_Uh`j ubahfj1ubahfhubhX)r }r (h_Uh`j ubhX -- r r }r (h_Uh`j ubhXis a dictionary (or iterable of (key, value) tuples) containing HTML Form data which will be url-encoded and assigned to the body of the request.r r }r (h_Xis a dictionary (or iterable of (key, value) tuples) containing HTML Form data which will be url-encoded and assigned to the body of the request.h`j ubehfhubahfjSubehfjTubaubh)r }r (h_XyThe :class:`FormRequest` objects support the following class method in addition to the standard :class:`Request` methods:h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpM hqhhZ]r (hXThe r r }r (h_XThe h`j ubh)r }r (h_X:class:`FormRequest`r h`j hahdhfhhh}r (UreftypeXclasshhX FormRequestU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhjl hhuhpM hZ]r h)r }r (h_j hh}r (hl]hm]r (hj Xpy-classr ehk]hj]hn]uh`j hZ]r hX FormRequestr r }r (h_Uh`j ubahfhubaubhXH objects support the following class method in addition to the standard r r }r (h_XH objects support the following class method in addition to the standard h`j ubh)r }r (h_X:class:`Request`r h`j hahdhfhhh}r (UreftypeXclasshhXRequestU refdomainXpyr! hj]hk]U refexplicithl]hm]hn]hhhjl hhuhpM hZ]r" h)r# }r$ (h_j hh}r% (hl]hm]r& (hj! Xpy-classr' ehk]hj]hn]uh`j hZ]r( hXRequestr) r* }r+ (h_Uh`j# ubahfhubaubhX methods:r, r- }r. (h_X methods:h`j ubeubh)r/ }r0 (h_Uh`j haNhfhhh}r1 (hj]hk]hl]hm]hn]Uentries]r2 (hX6from_response() (scrapy.http.FormRequest class method)h(Utr3 auhpNhqhhZ]ubjY)r4 }r5 (h_Uh`j haNhfj\hh}r6 (j^j_Xpyr7 hj]hk]hl]hm]hn]jaX classmethodr8 jcj8 uhpNhqhhZ]r9 (je)r: }r; (h_XxFormRequest.from_response(response, [formname=None, formnumber=0, formdata=None, formxpath=None, dont_click=False, ...])h`j4 hahdhfjhhh}r< (hj]r= h(ajkhhk]hl]hm]hn]r> h(ajmXFormRequest.from_responser? jojl jpuhpM=hqhhZ]r@ (jr)rA }rB (h_U classmethod rC h`j: hahdhfjuhh}rD (hl]hm]hk]hj]hn]uhpM=hqhhZ]rE hX classmethod rF rG }rH (h_Uh`jA ubaubj)rI }rJ (h_X from_responseh`j: hahdhfjhh}rK (hl]hm]hk]hj]hn]uhpM=hqhhZ]rL hX from_responserM rN }rO (h_Uh`jI ubaubj)rP }rQ (h_Uh`j: hahdhfjhh}rR (hl]hm]hk]hj]hn]uhpM=hqhhZ]rS (j)rT }rU (h_Xresponsehh}rV (hl]hm]hk]hj]hn]uh`jP hZ]rW hXresponserX rY }rZ (h_Uh`jT ubahfjubj)r[ }r\ (h_Uhh}r] (hl]hm]hk]hj]hn]uh`jP hZ]r^ (j)r_ }r` (h_X formname=Nonehh}ra (hl]hm]hk]hj]hn]uh`j[ hZ]rb hX formname=Nonerc rd }re (h_Uh`j_ ubahfjubj)rf }rg (h_X formnumber=0hh}rh (hl]hm]hk]hj]hn]uh`j[ hZ]ri hX formnumber=0rj rk }rl (h_Uh`jf ubahfjubj)rm }rn (h_X formdata=Nonehh}ro (hl]hm]hk]hj]hn]uh`j[ hZ]rp hX formdata=Nonerq rr }rs (h_Uh`jm ubahfjubj)rt }ru (h_Xformxpath=Nonehh}rv (hl]hm]hk]hj]hn]uh`j[ hZ]rw hXformxpath=Nonerx ry }rz (h_Uh`jt ubahfjubj)r{ }r| (h_Xdont_click=Falsehh}r} (hl]hm]hk]hj]hn]uh`j[ hZ]r~ hXdont_click=Falser r }r (h_Uh`j{ ubahfjubj)r }r (h_X...hh}r (hl]hm]hk]hj]hn]uh`j[ hZ]r hX...r r }r (h_Uh`j ubahfjubehfjubeubeubj)r }r (h_Uh`j4 hahdhfjhh}r (hl]hm]hk]hj]hn]uhpM=hqhhZ]r (h)r }r (h_XReturns a new :class:`FormRequest` object with its form field values pre-populated with those found in the HTML ``
              `` element contained in the given response. For an example see :ref:`topics-request-response-ref-request-userlogin`.h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpMhqhhZ]r (hXReturns a new r r }r (h_XReturns a new h`j ubh)r }r (h_X:class:`FormRequest`r h`j hahdhfhhh}r (UreftypeXclasshhX FormRequestU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhjl hhuhpMhZ]r h)r }r (h_j hh}r (hl]hm]r (hj Xpy-classr ehk]hj]hn]uh`j hZ]r hX FormRequestr r }r (h_Uh`j ubahfhubaubhXN object with its form field values pre-populated with those found in the HTML r r }r (h_XN object with its form field values pre-populated with those found in the HTML h`j ubh)r }r (h_X ````hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXr r }r (h_Uh`j ubahfhubhX= element contained in the given response. For an example see r r }r (h_X= element contained in the given response. For an example see h`j ubh)r }r (h_X4:ref:`topics-request-response-ref-request-userlogin`r h`j hahdhfhhh}r (UreftypeXrefhhX-topics-request-response-ref-request-userloginU refdomainXstdr hj]hk]U refexplicithl]hm]hn]hhuhpMhZ]r j')r }r (h_j hh}r (hl]hm]r (hj Xstd-refr ehk]hj]hn]uh`j hZ]r hX-topics-request-response-ref-request-userloginr r }r (h_Uh`j ubahfj1ubaubhX.r }r (h_X.h`j ubeubh)r }r (h_XyThe policy is to automatically simulate a click, by default, on any form control that looks clickable, like a ````. Even though this is quite convenient, and often the desired behaviour, sometimes it can cause problems which could be hard to debug. For example, when working with forms that are filled and/or submitted using javascript, the default :meth:`from_response` behaviour may not be the most appropriate. To disable this behaviour you can set the ``dont_click`` argument to ``True``. Also, if you want to change the control clicked (instead of disabling it) you can also use the ``clickdata`` argument.h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpMhqhhZ]r (hXnThe policy is to automatically simulate a click, by default, on any form control that looks clickable, like a r r }r (h_XnThe policy is to automatically simulate a click, by default, on any form control that looks clickable, like a h`j ubh)r }r (h_X````hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXr r }r (h_Uh`j ubahfhubhX. Even though this is quite convenient, and often the desired behaviour, sometimes it can cause problems which could be hard to debug. For example, when working with forms that are filled and/or submitted using javascript, the default r r }r (h_X. Even though this is quite convenient, and often the desired behaviour, sometimes it can cause problems which could be hard to debug. For example, when working with forms that are filled and/or submitted using javascript, the default h`j ubh)r }r (h_X:meth:`from_response`r h`j hahdhfhhh}r (UreftypeXmethhhX from_responseU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhjl hhuhpMhZ]r h)r }r (h_j hh}r (hl]hm]r (hj Xpy-methr ehk]hj]hn]uh`j hZ]r hXfrom_response()r r }r (h_Uh`j ubahfhubaubhXV behaviour may not be the most appropriate. To disable this behaviour you can set the r r }r (h_XV behaviour may not be the most appropriate. To disable this behaviour you can set the h`j ubh)r }r (h_X``dont_click``hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX dont_clickr r }r (h_Uh`j ubahfhubhX argument to r r }r (h_X argument to h`j ubh)r }r (h_X``True``hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXTruer r }r (h_Uh`j ubahfhubhXa. Also, if you want to change the control clicked (instead of disabling it) you can also use the r r }r (h_Xa. Also, if you want to change the control clicked (instead of disabling it) you can also use the h`j ubh)r }r (h_X ``clickdata``hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX clickdatar r }r (h_Uh`j ubahfhubhX argument.r r }r (h_X argument.h`j ubeubj)r }r (h_Uh`j haNhfjhh}r (hl]hm]hk]hj]hn]uhpNhqhhZ]r j)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j#)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX Parametersr r }r (h_Uh`j ubahfj+ubj,)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r j1)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j6)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j?)r! }r" (h_Xresponsehh}r# (hl]hm]hk]hj]hn]uh`j hZ]r$ hXresponser% r& }r' (h_Uh`j! ubahfjGubhX (r( r) }r* (h_Uh`j ubh)r+ }r, (h_X:class:`Response`r- h`j hahdhfhhh}r. (UreftypeXclasshhXResponseU refdomainXpyr/ hj]hk]U refexplicithl]hm]hn]hhhjl hhuhpM!hZ]r0 h)r1 }r2 (h_j- hh}r3 (hl]hm]r4 (hj/ Xpy-classr5 ehk]hj]hn]uh`j+ hZ]r6 hXResponser7 r8 }r9 (h_Uh`j1 ubahfhubaubhX objectr: r; }r< (h_X objecth`j ubhX)r= }r> (h_Uh`j ubhX -- r? r@ }rA (h_Uh`j ubhXVthe response containing a HTML form which will be used to pre-populate the form fieldsrB rC }rD (h_XVthe response containing a HTML form which will be used to pre-populate the form fieldsh`j ubehfhubahfjaubj6)rE }rF (h_Uhh}rG (hl]hm]hk]hj]hn]uh`j hZ]rH h)rI }rJ (h_Uhh}rK (hl]hm]hk]hj]hn]uh`jE hZ]rL (j?)rM }rN (h_Xformnamehh}rO (hl]hm]hk]hj]hn]uh`jI hZ]rP hXformnamerQ rR }rS (h_Uh`jM ubahfjGubhX (rT rU }rV (h_Uh`jI ubh)rW }rX (h_Uhh}rY (UreftypejNU reftargetXstringrZ U refdomainj7 hj]hk]U refexplicithl]hm]hn]uh`jI hZ]r[ j')r\ }r] (h_jZ hh}r^ (hl]hm]hk]hj]hn]uh`jW hZ]r_ hXstringr` ra }rb (h_Uh`j\ ubahfj1ubahfhubhX)rc }rd (h_Uh`jI ubhX -- re rf }rg (h_Uh`jI ubhXFif given, the form with name attribute set to this value will be used.rh ri }rj (h_XFif given, the form with name attribute set to this value will be used.h`jI ubehfhubahfjaubj6)rk }rl (h_Uhh}rm (hl]hm]hk]hj]hn]uh`j hZ]rn h)ro }rp (h_Uhh}rq (hl]hm]hk]hj]hn]uh`jk hZ]rr (j?)rs }rt (h_X formxpathhh}ru (hl]hm]hk]hj]hn]uh`jo hZ]rv hX formxpathrw rx }ry (h_Uh`js ubahfjGubhX (rz r{ }r| (h_Uh`jo ubh)r} }r~ (h_Uhh}r (UreftypejNU reftargetXstringr U refdomainj7 hj]hk]U refexplicithl]hm]hn]uh`jo hZ]r j')r }r (h_j hh}r (hl]hm]hk]hj]hn]uh`j} hZ]r hXstringr r }r (h_Uh`j ubahfj1ubahfhubhX)r }r (h_Uh`jo ubhX -- r r }r (h_Uh`jo ubhX=if given, the first form that matches the xpath will be used.r r }r (h_X=if given, the first form that matches the xpath will be used.h`jo ubehfhubahfjaubj6)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j?)r }r (h_X formnumberhh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX formnumberr r }r (h_Uh`j ubahfjGubhX (r r }r (h_Uh`j ubh)r }r (h_Uhh}r (UreftypejNU reftargetXintegerr U refdomainj7 hj]hk]U refexplicithl]hm]hn]uh`j hZ]r j')r }r (h_j hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXintegerr r }r (h_Uh`j ubahfj1ubahfhubhX)r }r (h_Uh`j ubhX -- r r }r (h_Uh`j ubhXnthe number of form to use, when the response contains multiple forms. The first one (and also the default) is r r }r (h_Xnthe number of form to use, when the response contains multiple forms. The first one (and also the default) is h`j ubh)r }r (h_X``0``hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX0r }r (h_Uh`j ubahfhubhX.r }r (h_X.h`j ubehfhubahfjaubj6)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j?)r }r (h_Xformdatahh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXformdatar r }r (h_Uh`j ubahfjGubhX (r r }r (h_Uh`j ubh)r }r (h_Uhh}r (UreftypejNU reftargetXdictr U refdomainj7 hj]hk]U refexplicithl]hm]hn]uh`j hZ]r j')r }r (h_j hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXdictr r }r (h_Uh`j ubahfj1ubahfhubhX)r }r (h_Uh`j ubhX -- r r }r (h_Uh`j ubhXTfields to override in the form data. If a field was already present in the response r r }r (h_XTfields to override in the form data. If a field was already present in the response h`j ubh)r }r (h_X ````hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXr r }r (h_Uh`j ubahfhubhXF element, its value is overridden by the one passed in this parameter.r r }r (h_XF element, its value is overridden by the one passed in this parameter.h`j ubehfhubahfjaubj6)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j?)r }r (h_X dont_clickhh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX dont_clickr r }r (h_Uh`j ubahfjGubhX (r r }r (h_Uh`j ubh)r }r (h_Uhh}r (UreftypejNU reftargetXbooleanr U refdomainj7 hj]hk]U refexplicithl]hm]hn]uh`j hZ]r j')r }r (h_j hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXbooleanr r }r (h_Uh`j ubahfj1ubahfhubhX)r }r (h_Uh`j ubhX -- r r }r (h_Uh`j ubhXIIf True, the form data will be submitted without clicking in any element.r r }r (h_XIIf True, the form data will be submitted without clicking in any element.h`j ubehfhubahfjaubehfjRubahfjSubehfjTubaubh)r }r (h_XfThe other parameters of this class method are passed directly to the :class:`FormRequest` constructor.h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpM6hqhhZ]r (hXEThe other parameters of this class method are passed directly to the r r }r (h_XEThe other parameters of this class method are passed directly to the h`j ubh)r }r (h_X:class:`FormRequest`r h`j hahdhfhhh}r (UreftypeXclasshhX FormRequestU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhjl hhuhpM6hZ]r! h)r" }r# (h_j hh}r$ (hl]hm]r% (hj Xpy-classr& ehk]hj]hn]uh`j hZ]r' hX FormRequestr( r) }r* (h_Uh`j" ubahfhubaubhX constructor.r+ r, }r- (h_X constructor.h`j ubeubcsphinx.addnodes versionmodified r. )r/ }r0 (h_Uh`j hahdhfUversionmodifiedr1 hh}r2 (Uversionr3 X0.10.3r4 hj]hk]hl]hm]hn]Utyper5 X versionaddedr6 uhpM9hqhhZ]r7 h)r8 }r9 (h_Uh`j/ hahdhfhhh}r: (hl]hm]hk]hj]hn]uhpM;hqhhZ]r; (cdocutils.nodes inline r< )r= }r> (h_Uh`j8 hahdhfUinliner? hh}r@ (hl]hm]rA j1 ahk]hj]hn]uhpM;hqhhZ]rB hXNew in version 0.10.3: rC rD }rE (h_Uh`j= ubaubhXThe rF rG }rH (h_XThe haNhpNhqhh`j8 ubh)rI }rJ (h_X ``formname``h`j8 haNhfhhh}rK (hl]hm]hk]hj]hn]uhpNhqhhZ]rL hXformnamerM rN }rO (h_Uh`jI ubaubhX parameter.rP rQ }rR (h_X parameter.haNhpNhqhh`j8 ubeubaubj. )rS }rT (h_Uh`j hahdhfj1 hh}rU (j3 X0.17hj]hk]hl]hm]hn]j5 X versionaddedrV uhpM<hqhhZ]rW h)rX }rY (h_Uh`jS hahdhfhhh}rZ (hl]hm]hk]hj]hn]uhpM=hqhhZ]r[ (j< )r\ }r] (h_Uh`jX hahdhfj? hh}r^ (hl]hm]r_ j1 ahk]hj]hn]uhpM=hqhhZ]r` hXNew in version 0.17: ra rb }rc (h_Uh`j\ ubaubhXThe rd re }rf (h_XThe haNhpNhqhh`jX ubh)rg }rh (h_X ``formxpath``h`jX haNhfhhh}ri (hl]hm]hk]hj]hn]uhpNhqhhZ]rj hX formxpathrk rl }rm (h_Uh`jg ubaubhX parameter.rn ro }rp (h_X parameter.haNhpNhqhh`jX ubeubaubeubeubeubeubeubhr)rq }rr (h_Uh`jhahdhfhwhh}rs (hl]hm]hk]hj]rt hLahn]ru hauhpM@hqhhZ]rv (h)rw }rx (h_XRequest usage examplesry h`jq hahdhfhhh}rz (hl]hm]hk]hj]hn]uhpM@hqhhZ]r{ hXRequest usage examplesr| r} }r~ (h_jy h`jw ubaubhr)r }r (h_Uh`jq hahdhfhwhh}r (hl]hm]hk]hj]r hNahn]r hauhpMChqhhZ]r (h)r }r (h_X,Using FormRequest to send data via HTTP POSTr h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpMChqhhZ]r hX,Using FormRequest to send data via HTTP POSTr r }r (h_j h`j ubaubh)r }r (h_XIf you want to simulate a HTML Form POST in your spider and send a couple of key-value fields, you can return a :class:`FormRequest` object (from your spider) like this::h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpMEhqhhZ]r (hXpIf you want to simulate a HTML Form POST in your spider and send a couple of key-value fields, you can return a r r }r (h_XpIf you want to simulate a HTML Form POST in your spider and send a couple of key-value fields, you can return a h`j ubh)r }r (h_X:class:`FormRequest`r h`j hahdhfhhh}r (UreftypeXclasshhX FormRequestU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhNhhuhpMEhZ]r h)r }r (h_j hh}r (hl]hm]r (hj Xpy-classr ehk]hj]hn]uh`j hZ]r hX FormRequestr r }r (h_Uh`j ubahfhubaubhX% object (from your spider) like this:r r }r (h_X% object (from your spider) like this:h`j ubeubj)r }r (h_Xreturn [FormRequest(url="http://www.example.com/post/action", formdata={'name': 'John Doe', 'age': '27'}, callback=self.after_post)]h`j hahdhfjhh}r (jjhj]hk]hl]hm]hn]uhpMIhqhhZ]r hXreturn [FormRequest(url="http://www.example.com/post/action", formdata={'name': 'John Doe', 'age': '27'}, callback=self.after_post)]r r }r (h_Uh`j ubaubh\)r }r (h_X2.. _topics-request-response-ref-request-userlogin:h`j hahdhfhghh}r (hj]hk]hl]hm]hn]hohBuhpMMhqhhZ]ubeubhr)r }r (h_Uh`jq hahdhu}r h j shfhwhh}r (hl]hm]hk]hj]r (hRhBehn]r (h&h euhpMPhqhh|}r hBj shZ]r (h)r }r (h_X:Using FormRequest.from_response() to simulate a user loginr h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpMPhqhhZ]r hX:Using FormRequest.from_response() to simulate a user loginr r }r (h_j h`j ubaubh)r }r (h_XIt is usual for web sites to provide pre-populated form fields through ```` elements, such as session related data or authentication tokens (for login pages). When scraping, you'll want these fields to be automatically pre-populated and only override a couple of them, such as the user name and password. You can use the :meth:`FormRequest.from_response` method for this job. Here's an example spider which uses it::h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpMRhqhhZ]r (hXGIt is usual for web sites to provide pre-populated form fields through r r }r (h_XGIt is usual for web sites to provide pre-populated form fields through h`j ubh)r }r (h_X````hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXr r }r (h_Uh`j ubahfhubhX elements, such as session related data or authentication tokens (for login pages). When scraping, you'll want these fields to be automatically pre-populated and only override a couple of them, such as the user name and password. You can use the r r }r (h_X elements, such as session related data or authentication tokens (for login pages). When scraping, you'll want these fields to be automatically pre-populated and only override a couple of them, such as the user name and password. You can use the h`j ubh)r }r (h_X!:meth:`FormRequest.from_response`r h`j hahdhfhhh}r (UreftypeXmethhhXFormRequest.from_responseU refdomainXpyr hj]hk]U refexplicithl]hm]hn]hhhNhhuhpMRhZ]r h)r }r (h_j hh}r (hl]hm]r (hj Xpy-methr ehk]hj]hn]uh`j hZ]r hXFormRequest.from_response()r r }r (h_Uh`j ubahfhubaubhX= method for this job. Here's an example spider which uses it:r r }r (h_X= method for this job. Here's an example spider which uses it:h`j ubeubj)r }r (h_XIclass LoginSpider(Spider): name = 'example.com' start_urls = ['http://www.example.com/users/login.php'] def parse(self, response): return [FormRequest.from_response(response, formdata={'username': 'john', 'password': 'secret'}, callback=self.after_login)] def after_login(self, response): # check login succeed before going on if "authentication failed" in response.body: self.log("Login failed", level=log.ERROR) return # continue scraping with authenticated session...h`j hahdhfjhh}r (jjhj]hk]hl]hm]hn]uhpMYhqhhZ]r hXIclass LoginSpider(Spider): name = 'example.com' start_urls = ['http://www.example.com/users/login.php'] def parse(self, response): return [FormRequest.from_response(response, formdata={'username': 'john', 'password': 'secret'}, callback=self.after_login)] def after_login(self, response): # check login succeed before going on if "authentication failed" in response.body: self.log("Login failed", level=log.ERROR) return # continue scraping with authenticated session...r r }r (h_Uh`j ubaubeubeubeubhr)r }r (h_Uh`hshahdhfhwhh}r (hl]hm]hk]hj]r hUahn]r h-auhpMlhqhhZ]r (h)r }r (h_XResponse objectsr h`j hahdhfhhh}r (hl]hm]hk]hj]hn]uhpMlhqhhZ]r hXResponse objectsr r }r (h_j h`j ubaubh)r }r (h_Uh`j haNhfhhh}r (hj]hk]hl]hm]hn]Uentries]r (hXResponse (class in scrapy.http)h Utr auhpNhqhhZ]ubjY)r }r (h_Uh`j haNhfj\hh}r (j^j_Xpyr hj]hk]hl]hm]hn]jaXclassr jcj uhpNhqhhZ]r (je)r }r (h_X1Response(url, [status=200, headers, body, flags])h`j hahdhfjhhh}r (hj]r h ajkhhk]hl]hm]hn]r h ajmXResponser joUjpuhpMhqhhZ]r (jr)r }r (h_Xclass h`j hahdhfjuhh}r (hl]hm]hk]hj]hn]uhpMhqhhZ]r hXclass r r }r (h_Uh`j ubaubj{)r }r (h_X scrapy.http.h`j hahdhfj~hh}r (hl]hm]hk]hj]hn]uhpMhqhhZ]r hX scrapy.http.r r }r (h_Uh`j ubaubj)r }r (h_j h`j hahdhfjhh}r (hl]hm]hk]hj]hn]uhpMhqhhZ]r hXResponser r }r (h_Uh`j ubaubj)r }r (h_Uh`j hahdhfjhh}r! (hl]hm]hk]hj]hn]uhpMhqhhZ]r" (j)r# }r$ (h_Xurlhh}r% (hl]hm]hk]hj]hn]uh`j hZ]r& hXurlr' r( }r) (h_Uh`j# ubahfjubj)r* }r+ (h_Uhh}r, (hl]hm]hk]hj]hn]uh`j hZ]r- (j)r. }r/ (h_X status=200hh}r0 (hl]hm]hk]hj]hn]uh`j* hZ]r1 hX status=200r2 r3 }r4 (h_Uh`j. ubahfjubj)r5 }r6 (h_Xheadershh}r7 (hl]hm]hk]hj]hn]uh`j* hZ]r8 hXheadersr9 r: }r; (h_Uh`j5 ubahfjubj)r< }r= (h_Xbodyhh}r> (hl]hm]hk]hj]hn]uh`j* hZ]r? hXbodyr@ rA }rB (h_Uh`j< ubahfjubj)rC }rD (h_Xflagshh}rE (hl]hm]hk]hj]hn]uh`j* hZ]rF hXflagsrG rH }rI (h_Uh`jC ubahfjubehfjubeubeubj)rJ }rK (h_Uh`j hahdhfjhh}rL (hl]hm]hk]hj]hn]uhpMhqhhZ]rM (h)rN }rO (h_XA :class:`Response` object represents an HTTP response, which is usually downloaded (by the Downloader) and fed to the Spiders for processing.h`jJ hahdhfhhh}rP (hl]hm]hk]hj]hn]uhpMphqhhZ]rQ (hXA rR rS }rT (h_XA h`jN ubh)rU }rV (h_X:class:`Response`rW h`jN hahdhfhhh}rX (UreftypeXclasshhXResponseU refdomainXpyrY hj]hk]U refexplicithl]hm]hn]hhhj hhuhpMphZ]rZ h)r[ }r\ (h_jW hh}r] (hl]hm]r^ (hjY Xpy-classr_ ehk]hj]hn]uh`jU hZ]r` hXResponsera rb }rc (h_Uh`j[ ubahfhubaubhX{ object represents an HTTP response, which is usually downloaded (by the Downloader) and fed to the Spiders for processing.rd re }rf (h_X{ object represents an HTTP response, which is usually downloaded (by the Downloader) and fed to the Spiders for processing.h`jN ubeubj)rg }rh (h_Uh`jJ haNhfjhh}ri (hl]hm]hk]hj]hn]uhpNhqhhZ]rj j)rk }rl (h_Uhh}rm (hl]hm]hk]hj]hn]uh`jg hZ]rn (j#)ro }rp (h_Uhh}rq (hl]hm]hk]hj]hn]uh`jk hZ]rr hX Parametersrs rt }ru (h_Uh`jo ubahfj+ubj,)rv }rw (h_Uhh}rx (hl]hm]hk]hj]hn]uh`jk hZ]ry j1)rz }r{ (h_Uhh}r| (hl]hm]hk]hj]hn]uh`jv hZ]r} (j6)r~ }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`jz hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j~ hZ]r (j?)r }r (h_Xurlhh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXurlr r }r (h_Uh`j ubahfjGubhX (r r }r (h_Uh`j ubh)r }r (h_Uhh}r (UreftypejNU reftargetXstringr U refdomainj hj]hk]U refexplicithl]hm]hn]uh`j hZ]r j')r }r (h_j hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXstringr r }r (h_Uh`j ubahfj1ubahfhubhX)r }r (h_Uh`j ubhX -- r r }r (h_Uh`j ubhXthe URL of this responser r }r (h_Xthe URL of this responseh`j ubehfhubahfjaubj6)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`jz hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j?)r }r (h_Xheadershh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXheadersr r }r (h_Uh`j ubahfjGubhX (r r }r (h_Uh`j ubh)r }r (h_Uhh}r (UreftypejNU reftargetXdictr U refdomainj hj]hk]U refexplicithl]hm]hn]uh`j hZ]r j')r }r (h_j hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXdictr r }r (h_Uh`j ubahfj1ubahfhubhX)r }r (h_Uh`j ubhX -- r r }r (h_Uh`j ubhX}the headers of this response. The dict values can be strings (for single valued headers) or lists (for multi-valued headers).r r }r (h_X}the headers of this response. The dict values can be strings (for single valued headers) or lists (for multi-valued headers).h`j ubehfhubahfjaubj6)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`jz hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r (j?)r }r (h_Xstatushh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXstatusr r }r (h_Uh`j ubahfjGubhX (r r }r (h_Uh`j ubh)r }r (h_Uhh}r (UreftypejNU reftargetXintegerr U refdomainj hj]hk]U refexplicithl]hm]hn]uh`j hZ]r j')r }r (h_j hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hXintegerr r }r (h_Uh`j ubahfj1ubahfhubhX)r }r (h_Uh`j ubhX -- r r }r (h_Uh`j ubhX-the HTTP status of the response. Defaults to r r }r (h_X-the HTTP status of the response. Defaults to h`j ubh)r }r (h_X``200``hh}r (hl]hm]hk]hj]hn]uh`j hZ]r hX200r r }r (h_Uh`j ubahfhubhX.r }r (h_X.h`j ubehfhubahfjaubj6)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`jz hZ]r h)r }r (h_Uhh}r (hl]hm]hk]hj]hn]uh`j hZ]r(j?)r}r(h_Xbodyhh}r(hl]hm]hk]hj]hn]uh`j hZ]rhXbodyrr}r(h_Uh`jubahfjGubhX (rr }r (h_Uh`j ubh)r }r (h_Uhh}r (UreftypejNU reftargetXstrrU refdomainj hj]hk]U refexplicithl]hm]hn]uh`j hZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`j hZ]rhXstrrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`j ubhX -- rr}r(h_Uh`j ubhXUthe response body. It must be str, not unicode, unless you're using a encoding-aware rr}r(h_XUthe response body. It must be str, not unicode, unless you're using a encoding-aware h`j ubh)r}r (h_XJ:ref:`Response subclass `r!h`j hahdhfhhh}r"(UreftypeXrefhhX/topics-request-response-ref-response-subclassesU refdomainXstdr#hj]hk]U refexplicithl]hm]hn]hhuhpM}hZ]r$j')r%}r&(h_j!hh}r'(hl]hm]r((hj#Xstd-refr)ehk]hj]hn]uh`jhZ]r*hXResponse subclassr+r,}r-(h_Uh`j%ubahfj1ubaubhX , such as r.r/}r0(h_X , such as h`j ubh)r1}r2(h_X:class:`TextResponse`r3h`j hahdhfhhh}r4(UreftypeXclasshhX TextResponseU refdomainXpyr5hj]hk]U refexplicithl]hm]hn]hhhj hhuhpM}hZ]r6h)r7}r8(h_j3hh}r9(hl]hm]r:(hj5Xpy-classr;ehk]hj]hn]uh`j1hZ]r<hX TextResponser=r>}r?(h_Uh`j7ubahfhubaubhX.r@}rA(h_X.h`j ubehfhubahfjaubj6)rB}rC(h_Uhh}rD(hl]hm]hk]hj]hn]uh`jz hZ]rEh)rF}rG(h_Uhh}rH(hl]hm]hk]hj]hn]uh`jBhZ]rI(j?)rJ}rK(h_Xmetahh}rL(hl]hm]hk]hj]hn]uh`jFhZ]rMhXmetarNrO}rP(h_Uh`jJubahfjGubhX (rQrR}rS(h_Uh`jFubh)rT}rU(h_Uhh}rV(UreftypejNU reftargetXdictrWU refdomainj hj]hk]U refexplicithl]hm]hn]uh`jFhZ]rXj')rY}rZ(h_jWhh}r[(hl]hm]hk]hj]hn]uh`jThZ]r\hXdictr]r^}r_(h_Uh`jYubahfj1ubahfhubhX)r`}ra(h_Uh`jFubhX -- rbrc}rd(h_Uh`jFubhXthe initial values for the rerf}rg(h_Xthe initial values for the h`jFubh)rh}ri(h_X:attr:`Response.meta`rjh`jFhahdhfhhh}rk(UreftypeXattrhhX Response.metaU refdomainXpyrlhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rmh)rn}ro(h_jjhh}rp(hl]hm]rq(hjlXpy-attrrrehk]hj]hn]uh`jhhZ]rshX Response.metartru}rv(h_Uh`jnubahfhubaubhX6 attribute. If given, the dict will be shallow copied.rwrx}ry(h_X6 attribute. If given, the dict will be shallow copied.h`jFubehfhubahfjaubj6)rz}r{(h_Uhh}r|(hl]hm]hk]hj]hn]uh`jz hZ]r}h)r~}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jzhZ]r(j?)r}r(h_Xflagshh}r(hl]hm]hk]hj]hn]uh`j~hZ]rhXflagsrr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`j~ubh)r}r(h_Uhh}r(UreftypejNU reftargetXlistrU refdomainj hj]hk]U refexplicithl]hm]hn]uh`j~hZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXlistrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`j~ubhX -- rr}r(h_Uh`j~ubhX0is a list containing the initial values for the rr}r(h_X0is a list containing the initial values for the h`j~ubh)r}r(h_X:attr:`Response.flags`rh`j~hahdhfhhh}r(UreftypeXattrhhXResponse.flagsU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhXResponse.flagsrr}r(h_Uh`jubahfhubaubhX6 attribute. If given, the list will be shallow copied.rr}r(h_X6 attribute. If given, the list will be shallow copied.h`j~ubehfhubahfjaubehfjRubahfjSubehfjTubaubh)r}r(h_Uh`jJ hahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX$url (scrapy.http.Response attribute)h/UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jJ hahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerjcjuhpNhqhhZ]r(je)r}r(h_X Response.urlh`jhahdhfjhhh}r(hj]rh/ajkhhk]hl]hm]hn]rh/ajmX Response.urljoj jpuhpMhqhhZ]rj)r}r(h_Xurlh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXurlrr}r(h_Uh`jubaubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(h)r}r(h_X,A string containing the URL of the response.rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhX,A string containing the URL of the response.rr}r(h_jh`jubaubh)r}r(h_XQThis attribute is read-only. To change the URL of a Response use :meth:`replace`.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(hXAThis attribute is read-only. To change the URL of a Response use rr}r(h_XAThis attribute is read-only. To change the URL of a Response use h`jubh)r}r(h_X:meth:`replace`rh`jhahdhfhhh}r(UreftypeXmethhhXreplaceU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-methrehk]hj]hn]uh`jhZ]rhX replace()rr}r(h_Uh`jubahfhubaubhX.r}r(h_X.h`jubeubeubeubh)r}r(h_Uh`jJ hahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX'status (scrapy.http.Response attribute)h#UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jJ hahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerjcjuhpNhqhhZ]r(je)r}r(h_XResponse.statush`jhahdhfjhhh}r(hj]rh#ajkhhk]hl]hm]hn]rh#ajmXResponse.statusjoj jpuhpMhqhhZ]rj)r}r(h_Xstatush`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXstatusrr}r(h_Uh`jubaubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rh)r}r (h_XSAn integer representing the HTTP status of the response. Example: ``200``, ``404``.h`jhahdhfhhh}r (hl]hm]hk]hj]hn]uhpMhqhhZ]r (hXBAn integer representing the HTTP status of the response. Example: r r }r(h_XBAn integer representing the HTTP status of the response. Example: h`jubh)r}r(h_X``200``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX200rr}r(h_Uh`jubahfhubhX, rr}r(h_X, h`jubh)r}r(h_X``404``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX404rr}r(h_Uh`jubahfhubhX.r }r!(h_X.h`jubeubaubeubh)r"}r#(h_Uh`jJ hahdhfhhh}r$(hj]hk]hl]hm]hn]Uentries]r%(hX(headers (scrapy.http.Response attribute)h4Utr&auhpNhqhhZ]ubjY)r'}r((h_Uh`jJ hahdhfj\hh}r)(j^j_Xpyhj]hk]hl]hm]hn]jaX attributer*jcj*uhpNhqhhZ]r+(je)r,}r-(h_XResponse.headersh`j'hahdhfjhhh}r.(hj]r/h4ajkhhk]hl]hm]hn]r0h4ajmXResponse.headersjoj jpuhpMhqhhZ]r1j)r2}r3(h_Xheadersh`j,hahdhfjhh}r4(hl]hm]hk]hj]hn]uhpMhqhhZ]r5hXheadersr6r7}r8(h_Uh`j2ubaubaubj)r9}r:(h_Uh`j'hahdhfjhh}r;(hl]hm]hk]hj]hn]uhpMhqhhZ]r<h)r=}r>(h_X=A dictionary-like object which contains the response headers.r?h`j9hahdhfhhh}r@(hl]hm]hk]hj]hn]uhpMhqhhZ]rAhX=A dictionary-like object which contains the response headers.rBrC}rD(h_j?h`j=ubaubaubeubh)rE}rF(h_Uh`jJ hahdhfhhh}rG(hj]hk]hl]hm]hn]Uentries]rH(hX%body (scrapy.http.Response attribute)hUtrIauhpNhqhhZ]ubjY)rJ}rK(h_Uh`jJ hahdhfj\hh}rL(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerMjcjMuhpNhqhhZ]rN(je)rO}rP(h_X Response.bodyh`jJhahdhfjhhh}rQ(hj]rRhajkhhk]hl]hm]hn]rShajmX Response.bodyjoj jpuhpMhqhhZ]rTj)rU}rV(h_Xbodyh`jOhahdhfjhh}rW(hl]hm]hk]hj]hn]uhpMhqhhZ]rXhXbodyrYrZ}r[(h_Uh`jUubaubaubj)r\}r](h_Uh`jJhahdhfjhh}r^(hl]hm]hk]hj]hn]uhpMhqhhZ]r_(h)r`}ra(h_XA str containing the body of this Response. Keep in mind that Reponse.body is always a str. If you want the unicode version use :meth:`TextResponse.body_as_unicode` (only available in :class:`TextResponse` and subclasses).h`j\hahdhfhhh}rb(hl]hm]hk]hj]hn]uhpMhqhhZ]rc(hXA str containing the body of this Response. Keep in mind that Reponse.body is always a str. If you want the unicode version use rdre}rf(h_XA str containing the body of this Response. Keep in mind that Reponse.body is always a str. If you want the unicode version use h`j`ubh)rg}rh(h_X$:meth:`TextResponse.body_as_unicode`rih`j`hahdhfhhh}rj(UreftypeXmethhhXTextResponse.body_as_unicodeU refdomainXpyrkhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rlh)rm}rn(h_jihh}ro(hl]hm]rp(hjkXpy-methrqehk]hj]hn]uh`jghZ]rrhXTextResponse.body_as_unicode()rsrt}ru(h_Uh`jmubahfhubaubhX (only available in rvrw}rx(h_X (only available in h`j`ubh)ry}rz(h_X:class:`TextResponse`r{h`j`hahdhfhhh}r|(UreftypeXclasshhX TextResponseU refdomainXpyr}hj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]r~h)r}r(h_j{hh}r(hl]hm]r(hj}Xpy-classrehk]hj]hn]uh`jyhZ]rhX TextResponserr}r(h_Uh`jubahfhubaubhX and subclasses).rr}r(h_X and subclasses).h`j`ubeubh)r}r(h_XRThis attribute is read-only. To change the body of a Response use :meth:`replace`.h`j\hahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(hXBThis attribute is read-only. To change the body of a Response use rr}r(h_XBThis attribute is read-only. To change the body of a Response use h`jubh)r}r(h_X:meth:`replace`rh`jhahdhfhhh}r(UreftypeXmethhhXreplaceU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-methrehk]hj]hn]uh`jhZ]rhX replace()rr}r(h_Uh`jubahfhubaubhX.r}r(h_X.h`jubeubeubeubh)r}r(h_Uh`jJ haNhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX(request (scrapy.http.Response attribute)h+UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jJ haNhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerjcjuhpNhqhhZ]r(je)r}r(h_XResponse.requesth`jhahdhfjhhh}r(hj]rh+ajkhhk]hl]hm]hn]rh+ajmXResponse.requestjoj jpuhpMhqhhZ]rj)r}r(h_Xrequesth`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXrequestrr}r(h_Uh`jubaubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(h)r}r(h_XThe :class:`Request` object that generated this response. This attribute is assigned in the Scrapy engine, after the response and the request have passed through all :ref:`Downloader Middlewares `. In particular, this means that:h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(hXThe rr}r(h_XThe h`jubh)r}r(h_X:class:`Request`rh`jhahdhfhhh}r(UreftypeXclasshhXRequestU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhXRequestrr}r(h_Uh`jubahfhubaubhX object that generated this response. This attribute is assigned in the Scrapy engine, after the response and the request have passed through all rr}r(h_X object that generated this response. This attribute is assigned in the Scrapy engine, after the response and the request have passed through all h`jubh)r}r(h_X<:ref:`Downloader Middlewares `rh`jhahdhfhhh}r(UreftypeXrefhhXtopics-downloader-middlewareU refdomainXstdrhj]hk]U refexplicithl]hm]hn]hhuhpMhZ]rj')r}r(h_jhh}r(hl]hm]r(hjXstd-refrehk]hj]hn]uh`jhZ]rhXDownloader Middlewaresrr}r(h_Uh`jubahfj1ubaubhX!. In particular, this means that:rr}r(h_X!. In particular, this means that:h`jubeubj1)r}r(h_Uh`jhahdhfjRhh}r(j X-hj]hk]hl]hm]hn]uhpMhqhhZ]r(j6)r}r(h_XHTTP redirections will cause the original request (to the URL before redirection) to be assigned to the redirected response (with the final URL after redirection). h`jhahdhfjahh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r}r(h_XHTTP redirections will cause the original request (to the URL before redirection) to be assigned to the redirected response (with the final URL after redirection).rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhZ]rhXHTTP redirections will cause the original request (to the URL before redirection) to be assigned to the redirected response (with the final URL after redirection).rr}r(h_jh`jubaubaubj6)r}r(h_X7Response.request.url doesn't always equal Response.url h`jhahdhfjahh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r}r(h_X6Response.request.url doesn't always equal Response.urlrh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhZ]rhX6Response.request.url doesn't always equal Response.urlrr}r(h_jh`jubaubaubj6)r}r(h_XThis attribute is only available in the spider code, and in the :ref:`Spider Middlewares `, but not in Downloader Middlewares (although you have the Request available there by other means) and handlers of the :signal:`response_downloaded` signal.h`jhahdhfjahh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r }r (h_XThis attribute is only available in the spider code, and in the :ref:`Spider Middlewares `, but not in Downloader Middlewares (although you have the Request available there by other means) and handlers of the :signal:`response_downloaded` signal.h`jhahdhfhhh}r (hl]hm]hk]hj]hn]uhpMhZ]r (hX@This attribute is only available in the spider code, and in the r r}r(h_X@This attribute is only available in the spider code, and in the h`j ubh)r}r(h_X4:ref:`Spider Middlewares `rh`j hahdhfhhh}r(UreftypeXrefhhXtopics-spider-middlewareU refdomainXstdrhj]hk]U refexplicithl]hm]hn]hhuhpMhZ]rj')r}r(h_jhh}r(hl]hm]r(hjXstd-refrehk]hj]hn]uh`jhZ]rhXSpider Middlewaresrr}r(h_Uh`jubahfj1ubaubhXw, but not in Downloader Middlewares (although you have the Request available there by other means) and handlers of the rr }r!(h_Xw, but not in Downloader Middlewares (although you have the Request available there by other means) and handlers of the h`j ubh)r"}r#(h_X:signal:`response_downloaded`r$h`j hahdhfhhh}r%(UreftypeXsignalhhXresponse_downloadedU refdomainXstdr&hj]hk]U refexplicithl]hm]hn]hhuhpMhZ]r'h)r(}r)(h_j$hh}r*(hl]hm]r+(hj&X std-signalr,ehk]hj]hn]uh`j"hZ]r-hXresponse_downloadedr.r/}r0(h_Uh`j(ubahfhubaubhX signal.r1r2}r3(h_X signal.h`j ubeubaubeubeubeubh)r4}r5(h_Uh`jJ haNhfhhh}r6(hj]hk]hl]hm]hn]Uentries]r7(hX%meta (scrapy.http.Response attribute)hUtr8auhpNhqhhZ]ubjY)r9}r:(h_Uh`jJ haNhfj\hh}r;(j^j_Xpyhj]hk]hl]hm]hn]jaX attributer<jcj<uhpNhqhhZ]r=(je)r>}r?(h_X Response.metah`j9hahdhfjhhh}r@(hj]rAhajkhhk]hl]hm]hn]rBhajmX Response.metajoj jpuhpMhqhhZ]rCj)rD}rE(h_Xmetah`j>hahdhfjhh}rF(hl]hm]hk]hj]hn]uhpMhqhhZ]rGhXmetarHrI}rJ(h_Uh`jDubaubaubj)rK}rL(h_Uh`j9hahdhfjhh}rM(hl]hm]hk]hj]hn]uhpMhqhhZ]rN(h)rO}rP(h_XtA shortcut to the :attr:`Request.meta` attribute of the :attr:`Response.request` object (ie. ``self.request.meta``).h`jKhahdhfhhh}rQ(hl]hm]hk]hj]hn]uhpMhqhhZ]rR(hXA shortcut to the rSrT}rU(h_XA shortcut to the h`jOubh)rV}rW(h_X:attr:`Request.meta`rXh`jOhahdhfhhh}rY(UreftypeXattrhhX Request.metaU refdomainXpyrZhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]r[h)r\}r](h_jXhh}r^(hl]hm]r_(hjZXpy-attrr`ehk]hj]hn]uh`jVhZ]rahX Request.metarbrc}rd(h_Uh`j\ubahfhubaubhX attribute of the rerf}rg(h_X attribute of the h`jOubh)rh}ri(h_X:attr:`Response.request`rjh`jOhahdhfhhh}rk(UreftypeXattrhhXResponse.requestU refdomainXpyrlhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rmh)rn}ro(h_jjhh}rp(hl]hm]rq(hjlXpy-attrrrehk]hj]hn]uh`jhhZ]rshXResponse.requestrtru}rv(h_Uh`jnubahfhubaubhX object (ie. rwrx}ry(h_X object (ie. h`jOubh)rz}r{(h_X``self.request.meta``hh}r|(hl]hm]hk]hj]hn]uh`jOhZ]r}hXself.request.metar~r}r(h_Uh`jzubahfhubhX).rr}r(h_X).h`jOubeubh)r}r(h_XUnlike the :attr:`Response.request` attribute, the :attr:`Response.meta` attribute is propagated along redirects and retries, so you will get the original :attr:`Request.meta` sent from your spider.h`jKhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(hX Unlike the rr}r(h_X Unlike the h`jubh)r}r(h_X:attr:`Response.request`rh`jhahdhfhhh}r(UreftypeXattrhhXResponse.requestU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhXResponse.requestrr}r(h_Uh`jubahfhubaubhX attribute, the rr}r(h_X attribute, the h`jubh)r}r(h_X:attr:`Response.meta`rh`jhahdhfhhh}r(UreftypeXattrhhX Response.metaU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhX Response.metarr}r(h_Uh`jubahfhubaubhXS attribute is propagated along redirects and retries, so you will get the original rr}r(h_XS attribute is propagated along redirects and retries, so you will get the original h`jubh)r}r(h_X:attr:`Request.meta`rh`jhahdhfhhh}r(UreftypeXattrhhX Request.metaU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhX Request.metarr}r(h_Uh`jubahfhubaubhX sent from your spider.rr}r(h_X sent from your spider.h`jubeubcsphinx.addnodes seealso r)r}r(h_X:attr:`Request.meta` attributerh`jKhahdhfUseealsorhh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r}r(h_jh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhZ]r(h)r}r(h_X:attr:`Request.meta`rh`jhahdhfhhh}r(UreftypeXattrhhX Request.metaU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhX Request.metarr}r(h_Uh`jubahfhubaubhX attributerr}r(h_X attributeh`jubeubaubeubeubh)r}r(h_Uh`jJ hahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX&flags (scrapy.http.Response attribute)h7UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jJ hahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerjcjuhpNhqhhZ]r(je)r}r(h_XResponse.flagsh`jhahdhfjhhh}r(hj]rh7ajkhhk]hl]hm]hn]rh7ajmXResponse.flagsjoj jpuhpMhqhhZ]rj)r}r(h_Xflagsh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXflagsrr}r(h_Uh`jubaubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rh)r}r(h_XA list that contains flags for this response. Flags are labels used for tagging Responses. For example: `'cached'`, `'redirected`', etc. And they're shown on the string representation of the Response (`__str__` method) which is used by the engine for logging.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(hXhA list that contains flags for this response. Flags are labels used for tagging Responses. For example: rr}r(h_XhA list that contains flags for this response. Flags are labels used for tagging Responses. For example: h`jubjM)r}r(h_X `'cached'`hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX'cached'rr}r(h_Uh`jubahfjUubhX, rr}r (h_X, h`jubjM)r }r (h_X `'redirected`hh}r (hl]hm]hk]hj]hn]uh`jhZ]r hX 'redirectedrr}r(h_Uh`j ubahfjUubhXH', etc. And they're shown on the string representation of the Response (rr}r(h_XH', etc. And they're shown on the string representation of the Response (h`jubjM)r}r(h_X `__str__`hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX__str__rr}r(h_Uh`jubahfjUubhX1 method) which is used by the engine for logging.rr}r(h_X1 method) which is used by the engine for logging.h`jubeubaubeubh)r}r(h_Uh`jJ hahdhfhhh}r (hj]hk]hl]hm]hn]Uentries]r!(hX$copy() (scrapy.http.Response method)h*Utr"auhpNhqhhZ]ubjY)r#}r$(h_Uh`jJ hahdhfj\hh}r%(j^j_Xpyhj]hk]hl]hm]hn]jaXmethodr&jcj&uhpNhqhhZ]r'(je)r(}r)(h_XResponse.copy()h`j#hahdhfjhhh}r*(hj]r+h*ajkhhk]hl]hm]hn]r,h*ajmX Response.copyjoj jpuhpMhqhhZ]r-(j)r.}r/(h_Xcopyh`j(hahdhfjhh}r0(hl]hm]hk]hj]hn]uhpMhqhhZ]r1hXcopyr2r3}r4(h_Uh`j.ubaubj)r5}r6(h_Uh`j(hahdhfjhh}r7(hl]hm]hk]hj]hn]uhpMhqhhZ]ubeubj)r8}r9(h_Uh`j#hahdhfjhh}r:(hl]hm]hk]hj]hn]uhpMhqhhZ]r;h)r<}r=(h_X8Returns a new Response which is a copy of this Response.r>h`j8hahdhfhhh}r?(hl]hm]hk]hj]hn]uhpMhqhhZ]r@hX8Returns a new Response which is a copy of this Response.rArB}rC(h_j>h`j<ubaubaubeubh)rD}rE(h_Uh`jJ hahdhfhhh}rF(hj]hk]hl]hm]hn]Uentries]rG(hX'replace() (scrapy.http.Response method)h'UtrHauhpNhqhhZ]ubjY)rI}rJ(h_Uh`jJ hahdhfj\hh}rK(j^j_Xpyhj]hk]hl]hm]hn]jaXmethodrLjcjLuhpNhqhhZ]rM(je)rN}rO(h_XCResponse.replace([url, status, headers, body, request, flags, cls])h`jIhahdhfjhhh}rP(hj]rQh'ajkhhk]hl]hm]hn]rRh'ajmXResponse.replacejoj jpuhpMhqhhZ]rS(j)rT}rU(h_Xreplaceh`jNhahdhfjhh}rV(hl]hm]hk]hj]hn]uhpMhqhhZ]rWhXreplacerXrY}rZ(h_Uh`jTubaubj)r[}r\(h_Uh`jNhahdhfjhh}r](hl]hm]hk]hj]hn]uhpMhqhhZ]r^j)r_}r`(h_Uhh}ra(hl]hm]hk]hj]hn]uh`j[hZ]rb(j)rc}rd(h_Xurlhh}re(hl]hm]hk]hj]hn]uh`j_hZ]rfhXurlrgrh}ri(h_Uh`jcubahfjubj)rj}rk(h_Xstatushh}rl(hl]hm]hk]hj]hn]uh`j_hZ]rmhXstatusrnro}rp(h_Uh`jjubahfjubj)rq}rr(h_Xheadershh}rs(hl]hm]hk]hj]hn]uh`j_hZ]rthXheadersrurv}rw(h_Uh`jqubahfjubj)rx}ry(h_Xbodyhh}rz(hl]hm]hk]hj]hn]uh`j_hZ]r{hXbodyr|r}}r~(h_Uh`jxubahfjubj)r}r(h_Xrequesthh}r(hl]hm]hk]hj]hn]uh`j_hZ]rhXrequestrr}r(h_Uh`jubahfjubj)r}r(h_Xflagshh}r(hl]hm]hk]hj]hn]uh`j_hZ]rhXflagsrr}r(h_Uh`jubahfjubj)r}r(h_Xclshh}r(hl]hm]hk]hj]hn]uh`j_hZ]rhXclsrr}r(h_Uh`jubahfjubehfjubaubeubj)r}r(h_Uh`jIhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rh)r}r(h_XReturns a Response object with the same members, except for those members given new values by whichever keyword arguments are specified. The attribute :attr:`Response.meta` is copied by default.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(hXReturns a Response object with the same members, except for those members given new values by whichever keyword arguments are specified. The attribute rr}r(h_XReturns a Response object with the same members, except for those members given new values by whichever keyword arguments are specified. The attribute h`jubh)r}r(h_X:attr:`Response.meta`rh`jhahdhfhhh}r(UreftypeXattrhhX Response.metaU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhX Response.metarr}r(h_Uh`jubahfhubaubhX is copied by default.rr}r(h_X is copied by default.h`jubeubaubeubeubeubh\)r}r(h_X4.. _topics-request-response-ref-response-subclasses:h`j hahdhfhghh}r(hj]hk]hl]hm]hn]hohYuhpMhqhhZ]ubeubhr)r}r(h_Uh`hshahdhu}rh5jshfhwhh}r(hl]hm]hk]hj]r(hFhYehn]r(hh5euhpMhqhh|}rhYjshZ]r(h)r}r(h_XResponse subclassesrh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXResponse subclassesrr}r(h_jh`jubaubh)r}r(h_XHere is the list of available built-in Response subclasses. You can also subclass the Response class to implement your own functionality.rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXHere is the list of available built-in Response subclasses. You can also subclass the Response class to implement your own functionality.rr}r(h_jh`jubaubhr)r}r(h_Uh`jhahdhfhwhh}r(hl]hm]hk]hj]rhTahn]rh,auhpMhqhhZ]r(h)r}r(h_XTextResponse objectsrh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXTextResponse objectsrr}r(h_jh`jubaubh)r}r(h_Uh`jhaNhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX#TextResponse (class in scrapy.http)h6UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jhaNhfj\hh}r(j^j_Xpyrhj]hk]hl]hm]hn]jaXclassrjcjuhpNhqhhZ]r(je)r}r(h_X$TextResponse(url, [encoding[, ...]])h`jhahdhfjhhh}r(hj]rh6ajkhhk]hl]hm]hn]rh6ajmX TextResponserjoUjpuhpMhqhhZ]r(jr)r}r(h_Xclass h`jhahdhfjuhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXclass rr}r(h_Uh`jubaubj{)r}r(h_X scrapy.http.h`jhahdhfj~hh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhX scrapy.http.rr}r(h_Uh`jubaubj)r}r(h_jh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhX TextResponserr}r(h_Uh`jubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(j)r}r(h_Xurlhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXurlr r }r (h_Uh`jubahfjubj)r }r (h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j)r}r(h_Xencodinghh}r(hl]hm]hk]hj]hn]uh`j hZ]rhXencodingrr}r(h_Uh`jubahfjubj)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j hZ]rj)r}r(h_X...hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX...rr }r!(h_Uh`jubahfjubahfjubehfjubeubeubj)r"}r#(h_Uh`jhahdhfjhh}r$(hl]hm]hk]hj]hn]uhpMhqhhZ]r%(h)r&}r'(h_X:class:`TextResponse` objects adds encoding capabilities to the base :class:`Response` class, which is meant to be used only for binary data, such as images, sounds or any media file.h`j"hahdhfhhh}r((hl]hm]hk]hj]hn]uhpMhqhhZ]r)(h)r*}r+(h_X:class:`TextResponse`r,h`j&hahdhfhhh}r-(UreftypeXclasshhX TextResponseU refdomainXpyr.hj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]r/h)r0}r1(h_j,hh}r2(hl]hm]r3(hj.Xpy-classr4ehk]hj]hn]uh`j*hZ]r5hX TextResponser6r7}r8(h_Uh`j0ubahfhubaubhX0 objects adds encoding capabilities to the base r9r:}r;(h_X0 objects adds encoding capabilities to the base h`j&ubh)r<}r=(h_X:class:`Response`r>h`j&hahdhfhhh}r?(UreftypeXclasshhXResponseU refdomainXpyr@hj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rAh)rB}rC(h_j>hh}rD(hl]hm]rE(hj@Xpy-classrFehk]hj]hn]uh`j<hZ]rGhXResponserHrI}rJ(h_Uh`jBubahfhubaubhXa class, which is meant to be used only for binary data, such as images, sounds or any media file.rKrL}rM(h_Xa class, which is meant to be used only for binary data, such as images, sounds or any media file.h`j&ubeubh)rN}rO(h_X:class:`TextResponse` objects support a new constructor argument, in addition to the base :class:`Response` objects. The remaining functionality is the same as for the :class:`Response` class and is not documented here.h`j"hahdhfhhh}rP(hl]hm]hk]hj]hn]uhpMhqhhZ]rQ(h)rR}rS(h_X:class:`TextResponse`rTh`jNhahdhfhhh}rU(UreftypeXclasshhX TextResponseU refdomainXpyrVhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rWh)rX}rY(h_jThh}rZ(hl]hm]r[(hjVXpy-classr\ehk]hj]hn]uh`jRhZ]r]hX TextResponser^r_}r`(h_Uh`jXubahfhubaubhXE objects support a new constructor argument, in addition to the base rarb}rc(h_XE objects support a new constructor argument, in addition to the base h`jNubh)rd}re(h_X:class:`Response`rfh`jNhahdhfhhh}rg(UreftypeXclasshhXResponseU refdomainXpyrhhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rih)rj}rk(h_jfhh}rl(hl]hm]rm(hjhXpy-classrnehk]hj]hn]uh`jdhZ]rohXResponserprq}rr(h_Uh`jjubahfhubaubhX= objects. The remaining functionality is the same as for the rsrt}ru(h_X= objects. The remaining functionality is the same as for the h`jNubh)rv}rw(h_X:class:`Response`rxh`jNhahdhfhhh}ry(UreftypeXclasshhXResponseU refdomainXpyrzhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]r{h)r|}r}(h_jxhh}r~(hl]hm]r(hjzXpy-classrehk]hj]hn]uh`jvhZ]rhXResponserr}r(h_Uh`j|ubahfhubaubhX" class and is not documented here.rr}r(h_X" class and is not documented here.h`jNubeubj)r}r(h_Uh`j"haNhfjhh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rj)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j#)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX Parametersrr}r(h_Uh`jubahfj+ubj,)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]r(j?)r}r(h_Xencodinghh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXencodingrr}r(h_Uh`jubahfjGubhX (rr}r(h_Uh`jubh)r}r(h_Uhh}r(UreftypejNU reftargetXstringrU refdomainjhj]hk]U refexplicithl]hm]hn]uh`jhZ]rj')r}r(h_jhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXstringrr}r(h_Uh`jubahfj1ubahfhubhX)r}r(h_Uh`jubhX -- rr}r(h_Uh`jubhXRis a string which contains the encoding to use for this response. If you create a rr}r(h_XRis a string which contains the encoding to use for this response. If you create a h`jubh)r}r(h_X:class:`TextResponse`rh`jhahdhfhhh}r(UreftypeXclasshhX TextResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhX TextResponserr}r(h_Uh`jubahfhubaubhXy object with a unicode body, it will be encoded using this encoding (remember the body attribute is always a string). If rr}r(h_Xy object with a unicode body, it will be encoded using this encoding (remember the body attribute is always a string). If h`jubh)r}r(h_X ``encoding``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXencodingrr}r(h_Uh`jubahfhubhX is rr}r(h_X is h`jubh)r}r(h_X``None``hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXNonerr}r(h_Uh`jubahfhubhXZ (default value), the encoding will be looked up in the response headers and body instead.rr}r(h_XZ (default value), the encoding will be looked up in the response headers and body instead.h`jubehfhubahfjSubehfjTubaubh)r}r(h_Xr:class:`TextResponse` objects support the following attributes in addition to the standard :class:`Response` ones:h`j"hahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(h)r}r(h_X:class:`TextResponse`rh`jhahdhfhhh}r(UreftypeXclasshhX TextResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhX TextResponserr}r(h_Uh`jubahfhubaubhXF objects support the following attributes in addition to the standard rr}r(h_XF objects support the following attributes in addition to the standard h`jubh)r}r(h_X:class:`Response`rh`jhahdhfhhh}r(UreftypeXclasshhXResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhXResponserr}r(h_Uh`jubahfhubaubhX ones:rr }r (h_X ones:h`jubeubh)r }r (h_Uh`j"haNhfhhh}r (hj]hk]hl]hm]hn]Uentries]r(hX-encoding (scrapy.http.TextResponse attribute)hUtrauhpNhqhhZ]ubjY)r}r(h_Uh`j"haNhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaX attributerjcjuhpNhqhhZ]r(je)r}r(h_XTextResponse.encodingrh`jhahdhfjhhh}r(hj]rhajkhhk]hl]hm]hn]rhajmXTextResponse.encodingjojjpuhpMhqhhZ]rj)r}r(h_Xencodingh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXencodingr r!}r"(h_Uh`jubaubaubj)r#}r$(h_Uh`jhahdhfjhh}r%(hl]hm]hk]hj]hn]uhpMhqhhZ]r&(h)r'}r((h_XsA string with the encoding of this response. The encoding is resolved by trying the following mechanisms, in order:r)h`j#hahdhfhhh}r*(hl]hm]hk]hj]hn]uhpMhqhhZ]r+hXsA string with the encoding of this response. The encoding is resolved by trying the following mechanisms, in order:r,r-}r.(h_j)h`j'ubaubj)r/}r0(h_Uh`j#hahdhfjhh}r1(jU.hj]hk]hl]jUhm]hn]jjuhpMhqhhZ]r2(j6)r3}r4(h_X;the encoding passed in the constructor `encoding` argument h`j/hahdhfjahh}r5(hl]hm]hk]hj]hn]uhpNhqhhZ]r6h)r7}r8(h_X:the encoding passed in the constructor `encoding` argumenth`j3hahdhfhhh}r9(hl]hm]hk]hj]hn]uhpMhZ]r:(hX'the encoding passed in the constructor r;r<}r=(h_X'the encoding passed in the constructor h`j7ubjM)r>}r?(h_X `encoding`hh}r@(hl]hm]hk]hj]hn]uh`j7hZ]rAhXencodingrBrC}rD(h_Uh`j>ubahfjUubhX argumentrErF}rG(h_X argumenth`j7ubeubaubj6)rH}rI(h_Xthe encoding declared in the Content-Type HTTP header. If this encoding is not valid (ie. unknown), it is ignored and the next resolution mechanism is tried. h`j/hahdhfjahh}rJ(hl]hm]hk]hj]hn]uhpNhqhhZ]rKh)rL}rM(h_Xthe encoding declared in the Content-Type HTTP header. If this encoding is not valid (ie. unknown), it is ignored and the next resolution mechanism is tried.rNh`jHhahdhfhhh}rO(hl]hm]hk]hj]hn]uhpMhZ]rPhXthe encoding declared in the Content-Type HTTP header. If this encoding is not valid (ie. unknown), it is ignored and the next resolution mechanism is tried.rQrR}rS(h_jNh`jLubaubaubj6)rT}rU(h_Xthe encoding declared in the response body. The TextResponse class doesn't provide any special functionality for this. However, the :class:`HtmlResponse` and :class:`XmlResponse` classes do. h`j/hahdhfjahh}rV(hl]hm]hk]hj]hn]uhpNhqhhZ]rWh)rX}rY(h_Xthe encoding declared in the response body. The TextResponse class doesn't provide any special functionality for this. However, the :class:`HtmlResponse` and :class:`XmlResponse` classes do.h`jThahdhfhhh}rZ(hl]hm]hk]hj]hn]uhpMhZ]r[(hXthe encoding declared in the response body. The TextResponse class doesn't provide any special functionality for this. However, the r\r]}r^(h_Xthe encoding declared in the response body. The TextResponse class doesn't provide any special functionality for this. However, the h`jXubh)r_}r`(h_X:class:`HtmlResponse`rah`jXhahdhfhhh}rb(UreftypeXclasshhX HtmlResponseU refdomainXpyrchj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rdh)re}rf(h_jahh}rg(hl]hm]rh(hjcXpy-classriehk]hj]hn]uh`j_hZ]rjhX HtmlResponserkrl}rm(h_Uh`jeubahfhubaubhX and rnro}rp(h_X and h`jXubh)rq}rr(h_X:class:`XmlResponse`rsh`jXhahdhfhhh}rt(UreftypeXclasshhX XmlResponseU refdomainXpyruhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rvh)rw}rx(h_jshh}ry(hl]hm]rz(hjuXpy-classr{ehk]hj]hn]uh`jqhZ]r|hX XmlResponser}r~}r(h_Uh`jwubahfhubaubhX classes do.rr}r(h_X classes do.h`jXubeubaubj6)r}r(h_Xsthe encoding inferred by looking at the response body. This is the more fragile method but also the last one tried.h`j/hahdhfjahh}r(hl]hm]hk]hj]hn]uhpNhqhhZ]rh)r}r(h_Xsthe encoding inferred by looking at the response body. This is the more fragile method but also the last one tried.rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhZ]rhXsthe encoding inferred by looking at the response body. This is the more fragile method but also the last one tried.rr}r(h_jh`jubaubaubeubeubeubh)r}r(h_Xo:class:`TextResponse` objects support the following methods in addition to the standard :class:`Response` ones:h`j"hahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(h)r}r(h_X:class:`TextResponse`rh`jhahdhfhhh}r(UreftypeXclasshhX TextResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhX TextResponserr}r(h_Uh`jubahfhubaubhXC objects support the following methods in addition to the standard rr}r(h_XC objects support the following methods in addition to the standard h`jubh)r}r(h_X:class:`Response`rh`jhahdhfhhh}r(UreftypeXclasshhXResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjhhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhXResponserr}r(h_Uh`jubahfhubaubhX ones:rr}r(h_X ones:h`jubeubh)r}r(h_Uh`j"hahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX3body_as_unicode() (scrapy.http.TextResponse method)hUtrauhpNhqhhZ]ubjY)r}r(h_Uh`j"hahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaXmethodrjcjuhpNhqhhZ]r(je)r}r(h_XTextResponse.body_as_unicode()rh`jhahdhfjhhh}r(hj]rhajkhhk]hl]hm]hn]rhajmXTextResponse.body_as_unicodejojjpuhpMhqhhZ]r(j)r}r(h_Xbody_as_unicodeh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXbody_as_unicoderr}r(h_Uh`jubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]ubeubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(h)r}r(h_XDReturns the body of the response as unicode. This is equivalent to::rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpM hqhhZ]rhXCReturns the body of the response as unicode. This is equivalent to:rr}r(h_XCReturns the body of the response as unicode. This is equivalent to:h`jubaubj)r}r(h_X'response.body.decode(response.encoding)h`jhahdhfjhh}r(jjhj]hk]hl]hm]hn]uhpM hqhhZ]rhX'response.body.decode(response.encoding)rr}r(h_Uh`jubaubh)r}r(h_XBut **not** equivalent to::rh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(hXBut rr}r(h_XBut h`jubj?)r}r(h_X**not**hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXnotrr}r(h_Uh`jubahfjGubhX equivalent to:rr}r(h_X equivalent to:h`jubeubj)r}r(h_Xunicode(response.body)h`jhahdhfjhh}r(jjhj]hk]hl]hm]hn]uhpMhqhhZ]rhXunicode(response.body)rr}r(h_Uh`jubaubh)r}r(h_XSince, in the latter case, you would be using you system default encoding (typically `ascii`) to convert the body to unicode, instead of the response encoding.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]r(hXUSince, in the latter case, you would be using you system default encoding (typically rr}r(h_XUSince, in the latter case, you would be using you system default encoding (typically h`jubjM)r}r(h_X`ascii`hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXasciir r }r (h_Uh`jubahfjUubhXC) to convert the body to unicode, instead of the response encoding.r r }r(h_XC) to convert the body to unicode, instead of the response encoding.h`jubeubeubeubeubeubeubhr)r}r(h_Uh`jhahdhfhwhh}r(hl]hm]hk]hj]rhPahn]rh auhpMhqhhZ]r(h)r}r(h_XHtmlResponse objectsrh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpMhqhhZ]rhXHtmlResponse objectsrr}r(h_jh`jubaubh)r}r(h_Uh`jhahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r (hX#HtmlResponse (class in scrapy.http)hUtr!auhpNhqhhZ]ubjY)r"}r#(h_Uh`jhahdhfj\hh}r$(j^j_Xpyhj]hk]hl]hm]hn]jaXclassr%jcj%uhpNhqhhZ]r&(je)r'}r((h_XHtmlResponse(url[, ...])h`j"hahdhfjhhh}r)(hj]r*hajkhhk]hl]hm]hn]r+hajmX HtmlResponser,joUjpuhpMhqhhZ]r-(jr)r.}r/(h_Xclass h`j'hahdhfjuhh}r0(hl]hm]hk]hj]hn]uhpMhqhhZ]r1hXclass r2r3}r4(h_Uh`j.ubaubj{)r5}r6(h_X scrapy.http.h`j'hahdhfj~hh}r7(hl]hm]hk]hj]hn]uhpMhqhhZ]r8hX scrapy.http.r9r:}r;(h_Uh`j5ubaubj)r<}r=(h_j,h`j'hahdhfjhh}r>(hl]hm]hk]hj]hn]uhpMhqhhZ]r?hX HtmlResponser@rA}rB(h_Uh`j<ubaubj)rC}rD(h_Uh`j'hahdhfjhh}rE(hl]hm]hk]hj]hn]uhpMhqhhZ]rF(j)rG}rH(h_Xurlhh}rI(hl]hm]hk]hj]hn]uh`jChZ]rJhXurlrKrL}rM(h_Uh`jGubahfjubj)rN}rO(h_Uhh}rP(hl]hm]hk]hj]hn]uh`jChZ]rQj)rR}rS(h_X...hh}rT(hl]hm]hk]hj]hn]uh`jNhZ]rUhX...rVrW}rX(h_Uh`jRubahfjubahfjubeubeubj)rY}rZ(h_Uh`j"hahdhfjhh}r[(hl]hm]hk]hj]hn]uhpMhqhhZ]r\h)r]}r^(h_XThe :class:`HtmlResponse` class is a subclass of :class:`TextResponse` which adds encoding auto-discovering support by looking into the HTML `meta http-equiv`_ attribute. See :attr:`TextResponse.encoding`.h`jYhahdhfhhh}r_(hl]hm]hk]hj]hn]uhpMhqhhZ]r`(hXThe rarb}rc(h_XThe h`j]ubh)rd}re(h_X:class:`HtmlResponse`rfh`j]hahdhfhhh}rg(UreftypeXclasshhX HtmlResponseU refdomainXpyrhhj]hk]U refexplicithl]hm]hn]hhhj,hhuhpMhZ]rih)rj}rk(h_jfhh}rl(hl]hm]rm(hjhXpy-classrnehk]hj]hn]uh`jdhZ]rohX HtmlResponserprq}rr(h_Uh`jjubahfhubaubhX class is a subclass of rsrt}ru(h_X class is a subclass of h`j]ubh)rv}rw(h_X:class:`TextResponse`rxh`j]hahdhfhhh}ry(UreftypeXclasshhX TextResponseU refdomainXpyrzhj]hk]U refexplicithl]hm]hn]hhhj,hhuhpMhZ]r{h)r|}r}(h_jxhh}r~(hl]hm]r(hjzXpy-classrehk]hj]hn]uh`jvhZ]rhX TextResponserr}r(h_Uh`j|ubahfhubaubhXG which adds encoding auto-discovering support by looking into the HTML rr}r(h_XG which adds encoding auto-discovering support by looking into the HTML h`j]ubjC)r}r(h_X`meta http-equiv`_jFKh`j]hfjGhh}r(UnameXmeta http-equivjIX5http://www.w3schools.com/TAGS/att_meta_http_equiv.asprhj]hk]hl]hm]hn]uhZ]rhXmeta http-equivrr}r(h_Uh`jubaubhX attribute. See rr}r(h_X attribute. See h`j]ubh)r}r(h_X:attr:`TextResponse.encoding`rh`j]hahdhfhhh}r(UreftypeXattrhhXTextResponse.encodingU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhj,hhuhpMhZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-attrrehk]hj]hn]uh`jhZ]rhXTextResponse.encodingrr}r(h_Uh`jubahfhubaubhX.r}r(h_X.h`j]ubeubaubeubh\)r}r(h_XJ.. _meta http-equiv: http://www.w3schools.com/TAGS/att_meta_http_equiv.aspjKh`jhahdhfhghh}r(jIjhj]rhDahk]hl]hm]hn]rh auhpMhqhhZ]ubeubhr)r}r(h_Uh`jhahdhfhwhh}r(hl]hm]hk]hj]rhVahn]rh0auhpM"hqhhZ]r(h)r}r(h_XXmlResponse objectsrh`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpM"hqhhZ]rhXXmlResponse objectsrr}r(h_jh`jubaubh)r}r(h_Uh`jhahdhfhhh}r(hj]hk]hl]hm]hn]Uentries]r(hX"XmlResponse (class in scrapy.http)h!UtrauhpNhqhhZ]ubjY)r}r(h_Uh`jhahdhfj\hh}r(j^j_Xpyhj]hk]hl]hm]hn]jaXclassrjcjuhpNhqhhZ]r(je)r}r(h_XXmlResponse(url[, ...])h`jhahdhfjhhh}r(hj]rh!ajkhhk]hl]hm]hn]rh!ajmX XmlResponserjoUjpuhpM)hqhhZ]r(jr)r}r(h_Xclass h`jhahdhfjuhh}r(hl]hm]hk]hj]hn]uhpM)hqhhZ]rhXclass rr}r(h_Uh`jubaubj{)r}r(h_X scrapy.http.h`jhahdhfj~hh}r(hl]hm]hk]hj]hn]uhpM)hqhhZ]rhX scrapy.http.rr}r(h_Uh`jubaubj)r}r(h_jh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpM)hqhhZ]rhX XmlResponserr}r(h_Uh`jubaubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpM)hqhhZ]r(j)r}r(h_Xurlhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXurlrr}r(h_Uh`jubahfjubj)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]rj)r}r(h_X...hh}r(hl]hm]hk]hj]hn]uh`jhZ]rhX...rr}r(h_Uh`jubahfjubahfjubeubeubj)r}r(h_Uh`jhahdhfjhh}r(hl]hm]hk]hj]hn]uhpM)hqhhZ]rh)r}r(h_XThe :class:`XmlResponse` class is a subclass of :class:`TextResponse` which adds encoding auto-discovering support by looking into the XML declaration line. See :attr:`TextResponse.encoding`.h`jhahdhfhhh}r(hl]hm]hk]hj]hn]uhpM&hqhhZ]r(hXThe rr}r(h_XThe h`jubh)r}r(h_X:class:`XmlResponse`rh`jhahdhfhhh}r(UreftypeXclasshhX XmlResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjhhuhpM&hZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]r hX XmlResponser r }r (h_Uh`jubahfhubaubhX class is a subclass of r r}r(h_X class is a subclass of h`jubh)r}r(h_X:class:`TextResponse`rh`jhahdhfhhh}r(UreftypeXclasshhX TextResponseU refdomainXpyrhj]hk]U refexplicithl]hm]hn]hhhjhhuhpM&hZ]rh)r}r(h_jhh}r(hl]hm]r(hjXpy-classrehk]hj]hn]uh`jhZ]rhX TextResponserr}r(h_Uh`jubahfhubaubhX] which adds encoding auto-discovering support by looking into the XML declaration line. See rr }r!(h_X] which adds encoding auto-discovering support by looking into the XML declaration line. See h`jubh)r"}r#(h_X:attr:`TextResponse.encoding`r$h`jhahdhfhhh}r%(UreftypeXattrhhXTextResponse.encodingU refdomainXpyr&hj]hk]U refexplicithl]hm]hn]hhhjhhuhpM&hZ]r'h)r(}r)(h_j$hh}r*(hl]hm]r+(hj&Xpy-attrr,ehk]hj]hn]uh`j"hZ]r-hXTextResponse.encodingr.r/}r0(h_Uh`j(ubahfhubaubhX.r1}r2(h_X.h`jubeubaubeubh\)r3}r4(h_Xg.. _Twisted Failure: http://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.htmljKh`jhahdhfhghh}r5(jIjJhj]r6hJahk]hl]hm]hn]r7hauhpM*hqhhZ]ubeubeubeubeh_UU transformerr8NU footnote_refsr9}r:Urefnamesr;}r<(Xmeta http-equiv]r=jaXlxml.html forms]r>j: aXshallow copied]r?jqaXtwisted failure]r@jDauUsymbol_footnotesrA]rBUautofootnote_refsrC]rDUsymbol_footnote_refsrE]rFU citationsrG]rHhqhU current_linerINUtransform_messagesrJ]rK(cdocutils.nodes system_message rL)rM}rN(h_Uhh}rO(hl]UlevelKhj]hk]Usourcehdhm]hn]UlineKUtypeUINFOrPuhZ]rQh)rR}rS(h_Uhh}rT(hl]hm]hk]hj]hn]uh`jMhZ]rUhX=Hyperlink target "topics-request-response" is not referenced.rVrW}rX(h_Uh`jRubahfhubahfUsystem_messagerYubjL)rZ}r[(h_Uhh}r\(hl]UlevelKhj]hk]Usourcehdhm]hn]UlineKUtypejPuhZ]r]h)r^}r_(h_Uhh}r`(hl]hm]hk]hj]hn]uh`jZhZ]rahX\Hyperlink target "topics-request-response-ref-request-callback-arguments" is not referenced.rbrc}rd(h_Uh`j^ubahfhubahfjYubjL)re}rf(h_Uhh}rg(hl]UlevelKhj]hk]Usourcehdhm]hn]UlineKUtypejPuhZ]rhh)ri}rj(h_Uhh}rk(hl]hm]hk]hj]hn]uh`jehZ]rlhX9Hyperlink target "topics-request-meta" is not referenced.rmrn}ro(h_Uh`jiubahfhubahfjYubjL)rp}rq(h_Uhh}rr(hl]UlevelKhj]hk]Usourcehdhm]hn]UlineKUtypejPuhZ]rsh)rt}ru(h_Uhh}rv(hl]hm]hk]hj]hn]uh`jphZ]rwhX=Hyperlink target "std:reqmeta-bindaddress" is not referenced.rxry}rz(h_Uh`jtubahfhubahfjYubjL)r{}r|(h_Uhh}r}(hl]UlevelKhj]hk]Usourcehdhm]hn]UlineKUtypejPuhZ]r~h)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`j{hZ]rhXTHyperlink target "topics-request-response-ref-request-subclasses" is not referenced.rr}r(h_Uh`jubahfhubahfjYubjL)r}r(h_Uhh}r(hl]UlevelKhj]hk]Usourcehdhm]hn]UlineMMUtypejPuhZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXSHyperlink target "topics-request-response-ref-request-userlogin" is not referenced.rr}r(h_Uh`jubahfhubahfjYubjL)r}r(h_Uhh}r(hl]UlevelKhj]hk]Usourcehdhm]hn]UlineMUtypejPuhZ]rh)r}r(h_Uhh}r(hl]hm]hk]hj]hn]uh`jhZ]rhXUHyperlink target "topics-request-response-ref-response-subclasses" is not referenced.rr}r(h_Uh`jubahfhubahfjYubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUM/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/request-response.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(hPjhj_hBj hCjhWjW hAjh jh j h7jhTjhj8hQj hjg hjhGjhj'hIjvhFjhKjhHjhj>hjhVjhjhMhshjOhNj h!jh"jfhzh\)r}r(h_Uh`hshahdhfhghh}r(hl]hj]rhzahk]Uismodhm]hn]uhpNhqhhZ]ubh#jh$jhOjh'jNh(j: jjh*j(h+jhRj hXhshJj3h.jh/jhUj h3jh4j,hYjh6jhSjvhDjhEjFhLjq uUsubstitution_namesr}rhfhqhh}r(hl]hj]hk]Usourcehdhm]hn]uU footnotesr]rUrefidsr}r(hG]rjahB]rj ahI]rjsahK]rjahY]rjaj]rjahM]rh]auub.PKo1Dhh2scrapy-0.22/.doctrees/topics/item-pipeline.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xitem pipeline exampleqNX process_itemqX item pipelineqNXduplicates filterq NX%activating an item pipeline componentq NXwriting your own item pipelineq NXwrite items to a json fileq NX close_spiderq X open_spiderqX2price validation and dropping items with no pricesqNXtopics-item-pipelinequUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUitem-pipeline-exampleqhhhU item-pipelineqh Uduplicates-filterqh U%activating-an-item-pipeline-componentqh Uwriting-your-own-item-pipelineqh Uwrite-items-to-a-json-fileqh h hhhU2price-validation-and-dropping-items-with-no-pricesq hUtopics-item-pipelineq!uUchildrenq"]q#(cdocutils.nodes target q$)q%}q&(U rawsourceq'X.. _topics-item-pipeline:Uparentq(hUsourceq)cdocutils.nodes reprunicode q*XJ/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/item-pipeline.rstq+q,}q-bUtagnameq.Utargetq/U attributesq0}q1(Uidsq2]Ubackrefsq3]Udupnamesq4]Uclassesq5]Unamesq6]Urefidq7h!uUlineq8KUdocumentq9hh"]ubcdocutils.nodes section q:)q;}q<(h'Uh(hh)h,Uexpect_referenced_by_nameq=}q>hh%sh.Usectionq?h0}q@(h4]h5]h3]h2]qA(hh!eh6]qB(hheuh8Kh9hUexpect_referenced_by_idqC}qDh!h%sh"]qE(cdocutils.nodes title qF)qG}qH(h'X Item PipelineqIh(h;h)h,h.UtitleqJh0}qK(h4]h5]h3]h2]h6]uh8Kh9hh"]qLcdocutils.nodes Text qMX Item PipelineqNqO}qP(h'hIh(hGubaubcdocutils.nodes paragraph qQ)qR}qS(h'XAfter an item has been scraped by a spider, it is sent to the Item Pipeline which process it through several components that are executed sequentially.qTh(h;h)h,h.U paragraphqUh0}qV(h4]h5]h3]h2]h6]uh8Kh9hh"]qWhMXAfter an item has been scraped by a spider, it is sent to the Item Pipeline which process it through several components that are executed sequentially.qXqY}qZ(h'hTh(hRubaubhQ)q[}q\(h'XEach item pipeline component (sometimes referred as just "Item Pipeline") is a Python class that implements a simple method. They receive an Item and perform an action over it, also deciding if the Item should continue through the pipeline or be dropped and no longer processed.q]h(h;h)h,h.hUh0}q^(h4]h5]h3]h2]h6]uh8K h9hh"]q_hMXEach item pipeline component (sometimes referred as just "Item Pipeline") is a Python class that implements a simple method. They receive an Item and perform an action over it, also deciding if the Item should continue through the pipeline or be dropped and no longer processed.q`qa}qb(h'h]h(h[ubaubhQ)qc}qd(h'X#Typical use for item pipelines are:qeh(h;h)h,h.hUh0}qf(h4]h5]h3]h2]h6]uh8Kh9hh"]qghMX#Typical use for item pipelines are:qhqi}qj(h'heh(hcubaubcdocutils.nodes bullet_list qk)ql}qm(h'Uh(h;h)h,h.U bullet_listqnh0}qo(UbulletqpX*h2]h3]h4]h5]h6]uh8Kh9hh"]qq(cdocutils.nodes list_item qr)qs}qt(h'Xcleansing HTML dataquh(hlh)h,h.U list_itemqvh0}qw(h4]h5]h3]h2]h6]uh8Nh9hh"]qxhQ)qy}qz(h'huh(hsh)h,h.hUh0}q{(h4]h5]h3]h2]h6]uh8Kh"]q|hMXcleansing HTML dataq}q~}q(h'huh(hyubaubaubhr)q}q(h'XHvalidating scraped data (checking that the items contain certain fields)qh(hlh)h,h.hvh0}q(h4]h5]h3]h2]h6]uh8Nh9hh"]qhQ)q}q(h'hh(hh)h,h.hUh0}q(h4]h5]h3]h2]h6]uh8Kh"]qhMXHvalidating scraped data (checking that the items contain certain fields)qq}q(h'hh(hubaubaubhr)q}q(h'X+checking for duplicates (and dropping them)qh(hlh)h,h.hvh0}q(h4]h5]h3]h2]h6]uh8Nh9hh"]qhQ)q}q(h'hh(hh)h,h.hUh0}q(h4]h5]h3]h2]h6]uh8Kh"]qhMX+checking for duplicates (and dropping them)qq}q(h'hh(hubaubaubhr)q}q(h'X(storing the scraped item in a database h(hlh)h,h.hvh0}q(h4]h5]h3]h2]h6]uh8Nh9hh"]qhQ)q}q(h'X&storing the scraped item in a databaseqh(hh)h,h.hUh0}q(h4]h5]h3]h2]h6]uh8Kh"]qhMX&storing the scraped item in a databaseqq}q(h'hh(hubaubaubeubh:)q}q(h'Uh(h;h)h,h.h?h0}q(h4]h5]h3]h2]qhah6]qh auh8Kh9hh"]q(hF)q}q(h'XWriting your own item pipelineqh(hh)h,h.hJh0}q(h4]h5]h3]h2]h6]uh8Kh9hh"]qhMXWriting your own item pipelineqq}q(h'hh(hubaubhQ)q}q(h'XWriting your own item pipeline is easy. Each item pipeline component is a single Python class that must implement the following method:qh(hh)h,h.hUh0}q(h4]h5]h3]h2]h6]uh8Kh9hh"]qhMXWriting your own item pipeline is easy. Each item pipeline component is a single Python class that must implement the following method:qq}q(h'hh(hubaubcsphinx.addnodes index q)q}q(h'Uh(hh)Nh.Uindexqh0}q(h2]h3]h4]h5]h6]Uentries]q(UsingleqXprocess_item()hUtqauh8Nh9hh"]ubcsphinx.addnodes desc q)q}q(h'Uh(hh)Nh.Udescqh0}q(UnoindexqljUdomainqXpyh2]h3]h4]h5]h6]UobjtypeqXmethodqUdesctypeqhuh8Nh9hh"]q(csphinx.addnodes desc_signature q)q}q(h'Xprocess_item(item, spider)h(hh)h,h.Udesc_signatureqh0}q(h2]qhaUmoduleqNh3]h4]h5]h6]qhaUfullnameqhUclassqUUfirstq׉uh8K)h9hh"]q(csphinx.addnodes desc_name q)q}q(h'hh(hh)h,h.U desc_nameqh0}q(h4]h5]h3]h2]h6]uh8K)h9hh"]qhMX process_itemq߅q}q(h'Uh(hubaubcsphinx.addnodes desc_parameterlist q)q}q(h'Uh(hh)h,h.Udesc_parameterlistqh0}q(h4]h5]h3]h2]h6]uh8K)h9hh"]q(csphinx.addnodes desc_parameter q)q}q(h'Xitemh0}q(h4]h5]h3]h2]h6]uh(hh"]qhMXitemq텁q}q(h'Uh(hubah.Udesc_parameterqubh)q}q(h'Xspiderh0}q(h4]h5]h3]h2]h6]uh(hh"]qhMXspiderqq}q(h'Uh(hubah.hubeubeubcsphinx.addnodes desc_content q)q}q(h'Uh(hh)h,h.U desc_contentqh0}q(h4]h5]h3]h2]h6]uh8K)h9hh"]q(hQ)q}q(h'X This method is called for every item pipeline component and must either return a :class:`~scrapy.item.Item` (or any descendant class) object or raise a :exc:`~scrapy.exceptions.DropItem` exception. Dropped items are no longer processed by further pipeline components.h(hh)h,h.hUh0}r(h4]h5]h3]h2]h6]uh8Kh9hh"]r(hMXQThis method is called for every item pipeline component and must either return a rr}r(h'XQThis method is called for every item pipeline component and must either return a h(hubcsphinx.addnodes pending_xref r)r}r(h'X:class:`~scrapy.item.Item`rh(hh)h,h.U pending_xrefr h0}r (UreftypeXclassUrefwarnr U reftargetr Xscrapy.item.ItemU refdomainXpyr h2]h3]U refexplicith4]h5]h6]UrefdocrXtopics/item-pipelinerUpy:classrNU py:modulerNuh8Kh"]rcdocutils.nodes literal r)r}r(h'jh0}r(h4]h5]r(Uxrefrj Xpy-classreh3]h2]h6]uh(jh"]rhMXItemrr}r(h'Uh(jubah.UliteralrubaubhMX- (or any descendant class) object or raise a rr }r!(h'X- (or any descendant class) object or raise a h(hubj)r"}r#(h'X":exc:`~scrapy.exceptions.DropItem`r$h(hh)h,h.j h0}r%(UreftypeXexcj j Xscrapy.exceptions.DropItemU refdomainXpyr&h2]h3]U refexplicith4]h5]h6]jjjNjNuh8Kh"]r'j)r(}r)(h'j$h0}r*(h4]h5]r+(jj&Xpy-excr,eh3]h2]h6]uh(j"h"]r-hMXDropItemr.r/}r0(h'Uh(j(ubah.jubaubhMXQ exception. Dropped items are no longer processed by further pipeline components.r1r2}r3(h'XQ exception. Dropped items are no longer processed by further pipeline components.h(hubeubcdocutils.nodes field_list r4)r5}r6(h'Uh(hh)Nh.U field_listr7h0}r8(h4]h5]h3]h2]h6]uh8Nh9hh"]r9cdocutils.nodes field r:)r;}r<(h'Uh0}r=(h4]h5]h3]h2]h6]uh(j5h"]r>(cdocutils.nodes field_name r?)r@}rA(h'Uh0}rB(h4]h5]h3]h2]h6]uh(j;h"]rChMX ParametersrDrE}rF(h'Uh(j@ubah.U field_namerGubcdocutils.nodes field_body rH)rI}rJ(h'Uh0}rK(h4]h5]h3]h2]h6]uh(j;h"]rLhk)rM}rN(h'Uh0}rO(h4]h5]h3]h2]h6]uh(jIh"]rP(hr)rQ}rR(h'Uh0}rS(h4]h5]h3]h2]h6]uh(jMh"]rThQ)rU}rV(h'Uh0}rW(h4]h5]h3]h2]h6]uh(jQh"]rX(cdocutils.nodes strong rY)rZ}r[(h'Xitemh0}r\(h4]h5]h3]h2]h6]uh(jUh"]r]hMXitemr^r_}r`(h'Uh(jZubah.UstrongraubhMX (rbrc}rd(h'Uh(jUubj)re}rf(h'X:class:`~scrapy.item.Item`rgh(jUh)h,h.j h0}rh(UreftypeXclassj j Xscrapy.item.ItemU refdomainXpyrih2]h3]U refexplicith4]h5]h6]jjjNjNuh8K%h"]rjj)rk}rl(h'jgh0}rm(h4]h5]rn(jjiXpy-classroeh3]h2]h6]uh(jeh"]rphMXItemrqrr}rs(h'Uh(jkubah.jubaubhMX objectrtru}rv(h'X objecth(jUubhMX)rw}rx(h'Uh(jUubhMX -- ryrz}r{(h'Uh(jUubhMXthe item scrapedr|r}}r~(h'Xthe item scrapedrh(jUubeh.hUubah.hvubhr)r}r(h'Uh0}r(h4]h5]h3]h2]h6]uh(jMh"]rhQ)r}r(h'Uh0}r(h4]h5]h3]h2]h6]uh(jh"]r(jY)r}r(h'Xspiderh0}r(h4]h5]h3]h2]h6]uh(jh"]rhMXspiderrr}r(h'Uh(jubah.jaubhMX (rr}r(h'Uh(jubj)r}r(h'X:class:`~scrapy.spider.Spider`rh(jh)h,h.j h0}r(UreftypeXclassj j Xscrapy.spider.SpiderU refdomainXpyrh2]h3]U refexplicith4]h5]h6]jjjNjNuh8K(h"]rj)r}r(h'jh0}r(h4]h5]r(jjXpy-classreh3]h2]h6]uh(jh"]rhMXSpiderrr}r(h'Uh(jubah.jubaubhMX objectrr}r(h'X objecth(jubhMX)r}r(h'Uh(jubhMX -- rr}r(h'Uh(jubhMX!the spider which scraped the itemrr}r(h'X!the spider which scraped the itemrh(jubeh.hUubah.hvubeh.hnubah.U field_bodyrubeh.UfieldrubaubeubeubhQ)r}r(h'X<Additionally, they may also implement the following methods:rh(hh)h,h.hUh0}r(h4]h5]h3]h2]h6]uh8K*h9hh"]rhMX<Additionally, they may also implement the following methods:rr}r(h'jh(jubaubh)r}r(h'Uh(hh)Nh.hh0}r(h2]h3]h4]h5]h6]Uentries]r(hX open_spider()hUtrauh8Nh9hh"]ubh)r}r(h'Uh(hh)Nh.hh0}r(hljhXpyh2]h3]h4]h5]h6]hXmethodrhjuh8Nh9hh"]r(h)r}r(h'Xopen_spider(spider)h(jh)h,h.hh0}r(h2]rhahNh3]h4]h5]h6]rhahhhUh׉uh8K2h9hh"]r(h)r}r(h'hh(jh)h,h.hh0}r(h4]h5]h3]h2]h6]uh8K2h9hh"]rhMX open_spiderrr}r(h'Uh(jubaubh)r}r(h'Uh(jh)h,h.hh0}r(h4]h5]h3]h2]h6]uh8K2h9hh"]rh)r}r(h'Xspiderh0}r(h4]h5]h3]h2]h6]uh(jh"]rhMXspiderrr}r(h'Uh(jubah.hubaubeubh)r}r(h'Uh(jh)h,h.hh0}r(h4]h5]h3]h2]h6]uh8K2h9hh"]r(hQ)r}r(h'X0This method is called when the spider is opened.rh(jh)h,h.hUh0}r(h4]h5]h3]h2]h6]uh8K.h9hh"]rhMX0This method is called when the spider is opened.rr}r(h'jh(jubaubj4)r}r(h'Uh(jh)Nh.j7h0}r(h4]h5]h3]h2]h6]uh8Nh9hh"]rj:)r}r(h'Uh0}r(h4]h5]h3]h2]h6]uh(jh"]r(j?)r}r(h'Uh0}r(h4]h5]h3]h2]h6]uh(jh"]rhMX Parametersrr}r(h'Uh(jubah.jGubjH)r}r(h'Uh0}r(h4]h5]h3]h2]h6]uh(jh"]rhQ)r}r(h'Uh0}r(h4]h5]h3]h2]h6]uh(jh"]r(jY)r}r(h'Xspiderh0}r(h4]h5]h3]h2]h6]uh(jh"]rhMXspiderrr}r(h'Uh(jubah.jaubhMX (rr}r(h'Uh(jubj)r}r(h'X:class:`~scrapy.spider.Spider`rh(jh)h,h.j h0}r (UreftypeXclassj j Xscrapy.spider.SpiderU refdomainXpyr h2]h3]U refexplicith4]h5]h6]jjjNjNuh8K1h"]r j)r }r (h'jh0}r(h4]h5]r(jj Xpy-classreh3]h2]h6]uh(jh"]rhMXSpiderrr}r(h'Uh(j ubah.jubaubhMX objectrr}r(h'X objecth(jubhMX)r}r(h'Uh(jubhMX -- rr}r(h'Uh(jubhMXthe spider which was openedrr}r(h'Xthe spider which was openedr h(jubeh.hUubah.jubeh.jubaubeubeubh)r!}r"(h'Uh(hh)Nh.hh0}r#(h2]h3]h4]h5]h6]Uentries]r$(hXclose_spider()h Utr%auh8Nh9hh"]ubh)r&}r'(h'Uh(hh)Nh.hh0}r((hljhXpyh2]h3]h4]h5]h6]hXmethodr)hj)uh8Nh9hh"]r*(h)r+}r,(h'Xclose_spider(spider)h(j&h)h,h.hh0}r-(h2]r.h ahNh3]h4]h5]h6]r/h ahh hUh׉uh8K:h9hh"]r0(h)r1}r2(h'h h(j+h)h,h.hh0}r3(h4]h5]h3]h2]h6]uh8K:h9hh"]r4hMX close_spiderr5r6}r7(h'Uh(j1ubaubh)r8}r9(h'Uh(j+h)h,h.hh0}r:(h4]h5]h3]h2]h6]uh8K:h9hh"]r;h)r<}r=(h'Xspiderh0}r>(h4]h5]h3]h2]h6]uh(j8h"]r?hMXspiderr@rA}rB(h'Uh(j<ubah.hubaubeubh)rC}rD(h'Uh(j&h)h,h.hh0}rE(h4]h5]h3]h2]h6]uh8K:h9hh"]rF(hQ)rG}rH(h'X0This method is called when the spider is closed.rIh(jCh)h,h.hUh0}rJ(h4]h5]h3]h2]h6]uh8K5h9hh"]rKhMX0This method is called when the spider is closed.rLrM}rN(h'jIh(jGubaubj4)rO}rP(h'Uh(jCh)Nh.j7h0}rQ(h4]h5]h3]h2]h6]uh8Nh9hh"]rRj:)rS}rT(h'Uh0}rU(h4]h5]h3]h2]h6]uh(jOh"]rV(j?)rW}rX(h'Uh0}rY(h4]h5]h3]h2]h6]uh(jSh"]rZhMX Parametersr[r\}r](h'Uh(jWubah.jGubjH)r^}r_(h'Uh0}r`(h4]h5]h3]h2]h6]uh(jSh"]rahQ)rb}rc(h'Uh0}rd(h4]h5]h3]h2]h6]uh(j^h"]re(jY)rf}rg(h'Xspiderh0}rh(h4]h5]h3]h2]h6]uh(jbh"]rihMXspiderrjrk}rl(h'Uh(jfubah.jaubhMX (rmrn}ro(h'Uh(jbubj)rp}rq(h'X:class:`~scrapy.spider.Spider`rrh(jbh)h,h.j h0}rs(UreftypeXclassj j Xscrapy.spider.SpiderU refdomainXpyrth2]h3]U refexplicith4]h5]h6]jjjNjNuh8K8h"]ruj)rv}rw(h'jrh0}rx(h4]h5]ry(jjtXpy-classrzeh3]h2]h6]uh(jph"]r{hMXSpiderr|r}}r~(h'Uh(jvubah.jubaubhMX objectrr}r(h'X objecth(jbubhMX)r}r(h'Uh(jbubhMX -- rr}r(h'Uh(jbubhMXthe spider which was closedrr}r(h'Xthe spider which was closedrh(jbubeh.hUubah.jubeh.jubaubeubeubeubh:)r}r(h'Uh(h;h)h,h.h?h0}r(h4]h5]h3]h2]rhah6]rhauh8K file, containing one item per line serialized in JSON format:rr}r(h'X> file, containing one item per line serialized in JSON format:h(jubeubj)r}r(h'Ximport json class JsonWriterPipeline(object): def __init__(self): self.file = open('items.jl', 'wb') def process_item(self, item, spider): line = json.dumps(dict(item)) + "\n" self.file.write(line) return itemh(jh)h,h.jh0}r(jjh2]h3]h4]h5]h6]uh8K[h9hh"]rhMXimport json class JsonWriterPipeline(object): def __init__(self): self.file = open('items.jl', 'wb') def process_item(self, item, spider): line = json.dumps(dict(item)) + "\n" self.file.write(line) return itemrr}r(h'Uh(jubaubcdocutils.nodes note r)r}r(h'XThe purpose of JsonWriterPipeline is just to introduce how to write item pipelines. If you really want to store all scraped items into a JSON file you should use the :ref:`Feed exports `.h(jh)h,h.Unoterh0}r(h4]h5]h3]h2]h6]uh8Nh9hh"]rhQ)r}r(h'XThe purpose of JsonWriterPipeline is just to introduce how to write item pipelines. If you really want to store all scraped items into a JSON file you should use the :ref:`Feed exports `.h(jh)h,h.hUh0}r(h4]h5]h3]h2]h6]uh8Kgh"]r(hMXThe purpose of JsonWriterPipeline is just to introduce how to write item pipelines. If you really want to store all scraped items into a JSON file you should use the rr}r(h'XThe purpose of JsonWriterPipeline is just to introduce how to write item pipelines. If you really want to store all scraped items into a JSON file you should use the h(jubj)r}r(h'X):ref:`Feed exports `rh(jh)h,h.j h0}r(UreftypeXrefj j Xtopics-feed-exportsU refdomainXstdrh2]h3]U refexplicith4]h5]h6]jjuh8Kgh"]rcdocutils.nodes emphasis r)r}r(h'jh0}r (h4]h5]r (jjXstd-refr eh3]h2]h6]uh(jh"]r hMX Feed exportsr r}r(h'Uh(jubah.UemphasisrubaubhMX.r}r(h'X.h(jubeubaubeubh:)r}r(h'Uh(jh)h,h.h?h0}r(h4]h5]h3]h2]rhah6]rh auh8Klh9hh"]r(hF)r}r(h'XDuplicates filterrh(jh)h,h.hJh0}r(h4]h5]h3]h2]h6]uh8Klh9hh"]rhMXDuplicates filterrr}r (h'jh(jubaubhQ)r!}r"(h'XA filter that looks for duplicate items, and drops those items that were already processed. Let say that our items have an unique id, but our spider returns multiples items with the same id::h(jh)h,h.hUh0}r#(h4]h5]h3]h2]h6]uh8Knh9hh"]r$hMXA filter that looks for duplicate items, and drops those items that were already processed. Let say that our items have an unique id, but our spider returns multiples items with the same id:r%r&}r'(h'XA filter that looks for duplicate items, and drops those items that were already processed. Let say that our items have an unique id, but our spider returns multiples items with the same id:h(j!ubaubj)r(}r)(h'Xafrom scrapy.exceptions import DropItem class DuplicatesPipeline(object): def __init__(self): self.ids_seen = set() def process_item(self, item, spider): if item['id'] in self.ids_seen: raise DropItem("Duplicate item found: %s" % item) else: self.ids_seen.add(item['id']) return itemh(jh)h,h.jh0}r*(jjh2]h3]h4]h5]h6]uh8Ksh9hh"]r+hMXafrom scrapy.exceptions import DropItem class DuplicatesPipeline(object): def __init__(self): self.ids_seen = set() def process_item(self, item, spider): if item['id'] in self.ids_seen: raise DropItem("Duplicate item found: %s" % item) else: self.ids_seen.add(item['id']) return itemr,r-}r.(h'Uh(j(ubaubeubeubh:)r/}r0(h'Uh(h;h)h,h.h?h0}r1(h4]h5]h3]h2]r2hah6]r3h auh8Kh9hh"]r4(hF)r5}r6(h'X%Activating an Item Pipeline componentr7h(j/h)h,h.hJh0}r8(h4]h5]h3]h2]h6]uh8Kh9hh"]r9hMX%Activating an Item Pipeline componentr:r;}r<(h'j7h(j5ubaubhQ)r=}r>(h'XTo activate an Item Pipeline component you must add its class to the :setting:`ITEM_PIPELINES` setting, like in the following example::h(j/h)h,h.hUh0}r?(h4]h5]h3]h2]h6]uh8Kh9hh"]r@(hMXETo activate an Item Pipeline component you must add its class to the rArB}rC(h'XETo activate an Item Pipeline component you must add its class to the h(j=ubj)rD}rE(h'X:setting:`ITEM_PIPELINES`rFh(j=h)h,h.j h0}rG(UreftypeXsettingj j XITEM_PIPELINESU refdomainXstdrHh2]h3]U refexplicith4]h5]h6]jjuh8Kh"]rIj)rJ}rK(h'jFh0}rL(h4]h5]rM(jjHX std-settingrNeh3]h2]h6]uh(jDh"]rOhMXITEM_PIPELINESrPrQ}rR(h'Uh(jJubah.jubaubhMX( setting, like in the following example:rSrT}rU(h'X( setting, like in the following example:h(j=ubeubj)rV}rW(h'XsITEM_PIPELINES = { 'myproject.pipeline.PricePipeline': 300, 'myproject.pipeline.JsonWriterPipeline': 800, }h(j/h)h,h.jh0}rX(jjh2]h3]h4]h5]h6]uh8Kh9hh"]rYhMXsITEM_PIPELINES = { 'myproject.pipeline.PricePipeline': 300, 'myproject.pipeline.JsonWriterPipeline': 800, }rZr[}r\(h'Uh(jVubaubhQ)r]}r^(h'XThe integer values you assign to classes in this setting determine the order they run in- items go through pipelines from order number low to high. It's customary to define these numbers in the 0-1000 range.r_h(j/h)h,h.hUh0}r`(h4]h5]h3]h2]h6]uh8Kh9hh"]rahMXThe integer values you assign to classes in this setting determine the order they run in- items go through pipelines from order number low to high. It's customary to define these numbers in the 0-1000 range.rbrc}rd(h'j_h(j]ubaubeubeubeh'UU transformerreNU footnote_refsrf}rgUrefnamesrh}riUsymbol_footnotesrj]rkUautofootnote_refsrl]rmUsymbol_footnote_refsrn]roU citationsrp]rqh9hU current_linerrNUtransform_messagesrs]rtcdocutils.nodes system_message ru)rv}rw(h'Uh0}rx(h4]UlevelKh2]h3]Usourceh,h5]h6]UlineKUtypeUINFOryuh"]rzhQ)r{}r|(h'Uh0}r}(h4]h5]h3]h2]h6]uh(jvh"]r~hMX:Hyperlink target "topics-item-pipeline" is not referenced.rr}r(h'Uh(j{ubah.hUubah.Usystem_messagerubaUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhJNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUJ/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/item-pipeline.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(hjhhh jhhhh;h j+hjhjhjh!h;hj/uUsubstitution_namesr}rh.h9h0}r(h4]h2]h3]Usourceh,h5]h6]uU footnotesr]rUrefidsr}rh!]rh%asub.PKo1Dfv.scrapy-0.22/.doctrees/topics/selectors.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xlocation pathsqXusing selectorsqNXtopics-selectorsqX$scrapy.selector.Selector.__nonzero__q Xbuilt-in selectors referenceq NX!selector examples on xml responseq NXscrapy.selector.Selector.xpathq Xtopics-selectors-refq Xconstructing selectorsqNXscrapy.selector.SelectorList.reqXlxmlqXset operationsqNXnesting selectorsqNX elementtreeqX(scrapy.selector.SelectorList.__nonzero__qX beautifulsoupqXscrapy.selector.SelectorListqXtopics-selectors-htmlcodeqX scrapy.selector.SelectorList.cssqX"scrapy.selector.SelectorList.xpathqXcssqXxpathqX$scrapy.selector.SelectorList.extractqX"selector examples on html responseqNXselectorlist objectsqNXremoving-namespacesqX scrapy.selector.Selector.extractq X*scrapy.selector.Selector.remove_namespacesq!Xexsltq"Xremoving namespacesq#NXscrapy.selector.Selector.cssq$Xscrapy.selector.Selectorq%X topics-selectors-relative-xpathsq&Xregular expressionsq'Xusing exslt extensionsq(NXscrapy.selector.Selector.req)Xgoogle base xml feedq*Xset manipulationq+X selectorsq,NX"topics-selectors-nesting-selectorsq-X cssselectq.X+scrapy.selector.Selector.register_namespaceq/X(using selectors with regular expressionsq0NXworking with relative xpathsq1NuUsubstitution_defsq2}q3Uparse_messagesq4]q5(cdocutils.nodes system_message q6)q7}q8(U rawsourceq9UUparentq:cdocutils.nodes section q;)q<}q=(h9UU referencedq>Kh:h;)q?}q@(h9Uh>Kh:h;)qA}qB(h9Uh:hUsourceqCcdocutils.nodes reprunicode qDXF/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/selectors.rstqEqF}qGbUexpect_referenced_by_nameqH}qIhcdocutils.nodes target qJ)qK}qL(h9X.. _topics-selectors:h:hhChFUtagnameqMUtargetqNU attributesqO}qP(UidsqQ]UbackrefsqR]UdupnamesqS]UclassesqT]UnamesqU]UrefidqVUtopics-selectorsqWuUlineqXKUdocumentqYhUchildrenqZ]ubshMUsectionq[hO}q\(hS]hT]hR]hQ]q](U selectorsq^hWehU]q_(h,heuhXKhYhUexpect_referenced_by_idq`}qahWhKshZ]qb(cdocutils.nodes title qc)qd}qe(h9X Selectorsqfh:hAhChFhMUtitleqghO}qh(hS]hT]hR]hQ]hU]uhXKhYhhZ]qicdocutils.nodes Text qjX Selectorsqkql}qm(h9hfh:hdubaubcdocutils.nodes paragraph qn)qo}qp(h9XWhen you're scraping web pages, the most common task you need to perform is to extract data from the HTML source. There are several libraries available to achieve this:qqh:hAhChFhMU paragraphqrhO}qs(hS]hT]hR]hQ]hU]uhXKhYhhZ]qthjXWhen you're scraping web pages, the most common task you need to perform is to extract data from the HTML source. There are several libraries available to achieve this:quqv}qw(h9hqh:houbaubcdocutils.nodes block_quote qx)qy}qz(h9Uh:hAhCNhMU block_quoteq{hO}q|(hS]hT]hR]hQ]hU]uhXNhYhhZ]q}cdocutils.nodes bullet_list q~)q}q(h9UhO}q(UbulletqX*hQ]hR]hS]hT]hU]uh:hyhZ]q(cdocutils.nodes list_item q)q}q(h9X`BeautifulSoup`_ is a very popular screen scraping library among Python programmers which constructs a Python object based on the structure of the HTML code and also deals with bad markup reasonably well, but it has one drawback: it's slow. hO}q(hS]hT]hR]hQ]hU]uh:hhZ]qhn)q}q(h9X`BeautifulSoup`_ is a very popular screen scraping library among Python programmers which constructs a Python object based on the structure of the HTML code and also deals with bad markup reasonably well, but it has one drawback: it's slow.h:hhChFhMhrhO}q(hS]hT]hR]hQ]hU]uhXK hZ]q(cdocutils.nodes reference q)q}q(h9X`BeautifulSoup`_UresolvedqKh:hhMU referenceqhO}q(UnameX BeautifulSoupUrefuriqX-http://www.crummy.com/software/BeautifulSoup/qhQ]hR]hS]hT]hU]uhZ]qhjX BeautifulSoupqq}q(h9Uh:hubaubhjX is a very popular screen scraping library among Python programmers which constructs a Python object based on the structure of the HTML code and also deals with bad markup reasonably well, but it has one drawback: it's slow.qq}q(h9X is a very popular screen scraping library among Python programmers which constructs a Python object based on the structure of the HTML code and also deals with bad markup reasonably well, but it has one drawback: it's slow.h:hubeubahMU list_itemqubh)q}q(h9X`lxml`_ is a XML parsing library (which also parses HTML) with a pythonic API based on `ElementTree`_ (which is not part of the Python standard library). hO}q(hS]hT]hR]hQ]hU]uh:hhZ]qhn)q}q(h9X`lxml`_ is a XML parsing library (which also parses HTML) with a pythonic API based on `ElementTree`_ (which is not part of the Python standard library).h:hhChFhMhrhO}q(hS]hT]hR]hQ]hU]uhXKhZ]q(h)q}q(h9X`lxml`_hKh:hhMhhO}q(UnameXlxmlqhXhttp://codespeak.net/lxml/qhQ]hR]hS]hT]hU]uhZ]qhjXlxmlqq}q(h9Uh:hubaubhjXP is a XML parsing library (which also parses HTML) with a pythonic API based on qq}q(h9XP is a XML parsing library (which also parses HTML) with a pythonic API based on h:hubh)q}q(h9X`ElementTree`_hKh:hhMhhO}q(UnameX ElementTreehX9http://docs.python.org/library/xml.etree.elementtree.htmlqhQ]hR]hS]hT]hU]uhZ]qhjX ElementTreeqq}q(h9Uh:hubaubhjX4 (which is not part of the Python standard library).qq}q(h9X4 (which is not part of the Python standard library).h:hubeubahMhubehMU bullet_listqubaubhn)q}q(h9XScrapy comes with its own mechanism for extracting data. They're called selectors because they "select" certain parts of the HTML document specified either by `XPath`_ or `CSS`_ expressions.h:hAhChFhMhrhO}q(hS]hT]hR]hQ]hU]uhXKhYhhZ]q(hjXScrapy comes with its own mechanism for extracting data. They're called selectors because they "select" certain parts of the HTML document specified either by qq}q(h9XScrapy comes with its own mechanism for extracting data. They're called selectors because they "select" certain parts of the HTML document specified either by h:hubh)q}q(h9X`XPath`_hKh:hhMhhO}q(UnameXXPathhXhttp://www.w3.org/TR/xpathqhQ]hR]hS]hT]hU]uhZ]qhjXXPathqɅq}q(h9Uh:hubaubhjX or q̅q}q(h9X or h:hubh)q}q(h9X`CSS`_hKh:hhMhhO}q(UnameXCSShXhttp://www.w3.org/TR/selectorsqhQ]hR]hS]hT]hU]uhZ]qhjXCSSqԅq}q(h9Uh:hubaubhjX expressions.qׅq}q(h9X expressions.h:hubeubhn)q}q(h9X`XPath`_ is a language for selecting nodes in XML documents, which can also be used with HTML. `CSS`_ is a language for applying styles to HTML documents. It defines selectors to associate those styles with specific HTML elements.h:hAhChFhMhrhO}q(hS]hT]hR]hQ]hU]uhXKhYhhZ]q(h)q}q(h9X`XPath`_hKh:hhMhhO}q(UnameXXPathhhhQ]hR]hS]hT]hU]uhZ]qhjXXPathq⅁q}q(h9Uh:hubaubhjXW is a language for selecting nodes in XML documents, which can also be used with HTML. q允q}q(h9XW is a language for selecting nodes in XML documents, which can also be used with HTML. h:hubh)q}q(h9X`CSS`_hKh:hhMhhO}q(UnameXCSShhhQ]hR]hS]hT]hU]uhZ]qhjXCSSq셁q}q(h9Uh:hubaubhjX is a language for applying styles to HTML documents. It defines selectors to associate those styles with specific HTML elements.qq}q(h9X is a language for applying styles to HTML documents. It defines selectors to associate those styles with specific HTML elements.h:hubeubhn)q}q(h9XtScrapy selectors are built over the `lxml`_ library, which means they're very similar in speed and parsing accuracy.h:hAhChFhMhrhO}q(hS]hT]hR]hQ]hU]uhXKhYhhZ]q(hjX$Scrapy selectors are built over the qq}q(h9X$Scrapy selectors are built over the h:hubh)q}q(h9X`lxml`_hKh:hhMhhO}q(UnameXlxmlhhhQ]hR]hS]hT]hU]uhZ]qhjXlxmlqq}q(h9Uh:hubaubhjXI library, which means they're very similar in speed and parsing accuracy.rr}r(h9XI library, which means they're very similar in speed and parsing accuracy.h:hubeubhn)r}r(h9XThis page explains how selectors work and describes their API which is very small and simple, unlike the `lxml`_ API which is much bigger because the `lxml`_ library can be used for many other tasks, besides selecting markup documents.h:hAhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXKhYhhZ]r(hjXiThis page explains how selectors work and describes their API which is very small and simple, unlike the rr}r (h9XiThis page explains how selectors work and describes their API which is very small and simple, unlike the h:jubh)r }r (h9X`lxml`_hKh:jhMhhO}r (UnameXlxmlhhhQ]hR]hS]hT]hU]uhZ]r hjXlxmlrr}r(h9Uh:j ubaubhjX& API which is much bigger because the rr}r(h9X& API which is much bigger because the h:jubh)r}r(h9X`lxml`_hKh:jhMhhO}r(UnameXlxmlhhhQ]hR]hS]hT]hU]uhZ]rhjXlxmlrr}r(h9Uh:jubaubhjXN library can be used for many other tasks, besides selecting markup documents.rr}r(h9XN library can be used for many other tasks, besides selecting markup documents.h:jubeubhn)r}r(h9XbFor a complete reference of the selectors API see :ref:`Selector reference `h:hAhChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXK$hYhhZ]r!(hjX2For a complete reference of the selectors API see r"r#}r$(h9X2For a complete reference of the selectors API see h:jubcsphinx.addnodes pending_xref r%)r&}r'(h9X0:ref:`Selector reference `r(h:jhChFhMU pending_xrefr)hO}r*(UreftypeXrefUrefwarnr+U reftargetr,Xtopics-selectors-refU refdomainXstdr-hQ]hR]U refexplicithS]hT]hU]Urefdocr.Xtopics/selectorsr/uhXK$hZ]r0cdocutils.nodes emphasis r1)r2}r3(h9j(hO}r4(hS]hT]r5(Uxrefr6j-Xstd-refr7ehR]hQ]hU]uh:j&hZ]r8hjXSelector referencer9r:}r;(h9Uh:j2ubahMUemphasisr<ubaubeubhJ)r=}r>(h9X@.. _BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/h>Kh:hAhChFhMhNhO}r?(hhhQ]r@U beautifulsouprAahR]hS]hT]hU]rBhauhXK'hYhhZ]ubhJ)rC}rD(h9X$.. _lxml: http://codespeak.net/lxml/h>Kh:hAhChFhMhNhO}rE(hhhQ]rFUlxmlrGahR]hS]hT]hU]rHhauhXK(hYhhZ]ubhJ)rI}rJ(h9XJ.. _ElementTree: http://docs.python.org/library/xml.etree.elementtree.htmlh>Kh:hAhChFhMhNhO}rK(hhhQ]rLU elementtreerMahR]hS]hT]hU]rNhauhXK)hYhhZ]ubhJ)rO}rP(h9X6.. _cssselect: https://pypi.python.org/pypi/cssselect/h>Kh:hAhChFhMhNhO}rQ(hX'https://pypi.python.org/pypi/cssselect/rRhQ]rSU cssselectrTahR]hS]hT]hU]rUh.auhXK*hYhhZ]ubhJ)rV}rW(h9X%.. _XPath: http://www.w3.org/TR/xpathh>Kh:hAhChFhMhNhO}rX(hhhQ]rYUxpathrZahR]hS]hT]hU]r[hauhXK+hYhhZ]ubhJ)r\}r](h9X'.. _CSS: http://www.w3.org/TR/selectorsh>Kh:hAhChFhMhNhO}r^(hhhQ]r_Ucssr`ahR]hS]hT]hU]rahauhXK,hYhhZ]ubh?h;)rb}rc(h9Uh:hAhChFhH}rdh hJ)re}rf(h9X.. _topics-selectors-ref:h:h;)rg}rh(h9Uh:h;)ri}rj(h9Uh:h?hChFhMh[hO}rk(hS]hT]hR]hQ]rlUusing-exslt-extensionsrmahU]rnh(auhXKhYhhZ]ro(hc)rp}rq(h9XUsing EXSLT extensionsrrh:jihChFhMhghO}rs(hS]hT]hR]hQ]hU]uhXKhYhhZ]rthjXUsing EXSLT extensionsrurv}rw(h9jrh:jpubaubhn)rx}ry(h9XBeing built atop `lxml`_, Scrapy selectors also support some `EXSLT`_ extensions and come with these pre-registered namespaces to use in XPath expressions:h:jihChFhMhrhO}rz(hS]hT]hR]hQ]hU]uhXKhYhhZ]r{(hjXBeing built atop r|r}}r~(h9XBeing built atop h:jxubh)r}r(h9X`lxml`_hKh:jxhMhhO}r(UnameXlxmlhhhQ]hR]hS]hT]hU]uhZ]rhjXlxmlrr}r(h9Uh:jubaubhjX%, Scrapy selectors also support some rr}r(h9X%, Scrapy selectors also support some h:jxubh)r}r(h9X`EXSLT`_hKh:jxhMhhO}r(UnameXEXSLThXhttp://www.exslt.org/rhQ]hR]hS]hT]hU]uhZ]rhjXEXSLTrr}r(h9Uh:jubaubhjXV extensions and come with these pre-registered namespaces to use in XPath expressions:rr}r(h9XV extensions and come with these pre-registered namespaces to use in XPath expressions:h:jxubeubcdocutils.nodes table r)r}r(h9Uh:jihChFhMUtablerhO}r(hS]hT]hR]hQ]hU]uhXNhYhhZ]rcdocutils.nodes tgroup r)r}r(h9UhO}r(hQ]hR]hS]hT]hU]UcolsKuh:jhZ]r(cdocutils.nodes colspec r)r}r(h9UhO}r(hQ]hR]hS]hT]hU]UcolwidthKuh:jhZ]hMUcolspecrubj)r}r(h9UhO}r(hQ]hR]hS]hT]hU]UcolwidthK$uh:jhZ]hMjubj)r}r(h9UhO}r(hQ]hR]hS]hT]hU]UcolwidthKuh:jhZ]hMjubcdocutils.nodes thead r)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rcdocutils.nodes row r)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]r(cdocutils.nodes entry r)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9Xprefixrh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXKhZ]rhjXprefixrr}r(h9jh:jubaubahMUentryrubj)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9X namespacerh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXKhZ]rhjX namespacerr}r(h9jh:jubaubahMjubj)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9Xusagerh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXKhZ]rhjXusagerr}r(h9jh:jubaubahMjubehMUrowrubahMUtheadrubcdocutils.nodes tbody r)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]r(j)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]r(j)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9Xrerh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXKhZ]rhjXrerr}r(h9jh:jubaubahMjubj)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9X$http://exslt.org/regular-expressionsrh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXKhZ]rh)r}r(h9jhO}r(UrefurijhQ]hR]hS]hT]hU]uh:jhZ]rhjX$http://exslt.org/regular-expressionsrr}r(h9Uh:jubahMhubaubahMjubj)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9X`regular expressions`_rh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXKhZ]r h)r }r (h9jhKh:jhMhhO}r (UnameXregular expressionshX&http://www.exslt.org/regexp/index.htmlr hQ]hR]hS]hT]hU]uhZ]rhjXregular expressionsrr}r(h9Uh:j ubaubaubahMjubehMjubj)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]r(j)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9Xsetrh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXKhZ]rhjXsetrr }r!(h9jh:jubaubahMjubj)r"}r#(h9UhO}r$(hS]hT]hR]hQ]hU]uh:jhZ]r%hn)r&}r'(h9Xhttp://exslt.org/setsr(h:j"hChFhMhrhO}r)(hS]hT]hR]hQ]hU]uhXKhZ]r*h)r+}r,(h9j(hO}r-(Urefurij(hQ]hR]hS]hT]hU]uh:j&hZ]r.hjXhttp://exslt.org/setsr/r0}r1(h9Uh:j+ubahMhubaubahMjubj)r2}r3(h9UhO}r4(hS]hT]hR]hQ]hU]uh:jhZ]r5hn)r6}r7(h9X`set manipulation`_r8h:j2hChFhMhrhO}r9(hS]hT]hR]hQ]hU]uhXKhZ]r:h)r;}r<(h9j8hKh:j6hMhhO}r=(UnameXset manipulationhX#http://www.exslt.org/set/index.htmlr>hQ]hR]hS]hT]hU]uhZ]r?hjXset manipulationr@rA}rB(h9Uh:j;ubaubaubahMjubehMjubehMUtbodyrCubehMUtgrouprDubaubh;)rE}rF(h9Uh>Kh:jihChFhMh[hO}rG(hS]rHXregular expressionsrIahT]hR]hQ]rJUregular-expressionsrKahU]uhXMhYhhZ]rL(hc)rM}rN(h9XRegular expressionsrOh:jEhChFhMhghO}rP(hS]hT]hR]hQ]hU]uhXMhYhhZ]rQhjXRegular expressionsrRrS}rT(h9jOh:jMubaubhn)rU}rV(h9XThe ``test()`` function for example can prove quite useful when XPath's ``starts-with()`` or ``contains()`` are not sufficient.h:jEhChFhMhrhO}rW(hS]hT]hR]hQ]hU]uhXMhYhhZ]rX(hjXThe rYrZ}r[(h9XThe h:jUubcdocutils.nodes literal r\)r]}r^(h9X ``test()``hO}r_(hS]hT]hR]hQ]hU]uh:jUhZ]r`hjXtest()rarb}rc(h9Uh:j]ubahMUliteralrdubhjX: function for example can prove quite useful when XPath's rerf}rg(h9X: function for example can prove quite useful when XPath's h:jUubj\)rh}ri(h9X``starts-with()``hO}rj(hS]hT]hR]hQ]hU]uh:jUhZ]rkhjX starts-with()rlrm}rn(h9Uh:jhubahMjdubhjX or rorp}rq(h9X or h:jUubj\)rr}rs(h9X``contains()``hO}rt(hS]hT]hR]hQ]hU]uh:jUhZ]ruhjX contains()rvrw}rx(h9Uh:jrubahMjdubhjX are not sufficient.ryrz}r{(h9X are not sufficient.h:jUubeubhn)r|}r}(h9XSExample selecting links in list item with a "class" attribute ending with a digit::r~h:jEhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXRExample selecting links in list item with a "class" attribute ending with a digit:rr}r(h9XRExample selecting links in list item with a "class" attribute ending with a digit:h:j|ubaubcdocutils.nodes literal_block r)r}r(h9X>>> doc = """ ...
              ... ...
              ... """ >>> sel = Selector(text=doc, type="html") >>> sel.xpath('//li//@href').extract() [u'link1.html', u'link2.html', u'link3.html', u'link4.html', u'link5.html'] >>> sel.xpath('//li[re:test(@class, "item-\d$")]//@href').extract() [u'link1.html', u'link2.html', u'link4.html', u'link5.html'] >>>h:jEhChFhMU literal_blockrhO}r(U xml:spacerUpreserverhQ]hR]hS]hT]hU]uhXM hYhhZ]rhjX>>> doc = """ ...
              ... ...
              ... """ >>> sel = Selector(text=doc, type="html") >>> sel.xpath('//li//@href').extract() [u'link1.html', u'link2.html', u'link3.html', u'link4.html', u'link5.html'] >>> sel.xpath('//li[re:test(@class, "item-\d$")]//@href').extract() [u'link1.html', u'link2.html', u'link4.html', u'link5.html'] >>>rr}r(h9Uh:jubaubcdocutils.nodes warning r)r}r(h9XC library ``libxslt`` doesn't natively support EXSLT regular expressions so `lxml`_'s implementation uses hooks to Python's ``re`` module. Thus, using regexp functions in your XPath expressions may add a small performance penalty.h:jEhChFhMUwarningrhO}r(hS]hT]hR]hQ]hU]uhXNhYhhZ]rhn)r}r(h9XC library ``libxslt`` doesn't natively support EXSLT regular expressions so `lxml`_'s implementation uses hooks to Python's ``re`` module. Thus, using regexp functions in your XPath expressions may add a small performance penalty.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhZ]r(hjX C library rr}r(h9X C library h:jubj\)r}r(h9X ``libxslt``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXlibxsltrr}r(h9Uh:jubahMjdubhjX7 doesn't natively support EXSLT regular expressions so rr}r(h9X7 doesn't natively support EXSLT regular expressions so h:jubh)r}r(h9X`lxml`_hKh:jhMhhO}r(UnameXlxmlhhhQ]hR]hS]hT]hU]uhZ]rhjXlxmlrr}r(h9Uh:jubaubhjX)'s implementation uses hooks to Python's rr}r(h9X)'s implementation uses hooks to Python's h:jubj\)r}r(h9X``re``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXrerr}r(h9Uh:jubahMjdubhjXd module. Thus, using regexp functions in your XPath expressions may add a small performance penalty.rr}r(h9Xd module. Thus, using regexp functions in your XPath expressions may add a small performance penalty.h:jubeubaubeubjgeubhChFhMh[hO}r(hS]hT]hR]hQ]rUset-operationsrahU]rhauhXM!hYhhZ]r(hc)r}r(h9XSet operationsrh:jghChFhMhghO}r(hS]hT]hR]hQ]hU]uhXM!hYhhZ]rhjXSet operationsrr}r(h9jh:jubaubhn)r}r(h9XfThese can be handy for excluding parts of a document tree before extracting text elements for example.rh:jghChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXM#hYhhZ]rhjXfThese can be handy for excluding parts of a document tree before extracting text elements for example.rr}r(h9jh:jubaubhn)r}r(h9XExample extracting microdata (sample content taken from http://schema.org/Product) with groups of itemscopes and corresponding itemprops::h:jghChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXM&hYhhZ]r(hjX8Example extracting microdata (sample content taken from rr}r(h9X8Example extracting microdata (sample content taken from h:jubh)r}r(h9Xhttp://schema.org/ProductrhO}r(UrefurijhQ]hR]hS]hT]hU]uh:jhZ]rhjXhttp://schema.org/Productrr}r(h9Uh:jubahMhubhjX8) with groups of itemscopes and corresponding itemprops:rr}r(h9X8) with groups of itemscopes and corresponding itemprops:h:jubeubj)r}r(h9XD >>> doc = """ ...
              ... Kenmore White 17" Microwave ... Kenmore 17" Microwave ...
              ... Rated 3.5/5 ... based on 11 customer reviews ...
              ... ...
              ... $55.00 ... In stock ...
              ... ... Product description: ... 0.7 cubic feet countertop microwave. ... Has six preset cooking categories and convenience features like ... Add-A-Minute and Child Lock. ... ... Customer reviews: ... ...
              ... Not a happy camper - ... by , ... April 1, 2011 ...
              ... ... 1/ ... 5stars ...
              ... The lamp burned out and now I have to replace ... it. ...
              ... ...
              ... Value purchase - ... by , ... March 25, 2011 ...
              ... ... 4/ ... 5stars ...
              ... Great microwave for the price. It is small and ... fits in my apartment. ...
              ... ... ...
              ... """ >>> >>> for scope in sel.xpath('//div[@itemscope]'): ... print "current scope:", scope.xpath('@itemtype').extract() ... props = scope.xpath(''' ... set:difference(./descendant::*/@itemprop, ... .//*[@itemscope]/*/@itemprop)''') ... print " properties:", props.extract() ... print ... current scope: [u'http://schema.org/Product'] properties: [u'name', u'aggregateRating', u'offers', u'description', u'review', u'review'] current scope: [u'http://schema.org/AggregateRating'] properties: [u'ratingValue', u'reviewCount'] current scope: [u'http://schema.org/Offer'] properties: [u'price', u'availability'] current scope: [u'http://schema.org/Review'] properties: [u'name', u'author', u'datePublished', u'reviewRating', u'description'] current scope: [u'http://schema.org/Rating'] properties: [u'worstRating', u'ratingValue', u'bestRating'] current scope: [u'http://schema.org/Review'] properties: [u'name', u'author', u'datePublished', u'reviewRating', u'description'] current scope: [u'http://schema.org/Rating'] properties: [u'worstRating', u'ratingValue', u'bestRating'] >>>h:jghChFhMjhO}r(jjhQ]hR]hS]hT]hU]uhXM)hYhhZ]rhjXD >>> doc = """ ...
              ... Kenmore White 17" Microwave ... Kenmore 17" Microwave ...
              ... Rated 3.5/5 ... based on 11 customer reviews ...
              ... ...
              ... $55.00 ... In stock ...
              ... ... Product description: ... 0.7 cubic feet countertop microwave. ... Has six preset cooking categories and convenience features like ... Add-A-Minute and Child Lock. ... ... Customer reviews: ... ...
              ... Not a happy camper - ... by , ... April 1, 2011 ...
              ... ... 1/ ... 5stars ...
              ... The lamp burned out and now I have to replace ... it. ...
              ... ...
              ... Value purchase - ... by , ... March 25, 2011 ...
              ... ... 4/ ... 5stars ...
              ... Great microwave for the price. It is small and ... fits in my apartment. ...
              ... ... ...
              ... """ >>> >>> for scope in sel.xpath('//div[@itemscope]'): ... print "current scope:", scope.xpath('@itemtype').extract() ... props = scope.xpath(''' ... set:difference(./descendant::*/@itemprop, ... .//*[@itemscope]/*/@itemprop)''') ... print " properties:", props.extract() ... print ... current scope: [u'http://schema.org/Product'] properties: [u'name', u'aggregateRating', u'offers', u'description', u'review', u'review'] current scope: [u'http://schema.org/AggregateRating'] properties: [u'ratingValue', u'reviewCount'] current scope: [u'http://schema.org/Offer'] properties: [u'price', u'availability'] current scope: [u'http://schema.org/Review'] properties: [u'name', u'author', u'datePublished', u'reviewRating', u'description'] current scope: [u'http://schema.org/Rating'] properties: [u'worstRating', u'ratingValue', u'bestRating'] current scope: [u'http://schema.org/Review'] properties: [u'name', u'author', u'datePublished', u'reviewRating', u'description'] current scope: [u'http://schema.org/Rating'] properties: [u'worstRating', u'ratingValue', u'bestRating'] >>>rr}r(h9Uh:jubaubhn)r}r(h9XHere we first iterate over ``itemscope`` elements, and for each one, we look for all ``itemprops`` elements and exclude those that are themselves inside another ``itemscope``.h:jghChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXM{hYhhZ]r(hjXHere we first iterate over rr}r(h9XHere we first iterate over h:jubj\)r}r(h9X ``itemscope``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX itemscoperr}r(h9Uh:jubahMjdubhjX- elements, and for each one, we look for all rr}r(h9X- elements, and for each one, we look for all h:jubj\)r}r(h9X ``itemprops``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX itempropsrr}r(h9Uh:jubahMjdubhjX? elements and exclude those that are themselves inside another rr}r(h9X? elements and exclude those that are themselves inside another h:jubj\)r}r(h9X ``itemscope``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX itemscoperr}r (h9Uh:jubahMjdubhjX.r }r (h9X.h:jubeubhJ)r }r (h9X .. _EXSLT: http://www.exslt.org/h>Kh:jghChFhMhNhO}r(hjhQ]rUexsltrahR]hS]hT]hU]rh"auhXMhYhhZ]ubhJ)r}r(h9X?.. _regular expressions: http://www.exslt.org/regexp/index.htmlh>Kh:jghChFhMhNhO}r(hj hQ]rUid2rahR]hS]hT]hU]rjIauhXMhYhhZ]ubhJ)r}r(h9X9.. _set manipulation: http://www.exslt.org/set/index.htmlh>Kh:jghChFhMhNhO}r(hj>hQ]rUset-manipulationrahR]hS]hT]hU]rh+auhXMhYhhZ]ubjeeubhChFhMhNhO}r(hQ]hR]hS]hT]hU]hVUtopics-selectors-refruhXMhYhhZ]ubshMh[hO}r (hS]hT]hR]hQ]r!(Xmodule-scrapy.selectorr"Ubuilt-in-selectors-referencer#jehU]r$(h h euhXMhYhh`}r%jjeshZ]r&(hc)r'}r((h9XBuilt-in Selectors referencer)h:jbhChFhMhghO}r*(hS]hT]hR]hQ]hU]uhXMhYhhZ]r+hjXBuilt-in Selectors referencer,r-}r.(h9j)h:j'ubaubcsphinx.addnodes index r/)r0}r1(h9Uh:jbhChFhMUindexr2hO}r3(hQ]hR]hS]hT]hU]Uentries]r4(Usingler5Xscrapy.selector (module)Xmodule-scrapy.selectorUtr6auhXNhYhhZ]ubj/)r7}r8(h9Uh:jbhCNhMj2hO}r9(hQ]hR]hS]hT]hU]Uentries]r:(j5X#Selector (class in scrapy.selector)h%Utr;auhXNhYhhZ]ubcsphinx.addnodes desc r<)r=}r>(h9Uh:jbhCNhMUdescr?hO}r@(UnoindexrAUdomainrBXpyhQ]hR]hS]hT]hU]UobjtyperCXclassrDUdesctyperEjDuhXNhYhhZ]rF(csphinx.addnodes desc_signature rG)rH}rI(h9X-Selector(response=None, text=None, type=None)h:j=hChFhMUdesc_signaturerJhO}rK(hQ]rLh%aUmodulerMXscrapy.selectorrNhR]hS]hT]hU]rOh%aUfullnamerPXSelectorrQUclassrRUUfirstrSuhXMhYhhZ]rT(csphinx.addnodes desc_annotation rU)rV}rW(h9Xclass h:jHhChFhMUdesc_annotationrXhO}rY(hS]hT]hR]hQ]hU]uhXMhYhhZ]rZhjXclass r[r\}r](h9Uh:jVubaubcsphinx.addnodes desc_addname r^)r_}r`(h9Xscrapy.selector.h:jHhChFhMU desc_addnamerahO}rb(hS]hT]hR]hQ]hU]uhXMhYhhZ]rchjXscrapy.selector.rdre}rf(h9Uh:j_ubaubcsphinx.addnodes desc_name rg)rh}ri(h9jQh:jHhChFhMU desc_namerjhO}rk(hS]hT]hR]hQ]hU]uhXMhYhhZ]rlhjXSelectorrmrn}ro(h9Uh:jhubaubcsphinx.addnodes desc_parameterlist rp)rq}rr(h9Uh:jHhChFhMUdesc_parameterlistrshO}rt(hS]hT]hR]hQ]hU]uhXMhYhhZ]ru(csphinx.addnodes desc_parameter rv)rw}rx(h9X response=NonehO}ry(hS]hT]hR]hQ]hU]uh:jqhZ]rzhjX response=Noner{r|}r}(h9Uh:jwubahMUdesc_parameterr~ubjv)r}r(h9X text=NonehO}r(hS]hT]hR]hQ]hU]uh:jqhZ]rhjX text=Nonerr}r(h9Uh:jubahMj~ubjv)r}r(h9X type=NonehO}r(hS]hT]hR]hQ]hU]uh:jqhZ]rhjX type=Nonerr}r(h9Uh:jubahMj~ubeubeubcsphinx.addnodes desc_content r)r}r(h9Uh:j=hChFhMU desc_contentrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hn)r}r(h9XcAn instance of :class:`Selector` is a wrapper over response to select certain parts of its content.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hjXAn instance of rr}r(h9XAn instance of h:jubj%)r}r(h9X:class:`Selector`rh:jhChFhMj)hO}r(UreftypeXclassj+j,XSelectorU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/Upy:classrjQU py:modulerjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjXSelectorrr}r(h9Uh:jubahMjdubaubhjXC is a wrapper over response to select certain parts of its content.rr}r(h9XC is a wrapper over response to select certain parts of its content.h:jubeubhn)r}r(h9X``response`` is a :class:`~scrapy.http.HtmlResponse` or :class:`~scrapy.http.XmlResponse` object that will be used for selecting and extracting data.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(j\)r}r(h9X ``response``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXresponserr}r(h9Uh:jubahMjdubhjX is a rr}r(h9X is a h:jubj%)r}r(h9X":class:`~scrapy.http.HtmlResponse`rh:jhChFhMj)hO}r(UreftypeXclassj+j,Xscrapy.http.HtmlResponseU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjX HtmlResponserr}r(h9Uh:jubahMjdubaubhjX or rr}r(h9X or h:jubj%)r}r(h9X!:class:`~scrapy.http.XmlResponse`rh:jhChFhMj)hO}r(UreftypeXclassj+j,Xscrapy.http.XmlResponseU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjX XmlResponserr}r(h9Uh:jubahMjdubaubhjX< object that will be used for selecting and extracting data.rr}r(h9X< object that will be used for selecting and extracting data.h:jubeubhn)r}r(h9X``text`` is a unicode string or utf-8 encoded text for cases when a ``response`` isn't available. Using ``text`` and ``response`` together is undefined behavior.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(j\)r}r(h9X``text``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXtextrr}r(h9Uh:jubahMjdubhjX< is a unicode string or utf-8 encoded text for cases when a rr}r(h9X< is a unicode string or utf-8 encoded text for cases when a h:jubj\)r}r(h9X ``response``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXresponserr}r(h9Uh:jubahMjdubhjX isn't available. Using rr}r(h9X isn't available. Using h:jubj\)r}r(h9X``text``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXtextrr}r(h9Uh:jubahMjdubhjX and rr}r(h9X and h:jubj\)r}r(h9X ``response``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXresponserr}r(h9Uh:jubahMjdubhjX together is undefined behavior.r r }r (h9X together is undefined behavior.h:jubeubhn)r }r (h9XZ``type`` defines the selector type, it can be ``"html"``, ``"xml"`` or ``None`` (default).rh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(j\)r}r(h9X``type``hO}r(hS]hT]hR]hQ]hU]uh:j hZ]rhjXtyperr}r(h9Uh:jubahMjdubhjX& defines the selector type, it can be rr}r(h9X& defines the selector type, it can be h:j ubj\)r}r(h9X ``"html"``hO}r(hS]hT]hR]hQ]hU]uh:j hZ]rhjX"html"rr }r!(h9Uh:jubahMjdubhjX, r"r#}r$(h9X, h:j ubj\)r%}r&(h9X ``"xml"``hO}r'(hS]hT]hR]hQ]hU]uh:j hZ]r(hjX"xml"r)r*}r+(h9Uh:j%ubahMjdubhjX or r,r-}r.(h9X or h:j ubj\)r/}r0(h9X``None``hO}r1(hS]hT]hR]hQ]hU]uh:j hZ]r2hjXNoner3r4}r5(h9Uh:j/ubahMjdubhjX (default).r6r7}r8(h9X (default).h:j ubeubhx)r9}r:(h9Uh:jhChFhMh{hO}r;(hS]hT]hR]hQ]hU]uhXNhYhhZ]r<(hx)r=}r>(h9UhO}r?(hS]hT]hR]hQ]hU]uh:j9hZ]r@(hn)rA}rB(h9XIf ``type`` is ``None``, the selector automatically chooses the best type based on ``response`` type (see below), or defaults to ``"html"`` in case it is used together with ``text``.h:j=hChFhMhrhO}rC(hS]hT]hR]hQ]hU]uhXMhZ]rD(hjXIf rErF}rG(h9XIf h:jAubj\)rH}rI(h9X``type``hO}rJ(hS]hT]hR]hQ]hU]uh:jAhZ]rKhjXtyperLrM}rN(h9Uh:jHubahMjdubhjX is rOrP}rQ(h9X is h:jAubj\)rR}rS(h9X``None``hO}rT(hS]hT]hR]hQ]hU]uh:jAhZ]rUhjXNonerVrW}rX(h9Uh:jRubahMjdubhjX<, the selector automatically chooses the best type based on rYrZ}r[(h9X<, the selector automatically chooses the best type based on h:jAubj\)r\}r](h9X ``response``hO}r^(hS]hT]hR]hQ]hU]uh:jAhZ]r_hjXresponser`ra}rb(h9Uh:j\ubahMjdubhjX" type (see below), or defaults to rcrd}re(h9X" type (see below), or defaults to h:jAubj\)rf}rg(h9X ``"html"``hO}rh(hS]hT]hR]hQ]hU]uh:jAhZ]rihjX"html"rjrk}rl(h9Uh:jfubahMjdubhjX" in case it is used together with rmrn}ro(h9X" in case it is used together with h:jAubj\)rp}rq(h9X``text``hO}rr(hS]hT]hR]hQ]hU]uh:jAhZ]rshjXtextrtru}rv(h9Uh:jpubahMjdubhjX.rw}rx(h9X.h:jAubeubhn)ry}rz(h9XuIf ``type`` is ``None`` and a ``response`` is passed, the selector type is inferred from the response type as follow:h:j=hChFhMhrhO}r{(hS]hT]hR]hQ]hU]uhXMhZ]r|(hjXIf r}r~}r(h9XIf h:jyubj\)r}r(h9X``type``hO}r(hS]hT]hR]hQ]hU]uh:jyhZ]rhjXtyperr}r(h9Uh:jubahMjdubhjX is rr}r(h9X is h:jyubj\)r}r(h9X``None``hO}r(hS]hT]hR]hQ]hU]uh:jyhZ]rhjXNonerr}r(h9Uh:jubahMjdubhjX and a rr}r(h9X and a h:jyubj\)r}r(h9X ``response``hO}r(hS]hT]hR]hQ]hU]uh:jyhZ]rhjXresponserr}r(h9Uh:jubahMjdubhjXK is passed, the selector type is inferred from the response type as follow:rr}r(h9XK is passed, the selector type is inferred from the response type as follow:h:jyubeubhx)r}r(h9UhO}r(hS]hT]hR]hQ]hU]uh:j=hZ]rh~)r}r(h9UhO}r(hX*hQ]hR]hS]hT]hU]uh:jhZ]r(h)r}r(h9X6``"html"`` for :class:`~scrapy.http.HtmlResponse` typerhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9jh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhZ]r(j\)r}r(h9X ``"html"``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX"html"rr}r(h9Uh:jubahMjdubhjX for rr}r(h9X for h:jubj%)r}r(h9X":class:`~scrapy.http.HtmlResponse`rh:jhChFhMj)hO}r(UreftypeXclassj+j,Xscrapy.http.HtmlResponseU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjX HtmlResponserr}r(h9Uh:jubahMjdubaubhjX typerr}r(h9X typeh:jubeubahMhubh)r}r(h9X4``"xml"`` for :class:`~scrapy.http.XmlResponse` typerhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9jh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhZ]r(j\)r}r(h9X ``"xml"``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX"xml"rr}r(h9Uh:jubahMjdubhjX for rr}r(h9X for h:jubj%)r}r(h9X!:class:`~scrapy.http.XmlResponse`rh:jhChFhMj)hO}r(UreftypeXclassj+j,Xscrapy.http.XmlResponseU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjX XmlResponserr}r(h9Uh:jubahMjdubaubhjX typerr}r(h9X typeh:jubeubahMhubh)r}r(h9X``"html"`` for anything else hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhn)r}r(h9X``"html"`` for anything elserh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhZ]r(j\)r}r(h9X ``"html"``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX"html"rr}r(h9Uh:jubahMjdubhjX for anything elserr}r(h9X for anything elseh:jubeubahMhubehMhubahMh{ubehMh{ubhn)r}r(h9X\Otherwise, if ``type`` is set, the selector type will be forced and no detection will occur.h:j9hChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhZ]r(hjXOtherwise, if rr}r (h9XOtherwise, if h:jubj\)r }r (h9X``type``hO}r (hS]hT]hR]hQ]hU]uh:jhZ]r hjXtyperr}r(h9Uh:j ubahMjdubhjXF is set, the selector type will be forced and no detection will occur.rr}r(h9XF is set, the selector type will be forced and no detection will occur.h:jubeubeubj/)r}r(h9Uh:jhChFhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X)xpath() (scrapy.selector.Selector method)h UtrauhXNhYhhZ]ubj<)r}r(h9Uh:jhChFhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjEjuhXNhYhhZ]r(jG)r}r(h9X xpath(query)h:jhChFhMjJhO}r (hQ]r!h ajMjNhR]hS]hT]hU]r"h ajPXSelector.xpathjRjQjSuhXMhYhhZ]r#(jg)r$}r%(h9Xxpathh:jhChFhMjjhO}r&(hS]hT]hR]hQ]hU]uhXMhYhhZ]r'hjXxpathr(r)}r*(h9Uh:j$ubaubjp)r+}r,(h9Uh:jhChFhMjshO}r-(hS]hT]hR]hQ]hU]uhXMhYhhZ]r.jv)r/}r0(h9XqueryhO}r1(hS]hT]hR]hQ]hU]uh:j+hZ]r2hjXqueryr3r4}r5(h9Uh:j/ubahMj~ubaubeubj)r6}r7(h9Uh:jhChFhMjhO}r8(hS]hT]hR]hQ]hU]uhXMhYhhZ]r9(hn)r:}r;(h9XFind nodes matching the xpath ``query`` and return the result as a :class:`SelectorList` instance with all elements flattened. List elements implement :class:`Selector` interface too.h:j6hChFhMhrhO}r<(hS]hT]hR]hQ]hU]uhXMhYhhZ]r=(hjXFind nodes matching the xpath r>r?}r@(h9XFind nodes matching the xpath h:j:ubj\)rA}rB(h9X ``query``hO}rC(hS]hT]hR]hQ]hU]uh:j:hZ]rDhjXqueryrErF}rG(h9Uh:jAubahMjdubhjX and return the result as a rHrI}rJ(h9X and return the result as a h:j:ubj%)rK}rL(h9X:class:`SelectorList`rMh:j:hChFhMj)hO}rN(UreftypeXclassj+j,X SelectorListU refdomainXpyrOhQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rPj\)rQ}rR(h9jMhO}rS(hS]hT]rT(j6jOXpy-classrUehR]hQ]hU]uh:jKhZ]rVhjX SelectorListrWrX}rY(h9Uh:jQubahMjdubaubhjX? instance with all elements flattened. List elements implement rZr[}r\(h9X? instance with all elements flattened. List elements implement h:j:ubj%)r]}r^(h9X:class:`Selector`r_h:j:hChFhMj)hO}r`(UreftypeXclassj+j,XSelectorU refdomainXpyrahQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rbj\)rc}rd(h9j_hO}re(hS]hT]rf(j6jaXpy-classrgehR]hQ]hU]uh:j]hZ]rhhjXSelectorrirj}rk(h9Uh:jcubahMjdubaubhjX interface too.rlrm}rn(h9X interface too.h:j:ubeubhn)ro}rp(h9X:``query`` is a string containing the XPATH query to apply.h:j6hChFhMhrhO}rq(hS]hT]hR]hQ]hU]uhXMhYhhZ]rr(j\)rs}rt(h9X ``query``hO}ru(hS]hT]hR]hQ]hU]uh:johZ]rvhjXqueryrwrx}ry(h9Uh:jsubahMjdubhjX1 is a string containing the XPATH query to apply.rzr{}r|(h9X1 is a string containing the XPATH query to apply.h:joubeubeubeubj/)r}}r~(h9Uh:jhChFhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X'css() (scrapy.selector.Selector method)h$UtrauhXNhYhhZ]ubj<)r}r(h9Uh:jhChFhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjEjuhXNhYhhZ]r(jG)r}r(h9X css(query)h:jhChFhMjJhO}r(hQ]rh$ajMjNhR]hS]hT]hU]rh$ajPX Selector.cssjRjQjSuhXMhYhhZ]r(jg)r}r(h9Xcssh:jhChFhMjjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXcssrr}r(h9Uh:jubaubjp)r}r(h9Uh:jhChFhMjshO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rjv)r}r(h9XqueryhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXqueryrr}r(h9Uh:jubahMj~ubaubeubj)r}r(h9Uh:jhChFhMjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hn)r}r(h9XIApply the given CSS selector and return a :class:`SelectorList` instance.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hjX*Apply the given CSS selector and return a rr}r(h9X*Apply the given CSS selector and return a h:jubj%)r}r(h9X:class:`SelectorList`rh:jhChFhMj)hO}r(UreftypeXclassj+j,X SelectorListU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjX SelectorListrr}r(h9Uh:jubahMjdubaubhjX instance.rr}r(h9X instance.h:jubeubhn)r}r(h9X;``query`` is a string containing the CSS selector to apply.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(j\)r}r(h9X ``query``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXqueryrr}r(h9Uh:jubahMjdubhjX2 is a string containing the CSS selector to apply.rr}r(h9X2 is a string containing the CSS selector to apply.h:jubeubhn)r}r(h9XxIn the background, CSS queries are translated into XPath queries using `cssselect`_ library and run ``.xpath()`` method.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hjXGIn the background, CSS queries are translated into XPath queries using rr}r(h9XGIn the background, CSS queries are translated into XPath queries using h:jubh)r}r(h9X `cssselect`_hKh:jhMhhO}r(UnameX cssselectrhjRhQ]hR]hS]hT]hU]uhZ]rhjX cssselectrr}r(h9Uh:jubaubhjX library and run rr}r(h9X library and run h:jubj\)r}r(h9X ``.xpath()``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX.xpath()rr}r(h9Uh:jubahMjdubhjX method.rr}r(h9X method.h:jubeubeubeubj/)r}r(h9Uh:jhChFhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X+extract() (scrapy.selector.Selector method)h UtrauhXNhYhhZ]ubj<)r}r(h9Uh:jhChFhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjEjuhXNhYhhZ]r(jG)r}r(h9X extract()h:jhChFhMjJhO}r(hQ]rh ajMjNhR]hS]hT]hU]rh ajPXSelector.extractjRjQjSuhXMhYhhZ]r(jg)r}r(h9Xextracth:jhChFhMjjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXextractrr}r(h9Uh:jubaubjp)r}r(h9Uh:jhChFhMjshO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]ubeubj)r}r(h9Uh:jhChFhMjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhn)r}r(h9XiSerialize and return the matched nodes as a list of unicode strings. Percent encoded content is unquoted.rh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXiSerialize and return the matched nodes as a list of unicode strings. Percent encoded content is unquoted.r r }r (h9jh:jubaubaubeubj/)r }r (h9Uh:jhChFhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X&re() (scrapy.selector.Selector method)h)UtrauhXNhYhhZ]ubj<)r}r(h9Uh:jhChFhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjEjuhXNhYhhZ]r(jG)r}r(h9X re(regex)h:jhChFhMjJhO}r(hQ]rh)ajMjNhR]hS]hT]hU]rh)ajPX Selector.rejRjQjSuhXMhYhhZ]r(jg)r}r(h9Xreh:jhChFhMjjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXrer r!}r"(h9Uh:jubaubjp)r#}r$(h9Uh:jhChFhMjshO}r%(hS]hT]hR]hQ]hU]uhXMhYhhZ]r&jv)r'}r((h9XregexhO}r)(hS]hT]hR]hQ]hU]uh:j#hZ]r*hjXregexr+r,}r-(h9Uh:j'ubahMj~ubaubeubj)r.}r/(h9Uh:jhChFhMjhO}r0(hS]hT]hR]hQ]hU]uhXMhYhhZ]r1(hn)r2}r3(h9XLApply the given regex and return a list of unicode strings with the matches.r4h:j.hChFhMhrhO}r5(hS]hT]hR]hQ]hU]uhXMhYhhZ]r6hjXLApply the given regex and return a list of unicode strings with the matches.r7r8}r9(h9j4h:j2ubaubhn)r:}r;(h9X``regex`` can be either a compiled regular expression or a string which will be compiled to a regular expression using ``re.compile(regex)``h:j.hChFhMhrhO}r<(hS]hT]hR]hQ]hU]uhXMhYhhZ]r=(j\)r>}r?(h9X ``regex``hO}r@(hS]hT]hR]hQ]hU]uh:j:hZ]rAhjXregexrBrC}rD(h9Uh:j>ubahMjdubhjXn can be either a compiled regular expression or a string which will be compiled to a regular expression using rErF}rG(h9Xn can be either a compiled regular expression or a string which will be compiled to a regular expression using h:j:ubj\)rH}rI(h9X``re.compile(regex)``hO}rJ(hS]hT]hR]hQ]hU]uh:j:hZ]rKhjXre.compile(regex)rLrM}rN(h9Uh:jHubahMjdubeubeubeubj/)rO}rP(h9Uh:jhChFhMj2hO}rQ(hQ]hR]hS]hT]hU]Uentries]rR(j5X6register_namespace() (scrapy.selector.Selector method)h/UtrSauhXNhYhhZ]ubj<)rT}rU(h9Uh:jhChFhMj?hO}rV(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrWjEjWuhXNhYhhZ]rX(jG)rY}rZ(h9Xregister_namespace(prefix, uri)h:jThChFhMjJhO}r[(hQ]r\h/ajMjNhR]hS]hT]hU]r]h/ajPXSelector.register_namespacejRjQjSuhXMhYhhZ]r^(jg)r_}r`(h9Xregister_namespaceh:jYhChFhMjjhO}ra(hS]hT]hR]hQ]hU]uhXMhYhhZ]rbhjXregister_namespacercrd}re(h9Uh:j_ubaubjp)rf}rg(h9Uh:jYhChFhMjshO}rh(hS]hT]hR]hQ]hU]uhXMhYhhZ]ri(jv)rj}rk(h9XprefixhO}rl(hS]hT]hR]hQ]hU]uh:jfhZ]rmhjXprefixrnro}rp(h9Uh:jjubahMj~ubjv)rq}rr(h9XurihO}rs(hS]hT]hR]hQ]hU]uh:jfhZ]rthjXurirurv}rw(h9Uh:jqubahMj~ubeubeubj)rx}ry(h9Uh:jThChFhMjhO}rz(hS]hT]hR]hQ]hU]uhXMhYhhZ]r{hn)r|}r}(h9XRegister the given namespace to be used in this :class:`Selector`. Without registering namespaces you can't select or extract data from non-standard namespaces. See examples below.h:jxhChFhMhrhO}r~(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hjX0Register the given namespace to be used in this rr}r(h9X0Register the given namespace to be used in this h:j|ubj%)r}r(h9X:class:`Selector`rh:j|hChFhMj)hO}r(UreftypeXclassj+j,XSelectorU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjXSelectorrr}r(h9Uh:jubahMjdubaubhjXs. Without registering namespaces you can't select or extract data from non-standard namespaces. See examples below.rr}r(h9Xs. Without registering namespaces you can't select or extract data from non-standard namespaces. See examples below.h:j|ubeubaubeubj/)r}r(h9Uh:jhChFhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X5remove_namespaces() (scrapy.selector.Selector method)h!UtrauhXNhYhhZ]ubj<)r}r(h9Uh:jhChFhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjEjuhXNhYhhZ]r(jG)r}r(h9Xremove_namespaces()h:jhChFhMjJhO}r(hQ]rh!ajMjNhR]hS]hT]hU]rh!ajPXSelector.remove_namespacesjRjQjSuhXMhYhhZ]r(jg)r}r(h9Xremove_namespacesh:jhChFhMjjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXremove_namespacesrr}r(h9Uh:jubaubjp)r}r(h9Uh:jhChFhMjshO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]ubeubj)r}r(h9Uh:jhChFhMjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhn)r}r(h9XhRemove all namespaces, allowing to traverse the document using namespace-less xpaths. See example below.rh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXhRemove all namespaces, allowing to traverse the document using namespace-less xpaths. See example below.rr}r(h9jh:jubaubaubeubj/)r}r(h9Uh:jhChFhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X/__nonzero__() (scrapy.selector.Selector method)h UtrauhXNhYhhZ]ubj<)r}r(h9Uh:jhChFhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjEjuhXNhYhhZ]r(jG)r}r(h9X __nonzero__()h:jhChFhMjJhO}r(hQ]rh ajMjNhR]hS]hT]hU]rh ajPXSelector.__nonzero__jRjQjSuhXMhYhhZ]r(jg)r}r(h9X __nonzero__h:jhChFhMjjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjX __nonzero__rr}r(h9Uh:jubaubjp)r}r(h9Uh:jhChFhMjshO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]ubeubj)r}r(h9Uh:jhChFhMjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhn)r}r(h9XReturns ``True`` if there is any real content selected or ``False`` otherwise. In other words, the boolean value of a :class:`Selector` is given by the contents it selects.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hjXReturns rr}r(h9XReturns h:jubj\)r}r(h9X``True``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXTruerr}r(h9Uh:jubahMjdubhjX* if there is any real content selected or rr}r(h9X* if there is any real content selected or h:jubj\)r}r(h9X ``False``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXFalserr}r(h9Uh:jubahMjdubhjX4 otherwise. In other words, the boolean value of a rr}r(h9X4 otherwise. In other words, the boolean value of a h:jubj%)r}r(h9X:class:`Selector`rh:jhChFhMj)hO}r(UreftypeXclassj+j,XSelectorU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jjQjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjXSelectorrr}r(h9Uh:jubahMjdubaubhjX% is given by the contents it selects.rr}r(h9X% is given by the contents it selects.h:jubeubaubeubeubeubh;)r}r(h9Uh:jbhChFhMh[hO}r(hS]hT]hR]hQ]r Uselectorlist-objectsr ahU]r hauhXMhYhhZ]r (hc)r }r(h9XSelectorList objectsrh:jhChFhMhghO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXSelectorList objectsrr}r(h9jh:j ubaubj/)r}r(h9Uh:jhCNhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X'SelectorList (class in scrapy.selector)hUtrauhXNhYhhZ]ubj<)r}r(h9Uh:jhCNhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXclassrjEjuhXNhYhhZ]r(jG)r}r (h9X SelectorListr!h:jhChFhMjJhO}r"(hQ]r#hajMjNhR]hS]hT]hU]r$hajPj!jRUjSuhXMhYhhZ]r%(jU)r&}r'(h9Xclass h:jhChFhMjXhO}r((hS]hT]hR]hQ]hU]uhXMhYhhZ]r)hjXclass r*r+}r,(h9Uh:j&ubaubj^)r-}r.(h9Xscrapy.selector.h:jhChFhMjahO}r/(hS]hT]hR]hQ]hU]uhXMhYhhZ]r0hjXscrapy.selector.r1r2}r3(h9Uh:j-ubaubjg)r4}r5(h9j!h:jhChFhMjjhO}r6(hS]hT]hR]hQ]hU]uhXMhYhhZ]r7hjX SelectorListr8r9}r:(h9Uh:j4ubaubeubj)r;}r<(h9Uh:jhChFhMjhO}r=(hS]hT]hR]hQ]hU]uhXMhYhhZ]r>(hn)r?}r@(h9XsThe :class:`SelectorList` class is subclass of the builtin ``list`` class, which provides a few additional methods.h:j;hChFhMhrhO}rA(hS]hT]hR]hQ]hU]uhXMhYhhZ]rB(hjXThe rCrD}rE(h9XThe h:j?ubj%)rF}rG(h9X:class:`SelectorList`rHh:j?hChFhMj)hO}rI(UreftypeXclassj+j,X SelectorListU refdomainXpyrJhQ]hR]U refexplicithS]hT]hU]j.j/jj!jjNuhXMhZ]rKj\)rL}rM(h9jHhO}rN(hS]hT]rO(j6jJXpy-classrPehR]hQ]hU]uh:jFhZ]rQhjX SelectorListrRrS}rT(h9Uh:jLubahMjdubaubhjX" class is subclass of the builtin rUrV}rW(h9X" class is subclass of the builtin h:j?ubj\)rX}rY(h9X``list``hO}rZ(hS]hT]hR]hQ]hU]uh:j?hZ]r[hjXlistr\r]}r^(h9Uh:jXubahMjdubhjX0 class, which provides a few additional methods.r_r`}ra(h9X0 class, which provides a few additional methods.h:j?ubeubj/)rb}rc(h9Uh:j;hChFhMj2hO}rd(hQ]hR]hS]hT]hU]Uentries]re(j5X-xpath() (scrapy.selector.SelectorList method)hUtrfauhXNhYhhZ]ubj<)rg}rh(h9Uh:j;hChFhMj?hO}ri(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjjEjjuhXNhYhhZ]rk(jG)rl}rm(h9X xpath(query)h:jghChFhMjJhO}rn(hQ]rohajMjNhR]hS]hT]hU]rphajPXSelectorList.xpathjRj!jSuhXMhYhhZ]rq(jg)rr}rs(h9Xxpathh:jlhChFhMjjhO}rt(hS]hT]hR]hQ]hU]uhXMhYhhZ]ruhjXxpathrvrw}rx(h9Uh:jrubaubjp)ry}rz(h9Uh:jlhChFhMjshO}r{(hS]hT]hR]hQ]hU]uhXMhYhhZ]r|jv)r}}r~(h9XqueryhO}r(hS]hT]hR]hQ]hU]uh:jyhZ]rhjXqueryrr}r(h9Uh:j}ubahMj~ubaubeubj)r}r(h9Uh:jghChFhMjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hn)r}r(h9XCall the ``.xpath()`` method for each element in this list and return their results flattened as another :class:`SelectorList`.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hjX Call the rr}r(h9X Call the h:jubj\)r}r(h9X ``.xpath()``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX.xpath()rr}r(h9Uh:jubahMjdubhjXT method for each element in this list and return their results flattened as another rr}r(h9XT method for each element in this list and return their results flattened as another h:jubj%)r}r(h9X:class:`SelectorList`rh:jhChFhMj)hO}r(UreftypeXclassj+j,X SelectorListU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jj!jjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjX SelectorListrr}r(h9Uh:jubahMjdubaubhjX.r}r(h9X.h:jubeubhn)r}r(h9XC``query`` is the same argument as the one in :meth:`Selector.xpath`h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(j\)r}r(h9X ``query``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXqueryrr}r(h9Uh:jubahMjdubhjX$ is the same argument as the one in rr}r(h9X$ is the same argument as the one in h:jubj%)r}r(h9X:meth:`Selector.xpath`rh:jhChFhMj)hO}r(UreftypeXmethj+j,XSelector.xpathU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jj!jjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-methrehR]hQ]hU]uh:jhZ]rhjXSelector.xpath()rr}r(h9Uh:jubahMjdubaubeubeubeubj/)r}r(h9Uh:j;hChFhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X+css() (scrapy.selector.SelectorList method)hUtrauhXNhYhhZ]ubj<)r}r(h9Uh:j;hChFhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjEjuhXNhYhhZ]r(jG)r}r(h9X css(query)h:jhChFhMjJhO}r(hQ]rhajMjNhR]hS]hT]hU]rhajPXSelectorList.cssjRj!jSuhXMhYhhZ]r(jg)r}r(h9Xcssh:jhChFhMjjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjXcssrr}r(h9Uh:jubaubjp)r}r(h9Uh:jhChFhMjshO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rjv)r}r(h9XqueryhO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXqueryrr}r(h9Uh:jubahMj~ubaubeubj)r}r(h9Uh:jhChFhMjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hn)r}r(h9X}Call the ``.css()`` method for each element in this list and return their results flattened as another :class:`SelectorList`.h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hjX Call the rr}r(h9X Call the h:jubj\)r}r(h9X ``.css()``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjX.css()rr}r(h9Uh:jubahMjdubhjXT method for each element in this list and return their results flattened as another rr}r(h9XT method for each element in this list and return their results flattened as another h:jubj%)r}r(h9X:class:`SelectorList`rh:jhChFhMj)hO}r(UreftypeXclassj+j,X SelectorListU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jj!jjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]r hjX SelectorListr r }r (h9Uh:jubahMjdubaubhjX.r }r(h9X.h:jubeubhn)r}r(h9XA``query`` is the same argument as the one in :meth:`Selector.css`h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(j\)r}r(h9X ``query``hO}r(hS]hT]hR]hQ]hU]uh:jhZ]rhjXqueryrr}r(h9Uh:jubahMjdubhjX$ is the same argument as the one in rr}r(h9X$ is the same argument as the one in h:jubj%)r}r(h9X:meth:`Selector.css`rh:jhChFhMj)hO}r (UreftypeXmethj+j,X Selector.cssU refdomainXpyr!hQ]hR]U refexplicithS]hT]hU]j.j/jj!jjNuhXMhZ]r"j\)r#}r$(h9jhO}r%(hS]hT]r&(j6j!Xpy-methr'ehR]hQ]hU]uh:jhZ]r(hjXSelector.css()r)r*}r+(h9Uh:j#ubahMjdubaubeubeubeubj/)r,}r-(h9Uh:j;hChFhMj2hO}r.(hQ]hR]hS]hT]hU]Uentries]r/(j5X/extract() (scrapy.selector.SelectorList method)hUtr0auhXNhYhhZ]ubj<)r1}r2(h9Uh:j;hChFhMj?hO}r3(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodr4jEj4uhXNhYhhZ]r5(jG)r6}r7(h9X extract()h:j1hChFhMjJhO}r8(hQ]r9hajMjNhR]hS]hT]hU]r:hajPXSelectorList.extractjRj!jSuhXMhYhhZ]r;(jg)r<}r=(h9Xextracth:j6hChFhMjjhO}r>(hS]hT]hR]hQ]hU]uhXMhYhhZ]r?hjXextractr@rA}rB(h9Uh:j<ubaubjp)rC}rD(h9Uh:j6hChFhMjshO}rE(hS]hT]hR]hQ]hU]uhXMhYhhZ]ubeubj)rF}rG(h9Uh:j1hChFhMjhO}rH(hS]hT]hR]hQ]hU]uhXMhYhhZ]rIhn)rJ}rK(h9X~Call the ``.extract()`` method for each element is this list and return their results flattened, as a list of unicode strings.h:jFhChFhMhrhO}rL(hS]hT]hR]hQ]hU]uhXMhYhhZ]rM(hjX Call the rNrO}rP(h9X Call the h:jJubj\)rQ}rR(h9X``.extract()``hO}rS(hS]hT]hR]hQ]hU]uh:jJhZ]rThjX .extract()rUrV}rW(h9Uh:jQubahMjdubhjXg method for each element is this list and return their results flattened, as a list of unicode strings.rXrY}rZ(h9Xg method for each element is this list and return their results flattened, as a list of unicode strings.h:jJubeubaubeubj/)r[}r\(h9Uh:j;hChFhMj2hO}r](hQ]hR]hS]hT]hU]Uentries]r^(j5X*re() (scrapy.selector.SelectorList method)hUtr_auhXNhYhhZ]ubj<)r`}ra(h9Uh:j;hChFhMj?hO}rb(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrcjEjcuhXNhYhhZ]rd(jG)re}rf(h9Xre()h:j`hChFhMjJhO}rg(hQ]rhhajMjNhR]hS]hT]hU]rihajPXSelectorList.rejRj!jSuhXMhYhhZ]rj(jg)rk}rl(h9Xreh:jehChFhMjjhO}rm(hS]hT]hR]hQ]hU]uhXMhYhhZ]rnhjXrerorp}rq(h9Uh:jkubaubjp)rr}rs(h9Uh:jehChFhMjshO}rt(hS]hT]hR]hQ]hU]uhXMhYhhZ]ubeubj)ru}rv(h9Uh:j`hChFhMjhO}rw(hS]hT]hR]hQ]hU]uhXMhYhhZ]rxhn)ry}rz(h9XyCall the ``.re()`` method for each element is this list and return their results flattened, as a list of unicode strings.h:juhChFhMhrhO}r{(hS]hT]hR]hQ]hU]uhXMhYhhZ]r|(hjX Call the r}r~}r(h9X Call the h:jyubj\)r}r(h9X ``.re()``hO}r(hS]hT]hR]hQ]hU]uh:jyhZ]rhjX.re()rr}r(h9Uh:jubahMjdubhjXg method for each element is this list and return their results flattened, as a list of unicode strings.rr}r(h9Xg method for each element is this list and return their results flattened, as a list of unicode strings.h:jyubeubaubeubj/)r}r(h9Uh:j;hChFhMj2hO}r(hQ]hR]hS]hT]hU]Uentries]r(j5X3__nonzero__() (scrapy.selector.SelectorList method)hUtrauhXNhYhhZ]ubj<)r}r(h9Uh:j;hChFhMj?hO}r(jAjBXpyhQ]hR]hS]hT]hU]jCXmethodrjEjuhXNhYhhZ]r(jG)r}r(h9X __nonzero__()h:jhChFhMjJhO}r(hQ]rhajMjNhR]hS]hT]hU]rhajPXSelectorList.__nonzero__jRj!jSuhXMhYhhZ]r(jg)r}r(h9X __nonzero__h:jhChFhMjjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjX __nonzero__rr}r(h9Uh:jubaubjp)r}r(h9Uh:jhChFhMjshO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]ubeubj)r}r(h9Uh:jhChFhMjhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhn)r}r(h9X7returns True if the list is not empty, False otherwise.rh:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjX7returns True if the list is not empty, False otherwise.rr}r(h9jh:jubaubaubeubeubeubh;)r}r(h9Uh:jhChFhMh[hO}r(hS]hT]hR]hQ]rU"selector-examples-on-html-responserahU]rhauhXMhYhhZ]r(hc)r}r(h9X"Selector examples on HTML responserh:jhChFhMhghO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]rhjX"Selector examples on HTML responserr}r(h9jh:jubaubhn)r}r(h9XHere's a couple of :class:`Selector` examples to illustrate several concepts. In all cases, we assume there is already an :class:`Selector` instantiated with a :class:`~scrapy.http.HtmlResponse` object like this::h:jhChFhMhrhO}r(hS]hT]hR]hQ]hU]uhXMhYhhZ]r(hjXHere's a couple of rr}r(h9XHere's a couple of h:jubj%)r}r(h9X:class:`Selector`rh:jhChFhMj)hO}r(UreftypeXclassj+j,XSelectorU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjXSelectorrr}r(h9Uh:jubahMjdubaubhjXV examples to illustrate several concepts. In all cases, we assume there is already an rr}r(h9XV examples to illustrate several concepts. In all cases, we assume there is already an h:jubj%)r}r(h9X:class:`Selector`rh:jhChFhMj)hO}r(UreftypeXclassj+j,XSelectorU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjXSelectorrr}r(h9Uh:jubahMjdubaubhjX instantiated with a rr}r(h9X instantiated with a h:jubj%)r}r(h9X":class:`~scrapy.http.HtmlResponse`rh:jhChFhMj)hO}r(UreftypeXclassj+j,Xscrapy.http.HtmlResponseU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]rj\)r}r(h9jhO}r(hS]hT]r(j6jXpy-classrehR]hQ]hU]uh:jhZ]rhjX HtmlResponserr}r(h9Uh:jubahMjdubaubhjX object like this:rr}r(h9X object like this:h:jubeubj)r}r(h9Xsel = Selector(html_response)h:jhChFhMjhO}r(jjhQ]hR]hS]hT]hU]uhXMhYhhZ]rhjXsel = Selector(html_response)r r }r (h9Uh:jubaubcdocutils.nodes enumerated_list r )r }r (h9Uh:jhChFhMUenumerated_listr hO}r (Usuffixr U.hQ]hR]hS]Uprefixr UhT]hU]Uenumtyper Uarabicr uhXMhYhhZ]r (h)r }r (h9XSelect all ``

              `` elements from a HTML response body, returning a list of :class:`Selector` objects (ie. a :class:`SelectorList` object):: sel.xpath("//h1") h:j hChFhMhhO}r (hS]hT]hR]hQ]hU]uhXNhYhhZ]r (hn)r }r (h9XSelect all ``

              `` elements from a HTML response body, returning a list of :class:`Selector` objects (ie. a :class:`SelectorList` object)::h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXMhZ]r (hjX Select all r r }r (h9X Select all h:j ubj\)r }r (h9X``

              ``hO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX

              r r }r (h9Uh:j ubahMjdubhjX9 elements from a HTML response body, returning a list of r r }r! (h9X9 elements from a HTML response body, returning a list of h:j ubj%)r" }r# (h9X:class:`Selector`r$ h:j hChFhMj)hO}r% (UreftypeXclassj+j,XSelectorU refdomainXpyr& hQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]r' j\)r( }r) (h9j$ hO}r* (hS]hT]r+ (j6j& Xpy-classr, ehR]hQ]hU]uh:j" hZ]r- hjXSelectorr. r/ }r0 (h9Uh:j( ubahMjdubaubhjX objects (ie. a r1 r2 }r3 (h9X objects (ie. a h:j ubj%)r4 }r5 (h9X:class:`SelectorList`r6 h:j hChFhMj)hO}r7 (UreftypeXclassj+j,X SelectorListU refdomainXpyr8 hQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]r9 j\)r: }r; (h9j6 hO}r< (hS]hT]r= (j6j8 Xpy-classr> ehR]hQ]hU]uh:j4 hZ]r? hjX SelectorListr@ rA }rB (h9Uh:j: ubahMjdubaubhjX object):rC rD }rE (h9X object):h:j ubeubj)rF }rG (h9Xsel.xpath("//h1")h:j hMjhO}rH (jjhQ]hR]hS]hT]hU]uhXM hZ]rI hjXsel.xpath("//h1")rJ rK }rL (h9Uh:jF ubaubeubh)rM }rN (h9XExtract the text of all ``

              `` elements from a HTML response body, returning a list of unicode strings:: sel.xpath("//h1").extract() # this includes the h1 tag sel.xpath("//h1/text()").extract() # this excludes the h1 tag h:j hChFhMhhO}rO (hS]hT]hR]hQ]hU]uhXNhYhhZ]rP (hn)rQ }rR (h9XjExtract the text of all ``

              `` elements from a HTML response body, returning a list of unicode strings::h:jM hChFhMhrhO}rS (hS]hT]hR]hQ]hU]uhXM hZ]rT (hjXExtract the text of all rU rV }rW (h9XExtract the text of all h:jQ ubj\)rX }rY (h9X``

              ``hO}rZ (hS]hT]hR]hQ]hU]uh:jQ hZ]r[ hjX

              r\ r] }r^ (h9Uh:jX ubahMjdubhjXI elements from a HTML response body, returning a list of unicode strings:r_ r` }ra (h9XI elements from a HTML response body, returning a list of unicode strings:h:jQ ubeubj)rb }rc (h9X}sel.xpath("//h1").extract() # this includes the h1 tag sel.xpath("//h1/text()").extract() # this excludes the h1 tagh:jM hMjhO}rd (jjhQ]hR]hS]hT]hU]uhXMhZ]re hjX}sel.xpath("//h1").extract() # this includes the h1 tag sel.xpath("//h1/text()").extract() # this excludes the h1 tagrf rg }rh (h9Uh:jb ubaubeubh)ri }rj (h9XIterate over all ``

              `` tags and print their class attribute:: for node in sel.xpath("//p"): ... print node.xpath("@class").extract() h:j hChFhMhhO}rk (hS]hT]hR]hQ]hU]uhXNhYhhZ]rl (hn)rm }rn (h9X?Iterate over all ``

              `` tags and print their class attribute::h:ji hChFhMhrhO}ro (hS]hT]hR]hQ]hU]uhXMhZ]rp (hjXIterate over all rq rr }rs (h9XIterate over all h:jm ubj\)rt }ru (h9X``

              ``hO}rv (hS]hT]hR]hQ]hU]uh:jm hZ]rw hjX

              rx ry }rz (h9Uh:jt ubahMjdubhjX& tags and print their class attribute:r{ r| }r} (h9X& tags and print their class attribute:h:jm ubeubj)r~ }r (h9XIfor node in sel.xpath("//p"): ... print node.xpath("@class").extract()h:ji hMjhO}r (jjhQ]hR]hS]hT]hU]uhXMhZ]r hjXIfor node in sel.xpath("//p"): ... print node.xpath("@class").extract()r r }r (h9Uh:j~ ubaubeubeubeubh;)r }r (h9Uh:jhChFhMh[hO}r (hS]hT]hR]hQ]r U!selector-examples-on-xml-responser ahU]r h auhXMhYhhZ]r (hc)r }r (h9X!Selector examples on XML responser h:j hChFhMhghO}r (hS]hT]hR]hQ]hU]uhXMhYhhZ]r hjX!Selector examples on XML responser r }r (h9j h:j ubaubhn)r }r (h9XHere's a couple of examples to illustrate several concepts. In both cases we assume there is already an :class:`Selector` instantiated with a :class:`~scrapy.http.XmlResponse` object like this::h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXMhYhhZ]r (hjXhHere's a couple of examples to illustrate several concepts. In both cases we assume there is already an r r }r (h9XhHere's a couple of examples to illustrate several concepts. In both cases we assume there is already an h:j ubj%)r }r (h9X:class:`Selector`r h:j hChFhMj)hO}r (UreftypeXclassj+j,XSelectorU refdomainXpyr hQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]r j\)r }r (h9j hO}r (hS]hT]r (j6j Xpy-classr ehR]hQ]hU]uh:j hZ]r hjXSelectorr r }r (h9Uh:j ubahMjdubaubhjX instantiated with a r r }r (h9X instantiated with a h:j ubj%)r }r (h9X!:class:`~scrapy.http.XmlResponse`r h:j hChFhMj)hO}r (UreftypeXclassj+j,Xscrapy.http.XmlResponseU refdomainXpyr hQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]r j\)r }r (h9j hO}r (hS]hT]r (j6j Xpy-classr ehR]hQ]hU]uh:j hZ]r hjX XmlResponser r }r (h9Uh:j ubahMjdubaubhjX object like this:r r }r (h9X object like this:h:j ubeubj)r }r (h9Xsel = Selector(xml_response)h:j hChFhMjhO}r (jjhQ]hR]hS]hT]hU]uhXMhYhhZ]r hjXsel = Selector(xml_response)r r }r (h9Uh:j ubaubj )r }r (h9Uh:j hChFhMj hO}r (j U.hQ]hR]hS]j UhT]hU]j j uhXMhYhhZ]r (h)r }r (h9XSelect all ```` elements from a XML response body, returning a list of :class:`Selector` objects (ie. a :class:`SelectorList` object):: sel.xpath("//product") h:j hChFhMhhO}r (hS]hT]hR]hQ]hU]uhXNhYhhZ]r (hn)r }r (h9XSelect all ```` elements from a XML response body, returning a list of :class:`Selector` objects (ie. a :class:`SelectorList` object)::h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXMhZ]r (hjX Select all r r }r (h9X Select all h:j ubj\)r }r (h9X ````hO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX r r }r (h9Uh:j ubahMjdubhjX8 elements from a XML response body, returning a list of r r }r (h9X8 elements from a XML response body, returning a list of h:j ubj%)r }r (h9X:class:`Selector`r h:j hChFhMj)hO}r (UreftypeXclassj+j,XSelectorU refdomainXpyr hQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]r j\)r }r (h9j hO}r (hS]hT]r (j6j Xpy-classr ehR]hQ]hU]uh:j hZ]r hjXSelectorr r }r (h9Uh:j ubahMjdubaubhjX objects (ie. a r r }r (h9X objects (ie. a h:j ubj%)r }r (h9X:class:`SelectorList`r h:j hChFhMj)hO}r (UreftypeXclassj+j,X SelectorListU refdomainXpyr hQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXMhZ]r j\)r }r (h9j hO}r (hS]hT]r (j6j Xpy-classr ehR]hQ]hU]uh:j hZ]r hjX SelectorListr r }r (h9Uh:j ubahMjdubaubhjX object):r r }r (h9X object):h:j ubeubj)r }r (h9Xsel.xpath("//product")h:j hMjhO}r (jjhQ]hR]hS]hT]hU]uhXM"hZ]r hjXsel.xpath("//product")r r }r (h9Uh:j ubaubeubh)r }r (h9XExtract all prices from a `Google Base XML feed`_ which requires registering a namespace:: sel.register_namespace("g", "http://base.google.com/ns/1.0") sel.xpath("//g:price").extract() h:j hChFhMhhO}r (hS]hT]hR]hQ]hU]uhXNhYhhZ]r (hn)r }r (h9XZExtract all prices from a `Google Base XML feed`_ which requires registering a namespace::h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXM$hZ]r (hjXExtract all prices from a r r }r (h9XExtract all prices from a h:j ubh)r }r (h9X`Google Base XML feed`_hKh:j hMhhO}r (UnameXGoogle Base XML feedhX?http://base.google.com/support/bin/answer.py?hl=en&answer=59461r hQ]hR]hS]hT]hU]uhZ]r hjXGoogle Base XML feedr r }r (h9Uh:j ubaubhjX( which requires registering a namespace:r r }r (h9X( which requires registering a namespace:h:j ubeubj)r }r! (h9X]sel.register_namespace("g", "http://base.google.com/ns/1.0") sel.xpath("//g:price").extract()h:j hMjhO}r" (jjhQ]hR]hS]hT]hU]uhXM'hZ]r# hjX]sel.register_namespace("g", "http://base.google.com/ns/1.0") sel.xpath("//g:price").extract()r$ r% }r& (h9Uh:j ubaubeubeubhJ)r' }r( (h9X.. _removing-namespaces:h:j hChFhMhNhO}r) (hQ]hR]hS]hT]hU]hVUremoving-namespacesr* uhXM*hYhhZ]ubeubh;)r+ }r, (h9Uh:jhChFhH}r- hj' shMh[hO}r. (hS]hT]hR]hQ]r/ (j* Uid3r0 ehU]r1 (h#heuhXM-hYhh`}r2 j* j' shZ]r3 (hc)r4 }r5 (h9XRemoving namespacesr6 h:j+ hChFhMhghO}r7 (hS]hT]hR]hQ]hU]uhXM-hYhhZ]r8 hjXRemoving namespacesr9 r: }r; (h9j6 h:j4 ubaubhn)r< }r= (h9XWhen dealing with scraping projects, it is often quite convenient to get rid of namespaces altogether and just work with element names, to write more simple/convenient XPaths. You can use the :meth:`Selector.remove_namespaces` method for that.h:j+ hChFhMhrhO}r> (hS]hT]hR]hQ]hU]uhXM/hYhhZ]r? (hjXWhen dealing with scraping projects, it is often quite convenient to get rid of namespaces altogether and just work with element names, to write more simple/convenient XPaths. You can use the r@ rA }rB (h9XWhen dealing with scraping projects, it is often quite convenient to get rid of namespaces altogether and just work with element names, to write more simple/convenient XPaths. You can use the h:j< ubj%)rC }rD (h9X":meth:`Selector.remove_namespaces`rE h:j< hChFhMj)hO}rF (UreftypeXmethj+j,XSelector.remove_namespacesU refdomainXpyrG hQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXM/hZ]rH j\)rI }rJ (h9jE hO}rK (hS]hT]rL (j6jG Xpy-methrM ehR]hQ]hU]uh:jC hZ]rN hjXSelector.remove_namespaces()rO rP }rQ (h9Uh:jI ubahMjdubaubhjX method for that.rR rS }rT (h9X method for that.h:j< ubeubhn)rU }rV (h9XGLet's show an example that illustrates this with Github blog atom feed.rW h:j+ hChFhMhrhO}rX (hS]hT]hR]hQ]hU]uhXM4hYhhZ]rY hjXGLet's show an example that illustrates this with Github blog atom feed.rZ r[ }r\ (h9jW h:jU ubaubhn)r] }r^ (h9X9First, we open the shell with the url we want to scrape::r_ h:j+ hChFhMhrhO}r` (hS]hT]hR]hQ]hU]uhXM6hYhhZ]ra hjX8First, we open the shell with the url we want to scrape:rb rc }rd (h9X8First, we open the shell with the url we want to scrape:h:j] ubaubj)re }rf (h9X+$ scrapy shell https://github.com/blog.atomh:j+ hChFhMjhO}rg (jjhQ]hR]hS]hT]hU]uhXM8hYhhZ]rh hjX+$ scrapy shell https://github.com/blog.atomri rj }rk (h9Uh:je ubaubhn)rl }rm (h9XOnce in the shell we can try selecting all ```` objects and see that it doesn't work (because the Atom XML namespace is obfuscating those nodes)::h:j+ hChFhMhrhO}rn (hS]hT]hR]hQ]hU]uhXM:hYhhZ]ro (hjX+Once in the shell we can try selecting all rp rq }rr (h9X+Once in the shell we can try selecting all h:jl ubj\)rs }rt (h9X ````hO}ru (hS]hT]hR]hQ]hU]uh:jl hZ]rv hjXrw rx }ry (h9Uh:js ubahMjdubhjXb objects and see that it doesn't work (because the Atom XML namespace is obfuscating those nodes):rz r{ }r| (h9Xb objects and see that it doesn't work (because the Atom XML namespace is obfuscating those nodes):h:jl ubeubj)r} }r~ (h9X>>> sel.xpath("//link") []h:j+ hChFhMjhO}r (jjhQ]hR]hS]hT]hU]uhXM=hYhhZ]r hjX>>> sel.xpath("//link") []r r }r (h9Uh:j} ubaubhn)r }r (h9XsBut once we call the :meth:`Selector.remove_namespaces` method, all nodes can be accessed directly by their names::h:j+ hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXM@hYhhZ]r (hjXBut once we call the r r }r (h9XBut once we call the h:j ubj%)r }r (h9X":meth:`Selector.remove_namespaces`r h:j hChFhMj)hO}r (UreftypeXmethj+j,XSelector.remove_namespacesU refdomainXpyr hQ]hR]U refexplicithS]hT]hU]j.j/jNjjNuhXM@hZ]r j\)r }r (h9j hO}r (hS]hT]r (j6j Xpy-methr ehR]hQ]hU]uh:j hZ]r hjXSelector.remove_namespaces()r r }r (h9Uh:j ubahMjdubaubhjX; method, all nodes can be accessed directly by their names:r r }r (h9X; method, all nodes can be accessed directly by their names:h:j ubeubj)r }r (h9X>>> sel.remove_namespaces() >>> sel.xpath("//link") [, ...h:j+ hChFhMjhO}r (jjhQ]hR]hS]hT]hU]uhXMChYhhZ]r hjX>>> sel.remove_namespaces() >>> sel.xpath("//link") [, ...r r }r (h9Uh:j ubaubhn)r }r (h9XIf you wonder why the namespace removal procedure is not always called, instead of having to call it manually. This is because of two reasons which, in order of relevance, are:r h:j+ hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXMIhYhhZ]r hjXIf you wonder why the namespace removal procedure is not always called, instead of having to call it manually. This is because of two reasons which, in order of relevance, are:r r }r (h9j h:j ubaubj )r }r (h9Uh:j+ hChFhMj hO}r (j U.hQ]hR]hS]j UhT]hU]j j uhXMMhYhhZ]r (h)r }r (h9XRemoving namespaces requires to iterate and modify all nodes in the document, which is a reasonably expensive operation to performs for all documents crawled by Scrapy h:j hChFhMhhO}r (hS]hT]hR]hQ]hU]uhXNhYhhZ]r hn)r }r (h9XRemoving namespaces requires to iterate and modify all nodes in the document, which is a reasonably expensive operation to performs for all documents crawled by Scrapyr h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXMMhZ]r hjXRemoving namespaces requires to iterate and modify all nodes in the document, which is a reasonably expensive operation to performs for all documents crawled by Scrapyr r }r (h9j h:j ubaubaubh)r }r (h9XThere could be some cases where using namespaces is actually required, in case some element names clash between namespaces. These cases are very rare though. h:j hChFhMhhO}r (hS]hT]hR]hQ]hU]uhXNhYhhZ]r hn)r }r (h9XThere could be some cases where using namespaces is actually required, in case some element names clash between namespaces. These cases are very rare though.r h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXMQhZ]r hjXThere could be some cases where using namespaces is actually required, in case some element names clash between namespaces. These cases are very rare though.r r }r (h9j h:j ubaubaubeubhJ)r }r (h9XY.. _Google Base XML feed: http://base.google.com/support/bin/answer.py?hl=en&answer=59461h>Kh:j+ hChFhMhNhO}r (hj hQ]r Ugoogle-base-xml-feedr ahR]hS]hT]hU]r h*auhXMUhYhhZ]ubeubeubeubeubhChFhMh[hO}r (hS]r Xusing selectorsr ahT]hR]hQ]r Uusing-selectorsr ahU]uhXK0hYhhZ]r (hc)r }r (h9XUsing selectorsr h:h?hChFhMhghO}r (hS]hT]hR]hQ]hU]uhXK0hYhhZ]r hjXUsing selectorsr r }r (h9j h:j ubaubh;)r }r (h9Uh:h?hChFhMh[hO}r (hS]hT]hR]hQ]r Uconstructing-selectorsr ahU]r hauhXK3hYhhZ]r (hc)r }r (h9XConstructing selectorsr h:j hChFhMhghO}r (hS]hT]hR]hQ]hU]uhXK3hYhhZ]r hjXConstructing selectorsr r }r (h9j h:j ubaubcsphinx.addnodes highlightlang r )r }r (h9Uh:j hChFhMU highlightlangr hO}r (UlangXpythonUlinenothresholdI9223372036854775807 hQ]hR]hS]hT]hU]uhXK6hYhhZ]ubhn)r }r (h9XScrapy selectors are instances of :class:`~scrapy.selector.Selector` class constructed by passing a `Response` object as first argument, the response's body is what they're going to be "selecting"::h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXK7hYhhZ]r (hjX"Scrapy selectors are instances of r r }r (h9X"Scrapy selectors are instances of h:j ubj%)r }r (h9X":class:`~scrapy.selector.Selector`r h:j hChFhMj)hO}r (UreftypeXclassj+j,Xscrapy.selector.SelectorU refdomainXpyr hQ]hR]U refexplicithS]hT]hU]j.j/jNjNuhXK7hZ]r j\)r }r (h9j hO}r (hS]hT]r (j6j Xpy-classr ehR]hQ]hU]uh:j hZ]r hjXSelectorr r }r (h9Uh:j ubahMjdubaubhjX class constructed by passing a r r }r (h9X class constructed by passing a h:j ubcdocutils.nodes title_reference r )r }r (h9X `Response`hO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjXResponser r }r (h9Uh:j ubahMUtitle_referencer ubhjXW object as first argument, the response's body is what they're going to be "selecting":r r }r (h9XW object as first argument, the response's body is what they're going to be "selecting":h:j ubeubj)r }r (h9Xpfrom scrapy.spider import Spider from scrapy.selector import Selector class MySpider(Spider): # ... def parse(self, response): sel = Selector(response) # Using XPath query print sel.xpath('//p') # Using CSS query print sel.css('p') # Nesting queries print sel.xpath('//div[@foo="bar"]').css('span#bold')h:j hChFhMjhO}r (jjhQ]hR]hS]hT]hU]uhXK;hYhhZ]r hjXpfrom scrapy.spider import Spider from scrapy.selector import Selector class MySpider(Spider): # ... def parse(self, response): sel = Selector(response) # Using XPath query print sel.xpath('//p') # Using CSS query print sel.css('p') # Nesting queries print sel.xpath('//div[@foo="bar"]').css('span#bold')r r }r (h9Uh:j ubaubeubh (h9Uh:j8 ubahMjdubhjX or r? r@ }rA (h9X or h:j1 ubj\)rB }rC (h9X ``.css()``hO}rD (hS]hT]hR]hQ]hU]uh:j1 hZ]rE hjX.css()rF rG }rH (h9Uh:jB ubahMjdubhjX) returns a list of selectors of the same type, so you can call the selection methods for those selectors too. Here's an example:rI rJ }rK (h9X) returns a list of selectors of the same type, so you can call the selection methods for those selectors too. Here's an example:h:j1 ubeubj)rL }rM (h9X'>>> links = sel.xpath('//a[contains(@href, "image")]') >>> links.extract() [u'Name: My image 1
              ', u'Name: My image 2
              ', u'Name: My image 3
              ', u'Name: My image 4
              ', u'Name: My image 5
              '] >>> for index, link in enumerate(links): args = (index, link.xpath('@href').extract(), link.xpath('img/@src').extract()) print 'Link number %d points to url %s and image %s' % args Link number 0 points to url [u'image1.html'] and image [u'image1_thumb.jpg'] Link number 1 points to url [u'image2.html'] and image [u'image2_thumb.jpg'] Link number 2 points to url [u'image3.html'] and image [u'image3_thumb.jpg'] Link number 3 points to url [u'image4.html'] and image [u'image4_thumb.jpg'] Link number 4 points to url [u'image5.html'] and image [u'image5_thumb.jpg']h:j hChFhMjhO}rN (jjhQ]hR]hS]hT]hU]uhXKhYhhZ]rO hjX'>>> links = sel.xpath('//a[contains(@href, "image")]') >>> links.extract() [u'Name: My image 1
              ', u'Name: My image 2
              ', u'Name: My image 3
              ', u'Name: My image 4
              ', u'Name: My image 5
              '] >>> for index, link in enumerate(links): args = (index, link.xpath('@href').extract(), link.xpath('img/@src').extract()) print 'Link number %d points to url %s and image %s' % args Link number 0 points to url [u'image1.html'] and image [u'image1_thumb.jpg'] Link number 1 points to url [u'image2.html'] and image [u'image2_thumb.jpg'] Link number 2 points to url [u'image3.html'] and image [u'image3_thumb.jpg'] Link number 3 points to url [u'image4.html'] and image [u'image4_thumb.jpg'] Link number 4 points to url [u'image5.html'] and image [u'image5_thumb.jpg']rP rQ }rR (h9Uh:jL ubaubeubh;)rS }rT (h9Uh:h?hChFhMh[hO}rU (hS]hT]hR]hQ]rV U(using-selectors-with-regular-expressionsrW ahU]rX h0auhXKhYhhZ]rY (hc)rZ }r[ (h9X(Using selectors with regular expressionsr\ h:jS hChFhMhghO}r] (hS]hT]hR]hQ]hU]uhXKhYhhZ]r^ hjX(Using selectors with regular expressionsr_ r` }ra (h9j\ h:jZ ubaubhn)rb }rc (h9X :class:`~scrapy.selector.Selector` also have a ``.re()`` method for extracting data using regular expressions. However, unlike using ``.xpath()`` or ``.css()`` methods, ``.re()`` method returns a list of unicode strings. So you can't construct nested ``.re()`` calls.h:jS hChFhMhrhO}rd (hS]hT]hR]hQ]hU]uhXKhYhhZ]re (j%)rf }rg (h9X":class:`~scrapy.selector.Selector`rh h:jb hChFhMj)hO}ri (UreftypeXclassj+j,Xscrapy.selector.SelectorU refdomainXpyrj hQ]hR]U refexplicithS]hT]hU]j.j/jNjNuhXKhZ]rk j\)rl }rm (h9jh hO}rn (hS]hT]ro (j6jj Xpy-classrp ehR]hQ]hU]uh:jf hZ]rq hjXSelectorrr rs }rt (h9Uh:jl ubahMjdubaubhjX also have a ru rv }rw (h9X also have a h:jb ubj\)rx }ry (h9X ``.re()``hO}rz (hS]hT]hR]hQ]hU]uh:jb hZ]r{ hjX.re()r| r} }r~ (h9Uh:jx ubahMjdubhjXM method for extracting data using regular expressions. However, unlike using r r }r (h9XM method for extracting data using regular expressions. However, unlike using h:jb ubj\)r }r (h9X ``.xpath()``hO}r (hS]hT]hR]hQ]hU]uh:jb hZ]r hjX.xpath()r r }r (h9Uh:j ubahMjdubhjX or r r }r (h9X or h:jb ubj\)r }r (h9X ``.css()``hO}r (hS]hT]hR]hQ]hU]uh:jb hZ]r hjX.css()r r }r (h9Uh:j ubahMjdubhjX methods, r r }r (h9X methods, h:jb ubj\)r }r (h9X ``.re()``hO}r (hS]hT]hR]hQ]hU]uh:jb hZ]r hjX.re()r r }r (h9Uh:j ubahMjdubhjXI method returns a list of unicode strings. So you can't construct nested r r }r (h9XI method returns a list of unicode strings. So you can't construct nested h:jb ubj\)r }r (h9X ``.re()``hO}r (hS]hT]hR]hQ]hU]uh:jb hZ]r hjX.re()r r }r (h9Uh:j ubahMjdubhjX calls.r r }r (h9X calls.h:jb ubeubhn)r }r (h9XlHere's an example used to extract images names from the :ref:`HTML code ` above::h:jS hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXKhYhhZ]r (hjX8Here's an example used to extract images names from the r r }r (h9X8Here's an example used to extract images names from the h:j ubj%)r }r (h9X,:ref:`HTML code `r h:j hChFhMj)hO}r (UreftypeXrefj+j,Xtopics-selectors-htmlcodeU refdomainXstdr hQ]hR]U refexplicithS]hT]hU]j.j/uhXKhZ]r j1)r }r (h9j hO}r (hS]hT]r (j6j Xstd-refr ehR]hQ]hU]uh:j hZ]r hjX HTML coder r }r (h9Uh:j ubahMj<ubaubhjX above:r r }r (h9X above:h:j ubeubj)r }r (h9X>>> sel.xpath('//a[contains(@href, "image")]/text()').re(r'Name:\s*(.*)') [u'My image 1', u'My image 2', u'My image 3', u'My image 4', u'My image 5']h:jS hChFhMjhO}r (jjhQ]hR]hS]hT]hU]uhXKhYhhZ]r hjX>>> sel.xpath('//a[contains(@href, "image")]/text()').re(r'Name:\s*(.*)') [u'My image 1', u'My image 2', u'My image 3', u'My image 4', u'My image 5']r r }r (h9Uh:j ubaubhJ)r }r (h9X%.. _topics-selectors-relative-xpaths:h:jS hChFhMhNhO}r (hQ]hR]hS]hT]hU]hVU topics-selectors-relative-xpathsr uhXKhYhhZ]ubeubh;)r }r (h9Uh:h?hChFhH}r h&j shMh[hO}r (hS]hT]hR]hQ]r (Uworking-with-relative-xpathsr j ehU]r (h1h&euhXKhYhh`}r j j shZ]r (hc)r }r (h9XWorking with relative XPathsr h:j hChFhMhghO}r (hS]hT]hR]hQ]hU]uhXKhYhhZ]r hjXWorking with relative XPathsr r }r (h9j h:j ubaubhn)r }r (h9XKeep in mind that if you are nesting selectors and use an XPath that starts with ``/``, that XPath will be absolute to the document and not relative to the ``Selector`` you're calling it from.h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXKhYhhZ]r (hjXQKeep in mind that if you are nesting selectors and use an XPath that starts with r r }r (h9XQKeep in mind that if you are nesting selectors and use an XPath that starts with h:j ubj\)r }r (h9X``/``hO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX/r }r (h9Uh:j ubahMjdubhjXF, that XPath will be absolute to the document and not relative to the r r }r (h9XF, that XPath will be absolute to the document and not relative to the h:j ubj\)r }r (h9X ``Selector``hO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjXSelectorr r }r (h9Uh:j ubahMjdubhjX you're calling it from.r r }r (h9X you're calling it from.h:j ubeubhn)r }r (h9XFor example, suppose you want to extract all ``

              `` elements inside ``

              `` elements. First, you would get all ``
              `` elements::h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXKhYhhZ]r (hjX-For example, suppose you want to extract all r r }r (h9X-For example, suppose you want to extract all h:j ubj\)r }r (h9X``

              ``hO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX

              r r }r (h9Uh:j ubahMjdubhjX elements inside r r }r (h9X elements inside h:j ubj\)r }r (h9X ``

              ``hO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX
              r r }r (h9Uh:j ubahMjdubhjX$ elements. First, you would get all r r }r (h9X$ elements. First, you would get all h:j ubj\)r }r (h9X ``
              ``hO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX
              r r }r (h9Uh:j ubahMjdubhjX elements:r r }r (h9X elements:h:j ubeubj)r }r (h9X>>> divs = sel.xpath('//div')h:j hChFhMjhO}r (jjhQ]hR]hS]hT]hU]uhXKhYhhZ]r! hjX>>> divs = sel.xpath('//div')r" r# }r$ (h9Uh:j ubaubhn)r% }r& (h9XAt first, you may be tempted to use the following approach, which is wrong, as it actually extracts all ``

              `` elements from the document, not only those inside ``

              `` elements::h:j hChFhMhrhO}r' (hS]hT]hR]hQ]hU]uhXKhYhhZ]r( (hjXhAt first, you may be tempted to use the following approach, which is wrong, as it actually extracts all r) r* }r+ (h9XhAt first, you may be tempted to use the following approach, which is wrong, as it actually extracts all h:j% ubj\)r, }r- (h9X``

              ``hO}r. (hS]hT]hR]hQ]hU]uh:j% hZ]r/ hjX

              r0 r1 }r2 (h9Uh:j, ubahMjdubhjX3 elements from the document, not only those inside r3 r4 }r5 (h9X3 elements from the document, not only those inside h:j% ubj\)r6 }r7 (h9X ``

              ``hO}r8 (hS]hT]hR]hQ]hU]uh:j% hZ]r9 hjX
              r: r; }r< (h9Uh:j6 ubahMjdubhjX elements:r= r> }r? (h9X elements:h:j% ubeubj)r@ }rA (h9Xp>>> for p in divs.xpath('//p') # this is wrong - gets all

              from the whole document >>> print p.extract()h:j hChFhMjhO}rB (jjhQ]hR]hS]hT]hU]uhXKhYhhZ]rC hjXp>>> for p in divs.xpath('//p') # this is wrong - gets all

              from the whole document >>> print p.extract()rD rE }rF (h9Uh:j@ ubaubhn)rG }rH (h9XMThis is the proper way to do it (note the dot prefixing the ``.//p`` XPath)::rI h:j hChFhMhrhO}rJ (hS]hT]hR]hQ]hU]uhXKhYhhZ]rK (hjX<This is the proper way to do it (note the dot prefixing the rL rM }rN (h9X<This is the proper way to do it (note the dot prefixing the h:jG ubj\)rO }rP (h9X``.//p``hO}rQ (hS]hT]hR]hQ]hU]uh:jG hZ]rR hjX.//prS rT }rU (h9Uh:jO ubahMjdubhjX XPath):rV rW }rX (h9X XPath):h:jG ubeubj)rY }rZ (h9XT>>> for p in divs.xpath('.//p') # extracts all

              inside >>> print p.extract()h:j hChFhMjhO}r[ (jjhQ]hR]hS]hT]hU]uhXKhYhhZ]r\ hjXT>>> for p in divs.xpath('.//p') # extracts all

              inside >>> print p.extract()r] r^ }r_ (h9Uh:jY ubaubhn)r` }ra (h9XEAnother common case would be to extract all direct ``

              `` children::rb h:j hChFhMhrhO}rc (hS]hT]hR]hQ]hU]uhXKhYhhZ]rd (hjX3Another common case would be to extract all direct re rf }rg (h9X3Another common case would be to extract all direct h:j` ubj\)rh }ri (h9X``

              ``hO}rj (hS]hT]hR]hQ]hU]uh:j` hZ]rk hjX

              rl rm }rn (h9Uh:jh ubahMjdubhjX children:ro rp }rq (h9X children:h:j` ubeubj)rr }rs (h9X6>>> for p in divs.xpath('p') >>> print p.extract()h:j hChFhMjhO}rt (jjhQ]hR]hS]hT]hU]uhXKhYhhZ]ru hjX6>>> for p in divs.xpath('p') >>> print p.extract()rv rw }rx (h9Uh:jr ubaubhn)ry }rz (h9XdFor more details about relative XPaths see the `Location Paths`_ section in the XPath specification.h:j hChFhMhrhO}r{ (hS]hT]hR]hQ]hU]uhXKhYhhZ]r| (hjX/For more details about relative XPaths see the r} r~ }r (h9X/For more details about relative XPaths see the h:jy ubh)r }r (h9X`Location Paths`_hKh:jy hMhhO}r (UnameXLocation PathshX)http://www.w3.org/TR/xpath#location-pathsr hQ]hR]hS]hT]hU]uhZ]r hjXLocation Pathsr r }r (h9Uh:j ubaubhjX$ section in the XPath specification.r r }r (h9X$ section in the XPath specification.h:jy ubeubhJ)r }r (h9X=.. _Location Paths: http://www.w3.org/TR/xpath#location-pathsh>Kh:j hChFhMhNhO}r (hj hQ]r Ulocation-pathsr ahR]hS]hT]hU]r hauhXKhYhhZ]ubeubjieubhChFhMh[hO}r (hS]r j ahT]hR]hQ]r Uid1r ahU]uhXKKhYhhZ]r (hc)r }r (h9XUsing selectorsr h:hhttp://doc.scrapy.org/en/latest/_static/selectors-sample1.htmlr h:j hChFhMhrhO}r (hS]hT]hR]hQ]hU]uhXKQhZ]r h)r }r (h9j hO}r (Urefurij hQ]hR]hS]hT]hU]uh:j hZ]r hjX>http://doc.scrapy.org/en/latest/_static/selectors-sample1.htmlr r }r (h9Uh:j ubahMhubaubaubhJ)r }r (h9X.. _topics-selectors-htmlcode:h:h Example website

              h:h Example website r r }r (h9Uh:j ubaubj )r }r (h9Uh:h` of that page, let's construct an XPath (using an HTML selector) for selecting the text inside the title tag::h:h`r h:j hChFhMj)hO}r (UreftypeXrefj+j,Xtopics-selectors-htmlcodeU refdomainXstdr hQ]hR]U refexplicithS]hT]hU]j.j/uhXKghZ]r j1)r }r (h9j hO}r (hS]hT]r (j6j Xstd-refr ehR]hQ]hU]uh:j hZ]r hjX HTML coder r }r (h9Uh:j ubahMj<ubaubhjXm of that page, let's construct an XPath (using an HTML selector) for selecting the text inside the title tag:r r }r (h9Xm of that page, let's construct an XPath (using an HTML selector) for selecting the text inside the title tag:h:j ubeubj)r! }r" (h9XH>>> sel.xpath('//title/text()') []h:h>> sel.xpath('//title/text()') []r% r& }r' (h9Uh:j! ubaubhn)r( }r) (h9XAs you can see, the ``.xpath()`` method returns an :class:`~scrapy.selector.SelectorList` instance, which is a list of new selectors. This API can be used quickly for extracting nested data.h:h j\)r? }r@ (h9j; hO}rA (hS]hT]rB (j6j= Xpy-classrC ehR]hQ]hU]uh:j9 hZ]rD hjX SelectorListrE rF }rG (h9Uh:j? ubahMjdubaubhjXe instance, which is a list of new selectors. This API can be used quickly for extracting nested data.rH rI }rJ (h9Xe instance, which is a list of new selectors. This API can be used quickly for extracting nested data.h:j( ubeubhn)rK }rL (h9XdTo actually extract the textual data, you must call the selector ``.extract()`` method, as follows::h:h>>> sel.xpath('//title/text()').extract() [u'Example website']h:h>>> sel.xpath('//title/text()').extract() [u'Example website']r` ra }rb (h9Uh:j\ ubaubhn)rc }rd (h9XYNotice that CSS selectors can select text or attribute nodes using CSS3 pseudo-elements::h:h>> sel.css('title::text').extract() [u'Example website']h:h>> sel.css('title::text').extract() [u'Example website']rn ro }rp (h9Uh:jj ubaubhn)rq }rr (h9X:Now we're going to get the base URL and some image links::rs h:h>> sel.xpath('//base/@href').extract() [u'http://example.com/'] >>> sel.css('base::attr(href)').extract() [u'http://example.com/'] >>> sel.xpath('//a[contains(@href, "image")]/@href').extract() [u'image1.html', u'image2.html', u'image3.html', u'image4.html', u'image5.html'] >>> sel.css('a[href*=image]::attr(href)').extract() [u'image1.html', u'image2.html', u'image3.html', u'image4.html', u'image5.html'] >>> sel.xpath('//a[contains(@href, "image")]/img/@src').extract() [u'image1_thumb.jpg', u'image2_thumb.jpg', u'image3_thumb.jpg', u'image4_thumb.jpg', u'image5_thumb.jpg'] >>> sel.css('a[href*=image] img::attr(src)').extract() [u'image1_thumb.jpg', u'image2_thumb.jpg', u'image3_thumb.jpg', u'image4_thumb.jpg', u'image5_thumb.jpg']h:h>> sel.xpath('//base/@href').extract() [u'http://example.com/'] >>> sel.css('base::attr(href)').extract() [u'http://example.com/'] >>> sel.xpath('//a[contains(@href, "image")]/@href').extract() [u'image1.html', u'image2.html', u'image3.html', u'image4.html', u'image5.html'] >>> sel.css('a[href*=image]::attr(href)').extract() [u'image1.html', u'image2.html', u'image3.html', u'image4.html', u'image5.html'] >>> sel.xpath('//a[contains(@href, "image")]/img/@src').extract() [u'image1_thumb.jpg', u'image2_thumb.jpg', u'image3_thumb.jpg', u'image4_thumb.jpg', u'image5_thumb.jpg'] >>> sel.css('a[href*=image] img::attr(src)').extract() [u'image1_thumb.jpg', u'image2_thumb.jpg', u'image3_thumb.jpg', u'image4_thumb.jpg', u'image5_thumb.jpg']r} r~ }r (h9Uh:jy ubaubj eubhChFhMUsystem_messager hO}r (hS]UlevelKhQ]hR]r j aUsourcehFhT]hU]UlineKKUtypeUINFOr uhXKKhYhhZ]r hn)r }r (h9UhO}r (hS]hT]hR]hQ]hU]uh:h7hZ]r hjX2Duplicate implicit target name: "using selectors".r r }r (h9Uh:j ubahMhrubaubh6)r }r (h9Uh:jghChFhMj hO}r (hS]UlevelKhQ]hR]r jaUsourcehFhT]hU]UlineMUtypej uhXMhYhhZ]r hn)r }r (h9UhO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX6Duplicate implicit target name: "regular expressions".r r }r (h9Uh:j ubahMhrubaubeUcurrent_sourcer NU decorationr NUautofootnote_startr KUnameidsr }r (hj hNhhWh h h j#h j h h h jhj hhhjGhjhj% hjMhhhjAhhhj hhhhhj`hjZhhhjhj hj* h h h!h!h"jh#j0 h$h$h%h%h&j h'jh(jmh)h)h*j h+jh,h^h-j" h.jTh/h/h0jW h1j uhZ]r (hKhAeh9UU transformerr NU footnote_refsr }r Urefnamesr }r (Xxpath]r (hheh]r (hhj jjjeXregular expressions]r j aX elementtree]r haXgoogle base xml feed]r j aXset manipulation]r j;aX beautifulsoup]r haXexslt]r jaj]r jaXlocation paths]r j aXcss]r (hheuUsymbol_footnotesr ]r Uautofootnote_refsr ]r Usymbol_footnote_refsr ]r U citationsr ]r hYhU current_liner NUtransform_messagesr ]r (h6)r }r (h9UhO}r (hS]UlevelKhQ]hR]UsourcehFhT]hU]UlineKUtypej uhZ]r hn)r }r (h9UhO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX6Hyperlink target "topics-selectors" is not referenced.r r }r (h9Uh:j ubahMhrubahMj ubh6)r }r (h9UhO}r (hS]UlevelKhQ]hR]UsourcehFhT]hU]UlineKSUtypej uhZ]r hn)r }r (h9UhO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX?Hyperlink target "topics-selectors-htmlcode" is not referenced.r r }r (h9Uh:j ubahMhrubahMj ubh6)r }r (h9UhO}r (hS]UlevelKhQ]hR]UsourcehFhT]hU]UlineKUtypej uhZ]r hn)r }r (h9UhO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjXHHyperlink target "topics-selectors-nesting-selectors" is not referenced.r r }r (h9Uh:j ubahMhrubahMj ubh6)r }r (h9UhO}r (hS]UlevelKhQ]hR]UsourcehFhT]hU]UlineKUtypej uhZ]r hn)r }r (h9UhO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjXFHyperlink target "topics-selectors-relative-xpaths" is not referenced.r r }r (h9Uh:j ubahMhrubahMj ubh6)r }r (h9UhO}r (hS]UlevelKhQ]hR]UsourcehFhT]hU]UlineMUtypej uhZ]r hn)r }r (h9UhO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX:Hyperlink target "topics-selectors-ref" is not referenced.r r }r (h9Uh:j ubahMhrubahMj ubh6)r }r (h9UhO}r (hS]UlevelKhQ]hR]UsourcehFhT]hU]UlineM*Utypej uhZ]r hn)r }r (h9UhO}r (hS]hT]hR]hQ]hU]uh:j hZ]r hjX9Hyperlink target "removing-namespaces" is not referenced.r r }r (h9Uh:j ubahMhrubahMj ubeUreporterr NUid_startr KU autofootnotesr ]r U citation_refsr }rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlr Uhttp://tools.ietf.org/html/r U tracebackr Upep_referencesr NUstrip_commentsr NU toc_backlinksrjU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhgNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templater Upep-%04dr!Uexit_status_levelr"KUconfigr#NUstrict_visitorr$NUcloak_email_addressesr%Utrim_footnote_reference_spacer&Uenvr'NUdump_pseudo_xmlr(NUexpose_internalsr)NUsectsubtitle_xformr*U source_linkr+NUrfc_referencesr,NUoutput_encodingr-Uutf-8r.U source_urlr/NUinput_encodingr0U utf-8-sigr1U_disable_configr2NU id_prefixr3UU tab_widthr4KUerror_encodingr5UUTF-8r6U_sourcer7UF/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/selectors.rstr8Ugettext_compactr9U generatorr:NUdump_internalsr;NU smart_quotesr<U pep_base_urlr=Uhttp://www.python.org/dev/peps/r>Usyntax_highlightr?Ulongr@Uinput_encoding_error_handlerrAjUauto_id_prefixrBUidrCUdoctitle_xformrDUstrip_elements_with_classesrENU _config_filesrF]Ufile_insertion_enabledrGU raw_enabledrHKU dump_settingsrINubUsymbol_footnote_startrJKUidsrK}rL(hWhAh jj0 j+ jjh!jh jj j jjbhjejGjCjMjIj j hjjjjAj=j hhh0hh!h#Utitleq?h%}q@(h)]h*]h(]h']h+]uh-Kh.hh]qAcdocutils.nodes Text qBXUbuntu packagesqCqD}qE(hh>hh(hX_The public GPG key used to sign these packages can be imported into you APT keyring as follows:hj8ubaubh)r?}r@(hXIcurl -s http://archive.scrapy.org/ubuntu/archive.key | sudo apt-key add -hh0hh!h#hh%}rA(hhh']h(]h)]h*]h+]uh-K.h.hh]rBhBXIcurl -s http://archive.scrapy.org/ubuntu/archive.key | sudo apt-key add -rCrD}rE(hUhj?ubaubh)rF}rG(hX(.. _Scrapinghub: http://scrapinghub.com/U referencedrHKhh0hh!h#h$h%}rI(hihjh']rJhah(]h)]h*]h+]rKh auh-K0h.hh]ubh)rL}rM(hX1.. _Github repo: https://github.com/scrapy/scrapyjHKhh0hh!h#h$h%}rN(hihuh']rOhah(]h)]h*]h+]rPhauh-K1h.hh]ubeubehUU transformerrQNU footnote_refsrR}rSUrefnamesrT}rU(X github repo]rVhraX scrapinghub]rWhdauUsymbol_footnotesrX]rYUautofootnote_refsrZ]r[Usymbol_footnote_refsr\]r]U citationsr^]r_h.hU current_liner`NUtransform_messagesra]rbcdocutils.nodes system_message rc)rd}re(hUh%}rf(h)]UlevelKh']h(]Usourceh!h*]h+]UlineKUtypeUINFOrguh]rhhO)ri}rj(hUh%}rk(h)]h*]h(]h']h+]uhjdh]rlhBX3Hyperlink target "topics-ubuntu" is not referenced.rmrn}ro(hUhjiubah#hRubah#Usystem_messagerpubaUreporterrqNUid_startrrKU autofootnotesrs]rtU citation_refsru}rvUindirect_targetsrw]rxUsettingsry(cdocutils.frontend Values rzor{}r|(Ufootnote_backlinksr}KUrecord_dependenciesr~NU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNh?NUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUC/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/ubuntu.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(hjLhh0hh0hjFuUsubstitution_namesr}rh#h.h%}r(h)]h']h(]Usourceh!h*]h+]uU footnotesr]rUrefidsr}rh]rhasub.PKo1D+scrapy-0.22/.doctrees/topics/images.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xtopics-images-thumbnailsqXpilqXfiltering out small imagesqNX<scrapy.contrib.pipeline.images.ImagesPipeline.item_completedq Xtopics-images-enablingq Xtwisted failureq Xusing the images pipelineq NX-scrapy.contrib.pipeline.images.ImagesPipelineq Xadditional featuresqNXenabling your images pipelineqNX usage exampleqNXimages storageqNXcustom images pipeline exampleqNXdownloading item imagesqNX amazon s3qXthumbnail generationqNXimage expirationqNX topics-imagesqX(implementing your custom images pipelineqNX@scrapy.contrib.pipeline.images.ImagesPipeline.get_media_requestsqXmd5 hashqXfile system storageqNXtopics-images-overrideqXpython imaging libraryqX sha1 hashqXpillowquUsubstitution_defsq }q!Uparse_messagesq"]q#Ucurrent_sourceq$NU decorationq%NUautofootnote_startq&KUnameidsq'}q((hUtopics-images-thumbnailsq)hUpilq*hUfiltering-out-small-imagesq+h h h Utopics-images-enablingq,h Utwisted-failureq-h Uusing-the-images-pipelineq.h h hUadditional-featuresq/hUenabling-your-images-pipelineq0hU usage-exampleq1hUimages-storageq2hUcustom-images-pipeline-exampleq3hUdownloading-item-imagesq4hU amazon-s3q5hUthumbnail-generationq6hUimage-expirationq7hU topics-imagesq8hU(implementing-your-custom-images-pipelineq9hhhUmd5-hashq:hUfile-system-storageq;hUtopics-images-overrideqhUpillowq?uUchildrenq@]qA(cdocutils.nodes target qB)qC}qD(U rawsourceqEX.. _topics-images:UparentqFhUsourceqGcdocutils.nodes reprunicode qHXC/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/images.rstqIqJ}qKbUtagnameqLUtargetqMU attributesqN}qO(UidsqP]UbackrefsqQ]UdupnamesqR]UclassesqS]UnamesqT]UrefidqUh8uUlineqVKUdocumentqWhh@]ubcdocutils.nodes section qX)qY}qZ(hEUhFhhGhJUexpect_referenced_by_nameq[}q\hhCshLUsectionq]hN}q^(hR]hS]hQ]hP]q_(h4h8ehT]q`(hheuhVKhWhUexpect_referenced_by_idqa}qbh8hCsh@]qc(cdocutils.nodes title qd)qe}qf(hEXDownloading Item ImagesqghFhYhGhJhLUtitleqhhN}qi(hR]hS]hQ]hP]hT]uhVKhWhh@]qjcdocutils.nodes Text qkXDownloading Item Imagesqlqm}qn(hEhghFheubaubcdocutils.nodes paragraph qo)qp}qq(hEXScrapy provides an :doc:`item pipeline ` for downloading images attached to a particular item, for example, when you scrape products and also want to download their images locally.hFhYhGhJhLU paragraphqrhN}qs(hR]hS]hQ]hP]hT]uhVK hWhh@]qt(hkXScrapy provides an quqv}qw(hEXScrapy provides an hFhpubcsphinx.addnodes pending_xref qx)qy}qz(hEX,:doc:`item pipeline `q{hFhphGhJhLU pending_xrefq|hN}q}(UreftypeXdocq~UrefwarnqU reftargetqX/topics/item-pipelineU refdomainUhP]hQ]U refexplicithR]hS]hT]UrefdocqX topics/imagesquhVK h@]qcdocutils.nodes literal q)q}q(hEh{hN}q(hR]hS]q(Uxrefqh~ehQ]hP]hT]uhFhyh@]qhkX item pipelineqq}q(hEUhFhubahLUliteralqubaubhkX for downloading images attached to a particular item, for example, when you scrape products and also want to download their images locally.qq}q(hEX for downloading images attached to a particular item, for example, when you scrape products and also want to download their images locally.hFhpubeubho)q}q(hEXThis pipeline, called the Images Pipeline and implemented in the :class:`ImagesPipeline` class, provides a convenient way for downloading and storing images locally with some additional features:hFhYhGhJhLhrhN}q(hR]hS]hQ]hP]hT]uhVK hWhh@]q(hkXAThis pipeline, called the Images Pipeline and implemented in the qq}q(hEXAThis pipeline, called the Images Pipeline and implemented in the hFhubhx)q}q(hEX:class:`ImagesPipeline`qhFhhGhJhLh|hN}q(UreftypeXclasshhXImagesPipelineU refdomainXpyqhP]hQ]U refexplicithR]hS]hT]hhUpy:classqNU py:moduleqXscrapy.contrib.pipeline.imagesquhVK h@]qh)q}q(hEhhN}q(hR]hS]q(hhXpy-classqehQ]hP]hT]uhFhh@]qhkXImagesPipelineqq}q(hEUhFhubahLhubaubhkXk class, provides a convenient way for downloading and storing images locally with some additional features:qq}q(hEXk class, provides a convenient way for downloading and storing images locally with some additional features:hFhubeubcdocutils.nodes bullet_list q)q}q(hEUhFhYhGhJhLU bullet_listqhN}q(UbulletqX*hP]hQ]hR]hS]hT]uhVKhWhh@]q(cdocutils.nodes list_item q)q}q(hEXEConvert all downloaded images to a common format (JPG) and mode (RGB)qhFhhGhJhLU list_itemqhN}q(hR]hS]hQ]hP]hT]uhVNhWhh@]qho)q}q(hEhhFhhGhJhLhrhN}q(hR]hS]hQ]hP]hT]uhVKh@]qhkXEConvert all downloaded images to a common format (JPG) and mode (RGB)qq}q(hEhhFhubaubaubh)q}q(hEX:Avoid re-downloading images which were downloaded recentlyqhFhhGhJhLhhN}q(hR]hS]hQ]hP]hT]uhVNhWhh@]qho)q}q(hEhhFhhGhJhLhrhN}q(hR]hS]hQ]hP]hT]uhVKh@]qhkX:Avoid re-downloading images which were downloaded recentlyq̅q}q(hEhhFhubaubaubh)q}q(hEXThumbnail generationqhFhhGhJhLhhN}q(hR]hS]hQ]hP]hT]uhVNhWhh@]qho)q}q(hEhhFhhGhJhLhrhN}q(hR]hS]hQ]hP]hT]uhVKh@]qhkXThumbnail generationq؅q}q(hEhhFhubaubaubh)q}q(hEXFCheck images width/height to make sure they meet a minimum constraint hFhhGhJhLhhN}q(hR]hS]hQ]hP]hT]uhVNhWhh@]qho)q}q(hEXECheck images width/height to make sure they meet a minimum constraintqhFhhGhJhLhrhN}q(hR]hS]hQ]hP]hT]uhVKh@]qhkXECheck images width/height to make sure they meet a minimum constraintq䅁q}q(hEhhFhubaubaubeubho)q}q(hEXThis pipeline also keeps an internal queue of those images which are currently being scheduled for download, and connects those items that arrive containing the same image, to that queue. This avoids downloading the same image more than once when it's shared by several items.qhFhYhGhJhLhrhN}q(hR]hS]hQ]hP]hT]uhVKhWhh@]qhkXThis pipeline also keeps an internal queue of those images which are currently being scheduled for download, and connects those items that arrive containing the same image, to that queue. This avoids downloading the same image more than once when it's shared by several items.q셁q}q(hEhhFhubaubho)q}q(hEXU`Pillow`_ is used for thumbnailing and normalizing images to JPEG/RGB format, so you need to install this library in order to use the images pipeline. `Python Imaging Library`_ (PIL) should also work in most cases, but it is known to cause troubles in some setups, so we recommend to use `Pillow`_ instead of `PIL `_.hFhYhGhJhLhrhN}q(hR]hS]hQ]hP]hT]uhVKhWhh@]q(cdocutils.nodes reference q)q}q(hEX `Pillow`_UresolvedqKhFhhLU referenceqhN}q(UnameXPillowUrefuriqX(https://github.com/python-imaging/PillowqhP]hQ]hR]hS]hT]uh@]qhkXPillowqq}q(hEUhFhubaubhkX is used for thumbnailing and normalizing images to JPEG/RGB format, so you need to install this library in order to use the images pipeline. qr}r(hEX is used for thumbnailing and normalizing images to JPEG/RGB format, so you need to install this library in order to use the images pipeline. hFhubh)r}r(hEX`Python Imaging Library`_hKhFhhLhhN}r(UnameXPython Imaging LibraryhX'http://www.pythonware.com/products/pil/rhP]hQ]hR]hS]hT]uh@]rhkXPython Imaging Libraryrr}r (hEUhFjubaubhkXp (PIL) should also work in most cases, but it is known to cause troubles in some setups, so we recommend to use r r }r (hEXp (PIL) should also work in most cases, but it is known to cause troubles in some setups, so we recommend to use hFhubh)r }r(hEX `Pillow`_hKhFhhLhhN}r(UnameXPillowhhhP]hQ]hR]hS]hT]uh@]rhkXPillowrr}r(hEUhFj ubaubhkX instead of rr}r(hEX instead of hFhubh)r}r(hEX`PIL `_hN}r(UnameXPILhXPythonImagingLibraryrhP]hQ]hR]hS]hT]uhFhh@]rhkXPILrr}r(hEUhFjubahLhubhB)r}r (hEX U referencedr!KhFhhLhMhN}r"(UrefurijhP]r#h*ahQ]hR]hS]hT]r$hauh@]ubhkX.r%}r&(hEX.hFhubeubhB)r'}r((hEX4.. _Pillow: https://github.com/python-imaging/Pillowj!KhFhYhGhJhLhMhN}r)(hhhP]r*h?ahQ]hR]hS]hT]r+hauhVK!hWhh@]ubhB)r,}r-(hEXC.. _Python Imaging Library: http://www.pythonware.com/products/pil/j!KhFhYhGhJhLhMhN}r.(hjhP]r/h=ahQ]hR]hS]hT]r0hauhVK"hWhh@]ubhX)r1}r2(hEUhFhYhGhJhLh]hN}r3(hR]hS]hQ]hP]r4h.ahT]r5h auhVK%hWhh@]r6(hd)r7}r8(hEXUsing the Images Pipeliner9hFj1hGhJhLhhhN}r:(hR]hS]hQ]hP]hT]uhVK%hWhh@]r;hkXUsing the Images Pipeliner<r=}r>(hEj9hFj7ubaubho)r?}r@(hEXLThe typical workflow, when using the :class:`ImagesPipeline` goes like this:hFj1hGhJhLhrhN}rA(hR]hS]hQ]hP]hT]uhVK'hWhh@]rB(hkX%The typical workflow, when using the rCrD}rE(hEX%The typical workflow, when using the hFj?ubhx)rF}rG(hEX:class:`ImagesPipeline`rHhFj?hGhJhLh|hN}rI(UreftypeXclasshhXImagesPipelineU refdomainXpyrJhP]hQ]U refexplicithR]hS]hT]hhhNhhuhVK'h@]rKh)rL}rM(hEjHhN}rN(hR]hS]rO(hjJXpy-classrPehQ]hP]hT]uhFjFh@]rQhkXImagesPipelinerRrS}rT(hEUhFjLubahLhubaubhkX goes like this:rUrV}rW(hEX goes like this:hFj?ubeubcdocutils.nodes enumerated_list rX)rY}rZ(hEUhFj1hGhJhLUenumerated_listr[hN}r\(Usuffixr]U.hP]hQ]hR]Uprefixr^UhS]hT]Uenumtyper_Uarabicr`uhVK*hWhh@]ra(h)rb}rc(hEX\In a Spider, you scrape an item and put the URLs of its images into a ``image_urls`` field. hFjYhGhJhLhhN}rd(hR]hS]hQ]hP]hT]uhVNhWhh@]reho)rf}rg(hEX[In a Spider, you scrape an item and put the URLs of its images into a ``image_urls`` field.hFjbhGhJhLhrhN}rh(hR]hS]hQ]hP]hT]uhVK*h@]ri(hkXFIn a Spider, you scrape an item and put the URLs of its images into a rjrk}rl(hEXFIn a Spider, you scrape an item and put the URLs of its images into a hFjfubh)rm}rn(hEX``image_urls``hN}ro(hR]hS]hQ]hP]hT]uhFjfh@]rphkX image_urlsrqrr}rs(hEUhFjmubahLhubhkX field.rtru}rv(hEX field.hFjfubeubaubh)rw}rx(hEXDThe item is returned from the spider and goes to the item pipeline. hFjYhGhJhLhhN}ry(hR]hS]hQ]hP]hT]uhVNhWhh@]rzho)r{}r|(hEXCThe item is returned from the spider and goes to the item pipeline.r}hFjwhGhJhLhrhN}r~(hR]hS]hQ]hP]hT]uhVK-h@]rhkXCThe item is returned from the spider and goes to the item pipeline.rr}r(hEj}hFj{ubaubaubh)r}r(hEXWhen the item reaches the :class:`ImagesPipeline`, the URLs in the ``image_urls`` field are scheduled for download using the standard Scrapy scheduler and downloader (which means the scheduler and downloader middlewares are reused), but with a higher priority, processing them before other pages are scraped. The item remains "locked" at that particular pipeline stage until the images have finish downloading (or fail for some reason). hFjYhGhJhLhhN}r(hR]hS]hQ]hP]hT]uhVNhWhh@]rho)r}r(hEXWhen the item reaches the :class:`ImagesPipeline`, the URLs in the ``image_urls`` field are scheduled for download using the standard Scrapy scheduler and downloader (which means the scheduler and downloader middlewares are reused), but with a higher priority, processing them before other pages are scraped. The item remains "locked" at that particular pipeline stage until the images have finish downloading (or fail for some reason).hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVK/h@]r(hkXWhen the item reaches the rr}r(hEXWhen the item reaches the hFjubhx)r}r(hEX:class:`ImagesPipeline`rhFjhGhJhLh|hN}r(UreftypeXclasshhXImagesPipelineU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhNhhuhVK/h@]rh)r}r(hEjhN}r(hR]hS]r(hjXpy-classrehQ]hP]hT]uhFjh@]rhkXImagesPipelinerr}r(hEUhFjubahLhubaubhkX, the URLs in the rr}r(hEX, the URLs in the hFjubh)r}r(hEX``image_urls``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkX image_urlsrr}r(hEUhFjubahLhubhkXc field are scheduled for download using the standard Scrapy scheduler and downloader (which means the scheduler and downloader middlewares are reused), but with a higher priority, processing them before other pages are scraped. The item remains "locked" at that particular pipeline stage until the images have finish downloading (or fail for some reason).rr}r(hEXc field are scheduled for download using the standard Scrapy scheduler and downloader (which means the scheduler and downloader middlewares are reused), but with a higher priority, processing them before other pages are scraped. The item remains "locked" at that particular pipeline stage until the images have finish downloading (or fail for some reason).hFjubeubaubh)r}r(hEX When the images are downloaded another field (``images``) will be populated with the results. This field will contain a list of dicts with information about the images downloaded, such as the downloaded path, the original scraped url (taken from the ``image_urls`` field) , and the image checksum. The images in the list of the ``images`` field will retain the same order of the original ``image_urls`` field. If some image failed downloading, an error will be logged and the image won't be present in the ``images`` field. hFjYhGhJhLhhN}r(hR]hS]hQ]hP]hT]uhVNhWhh@]rho)r}r(hEX When the images are downloaded another field (``images``) will be populated with the results. This field will contain a list of dicts with information about the images downloaded, such as the downloaded path, the original scraped url (taken from the ``image_urls`` field) , and the image checksum. The images in the list of the ``images`` field will retain the same order of the original ``image_urls`` field. If some image failed downloading, an error will be logged and the image won't be present in the ``images`` field.hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVK6h@]r(hkX.When the images are downloaded another field (rr}r(hEX.When the images are downloaded another field (hFjubh)r}r(hEX ``images``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXimagesrr}r(hEUhFjubahLhubhkX) will be populated with the results. This field will contain a list of dicts with information about the images downloaded, such as the downloaded path, the original scraped url (taken from the rr}r(hEX) will be populated with the results. This field will contain a list of dicts with information about the images downloaded, such as the downloaded path, the original scraped url (taken from the hFjubh)r}r(hEX``image_urls``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkX image_urlsrr}r(hEUhFjubahLhubhkX@ field) , and the image checksum. The images in the list of the rr}r(hEX@ field) , and the image checksum. The images in the list of the hFjubh)r}r(hEX ``images``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXimagesrr}r(hEUhFjubahLhubhkX2 field will retain the same order of the original rr}r(hEX2 field will retain the same order of the original hFjubh)r}r(hEX``image_urls``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkX image_urlsrr}r(hEUhFjubahLhubhkXh field. If some image failed downloading, an error will be logged and the image won't be present in the rr}r(hEXh field. If some image failed downloading, an error will be logged and the image won't be present in the hFjubh)r}r(hEX ``images``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXimagesrr}r(hEUhFjubahLhubhkX field.rr}r(hEX field.hFjubeubaubeubeubhX)r}r(hEUhFhYhGhJhLh]hN}r(hR]hS]hQ]hP]rh1ahT]rhauhVK@hWhh@]r(hd)r}r(hEX Usage examplerhFjhGhJhLhhhN}r(hR]hS]hQ]hP]hT]uhVK@hWhh@]rhkX Usage examplerr}r(hEjhFjubaubho)r}r(hEXIn order to use the image pipeline you just need to :ref:`enable it ` and define an item with the ``image_urls`` and ``images`` fields::hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKBhWhh@]r(hkX4In order to use the image pipeline you just need to rr}r(hEX4In order to use the image pipeline you just need to hFjubhx)r}r(hEX):ref:`enable it `rhFjhGhJhLh|hN}r(UreftypeXrefhhXtopics-images-enablingU refdomainXstdrhP]hQ]U refexplicithR]hS]hT]hhuhVKBh@]rcdocutils.nodes emphasis r)r}r(hEjhN}r(hR]hS]r(hjXstd-refrehQ]hP]hT]uhFjh@]rhkX enable itr r }r (hEUhFjubahLUemphasisr ubaubhkX and define an item with the r r}r(hEX and define an item with the hFjubh)r}r(hEX``image_urls``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkX image_urlsrr}r(hEUhFjubahLhubhkX and rr}r(hEX and hFjubh)r}r(hEX ``images``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXimagesrr}r (hEUhFjubahLhubhkX fields:r!r"}r#(hEX fields:hFjubeubcdocutils.nodes literal_block r$)r%}r&(hEXfrom scrapy.item import Item class MyItem(Item): # ... other item fields ... image_urls = Field() images = Field()hFjhGhJhLU literal_blockr'hN}r((U xml:spacer)Upreserver*hP]hQ]hR]hS]hT]uhVKFhWhh@]r+hkXfrom scrapy.item import Item class MyItem(Item): # ... other item fields ... image_urls = Field() images = Field()r,r-}r.(hEUhFj%ubaubho)r/}r0(hEXIf you need something more complex and want to override the custom images pipeline behaviour, see :ref:`topics-images-override`.hFjhGhJhLhrhN}r1(hR]hS]hQ]hP]hT]uhVKNhWhh@]r2(hkXbIf you need something more complex and want to override the custom images pipeline behaviour, see r3r4}r5(hEXbIf you need something more complex and want to override the custom images pipeline behaviour, see hFj/ubhx)r6}r7(hEX:ref:`topics-images-override`r8hFj/hGhJhLh|hN}r9(UreftypeXrefhhXtopics-images-overrideU refdomainXstdr:hP]hQ]U refexplicithR]hS]hT]hhuhVKNh@]r;j)r<}r=(hEj8hN}r>(hR]hS]r?(hj:Xstd-refr@ehQ]hP]hT]uhFj6h@]rAhkXtopics-images-overriderBrC}rD(hEUhFj<ubahLj ubaubhkX.rE}rF(hEX.hFj/ubeubhB)rG}rH(hEX.. _topics-images-enabling:hFjhGhJhLhMhN}rI(hP]hQ]hR]hS]hT]hUh,uhVKQhWhh@]ubeubhX)rJ}rK(hEUhFhYhGhJh[}rLh jGshLh]hN}rM(hR]hS]hQ]hP]rN(h0h,ehT]rO(hh euhVKThWhha}rPh,jGsh@]rQ(hd)rR}rS(hEXEnabling your Images PipelinerThFjJhGhJhLhhhN}rU(hR]hS]hQ]hP]hT]uhVKThWhh@]rVhkXEnabling your Images PipelinerWrX}rY(hEjThFjRubaubcsphinx.addnodes index rZ)r[}r\(hEUhFjJhGhJhLUindexr]hN}r^(hP]hQ]hR]hS]hT]Uentries]r_(XpairXIMAGES_STORE; settingXstd:setting-IMAGES_STOREr`UtraauhVKWhWhh@]ubhB)rb}rc(hEUhFjJhGhJhLhMhN}rd(hP]hQ]hR]hS]hT]hUj`uhVKWhWhh@]ubho)re}rf(hEXhTo enable your images pipeline you must first add it to your project :setting:`ITEM_PIPELINES` setting::hFjJhGhJh[}hLhrhN}rg(hR]hS]hQ]hP]rhj`ahT]uhVKXhWhha}rij`jbsh@]rj(hkXETo enable your images pipeline you must first add it to your project rkrl}rm(hEXETo enable your images pipeline you must first add it to your project hFjeubhx)rn}ro(hEX:setting:`ITEM_PIPELINES`rphFjehGhJhLh|hN}rq(UreftypeXsettinghhXITEM_PIPELINESU refdomainXstdrrhP]hQ]U refexplicithR]hS]hT]hhuhVKXh@]rsh)rt}ru(hEjphN}rv(hR]hS]rw(hjrX std-settingrxehQ]hP]hT]uhFjnh@]ryhkXITEM_PIPELINESrzr{}r|(hEUhFjtubahLhubaubhkX setting:r}r~}r(hEX setting:hFjeubeubj$)r}r(hEXEITEM_PIPELINES = {'scrapy.contrib.pipeline.images.ImagesPipeline': 1}hFjJhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVK[hWhh@]rhkXEITEM_PIPELINES = {'scrapy.contrib.pipeline.images.ImagesPipeline': 1}rr}r(hEUhFjubaubho)r}r(hEXAnd set the :setting:`IMAGES_STORE` setting to a valid directory that will be used for storing the downloaded images. Otherwise the pipeline will remain disabled, even if you include it in the :setting:`ITEM_PIPELINES` setting.hFjJhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVK]hWhh@]r(hkX And set the rr}r(hEX And set the hFjubhx)r}r(hEX:setting:`IMAGES_STORE`rhFjhGhJhLh|hN}r(UreftypeXsettinghhX IMAGES_STOREU refdomainXstdrhP]hQ]U refexplicithR]hS]hT]hhuhVK]h@]rh)r}r(hEjhN}r(hR]hS]r(hjX std-settingrehQ]hP]hT]uhFjh@]rhkX IMAGES_STORErr}r(hEUhFjubahLhubaubhkX setting to a valid directory that will be used for storing the downloaded images. Otherwise the pipeline will remain disabled, even if you include it in the rr}r(hEX setting to a valid directory that will be used for storing the downloaded images. Otherwise the pipeline will remain disabled, even if you include it in the hFjubhx)r}r(hEX:setting:`ITEM_PIPELINES`rhFjhGhJhLh|hN}r(UreftypeXsettinghhXITEM_PIPELINESU refdomainXstdrhP]hQ]U refexplicithR]hS]hT]hhuhVK]h@]rh)r}r(hEjhN}r(hR]hS]r(hjX std-settingrehQ]hP]hT]uhFjh@]rhkXITEM_PIPELINESrr}r(hEUhFjubahLhubaubhkX setting.rr}r(hEX setting.hFjubeubho)r}r(hEX For example::rhFjJhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKahWhh@]rhkX For example:rr}r(hEX For example:hFjubaubj$)r}r(hEX#IMAGES_STORE = '/path/to/valid/dir'hFjJhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVKchWhh@]rhkX#IMAGES_STORE = '/path/to/valid/dir'rr}r(hEUhFjubaubeubhX)r}r(hEUhFhYhGhJhLh]hN}r(hR]hS]hQ]hP]rh2ahT]rhauhVKfhWhh@]r(hd)r}r(hEXImages StoragerhFjhGhJhLhhhN}r(hR]hS]hQ]hP]hT]uhVKfhWhh@]rhkXImages Storagerr}r(hEjhFjubaubho)r}r(hEXzFile system is currently the only officially supported storage, but there is also (undocumented) support for `Amazon S3`_.hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhhWhh@]r(hkXmFile system is currently the only officially supported storage, but there is also (undocumented) support for rr}r(hEXmFile system is currently the only officially supported storage, but there is also (undocumented) support for hFjubh)r}r(hEX `Amazon S3`_hKhFjhLhhN}r(UnameX Amazon S3hXhttps://s3.amazonaws.com/rhP]hQ]hR]hS]hT]uh@]rhkX Amazon S3rr}r(hEUhFjubaubhkX.r}r(hEX.hFjubeubhB)r}r(hEX(.. _Amazon S3: https://s3.amazonaws.com/j!KhFjhGhJhLhMhN}r(hjhP]rh5ahQ]hR]hS]hT]rhauhVKkhWhh@]ubhX)r}r(hEUhFjhGhJhLh]hN}r(hR]hS]hQ]hP]rh;ahT]rhauhVKnhWhh@]r(hd)r}r(hEXFile system storagerhFjhGhJhLhhhN}r(hR]hS]hQ]hP]hT]uhVKnhWhh@]rhkXFile system storagerr}r(hEjhFjubaubho)r}r(hEXfThe images are stored in files (one per image), using a `SHA1 hash`_ of their URLs for the file names.hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKphWhh@]r(hkX8The images are stored in files (one per image), using a rr}r(hEX8The images are stored in files (one per image), using a hFjubh)r}r(hEX `SHA1 hash`_hKhFjhLhhN}r(UnameX SHA1 hashhX/http://en.wikipedia.org/wiki/SHA_hash_functionsrhP]hQ]hR]hS]hT]uh@]rhkX SHA1 hashrr}r(hEUhFjubaubhkX" of their URLs for the file names.rr}r(hEX" of their URLs for the file names.hFjubeubho)r}r(hEX&For example, the following image URL::rhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKshWhh@]r hkX%For example, the following image URL:r r }r (hEX%For example, the following image URL:hFjubaubj$)r }r(hEX http://www.example.com/image.jpghFjhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVKuhWhh@]rhkX http://www.example.com/image.jpgrr}r(hEUhFj ubaubho)r}r(hEXWhose `SHA1 hash` is::rhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKwhWhh@]r(hkXWhose rr}r(hEXWhose hFjubcdocutils.nodes title_reference r)r}r(hEX `SHA1 hash`hN}r(hR]hS]hQ]hP]hT]uhFjh@]r hkX SHA1 hashr!r"}r#(hEUhFjubahLUtitle_referencer$ubhkX is:r%r&}r'(hEX is:hFjubeubj$)r(}r)(hEX(3afec3b4765f8f0a07b78f98c07b83f013567a0ahFjhGhJhLj'hN}r*(j)j*hP]hQ]hR]hS]hT]uhVKyhWhh@]r+hkX(3afec3b4765f8f0a07b78f98c07b83f013567a0ar,r-}r.(hEUhFj(ubaubho)r/}r0(hEX5Will be downloaded and stored in the following file::r1hFjhGhJhLhrhN}r2(hR]hS]hQ]hP]hT]uhVK{hWhh@]r3hkX4Will be downloaded and stored in the following file:r4r5}r6(hEX4Will be downloaded and stored in the following file:hFj/ubaubj$)r7}r8(hEX@/full/3afec3b4765f8f0a07b78f98c07b83f013567a0a.jpghFjhGhJhLj'hN}r9(j)j*hP]hQ]hR]hS]hT]uhVK}hWhh@]r:hkX@/full/3afec3b4765f8f0a07b78f98c07b83f013567a0a.jpgr;r<}r=(hEUhFj7ubaubho)r>}r?(hEXWhere:r@hFjhGhJhLhrhN}rA(hR]hS]hQ]hP]hT]uhVKhWhh@]rBhkXWhere:rCrD}rE(hEj@hFj>ubaubh)rF}rG(hEUhFjhGhJhLhhN}rH(hX*hP]hQ]hR]hS]hT]uhVKhWhh@]rI(h)rJ}rK(hEXO```` is the directory defined in :setting:`IMAGES_STORE` setting hFjFhGhJhLhhN}rL(hR]hS]hQ]hP]hT]uhVNhWhh@]rMho)rN}rO(hEXN```` is the directory defined in :setting:`IMAGES_STORE` settinghFjJhGhJhLhrhN}rP(hR]hS]hQ]hP]hT]uhVKh@]rQ(h)rR}rS(hEX````hN}rT(hR]hS]hQ]hP]hT]uhFjNh@]rUhkXrVrW}rX(hEUhFjRubahLhubhkX is the directory defined in rYrZ}r[(hEX is the directory defined in hFjNubhx)r\}r](hEX:setting:`IMAGES_STORE`r^hFjNhGhJhLh|hN}r_(UreftypeXsettinghhX IMAGES_STOREU refdomainXstdr`hP]hQ]U refexplicithR]hS]hT]hhuhVKh@]rah)rb}rc(hEj^hN}rd(hR]hS]re(hj`X std-settingrfehQ]hP]hT]uhFj\h@]rghkX IMAGES_STORErhri}rj(hEUhFjbubahLhubaubhkX settingrkrl}rm(hEX settinghFjNubeubaubh)rn}ro(hEX``full`` is a sub-directory to separate full images from thumbnails (if used). For more info see :ref:`topics-images-thumbnails`. hFjFhGhJhLhhN}rp(hR]hS]hQ]hP]hT]uhVNhWhh@]rqho)rr}rs(hEX``full`` is a sub-directory to separate full images from thumbnails (if used). For more info see :ref:`topics-images-thumbnails`.hFjnhGhJhLhrhN}rt(hR]hS]hQ]hP]hT]uhVKh@]ru(h)rv}rw(hEX``full``hN}rx(hR]hS]hQ]hP]hT]uhFjrh@]ryhkXfullrzr{}r|(hEUhFjvubahLhubhkXY is a sub-directory to separate full images from thumbnails (if used). For more info see r}r~}r(hEXY is a sub-directory to separate full images from thumbnails (if used). For more info see hFjrubhx)r}r(hEX:ref:`topics-images-thumbnails`rhFjrhGhJhLh|hN}r(UreftypeXrefhhXtopics-images-thumbnailsU refdomainXstdrhP]hQ]U refexplicithR]hS]hT]hhuhVKh@]rj)r}r(hEjhN}r(hR]hS]r(hjXstd-refrehQ]hP]hT]uhFjh@]rhkXtopics-images-thumbnailsrr}r(hEUhFjubahLj ubaubhkX.r}r(hEX.hFjrubeubaubeubeubeubhX)r}r(hEUhFhYhGhJhLh]hN}r(hR]hS]hQ]hP]rh/ahT]rhauhVKhWhh@]r(hd)r}r(hEXAdditional featuresrhFjhGhJhLhhhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]rhkXAdditional featuresrr}r(hEjhFjubaubhX)r}r(hEUhFjhGhJhLh]hN}r(hR]hS]hQ]hP]rh7ahT]rhauhVKhWhh@]r(hd)r}r(hEXImage expirationrhFjhGhJhLhhhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]rhkXImage expirationrr}r(hEjhFjubaubjZ)r}r(hEUhFjhGhJhLj]hN}r(hP]hQ]hR]hS]hT]Uentries]r(XpairXIMAGES_EXPIRES; settingXstd:setting-IMAGES_EXPIRESrUtrauhVKhWhh@]ubhB)r}r(hEUhFjhGhJhLhMhN}r(hP]hQ]hR]hS]hT]hUjuhVKhWhh@]ubho)r}r(hEXThe Image Pipeline avoids downloading images that were downloaded recently. To adjust this retention delay use the :setting:`IMAGES_EXPIRES` setting, which specifies the delay in number of days::hFjhGhJh[}hLhrhN}r(hR]hS]hQ]hP]rjahT]uhVKhWhha}rjjsh@]r(hkXsThe Image Pipeline avoids downloading images that were downloaded recently. To adjust this retention delay use the rr}r(hEXsThe Image Pipeline avoids downloading images that were downloaded recently. To adjust this retention delay use the hFjubhx)r}r(hEX:setting:`IMAGES_EXPIRES`rhFjhGhJhLh|hN}r(UreftypeXsettinghhXIMAGES_EXPIRESU refdomainXstdrhP]hQ]U refexplicithR]hS]hT]hhuhVKh@]rh)r}r(hEjhN}r(hR]hS]r(hjX std-settingrehQ]hP]hT]uhFjh@]rhkXIMAGES_EXPIRESrr}r(hEUhFjubahLhubaubhkX6 setting, which specifies the delay in number of days:rr}r(hEX6 setting, which specifies the delay in number of days:hFjubeubj$)r}r(hEX;# 90 days of delay for image expiration IMAGES_EXPIRES = 90hFjhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVKhWhh@]rhkX;# 90 days of delay for image expiration IMAGES_EXPIRES = 90rr}r(hEUhFjubaubhB)r}r(hEX.. _topics-images-thumbnails:hFjhGhJhLhMhN}r(hP]hQ]hR]hS]hT]hUh)uhVKhWhh@]ubeubhX)r}r(hEUhFjhGhJh[}rhjshLh]hN}r(hR]hS]hQ]hP]r(h6h)ehT]r(hheuhVKhWhha}rh)jsh@]r(hd)r}r(hEXThumbnail generationrhFjhGhJhLhhhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]rhkXThumbnail generationrr}r(hEjhFjubaubho)r}r(hEXQThe Images Pipeline can automatically create thumbnails of the downloaded images.rhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]rhkXQThe Images Pipeline can automatically create thumbnails of the downloaded images.rr}r(hEjhFjubaubjZ)r}r(hEUhFjhGhJhLj]hN}r(hP]hQ]hR]hS]hT]Uentries]r(XpairXIMAGES_THUMBS; settingXstd:setting-IMAGES_THUMBSrUtrauhVKhWhh@]ubhB)r}r(hEUhFjhGhJhLhMhN}r(hP]hQ]hR]hS]hT]hUjuhVKhWhh@]ubho)r}r(hEXIn order use this feature, you must set :setting:`IMAGES_THUMBS` to a dictionary where the keys are the thumbnail names and the values are their dimensions.hFjhGhJh[}hLhrhN}r(hR]hS]hQ]hP]rjahT]uhVKhWhha}rjjsh@]r(hkX(In order use this feature, you must set rr}r(hEX(In order use this feature, you must set hFjubhx)r}r(hEX:setting:`IMAGES_THUMBS`rhFjhGhJhLh|hN}r(UreftypeXsettinghhX IMAGES_THUMBSU refdomainXstdr hP]hQ]U refexplicithR]hS]hT]hhuhVKh@]r h)r }r (hEjhN}r (hR]hS]r(hj X std-settingrehQ]hP]hT]uhFjh@]rhkX IMAGES_THUMBSrr}r(hEUhFj ubahLhubaubhkX\ to a dictionary where the keys are the thumbnail names and the values are their dimensions.rr}r(hEX\ to a dictionary where the keys are the thumbnail names and the values are their dimensions.hFjubeubho)r}r(hEX For example::rhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]rhkX For example:rr}r(hEX For example:hFjubaubj$)r}r (hEXAIMAGES_THUMBS = { 'small': (50, 50), 'big': (270, 270), }hFjhGhJhLj'hN}r!(j)j*hP]hQ]hR]hS]hT]uhVKhWhh@]r"hkXAIMAGES_THUMBS = { 'small': (50, 50), 'big': (270, 270), }r#r$}r%(hEUhFjubaubho)r&}r'(hEXsWhen you use this feature, the Images Pipeline will create thumbnails of the each specified size with this format::hFjhGhJhLhrhN}r((hR]hS]hQ]hP]hT]uhVKhWhh@]r)hkXrWhen you use this feature, the Images Pipeline will create thumbnails of the each specified size with this format:r*r+}r,(hEXrWhen you use this feature, the Images Pipeline will create thumbnails of the each specified size with this format:hFj&ubaubj$)r-}r.(hEX0/thumbs//.jpghFjhGhJhLj'hN}r/(j)j*hP]hQ]hR]hS]hT]uhVKhWhh@]r0hkX0/thumbs//.jpgr1r2}r3(hEUhFj-ubaubho)r4}r5(hEXWhere:r6hFjhGhJhLhrhN}r7(hR]hS]hQ]hP]hT]uhVKhWhh@]r8hkXWhere:r9r:}r;(hEj6hFj4ubaubh)r<}r=(hEUhFjhGhJhLhhN}r>(hX*hP]hQ]hR]hS]hT]uhVKhWhh@]r?(h)r@}rA(hEXo```` is the one specified in the :setting:`IMAGES_THUMBS` dictionary keys (``small``, ``big``, etc) hFj<hGhJhLhhN}rB(hR]hS]hQ]hP]hT]uhVNhWhh@]rCho)rD}rE(hEXn```` is the one specified in the :setting:`IMAGES_THUMBS` dictionary keys (``small``, ``big``, etc)hFj@hGhJhLhrhN}rF(hR]hS]hQ]hP]hT]uhVKh@]rG(h)rH}rI(hEX````hN}rJ(hR]hS]hQ]hP]hT]uhFjDh@]rKhkX rLrM}rN(hEUhFjHubahLhubhkX is the one specified in the rOrP}rQ(hEX is the one specified in the hFjDubhx)rR}rS(hEX:setting:`IMAGES_THUMBS`rThFjDhGhJhLh|hN}rU(UreftypeXsettinghhX IMAGES_THUMBSU refdomainXstdrVhP]hQ]U refexplicithR]hS]hT]hhuhVKh@]rWh)rX}rY(hEjThN}rZ(hR]hS]r[(hjVX std-settingr\ehQ]hP]hT]uhFjRh@]r]hkX IMAGES_THUMBSr^r_}r`(hEUhFjXubahLhubaubhkX dictionary keys (rarb}rc(hEX dictionary keys (hFjDubh)rd}re(hEX ``small``hN}rf(hR]hS]hQ]hP]hT]uhFjDh@]rghkXsmallrhri}rj(hEUhFjdubahLhubhkX, rkrl}rm(hEX, hFjDubh)rn}ro(hEX``big``hN}rp(hR]hS]hQ]hP]hT]uhFjDh@]rqhkXbigrrrs}rt(hEUhFjnubahLhubhkX, etc)rurv}rw(hEX, etc)hFjDubeubaubh)rx}ry(hEX4```` is the `SHA1 hash`_ of the image url hFj<hGhJhLhhN}rz(hR]hS]hQ]hP]hT]uhVNhWhh@]r{ho)r|}r}(hEX3```` is the `SHA1 hash`_ of the image urlhFjxhGhJhLhrhN}r~(hR]hS]hQ]hP]hT]uhVKh@]r(h)r}r(hEX````hN}r(hR]hS]hQ]hP]hT]uhFj|h@]rhkX rr}r(hEUhFjubahLhubhkX is the rr}r(hEX is the hFj|ubh)r}r(hEX `SHA1 hash`_hKhFj|hLhhN}r(UnameX SHA1 hashhjhP]hQ]hR]hS]hT]uh@]rhkX SHA1 hashrr}r(hEUhFjubaubhkX of the image urlrr}r(hEX of the image urlhFj|ubeubaubeubhB)r}r(hEX>.. _SHA1 hash: http://en.wikipedia.org/wiki/SHA_hash_functionsj!KhFjhGhJhLhMhN}r(hjhP]rh>ahQ]hR]hS]hT]rhauhVKhWhh@]ubho)r}r(hEXKExample of image files stored using ``small`` and ``big`` thumbnail names::rhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]r(hkX$Example of image files stored using rr}r(hEX$Example of image files stored using hFjubh)r}r(hEX ``small``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXsmallrr}r(hEUhFjubahLhubhkX and rr}r(hEX and hFjubh)r}r(hEX``big``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXbigrr}r(hEUhFjubahLhubhkX thumbnail names:rr}r(hEX thumbnail names:hFjubeubj$)r}r(hEX/full/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg /thumbs/small/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg /thumbs/big/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpghFjhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVKhWhh@]rhkX/full/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg /thumbs/small/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg /thumbs/big/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpgrr}r(hEUhFjubaubho)r}r(hEX=The first one is the full image, as downloaded from the site.rhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]rhkX=The first one is the full image, as downloaded from the site.rr}r(hEjhFjubaubeubhX)r}r(hEUhFjhGhJhLh]hN}r(hR]hS]hQ]hP]rh+ahT]rhauhVKhWhh@]r(hd)r}r(hEXFiltering out small imagesrhFjhGhJhLhhhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]rhkXFiltering out small imagesrr}r(hEjhFjubaubjZ)r}r(hEUhFjhGhJhLj]hN}r(hP]hQ]hR]hS]hT]Uentries]r(XpairXIMAGES_MIN_HEIGHT; settingXstd:setting-IMAGES_MIN_HEIGHTrUtrauhVKhWhh@]ubhB)r}r(hEUhFjhGhJhLhMhN}r(hR]hS]hQ]hP]rjahT]uhVKhWhh@]ubjZ)r}r(hEUhFjhGhJhLj]hN}r(hP]hQ]hR]hS]hT]Uentries]r(XpairXIMAGES_MIN_WIDTH; settingXstd:setting-IMAGES_MIN_WIDTHrUtrauhVKhWhh@]ubhB)r}r(hEUhFjhGhJhLhMhN}r(hP]hQ]hR]hS]hT]hUjuhVKhWhh@]ubho)r}r(hEXYou can drop images which are too small, by specifying the minimum allowed size in the :setting:`IMAGES_MIN_HEIGHT` and :setting:`IMAGES_MIN_WIDTH` settings.hFjhGhJh[}hLhrhN}r(hR]hS]hQ]hP]rjahT]uhVKhWhha}rjjsh@]r(hkXWYou can drop images which are too small, by specifying the minimum allowed size in the rr}r(hEXWYou can drop images which are too small, by specifying the minimum allowed size in the hFjubhx)r}r(hEX:setting:`IMAGES_MIN_HEIGHT`rhFjhGhJhLh|hN}r(UreftypeXsettinghhXIMAGES_MIN_HEIGHTU refdomainXstdrhP]hQ]U refexplicithR]hS]hT]hhuhVKh@]rh)r}r(hEjhN}r(hR]hS]r(hjX std-settingrehQ]hP]hT]uhFjh@]rhkXIMAGES_MIN_HEIGHTrr}r(hEUhFjubahLhubaubhkX and rr}r(hEX and hFjubhx)r}r(hEX:setting:`IMAGES_MIN_WIDTH`rhFjhGhJhLh|hN}r(UreftypeXsettinghhXIMAGES_MIN_WIDTHU refdomainXstdrhP]hQ]U refexplicithR]hS]hT]hhuhVKh@]rh)r}r(hEjhN}r(hR]hS]r (hjX std-settingr ehQ]hP]hT]uhFjh@]r hkXIMAGES_MIN_WIDTHr r }r(hEUhFjubahLhubaubhkX settings.rr}r(hEX settings.hFjubeubho)r}r(hEX For example::rhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]rhkX For example:rr}r(hEX For example:hFjubaubj$)r}r(hEX.IMAGES_MIN_HEIGHT = 110 IMAGES_MIN_WIDTH = 110hFjhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVKhWhh@]rhkX.IMAGES_MIN_HEIGHT = 110 IMAGES_MIN_WIDTH = 110rr}r (hEUhFjubaubho)r!}r"(hEXFNote: these size constraints don't affect thumbnail generation at all.r#hFjhGhJhLhrhN}r$(hR]hS]hQ]hP]hT]uhVKhWhh@]r%hkXFNote: these size constraints don't affect thumbnail generation at all.r&r'}r((hEj#hFj!ubaubho)r)}r*(hEXGBy default, there are no size constraints, so all images are processed.r+hFjhGhJhLhrhN}r,(hR]hS]hQ]hP]hT]uhVKhWhh@]r-hkXGBy default, there are no size constraints, so all images are processed.r.r/}r0(hEj+hFj)ubaubhB)r1}r2(hEX.. _topics-images-override:hFjhGhJhLhMhN}r3(hP]hQ]hR]hS]hT]hUh(hEX(Implementing your custom Images Pipeliner?hFj4hGhJhLhhhN}r@(hR]hS]hQ]hP]hT]uhVKhWhh@]rAhkX(Implementing your custom Images PipelinerBrC}rD(hEj?hFj=ubaubjZ)rE}rF(hEUhFj4hGhJhLj]hN}rG(hP]hQ]hR]hS]hT]Uentries]rH(UsinglerIX'scrapy.contrib.pipeline.images (module)X%module-scrapy.contrib.pipeline.imagesUtrJauhVNhWhh@]ubho)rK}rL(hEXMHere are the methods that you should override in your custom Images Pipeline:rMhFj4hGhJhLhrhN}rN(hR]hS]hQ]hP]hT]uhVKhWhh@]rOhkXMHere are the methods that you should override in your custom Images Pipeline:rPrQ}rR(hEjMhFjKubaubjZ)rS}rT(hEUhFj4hGNhLj]hN}rU(hP]hQ]hR]hS]hT]Uentries]rV(jIX8ImagesPipeline (class in scrapy.contrib.pipeline.images)h UtrWauhVNhWhh@]ubcsphinx.addnodes desc rX)rY}rZ(hEUhFj4hGNhLUdescr[hN}r\(Unoindexr]Udomainr^XpyhP]hQ]hR]hS]hT]Uobjtyper_Xclassr`Udesctyperaj`uhVNhWhh@]rb(csphinx.addnodes desc_signature rc)rd}re(hEXImagesPipelinerfhFjYhGhJhLUdesc_signaturerghN}rh(hP]rih aUmodulerjXscrapy.contrib.pipeline.imagesrkhQ]hR]hS]hT]rlh aUfullnamermjfUclassrnUUfirstrouhVM*hWhh@]rp(csphinx.addnodes desc_annotation rq)rr}rs(hEXclass hFjdhGhJhLUdesc_annotationrthN}ru(hR]hS]hQ]hP]hT]uhVM*hWhh@]rvhkXclass rwrx}ry(hEUhFjrubaubcsphinx.addnodes desc_addname rz)r{}r|(hEXscrapy.contrib.pipeline.images.hFjdhGhJhLU desc_addnamer}hN}r~(hR]hS]hQ]hP]hT]uhVM*hWhh@]rhkXscrapy.contrib.pipeline.images.rr}r(hEUhFj{ubaubcsphinx.addnodes desc_name r)r}r(hEjfhFjdhGhJhLU desc_namerhN}r(hR]hS]hQ]hP]hT]uhVM*hWhh@]rhkXImagesPipelinerr}r(hEUhFjubaubeubcsphinx.addnodes desc_content r)r}r(hEUhFjYhGhJhLU desc_contentrhN}r(hR]hS]hQ]hP]hT]uhVM*hWhh@]r(jZ)r}r(hEUhFjhGhJhLj]hN}r(hP]hQ]hR]hS]hT]Uentries]r(jIXKget_media_requests() (scrapy.contrib.pipeline.images.ImagesPipeline method)hUtrauhVNhWhh@]ubjX)r}r(hEUhFjhGhJhLj[hN}r(j]j^XpyhP]hQ]hR]hS]hT]j_XmethodrjajuhVNhWhh@]r(jc)r}r(hEXget_media_requests(item, info)hFjhGhJhLjghN}r(hP]rhajjjkhQ]hR]hS]hT]rhajmX!ImagesPipeline.get_media_requestsjnjfjouhVMhWhh@]r(j)r}r(hEXget_media_requestshFjhGhJhLjhN}r(hR]hS]hQ]hP]hT]uhVMhWhh@]rhkXget_media_requestsrr}r(hEUhFjubaubcsphinx.addnodes desc_parameterlist r)r}r(hEUhFjhGhJhLUdesc_parameterlistrhN}r(hR]hS]hQ]hP]hT]uhVMhWhh@]r(csphinx.addnodes desc_parameter r)r}r(hEXitemhN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXitemrr}r(hEUhFjubahLUdesc_parameterrubj)r}r(hEXinfohN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXinforr}r(hEUhFjubahLjubeubeubj)r}r(hEUhFjhGhJhLjhN}r(hR]hS]hQ]hP]hT]uhVMhWhh@]r(ho)r}r(hEXAs seen on the workflow, the pipeline will get the URLs of the images to download from the item. In order to do this, you must override the :meth:`~get_media_requests` method and return a Request for each image URL::hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]r(hkXAs seen on the workflow, the pipeline will get the URLs of the images to download from the item. In order to do this, you must override the rr}r(hEXAs seen on the workflow, the pipeline will get the URLs of the images to download from the item. In order to do this, you must override the hFjubhx)r}r(hEX:meth:`~get_media_requests`rhFjhGhJhLh|hN}r(UreftypeXmethhhXget_media_requestsU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVKh@]rh)r}r(hEjhN}r(hR]hS]r(hjXpy-methrehQ]hP]hT]uhFjh@]rhkXget_media_requests()rr}r(hEUhFjubahLhubaubhkX0 method and return a Request for each image URL:rr}r(hEX0 method and return a Request for each image URL:hFjubeubj$)r}r(hEXsdef get_media_requests(self, item, info): for image_url in item['image_urls']: yield Request(image_url)hFjhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVKhWhh@]rhkXsdef get_media_requests(self, item, info): for image_url in item['image_urls']: yield Request(image_url)rr}r(hEUhFjubaubho)r}r(hEXThose requests will be processed by the pipeline and, when they have finished downloading, the results will be sent to the :meth:`~item_completed` method, as a list of 2-element tuples. Each tuple will contain ``(success, image_info_or_failure)`` where:hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]r(hkX{Those requests will be processed by the pipeline and, when they have finished downloading, the results will be sent to the rr}r(hEX{Those requests will be processed by the pipeline and, when they have finished downloading, the results will be sent to the hFjubhx)r}r(hEX:meth:`~item_completed`rhFjhGhJhLh|hN}r(UreftypeXmethhhXitem_completedU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVKh@]rh)r}r(hEjhN}r(hR]hS]r(hjXpy-methrehQ]hP]hT]uhFjh@]rhkXitem_completed()rr}r(hEUhFjubahLhubaubhkX@ method, as a list of 2-element tuples. Each tuple will contain rr}r(hEX@ method, as a list of 2-element tuples. Each tuple will contain hFjubh)r}r(hEX$``(success, image_info_or_failure)``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkX (success, image_info_or_failure)rr}r(hEUhFjubahLhubhkX where:rr}r(hEX where:hFjubeubh)r}r(hEUhFjhGhJhLhhN}r(hX*hP]hQ]hR]hS]hT]uhVKhWhh@]r (h)r }r (hEX~``success`` is a boolean which is ``True`` if the image was downloaded successfully or ``False`` if it failed for some reason hFjhGhJhLhhN}r (hR]hS]hQ]hP]hT]uhVNhWhh@]r ho)r}r(hEX}``success`` is a boolean which is ``True`` if the image was downloaded successfully or ``False`` if it failed for some reasonhFj hGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKh@]r(h)r}r(hEX ``success``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXsuccessrr}r(hEUhFjubahLhubhkX is a boolean which is rr}r(hEX is a boolean which is hFjubh)r}r(hEX``True``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXTruer r!}r"(hEUhFjubahLhubhkX- if the image was downloaded successfully or r#r$}r%(hEX- if the image was downloaded successfully or hFjubh)r&}r'(hEX ``False``hN}r((hR]hS]hQ]hP]hT]uhFjh@]r)hkXFalser*r+}r,(hEUhFj&ubahLhubhkX if it failed for some reasonr-r.}r/(hEX if it failed for some reasonhFjubeubaubh)r0}r1(hEX``image_info_or_error`` is a dict containing the following keys (if success is ``True``) or a `Twisted Failure`_ if there was a problem. * ``url`` - the url where the image was downloaded from. This is the url of the request returned from the :meth:`~get_media_requests` method. * ``path`` - the path (relative to :setting:`IMAGES_STORE`) where the image was stored * ``checksum`` - a `MD5 hash`_ of the image contents hFjhGNhLhhN}r2(hR]hS]hQ]hP]hT]uhVNhWhh@]r3(ho)r4}r5(hEX``image_info_or_error`` is a dict containing the following keys (if success is ``True``) or a `Twisted Failure`_ if there was a problem.hFj0hGhJhLhrhN}r6(hR]hS]hQ]hP]hT]uhVKh@]r7(h)r8}r9(hEX``image_info_or_error``hN}r:(hR]hS]hQ]hP]hT]uhFj4h@]r;hkXimage_info_or_errorr<r=}r>(hEUhFj8ubahLhubhkX8 is a dict containing the following keys (if success is r?r@}rA(hEX8 is a dict containing the following keys (if success is hFj4ubh)rB}rC(hEX``True``hN}rD(hR]hS]hQ]hP]hT]uhFj4h@]rEhkXTruerFrG}rH(hEUhFjBubahLhubhkX) or a rIrJ}rK(hEX) or a hFj4ubh)rL}rM(hEX`Twisted Failure`_hKhFj4hLhhN}rN(UnameXTwisted FailurehXRhttp://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.htmlrOhP]hQ]hR]hS]hT]uh@]rPhkXTwisted FailurerQrR}rS(hEUhFjLubaubhkX if there was a problem.rTrU}rV(hEX if there was a problem.hFj4ubeubh)rW}rX(hEUhN}rY(hX*hP]hQ]hR]hS]hT]uhFj0h@]rZ(h)r[}r\(hEX``url`` - the url where the image was downloaded from. This is the url of the request returned from the :meth:`~get_media_requests` method. hN}r](hR]hS]hQ]hP]hT]uhFjWh@]r^ho)r_}r`(hEX``url`` - the url where the image was downloaded from. This is the url of the request returned from the :meth:`~get_media_requests` method.hFj[hGhJhLhrhN}ra(hR]hS]hQ]hP]hT]uhVKh@]rb(h)rc}rd(hEX``url``hN}re(hR]hS]hQ]hP]hT]uhFj_h@]rfhkXurlrgrh}ri(hEUhFjcubahLhubhkXa - the url where the image was downloaded from. This is the url of the request returned from the rjrk}rl(hEXa - the url where the image was downloaded from. This is the url of the request returned from the hFj_ubhx)rm}rn(hEX:meth:`~get_media_requests`rohFj_hGhJhLh|hN}rp(UreftypeXmethhhXget_media_requestsU refdomainXpyrqhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVKh@]rrh)rs}rt(hEjohN}ru(hR]hS]rv(hjqXpy-methrwehQ]hP]hT]uhFjmh@]rxhkXget_media_requests()ryrz}r{(hEUhFjsubahLhubaubhkX method.r|r}}r~(hEX method.hFj_ubeubahLhubh)r}r(hEXU``path`` - the path (relative to :setting:`IMAGES_STORE`) where the image was stored hN}r(hR]hS]hQ]hP]hT]uhFjWh@]rho)r}r(hEXT``path`` - the path (relative to :setting:`IMAGES_STORE`) where the image was storedhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKh@]r(h)r}r(hEX``path``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXpathrr}r(hEUhFjubahLhubhkX - the path (relative to rr}r(hEX - the path (relative to hFjubhx)r}r(hEX:setting:`IMAGES_STORE`rhFjhGhJhLh|hN}r(UreftypeXsettinghhX IMAGES_STOREU refdomainXstdrhP]hQ]U refexplicithR]hS]hT]hhuhVKh@]rh)r}r(hEjhN}r(hR]hS]r(hjX std-settingrehQ]hP]hT]uhFjh@]rhkX IMAGES_STORErr}r(hEUhFjubahLhubaubhkX) where the image was storedrr}r(hEX) where the image was storedhFjubeubahLhubh)r}r(hEX3``checksum`` - a `MD5 hash`_ of the image contents hN}r(hR]hS]hQ]hP]hT]uhFjWh@]rho)r}r(hEX2``checksum`` - a `MD5 hash`_ of the image contentsrhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKh@]r(h)r}r(hEX ``checksum``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXchecksumrr}r(hEUhFjubahLhubhkX - a rr}r(hEX - a hFjubh)r}r(hEX `MD5 hash`_hKhFjhLhhN}r(UnameXMD5 hashhX http://en.wikipedia.org/wiki/MD5rhP]hQ]hR]hS]hT]uh@]rhkXMD5 hashrr}r(hEUhFjubaubhkX of the image contentsrr}r(hEX of the image contentshFjubeubahLhubehLhubeubeubho)r}r(hEXThe list of tuples received by :meth:`~item_completed` is guaranteed to retain the same order of the requests returned from the :meth:`~get_media_requests` method.hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVKhWhh@]r(hkXThe list of tuples received by rr}r(hEXThe list of tuples received by hFjubhx)r}r(hEX:meth:`~item_completed`rhFjhGhJhLh|hN}r(UreftypeXmethhhXitem_completedU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVKh@]rh)r}r(hEjhN}r(hR]hS]r(hjXpy-methrehQ]hP]hT]uhFjh@]rhkXitem_completed()rr}r(hEUhFjubahLhubaubhkXJ is guaranteed to retain the same order of the requests returned from the rr}r(hEXJ is guaranteed to retain the same order of the requests returned from the hFjubhx)r}r(hEX:meth:`~get_media_requests`rhFjhGhJhLh|hN}r(UreftypeXmethhhXget_media_requestsU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVKh@]rh)r}r(hEjhN}r(hR]hS]r(hjXpy-methrehQ]hP]hT]uhFjh@]rhkXget_media_requests()rr}r(hEUhFjubahLhubaubhkX method.rr}r(hEX method.hFjubeubho)r}r(hEX4Here's a typical value of the ``results`` argument::rhFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVMhWhh@]r(hkXHere's a typical value of the rr}r(hEXHere's a typical value of the hFjubh)r}r(hEX ``results``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkXresultsrr}r(hEUhFjubahLhubhkX argument:rr}r(hEX argument:hFjubeubj$)r}r(hEX[(True, {'checksum': '2b00042f7481c7b056c4b410d28f33cf', 'path': 'full/7d97e98f8af710c7e7fe703abc8f639e0ee507c4.jpg', 'url': 'http://www.example.com/images/product1.jpg'}), (True, {'checksum': 'b9628c4ab9b595f72f280b90c4fd093d', 'path': 'full/1ca5879492b8fd606df1964ea3c1e2f4520f076f.jpg', 'url': 'http://www.example.com/images/product2.jpg'}), (False, Failure(...))]hFjhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVMhWhh@]rhkX[(True, {'checksum': '2b00042f7481c7b056c4b410d28f33cf', 'path': 'full/7d97e98f8af710c7e7fe703abc8f639e0ee507c4.jpg', 'url': 'http://www.example.com/images/product1.jpg'}), (True, {'checksum': 'b9628c4ab9b595f72f280b90c4fd093d', 'path': 'full/1ca5879492b8fd606df1964ea3c1e2f4520f076f.jpg', 'url': 'http://www.example.com/images/product2.jpg'}), (False, Failure(...))]rr}r(hEUhFjubaubho)r}r(hEX{By default the :meth:`get_media_requests` method returns ``None`` which means there are no images to download for the item.hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVMhWhh@]r(hkXBy default the r r }r (hEXBy default the hFjubhx)r }r (hEX:meth:`get_media_requests`rhFjhGhJhLh|hN}r(UreftypeXmethhhXget_media_requestsU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVMh@]rh)r}r(hEjhN}r(hR]hS]r(hjXpy-methrehQ]hP]hT]uhFj h@]rhkXget_media_requests()rr}r(hEUhFjubahLhubaubhkX method returns rr}r(hEX method returns hFjubh)r}r(hEX``None``hN}r (hR]hS]hQ]hP]hT]uhFjh@]r!hkXNoner"r#}r$(hEUhFjubahLhubhkX: which means there are no images to download for the item.r%r&}r'(hEX: which means there are no images to download for the item.hFjubeubeubeubjZ)r(}r)(hEUhFjhGhJhLj]hN}r*(hP]hQ]hR]hS]hT]Uentries]r+(jIXGitem_completed() (scrapy.contrib.pipeline.images.ImagesPipeline method)h Utr,auhVNhWhh@]ubjX)r-}r.(hEUhFjhGhJhLj[hN}r/(j]j^XpyhP]hQ]hR]hS]hT]j_Xmethodr0jaj0uhVNhWhh@]r1(jc)r2}r3(hEX$item_completed(results, items, info)hFj-hGhJhLjghN}r4(hP]r5h ajjjkhQ]hR]hS]hT]r6h ajmXImagesPipeline.item_completedjnjfjouhVM(hWhh@]r7(j)r8}r9(hEXitem_completedhFj2hGhJhLjhN}r:(hR]hS]hQ]hP]hT]uhVM(hWhh@]r;hkXitem_completedr<r=}r>(hEUhFj8ubaubj)r?}r@(hEUhFj2hGhJhLjhN}rA(hR]hS]hQ]hP]hT]uhVM(hWhh@]rB(j)rC}rD(hEXresultshN}rE(hR]hS]hQ]hP]hT]uhFj?h@]rFhkXresultsrGrH}rI(hEUhFjCubahLjubj)rJ}rK(hEXitemshN}rL(hR]hS]hQ]hP]hT]uhFj?h@]rMhkXitemsrNrO}rP(hEUhFjJubahLjubj)rQ}rR(hEXinfohN}rS(hR]hS]hQ]hP]hT]uhFj?h@]rThkXinforUrV}rW(hEUhFjQubahLjubeubeubj)rX}rY(hEUhFj-hGhJhLjhN}rZ(hR]hS]hQ]hP]hT]uhVM(hWhh@]r[(ho)r\}r](hEXThe :meth:`ImagesPipeline.item_completed` method called when all image requests for a single item have completed (either finished downloading, or failed for some reason).hFjXhGhJhLhrhN}r^(hR]hS]hQ]hP]hT]uhVMhWhh@]r_(hkXThe r`ra}rb(hEXThe hFj\ubhx)rc}rd(hEX%:meth:`ImagesPipeline.item_completed`rehFj\hGhJhLh|hN}rf(UreftypeXmethhhXImagesPipeline.item_completedU refdomainXpyrghP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVMh@]rhh)ri}rj(hEjehN}rk(hR]hS]rl(hjgXpy-methrmehQ]hP]hT]uhFjch@]rnhkXImagesPipeline.item_completed()rorp}rq(hEUhFjiubahLhubaubhkX method called when all image requests for a single item have completed (either finished downloading, or failed for some reason).rrrs}rt(hEX method called when all image requests for a single item have completed (either finished downloading, or failed for some reason).hFj\ubeubho)ru}rv(hEXThe :meth:`~item_completed` method must return the output that will be sent to subsequent item pipeline stages, so you must return (or drop) the item, as you would in any pipeline.hFjXhGhJhLhrhN}rw(hR]hS]hQ]hP]hT]uhVMhWhh@]rx(hkXThe ryrz}r{(hEXThe hFjuubhx)r|}r}(hEX:meth:`~item_completed`r~hFjuhGhJhLh|hN}r(UreftypeXmethhhXitem_completedU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVMh@]rh)r}r(hEj~hN}r(hR]hS]r(hjXpy-methrehQ]hP]hT]uhFj|h@]rhkXitem_completed()rr}r(hEUhFjubahLhubaubhkX method must return the output that will be sent to subsequent item pipeline stages, so you must return (or drop) the item, as you would in any pipeline.rr}r(hEX method must return the output that will be sent to subsequent item pipeline stages, so you must return (or drop) the item, as you would in any pipeline.hFjuubeubho)r}r(hEXHere is an example of the :meth:`~item_completed` method where we store the downloaded image paths (passed in results) in the ``image_paths`` item field, and we drop the item if it doesn't contain any images::hFjXhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVMhWhh@]r(hkXHere is an example of the rr}r(hEXHere is an example of the hFjubhx)r}r(hEX:meth:`~item_completed`rhFjhGhJhLh|hN}r(UreftypeXmethhhXitem_completedU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVMh@]rh)r}r(hEjhN}r(hR]hS]r(hjXpy-methrehQ]hP]hT]uhFjh@]rhkXitem_completed()rr}r(hEUhFjubahLhubaubhkXM method where we store the downloaded image paths (passed in results) in the rr}r(hEXM method where we store the downloaded image paths (passed in results) in the hFjubh)r}r(hEX``image_paths``hN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkX image_pathsrr}r(hEUhFjubahLhubhkXC item field, and we drop the item if it doesn't contain any images:rr}r(hEXC item field, and we drop the item if it doesn't contain any images:hFjubeubj$)r}r(hEXfrom scrapy.exceptions import DropItem def item_completed(self, results, item, info): image_paths = [x['path'] for ok, x in results if ok] if not image_paths: raise DropItem("Item contains no images") item['image_paths'] = image_paths return itemhFjXhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVMhWhh@]rhkXfrom scrapy.exceptions import DropItem def item_completed(self, results, item, info): image_paths = [x['path'] for ok, x in results if ok] if not image_paths: raise DropItem("Item contains no images") item['image_paths'] = image_paths return itemrr}r(hEUhFjubaubho)r}r(hEX?By default, the :meth:`item_completed` method returns the item.rhFjXhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVM(hWhh@]r(hkXBy default, the rr}r(hEXBy default, the hFjubhx)r}r(hEX:meth:`item_completed`rhFjhGhJhLh|hN}r(UreftypeXmethhhXitem_completedU refdomainXpyrhP]hQ]U refexplicithR]hS]hT]hhhjfhjkuhVM(h@]rh)r}r(hEjhN}r(hR]hS]r(hjXpy-methrehQ]hP]hT]uhFjh@]rhkXitem_completed()rr}r(hEUhFjubahLhubaubhkX method returns the item.rr}r(hEX method returns the item.hFjubeubeubeubeubeubeubhX)r}r(hEUhFhYhGhJhLh]hN}r(hR]hS]hQ]hP]rh3ahT]rhauhVM,hWhh@]r(hd)r}r(hEXCustom Images pipeline examplerhFjhGhJhLhhhN}r(hR]hS]hQ]hP]hT]uhVM,hWhh@]rhkXCustom Images pipeline examplerr}r(hEjhFjubaubho)r}r(hEXSHere is a full example of the Images Pipeline whose methods are examplified above::hFjhGhJhLhrhN}r(hR]hS]hQ]hP]hT]uhVM.hWhh@]rhkXRHere is a full example of the Images Pipeline whose methods are examplified above:rr}r(hEXRHere is a full example of the Images Pipeline whose methods are examplified above:hFjubaubj$)r}r(hEX+from scrapy.contrib.pipeline.images import ImagesPipeline from scrapy.exceptions import DropItem from scrapy.http import Request class MyImagesPipeline(ImagesPipeline): def get_media_requests(self, item, info): for image_url in item['image_urls']: yield Request(image_url) def item_completed(self, results, item, info): image_paths = [x['path'] for ok, x in results if ok] if not image_paths: raise DropItem("Item contains no images") item['image_paths'] = image_paths return itemhFjhGhJhLj'hN}r(j)j*hP]hQ]hR]hS]hT]uhVM1hWhh@]rhkX+from scrapy.contrib.pipeline.images import ImagesPipeline from scrapy.exceptions import DropItem from scrapy.http import Request class MyImagesPipeline(ImagesPipeline): def get_media_requests(self, item, info): for image_url in item['image_urls']: yield Request(image_url) def item_completed(self, results, item, info): image_paths = [x['path'] for ok, x in results if ok] if not image_paths: raise DropItem("Item contains no images") item['image_paths'] = image_paths return itemrr}r(hEUhFjubaubhB)r}r(hEXg.. _Twisted Failure: http://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.htmlj!KhFjhGhJhLhMhN}r(hjOhP]rh-ahQ]hR]hS]hT]rh auhVMBhWhh@]ubhB)r}r(hEX... _MD5 hash: http://en.wikipedia.org/wiki/MD5j!KhFjhGhJhLhMhN}r(hjhP]rh:ahQ]hR]hS]hT]rhauhVMChWhh@]ubeubeubehEUU transformerrNU footnote_refsr}rUrefnamesr}r(Xtwisted failure]rjLaXmd5 hash]rjaX amazon s3]rjaXpython imaging library]rjaX sha1 hash]r(jjeXpillow]r(hj euUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr ]r hWhU current_liner NUtransform_messagesr ]r (cdocutils.nodes system_message r)r}r(hEUhN}r(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKUtypeUINFOruh@]rho)r}r(hEUhN}r(hR]hS]hQ]hP]hT]uhFjh@]rhkX3Hyperlink target "topics-images" is not referenced.rr}r(hEUhFjubahLhrubahLUsystem_messagerubj)r}r(hEUhN}r(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKQUtypejuh@]rho)r }r!(hEUhN}r"(hR]hS]hQ]hP]hT]uhFjh@]r#hkX<Hyperlink target "topics-images-enabling" is not referenced.r$r%}r&(hEUhFj ubahLhrubahLjubj)r'}r((hEUhN}r)(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKWUtypejuh@]r*ho)r+}r,(hEUhN}r-(hR]hS]hQ]hP]hT]uhFj'h@]r.hkX>Hyperlink target "std:setting-IMAGES_STORE" is not referenced.r/r0}r1(hEUhFj+ubahLhrubahLjubj)r2}r3(hEUhN}r4(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKUtypejuh@]r5ho)r6}r7(hEUhN}r8(hR]hS]hQ]hP]hT]uhFj2h@]r9hkX@Hyperlink target "std:setting-IMAGES_EXPIRES" is not referenced.r:r;}r<(hEUhFj6ubahLhrubahLjubj)r=}r>(hEUhN}r?(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKUtypejuh@]r@ho)rA}rB(hEUhN}rC(hR]hS]hQ]hP]hT]uhFj=h@]rDhkX>Hyperlink target "topics-images-thumbnails" is not referenced.rErF}rG(hEUhFjAubahLhrubahLjubj)rH}rI(hEUhN}rJ(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKUtypejuh@]rKho)rL}rM(hEUhN}rN(hR]hS]hQ]hP]hT]uhFjHh@]rOhkX?Hyperlink target "std:setting-IMAGES_THUMBS" is not referenced.rPrQ}rR(hEUhFjLubahLhrubahLjubj)rS}rT(hEUhN}rU(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKUtypejuh@]rVho)rW}rX(hEUhN}rY(hR]hS]hQ]hP]hT]uhFjSh@]rZhkXCHyperlink target "std:setting-IMAGES_MIN_HEIGHT" is not referenced.r[r\}r](hEUhFjWubahLhrubahLjubj)r^}r_(hEUhN}r`(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKUtypejuh@]raho)rb}rc(hEUhN}rd(hR]hS]hQ]hP]hT]uhFj^h@]rehkXBHyperlink target "std:setting-IMAGES_MIN_WIDTH" is not referenced.rfrg}rh(hEUhFjbubahLhrubahLjubj)ri}rj(hEUhN}rk(hR]UlevelKhP]hQ]UsourcehJhS]hT]UlineKUtypejuh@]rlho)rm}rn(hEUhN}ro(hR]hS]hQ]hP]hT]uhFjih@]rphkX<Hyperlink target "topics-images-override" is not referenced.rqrr}rs(hEUhFjmubahLhrubahLjubeUreporterrtNUid_startruKU autofootnotesrv]rwU citation_refsrx}ryUindirect_targetsrz]r{Usettingsr|(cdocutils.frontend Values r}or~}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhhNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUC/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/images.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(h:jj`jeh)jh*jj9hB)r}r(hEUhFj4hGhJhLhMhN}r(hR]hP]rj9ahQ]UismodhS]hT]uhVNhWhh@]ubh=j,h-jh.j1h j2h2jjjh,jJh3jh;jjjjjh1jh7jh+jh>jh jdh0jJh6jh4hYhh Ufeed-storages-baseq?hU feed-exportsq@hUxmlqAhUstorage-uri-parametersqBhUfeed-uriqChUtopics-feed-format-xmlqDhUs3qEhUtopics-feed-storage-fsqFhUuriqGhUlocal-filesystemqHhUtopics-feed-format-jsonlinesqIhUtopics-feed-format-marshalqJhUjsonqKhUtopics-feed-format-pickleqLhUstandard-outputqMhUstoragesqNhUcsvqOhUfeed-store-emptyqPhUtopics-feed-format-csvqQh Userialization-formatsqRh!UftpqSh"Utopics-feed-storage-backendsqTh#Utopics-feed-storage-ftpqUh$U feed-formatqVh%Utopics-feed-storageqWh&U amazon-s3qXh'UbotoqYh(UsettingsqZh)Utopics-feed-formatq[h*Utopics-feed-format-jsonq\h+Ufeed-exporters-baseq]h,Ufeed-exportersq^h-Upickleq_h.Umarshalq`uUchildrenqa]qb(cdocutils.nodes target qc)qd}qe(U rawsourceqfX.. _topics-feed-exports:UparentqghUsourceqhcdocutils.nodes reprunicode qiXI/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/feed-exports.rstqjqk}qlbUtagnameqmUtargetqnU attributesqo}qp(Uidsqq]Ubackrefsqr]Udupnamesqs]Uclassesqt]Unamesqu]Urefidqvh8uUlineqwKUdocumentqxhha]ubcdocutils.nodes section qy)qz}q{(hfUhghhhhkUexpect_referenced_by_nameq|}q}hhdshmUsectionq~ho}q(hs]ht]hr]hq]q(h@h8ehu]q(hheuhwKhxhUexpect_referenced_by_idq}qh8hdsha]q(cdocutils.nodes title q)q}q(hfX Feed exportsqhghzhhhkhmUtitleqho}q(hs]ht]hr]hq]hu]uhwKhxhha]qcdocutils.nodes Text qX Feed exportsqq}q(hfhhghubaubcsphinx.addnodes versionmodified q)q}q(hfUhghzhhhkhmUversionmodifiedqho}q(UversionqX0.10hq]hr]hs]ht]hu]UtypeqX versionaddedquhwKhxhha]qcdocutils.nodes paragraph q)q}q(hfUhghhhhkhmU paragraphqho}q(hs]ht]hr]hq]hu]uhwKhxhha]qcdocutils.nodes inline q)q}q(hfUho}q(hs]ht]qhahr]hq]hu]uhghha]qhXNew in version 0.10.qq}q(hfUhghubahmUinlinequbaubaubh)q}q(hfXOne of the most frequently required features when implementing scrapers is being able to store the scraped data properly and, quite often, that means generating a "export file" with the scraped data (commonly called "export feed") to be consumed by other systems.qhghzhhhkhmhho}q(hs]ht]hr]hq]hu]uhwK hxhha]qhXOne of the most frequently required features when implementing scrapers is being able to store the scraped data properly and, quite often, that means generating a "export file" with the scraped data (commonly called "export feed") to be consumed by other systems.qq}q(hfhhghubaubh)q}q(hfXScrapy provides this functionality out of the box with the Feed Exports, which allows you to generate a feed with the scraped items, using multiple serialization formats and storage backends.qhghzhhhkhmhho}q(hs]ht]hr]hq]hu]uhwKhxhha]qhXScrapy provides this functionality out of the box with the Feed Exports, which allows you to generate a feed with the scraped items, using multiple serialization formats and storage backends.qq}q(hfhhghubaubhc)q}q(hfX.. _topics-feed-format:hghzhhhkhmhnho}q(hq]hr]hs]ht]hu]hvh[uhwKhxhha]ubhy)q}q(hfUhghzhhhkh|}qh)hshmh~ho}q(hs]ht]hr]hq]q(hRh[ehu]q(h h)euhwKhxhh}qh[hsha]q(h)q}q(hfXSerialization formatsqhghhhhkhmhho}q(hs]ht]hr]hq]hu]uhwKhxhha]qhXSerialization formatsqɅq}q(hfhhghubaubh)q}q(hfXFor serializing the scraped data, the feed exports use the :ref:`Item exporters ` and these formats are supported out of the box:hghhhhkhmhho}q(hs]ht]hr]hq]hu]uhwKhxhha]q(hX;For serializing the scraped data, the feed exports use the qЅq}q(hfX;For serializing the scraped data, the feed exports use the hghubcsphinx.addnodes pending_xref q)q}q(hfX(:ref:`Item exporters `qhghhhhkhmU pending_xrefqho}q(UreftypeXrefUrefwarnqوU reftargetqXtopics-exportersU refdomainXstdqhq]hr]U refexpliciths]ht]hu]UrefdocqXtopics/feed-exportsquhwKha]qcdocutils.nodes emphasis q)q}q(hfhho}q(hs]ht]q(UxrefqhXstd-refqehr]hq]hu]uhghha]qhXItem exportersq煁q}q(hfUhghubahmUemphasisqubaubhX0 and these formats are supported out of the box:q녁q}q(hfX0 and these formats are supported out of the box:hghubeubcdocutils.nodes block_quote q)q}q(hfUhghhhNhmU block_quoteqho}q(hs]ht]hr]hq]hu]uhwNhxhha]qcdocutils.nodes bullet_list q)q}q(hfUho}q(UbulletqX*hq]hr]hs]ht]hu]uhghha]q(cdocutils.nodes list_item q)q}q(hfX:ref:`topics-feed-format-json`qho}q(hs]ht]hr]hq]hu]uhghha]qh)r}r(hfhhghhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]rh)r}r(hfhhgjhhhkhmhho}r(UreftypeXrefhوhXtopics-feed-format-jsonU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKha]rh)r }r (hfhho}r (hs]ht]r (hjXstd-refr ehr]hq]hu]uhgjha]rhXtopics-feed-format-jsonrr}r(hfUhgj ubahmhubaubaubahmU list_itemrubh)r}r(hfX#:ref:`topics-feed-format-jsonlines`rho}r(hs]ht]hr]hq]hu]uhghha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]rh)r}r(hfjhgjhhhkhmhho}r(UreftypeXrefhوhXtopics-feed-format-jsonlinesU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKha]r h)r!}r"(hfjho}r#(hs]ht]r$(hjXstd-refr%ehr]hq]hu]uhgjha]r&hXtopics-feed-format-jsonlinesr'r(}r)(hfUhgj!ubahmhubaubaubahmjubh)r*}r+(hfX:ref:`topics-feed-format-csv`r,ho}r-(hs]ht]hr]hq]hu]uhghha]r.h)r/}r0(hfj,hgj*hhhkhmhho}r1(hs]ht]hr]hq]hu]uhwKha]r2h)r3}r4(hfj,hgj/hhhkhmhho}r5(UreftypeXrefhوhXtopics-feed-format-csvU refdomainXstdr6hq]hr]U refexpliciths]ht]hu]hhuhwKha]r7h)r8}r9(hfj,ho}r:(hs]ht]r;(hj6Xstd-refr<ehr]hq]hu]uhgj3ha]r=hXtopics-feed-format-csvr>r?}r@(hfUhgj8ubahmhubaubaubahmjubh)rA}rB(hfX:ref:`topics-feed-format-xml` ho}rC(hs]ht]hr]hq]hu]uhghha]rDh)rE}rF(hfX:ref:`topics-feed-format-xml`rGhgjAhhhkhmhho}rH(hs]ht]hr]hq]hu]uhwKha]rIh)rJ}rK(hfjGhgjEhhhkhmhho}rL(UreftypeXrefhوhXtopics-feed-format-xmlU refdomainXstdrMhq]hr]U refexpliciths]ht]hu]hhuhwKha]rNh)rO}rP(hfjGho}rQ(hs]ht]rR(hjMXstd-refrSehr]hq]hu]uhgjJha]rThXtopics-feed-format-xmlrUrV}rW(hfUhgjOubahmhubaubaubahmjubehmU bullet_listrXubaubh)rY}rZ(hfX[But you can also extend the supported format through the :setting:`FEED_EXPORTERS` setting.hghhhhkhmhho}r[(hs]ht]hr]hq]hu]uhwKhxhha]r\(hX9But you can also extend the supported format through the r]r^}r_(hfX9But you can also extend the supported format through the hgjYubh)r`}ra(hfX:setting:`FEED_EXPORTERS`rbhgjYhhhkhmhho}rc(UreftypeXsettinghىhXFEED_EXPORTERSU refdomainXstdrdhq]hr]U refexpliciths]ht]hu]hhuhwKha]recdocutils.nodes literal rf)rg}rh(hfjbho}ri(hs]ht]rj(hjdX std-settingrkehr]hq]hu]uhgj`ha]rlhXFEED_EXPORTERSrmrn}ro(hfUhgjgubahmUliteralrpubaubhX setting.rqrr}rs(hfX setting.hgjYubeubhc)rt}ru(hfX.. _topics-feed-format-json:hghhhhkhmhnho}rv(hq]hr]hs]ht]hu]hvh\uhwK"hxhha]ubhy)rw}rx(hfUhghhhhkh|}ryh*jtshmh~ho}rz(hs]ht]hr]hq]r{(hKh\ehu]r|(hh*euhwK%hxhh}r}h\jtsha]r~(h)r}r(hfXJSONrhgjwhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK%hxhha]rhXJSONrr}r(hfjhgjubaubh)r}r(hfUhgjwhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfX :setting:`FEED_FORMAT`: ``json``rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK'ha]r(h)r}r(hfX:setting:`FEED_FORMAT`rhgjhhhkhmhho}r(UreftypeXsettinghىhX FEED_FORMATU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwK'ha]rjf)r}r(hfjho}r(hs]ht]r(hjX std-settingrehr]hq]hu]uhgjha]rhX FEED_FORMATrr}r(hfUhgjubahmjpubaubhX: rr}r(hfX: hgjubjf)r}r(hfX``json``ho}r(hs]ht]hr]hq]hu]uhgjha]rhXjsonrr}r(hfUhgjubahmjpubeubahmjubh)r}r(hfXAExporter used: :class:`~scrapy.contrib.exporter.JsonItemExporter`rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK(ha]r(hXExporter used: rr}r(hfXExporter used: hgjubh)r}r(hfX2:class:`~scrapy.contrib.exporter.JsonItemExporter`rhgjhhhkhmhho}r(UreftypeXclasshىhX(scrapy.contrib.exporter.JsonItemExporterU refdomainXpyrhq]hr]U refexpliciths]ht]hu]hhUpy:classrNU py:modulerNuhwK(ha]rjf)r}r(hfjho}r(hs]ht]r(hjXpy-classrehr]hq]hu]uhgjha]rhXJsonItemExporterrr}r(hfUhgjubahmjpubaubeubahmjubh)r}r(hfXUSee :ref:`this warning ` if you're using JSON with large feeds ho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfXTSee :ref:`this warning ` if you're using JSON with large feedshgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK)ha]r(hXSee rr}r(hfXSee hgjubh)r}r(hfX*:ref:`this warning `rhgjhhhkhmhho}r(UreftypeXrefhوhXjson-with-large-dataU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwK)ha]rh)r}r(hfjho}r(hs]ht]r(hjXstd-refrehr]hq]hu]uhgjha]rhX this warningrr}r(hfUhgjubahmhubaubhX& if you're using JSON with large feedsrr}r(hfX& if you're using JSON with large feedshgjubeubahmjubehmjXubaubhc)r}r(hfX!.. _topics-feed-format-jsonlines:hgjwhhhkhmhnho}r(hq]hr]hs]ht]hu]hvhIuhwK+hxhha]ubeubhy)r}r(hfUhghhhhkh|}rhjshmh~ho}r(hs]ht]hr]hq]r(h=hIehu]r(h heuhwK.hxhh}rhIjsha]r(h)r}r(hfX JSON linesrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK.hxhha]rhX JSON linesrr}r(hfjhgjubaubh)r}r(hfUhgjhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfX%:setting:`FEED_FORMAT`: ``jsonlines``rho}r (hs]ht]hr]hq]hu]uhgjha]r h)r }r (hfjhgjhhhkhmhho}r (hs]ht]hr]hq]hu]uhwK0ha]r(h)r}r(hfX:setting:`FEED_FORMAT`rhgj hhhkhmhho}r(UreftypeXsettinghىhX FEED_FORMATU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwK0ha]rjf)r}r(hfjho}r(hs]ht]r(hjX std-settingrehr]hq]hu]uhgjha]rhX FEED_FORMATrr}r(hfUhgjubahmjpubaubhX: rr}r (hfX: hgj ubjf)r!}r"(hfX ``jsonlines``ho}r#(hs]ht]hr]hq]hu]uhgj ha]r$hX jsonlinesr%r&}r'(hfUhgj!ubahmjpubeubahmjubh)r(}r)(hfXGExporter used: :class:`~scrapy.contrib.exporter.JsonLinesItemExporter` ho}r*(hs]ht]hr]hq]hu]uhgjha]r+h)r,}r-(hfXFExporter used: :class:`~scrapy.contrib.exporter.JsonLinesItemExporter`hgj(hhhkhmhho}r.(hs]ht]hr]hq]hu]uhwK1ha]r/(hXExporter used: r0r1}r2(hfXExporter used: hgj,ubh)r3}r4(hfX7:class:`~scrapy.contrib.exporter.JsonLinesItemExporter`r5hgj,hhhkhmhho}r6(UreftypeXclasshىhX-scrapy.contrib.exporter.JsonLinesItemExporterU refdomainXpyr7hq]hr]U refexpliciths]ht]hu]hhjNjNuhwK1ha]r8jf)r9}r:(hfj5ho}r;(hs]ht]r<(hj7Xpy-classr=ehr]hq]hu]uhgj3ha]r>hXJsonLinesItemExporterr?r@}rA(hfUhgj9ubahmjpubaubeubahmjubehmjXubaubhc)rB}rC(hfX.. _topics-feed-format-csv:hgjhhhkhmhnho}rD(hq]hr]hs]ht]hu]hvhQuhwK3hxhha]ubeubhy)rE}rF(hfUhghhhhkh|}rGhjBshmh~ho}rH(hs]ht]hr]hq]rI(hOhQehu]rJ(hheuhwK6hxhh}rKhQjBsha]rL(h)rM}rN(hfXCSVrOhgjEhhhkhmhho}rP(hs]ht]hr]hq]hu]uhwK6hxhha]rQhXCSVrRrS}rT(hfjOhgjMubaubh)rU}rV(hfUhgjEhhNhmhho}rW(hs]ht]hr]hq]hu]uhwNhxhha]rXh)rY}rZ(hfUho}r[(hX*hq]hr]hs]ht]hu]uhgjUha]r\(h)r]}r^(hfX:setting:`FEED_FORMAT`: ``csv``r_ho}r`(hs]ht]hr]hq]hu]uhgjYha]rah)rb}rc(hfj_hgj]hhhkhmhho}rd(hs]ht]hr]hq]hu]uhwK8ha]re(h)rf}rg(hfX:setting:`FEED_FORMAT`rhhgjbhhhkhmhho}ri(UreftypeXsettinghىhX FEED_FORMATU refdomainXstdrjhq]hr]U refexpliciths]ht]hu]hhuhwK8ha]rkjf)rl}rm(hfjhho}rn(hs]ht]ro(hjjX std-settingrpehr]hq]hu]uhgjfha]rqhX FEED_FORMATrrrs}rt(hfUhgjlubahmjpubaubhX: rurv}rw(hfX: hgjbubjf)rx}ry(hfX``csv``ho}rz(hs]ht]hr]hq]hu]uhgjbha]r{hXcsvr|r}}r~(hfUhgjxubahmjpubeubahmjubh)r}r(hfXAExporter used: :class:`~scrapy.contrib.exporter.CsvItemExporter` ho}r(hs]ht]hr]hq]hu]uhgjYha]rh)r}r(hfX@Exporter used: :class:`~scrapy.contrib.exporter.CsvItemExporter`hgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK9ha]r(hXExporter used: rr}r(hfXExporter used: hgjubh)r}r(hfX1:class:`~scrapy.contrib.exporter.CsvItemExporter`rhgjhhhkhmhho}r(UreftypeXclasshىhX'scrapy.contrib.exporter.CsvItemExporterU refdomainXpyrhq]hr]U refexpliciths]ht]hu]hhjNjNuhwK9ha]rjf)r}r(hfjho}r(hs]ht]r(hjXpy-classrehr]hq]hu]uhgjha]rhXCsvItemExporterrr}r(hfUhgjubahmjpubaubeubahmjubehmjXubaubhc)r}r(hfX.. _topics-feed-format-xml:hgjEhhhkhmhnho}r(hq]hr]hs]ht]hu]hvhDuhwK;hxhha]ubeubhy)r}r(hfUhghhhhkh|}rhjshmh~ho}r(hs]ht]hr]hq]r(hAhDehu]r(hheuhwK>hxhh}rhDjsha]r(h)r}r(hfXXMLrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK>hxhha]rhXXMLrr}r(hfjhgjubaubh)r}r(hfUhgjhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfX:setting:`FEED_FORMAT`: ``xml``rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK@ha]r(h)r}r(hfX:setting:`FEED_FORMAT`rhgjhhhkhmhho}r(UreftypeXsettinghىhX FEED_FORMATU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwK@ha]rjf)r}r(hfjho}r(hs]ht]r(hjX std-settingrehr]hq]hu]uhgjha]rhX FEED_FORMATrr}r(hfUhgjubahmjpubaubhX: rr}r(hfX: hgjubjf)r}r(hfX``xml``ho}r(hs]ht]hr]hq]hu]uhgjha]rhXxmlrr}r(hfUhgjubahmjpubeubahmjubh)r}r(hfXAExporter used: :class:`~scrapy.contrib.exporter.XmlItemExporter` ho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfX@Exporter used: :class:`~scrapy.contrib.exporter.XmlItemExporter`hgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKAha]r(hXExporter used: rr}r(hfXExporter used: hgjubh)r}r(hfX1:class:`~scrapy.contrib.exporter.XmlItemExporter`rhgjhhhkhmhho}r(UreftypeXclasshىhX'scrapy.contrib.exporter.XmlItemExporterU refdomainXpyrhq]hr]U refexpliciths]ht]hu]hhjNjNuhwKAha]rjf)r}r(hfjho}r(hs]ht]r(hjXpy-classrehr]hq]hu]uhgjha]rhXXmlItemExporterrr}r(hfUhgjubahmjpubaubeubahmjubehmjXubaubhc)r}r(hfX.. _topics-feed-format-pickle:hgjhhhkhmhnho}r(hq]hr]hs]ht]hu]hvhLuhwKChxhha]ubeubhy)r}r(hfUhghhhhkh|}rhjshmh~ho}r(hs]ht]hr]hq]r(h_hLehu]r(h-heuhwKFhxhh}rhLjsha]r(h)r}r(hfXPicklerhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKFhxhha]rhXPicklerr}r(hfjhgjubaubh)r}r(hfUhgjhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r (hX*hq]hr]hs]ht]hu]uhgjha]r (h)r }r (hfX":setting:`FEED_FORMAT`: ``pickle``r ho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfj hgj hhhkhmhho}r(hs]ht]hr]hq]hu]uhwKHha]r(h)r}r(hfX:setting:`FEED_FORMAT`rhgjhhhkhmhho}r(UreftypeXsettinghىhX FEED_FORMATU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKHha]rjf)r}r(hfjho}r(hs]ht]r(hjX std-settingrehr]hq]hu]uhgjha]rhX FEED_FORMATr r!}r"(hfUhgjubahmjpubaubhX: r#r$}r%(hfX: hgjubjf)r&}r'(hfX ``pickle``ho}r((hs]ht]hr]hq]hu]uhgjha]r)hXpickler*r+}r,(hfUhgj&ubahmjpubeubahmjubh)r-}r.(hfXDExporter used: :class:`~scrapy.contrib.exporter.PickleItemExporter` ho}r/(hs]ht]hr]hq]hu]uhgjha]r0h)r1}r2(hfXCExporter used: :class:`~scrapy.contrib.exporter.PickleItemExporter`hgj-hhhkhmhho}r3(hs]ht]hr]hq]hu]uhwKIha]r4(hXExporter used: r5r6}r7(hfXExporter used: hgj1ubh)r8}r9(hfX4:class:`~scrapy.contrib.exporter.PickleItemExporter`r:hgj1hhhkhmhho}r;(UreftypeXclasshىhX*scrapy.contrib.exporter.PickleItemExporterU refdomainXpyr<hq]hr]U refexpliciths]ht]hu]hhjNjNuhwKIha]r=jf)r>}r?(hfj:ho}r@(hs]ht]rA(hj<Xpy-classrBehr]hq]hu]uhgj8ha]rChXPickleItemExporterrDrE}rF(hfUhgj>ubahmjpubaubeubahmjubehmjXubaubhc)rG}rH(hfX.. _topics-feed-format-marshal:hgjhhhkhmhnho}rI(hq]hr]hs]ht]hu]hvhJuhwKKhxhha]ubeubhy)rJ}rK(hfUhghhhhkh|}rLhjGshmh~ho}rM(hs]ht]hr]hq]rN(h`hJehu]rO(h.heuhwKNhxhh}rPhJjGsha]rQ(h)rR}rS(hfXMarshalrThgjJhhhkhmhho}rU(hs]ht]hr]hq]hu]uhwKNhxhha]rVhXMarshalrWrX}rY(hfjThgjRubaubh)rZ}r[(hfUhgjJhhNhmhho}r\(hs]ht]hr]hq]hu]uhwNhxhha]r]h)r^}r_(hfUho}r`(hX*hq]hr]hs]ht]hu]uhgjZha]ra(h)rb}rc(hfX#:setting:`FEED_FORMAT`: ``marshal``rdho}re(hs]ht]hr]hq]hu]uhgj^ha]rfh)rg}rh(hfjdhgjbhhhkhmhho}ri(hs]ht]hr]hq]hu]uhwKPha]rj(h)rk}rl(hfX:setting:`FEED_FORMAT`rmhgjghhhkhmhho}rn(UreftypeXsettinghىhX FEED_FORMATU refdomainXstdrohq]hr]U refexpliciths]ht]hu]hhuhwKPha]rpjf)rq}rr(hfjmho}rs(hs]ht]rt(hjoX std-settingruehr]hq]hu]uhgjkha]rvhX FEED_FORMATrwrx}ry(hfUhgjqubahmjpubaubhX: rzr{}r|(hfX: hgjgubjf)r}}r~(hfX ``marshal``ho}r(hs]ht]hr]hq]hu]uhgjgha]rhXmarshalrr}r(hfUhgj}ubahmjpubeubahmjubh)r}r(hfXFExporter used: :class:`~scrapy.contrib.exporter.MarshalItemExporter` ho}r(hs]ht]hr]hq]hu]uhgj^ha]rh)r}r(hfXDExporter used: :class:`~scrapy.contrib.exporter.MarshalItemExporter`hgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKQha]r(hXExporter used: rr}r(hfXExporter used: hgjubh)r}r(hfX5:class:`~scrapy.contrib.exporter.MarshalItemExporter`rhgjhhhkhmhho}r(UreftypeXclasshىhX+scrapy.contrib.exporter.MarshalItemExporterU refdomainXpyrhq]hr]U refexpliciths]ht]hu]hhjNjNuhwKQha]rjf)r}r(hfjho}r(hs]ht]r(hjXpy-classrehr]hq]hu]uhgjha]rhXMarshalItemExporterrr}r(hfUhgjubahmjpubaubeubahmjubehmjXubaubhc)r}r(hfX.. _topics-feed-storage:hgjJhhhkhmhnho}r(hq]hr]hs]ht]hu]hvhWuhwKThxhha]ubeubeubhy)r}r(hfUhghzhhhkh|}rh%jshmh~ho}r(hs]ht]hr]hq]r(hNhWehu]r(hh%euhwKWhxhh}rhWjsha]r(h)r}r(hfXStoragesrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKWhxhha]rhXStoragesrr}r(hfjhgjubaubh)r}r(hfXWhen using the feed exports you define where to store the feed using a URI_ (through the :setting:`FEED_URI` setting). The feed exports supports multiple storage backend types which are defined by the URI scheme.hgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKYhxhha]r(hXGWhen using the feed exports you define where to store the feed using a rr}r(hfXGWhen using the feed exports you define where to store the feed using a hgjubcdocutils.nodes reference r)r}r(hfXURI_UresolvedrKhgjhmU referencerho}r(UnameXURIUrefurirX8http://en.wikipedia.org/wiki/Uniform_Resource_Identifierrhq]hr]hs]ht]hu]uha]rhXURIrr}r(hfUhgjubaubhX (through the rr}r(hfX (through the hgjubh)r}r(hfX:setting:`FEED_URI`rhgjhhhkhmhho}r(UreftypeXsettinghىhXFEED_URIU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKYha]rjf)r}r(hfjho}r(hs]ht]r(hjX std-settingrehr]hq]hu]uhgjha]rhXFEED_URIrr}r(hfUhgjubahmjpubaubhXh setting). The feed exports supports multiple storage backend types which are defined by the URI scheme.rr}r(hfXh setting). The feed exports supports multiple storage backend types which are defined by the URI scheme.hgjubeubh)r}r(hfX3The storages backends supported out of the box are:rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK]hxhha]rhX3The storages backends supported out of the box are:rr}r(hfjhgjubaubh)r}r(hfUhgjhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfX:ref:`topics-feed-storage-fs`rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK_ha]rh)r}r(hfjhgjhhhkhmhho}r(UreftypeXrefhوhXtopics-feed-storage-fsU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwK_ha]rh)r}r(hfjho}r(hs]ht]r(hjXstd-refrehr]hq]hu]uhgjha]rhXtopics-feed-storage-fsrr}r(hfUhgjubahmhubaubaubahmjubh)r}r(hfX:ref:`topics-feed-storage-ftp`rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK`ha]rh)r }r (hfjhgjhhhkhmhho}r (UreftypeXrefhوhXtopics-feed-storage-ftpU refdomainXstdr hq]hr]U refexpliciths]ht]hu]hhuhwK`ha]r h)r}r(hfjho}r(hs]ht]r(hj Xstd-refrehr]hq]hu]uhgj ha]rhXtopics-feed-storage-ftprr}r(hfUhgjubahmhubaubaubahmjubh)r}r(hfX.:ref:`topics-feed-storage-s3` (requires boto_)rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKaha]r(h)r }r!(hfX:ref:`topics-feed-storage-s3`r"hgjhhhkhmhho}r#(UreftypeXrefhوhXtopics-feed-storage-s3U refdomainXstdr$hq]hr]U refexpliciths]ht]hu]hhuhwKaha]r%h)r&}r'(hfj"ho}r((hs]ht]r)(hj$Xstd-refr*ehr]hq]hu]uhgj ha]r+hXtopics-feed-storage-s3r,r-}r.(hfUhgj&ubahmhubaubhX (requires r/r0}r1(hfX (requires hgjubj)r2}r3(hfXboto_jKhgjhmjho}r4(UnameXbotor5jXhttp://code.google.com/p/boto/r6hq]hr]hs]ht]hu]uha]r7hXbotor8r9}r:(hfUhgj2ubaubhX)r;}r<(hfX)hgjubeubahmjubh)r=}r>(hfX":ref:`topics-feed-storage-stdout` ho}r?(hs]ht]hr]hq]hu]uhgjha]r@h)rA}rB(hfX!:ref:`topics-feed-storage-stdout`rChgj=hhhkhmhho}rD(hs]ht]hr]hq]hu]uhwKbha]rEh)rF}rG(hfjChgjAhhhkhmhho}rH(UreftypeXrefhوhXtopics-feed-storage-stdoutU refdomainXstdrIhq]hr]U refexpliciths]ht]hu]hhuhwKbha]rJh)rK}rL(hfjCho}rM(hs]ht]rN(hjIXstd-refrOehr]hq]hu]uhgjFha]rPhXtopics-feed-storage-stdoutrQrR}rS(hfUhgjKubahmhubaubaubahmjubehmjXubaubh)rT}rU(hfXSome storage backends may be unavailable if the required external libraries are not available. For example, the S3 backend is only available if the boto_ library is installed.hgjhhhkhmhho}rV(hs]ht]hr]hq]hu]uhwKdhxhha]rW(hXSome storage backends may be unavailable if the required external libraries are not available. For example, the S3 backend is only available if the rXrY}rZ(hfXSome storage backends may be unavailable if the required external libraries are not available. For example, the S3 backend is only available if the hgjTubj)r[}r\(hfXboto_jKhgjThmjho}r](UnameXbotojj6hq]hr]hs]ht]hu]uha]r^hXbotor_r`}ra(hfUhgj[ubaubhX library is installed.rbrc}rd(hfX library is installed.hgjTubeubhc)re}rf(hfX.. _topics-feed-uri-params:hgjhhhkhmhnho}rg(hq]hr]hs]ht]hu]hvh;uhwKihxhha]ubeubhy)rh}ri(hfUhghzhhhkh|}rjh jeshmh~ho}rk(hs]ht]hr]hq]rl(hBh;ehu]rm(hh euhwKlhxhh}rnh;jesha]ro(h)rp}rq(hfXStorage URI parametersrrhgjhhhhkhmhho}rs(hs]ht]hr]hq]hu]uhwKlhxhha]rthXStorage URI parametersrurv}rw(hfjrhgjpubaubh)rx}ry(hfXsThe storage URI can also contain parameters that get replaced when the feed is being created. These parameters are:rzhgjhhhhkhmhho}r{(hs]ht]hr]hq]hu]uhwKnhxhha]r|hXsThe storage URI can also contain parameters that get replaced when the feed is being created. These parameters are:r}r~}r(hfjzhgjxubaubh)r}r(hfUhgjhhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfXJ``%(time)s`` - gets replaced by a timestamp when the feed is being createdrho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKqha]r(jf)r}r(hfX ``%(time)s``ho}r(hs]ht]hr]hq]hu]uhgjha]rhX%(time)srr}r(hfUhgjubahmjpubhX> - gets replaced by a timestamp when the feed is being createdrr}r(hfX> - gets replaced by a timestamp when the feed is being createdhgjubeubahmjubh)r}r(hfX0``%(name)s`` - gets replaced by the spider name ho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfX/``%(name)s`` - gets replaced by the spider namehgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKrha]r(jf)r}r(hfX ``%(name)s``ho}r(hs]ht]hr]hq]hu]uhgjha]rhX%(name)srr}r(hfUhgjubahmjpubhX# - gets replaced by the spider namerr}r(hfX# - gets replaced by the spider namehgjubeubahmjubehmjXubaubh)r}r(hfXAny other named parameter gets replaced by the spider attribute of the same name. For example, ``%(site_id)s`` would get replaced by the ``spider.site_id`` attribute the moment the feed is being created.hgjhhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKthxhha]r(hX_Any other named parameter gets replaced by the spider attribute of the same name. For example, rr}r(hfX_Any other named parameter gets replaced by the spider attribute of the same name. For example, hgjubjf)r}r(hfX``%(site_id)s``ho}r(hs]ht]hr]hq]hu]uhgjha]rhX %(site_id)srr}r(hfUhgjubahmjpubhX would get replaced by the rr}r(hfX would get replaced by the hgjubjf)r}r(hfX``spider.site_id``ho}r(hs]ht]hr]hq]hu]uhgjha]rhXspider.site_idrr}r(hfUhgjubahmjpubhX0 attribute the moment the feed is being created.rr}r(hfX0 attribute the moment the feed is being created.hgjubeubh)r}r(hfX%Here are some examples to illustrate:rhgjhhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKxhxhha]rhX%Here are some examples to illustrate:rr}r(hfjhgjubaubh)r}r(hfUhgjhhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfX~Store in FTP using one directory per spider: * ``ftp://user:password@ftp.example.com/scraping/feeds/%(name)s/%(time)s.json`` ho}r(hs]ht]hr]hq]hu]uhgjha]r(h)r}r(hfX,Store in FTP using one directory per spider:rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKzha]rhX,Store in FTP using one directory per spider:rr}r(hfjhgjubaubh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]rh)r}r(hfXN``ftp://user:password@ftp.example.com/scraping/feeds/%(name)s/%(time)s.json`` ho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfXM``ftp://user:password@ftp.example.com/scraping/feeds/%(name)s/%(time)s.json``rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK|ha]rjf)r}r(hfjho}r(hs]ht]hr]hq]hu]uhgjha]rhXIftp://user:password@ftp.example.com/scraping/feeds/%(name)s/%(time)s.jsonrr}r(hfUhgjubahmjpubaubahmjubahmjXubehmjubh)r}r(hfXhStore in S3 using one directory per spider: * ``s3://mybucket/scraping/feeds/%(name)s/%(time)s.json`` ho}r(hs]ht]hr]hq]hu]uhgjha]r(h)r}r(hfX+Store in S3 using one directory per spider:rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwK~ha]rhX+Store in S3 using one directory per spider:rr}r(hfjhgjubaubh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]rh)r}r (hfX9``s3://mybucket/scraping/feeds/%(name)s/%(time)s.json`` ho}r (hs]ht]hr]hq]hu]uhgjha]r h)r }r (hfX7``s3://mybucket/scraping/feeds/%(name)s/%(time)s.json``rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]rjf)r}r(hfjho}r(hs]ht]hr]hq]hu]uhgj ha]rhX3s3://mybucket/scraping/feeds/%(name)s/%(time)s.jsonrr}r(hfUhgjubahmjpubaubahmjubahmjXubehmjubehmjXubaubhc)r}r(hfX!.. _topics-feed-storage-backends:hgjhhhhkhmhnho}r(hq]hr]hs]ht]hu]hvhTuhwKhxhha]ubeubhy)r}r(hfUhghzhhhkh|}rh"jshmh~ho}r(hs]ht]hr]hq]r(h}r?(hfX-The feeds are stored in the local filesystem.r@hgj.hhhkhmhho}rA(hs]ht]hr]hq]hu]uhwKhxhha]rBhX-The feeds are stored in the local filesystem.rCrD}rE(hfj@hgj>ubaubh)rF}rG(hfUhgj.hhNhmhho}rH(hs]ht]hr]hq]hu]uhwNhxhha]rIh)rJ}rK(hfUho}rL(hX*hq]hr]hs]ht]hu]uhgjFha]rM(h)rN}rO(hfXURI scheme: ``file``rPho}rQ(hs]ht]hr]hq]hu]uhgjJha]rRh)rS}rT(hfjPhgjNhhhkhmhho}rU(hs]ht]hr]hq]hu]uhwKha]rV(hX URI scheme: rWrX}rY(hfX URI scheme: hgjSubjf)rZ}r[(hfX``file``ho}r\(hs]ht]hr]hq]hu]uhgjSha]r]hXfiler^r_}r`(hfUhgjZubahmjpubeubahmjubh)ra}rb(hfX'Example URI: ``file:///tmp/export.csv``rcho}rd(hs]ht]hr]hq]hu]uhgjJha]reh)rf}rg(hfjchgjahhhkhmhho}rh(hs]ht]hr]hq]hu]uhwKha]ri(hX Example URI: rjrk}rl(hfX Example URI: hgjfubjf)rm}rn(hfX``file:///tmp/export.csv``ho}ro(hs]ht]hr]hq]hu]uhgjfha]rphXfile:///tmp/export.csvrqrr}rs(hfUhgjmubahmjpubeubahmjubh)rt}ru(hfX"Required external libraries: none ho}rv(hs]ht]hr]hq]hu]uhgjJha]rwh)rx}ry(hfX!Required external libraries: nonerzhgjthhhkhmhho}r{(hs]ht]hr]hq]hu]uhwKha]r|hX!Required external libraries: noner}r~}r(hfjzhgjxubaubahmjubehmjXubaubh)r}r(hfXNote that for the local filesystem storage (only) you can omit the scheme if you specify an absolute path like ``/tmp/export.csv``. This only works on Unix systems though.hgj.hhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]r(hXoNote that for the local filesystem storage (only) you can omit the scheme if you specify an absolute path like rr}r(hfXoNote that for the local filesystem storage (only) you can omit the scheme if you specify an absolute path like hgjubjf)r}r(hfX``/tmp/export.csv``ho}r(hs]ht]hr]hq]hu]uhgjha]rhX/tmp/export.csvrr}r(hfUhgjubahmjpubhX). This only works on Unix systems though.rr}r(hfX). This only works on Unix systems though.hgjubeubhc)r}r(hfX.. _topics-feed-storage-ftp:hgj.hhhkhmhnho}r(hq]hr]hs]ht]hu]hvhUuhwKhxhha]ubeubhy)r}r(hfUhgjhhhkh|}rh#jshmh~ho}r(hs]ht]hr]hq]r(hShUehu]r(h!h#euhwKhxhh}rhUjsha]r(h)r}r(hfXFTPrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhXFTPrr}r(hfjhgjubaubh)r}r(hfX%The feeds are stored in a FTP server.rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhX%The feeds are stored in a FTP server.rr}r(hfjhgjubaubh)r}r(hfUhgjhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfXURI scheme: ``ftp``rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]r(hX URI scheme: rr}r(hfX URI scheme: hgjubjf)r}r(hfX``ftp``ho}r(hs]ht]hr]hq]hu]uhgjha]rhXftprr}r(hfUhgjubahmjpubeubahmjubh)r}r(hfXCExample URI: ``ftp://user:pass@ftp.example.com/path/to/export.csv``rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]r(hX Example URI: rr}r(hfX Example URI: hgjubjf)r}r(hfX6``ftp://user:pass@ftp.example.com/path/to/export.csv``ho}r(hs]ht]hr]hq]hu]uhgjha]rhX2ftp://user:pass@ftp.example.com/path/to/export.csvrr}r(hfUhgjubahmjpubeubahmjubh)r}r(hfX"Required external libraries: none ho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfX!Required external libraries: nonerhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]rhX!Required external libraries: nonerr}r(hfjhgjubaubahmjubehmjXubaubhc)r}r(hfX.. _topics-feed-storage-s3:hgjhhhkhmhnho}r(hq]hr]hs]ht]hu]hvh>uhwKhxhha]ubeubhy)r}r(hfUhgjhhhkh|}rh jshmh~ho}r(hs]ht]hr]hq]r(hEh>ehu]r(hh euhwKhxhh}rh>jsha]r(h)r}r(hfXS3rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhXS3rr}r(hfjhgjubaubh)r}r(hfX%The feeds are stored on `Amazon S3`_.rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]r(hXThe feeds are stored on rr}r(hfXThe feeds are stored on hgjubj)r}r(hfX `Amazon S3`_jKhgjhmjho}r(UnameX Amazon S3jXhttp://aws.amazon.com/s3/rhq]hr]hs]ht]hu]uha]rhX Amazon S3rr}r(hfUhgjubaubhX.r }r (hfX.hgjubeubh)r }r (hfUhgjhhNhmhho}r (hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgj ha]r(h)r}r(hfXURI scheme: ``s3``rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]r(hX URI scheme: rr}r(hfX URI scheme: hgjubjf)r}r (hfX``s3``ho}r!(hs]ht]hr]hq]hu]uhgjha]r"hXs3r#r$}r%(hfUhgjubahmjpubeubahmjubh)r&}r'(hfXpExample URIs: * ``s3://mybucket/path/to/export.csv`` * ``s3://aws_key:aws_secret@mybucket/path/to/export.csv`` ho}r((hs]ht]hr]hq]hu]uhgjha]r)(h)r*}r+(hfX Example URIs:r,hgj&hhhkhmhho}r-(hs]ht]hr]hq]hu]uhwKha]r.hX Example URIs:r/r0}r1(hfj,hgj*ubaubh)r2}r3(hfUho}r4(hX*hq]hr]hs]ht]hu]uhgj&ha]r5(h)r6}r7(hfX$``s3://mybucket/path/to/export.csv``r8ho}r9(hs]ht]hr]hq]hu]uhgj2ha]r:h)r;}r<(hfj8hgj6hhhkhmhho}r=(hs]ht]hr]hq]hu]uhwKha]r>jf)r?}r@(hfj8ho}rA(hs]ht]hr]hq]hu]uhgj;ha]rBhX s3://mybucket/path/to/export.csvrCrD}rE(hfUhgj?ubahmjpubaubahmjubh)rF}rG(hfX8``s3://aws_key:aws_secret@mybucket/path/to/export.csv`` ho}rH(hs]ht]hr]hq]hu]uhgj2ha]rIh)rJ}rK(hfX7``s3://aws_key:aws_secret@mybucket/path/to/export.csv``rLhgjFhhhkhmhho}rM(hs]ht]hr]hq]hu]uhwKha]rNjf)rO}rP(hfjLho}rQ(hs]ht]hr]hq]hu]uhgjJha]rRhX3s3://aws_key:aws_secret@mybucket/path/to/export.csvrSrT}rU(hfUhgjOubahmjpubaubahmjubehmjXubehmjubh)rV}rW(hfX%Required external libraries: `boto`_ ho}rX(hs]ht]hr]hq]hu]uhgjha]rYh)rZ}r[(hfX$Required external libraries: `boto`_hgjVhhhkhmhho}r\(hs]ht]hr]hq]hu]uhwKha]r](hXRequired external libraries: r^r_}r`(hfXRequired external libraries: hgjZubj)ra}rb(hfX`boto`_jKhgjZhmjho}rc(UnameXbotojj6hq]hr]hs]ht]hu]uha]rdhXbotorerf}rg(hfUhgjaubaubeubahmjubehmjXubaubh)rh}ri(hfXtThe AWS credentials can be passed as user/password in the URI, or they can be passed through the following settings:rjhgjhhhkhmhho}rk(hs]ht]hr]hq]hu]uhwKhxhha]rlhXtThe AWS credentials can be passed as user/password in the URI, or they can be passed through the following settings:rmrn}ro(hfjjhgjhubaubh)rp}rq(hfUhgjhhNhmhho}rr(hs]ht]hr]hq]hu]uhwNhxhha]rsh)rt}ru(hfUho}rv(hX*hq]hr]hs]ht]hu]uhgjpha]rw(h)rx}ry(hfX:setting:`AWS_ACCESS_KEY_ID`rzho}r{(hs]ht]hr]hq]hu]uhgjtha]r|h)r}}r~(hfjzhgjxhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]rh)r}r(hfjzhgj}hhhkhmhho}r(UreftypeXsettinghىhXAWS_ACCESS_KEY_IDU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKha]rjf)r}r(hfjzho}r(hs]ht]r(hjX std-settingrehr]hq]hu]uhgjha]rhXAWS_ACCESS_KEY_IDrr}r(hfUhgjubahmjpubaubaubahmjubh)r}r(hfX!:setting:`AWS_SECRET_ACCESS_KEY` ho}r(hs]ht]hr]hq]hu]uhgjtha]rh)r}r(hfX :setting:`AWS_SECRET_ACCESS_KEY`rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]rh)r}r(hfjhgjhhhkhmhho}r(UreftypeXsettinghىhXAWS_SECRET_ACCESS_KEYU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKha]rjf)r}r(hfjho}r(hs]ht]r(hjX std-settingrehr]hq]hu]uhgjha]rhXAWS_SECRET_ACCESS_KEYrr}r(hfUhgjubahmjpubaubaubahmjubehmjXubaubhc)r}r(hfX.. _topics-feed-storage-stdout:hgjhhhkhmhnho}r(hq]hr]hs]ht]hu]hvh:uhwKhxhha]ubeubhy)r}r(hfUhgjhhhkh|}rhjshmh~ho}r(hs]ht]hr]hq]r(hMh:ehu]r(hheuhwKhxhh}rh:jsha]r(h)r}r(hfXStandard outputrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhXStandard outputrr}r(hfjhgjubaubh)r}r(hfXCThe feeds are written to the standard output of the Scrapy process.rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhXCThe feeds are written to the standard output of the Scrapy process.rr}r(hfjhgjubaubh)r}r(hfUhgjhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfXURI scheme: ``stdout``rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]r(hX URI scheme: rr}r(hfX URI scheme: hgjubjf)r}r(hfX ``stdout``ho}r(hs]ht]hr]hq]hu]uhgjha]rhXstdoutrr}r(hfUhgjubahmjpubeubahmjubh)r}r(hfXExample URI: ``stdout:``rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]r(hX Example URI: rr}r(hfX Example URI: hgjubjf)r}r(hfX ``stdout:``ho}r(hs]ht]hr]hq]hu]uhgjha]rhXstdout:rr}r(hfUhgjubahmjpubeubahmjubh)r}r(hfX#Required external libraries: none ho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfX!Required external libraries: nonerhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]rhX!Required external libraries: nonerr}r(hfjhgjubaubahmjubehmjXubaubeubeubhy)r}r(hfUhghzhhhkhmh~ho}r(hs]ht]hr]hq]rhZahu]rh(auhwKhxhha]r(h)r}r(hfXSettingsrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhXSettingsrr}r(hfjhgjubaubh)r }r (hfX=These are the settings used for configuring the feed exports:r hgjhhhkhmhho}r (hs]ht]hr]hq]hu]uhwKhxhha]r hX=These are the settings used for configuring the feed exports:rr}r(hfj hgj ubaubh)r}r(hfUhgjhhNhmhho}r(hs]ht]hr]hq]hu]uhwNhxhha]rh)r}r(hfUho}r(hX*hq]hr]hs]ht]hu]uhgjha]r(h)r}r(hfX:setting:`FEED_URI` (mandatory)rho}r(hs]ht]hr]hq]hu]uhgjha]rh)r}r(hfjhgjhhhkhmhho}r (hs]ht]hr]hq]hu]uhwKha]r!(h)r"}r#(hfX:setting:`FEED_URI`r$hgjhhhkhmhho}r%(UreftypeXsettinghىhXFEED_URIU refdomainXstdr&hq]hr]U refexpliciths]ht]hu]hhuhwKha]r'jf)r(}r)(hfj$ho}r*(hs]ht]r+(hj&X std-settingr,ehr]hq]hu]uhgj"ha]r-hXFEED_URIr.r/}r0(hfUhgj(ubahmjpubaubhX (mandatory)r1r2}r3(hfX (mandatory)hgjubeubahmjubh)r4}r5(hfX:setting:`FEED_FORMAT`r6ho}r7(hs]ht]hr]hq]hu]uhgjha]r8h)r9}r:(hfj6hgj4hhhkhmhho}r;(hs]ht]hr]hq]hu]uhwKha]r<h)r=}r>(hfj6hgj9hhhkhmhho}r?(UreftypeXsettinghىhX FEED_FORMATU refdomainXstdr@hq]hr]U refexpliciths]ht]hu]hhuhwKha]rAjf)rB}rC(hfj6ho}rD(hs]ht]rE(hj@X std-settingrFehr]hq]hu]uhgj=ha]rGhX FEED_FORMATrHrI}rJ(hfUhgjBubahmjpubaubaubahmjubh)rK}rL(hfX:setting:`FEED_STORAGES`rMho}rN(hs]ht]hr]hq]hu]uhgjha]rOh)rP}rQ(hfjMhgjKhhhkhmhho}rR(hs]ht]hr]hq]hu]uhwKha]rSh)rT}rU(hfjMhgjPhhhkhmhho}rV(UreftypeXsettinghىhX FEED_STORAGESU refdomainXstdrWhq]hr]U refexpliciths]ht]hu]hhuhwKha]rXjf)rY}rZ(hfjMho}r[(hs]ht]r\(hjWX std-settingr]ehr]hq]hu]uhgjTha]r^hX FEED_STORAGESr_r`}ra(hfUhgjYubahmjpubaubaubahmjubh)rb}rc(hfX:setting:`FEED_EXPORTERS`rdho}re(hs]ht]hr]hq]hu]uhgjha]rfh)rg}rh(hfjdhgjbhhhkhmhho}ri(hs]ht]hr]hq]hu]uhwKha]rjh)rk}rl(hfjdhgjghhhkhmhho}rm(UreftypeXsettinghىhXFEED_EXPORTERSU refdomainXstdrnhq]hr]U refexpliciths]ht]hu]hhuhwKha]rojf)rp}rq(hfjdho}rr(hs]ht]rs(hjnX std-settingrtehr]hq]hu]uhgjkha]ruhXFEED_EXPORTERSrvrw}rx(hfUhgjpubahmjpubaubaubahmjubh)ry}rz(hfX:setting:`FEED_STORE_EMPTY` ho}r{(hs]ht]hr]hq]hu]uhgjha]r|h)r}}r~(hfX:setting:`FEED_STORE_EMPTY`rhgjyhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKha]rh)r}r(hfjhgj}hhhkhmhho}r(UreftypeXsettinghىhXFEED_STORE_EMPTYU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKha]rjf)r}r(hfjho}r(hs]ht]r(hjX std-settingrehr]hq]hu]uhgjha]rhXFEED_STORE_EMPTYrr}r(hfUhgjubahmjpubaubaubahmjubehmjXubaubcsphinx.addnodes index r)r}r(hfUhgjhhhkhmUindexrho}r(hq]hr]hs]ht]hu]Uentries]r(XpairXFEED_URI; settingXstd:setting-FEED_URIrUtrauhwKhxhha]ubhc)r}r(hfUhgjhhhkhmhnho}r(hq]hr]hs]ht]hu]hvjuhwKhxhha]ubhy)r}r(hfUhgjhhhkh|}hmh~ho}r(hs]ht]hr]hq]r(hCjehu]rhauhwKhxhh}rjjsha]r(h)r}r(hfXFEED_URIrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhXFEED_URIrr}r(hfjhgjubaubh)r}r(hfXDefault: ``None``rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]r(hX Default: rr}r(hfX Default: hgjubjf)r}r(hfX``None``ho}r(hs]ht]hr]hq]hu]uhgjha]rhXNonerr}r(hfUhgjubahmjpubeubh)r}r(hfX^The URI of the export feed. See :ref:`topics-feed-storage-backends` for supported URI schemes.hgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]r(hX The URI of the export feed. See rr}r(hfX The URI of the export feed. See hgjubh)r}r(hfX#:ref:`topics-feed-storage-backends`rhgjhhhkhmhho}r(UreftypeXrefhوhXtopics-feed-storage-backendsU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKha]rh)r}r(hfjho}r(hs]ht]r(hjXstd-refrehr]hq]hu]uhgjha]rhXtopics-feed-storage-backendsrr}r(hfUhgjubahmhubaubhX for supported URI schemes.rr}r(hfX for supported URI schemes.hgjubeubh)r}r(hfX7This setting is required for enabling the feed exports.rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhX7This setting is required for enabling the feed exports.rr}r(hfjhgjubaubj)r}r(hfUhgjhhhkhmjho}r(hq]hr]hs]ht]hu]Uentries]r(XpairXFEED_FORMAT; settingXstd:setting-FEED_FORMATrUtrauhwKhxhha]ubhc)r}r(hfUhgjhhhkhmhnho}r(hq]hr]hs]ht]hu]hvjuhwKhxhha]ubeubhy)r}r(hfUhgjhhhkh|}hmh~ho}r(hs]ht]hr]hq]r(hVjehu]rh$auhwKhxhh}rjjsha]r(h)r}r(hfX FEED_FORMATrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhX FEED_FORMATrr}r(hfjhgjubaubh)r}r(hfXdThe serialization format to be used for the feed. See :ref:`topics-feed-format` for possible values.hgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]r(hX6The serialization format to be used for the feed. See rr}r(hfX6The serialization format to be used for the feed. See hgjubh)r}r(hfX:ref:`topics-feed-format`rhgjhhhkhmhho}r(UreftypeXrefhوhXtopics-feed-formatU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwKha]rh)r}r(hfjho}r(hs]ht]r(hjXstd-refrehr]hq]hu]uhgjha]rhXtopics-feed-formatrr}r(hfUhgjubahmhubaubhX for possible values.rr }r (hfX for possible values.hgjubeubj)r }r (hfUhgjhhhkhmjho}r (hq]hr]hs]ht]hu]Uentries]r(XpairXFEED_STORE_EMPTY; settingXstd:setting-FEED_STORE_EMPTYrUtrauhwKhxhha]ubhc)r}r(hfUhgjhhhkhmhnho}r(hq]hr]hs]ht]hu]hvjuhwKhxhha]ubeubhy)r}r(hfUhgjhhhkh|}hmh~ho}r(hs]ht]hr]hq]r(hPjehu]rhauhwKhxhh}rjjsha]r(h)r}r(hfXFEED_STORE_EMPTYrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhXFEED_STORE_EMPTYr r!}r"(hfjhgjubaubh)r#}r$(hfXDefault: ``False``r%hgjhhhkhmhho}r&(hs]ht]hr]hq]hu]uhwKhxhha]r'(hX Default: r(r)}r*(hfX Default: hgj#ubjf)r+}r,(hfX ``False``ho}r-(hs]ht]hr]hq]hu]uhgj#ha]r.hXFalser/r0}r1(hfUhgj+ubahmjpubeubh)r2}r3(hfX8Whether to export empty feeds (ie. feeds with no items).r4hgjhhhkhmhho}r5(hs]ht]hr]hq]hu]uhwKhxhha]r6hX8Whether to export empty feeds (ie. feeds with no items).r7r8}r9(hfj4hgj2ubaubj)r:}r;(hfUhgjhhhkhmjho}r<(hq]hr]hs]ht]hu]Uentries]r=(XpairXFEED_STORAGES; settingXstd:setting-FEED_STORAGESr>Utr?auhwKhxhha]ubhc)r@}rA(hfUhgjhhhkhmhnho}rB(hq]hr]hs]ht]hu]hvj>uhwKhxhha]ubeubhy)rC}rD(hfUhgjhhhkh|}hmh~ho}rE(hs]ht]hr]hq]rF(h9j>ehu]rGhauhwKhxhh}rHj>j@sha]rI(h)rJ}rK(hfX FEED_STORAGESrLhgjChhhkhmhho}rM(hs]ht]hr]hq]hu]uhwKhxhha]rNhX FEED_STORAGESrOrP}rQ(hfjLhgjJubaubh)rR}rS(hfXDefault:: ``{}``rThgjChhhkhmhho}rU(hs]ht]hr]hq]hu]uhwKhxhha]rV(hX Default:: rWrX}rY(hfX Default:: hgjRubjf)rZ}r[(hfX``{}``ho}r\(hs]ht]hr]hq]hu]uhgjRha]r]hX{}r^r_}r`(hfUhgjZubahmjpubeubh)ra}rb(hfXA dict containing additional feed storage backends supported by your project. The keys are URI schemes and the values are paths to storage classes.rchgjChhhkhmhho}rd(hs]ht]hr]hq]hu]uhwKhxhha]rehXA dict containing additional feed storage backends supported by your project. The keys are URI schemes and the values are paths to storage classes.rfrg}rh(hfjchgjaubaubj)ri}rj(hfUhgjChhhkhmjho}rk(hq]hr]hs]ht]hu]Uentries]rl(XpairXFEED_STORAGES_BASE; settingXstd:setting-FEED_STORAGES_BASErmUtrnauhwKhxhha]ubhc)ro}rp(hfUhgjChhhkhmhnho}rq(hq]hr]hs]ht]hu]hvjmuhwKhxhha]ubeubhy)rr}rs(hfUhgjhhhkh|}hmh~ho}rt(hs]ht]hr]hq]ru(h?jmehu]rvh auhwKhxhh}rwjmjosha]rx(h)ry}rz(hfXFEED_STORAGES_BASEr{hgjrhhhkhmhho}r|(hs]ht]hr]hq]hu]uhwKhxhha]r}hXFEED_STORAGES_BASEr~r}r(hfj{hgjyubaubh)r}r(hfX Default::rhgjrhhhkhmhho}r(hs]ht]hr]hq]hu]uhwKhxhha]rhXDefault:rr}r(hfXDefault:hgjubaubcdocutils.nodes literal_block r)r}r(hfX{ '': 'scrapy.contrib.feedexport.FileFeedStorage', 'file': 'scrapy.contrib.feedexport.FileFeedStorage', 'stdout': 'scrapy.contrib.feedexport.StdoutFeedStorage', 's3': 'scrapy.contrib.feedexport.S3FeedStorage', 'ftp': 'scrapy.contrib.feedexport.FTPFeedStorage', }hgjrhhhkhmU literal_blockrho}r(U xml:spacerUpreserverhq]hr]hs]ht]hu]uhwKhxhha]rhX{ '': 'scrapy.contrib.feedexport.FileFeedStorage', 'file': 'scrapy.contrib.feedexport.FileFeedStorage', 'stdout': 'scrapy.contrib.feedexport.StdoutFeedStorage', 's3': 'scrapy.contrib.feedexport.S3FeedStorage', 'ftp': 'scrapy.contrib.feedexport.FTPFeedStorage', }rr}r(hfUhgjubaubh)r}r(hfXIA dict containing the built-in feed storage backends supported by Scrapy.rhgjrhhhkhmhho}r(hs]ht]hr]hq]hu]uhwMhxhha]rhXIA dict containing the built-in feed storage backends supported by Scrapy.rr}r(hfjhgjubaubj)r}r(hfUhgjrhhhkhmjho}r(hq]hr]hs]ht]hu]Uentries]r(XpairXFEED_EXPORTERS; settingXstd:setting-FEED_EXPORTERSrUtrauhwM hxhha]ubhc)r}r(hfUhgjrhhhkhmhnho}r(hq]hr]hs]ht]hu]hvjuhwM hxhha]ubeubhy)r}r(hfUhgjhhhkh|}hmh~ho}r(hs]ht]hr]hq]r(h^jehu]rh,auhwM hxhh}rjjsha]r(h)r}r(hfXFEED_EXPORTERSrhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwM hxhha]rhXFEED_EXPORTERSrr}r(hfjhgjubaubh)r}r(hfXDefault:: ``{}``rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwM hxhha]r(hX Default:: rr}r(hfX Default:: hgjubjf)r}r(hfX``{}``ho}r(hs]ht]hr]hq]hu]uhgjha]rhX{}rr}r(hfUhgjubahmjpubeubh)r}r(hfXA dict containing additional exporters supported by your project. The keys are URI schemes and the values are paths to :ref:`Item exporter ` classes.hgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwMhxhha]r(hXwA dict containing additional exporters supported by your project. The keys are URI schemes and the values are paths to rr}r(hfXwA dict containing additional exporters supported by your project. The keys are URI schemes and the values are paths to hgjubh)r}r(hfX':ref:`Item exporter `rhgjhhhkhmhho}r(UreftypeXrefhوhXtopics-exportersU refdomainXstdrhq]hr]U refexpliciths]ht]hu]hhuhwMha]rh)r}r(hfjho}r(hs]ht]r(hjXstd-refrehr]hq]hu]uhgjha]rhX Item exporterrr}r(hfUhgjubahmhubaubhX classes.rr}r(hfX classes.hgjubeubj)r}r(hfUhgjhhhkhmjho}r(hq]hr]hs]ht]hu]Uentries]r(XpairXFEED_EXPORTERS_BASE; settingXstd:setting-FEED_EXPORTERS_BASErUtrauhwMhxhha]ubhc)r}r(hfUhgjhhhkhmhnho}r(hq]hr]hs]ht]hu]hvjuhwMhxhha]ubeubhy)r}r(hfUhgjhhhkh|}hmh~ho}r(hs]ht]hr]hq]r(h]jehu]rh+auhwMhxhh}rjjsha]r(h)r}r(hfXFEED_EXPORTERS_BASErhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwMhxhha]rhXFEED_EXPORTERS_BASErr}r(hfjhgjubaubh)r}r(hfX Default::rhgjhhhkhmhho}r(hs]ht]hr]hq]hu]uhwMhxhha]rhXDefault:rr}r(hfXDefault:hgjubaubj)r}r(hfX=FEED_EXPORTERS_BASE = { 'json': 'scrapy.contrib.exporter.JsonItemExporter', 'jsonlines': 'scrapy.contrib.exporter.JsonLinesItemExporter', 'csv': 'scrapy.contrib.exporter.CsvItemExporter', 'xml': 'scrapy.contrib.exporter.XmlItemExporter', 'marshal': 'scrapy.contrib.exporter.MarshalItemExporter', }hgjhhhkhmjho}r(jjhq]hr]hs]ht]hu]uhwMhxhha]rhX=FEED_EXPORTERS_BASE = { 'json': 'scrapy.contrib.exporter.JsonItemExporter', 'jsonlines': 'scrapy.contrib.exporter.JsonLinesItemExporter', 'csv': 'scrapy.contrib.exporter.CsvItemExporter', 'xml': 'scrapy.contrib.exporter.XmlItemExporter', 'marshal': 'scrapy.contrib.exporter.MarshalItemExporter', }r r }r (hfUhgjubaubh)r }r (hfXBA dict containing the built-in feed exporters supported by Scrapy.r hgjhhhkhmhho}r (hs]ht]hr]hq]hu]uhwM"hxhha]r hXBA dict containing the built-in feed exporters supported by Scrapy.r r }r (hfj hgj ubaubhc)r }r (hfXA.. _URI: http://en.wikipedia.org/wiki/Uniform_Resource_IdentifierU referencedr Khgjhhhkhmhnho}r (jjhq]r hGahr]hs]ht]hu]r hauhwM%hxhha]ubhc)r }r (hfX(.. _Amazon S3: http://aws.amazon.com/s3/j Khgjhhhkhmhnho}r (jjhq]r hXahr]hs]ht]hu]r h&auhwM&hxhha]ubhc)r }r (hfX(.. _boto: http://code.google.com/p/boto/j Khgjhhhkhmhnho}r (jj6hq]r hYahr]hs]ht]hu]r h'auhwM'hxhha]ubeubeubeubehfUU transformerr NU footnote_refsr }r Urefnamesr }r (Xuri]r jaj5]r! (j2j[jaeX amazon s3]r" jauUsymbol_footnotesr# ]r$ Uautofootnote_refsr% ]r& Usymbol_footnote_refsr' ]r( U citationsr) ]r* hxhU current_liner+ NUtransform_messagesr, ]r- (cdocutils.nodes system_message r. )r/ }r0 (hfUho}r1 (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypeUINFOr2 uha]r3 h)r4 }r5 (hfUho}r6 (hs]ht]hr]hq]hu]uhgj/ ha]r7 hX9Hyperlink target "topics-feed-exports" is not referenced.r8 r9 }r: (hfUhgj4 ubahmhubahmUsystem_messager; ubj. )r< }r= (hfUho}r> (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r? h)r@ }rA (hfUho}rB (hs]ht]hr]hq]hu]uhgj< ha]rC hX8Hyperlink target "topics-feed-format" is not referenced.rD rE }rF (hfUhgj@ ubahmhubahmj; ubj. )rG }rH (hfUho}rI (hs]UlevelKhq]hr]Usourcehkht]hu]UlineK"Utypej2 uha]rJ h)rK }rL (hfUho}rM (hs]ht]hr]hq]hu]uhgjG ha]rN hX=Hyperlink target "topics-feed-format-json" is not referenced.rO rP }rQ (hfUhgjK ubahmhubahmj; ubj. )rR }rS (hfUho}rT (hs]UlevelKhq]hr]Usourcehkht]hu]UlineK+Utypej2 uha]rU h)rV }rW (hfUho}rX (hs]ht]hr]hq]hu]uhgjR ha]rY hXBHyperlink target "topics-feed-format-jsonlines" is not referenced.rZ r[ }r\ (hfUhgjV ubahmhubahmj; ubj. )r] }r^ (hfUho}r_ (hs]UlevelKhq]hr]Usourcehkht]hu]UlineK3Utypej2 uha]r` h)ra }rb (hfUho}rc (hs]ht]hr]hq]hu]uhgj] ha]rd hX<Hyperlink target "topics-feed-format-csv" is not referenced.re rf }rg (hfUhgja ubahmhubahmj; ubj. )rh }ri (hfUho}rj (hs]UlevelKhq]hr]Usourcehkht]hu]UlineK;Utypej2 uha]rk h)rl }rm (hfUho}rn (hs]ht]hr]hq]hu]uhgjh ha]ro hX<Hyperlink target "topics-feed-format-xml" is not referenced.rp rq }rr (hfUhgjl ubahmhubahmj; ubj. )rs }rt (hfUho}ru (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKCUtypej2 uha]rv h)rw }rx (hfUho}ry (hs]ht]hr]hq]hu]uhgjs ha]rz hX?Hyperlink target "topics-feed-format-pickle" is not referenced.r{ r| }r} (hfUhgjw ubahmhubahmj; ubj. )r~ }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj~ ha]r hX@Hyperlink target "topics-feed-format-marshal" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKTUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX9Hyperlink target "topics-feed-storage" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKiUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX<Hyperlink target "topics-feed-uri-params" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hXBHyperlink target "topics-feed-storage-backends" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX<Hyperlink target "topics-feed-storage-fs" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX=Hyperlink target "topics-feed-storage-ftp" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX<Hyperlink target "topics-feed-storage-s3" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX@Hyperlink target "topics-feed-storage-stdout" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX:Hyperlink target "std:setting-FEED_URI" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX=Hyperlink target "std:setting-FEED_FORMAT" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hXBHyperlink target "std:setting-FEED_STORE_EMPTY" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX?Hyperlink target "std:setting-FEED_STORAGES" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineKUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hXDHyperlink target "std:setting-FEED_STORAGES_BASE" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineM Utypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hX@Hyperlink target "std:setting-FEED_EXPORTERS" is not referenced.r r }r (hfUhgj ubahmhubahmj; ubj. )r }r (hfUho}r (hs]UlevelKhq]hr]Usourcehkht]hu]UlineMUtypej2 uha]r h)r }r (hfUho}r (hs]ht]hr]hq]hu]uhgj ha]r hXEHyperlink target "std:setting-FEED_EXPORTERS_BASE" is not referenced.r r! }r" (hfUhgj ubahmhubahmj; ubeUreporterr# NUid_startr$ KU autofootnotesr% ]r& U citation_refsr' }r( Uindirect_targetsr) ]r* Usettingsr+ (cdocutils.frontend Values r, or- }r. (Ufootnote_backlinksr/ KUrecord_dependenciesr0 NU rfc_base_urlr1 Uhttp://tools.ietf.org/html/r2 U tracebackr3 Upep_referencesr4 NUstrip_commentsr5 NU toc_backlinksr6 Uentryr7 U language_coder8 Uenr9 U datestampr: NU report_levelr; KU _destinationr< NU halt_levelr= KU strip_classesr> NhNUerror_encoding_error_handlerr? Ubackslashreplacer@ UdebugrA NUembed_stylesheetrB Uoutput_encoding_error_handlerrC UstrictrD U sectnum_xformrE KUdump_transformsrF NU docinfo_xformrG KUwarning_streamrH NUpep_file_url_templaterI Upep-%04drJ Uexit_status_levelrK KUconfigrL NUstrict_visitorrM NUcloak_email_addressesrN Utrim_footnote_reference_spacerO UenvrP NUdump_pseudo_xmlrQ NUexpose_internalsrR NUsectsubtitle_xformrS U source_linkrT NUrfc_referencesrU NUoutput_encodingrV Uutf-8rW U source_urlrX NUinput_encodingrY U utf-8-sigrZ U_disable_configr[ NU id_prefixr\ UU tab_widthr] KUerror_encodingr^ UUTF-8r_ U_sourcer` UI/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/feed-exports.rstra Ugettext_compactrb U generatorrc NUdump_internalsrd NU smart_quotesre U pep_base_urlrf Uhttp://www.python.org/dev/peps/rg Usyntax_highlightrh Ulongri Uinput_encoding_error_handlerrj jD Uauto_id_prefixrk Uidrl Udoctitle_xformrm Ustrip_elements_with_classesrn NU _config_filesro ]Ufile_insertion_enabledrp U raw_enabledrq KU dump_settingsrr NubUsymbol_footnote_startrs KUidsrt }ru (hPjh8hzhGj h;jhh>jhRhhVjjjhAjhSjjmjrhDjhEjhFj.h:jhIjhJjJhCjhKjwhLjhBjhhNjhOjEhQjEhjCjjh[hh\jwjjh@hzhXj h_jh`jJh9jChHj.uUsubstitution_namesrv }rw hmhxho}rx (hs]hq]hr]Usourcehkht]hu]uU footnotesry ]rz Urefidsr{ }r| (h8]r} hdah;]r~ jeah>]r jajm]r joaj]r jahD]r jahF]r j+ah:]r jahI]r jahJ]r jGahL]r jahQ]r jBaj]r jahT]r jaj]r jahU]r jahW]r jaj>]r j@aj]r jah[]r hah\]r jtaj]r jauub.PKo1DԞ@))1scrapy-0.22/.doctrees/topics/benchmarking.doctreecdocutils.nodes document q)q}q(U nametypesq}qX benchmarkingqsUsubstitution_defsq}qUparse_messagesq ]q cdocutils.nodes system_message q )q }q (U rawsourceqUUparentqcdocutils.nodes section q)q}q(hUU referencedqKhhUsourceqcdocutils.nodes reprunicode qXI/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/benchmarking.rstqq}qbUexpect_referenced_by_nameq}qhcdocutils.nodes target q)q}q(hX.. _benchmarking:hhhhUtagnameqUtargetqU attributesq }q!(Uidsq"]Ubackrefsq#]Udupnamesq$]Uclassesq%]Unamesq&]Urefidq'U benchmarkingq(uUlineq)KUdocumentq*hUchildrenq+]ubshUsectionq,h }q-(h$]q.X benchmarkingq/ah%]h#]h"]q0(h(Uid1q1eh&]q2hauh)Kh*hUexpect_referenced_by_idq3}q4h(hsh+]q5(cdocutils.nodes title q6)q7}q8(hX Benchmarkingq9hhhhhUtitleq:h }q;(h$]h%]h#]h"]h&]uh)Kh*hh+]qq?}q@(hh9hh7ubaubcsphinx.addnodes versionmodified qA)qB}qC(hUhhhhhUversionmodifiedqDh }qE(UversionqFX0.17qGh"]h#]h$]h%]h&]UtypeqHX versionaddedqIuh)Kh*hh+]qJcdocutils.nodes paragraph qK)qL}qM(hUhhBhhhU paragraphqNh }qO(h$]h%]h#]h"]h&]uh)Kh*hh+]qPcdocutils.nodes inline qQ)qR}qS(hUh }qT(h$]h%]qUhDah#]h"]h&]uhhLh+]qVh=XNew in version 0.17.qWqX}qY(hUhhRubahUinlineqZubaubaubhK)q[}q\(hXIScrapy comes with a simple benchmarking suite that spawns a local HTTP server and crawls it at the maximum possible speed. The goal of this benchmarking is to get an idea of how Scrapy performs in your hardware, in order to have a common baseline for comparisons. It uses a simple spider that does nothing and just follows links.q]hhhhhhNh }q^(h$]h%]h#]h"]h&]uh)K h*hh+]q_h=XIScrapy comes with a simple benchmarking suite that spawns a local HTTP server and crawls it at the maximum possible speed. The goal of this benchmarking is to get an idea of how Scrapy performs in your hardware, in order to have a common baseline for comparisons. It uses a simple spider that does nothing and just follows links.q`qa}qb(hh]hh[ubaubhK)qc}qd(hXTo run it use::qehhhhhhNh }qf(h$]h%]h#]h"]h&]uh)Kh*hh+]qgh=XTo run it use:qhqi}qj(hXTo run it use:hhcubaubcdocutils.nodes literal_block qk)ql}qm(hX scrapy benchhhhhhU literal_blockqnh }qo(U xml:spaceqpUpreserveqqh"]h#]h$]h%]h&]uh)Kh*hh+]qrh=X scrapy benchqsqt}qu(hUhhlubaubhK)qv}qw(hX$You should see an output like this::qxhhhhhhNh }qy(h$]h%]h#]h"]h&]uh)Kh*hh+]qzh=X#You should see an output like this:q{q|}q}(hX#You should see an output like this:hhvubaubhk)q~}q(hX2013-05-16 13:08:46-0300 [scrapy] INFO: Scrapy 0.17.0 started (bot: scrapybot) 2013-05-16 13:08:47-0300 [follow] INFO: Spider opened 2013-05-16 13:08:47-0300 [follow] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:48-0300 [follow] INFO: Crawled 74 pages (at 4440 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:49-0300 [follow] INFO: Crawled 143 pages (at 4140 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:50-0300 [follow] INFO: Crawled 210 pages (at 4020 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:51-0300 [follow] INFO: Crawled 274 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:52-0300 [follow] INFO: Crawled 343 pages (at 4140 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:53-0300 [follow] INFO: Crawled 410 pages (at 4020 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:54-0300 [follow] INFO: Crawled 474 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:55-0300 [follow] INFO: Crawled 538 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:56-0300 [follow] INFO: Crawled 602 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:57-0300 [follow] INFO: Closing spider (closespider_timeout) 2013-05-16 13:08:57-0300 [follow] INFO: Crawled 666 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:57-0300 [follow] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 231508, 'downloader/request_count': 682, 'downloader/request_method_count/GET': 682, 'downloader/response_bytes': 1172802, 'downloader/response_count': 682, 'downloader/response_status_count/200': 682, 'finish_reason': 'closespider_timeout', 'finish_time': datetime.datetime(2013, 5, 16, 16, 8, 57, 985539), 'log_count/INFO': 14, 'request_depth_max': 34, 'response_received_count': 682, 'scheduler/dequeued': 682, 'scheduler/dequeued/memory': 682, 'scheduler/enqueued': 12767, 'scheduler/enqueued/memory': 12767, 'start_time': datetime.datetime(2013, 5, 16, 16, 8, 47, 676539)} 2013-05-16 13:08:57-0300 [follow] INFO: Spider closed (closespider_timeout)hhhhhhnh }q(hphqh"]h#]h$]h%]h&]uh)Kh*hh+]qh=X2013-05-16 13:08:46-0300 [scrapy] INFO: Scrapy 0.17.0 started (bot: scrapybot) 2013-05-16 13:08:47-0300 [follow] INFO: Spider opened 2013-05-16 13:08:47-0300 [follow] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:48-0300 [follow] INFO: Crawled 74 pages (at 4440 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:49-0300 [follow] INFO: Crawled 143 pages (at 4140 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:50-0300 [follow] INFO: Crawled 210 pages (at 4020 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:51-0300 [follow] INFO: Crawled 274 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:52-0300 [follow] INFO: Crawled 343 pages (at 4140 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:53-0300 [follow] INFO: Crawled 410 pages (at 4020 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:54-0300 [follow] INFO: Crawled 474 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:55-0300 [follow] INFO: Crawled 538 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:56-0300 [follow] INFO: Crawled 602 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:57-0300 [follow] INFO: Closing spider (closespider_timeout) 2013-05-16 13:08:57-0300 [follow] INFO: Crawled 666 pages (at 3840 pages/min), scraped 0 items (at 0 items/min) 2013-05-16 13:08:57-0300 [follow] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 231508, 'downloader/request_count': 682, 'downloader/request_method_count/GET': 682, 'downloader/response_bytes': 1172802, 'downloader/response_count': 682, 'downloader/response_status_count/200': 682, 'finish_reason': 'closespider_timeout', 'finish_time': datetime.datetime(2013, 5, 16, 16, 8, 57, 985539), 'log_count/INFO': 14, 'request_depth_max': 34, 'response_received_count': 682, 'scheduler/dequeued': 682, 'scheduler/dequeued/memory': 682, 'scheduler/enqueued': 12767, 'scheduler/enqueued/memory': 12767, 'start_time': datetime.datetime(2013, 5, 16, 16, 8, 47, 676539)} 2013-05-16 13:08:57-0300 [follow] INFO: Spider closed (closespider_timeout)qq}q(hUhh~ubaubhK)q}q(hXRThat tells you that Scrapy is able to crawl about 3900 pages per minute in the hardware where you run it. Note that this is a very simple spider intended to follow links, any custom spider you write will probably do more stuff which results in slower crawl rates. How slower depends on how much your spider does and how well it's written.qhhhhhhNh }q(h$]h%]h#]h"]h&]uh)K6h*hh+]qh=XRThat tells you that Scrapy is able to crawl about 3900 pages per minute in the hardware where you run it. Note that this is a very simple spider intended to follow links, any custom spider you write will probably do more stuff which results in slower crawl rates. How slower depends on how much your spider does and how well it's written.qq}q(hhhhubaubhK)q}q(hXbIn the future, more cases will be added to the benchmarking suite to cover other common scenarios.qhhhhhhNh }q(h$]h%]h#]h"]h&]uh)Kh Utopics-spider-middleware-refq?h!h!h"h"h#Uhttperror-allowed-codesq@uUchildrenqA]qB(cdocutils.nodes target qC)qD}qE(U rawsourceqFX.. _topics-spider-middleware:UparentqGhUsourceqHcdocutils.nodes reprunicode qIXN/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/spider-middleware.rstqJqK}qLbUtagnameqMUtargetqNU attributesqO}qP(UidsqQ]UbackrefsqR]UdupnamesqS]UclassesqT]UnamesqU]UrefidqVh5uUlineqWKUdocumentqXhhA]ubcdocutils.nodes section qY)qZ}q[(hFUhGhhHhKUexpect_referenced_by_nameq\}q]hhDshMUsectionq^hO}q_(hS]hT]hR]hQ]q`(h8h5ehU]qa(hheuhWKhXhUexpect_referenced_by_idqb}qch5hDshA]qd(cdocutils.nodes title qe)qf}qg(hFXSpider MiddlewareqhhGhZhHhKhMUtitleqihO}qj(hS]hT]hR]hQ]hU]uhWKhXhhA]qkcdocutils.nodes Text qlXSpider Middlewareqmqn}qo(hFhhhGhfubaubcdocutils.nodes paragraph qp)qq}qr(hFXThe spider middleware is a framework of hooks into Scrapy's spider processing mechanism where you can plug custom functionality to process the requests that are sent to :ref:`topics-spiders` for processing and to process the responses and items that are generated from spiders.hGhZhHhKhMU paragraphqshO}qt(hS]hT]hR]hQ]hU]uhWKhXhhA]qu(hlXThe spider middleware is a framework of hooks into Scrapy's spider processing mechanism where you can plug custom functionality to process the requests that are sent to qvqw}qx(hFXThe spider middleware is a framework of hooks into Scrapy's spider processing mechanism where you can plug custom functionality to process the requests that are sent to hGhqubcsphinx.addnodes pending_xref qy)qz}q{(hFX:ref:`topics-spiders`q|hGhqhHhKhMU pending_xrefq}hO}q~(UreftypeXrefUrefwarnqU reftargetqXtopics-spidersU refdomainXstdqhQ]hR]U refexplicithS]hT]hU]UrefdocqXtopics/spider-middlewarequhWKhA]qcdocutils.nodes emphasis q)q}q(hFh|hO}q(hS]hT]q(UxrefqhXstd-refqehR]hQ]hU]uhGhzhA]qhlXtopics-spidersqq}q(hFUhGhubahMUemphasisqubaubhlXW for processing and to process the responses and items that are generated from spiders.qq}q(hFXW for processing and to process the responses and items that are generated from spiders.hGhqubeubhC)q}q(hFX%.. _topics-spider-middleware-setting:hGhZhHhKhMhNhO}q(hQ]hR]hS]hT]hU]hVh3uhWK hXhhA]ubhY)q}q(hFUhGhZhHhKh\}qhhshMh^hO}q(hS]hT]hR]hQ]q(h/h3ehU]q(hheuhWKhXhhb}qh3hshA]q(he)q}q(hFXActivating a spider middlewareqhGhhHhKhMhihO}q(hS]hT]hR]hQ]hU]uhWKhXhhA]qhlXActivating a spider middlewareqq}q(hFhhGhubaubhp)q}q(hFXTo activate a spider middleware component, add it to the :setting:`SPIDER_MIDDLEWARES` setting, which is a dict whose keys are the middleware class path and their values are the middleware orders.hGhhHhKhMhshO}q(hS]hT]hR]hQ]hU]uhWKhXhhA]q(hlX9To activate a spider middleware component, add it to the qq}q(hFX9To activate a spider middleware component, add it to the hGhubhy)q}q(hFX:setting:`SPIDER_MIDDLEWARES`qhGhhHhKhMh}hO}q(UreftypeXsettinghhXSPIDER_MIDDLEWARESU refdomainXstdqhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]qcdocutils.nodes literal q)q}q(hFhhO}q(hS]hT]q(hhX std-settingqehR]hQ]hU]uhGhhA]qhlXSPIDER_MIDDLEWARESqq}q(hFUhGhubahMUliteralqubaubhlXn setting, which is a dict whose keys are the middleware class path and their values are the middleware orders.qq}q(hFXn setting, which is a dict whose keys are the middleware class path and their values are the middleware orders.hGhubeubhp)q}q(hFXHere's an example::qhGhhHhKhMhshO}q(hS]hT]hR]hQ]hU]uhWKhXhhA]qhlXHere's an example:qDžq}q(hFXHere's an example:hGhubaubcdocutils.nodes literal_block q)q}q(hFXQSPIDER_MIDDLEWARES = { 'myproject.middlewares.CustomSpiderMiddleware': 543, }hGhhHhKhMU literal_blockqhO}q(U xml:spaceqUpreserveqhQ]hR]hS]hT]hU]uhWKhXhhA]qhlXQSPIDER_MIDDLEWARES = { 'myproject.middlewares.CustomSpiderMiddleware': 543, }q҅q}q(hFUhGhubaubhp)q}q(hFXKThe :setting:`SPIDER_MIDDLEWARES` setting is merged with the :setting:`SPIDER_MIDDLEWARES_BASE` setting defined in Scrapy (and not meant to be overridden) and then sorted by order to get the final sorted list of enabled middlewares: the first middleware is the one closer to the engine and the last is the one closer to the spider.hGhhHhKhMhshO}q(hS]hT]hR]hQ]hU]uhWKhXhhA]q(hlXThe qمq}q(hFXThe hGhubhy)q}q(hFX:setting:`SPIDER_MIDDLEWARES`qhGhhHhKhMh}hO}q(UreftypeXsettinghhXSPIDER_MIDDLEWARESU refdomainXstdqhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]qh)q}q(hFhhO}q(hS]hT]q(hhX std-settingqehR]hQ]hU]uhGhhA]qhlXSPIDER_MIDDLEWARESq腁q}q(hFUhGhubahMhubaubhlX setting is merged with the q녁q}q(hFX setting is merged with the hGhubhy)q}q(hFX":setting:`SPIDER_MIDDLEWARES_BASE`qhGhhHhKhMh}hO}q(UreftypeXsettinghhXSPIDER_MIDDLEWARES_BASEU refdomainXstdqhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]qh)q}q(hFhhO}q(hS]hT]q(hhX std-settingqehR]hQ]hU]uhGhhA]qhlXSPIDER_MIDDLEWARES_BASEqq}q(hFUhGhubahMhubaubhlX setting defined in Scrapy (and not meant to be overridden) and then sorted by order to get the final sorted list of enabled middlewares: the first middleware is the one closer to the engine and the last is the one closer to the spider.qq}q(hFX setting defined in Scrapy (and not meant to be overridden) and then sorted by order to get the final sorted list of enabled middlewares: the first middleware is the one closer to the engine and the last is the one closer to the spider.hGhubeubhp)r}r(hFXRTo decide which order to assign to your middleware see the :setting:`SPIDER_MIDDLEWARES_BASE` setting and pick a value according to where you want to insert the middleware. The order does matter because each middleware performs a different action and your middleware could depend on some previous (or subsequent) middleware being applied.hGhhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWK!hXhhA]r(hlX;To decide which order to assign to your middleware see the rr}r(hFX;To decide which order to assign to your middleware see the hGjubhy)r}r(hFX":setting:`SPIDER_MIDDLEWARES_BASE`r hGjhHhKhMh}hO}r (UreftypeXsettinghhXSPIDER_MIDDLEWARES_BASEU refdomainXstdr hQ]hR]U refexplicithS]hT]hU]hhuhWK!hA]r h)r }r(hFj hO}r(hS]hT]r(hj X std-settingrehR]hQ]hU]uhGjhA]rhlXSPIDER_MIDDLEWARES_BASErr}r(hFUhGj ubahMhubaubhlX setting and pick a value according to where you want to insert the middleware. The order does matter because each middleware performs a different action and your middleware could depend on some previous (or subsequent) middleware being applied.rr}r(hFX setting and pick a value according to where you want to insert the middleware. The order does matter because each middleware performs a different action and your middleware could depend on some previous (or subsequent) middleware being applied.hGjubeubhp)r}r(hFX$If you want to disable a builtin middleware (the ones defined in :setting:`SPIDER_MIDDLEWARES_BASE`, and enabled by default) you must define it in your project :setting:`SPIDER_MIDDLEWARES` setting and assign `None` as its value. For example, if you want to disable the off-site middleware::hGhhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWK'hXhhA]r(hlXAIf you want to disable a builtin middleware (the ones defined in rr}r(hFXAIf you want to disable a builtin middleware (the ones defined in hGjubhy)r }r!(hFX":setting:`SPIDER_MIDDLEWARES_BASE`r"hGjhHhKhMh}hO}r#(UreftypeXsettinghhXSPIDER_MIDDLEWARES_BASEU refdomainXstdr$hQ]hR]U refexplicithS]hT]hU]hhuhWK'hA]r%h)r&}r'(hFj"hO}r((hS]hT]r)(hj$X std-settingr*ehR]hQ]hU]uhGj hA]r+hlXSPIDER_MIDDLEWARES_BASEr,r-}r.(hFUhGj&ubahMhubaubhlX=, and enabled by default) you must define it in your project r/r0}r1(hFX=, and enabled by default) you must define it in your project hGjubhy)r2}r3(hFX:setting:`SPIDER_MIDDLEWARES`r4hGjhHhKhMh}hO}r5(UreftypeXsettinghhXSPIDER_MIDDLEWARESU refdomainXstdr6hQ]hR]U refexplicithS]hT]hU]hhuhWK'hA]r7h)r8}r9(hFj4hO}r:(hS]hT]r;(hj6X std-settingr<ehR]hQ]hU]uhGj2hA]r=hlXSPIDER_MIDDLEWARESr>r?}r@(hFUhGj8ubahMhubaubhlX setting and assign rArB}rC(hFX setting and assign hGjubcdocutils.nodes title_reference rD)rE}rF(hFX`None`hO}rG(hS]hT]hR]hQ]hU]uhGjhA]rHhlXNonerIrJ}rK(hFUhGjEubahMUtitle_referencerLubhlXL as its value. For example, if you want to disable the off-site middleware:rMrN}rO(hFXL as its value. For example, if you want to disable the off-site middleware:hGjubeubh)rP}rQ(hFXSPIDER_MIDDLEWARES = { 'myproject.middlewares.CustomSpiderMiddleware': 543, 'scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware': None, }hGhhHhKhMhhO}rR(hhhQ]hR]hS]hT]hU]uhWK,hXhhA]rShlXSPIDER_MIDDLEWARES = { 'myproject.middlewares.CustomSpiderMiddleware': 543, 'scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware': None, }rTrU}rV(hFUhGjPubaubhp)rW}rX(hFXFinally, keep in mind that some middlewares may need to be enabled through a particular setting. See each middleware documentation for more info.rYhGhhHhKhMhshO}rZ(hS]hT]hR]hQ]hU]uhWK1hXhhA]r[hlXFinally, keep in mind that some middlewares may need to be enabled through a particular setting. See each middleware documentation for more info.r\r]}r^(hFjYhGjWubaubeubhY)r_}r`(hFUhGhZhHhKhMh^hO}ra(hS]hT]hR]hQ]rbh.ahU]rchauhWK5hXhhA]rd(he)re}rf(hFX"Writing your own spider middlewarerghGj_hHhKhMhihO}rh(hS]hT]hR]hQ]hU]uhWK5hXhhA]rihlX"Writing your own spider middlewarerjrk}rl(hFjghGjeubaubhp)rm}rn(hFXWriting your own spider middleware is easy. Each middleware component is a single Python class that defines one or more of the following methods:rohGj_hHhKhMhshO}rp(hS]hT]hR]hQ]hU]uhWK7hXhhA]rqhlXWriting your own spider middleware is easy. Each middleware component is a single Python class that defines one or more of the following methods:rrrs}rt(hFjohGjmubaubhC)ru}rv(hFUhGj_hHhKhMhNhO}rw(hS]hQ]rxX&module-scrapy.contrib.spidermiddlewareryahR]UismodhT]hU]uhWK;hXhhA]ubcsphinx.addnodes index rz)r{}r|(hFUhGj_hHhKhMUindexr}hO}r~(hQ]hR]hS]hT]hU]Uentries]r(UsinglerX(scrapy.contrib.spidermiddleware (module)X&module-scrapy.contrib.spidermiddlewareUtrauhWK;hXhhA]ubjz)r}r(hFUhGj_hHNhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jX;SpiderMiddleware (class in scrapy.contrib.spidermiddleware)h UtrauhWNhXhhA]ubcsphinx.addnodes desc r)r}r(hFUhGj_hHNhMUdescrhO}r(UnoindexrUdomainrXpyhQ]hR]hS]hT]hU]UobjtyperXclassrUdesctyperjuhWNhXhhA]r(csphinx.addnodes desc_signature r)r}r(hFXSpiderMiddlewarerhGjhHhKhMUdesc_signaturerhO}r(hQ]rh aUmodulerXscrapy.contrib.spidermiddlewarerhR]hS]hT]hU]rh aUfullnamerjUclassrUUfirstruhWKhXhhA]r(csphinx.addnodes desc_annotation r)r}r(hFXclass hGjhHhKhMUdesc_annotationrhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXclass rr}r(hFUhGjubaubcsphinx.addnodes desc_addname r)r}r(hFX scrapy.contrib.spidermiddleware.hGjhHhKhMU desc_addnamerhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlX scrapy.contrib.spidermiddleware.rr}r(hFUhGjubaubcsphinx.addnodes desc_name r)r}r(hFjhGjhHhKhMU desc_namerhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXSpiderMiddlewarerr}r(hFUhGjubaubeubcsphinx.addnodes desc_content r)r}r(hFUhGjhHhKhMU desc_contentrhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(jz)r}r(hFUhGjhHNhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jXPprocess_spider_input() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)hUtrauhWNhXhhA]ubj)r}r(hFUhGjhHNhMjhO}r(jjXpyhQ]hR]hS]hT]hU]jXmethodrjjuhWNhXhhA]r(j)r}r(hFX&process_spider_input(response, spider)hGjhHhKhMjhO}r(hQ]rhajjhR]hS]hT]hU]rhajX%SpiderMiddleware.process_spider_inputjjjuhWKVhXhhA]r(j)r}r(hFXprocess_spider_inputhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKVhXhhA]rhlXprocess_spider_inputrr}r(hFUhGjubaubcsphinx.addnodes desc_parameterlist r)r}r(hFUhGjhHhKhMUdesc_parameterlistrhO}r(hS]hT]hR]hQ]hU]uhWKVhXhhA]r(csphinx.addnodes desc_parameter r)r}r(hFXresponsehO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXresponserr}r(hFUhGjubahMUdesc_parameterrubj)r}r(hFXspiderhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXspiderrr}r(hFUhGjubahMjubeubeubj)r}r(hFUhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKVhXhhA]r(hp)r}r(hFXtThis method is called for each response that goes through the spider middleware and into the spider, for processing.rhGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWK@hXhhA]rhlXtThis method is called for each response that goes through the spider middleware and into the spider, for processing.rr}r(hFjhGjubaubhp)r}r(hFXJ:meth:`process_spider_input` should return ``None`` or raise an exception.hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKChXhhA]r(hy)r}r(hFX:meth:`process_spider_input`rhGjhHhKhMh}hO}r(UreftypeXmethhhXprocess_spider_inputU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhUpy:classrjU py:modulerjuhWKChA]rh)r}r(hFjhO}r(hS]hT]r (hjXpy-methr ehR]hQ]hU]uhGjhA]r hlXprocess_spider_input()r r }r(hFUhGjubahMhubaubhlX should return rr}r(hFX should return hGjubh)r}r(hFX``None``hO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXNonerr}r(hFUhGjubahMhubhlX or raise an exception.rr}r(hFX or raise an exception.hGjubeubhp)r}r(hFXIf it returns ``None``, Scrapy will continue processing this response, executing all other middlewares until, finally, the response is handed to the spider for processing.hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKFhXhhA]r(hlXIf it returns r r!}r"(hFXIf it returns hGjubh)r#}r$(hFX``None``hO}r%(hS]hT]hR]hQ]hU]uhGjhA]r&hlXNoner'r(}r)(hFUhGj#ubahMhubhlX, Scrapy will continue processing this response, executing all other middlewares until, finally, the response is handed to the spider for processing.r*r+}r,(hFX, Scrapy will continue processing this response, executing all other middlewares until, finally, the response is handed to the spider for processing.hGjubeubhp)r-}r.(hFXDIf it raises an exception, Scrapy won't bother calling any other spider middleware :meth:`process_spider_input` and will call the request errback. The output of the errback is chained back in the other direction for :meth:`process_spider_output` to process it, or :meth:`process_spider_exception` if it raised an exception.hGjhHhKhMhshO}r/(hS]hT]hR]hQ]hU]uhWKJhXhhA]r0(hlXSIf it raises an exception, Scrapy won't bother calling any other spider middleware r1r2}r3(hFXSIf it raises an exception, Scrapy won't bother calling any other spider middleware hGj-ubhy)r4}r5(hFX:meth:`process_spider_input`r6hGj-hHhKhMh}hO}r7(UreftypeXmethhhXprocess_spider_inputU refdomainXpyr8hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKJhA]r9h)r:}r;(hFj6hO}r<(hS]hT]r=(hj8Xpy-methr>ehR]hQ]hU]uhGj4hA]r?hlXprocess_spider_input()r@rA}rB(hFUhGj:ubahMhubaubhlXj and will call the request errback. The output of the errback is chained back in the other direction for rCrD}rE(hFXj and will call the request errback. The output of the errback is chained back in the other direction for hGj-ubhy)rF}rG(hFX:meth:`process_spider_output`rHhGj-hHhKhMh}hO}rI(UreftypeXmethhhXprocess_spider_outputU refdomainXpyrJhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKJhA]rKh)rL}rM(hFjHhO}rN(hS]hT]rO(hjJXpy-methrPehR]hQ]hU]uhGjFhA]rQhlXprocess_spider_output()rRrS}rT(hFUhGjLubahMhubaubhlX to process it, or rUrV}rW(hFX to process it, or hGj-ubhy)rX}rY(hFX :meth:`process_spider_exception`rZhGj-hHhKhMh}hO}r[(UreftypeXmethhhXprocess_spider_exceptionU refdomainXpyr\hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKJhA]r]h)r^}r_(hFjZhO}r`(hS]hT]ra(hj\Xpy-methrbehR]hQ]hU]uhGjXhA]rchlXprocess_spider_exception()rdre}rf(hFUhGj^ubahMhubaubhlX if it raised an exception.rgrh}ri(hFX if it raised an exception.hGj-ubeubcdocutils.nodes field_list rj)rk}rl(hFUhGjhHNhMU field_listrmhO}rn(hS]hT]hR]hQ]hU]uhWNhXhhA]rocdocutils.nodes field rp)rq}rr(hFUhO}rs(hS]hT]hR]hQ]hU]uhGjkhA]rt(cdocutils.nodes field_name ru)rv}rw(hFUhO}rx(hS]hT]hR]hQ]hU]uhGjqhA]ryhlX Parametersrzr{}r|(hFUhGjvubahMU field_namer}ubcdocutils.nodes field_body r~)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjqhA]rcdocutils.nodes bullet_list r)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]r(cdocutils.nodes list_item r)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhp)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]r(cdocutils.nodes strong r)r}r(hFXresponsehO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXresponserr}r(hFUhGjubahMUstrongrubhlX (rr}r(hFUhGjubhy)r}r(hFX:class:`~scrapy.http.Response`rhGjhHhKhMh}hO}r(UreftypeXclasshhXscrapy.http.ResponseU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKQhA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-classrehR]hQ]hU]uhGjhA]rhlXResponserr}r(hFUhGjubahMhubaubhlX objectrr}r(hFX objecthGjubhlX)r}r(hFUhGjubhlX -- rr}r(hFUhGjubhlXthe response being processedrr}r(hFXthe response being processedrhGjubehMhsubahMU list_itemrubj)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhp)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]r(j)r}r(hFXspiderhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXspiderrr}r(hFUhGjubahMjubhlX (rr}r(hFUhGjubhy)r}r(hFX:class:`~scrapy.spider.Spider`rhGjhHhKhMh}hO}r(UreftypeXclasshhXscrapy.spider.SpiderU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKThA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-classrehR]hQ]hU]uhGjhA]rhlXSpiderrr}r(hFUhGjubahMhubaubhlX objectrr}r(hFX objecthGjubhlX)r}r(hFUhGjubhlX -- rr}r(hFUhGjubhlX.the spider for which this response is intendedrr}r(hFX.the spider for which this response is intendedrhGjubehMhsubahMjubehMU bullet_listrubahMU field_bodyrubehMUfieldrubaubeubeubjz)r}r(hFUhGjhHNhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jXQprocess_spider_output() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)h!UtrauhWNhXhhA]ubj)r}r(hFUhGjhHNhMjhO}r(jjXpyhQ]hR]hS]hT]hU]jXmethodrjjuhWNhXhhA]r(j)r}r(hFX/process_spider_output(response, result, spider)hGjhHhKhMjhO}r(hQ]rh!ajjhR]hS]hT]hU]rh!ajX&SpiderMiddleware.process_spider_outputjjjuhWKjhXhhA]r(j)r}r(hFXprocess_spider_outputhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKjhXhhA]rhlXprocess_spider_outputrr}r(hFUhGjubaubj)r}r(hFUhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKjhXhhA]r(j)r}r(hFXresponsehO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXresponserr }r (hFUhGjubahMjubj)r }r (hFXresulthO}r (hS]hT]hR]hQ]hU]uhGjhA]rhlXresultrr}r(hFUhGj ubahMjubj)r}r(hFXspiderhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXspiderrr}r(hFUhGjubahMjubeubeubj)r}r(hFUhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKjhXhhA]r(hp)r}r(hFXeThis method is called with the results returned from the Spider, after it has processed the response.rhGjhHhKhMhshO}r (hS]hT]hR]hQ]hU]uhWKYhXhhA]r!hlXeThis method is called with the results returned from the Spider, after it has processed the response.r"r#}r$(hFjhGjubaubhp)r%}r&(hFX}:meth:`process_spider_output` must return an iterable of :class:`~scrapy.http.Request` or :class:`~scrapy.item.Item` objects.hGjhHhKhMhshO}r'(hS]hT]hR]hQ]hU]uhWK\hXhhA]r((hy)r)}r*(hFX:meth:`process_spider_output`r+hGj%hHhKhMh}hO}r,(UreftypeXmethhhXprocess_spider_outputU refdomainXpyr-hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWK\hA]r.h)r/}r0(hFj+hO}r1(hS]hT]r2(hj-Xpy-methr3ehR]hQ]hU]uhGj)hA]r4hlXprocess_spider_output()r5r6}r7(hFUhGj/ubahMhubaubhlX must return an iterable of r8r9}r:(hFX must return an iterable of hGj%ubhy)r;}r<(hFX:class:`~scrapy.http.Request`r=hGj%hHhKhMh}hO}r>(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyr?hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWK\hA]r@h)rA}rB(hFj=hO}rC(hS]hT]rD(hj?Xpy-classrEehR]hQ]hU]uhGj;hA]rFhlXRequestrGrH}rI(hFUhGjAubahMhubaubhlX or rJrK}rL(hFX or hGj%ubhy)rM}rN(hFX:class:`~scrapy.item.Item`rOhGj%hHhKhMh}hO}rP(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrQhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWK\hA]rRh)rS}rT(hFjOhO}rU(hS]hT]rV(hjQXpy-classrWehR]hQ]hU]uhGjMhA]rXhlXItemrYrZ}r[(hFUhGjSubahMhubaubhlX objects.r\r]}r^(hFX objects.hGj%ubeubjj)r_}r`(hFUhGjhHNhMjmhO}ra(hS]hT]hR]hQ]hU]uhWNhXhhA]rbjp)rc}rd(hFUhO}re(hS]hT]hR]hQ]hU]uhGj_hA]rf(ju)rg}rh(hFUhO}ri(hS]hT]hR]hQ]hU]uhGjchA]rjhlX Parametersrkrl}rm(hFUhGjgubahMj}ubj~)rn}ro(hFUhO}rp(hS]hT]hR]hQ]hU]uhGjchA]rqj)rr}rs(hFUhO}rt(hS]hT]hR]hQ]hU]uhGjnhA]ru(j)rv}rw(hFUhO}rx(hS]hT]hR]hQ]hU]uhGjrhA]ryhp)rz}r{(hFUhO}r|(hS]hT]hR]hQ]hU]uhGjvhA]r}(j)r~}r(hFXresponsehO}r(hS]hT]hR]hQ]hU]uhGjzhA]rhlXresponserr}r(hFUhGj~ubahMjubhlX (rr}r(hFUhGjzubhlXclass:rr}r(hFXclass:hGjzubjD)r}r(hFX`~scrapy.http.Response`hO}r(hS]hT]hR]hQ]hU]uhGjzhA]rhlX~scrapy.http.Responserr}r(hFUhGjubahMjLubhlX objectrr}r(hFX objecthGjzubhlX)r}r(hFUhGjzubhlX -- rr}r(hFUhGjzubhlX8the response which generated this output from the spiderrr}r(hFX8the response which generated this output from the spiderrhGjzubehMhsubahMjubj)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjrhA]rhp)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]r(j)r}r(hFXresulthO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXresultrr}r(hFUhGjubahMjubhlX (rr}r(hFUhGjubhlXan iterable of rr}r(hFXan iterable of hGjubhy)r}r(hFX:class:`~scrapy.http.Request`rhGjhHhKhMh}hO}r(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKdhA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-classrehR]hQ]hU]uhGjhA]rhlXRequestrr}r(hFUhGjubahMhubaubhlX or rr}r(hFX or hGjubhy)r}r(hFX:class:`~scrapy.item.Item`rhGjhHhKhMh}hO}r(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKdhA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-classrehR]hQ]hU]uhGjhA]rhlXItemrr}r(hFUhGjubahMhubaubhlX objectsrr}r(hFX objectshGjubhlX)r}r(hFUhGjubhlX -- rr}r(hFUhGjubhlX!the result returned by the spiderrr}r(hFX!the result returned by the spiderrhGjubehMhsubahMjubj)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjrhA]rhp)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]r(j)r}r(hFXspiderhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXspiderrr}r(hFUhGjubahMjubhlX (rr}r(hFUhGjubhy)r}r(hFX:class:`~scrapy.item.Spider`rhGjhHhKhMh}hO}r(UreftypeXclasshhXscrapy.item.SpiderU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKhhA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-classrehR]hQ]hU]uhGjhA]rhlXSpiderrr}r(hFUhGjubahMhubaubhlX objectrr}r(hFX objecthGjubhlX)r}r(hFUhGjubhlX -- rr}r(hFUhGjubhlX*the spider whose result is being processedr r }r (hFX*the spider whose result is being processedr hGjubehMhsubahMjubehMjubahMjubehMjubaubeubeubjz)r }r(hFUhGjhHNhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jXTprocess_spider_exception() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)h"UtrauhWNhXhhA]ubj)r}r(hFUhGjhHNhMjhO}r(jjXpyhQ]hR]hS]hT]hU]jXmethodrjjuhWNhXhhA]r(j)r}r(hFX5process_spider_exception(response, exception, spider)hGjhHhKhMjhO}r(hQ]rh"ajjhR]hS]hT]hU]rh"ajX)SpiderMiddleware.process_spider_exceptionjjjuhWKhXhhA]r(j)r}r(hFXprocess_spider_exceptionhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r hlXprocess_spider_exceptionr!r"}r#(hFUhGjubaubj)r$}r%(hFUhGjhHhKhMjhO}r&(hS]hT]hR]hQ]hU]uhWKhXhhA]r'(j)r(}r)(hFXresponsehO}r*(hS]hT]hR]hQ]hU]uhGj$hA]r+hlXresponser,r-}r.(hFUhGj(ubahMjubj)r/}r0(hFX exceptionhO}r1(hS]hT]hR]hQ]hU]uhGj$hA]r2hlX exceptionr3r4}r5(hFUhGj/ubahMjubj)r6}r7(hFXspiderhO}r8(hS]hT]hR]hQ]hU]uhGj$hA]r9hlXspiderr:r;}r<(hFUhGj6ubahMjubeubeubj)r=}r>(hFUhGjhHhKhMjhO}r?(hS]hT]hR]hQ]hU]uhWKhXhhA]r@(hp)rA}rB(hFXThis method is called when when a spider or :meth:`process_spider_input` method (from other spider middleware) raises an exception.hGj=hHhKhMhshO}rC(hS]hT]hR]hQ]hU]uhWKmhXhhA]rD(hlX,This method is called when when a spider or rErF}rG(hFX,This method is called when when a spider or hGjAubhy)rH}rI(hFX:meth:`process_spider_input`rJhGjAhHhKhMh}hO}rK(UreftypeXmethhhXprocess_spider_inputU refdomainXpyrLhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKmhA]rMh)rN}rO(hFjJhO}rP(hS]hT]rQ(hjLXpy-methrRehR]hQ]hU]uhGjHhA]rShlXprocess_spider_input()rTrU}rV(hFUhGjNubahMhubaubhlX; method (from other spider middleware) raises an exception.rWrX}rY(hFX; method (from other spider middleware) raises an exception.hGjAubeubhp)rZ}r[(hFX:meth:`process_spider_exception` should return either ``None`` or an iterable of :class:`~scrapy.http.Response` or :class:`~scrapy.item.Item` objects.hGj=hHhKhMhshO}r\(hS]hT]hR]hQ]hU]uhWKphXhhA]r](hy)r^}r_(hFX :meth:`process_spider_exception`r`hGjZhHhKhMh}hO}ra(UreftypeXmethhhXprocess_spider_exceptionU refdomainXpyrbhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKphA]rch)rd}re(hFj`hO}rf(hS]hT]rg(hjbXpy-methrhehR]hQ]hU]uhGj^hA]rihlXprocess_spider_exception()rjrk}rl(hFUhGjdubahMhubaubhlX should return either rmrn}ro(hFX should return either hGjZubh)rp}rq(hFX``None``hO}rr(hS]hT]hR]hQ]hU]uhGjZhA]rshlXNonertru}rv(hFUhGjpubahMhubhlX or an iterable of rwrx}ry(hFX or an iterable of hGjZubhy)rz}r{(hFX:class:`~scrapy.http.Response`r|hGjZhHhKhMh}hO}r}(UreftypeXclasshhXscrapy.http.ResponseU refdomainXpyr~hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKphA]rh)r}r(hFj|hO}r(hS]hT]r(hj~Xpy-classrehR]hQ]hU]uhGjzhA]rhlXResponserr}r(hFUhGjubahMhubaubhlX or rr}r(hFX or hGjZubhy)r}r(hFX:class:`~scrapy.item.Item`rhGjZhHhKhMh}hO}r(UreftypeXclasshhXscrapy.item.ItemU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKphA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-classrehR]hQ]hU]uhGjhA]rhlXItemrr}r(hFUhGjubahMhubaubhlX objects.rr}r(hFX objects.hGjZubeubhp)r}r(hFXIf it returns ``None``, Scrapy will continue processing this exception, executing any other :meth:`process_spider_exception` in the following middleware components, until no middleware components are left and the exception reaches the engine (where it's logged and discarded).hGj=hHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKthXhhA]r(hlXIf it returns rr}r(hFXIf it returns hGjubh)r}r(hFX``None``hO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXNonerr}r(hFUhGjubahMhubhlXF, Scrapy will continue processing this exception, executing any other rr}r(hFXF, Scrapy will continue processing this exception, executing any other hGjubhy)r}r(hFX :meth:`process_spider_exception`rhGjhHhKhMh}hO}r(UreftypeXmethhhXprocess_spider_exceptionU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKthA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-methrehR]hQ]hU]uhGjhA]rhlXprocess_spider_exception()rr}r(hFUhGjubahMhubaubhlX in the following middleware components, until no middleware components are left and the exception reaches the engine (where it's logged and discarded).rr}r(hFX in the following middleware components, until no middleware components are left and the exception reaches the engine (where it's logged and discarded).hGjubeubhp)r}r(hFXIf it returns an iterable the :meth:`process_spider_output` pipeline kicks in, and no other :meth:`process_spider_exception` will be called.hGj=hHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKyhXhhA]r(hlXIf it returns an iterable the rr}r(hFXIf it returns an iterable the hGjubhy)r}r(hFX:meth:`process_spider_output`rhGjhHhKhMh}hO}r(UreftypeXmethhhXprocess_spider_outputU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKyhA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-methrehR]hQ]hU]uhGjhA]rhlXprocess_spider_output()rr}r(hFUhGjubahMhubaubhlX! pipeline kicks in, and no other rr}r(hFX! pipeline kicks in, and no other hGjubhy)r}r(hFX :meth:`process_spider_exception`rhGjhHhKhMh}hO}r(UreftypeXmethhhXprocess_spider_exceptionU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKyhA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-methrehR]hQ]hU]uhGjhA]rhlXprocess_spider_exception()rr}r(hFUhGjubahMhubaubhlX will be called.rr}r(hFX will be called.hGjubeubjj)r}r(hFUhGj=hHNhMjmhO}r(hS]hT]hR]hQ]hU]uhWNhXhhA]rjp)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]r(ju)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlX Parametersrr}r(hFUhGjubahMj}ubj~)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]rj)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]r(j)r}r(hFUhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhp)r}r(hFUhO}r (hS]hT]hR]hQ]hU]uhGjhA]r (j)r }r (hFXresponsehO}r (hS]hT]hR]hQ]hU]uhGjhA]rhlXresponserr}r(hFUhGj ubahMjubhlX (rr}r(hFUhGjubhy)r}r(hFX:class:`~scrapy.http.Response`rhGjhHhKhMh}hO}r(UreftypeXclasshhXscrapy.http.ResponseU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWK~hA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-classrehR]hQ]hU]uhGjhA]r hlXResponser!r"}r#(hFUhGjubahMhubaubhlX objectr$r%}r&(hFX objecthGjubhlX)r'}r((hFUhGjubhlX -- r)r*}r+(hFUhGjubhlX:the response being processed when the exception was raisedr,r-}r.(hFX:the response being processed when the exception was raisedr/hGjubehMhsubahMjubj)r0}r1(hFUhO}r2(hS]hT]hR]hQ]hU]uhGjhA]r3hp)r4}r5(hFUhO}r6(hS]hT]hR]hQ]hU]uhGj0hA]r7(j)r8}r9(hFX exceptionhO}r:(hS]hT]hR]hQ]hU]uhGj4hA]r;hlX exceptionr<r=}r>(hFUhGj8ubahMjubhlX (r?r@}rA(hFUhGj4ubcdocutils.nodes reference rB)rC}rD(hFX `Exception`_UresolvedrEKhGj4hMU referencerFhO}rG(UnameX ExceptionUrefurirHXChttp://docs.python.org/library/exceptions.html#exceptions.ExceptionrIhQ]hR]hS]hT]hU]uhA]rJhlX ExceptionrKrL}rM(hFUhGjCubaubhlX objectrNrO}rP(hFX objecthGj4ubhlX)rQ}rR(hFUhGj4ubhlX -- rSrT}rU(hFUhGj4ubhlXthe exception raisedrVrW}rX(hFXthe exception raisedrYhGj4ubehMhsubahMjubj)rZ}r[(hFUhO}r\(hS]hT]hR]hQ]hU]uhGjhA]r]hp)r^}r_(hFUhO}r`(hS]hT]hR]hQ]hU]uhGjZhA]ra(j)rb}rc(hFXspiderhO}rd(hS]hT]hR]hQ]hU]uhGj^hA]rehlXspiderrfrg}rh(hFUhGjbubahMjubhlX (rirj}rk(hFUhGj^ubhy)rl}rm(hFX:class:`scrapy.spider.Spider`rnhGj^hHhKhMh}hO}ro(UreftypeXclasshhXscrapy.spider.SpiderU refdomainXpyrphQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKhA]rqh)rr}rs(hFjnhO}rt(hS]hT]ru(hjpXpy-classrvehR]hQ]hU]uhGjlhA]rwhlXscrapy.spider.Spiderrxry}rz(hFUhGjrubahMhubaubhlX objectr{r|}r}(hFX objecthGj^ubhlX)r~}r(hFUhGj^ubhlX -- rr}r(hFUhGj^ubhlX%the spider which raised the exceptionrr}r(hFX%the spider which raised the exceptionrhGj^ubehMhsubahMjubehMjubahMjubehMjubaubeubeubjz)r}r(hFUhGjhHNhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jXRprocess_start_requests() (scrapy.contrib.spidermiddleware.SpiderMiddleware method)hUtrauhWNhXhhA]ubj)r}r(hFUhGjhHNhMjhO}r(jjXpyhQ]hR]hS]hT]hU]jXmethodrjjuhWNhXhhA]r(j)r}r(hFX.process_start_requests(start_requests, spider)hGjhHhKhMjhO}r(hQ]rhajjhR]hS]hT]hU]rhajX'SpiderMiddleware.process_start_requestsrjjjuhWKhXhhA]r(j)r}r(hFXprocess_start_requestshGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXprocess_start_requestsrr}r(hFUhGjubaubj)r}r(hFUhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(j)r}r(hFXstart_requestshO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXstart_requestsrr}r(hFUhGjubahMjubj)r}r(hFXspiderhO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXspiderrr}r(hFUhGjubahMjubeubeubj)r}r(hFUhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(csphinx.addnodes versionmodified r)r}r(hFUhGjhHhKhMUversionmodifiedrhO}r(UversionrX0.15hQ]hR]hS]hT]hU]UtyperX versionaddedruhWKhXhhA]rhp)r}r(hFUhGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rcdocutils.nodes inline r)r}r(hFUhO}r(hS]hT]rjahR]hQ]hU]uhGjhA]rhlXNew in version 0.15.rr}r(hFUhGjubahMUinlinerubaubaubhp)r}r(hFXThis method is called with the start requests of the spider, and works similarly to the :meth:`process_spider_output` method, except that it doesn't have a response associated and must return only requests (not items).hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(hlXXThis method is called with the start requests of the spider, and works similarly to the rr}r(hFXXThis method is called with the start requests of the spider, and works similarly to the hGjubhy)r}r(hFX:meth:`process_spider_output`rhGjhHhKhMh}hO}r(UreftypeXmethhhXprocess_spider_outputU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKhA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-methrehR]hQ]hU]uhGjhA]rhlXprocess_spider_output()rr}r(hFUhGjubahMhubaubhlXe method, except that it doesn't have a response associated and must return only requests (not items).rr}r(hFXe method, except that it doesn't have a response associated and must return only requests (not items).hGjubeubhp)r}r(hFXIt receives an iterable (in the ``start_requests`` parameter) and must return another iterable of :class:`~scrapy.http.Request` objects.hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(hlX It receives an iterable (in the rr}r(hFX It receives an iterable (in the hGjubh)r}r(hFX``start_requests``hO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXstart_requestsrr}r(hFUhGjubahMhubhlX0 parameter) and must return another iterable of rr}r(hFX0 parameter) and must return another iterable of hGjubhy)r}r(hFX:class:`~scrapy.http.Request`rhGjhHhKhMh}hO}r(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyrhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKhA]rh)r}r(hFjhO}r(hS]hT]r(hjXpy-classrehR]hQ]hU]uhGjhA]rhlXRequestrr}r(hFUhGjubahMhubaubhlX objects.rr}r(hFX objects.hGjubeubcdocutils.nodes note r)r }r (hFXWhen implementing this method in your spider middleware, you should always return an iterable (that follows the input one) and not consume all ``start_requests`` iterator because it can be very large (or even unbounded) and cause a memory overflow. The Scrapy engine is designed to pull start requests while it has capacity to process them, so the start requests iterator can be effectively endless where there is some other condition for stopping the spider (like a time limit or item/page count).hGjhHhKhMUnoter hO}r (hS]hT]hR]hQ]hU]uhWNhXhhA]r hp)r}r(hFXWhen implementing this method in your spider middleware, you should always return an iterable (that follows the input one) and not consume all ``start_requests`` iterator because it can be very large (or even unbounded) and cause a memory overflow. The Scrapy engine is designed to pull start requests while it has capacity to process them, so the start requests iterator can be effectively endless where there is some other condition for stopping the spider (like a time limit or item/page count).hGj hHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhA]r(hlXWhen implementing this method in your spider middleware, you should always return an iterable (that follows the input one) and not consume all rr}r(hFXWhen implementing this method in your spider middleware, you should always return an iterable (that follows the input one) and not consume all hGjubh)r}r(hFX``start_requests``hO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXstart_requestsrr}r(hFUhGjubahMhubhlXQ iterator because it can be very large (or even unbounded) and cause a memory overflow. The Scrapy engine is designed to pull start requests while it has capacity to process them, so the start requests iterator can be effectively endless where there is some other condition for stopping the spider (like a time limit or item/page count).rr}r(hFXQ iterator because it can be very large (or even unbounded) and cause a memory overflow. The Scrapy engine is designed to pull start requests while it has capacity to process them, so the start requests iterator can be effectively endless where there is some other condition for stopping the spider (like a time limit or item/page count).hGjubeubaubjj)r}r (hFUhGjhHNhMjmhO}r!(hS]hT]hR]hQ]hU]uhWNhXhhA]r"jp)r#}r$(hFUhO}r%(hS]hT]hR]hQ]hU]uhGjhA]r&(ju)r'}r((hFUhO}r)(hS]hT]hR]hQ]hU]uhGj#hA]r*hlX Parametersr+r,}r-(hFUhGj'ubahMj}ubj~)r.}r/(hFUhO}r0(hS]hT]hR]hQ]hU]uhGj#hA]r1j)r2}r3(hFUhO}r4(hS]hT]hR]hQ]hU]uhGj.hA]r5(j)r6}r7(hFUhO}r8(hS]hT]hR]hQ]hU]uhGj2hA]r9hp)r:}r;(hFUhO}r<(hS]hT]hR]hQ]hU]uhGj6hA]r=(j)r>}r?(hFXstart_requestshO}r@(hS]hT]hR]hQ]hU]uhGj:hA]rAhlXstart_requestsrBrC}rD(hFUhGj>ubahMjubhlX (rErF}rG(hFUhGj:ubhlXan iterable of rHrI}rJ(hFXan iterable of hGj:ubhy)rK}rL(hFX:class:`~scrapy.http.Request`rMhGj:hHhKhMh}hO}rN(UreftypeXclasshhXscrapy.http.RequestU refdomainXpyrOhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKhA]rPh)rQ}rR(hFjMhO}rS(hS]hT]rT(hjOXpy-classrUehR]hQ]hU]uhGjKhA]rVhlXRequestrWrX}rY(hFUhGjQubahMhubaubhlX)rZ}r[(hFUhGj:ubhlX -- r\r]}r^(hFUhGj:ubhlXthe start requestsr_r`}ra(hFXthe start requestsrbhGj:ubehMhsubahMjubj)rc}rd(hFUhO}re(hS]hT]hR]hQ]hU]uhGj2hA]rfhp)rg}rh(hFUhO}ri(hS]hT]hR]hQ]hU]uhGjchA]rj(j)rk}rl(hFXspiderhO}rm(hS]hT]hR]hQ]hU]uhGjghA]rnhlXspiderrorp}rq(hFUhGjkubahMjubhlX (rrrs}rt(hFUhGjgubhy)ru}rv(hFX:class:`~scrapy.item.Spider`rwhGjghHhKhMh}hO}rx(UreftypeXclasshhXscrapy.item.SpiderU refdomainXpyryhQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKhA]rzh)r{}r|(hFjwhO}r}(hS]hT]r~(hjyXpy-classrehR]hQ]hU]uhGjuhA]rhlXSpiderrr}r(hFUhGj{ubahMhubaubhlX objectrr}r(hFX objecthGjgubhlX)r}r(hFUhGjgubhlX -- rr}r(hFUhGjgubhlX,the spider to whom the start requests belongrr}r(hFX,the spider to whom the start requests belongrhGjgubehMhsubahMjubehMjubahMjubehMjubaubeubeubeubeubhC)r}r(hFXR.. _Exception: http://docs.python.org/library/exceptions.html#exceptions.ExceptionU referencedrKhGj_hHhKhMhNhO}r(jHjIhQ]rh`.hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(hlXThis page describes all spider middleware components that come with Scrapy. For information on how to use them and how to write your own spider middleware, see the rr}r(hFXThis page describes all spider middleware components that come with Scrapy. For information on how to use them and how to write your own spider middleware, see the hGjubhy)r}r(hFX?:ref:`spider middleware usage guide `rhGjhHhKhMh}hO}r(UreftypeXrefhhXtopics-spider-middlewareU refdomainXstdrhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]rh)r}r(hFjhO}r(hS]hT]r(hjXstd-refrehR]hQ]hU]uhGjhA]rhlXspider middleware usage guiderr}r(hFUhGjubahMhubaubhlX.r}r(hFX.hGjubeubhp)r}r(hFXvFor a list of the components enabled by default (and their orders) see the :setting:`SPIDER_MIDDLEWARES_BASE` setting.hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(hlXKFor a list of the components enabled by default (and their orders) see the rr}r(hFXKFor a list of the components enabled by default (and their orders) see the hGjubhy)r}r(hFX":setting:`SPIDER_MIDDLEWARES_BASE`rhGjhHhKhMh}hO}r(UreftypeXsettinghhXSPIDER_MIDDLEWARES_BASEU refdomainXstdrhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]rh)r}r(hFjhO}r(hS]hT]r(hjX std-settingrehR]hQ]hU]uhGjhA]rhlXSPIDER_MIDDLEWARES_BASErr}r(hFUhGjubahMhubaubhlX setting.rr}r(hFX setting.hGjubeubhY)r}r(hFUhGjhHhKhMh^hO}r(hS]hT]hR]hQ]r(X,module-scrapy.contrib.spidermiddleware.depthrh=ehU]rhauhWKhXhhA]r(he)r}r(hFXDepthMiddlewarerhGjhHhKhMhihO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXDepthMiddlewarerr}r(hFjhGjubaubjz)r}r(hFUhGjhHhKhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jX.scrapy.contrib.spidermiddleware.depth (module)X,module-scrapy.contrib.spidermiddleware.depthUtrauhWNhXhhA]ubjz)r}r(hFUhGjhHhKhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jX@DepthMiddleware (class in scrapy.contrib.spidermiddleware.depth)hUtrauhWNhXhhA]ubj)r}r(hFUhGjhHhKhMjhO}r(jjXpyhQ]hR]hS]hT]hU]jXclassrjjuhWNhXhhA]r(j)r}r(hFXDepthMiddlewarerhGjhHhKhMjhO}r(hQ]rhajX%scrapy.contrib.spidermiddleware.depthrhR]hS]hT]hU]rhajjjUjuhWKhXhhA]r(j)r}r(hFXclass hGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXclass rr}r(hFUhGjubaubj)r}r(hFX&scrapy.contrib.spidermiddleware.depth.hGjhHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWKhXhhA]r hlX&scrapy.contrib.spidermiddleware.depth.r r }r (hFUhGjubaubj)r}r(hFjhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXDepthMiddlewarerr}r(hFUhGjubaubeubj)r}r(hFUhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(hp)r}r(hFXDepthMiddleware is a scrape middleware used for tracking the depth of each Request inside the site being scraped. It can be used to limit the maximum depth to scrape or things like that.rhGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXDepthMiddleware is a scrape middleware used for tracking the depth of each Request inside the site being scraped. It can be used to limit the maximum depth to scrape or things like that.rr}r (hFjhGjubaubhp)r!}r"(hFX}The :class:`DepthMiddleware` can be configured through the following settings (see the settings documentation for more info):hGjhHhKhMhshO}r#(hS]hT]hR]hQ]hU]uhWKhXhhA]r$(hlXThe r%r&}r'(hFXThe hGj!ubhy)r(}r)(hFX:class:`DepthMiddleware`r*hGj!hHhKhMh}hO}r+(UreftypeXclasshhXDepthMiddlewareU refdomainXpyr,hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWKhA]r-h)r.}r/(hFj*hO}r0(hS]hT]r1(hj,Xpy-classr2ehR]hQ]hU]uhGj(hA]r3hlXDepthMiddlewarer4r5}r6(hFUhGj.ubahMhubaubhlXa can be configured through the following settings (see the settings documentation for more info):r7r8}r9(hFXa can be configured through the following settings (see the settings documentation for more info):hGj!ubeubcdocutils.nodes block_quote r:)r;}r<(hFUhGjhHNhMU block_quoter=hO}r>(hS]hT]hR]hQ]hU]uhWNhXhhA]r?j)r@}rA(hFUhO}rB(UbulletrCX*hQ]hR]hS]hT]hU]uhGj;hA]rD(j)rE}rF(hFXy:setting:`DEPTH_LIMIT` - The maximum depth that will be allowed to crawl for any site. If zero, no limit will be imposed.hO}rG(hS]hT]hR]hQ]hU]uhGj@hA]rHhp)rI}rJ(hFXy:setting:`DEPTH_LIMIT` - The maximum depth that will be allowed to crawl for any site. If zero, no limit will be imposed.hGjEhHhKhMhshO}rK(hS]hT]hR]hQ]hU]uhWKhA]rL(hy)rM}rN(hFX:setting:`DEPTH_LIMIT`rOhGjIhHhKhMh}hO}rP(UreftypeXsettinghhX DEPTH_LIMITU refdomainXstdrQhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]rRh)rS}rT(hFjOhO}rU(hS]hT]rV(hjQX std-settingrWehR]hQ]hU]uhGjMhA]rXhlX DEPTH_LIMITrYrZ}r[(hFUhGjSubahMhubaubhlXc - The maximum depth that will be allowed to crawl for any site. If zero, no limit will be imposed.r\r]}r^(hFXc - The maximum depth that will be allowed to crawl for any site. If zero, no limit will be imposed.hGjIubeubahMjubj)r_}r`(hFX8:setting:`DEPTH_STATS` - Whether to collect depth stats.rahO}rb(hS]hT]hR]hQ]hU]uhGj@hA]rchp)rd}re(hFjahGj_hHhKhMhshO}rf(hS]hT]hR]hQ]hU]uhWKhA]rg(hy)rh}ri(hFX:setting:`DEPTH_STATS`rjhGjdhHhKhMh}hO}rk(UreftypeXsettinghhX DEPTH_STATSU refdomainXstdrlhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]rmh)rn}ro(hFjjhO}rp(hS]hT]rq(hjlX std-settingrrehR]hQ]hU]uhGjhhA]rshlX DEPTH_STATSrtru}rv(hFUhGjnubahMhubaubhlX" - Whether to collect depth stats.rwrx}ry(hFX" - Whether to collect depth stats.hGjdubeubahMjubj)rz}r{(hFXT:setting:`DEPTH_PRIORITY` - Whether to prioritize the requests based on their depth.hO}r|(hS]hT]hR]hQ]hU]uhGj@hA]r}hp)r~}r(hFXT:setting:`DEPTH_PRIORITY` - Whether to prioritize the requests based on their depth.hGjzhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhA]r(hy)r}r(hFX:setting:`DEPTH_PRIORITY`rhGj~hHhKhMh}hO}r(UreftypeXsettinghhXDEPTH_PRIORITYU refdomainXstdrhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]rh)r}r(hFjhO}r(hS]hT]r(hjX std-settingrehR]hQ]hU]uhGjhA]rhlXDEPTH_PRIORITYrr}r(hFUhGjubahMhubaubhlX; - Whether to prioritize the requests based on their depth.rr}r(hFX; - Whether to prioritize the requests based on their depth.hGj~ubeubahMjubehMjubaubeubeubeubhY)r}r(hFUhGjhHhKhMh^hO}r(hS]hT]hR]hQ]r(X0module-scrapy.contrib.spidermiddleware.httperrorrh6ehU]rhauhWKhXhhA]r(he)r}r(hFXHttpErrorMiddlewarerhGjhHhKhMhihO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXHttpErrorMiddlewarerr}r(hFjhGjubaubjz)r}r(hFUhGjhHhKhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jX2scrapy.contrib.spidermiddleware.httperror (module)X0module-scrapy.contrib.spidermiddleware.httperrorUtrauhWNhXhhA]ubjz)r}r(hFUhGjhHhKhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jXHHttpErrorMiddleware (class in scrapy.contrib.spidermiddleware.httperror)hUtrauhWNhXhhA]ubj)r}r(hFUhGjhHhKhMjhO}r(jjXpyhQ]hR]hS]hT]hU]jXclassrjjuhWNhXhhA]r(j)r}r(hFXHttpErrorMiddlewarerhGjhHhKhMjhO}r(hQ]rhajX)scrapy.contrib.spidermiddleware.httperrorrhR]hS]hT]hU]rhajjjUjuhWKhXhhA]r(j)r}r(hFXclass hGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXclass rr}r(hFUhGjubaubj)r}r(hFX*scrapy.contrib.spidermiddleware.httperror.hGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlX*scrapy.contrib.spidermiddleware.httperror.rr}r(hFUhGjubaubj)r}r(hFjhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXHttpErrorMiddlewarerr}r(hFUhGjubaubeubj)r}r(hFUhGjhHhKhMjhO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhp)r}r(hFXFilter out unsuccessful (erroneous) HTTP responses so that spiders don't have to deal with them, which (most of the time) imposes an overhead, consumes more resources, and makes the spider logic more complex.rhGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXFilter out unsuccessful (erroneous) HTTP responses so that spiders don't have to deal with them, which (most of the time) imposes an overhead, consumes more resources, and makes the spider logic more complex.rr}r(hFjhGjubaubaubeubhp)r}r(hFXnAccording to the `HTTP standard`_, successful responses are those whose status codes are in the 200-300 range.hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(hlXAccording to the rr}r(hFXAccording to the hGjubjB)r}r(hFX`HTTP standard`_jEKhGjhMjFhO}r(UnameX HTTP standardjHX6http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.htmlrhQ]hR]hS]hT]hU]uhA]rhlX HTTP standardrr}r(hFUhGjubaubhlXM, successful responses are those whose status codes are in the 200-300 range.rr}r(hFXM, successful responses are those whose status codes are in the 200-300 range.hGjubeubhC)r}r(hFXI.. _HTTP standard: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.htmljKhGjhHhKhMhNhO}r(jHjhQ]rh4ahR]hS]hT]hU]rhauhWKhXhhA]ubhp)r}r(hFXIf you still want to process response codes outside that range, you can specify which response codes the spider is able to handle using the ``handle_httpstatus_list`` spider attribute or :setting:`HTTPERROR_ALLOWED_CODES` setting.hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(hlXIf you still want to process response codes outside that range, you can specify which response codes the spider is able to handle using the rr}r(hFXIf you still want to process response codes outside that range, you can specify which response codes the spider is able to handle using the hGjubh)r}r(hFX``handle_httpstatus_list``hO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXhandle_httpstatus_listrr}r(hFUhGjubahMhubhlX spider attribute or rr}r(hFX spider attribute or hGjubhy)r}r(hFX":setting:`HTTPERROR_ALLOWED_CODES`rhGjhHhKhMh}hO}r(UreftypeXsettinghhXHTTPERROR_ALLOWED_CODESU refdomainXstdrhQ]hR]U refexplicithS]hT]hU]hhuhWKhA]rh)r }r (hFjhO}r (hS]hT]r (hjX std-settingr ehR]hQ]hU]uhGjhA]rhlXHTTPERROR_ALLOWED_CODESrr}r(hFUhGj ubahMhubaubhlX setting.rr}r(hFX setting.hGjubeubhp)r}r(hFXNFor example, if you want your spider to handle 404 responses you can do this::hGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXMFor example, if you want your spider to handle 404 responses you can do this:rr}r(hFXMFor example, if you want your spider to handle 404 responses you can do this:hGjubaubh)r}r(hFX?class MySpider(CrawlSpider): handle_httpstatus_list = [404]hGjhHhKhMhhO}r(hhhQ]hR]hS]hT]hU]uhWKhXhhA]rhlX?class MySpider(CrawlSpider): handle_httpstatus_list = [404]r r!}r"(hFUhGjubaubjz)r#}r$(hFUhGjhHhKhMj}hO}r%(hQ]hR]hS]hT]hU]Uentries]r&(XpairXhandle_httpstatus_list; reqmetaX"std:reqmeta-handle_httpstatus_listr'Utr(auhWKhXhhA]ubhC)r)}r*(hFUhGjhHhKhMhNhO}r+(hQ]hR]hS]hT]hU]hVj'uhWKhXhhA]ubhp)r,}r-(hFXThe ``handle_httpstatus_list`` key of :attr:`Request.meta ` can also be used to specify which response codes to allow on a per-request basis.hGjhHhKh\}hMhshO}r.(hS]hT]hR]hQ]r/j'ahU]uhWKhXhhb}r0j'j)shA]r1(hlXThe r2r3}r4(hFXThe hGj,ubh)r5}r6(hFX``handle_httpstatus_list``hO}r7(hS]hT]hR]hQ]hU]uhGj,hA]r8hlXhandle_httpstatus_listr9r:}r;(hFUhGj5ubahMhubhlX key of r<r=}r>(hFX key of hGj,ubhy)r?}r@(hFX/:attr:`Request.meta `rAhGj,hHhKhMh}hO}rB(UreftypeXattrhhXscrapy.http.Request.metaU refdomainXpyrChQ]hR]U refexplicithS]hT]hU]hhjNjjuhWKhA]rDh)rE}rF(hFjAhO}rG(hS]hT]rH(hjCXpy-attrrIehR]hQ]hU]uhGj?hA]rJhlX Request.metarKrL}rM(hFUhGjEubahMhubaubhlXR can also be used to specify which response codes to allow on a per-request basis.rNrO}rP(hFXR can also be used to specify which response codes to allow on a per-request basis.hGj,ubeubhp)rQ}rR(hFXzKeep in mind, however, that it's usually a bad idea to handle non-200 responses, unless you really know what you're doing.rShGjhHhKhMhshO}rT(hS]hT]hR]hQ]hU]uhWKhXhhA]rUhlXzKeep in mind, however, that it's usually a bad idea to handle non-200 responses, unless you really know what you're doing.rVrW}rX(hFjShGjQubaubhp)rY}rZ(hFX:For more information see: `HTTP Status Code Definitions`_.r[hGjhHhKhMhshO}r\(hS]hT]hR]hQ]hU]uhWKhXhhA]r](hlXFor more information see: r^r_}r`(hFXFor more information see: hGjYubjB)ra}rb(hFX`HTTP Status Code Definitions`_jEKhGjYhMjFhO}rc(UnameXHTTP Status Code DefinitionsjHX6http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.htmlrdhQ]hR]hS]hT]hU]uhA]rehlXHTTP Status Code Definitionsrfrg}rh(hFUhGjaubaubhlX.ri}rj(hFX.hGjYubeubhC)rk}rl(hFXX.. _HTTP Status Code Definitions: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.htmljKhGjhHhKhMhNhO}rm(jHjdhQ]rnh-ahR]hS]hT]hU]rohauhWKhXhhA]ubhY)rp}rq(hFUhGjhHhKhMh^hO}rr(hS]hT]hR]hQ]rsh:ahU]rthauhWKhXhhA]ru(he)rv}rw(hFXHttpErrorMiddleware settingsrxhGjphHhKhMhihO}ry(hS]hT]hR]hQ]hU]uhWKhXhhA]rzhlXHttpErrorMiddleware settingsr{r|}r}(hFjxhGjvubaubjz)r~}r(hFUhGjphHhKhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(XpairX HTTPERROR_ALLOWED_CODES; settingX#std:setting-HTTPERROR_ALLOWED_CODESrUtrauhWKhXhhA]ubhC)r}r(hFUhGjphHhKhMhNhO}r(hQ]hR]hS]hT]hU]hVjuhWKhXhhA]ubhY)r}r(hFUhGjphHhKh\}hMh^hO}r(hS]hT]hR]hQ]r(h@jehU]rh#auhWKhXhhb}rjjshA]r(he)r}r(hFXHTTPERROR_ALLOWED_CODESrhGjhHhKhMhihO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXHTTPERROR_ALLOWED_CODESrr}r(hFjhGjubaubhp)r}r(hFXDefault: ``[]``rhGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]r(hlX Default: rr}r(hFX Default: hGjubh)r}r(hFX``[]``hO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlX[]rr}r(hFUhGjubahMhubeubhp)r}r(hFXDPass all responses with non-200 status codes contained in this list.rhGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXDPass all responses with non-200 status codes contained in this list.rr}r(hFjhGjubaubjz)r}r(hFUhGjhHhKhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(XpairXHTTPERROR_ALLOW_ALL; settingXstd:setting-HTTPERROR_ALLOW_ALLrUtrauhWKhXhhA]ubhC)r}r(hFUhGjhHhKhMhNhO}r(hQ]hR]hS]hT]hU]hVjuhWKhXhhA]ubeubhY)r}r(hFUhGjphHhKh\}hMh^hO}r(hS]hT]hR]hQ]r(h1jehU]rh auhWKhXhhb}rjjshA]r(he)r}r(hFXHTTPERROR_ALLOW_ALLrhGjhHhKhMhihO}r(hS]hT]hR]hQ]hU]uhWKhXhhA]rhlXHTTPERROR_ALLOW_ALLrr}r(hFjhGjubaubhp)r}r(hFXDefault: ``False``rhGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWMhXhhA]r(hlX Default: rr}r(hFX Default: hGjubh)r}r(hFX ``False``hO}r(hS]hT]hR]hQ]hU]uhGjhA]rhlXFalserr}r(hFUhGjubahMhubeubhp)r}r(hFX2Pass all responses, regardless of its status code.rhGjhHhKhMhshO}r(hS]hT]hR]hQ]hU]uhWMhXhhA]rhlX2Pass all responses, regardless of its status code.rr}r(hFjhGjubaubeubeubeubhY)r}r(hFUhGjhHhKhMh^hO}r(hS]hT]hR]hQ]r(X.module-scrapy.contrib.spidermiddleware.offsiterh2ehU]rh auhWMhXhhA]r(he)r}r(hFXOffsiteMiddlewarerhGjhHhKhMhihO}r(hS]hT]hR]hQ]hU]uhWMhXhhA]rhlXOffsiteMiddlewarerr}r(hFjhGjubaubjz)r}r(hFUhGjhHhKhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jX0scrapy.contrib.spidermiddleware.offsite (module)X.module-scrapy.contrib.spidermiddleware.offsiteUtrauhWNhXhhA]ubjz)r}r(hFUhGjhHhKhMj}hO}r(hQ]hR]hS]hT]hU]Uentries]r(jXDOffsiteMiddleware (class in scrapy.contrib.spidermiddleware.offsite)hUtrauhWNhXhhA]ubj)r}r(hFUhGjhHhKhMjhO}r(jjXpyhQ]hR]hS]hT]hU]jXclassrjjuhWNhXhhA]r(j)r}r(hFXOffsiteMiddlewarerhGjhHhKhMjhO}r(hQ]rhajX'scrapy.contrib.spidermiddleware.offsiterhR]hS]hT]hU]r hajjjUjuhWM%hXhhA]r (j)r }r (hFXclass hGjhHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWM%hXhhA]r hlXclass r r }r (hFUhGj ubaubj)r }r (hFX(scrapy.contrib.spidermiddleware.offsite.hGjhHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWM%hXhhA]r hlX(scrapy.contrib.spidermiddleware.offsite.r r }r (hFUhGj ubaubj)r }r (hFjhGjhHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWM%hXhhA]r hlXOffsiteMiddlewarer r }r (hFUhGj ubaubeubj)r }r (hFUhGjhHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWM%hXhhA]r (hp)r }r (hFXHFilters out Requests for URLs outside the domains covered by the spider.r hGj hHhKhMhshO}r (hS]hT]hR]hQ]hU]uhWM hXhhA]r hlXHFilters out Requests for URLs outside the domains covered by the spider.r r! }r" (hFj hGj ubaubhp)r# }r$ (hFXThis middleware filters out every request whose host names aren't in the spider's :attr:`~scrapy.spider.Spider.allowed_domains` attribute.hGj hHhKhMhshO}r% (hS]hT]hR]hQ]hU]uhWMhXhhA]r& (hlXRThis middleware filters out every request whose host names aren't in the spider's r' r( }r) (hFXRThis middleware filters out every request whose host names aren't in the spider's hGj# ubhy)r* }r+ (hFX-:attr:`~scrapy.spider.Spider.allowed_domains`r, hGj# hHhKhMh}hO}r- (UreftypeXattrhhX$scrapy.spider.Spider.allowed_domainsU refdomainXpyr. hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWMhA]r/ h)r0 }r1 (hFj, hO}r2 (hS]hT]r3 (hj. Xpy-attrr4 ehR]hQ]hU]uhGj* hA]r5 hlXallowed_domainsr6 r7 }r8 (hFUhGj0 ubahMhubaubhlX attribute.r9 r: }r; (hFX attribute.hGj# ubeubhp)r< }r= (hFXWhen your spider returns a request for a domain not belonging to those covered by the spider, this middleware will log a debug message similar to this one::hGj hHhKhMhshO}r> (hS]hT]hR]hQ]hU]uhWMhXhhA]r? hlXWhen your spider returns a request for a domain not belonging to those covered by the spider, this middleware will log a debug message similar to this one:r@ rA }rB (hFXWhen your spider returns a request for a domain not belonging to those covered by the spider, this middleware will log a debug message similar to this one:hGj< ubaubh)rC }rD (hFXeDEBUG: Filtered offsite request to 'www.othersite.com': hGj hHhKhMhhO}rE (hhhQ]hR]hS]hT]hU]uhWMhXhhA]rF hlXeDEBUG: Filtered offsite request to 'www.othersite.com': rG rH }rI (hFUhGjC ubaubhp)rJ }rK (hFX\To avoid filling the log with too much noise, it will only print one of these messages for each new domain filtered. So, for example, if another request for ``www.othersite.com`` is filtered, no log message will be printed. But if a request for ``someothersite.com`` is filtered, a message will be printed (but only for the first request filtered).hGj hHhKhMhshO}rL (hS]hT]hR]hQ]hU]uhWMhXhhA]rM (hlXTo avoid filling the log with too much noise, it will only print one of these messages for each new domain filtered. So, for example, if another request for rN rO }rP (hFXTo avoid filling the log with too much noise, it will only print one of these messages for each new domain filtered. So, for example, if another request for hGjJ ubh)rQ }rR (hFX``www.othersite.com``hO}rS (hS]hT]hR]hQ]hU]uhGjJ hA]rT hlXwww.othersite.comrU rV }rW (hFUhGjQ ubahMhubhlXC is filtered, no log message will be printed. But if a request for rX rY }rZ (hFXC is filtered, no log message will be printed. But if a request for hGjJ ubh)r[ }r\ (hFX``someothersite.com``hO}r] (hS]hT]hR]hQ]hU]uhGjJ hA]r^ hlXsomeothersite.comr_ r` }ra (hFUhGj[ ubahMhubhlXR is filtered, a message will be printed (but only for the first request filtered).rb rc }rd (hFXR is filtered, a message will be printed (but only for the first request filtered).hGjJ ubeubhp)re }rf (hFXIf the spider doesn't define an :attr:`~scrapy.spider.Spider.allowed_domains` attribute, or the attribute is empty, the offsite middleware will allow all requests.hGj hHhKhMhshO}rg (hS]hT]hR]hQ]hU]uhWMhXhhA]rh (hlX If the spider doesn't define an ri rj }rk (hFX If the spider doesn't define an hGje ubhy)rl }rm (hFX-:attr:`~scrapy.spider.Spider.allowed_domains`rn hGje hHhKhMh}hO}ro (UreftypeXattrhhX$scrapy.spider.Spider.allowed_domainsU refdomainXpyrp hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWMhA]rq h)rr }rs (hFjn hO}rt (hS]hT]ru (hjp Xpy-attrrv ehR]hQ]hU]uhGjl hA]rw hlXallowed_domainsrx ry }rz (hFUhGjr ubahMhubaubhlXV attribute, or the attribute is empty, the offsite middleware will allow all requests.r{ r| }r} (hFXV attribute, or the attribute is empty, the offsite middleware will allow all requests.hGje ubeubhp)r~ }r (hFXIf the request has the :attr:`~scrapy.http.Request.dont_filter` attribute set, the offsite middleware will allow the request even if its domain is not listed in allowed domains.hGj hHhKhMhshO}r (hS]hT]hR]hQ]hU]uhWM!hXhhA]r (hlXIf the request has the r r }r (hFXIf the request has the hGj~ ubhy)r }r (hFX(:attr:`~scrapy.http.Request.dont_filter`r hGj~ hHhKhMh}hO}r (UreftypeXattrhhXscrapy.http.Request.dont_filterU refdomainXpyr hQ]hR]U refexplicithS]hT]hU]hhjjjjuhWM!hA]r h)r }r (hFj hO}r (hS]hT]r (hj Xpy-attrr ehR]hQ]hU]uhGj hA]r hlX dont_filterr r }r (hFUhGj ubahMhubaubhlXr attribute set, the offsite middleware will allow the request even if its domain is not listed in allowed domains.r r }r (hFXr attribute set, the offsite middleware will allow the request even if its domain is not listed in allowed domains.hGj~ ubeubeubeubeubhY)r }r (hFUhGjhHhKhMh^hO}r (hS]hT]hR]hQ]r (X.module-scrapy.contrib.spidermiddleware.refererr h>ehU]r hauhWM'hXhhA]r (he)r }r (hFXRefererMiddlewarer hGj hHhKhMhihO}r (hS]hT]hR]hQ]hU]uhWM'hXhhA]r hlXRefererMiddlewarer r }r (hFj hGj ubaubjz)r }r (hFUhGj hHhKhMj}hO}r (hQ]hR]hS]hT]hU]Uentries]r (jX0scrapy.contrib.spidermiddleware.referer (module)X.module-scrapy.contrib.spidermiddleware.refererUtr auhWNhXhhA]ubjz)r }r (hFUhGj hHhKhMj}hO}r (hQ]hR]hS]hT]hU]Uentries]r (jXDRefererMiddleware (class in scrapy.contrib.spidermiddleware.referer)hUtr auhWNhXhhA]ubj)r }r (hFUhGj hHhKhMjhO}r (jjXpyhQ]hR]hS]hT]hU]jXclassr jj uhWNhXhhA]r (j)r }r (hFXRefererMiddlewarer hGj hHhKhMjhO}r (hQ]r hajX'scrapy.contrib.spidermiddleware.refererr hR]hS]hT]hU]r hajj jUjuhWM0hXhhA]r (j)r }r (hFXclass hGj hHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWM0hXhhA]r hlXclass r r }r (hFUhGj ubaubj)r }r (hFX(scrapy.contrib.spidermiddleware.referer.hGj hHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWM0hXhhA]r hlX(scrapy.contrib.spidermiddleware.referer.r r }r (hFUhGj ubaubj)r }r (hFj hGj hHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWM0hXhhA]r hlXRefererMiddlewarer r }r (hFUhGj ubaubeubj)r }r (hFUhGj hHhKhMjhO}r (hS]hT]hR]hQ]hU]uhWM0hXhhA]r hp)r }r (hFXZPopulates Request ``Referer`` header, based on the URL of the Response which generated it.hGj hHhKhMhshO}r (hS]hT]hR]hQ]hU]uhWM.hXhhA]r (hlXPopulates Request r r }r (hFXPopulates Request hGj ubh)r }r (hFX ``Referer``hO}r (hS]hT]hR]hQ]hU]uhGj hA]r hlXRefererr r }r (hFUhGj ubahMhubhlX= header, based on the URL of the Response which generated it.r r }r (hFX= header, based on the URL of the Response which generated it.hGj ubeubaubeubhY)r }r (hFUhGj hHhKhMh^hO}r (hS]hT]hR]hQ]r h0ahU]r h auhWM2hXhhA]r (he)r }r (hFXRefererMiddleware settingsr hGj hHhKhMhihO}r (hS]hT]hR]hQ]hU]uhWM2hXhhA]r hlXRefererMiddleware settingsr r }r (hFj hGj ubaubjz)r }r (hFUhGj hHhKhMj}hO}r (hQ]hR]hS]hT]hU]Uentries]r (XpairXREFERER_ENABLED; settingXstd:setting-REFERER_ENABLEDr Utr auhWM5hXhhA]ubhC)r }r (hFUhGj hHhKhMhNhO}r (hQ]hR]hS]hT]hU]hVj uhWM5hXhhA]ubhY)r }r (hFUhGj hHhKh\}hMh^hO}r (hS]hT]hR]hQ]r (h;j ehU]r hauhWM7hXhhb}r j j shA]r (he)r }r (hFXREFERER_ENABLEDr hGj hHhKhMhihO}r (hS]hT]hR]hQ]hU]uhWM7hXhhA]r hlXREFERER_ENABLEDr r }r (hFj hGj ubaubj)r }r (hFUhGj hHhKhMjhO}r (jX0.15hQ]hR]hS]hT]hU]jX versionaddedr uhWM9hXhhA]r hp)r }r (hFUhGj hHhKhMhshO}r (hS]hT]hR]hQ]hU]uhWM:hXhhA]r j)r }r (hFUhO}r (hS]hT]r jahR]hQ]hU]uhGj hA]r hlXNew in version 0.15.r r }r (hFUhGj ubahMjubaubaubhp)r }r (hFXDefault: ``True``r hGj hHhKhMhshO}r! (hS]hT]hR]hQ]hU]uhWM;hXhhA]r" (hlX Default: r# r$ }r% (hFX Default: hGj ubh)r& }r' (hFX``True``hO}r( (hS]hT]hR]hQ]hU]uhGj hA]r) hlXTruer* r+ }r, (hFUhGj& ubahMhubeubhp)r- }r. (hFX%Whether to enable referer middleware.r/ hGj hHhKhMhshO}r0 (hS]hT]hR]hQ]hU]uhWM=hXhhA]r1 hlX%Whether to enable referer middleware.r2 r3 }r4 (hFj/ hGj- ubaubeubeubeubhY)r5 }r6 (hFUhGjhHhKhMh^hO}r7 (hS]hT]hR]hQ]r8 (X0module-scrapy.contrib.spidermiddleware.urllengthr9 h9ehU]r: hauhWM@hXhhA]r; (he)r< }r= (hFXUrlLengthMiddlewarer> hGj5 hHhKhMhihO}r? (hS]hT]hR]hQ]hU]uhWM@hXhhA]r@ hlXUrlLengthMiddlewarerA rB }rC (hFj> hGj< ubaubjz)rD }rE (hFUhGj5 hHhKhMj}hO}rF (hQ]hR]hS]hT]hU]Uentries]rG (jX2scrapy.contrib.spidermiddleware.urllength (module)X0module-scrapy.contrib.spidermiddleware.urllengthUtrH auhWNhXhhA]ubjz)rI }rJ (hFUhGj5 hHhKhMj}hO}rK (hQ]hR]hS]hT]hU]Uentries]rL (jXHUrlLengthMiddleware (class in scrapy.contrib.spidermiddleware.urllength)h UtrM auhWNhXhhA]ubj)rN }rO (hFUhGj5 hHhKhMjhO}rP (jjXpyhQ]hR]hS]hT]hU]jXclassrQ jjQ uhWNhXhhA]rR (j)rS }rT (hFXUrlLengthMiddlewarerU hGjN hHhKhMjhO}rV (hQ]rW h ajX)scrapy.contrib.spidermiddleware.urllengthrX hR]hS]hT]hU]rY h ajjU jUjuhWMMhXhhA]rZ (j)r[ }r\ (hFXclass hGjS hHhKhMjhO}r] (hS]hT]hR]hQ]hU]uhWMMhXhhA]r^ hlXclass r_ r` }ra (hFUhGj[ ubaubj)rb }rc (hFX*scrapy.contrib.spidermiddleware.urllength.hGjS hHhKhMjhO}rd (hS]hT]hR]hQ]hU]uhWMMhXhhA]re hlX*scrapy.contrib.spidermiddleware.urllength.rf rg }rh (hFUhGjb ubaubj)ri }rj (hFjU hGjS hHhKhMjhO}rk (hS]hT]hR]hQ]hU]uhWMMhXhhA]rl hlXUrlLengthMiddlewarerm rn }ro (hFUhGji ubaubeubj)rp }rq (hFUhGjN hHhKhMjhO}rr (hS]hT]hR]hQ]hU]uhWMMhXhhA]rs (hp)rt }ru (hFX:Filters out requests with URLs longer than URLLENGTH_LIMITrv hGjp hHhKhMhshO}rw (hS]hT]hR]hQ]hU]uhWMGhXhhA]rx hlX:Filters out requests with URLs longer than URLLENGTH_LIMITry rz }r{ (hFjv hGjt ubaubhp)r| }r} (hFXThe :class:`UrlLengthMiddleware` can be configured through the following settings (see the settings documentation for more info):hGjp hHhKhMhshO}r~ (hS]hT]hR]hQ]hU]uhWMIhXhhA]r (hlXThe r r }r (hFXThe hGj| ubhy)r }r (hFX:class:`UrlLengthMiddleware`r hGj| hHhKhMh}hO}r (UreftypeXclasshhXUrlLengthMiddlewareU refdomainXpyr hQ]hR]U refexplicithS]hT]hU]hhjjU jjX uhWMIhA]r h)r }r (hFj hO}r (hS]hT]r (hj Xpy-classr ehR]hQ]hU]uhGj hA]r hlXUrlLengthMiddlewarer r }r (hFUhGj ubahMhubaubhlXa can be configured through the following settings (see the settings documentation for more info):r r }r (hFXa can be configured through the following settings (see the settings documentation for more info):hGj| ubeubj:)r }r (hFUhGjp hHNhMj=hO}r (hS]hT]hR]hQ]hU]uhWNhXhhA]r j)r }r (hFUhO}r (jCX*hQ]hR]hS]hT]hU]uhGj hA]r j)r }r (hFXN:setting:`URLLENGTH_LIMIT` - The maximum URL length to allow for crawled URLs.r hO}r (hS]hT]hR]hQ]hU]uhGj hA]r hp)r }r (hFj hGj hHhKhMhshO}r (hS]hT]hR]hQ]hU]uhWMLhA]r (hy)r }r (hFX:setting:`URLLENGTH_LIMIT`r hGj hHhKhMh}hO}r (UreftypeXsettinghhXURLLENGTH_LIMITU refdomainXstdr hQ]hR]U refexplicithS]hT]hU]hhuhWMLhA]r h)r }r (hFj hO}r (hS]hT]r (hj X std-settingr ehR]hQ]hU]uhGj hA]r hlXURLLENGTH_LIMITr r }r (hFUhGj ubahMhubaubhlX4 - The maximum URL length to allow for crawled URLs.r r }r (hFX4 - The maximum URL length to allow for crawled URLs.hGj ubeubahMjubahMjubaubeubeubeubeubeubehFUU transformerr NU footnote_refsr }r Urefnamesr }r (X exception]r jCaXhttp status code definitions]r jaaX http standard]r jauUsymbol_footnotesr ]r Uautofootnote_refsr ]r Usymbol_footnote_refsr ]r U citationsr ]r hXhU current_liner NUtransform_messagesr ]r (cdocutils.nodes system_message r )r }r (hFUhO}r (hS]UlevelKhQ]hR]UsourcehKhT]hU]UlineKUtypeUINFOr uhA]r hp)r }r (hFUhO}r (hS]hT]hR]hQ]hU]uhGj hA]r hlX>Hyperlink target "topics-spider-middleware" is not referenced.r r }r (hFUhGj ubahMhsubahMUsystem_messager ubj )r }r (hFUhO}r (hS]UlevelKhQ]hR]UsourcehKhT]hU]UlineK Utypej uhA]r hp)r }r (hFUhO}r (hS]hT]hR]hQ]hU]uhGj hA]r hlXFHyperlink target "topics-spider-middleware-setting" is not referenced.r r }r (hFUhGj ubahMhsubahMj ubj )r }r (hFUhO}r (hS]UlevelKhQ]hR]UsourcehKhT]hU]UlineK;Utypej uhA]r hp)r }r (hFUhO}r (hS]hT]hR]hQ]hU]uhGj hA]r hlXLHyperlink target "module-scrapy.contrib.spidermiddleware" is not referenced.r r }r (hFUhGj ubahMhsubahMj ubj )r }r (hFUhO}r (hS]UlevelKhQ]hR]UsourcehKhT]hU]UlineKUtypej uhA]r hp)r }r (hFUhO}r (hS]hT]hR]hQ]hU]uhGj hA]r hlXBHyperlink target "topics-spider-middleware-ref" is not referenced.r r }r (hFUhGj ubahMhsubahMj ubj )r }r (hFUhO}r (hS]UlevelKhQ]hR]UsourcehKhT]hU]UlineKUtypej uhA]r hp)r }r (hFUhO}r (hS]hT]hR]hQ]hU]uhGj hA]r hlXHHyperlink target "std:reqmeta-handle_httpstatus_list" is not referenced.r r }r (hFUhGj ubahMhsubahMj ubj )r }r (hFUhO}r (hS]UlevelKhQ]hR]UsourcehKhT]hU]UlineKUtypej uhA]r hp)r }r (hFUhO}r (hS]hT]hR]hQ]hU]uhGj hA]r hlXIHyperlink target "std:setting-HTTPERROR_ALLOWED_CODES" is not referenced.r r }r (hFUhGj ubahMhsubahMj ubj )r }r (hFUhO}r (hS]UlevelKhQ]hR]UsourcehKhT]hU]UlineKUtypej uhA]r hp)r }r (hFUhO}r (hS]hT]hR]hQ]hU]uhGj hA]r hlXEHyperlink target "std:setting-HTTPERROR_ALLOW_ALL" is not referenced.r r }r (hFUhGj ubahMhsubahMj ubj )r }r (hFUhO}r (hS]UlevelKhQ]hR]UsourcehKhT]hU]UlineM5Utypej uhA]r hp)r }r (hFUhO}r! (hS]hT]hR]hQ]hU]uhGj hA]r" hlXAHyperlink target "std:setting-REFERER_ENABLED" is not referenced.r# r$ }r% (hFUhGj ubahMhsubahMj ubeUreporterr& NUid_startr' KU autofootnotesr( ]r) U citation_refsr* }r+ Uindirect_targetsr, ]r- Usettingsr. (cdocutils.frontend Values r/ or0 }r1 (Ufootnote_backlinksr2 KUrecord_dependenciesr3 NU rfc_base_urlr4 Uhttp://tools.ietf.org/html/r5 U tracebackr6 Upep_referencesr7 NUstrip_commentsr8 NU toc_backlinksr9 Uentryr: U language_coder; Uenr< U datestampr= NU report_levelr> KU _destinationr? NU halt_levelr@ KU strip_classesrA NhiNUerror_encoding_error_handlerrB UbackslashreplacerC UdebugrD NUembed_stylesheetrE Uoutput_encoding_error_handlerrF UstrictrG U sectnum_xformrH KUdump_transformsrI NU docinfo_xformrJ KUwarning_streamrK NUpep_file_url_templaterL Upep-%04drM Uexit_status_levelrN KUconfigrO NUstrict_visitorrP NUcloak_email_addressesrQ Utrim_footnote_reference_spacerR UenvrS NUdump_pseudo_xmlrT NUexpose_internalsrU NUsectsubtitle_xformrV U source_linkrW NUrfc_referencesrX NUoutput_encodingrY Uutf-8rZ U source_urlr[ NUinput_encodingr\ U utf-8-sigr] U_disable_configr^ NU id_prefixr_ UU tab_widthr` KUerror_encodingra UUTF-8rb U_sourcerc UN/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/spider-middleware.rstrd Ugettext_compactre U generatorrf NUdump_internalsrg NU smart_quotesrh U pep_base_urlri Uhttp://www.python.org/dev/peps/rj Usyntax_highlightrk Ulongrl Uinput_encoding_error_handlerrm jG Uauto_id_prefixrn Uidro Udoctitle_xformrp Ustrip_elements_with_classesrq NU _config_filesrr ]Ufile_insertion_enabledrs U raw_enabledrt KU dump_settingsru NubUsymbol_footnote_startrv KUidsrw }rx (h;j h0j j hC)ry }rz (hFUhGj hHhKhMhNhO}r{ (hS]hQ]r| j ahR]UismodhT]hU]uhWNhXhhA]ubh/hh:jph jS h2jh8hZh jh3hh5hZhjh6jhjj9 hC)r} }r~ (hFUhGj5 hHhKhMhNhO}r (hS]hQ]r j9 ahR]UismodhT]hU]uhWNhXhhA]ubh4jhj h9j5 hjh.j_hjjjjhC)r }r (hFUhGjhHhKhMhNhO}r (hS]hQ]r jahR]UismodhT]hU]uhWNhXhhA]ubhjjhC)r }r (hFUhGjhHhKhMhNhO}r (hS]hQ]r jahR]UismodhT]hU]uhWNhXhhA]ubhj h@jjyjuh-jkjhC)r }r (hFUhGjhHhKhMhNhO}r (hS]hQ]r jahR]UismodhT]hU]uhWNhXhhA]ubh?jh1jh!jh"juUsubstitution_namesr }r hMhXhO}r (hS]hQ]hR]UsourcehKhT]hU]uU footnotesr ]r Urefidsr }r (j ]r j ah3]r haj]r jah5]r hDaj]r jah?]r jaj']r j)auub.PKo1DzܝLL,scrapy-0.22/.doctrees/topics/firefox.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xtopics-firefox-addonsqXtopics-firefoxqXinspect elementqXusing firefox for scrapingq NXxpatherq X#useful firefox add-ons for scrapingq NX tamper dataq X xpath checkerq Xtopics-firefox-livedomqX,caveats with inspecting the live browser domqNXfirebugqX firecookiequUsubstitution_defsq}qUparse_messagesq]q(cdocutils.nodes system_message q)q}q(U rawsourceqUUparentqcdocutils.nodes section q)q}q(hUU referencedqKhh)q}q (hUhh)q!}q"(hUhhUsourceq#cdocutils.nodes reprunicode q$XD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/firefox.rstq%q&}q'bUexpect_referenced_by_nameq(}q)hcdocutils.nodes target q*)q+}q,(hX.. _topics-firefox:hhh#h&Utagnameq-Utargetq.U attributesq/}q0(Uidsq1]Ubackrefsq2]Udupnamesq3]Uclassesq4]Unamesq5]Urefidq6Utopics-firefoxq7uUlineq8KUdocumentq9hUchildrenq:]ubsh-Usectionq;h/}q<(h3]h4]h2]h1]q=(Uusing-firefox-for-scrapingq>h7eh5]q?(h heuh8Kh9hUexpect_referenced_by_idq@}qAh7h+sh:]qB(cdocutils.nodes title qC)qD}qE(hXUsing Firefox for scrapingqFhh!h#h&h-UtitleqGh/}qH(h3]h4]h2]h1]h5]uh8Kh9hh:]qIcdocutils.nodes Text qJXUsing Firefox for scrapingqKqL}qM(hhFhhDubaubcdocutils.nodes paragraph qN)qO}qP(hXHere is a list of tips and advice on using Firefox for scraping, along with a list of useful Firefox add-ons to ease the scraping process.qQhh!h#h&h-U paragraphqRh/}qS(h3]h4]h2]h1]h5]uh8Kh9hh:]qThJXHere is a list of tips and advice on using Firefox for scraping, along with a list of useful Firefox add-ons to ease the scraping process.qUqV}qW(hhQhhOubaubh*)qX}qY(hX.. _topics-firefox-livedom:hh!h#h&h-h.h/}qZ(h1]h2]h3]h4]h5]h6Utopics-firefox-livedomq[uh8K h9hh:]ubh)q\}q](hUhh!h#h&h(}q^hhXsh-h;h/}q_(h3]h4]h2]h1]q`(U,caveats-with-inspecting-the-live-browser-domqah[eh5]qb(hheuh8K h9hh@}qch[hXsh:]qd(hC)qe}qf(hX,Caveats with inspecting the live browser DOMqghh\h#h&h-hGh/}qh(h3]h4]h2]h1]h5]uh8K h9hh:]qihJX,Caveats with inspecting the live browser DOMqjqk}ql(hhghheubaubhN)qm}qn(hXSince Firefox add-ons operate on a live browser DOM, what you'll actually see when inspecting the page source is not the original HTML, but a modified one after applying some browser clean up and executing Javascript code. Firefox, in particular, is known for adding ```` elements to tables. Scrapy, on the other hand, does not modify the original page HTML, so you won't be able to extract any data if you use ````h/}qw(h3]h4]h2]h1]h5]uhhmh:]qxhJXqyqz}q{(hUhhuubah-Uliteralq|ubhJX elements to tables. Scrapy, on the other hand, does not modify the original page HTML, so you won't be able to extract any data if you use q}q~}q(hX elements to tables. Scrapy, on the other hand, does not modify the original page HTML, so you won't be able to extract any data if you use hhmubht)q}q(hX ```` elements in your XPath expressions unless you really know what you're doing hhh#h&h-hh/}q(h3]h4]h2]h1]h5]uh8Nh9hh:]qhN)q}q(hXeNever include ```` elements in your XPath expressions unless you really know what you're doinghhh#h&h-hRh/}q(h3]h4]h2]h1]h5]uh8K h:]q(hJXNever include qᅁq}q(hXNever include hhubht)q}q(hX ````h/}q(h3]h4]h2]h1]h5]uhhh:]qhJXq腁q}q(hUhhubah-h|ubhJXL elements in your XPath expressions unless you really know what you're doingq녁q}q(hXL elements in your XPath expressions unless you really know what you're doinghhubeubaubeubh*)q}q(hX.. _topics-firefox-addons:hh\h#h&h-h.h/}q(h1]h2]h3]h4]h5]h6Utopics-firefox-addonsquh8K#h9hh:]ubeubheubh#h&h(}qhhsh-h;h/}q(h3]h4]h2]h1]q(U#useful-firefox-add-ons-for-scrapingqheh5]q(h heuh8K&h9hh@}qhhsh:]q(hC)q}q(hX#Useful Firefox add-ons for scrapingqhhh#h&h-hGh/}q(h3]h4]h2]h1]h5]uh8K&h9hh:]qhJX#Useful Firefox add-ons for scrapingqq}r(hhhhubaubh)r}r(hUhKhhh#h&h-h;h/}r(h3]rXfirebugrah4]h2]h1]rUfirebugrah5]uh8K)h9hh:]r(hC)r }r (hXFirebugr hjh#h&h-hGh/}r (h3]h4]h2]h1]h5]uh8K)h9hh:]r hJXFirebugrr}r(hj hj ubaubhN)r}r(hX9`Firebug`_ is a widely known tool among web developers and it's also very useful for scraping. In particular, its `Inspect Element`_ feature comes very handy when you need to construct the XPaths for extracting data because it allows you to view the HTML code of each page element while moving your mouse over it.hjh#h&h-hRh/}r(h3]h4]h2]h1]h5]uh8K+h9hh:]r(cdocutils.nodes reference r)r}r(hX `Firebug`_UresolvedrKhjh-U referencerh/}r(UnameXFirebugUrefurirXhttp://getfirebug.comrh1]h2]h3]h4]h5]uh:]rhJXFirebugrr}r (hUhjubaubhJXh is a widely known tool among web developers and it's also very useful for scraping. In particular, its r!r"}r#(hXh is a widely known tool among web developers and it's also very useful for scraping. In particular, its hjubj)r$}r%(hX`Inspect Element`_jKhjh-jh/}r&(UnameXInspect ElementjX*http://www.youtube.com/watch?v=-pT_pDe54aAr'h1]h2]h3]h4]h5]uh:]r(hJXInspect Elementr)r*}r+(hUhj$ubaubhJX feature comes very handy when you need to construct the XPaths for extracting data because it allows you to view the HTML code of each page element while moving your mouse over it.r,r-}r.(hX feature comes very handy when you need to construct the XPaths for extracting data because it allows you to view the HTML code of each page element while moving your mouse over it.hjubeubhN)r/}r0(hXQSee :ref:`topics-firebug` for a detailed guide on how to use Firebug with Scrapy.hjh#h&h-hRh/}r1(h3]h4]h2]h1]h5]uh8K1h9hh:]r2(hJXSee r3r4}r5(hXSee hj/ubcsphinx.addnodes pending_xref r6)r7}r8(hX:ref:`topics-firebug`r9hj/h#h&h-U pending_xrefr:h/}r;(UreftypeXrefUrefwarnr<U reftargetr=Xtopics-firebugU refdomainXstdr>h1]h2]U refexplicith3]h4]h5]Urefdocr?Xtopics/firefoxr@uh8K1h:]rAcdocutils.nodes emphasis rB)rC}rD(hj9h/}rE(h3]h4]rF(UxrefrGj>Xstd-refrHeh2]h1]h5]uhj7h:]rIhJXtopics-firebugrJrK}rL(hUhjCubah-UemphasisrMubaubhJX8 for a detailed guide on how to use Firebug with Scrapy.rNrO}rP(hX8 for a detailed guide on how to use Firebug with Scrapy.hj/ubeubeubh)rQ}rR(hUhKhhh#h&h-h;h/}rS(h3]rTXxpatherrUah4]h2]h1]rVUxpatherrWah5]uh8K5h9hh:]rX(hC)rY}rZ(hXXPatherr[hjQh#h&h-hGh/}r\(h3]h4]h2]h1]h5]uh8K5h9hh:]r]hJXXPatherr^r_}r`(hj[hjYubaubhN)ra}rb(hXF`XPather`_ allows you to test XPath expressions directly on the pages.rchjQh#h&h-hRh/}rd(h3]h4]h2]h1]h5]uh8K7h9hh:]re(j)rf}rg(hX `XPather`_jKhjah-jh/}rh(UnameXXPatherjX-https://addons.mozilla.org/firefox/addon/1192rih1]h2]h3]h4]h5]uh:]rjhJXXPatherrkrl}rm(hUhjfubaubhJX< allows you to test XPath expressions directly on the pages.rnro}rp(hX< allows you to test XPath expressions directly on the pages.hjaubeubeubh)rq}rr(hUhKhhh#h&h-h;h/}rs(h3]rtX xpath checkerruah4]h2]h1]rvU xpath-checkerrwah5]uh8K:h9hh:]rx(hC)ry}rz(hX XPath Checkerr{hjqh#h&h-hGh/}r|(h3]h4]h2]h1]h5]uh8K:h9hh:]r}hJX XPath Checkerr~r}r(hj{hjyubaubhN)r}r(hXL`XPath Checker`_ is another Firefox add-on for testing XPaths on your pages.rhjqh#h&h-hRh/}r(h3]h4]h2]h1]h5]uh8Kh jh hh jh jhh[hhahjhjuh:]r2(h+h!ehUU transformerr3NU footnote_refsr4}r5Urefnamesr6}r7(Xinspect element]r8j$aXxpather]r9jfaX tamper data]r:jaX xpath checker]r;jaXfirebug]r<jaX firecookie]r=jauUsymbol_footnotesr>]r?Uautofootnote_refsr@]rAUsymbol_footnote_refsrB]rCU citationsrD]rEh9hU current_linerFNUtransform_messagesrG]rH(h)rI}rJ(hUh/}rK(h3]UlevelKh1]h2]Usourceh&h4]h5]UlineKUtypejuh:]rLhN)rM}rN(hUh/}rO(h3]h4]h2]h1]h5]uhjIh:]rPhJX4Hyperlink target "topics-firefox" is not referenced.rQrR}rS(hUhjMubah-hRubah-jubh)rT}rU(hUh/}rV(h3]UlevelKh1]h2]Usourceh&h4]h5]UlineK Utypejuh:]rWhN)rX}rY(hUh/}rZ(h3]h4]h2]h1]h5]uhjTh:]r[hJX<Hyperlink target "topics-firefox-livedom" is not referenced.r\r]}r^(hUhjXubah-hRubah-jubh)r_}r`(hUh/}ra(h3]UlevelKh1]h2]Usourceh&h4]h5]UlineK#Utypejuh:]rbhN)rc}rd(hUh/}re(h3]h4]h2]h1]h5]uhj_h:]rfhJX;Hyperlink target "topics-firefox-addons" is not referenced.rgrh}ri(hUhjcubah-hRubah-jubeUreporterrjNUid_startrkKU autofootnotesrl]rmU citation_refsrn}roUindirect_targetsrp]rqUsettingsrr(cdocutils.frontend Values rsort}ru(Ufootnote_backlinksrvKUrecord_dependenciesrwNU rfc_base_urlrxUhttp://tools.ietf.org/html/ryU tracebackrzUpep_referencesr{NUstrip_commentsr|NU toc_backlinksr}Uentryr~U language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhGNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/firefox.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]rUfile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(jjhhh7h!jjjjjwjqjhjjjWjQh>h!jjhah\jjh[h\jjhhjjuUsubstitution_namesr}rh-h9h/}r(h3]h1]h2]Usourceh&h4]h5]uU footnotesr]rUrefidsr}r(h]rhah7]rh+ah[]rhXauub.PKo1DhZykk,scrapy-0.22/.doctrees/topics/logging.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xscrapy.log.CRITICALqXloggingqNXscrapy.log moduleqNXtwisted loggingq Xscrapy.log.DEBUGq Xhow to set the log levelq NXscrapy.log.startq X log levelsq NXscrapy.log.ERRORqXlogging settingsqNXtopics-loggingqXhow to log messagesqNXscrapy.log.msgqXscrapy.log.WARNINGqXtopics-logging-levelsqXscrapy.log.INFOqXlogging from spidersqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hhhUloggingq hUscrapy-log-moduleq!h Utwisted-loggingq"h h h Uhow-to-set-the-log-levelq#h h h U log-levelsq$hhhUlogging-settingsq%hUtopics-loggingq&hUhow-to-log-messagesq'hhhhhUtopics-logging-levelsq(hhhUlogging-from-spidersq)uUchildrenq*]q+(cdocutils.nodes target q,)q-}q.(U rawsourceq/X.. _topics-logging:Uparentq0hUsourceq1cdocutils.nodes reprunicode q2XD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/logging.rstq3q4}q5bUtagnameq6Utargetq7U attributesq8}q9(Uidsq:]Ubackrefsq;]Udupnamesq<]Uclassesq=]Unamesq>]Urefidq?h&uUlineq@KUdocumentqAhh*]ubcdocutils.nodes section qB)qC}qD(h/Uh0hh1h4Uexpect_referenced_by_nameqE}qFhh-sh6UsectionqGh8}qH(h<]h=]h;]h:]qI(h h&eh>]qJ(hheuh@KhAhUexpect_referenced_by_idqK}qLh&h-sh*]qM(cdocutils.nodes title qN)qO}qP(h/XLoggingqQh0hCh1h4h6UtitleqRh8}qS(h<]h=]h;]h:]h>]uh@KhAhh*]qTcdocutils.nodes Text qUXLoggingqVqW}qX(h/hQh0hOubaubcdocutils.nodes paragraph qY)qZ}q[(h/XScrapy provides a logging facility which can be used through the :mod:`scrapy.log` module. The current underlying implementation uses `Twisted logging`_ but this may change in the future.h0hCh1h4h6U paragraphq\h8}q](h<]h=]h;]h:]h>]uh@KhAhh*]q^(hUXAScrapy provides a logging facility which can be used through the q_q`}qa(h/XAScrapy provides a logging facility which can be used through the h0hZubcsphinx.addnodes pending_xref qb)qc}qd(h/X:mod:`scrapy.log`qeh0hZh1h4h6U pending_xrefqfh8}qg(UreftypeXmodUrefwarnqhU reftargetqiX scrapy.logU refdomainXpyqjh:]h;]U refexplicith<]h=]h>]UrefdocqkXtopics/loggingqlUpy:classqmNU py:moduleqnNuh@Kh*]qocdocutils.nodes literal qp)qq}qr(h/heh8}qs(h<]h=]qt(UxrefquhjXpy-modqveh;]h:]h>]uh0hch*]qwhUX scrapy.logqxqy}qz(h/Uh0hqubah6Uliteralq{ubaubhUX4 module. The current underlying implementation uses q|q}}q~(h/X4 module. The current underlying implementation uses h0hZubcdocutils.nodes reference q)q}q(h/X`Twisted logging`_UresolvedqKh0hZh6U referenceqh8}q(UnameXTwisted loggingUrefuriqXGhttp://twistedmatrix.com/projects/core/documentation/howto/logging.htmlqh:]h;]h<]h=]h>]uh*]qhUXTwisted loggingqq}q(h/Uh0hubaubhUX# but this may change in the future.qq}q(h/X# but this may change in the future.h0hZubeubh,)q}q(h/X\.. _Twisted logging: http://twistedmatrix.com/projects/core/documentation/howto/logging.htmlU referencedqKh0hCh1h4h6h7h8}q(hhh:]qh"ah;]h<]h=]h>]qh auh@K hAhh*]ubhY)q}q(h/X]The logging service must be explicitly started through the :func:`scrapy.log.start` function.qh0hCh1h4h6h\h8}q(h<]h=]h;]h:]h>]uh@K hAhh*]q(hUX;The logging service must be explicitly started through the qq}q(h/X;The logging service must be explicitly started through the h0hubhb)q}q(h/X:func:`scrapy.log.start`qh0hh1h4h6hfh8}q(UreftypeXfunchhhiXscrapy.log.startU refdomainXpyqh:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@K h*]qhp)q}q(h/hh8}q(h<]h=]q(huhXpy-funcqeh;]h:]h>]uh0hh*]qhUXscrapy.log.start()qq}q(h/Uh0hubah6h{ubaubhUX function.qq}q(h/X function.h0hubeubh,)q}q(h/X.. _topics-logging-levels:h0hCh1h4h6h7h8}q(h:]h;]h<]h=]h>]h?h(uh@KhAhh*]ubhB)q}q(h/Uh0hCh1h4hE}qhhsh6hGh8}q(h<]h=]h;]h:]q(h$h(eh>]q(h heuh@KhAhhK}qh(hsh*]q(hN)q}q(h/X Log levelsqh0hh1h4h6hRh8}q(h<]h=]h;]h:]h>]uh@KhAhh*]qhUX Log levelsqq}q(h/hh0hubaubhY)q}q(h/X!Scrapy provides 5 logging levels:qh0hh1h4h6h\h8}q(h<]h=]h;]h:]h>]uh@KhAhh*]qhUX!Scrapy provides 5 logging levels:qƅq}q(h/hh0hubaubcdocutils.nodes enumerated_list q)q}q(h/Uh0hh1h4h6Uenumerated_listqh8}q(UsuffixqU.h:]h;]h<]UprefixqUh=]h>]UenumtypeqUarabicquh@KhAhh*]q(cdocutils.nodes list_item q)q}q(h/X2:data:`~scrapy.log.CRITICAL` - for critical errorsqh0hh1h4h6U list_itemqh8}q(h<]h=]h;]h:]h>]uh@NhAhh*]qhY)q}q(h/hh0hh1h4h6h\h8}q(h<]h=]h;]h:]h>]uh@Kh*]q(hb)q}q(h/X:data:`~scrapy.log.CRITICAL`qh0hh1h4h6hfh8}q(UreftypeXdatahhhiXscrapy.log.CRITICALU refdomainXpyqh:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@Kh*]qhp)q}q(h/hh8}q(h<]h=]q(huhXpy-dataqeh;]h:]h>]uh0hh*]qhUXCRITICALqꅁq}q(h/Uh0hubah6h{ubaubhUX - for critical errorsq텁q}q(h/X - for critical errorsh0hubeubaubh)q}q(h/X.:data:`~scrapy.log.ERROR` - for regular errorsqh0hh1h4h6hh8}q(h<]h=]h;]h:]h>]uh@NhAhh*]qhY)q}q(h/hh0hh1h4h6h\h8}q(h<]h=]h;]h:]h>]uh@Kh*]q(hb)q}q(h/X:data:`~scrapy.log.ERROR`qh0hh1h4h6hfh8}q(UreftypeXdatahhhiXscrapy.log.ERRORU refdomainXpyqh:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@Kh*]qhp)q}r(h/hh8}r(h<]h=]r(huhXpy-datareh;]h:]h>]uh0hh*]rhUXERRORrr}r(h/Uh0hubah6h{ubaubhUX - for regular errorsrr }r (h/X - for regular errorsh0hubeubaubh)r }r (h/X2:data:`~scrapy.log.WARNING` - for warning messagesr h0hh1h4h6hh8}r(h<]h=]h;]h:]h>]uh@NhAhh*]rhY)r}r(h/j h0j h1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@Kh*]r(hb)r}r(h/X:data:`~scrapy.log.WARNING`rh0jh1h4h6hfh8}r(UreftypeXdatahhhiXscrapy.log.WARNINGU refdomainXpyrh:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@Kh*]rhp)r}r(h/jh8}r(h<]h=]r(hujXpy-datareh;]h:]h>]uh0jh*]rhUXWARNINGr r!}r"(h/Uh0jubah6h{ubaubhUX - for warning messagesr#r$}r%(h/X - for warning messagesh0jubeubaubh)r&}r'(h/X5:data:`~scrapy.log.INFO` - for informational messagesr(h0hh1h4h6hh8}r)(h<]h=]h;]h:]h>]uh@NhAhh*]r*hY)r+}r,(h/j(h0j&h1h4h6h\h8}r-(h<]h=]h;]h:]h>]uh@Kh*]r.(hb)r/}r0(h/X:data:`~scrapy.log.INFO`r1h0j+h1h4h6hfh8}r2(UreftypeXdatahhhiXscrapy.log.INFOU refdomainXpyr3h:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@Kh*]r4hp)r5}r6(h/j1h8}r7(h<]h=]r8(huj3Xpy-datar9eh;]h:]h>]uh0j/h*]r:hUXINFOr;r<}r=(h/Uh0j5ubah6h{ubaubhUX - for informational messagesr>r?}r@(h/X - for informational messagesh0j+ubeubaubh)rA}rB(h/X3:data:`~scrapy.log.DEBUG` - for debugging messages h0hh1h4h6hh8}rC(h<]h=]h;]h:]h>]uh@NhAhh*]rDhY)rE}rF(h/X2:data:`~scrapy.log.DEBUG` - for debugging messagesh0jAh1h4h6h\h8}rG(h<]h=]h;]h:]h>]uh@Kh*]rH(hb)rI}rJ(h/X:data:`~scrapy.log.DEBUG`rKh0jEh1h4h6hfh8}rL(UreftypeXdatahhhiXscrapy.log.DEBUGU refdomainXpyrMh:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@Kh*]rNhp)rO}rP(h/jKh8}rQ(h<]h=]rR(hujMXpy-datarSeh;]h:]h>]uh0jIh*]rThUXDEBUGrUrV}rW(h/Uh0jOubah6h{ubaubhUX - for debugging messagesrXrY}rZ(h/X - for debugging messagesh0jEubeubaubeubeubhB)r[}r\(h/Uh0hCh1h4h6hGh8}r](h<]h=]h;]h:]r^h#ah>]r_h auh@KhAhh*]r`(hN)ra}rb(h/XHow to set the log levelrch0j[h1h4h6hRh8}rd(h<]h=]h;]h:]h>]uh@KhAhh*]rehUXHow to set the log levelrfrg}rh(h/jch0jaubaubhY)ri}rj(h/XsYou can set the log level using the `--loglevel/-L` command line option, or using the :setting:`LOG_LEVEL` setting.h0j[h1h4h6h\h8}rk(h<]h=]h;]h:]h>]uh@KhAhh*]rl(hUX$You can set the log level using the rmrn}ro(h/X$You can set the log level using the h0jiubcdocutils.nodes title_reference rp)rq}rr(h/X`--loglevel/-L`h8}rs(h<]h=]h;]h:]h>]uh0jih*]rthUX --loglevel/-Lrurv}rw(h/Uh0jqubah6Utitle_referencerxubhUX# command line option, or using the ryrz}r{(h/X# command line option, or using the h0jiubhb)r|}r}(h/X:setting:`LOG_LEVEL`r~h0jih1h4h6hfh8}r(UreftypeXsettinghhhiX LOG_LEVELU refdomainXstdrh:]h;]U refexplicith<]h=]h>]hkhluh@Kh*]rhp)r}r(h/j~h8}r(h<]h=]r(hujX std-settingreh;]h:]h>]uh0j|h*]rhUX LOG_LEVELrr}r(h/Uh0jubah6h{ubaubhUX setting.rr}r(h/X setting.h0jiubeubeubhB)r}r(h/Uh0hCh1h4h6hGh8}r(h<]h=]h;]h:]rh'ah>]rhauh@K#hAhh*]r(hN)r}r(h/XHow to log messagesrh0jh1h4h6hRh8}r(h<]h=]h;]h:]h>]uh@K#hAhh*]rhUXHow to log messagesrr}r(h/jh0jubaubhY)r}r(h/XLHere's a quick example of how to log a message using the ``WARNING`` level::rh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@K%hAhh*]r(hUX9Here's a quick example of how to log a message using the rr}r(h/X9Here's a quick example of how to log a message using the h0jubhp)r}r(h/X ``WARNING``h8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXWARNINGrr}r(h/Uh0jubah6h{ubhUX level:rr}r(h/X level:h0jubeubcdocutils.nodes literal_block r)r}r(h/XFfrom scrapy import log log.msg("This is a warning", level=log.WARNING)h0jh1h4h6U literal_blockrh8}r(U xml:spacerUpreserverh:]h;]h<]h=]h>]uh@K'hAhh*]rhUXFfrom scrapy import log log.msg("This is a warning", level=log.WARNING)rr}r(h/Uh0jubaubeubhB)r}r(h/Uh0hCh1h4h6hGh8}r(h<]h=]h;]h:]rh)ah>]rhauh@K+hAhh*]r(hN)r}r(h/XLogging from Spidersrh0jh1h4h6hRh8}r(h<]h=]h;]h:]h>]uh@K+hAhh*]rhUXLogging from Spidersrr}r(h/jh0jubaubhY)r}r(h/XThe recommended way to log from spiders is by using the Spider :meth:`~scrapy.spider.Spider.log` method, which already populates the ``spider`` argument of the :func:`scrapy.log.msg` function. The other arguments are passed directly to the :func:`~scrapy.log.msg` function.h0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@K-hAhh*]r(hUX?The recommended way to log from spiders is by using the Spider rr}r(h/X?The recommended way to log from spiders is by using the Spider h0jubhb)r}r(h/X!:meth:`~scrapy.spider.Spider.log`rh0jh1h4h6hfh8}r(UreftypeXmethhhhiXscrapy.spider.Spider.logU refdomainXpyrh:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@K-h*]rhp)r}r(h/jh8}r(h<]h=]r(hujXpy-methreh;]h:]h>]uh0jh*]rhUXlog()rr}r(h/Uh0jubah6h{ubaubhUX% method, which already populates the rr}r(h/X% method, which already populates the h0jubhp)r}r(h/X ``spider``h8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXspiderrr}r(h/Uh0jubah6h{ubhUX argument of the rr}r(h/X argument of the h0jubhb)r}r(h/X:func:`scrapy.log.msg`rh0jh1h4h6hfh8}r(UreftypeXfunchhhiXscrapy.log.msgU refdomainXpyrh:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@K-h*]rhp)r}r(h/jh8}r(h<]h=]r(hujXpy-funcreh;]h:]h>]uh0jh*]rhUXscrapy.log.msg()rr}r(h/Uh0jubah6h{ubaubhUX: function. The other arguments are passed directly to the rr}r(h/X: function. The other arguments are passed directly to the h0jubhb)r}r(h/X:func:`~scrapy.log.msg`rh0jh1h4h6hfh8}r(UreftypeXfunchhhiXscrapy.log.msgU refdomainXpyrh:]h;]U refexplicith<]h=]h>]hkhlhmNhnNuh@K-h*]rhp)r}r(h/jh8}r(h<]h=]r(hujXpy-funcreh;]h:]h>]uh0jh*]rhUXmsg()rr }r (h/Uh0jubah6h{ubaubhUX function.r r }r (h/X function.h0jubeubeubhB)r}r(h/Uh0hCh1h4h6hGh8}r(h<]h=]h;]h:]r(Xmodule-scrapy.logrh!eh>]rhauh@K3hAhh*]r(hN)r}r(h/Xscrapy.log modulerh0jh1h4h6hRh8}r(h<]h=]h;]h:]h>]uh@K3hAhh*]rhUXscrapy.log modulerr}r(h/jh0jubaubcsphinx.addnodes index r)r}r(h/Uh0jh1h4h6Uindexr h8}r!(h:]h;]h<]h=]h>]Uentries]r"(Usingler#Xscrapy.log (module)Xmodule-scrapy.logUtr$auh@NhAhh*]ubj)r%}r&(h/Uh0jh1Nh6j h8}r'(h:]h;]h<]h=]h>]Uentries]r((j#Xstart() (in module scrapy.log)h Utr)auh@NhAhh*]ubcsphinx.addnodes desc r*)r+}r,(h/Uh0jh1Nh6Udescr-h8}r.(Unoindexr/Udomainr0Xpyr1h:]h;]h<]h=]h>]Uobjtyper2Xfunctionr3Udesctyper4j3uh@NhAhh*]r5(csphinx.addnodes desc_signature r6)r7}r8(h/X2start(logfile=None, loglevel=None, logstdout=None)h0j+h1h4h6Udesc_signaturer9h8}r:(h:]r;h aUmoduler<X scrapy.logr=h;]h<]h=]h>]r>h aUfullnamer?Xstartr@UclassrAUUfirstrBuh@KKhAhh*]rC(csphinx.addnodes desc_addname rD)rE}rF(h/X scrapy.log.h0j7h1h4h6U desc_addnamerGh8}rH(h<]h=]h;]h:]h>]uh@KKhAhh*]rIhUX scrapy.log.rJrK}rL(h/Uh0jEubaubcsphinx.addnodes desc_name rM)rN}rO(h/j@h0j7h1h4h6U desc_namerPh8}rQ(h<]h=]h;]h:]h>]uh@KKhAhh*]rRhUXstartrSrT}rU(h/Uh0jNubaubcsphinx.addnodes desc_parameterlist rV)rW}rX(h/Uh0j7h1h4h6Udesc_parameterlistrYh8}rZ(h<]h=]h;]h:]h>]uh@KKhAhh*]r[(csphinx.addnodes desc_parameter r\)r]}r^(h/X logfile=Noneh8}r_(h<]h=]h;]h:]h>]uh0jWh*]r`hUX logfile=Nonerarb}rc(h/Uh0j]ubah6Udesc_parameterrdubj\)re}rf(h/X loglevel=Noneh8}rg(h<]h=]h;]h:]h>]uh0jWh*]rhhUX loglevel=Nonerirj}rk(h/Uh0jeubah6jdubj\)rl}rm(h/Xlogstdout=Noneh8}rn(h<]h=]h;]h:]h>]uh0jWh*]rohUXlogstdout=Nonerprq}rr(h/Uh0jlubah6jdubeubeubcsphinx.addnodes desc_content rs)rt}ru(h/Uh0j+h1h4h6U desc_contentrvh8}rw(h<]h=]h;]h:]h>]uh@KKhAhh*]rx(hY)ry}rz(h/XStart the logging facility. This must be called before actually logging any messages. Otherwise, messages logged before this call will get lost.r{h0jth1h4h6h\h8}r|(h<]h=]h;]h:]h>]uh@K:hAhh*]r}hUXStart the logging facility. This must be called before actually logging any messages. Otherwise, messages logged before this call will get lost.r~r}r(h/j{h0jyubaubcdocutils.nodes field_list r)r}r(h/Uh0jth1Nh6U field_listrh8}r(h<]h=]h;]h:]h>]uh@NhAhh*]rcdocutils.nodes field r)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]r(cdocutils.nodes field_name r)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUX Parametersrr}r(h/Uh0jubah6U field_namerubcdocutils.nodes field_body r)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rcdocutils.nodes bullet_list r)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]r(h)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rhY)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]r(cdocutils.nodes strong r)r}r(h/Xlogfileh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXlogfilerr}r(h/Uh0jubah6UstrongrubhUX (rr}r(h/Uh0jubhb)r}r(h/Uh8}r(UreftypeUobjrU reftargetXstrrU refdomainj1h:]h;]U refexplicith<]h=]h>]uh0jh*]rcdocutils.nodes emphasis r)r}r(h/jh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXstrrr}r(h/Uh0jubah6Uemphasisrubah6hfubhUX)r}r(h/Uh0jubhUX -- rr}r(h/Uh0jubhUX9the file path to use for logging output. If omitted, the rr}r(h/X9the file path to use for logging output. If omitted, the h0jubhb)r}r(h/X:setting:`LOG_FILE`rh0jh1h4h6hfh8}r(UreftypeXsettinghhhiXLOG_FILEU refdomainXstdrh:]h;]U refexplicith<]h=]h>]hkhluh@K=h*]rhp)r}r(h/jh8}r(h<]h=]r(hujX std-settingreh;]h:]h>]uh0jh*]rhUXLOG_FILErr}r(h/Uh0jubah6h{ubaubhUX# setting will be used. If both are rr}r(h/X# setting will be used. If both are h0jubhp)r}r(h/X``None``h8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXNonerr}r(h/Uh0jubah6h{ubhUX), the log will be sent to standard error.rr}r(h/X), the log will be sent to standard error.h0jubeh6h\ubah6hubh)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rhY)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]r(j)r}r(h/Xloglevelh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXloglevelrr}r(h/Uh0jubah6jubhUX -- rr}r(h/Uh0jubhUX8the minimum logging level to log. Available values are: rr}r(h/X8the minimum logging level to log. Available values are: h0jubhb)r}r(h/X:data:`CRITICAL`rh0jh1h4h6hfh8}r(UreftypeXdatahhhiXCRITICALU refdomainXpyrh:]h;]U refexplicith<]h=]h>]hkhlhmNhnj=uh@KBh*]rhp)r}r(h/jh8}r(h<]h=]r(hujXpy-datareh;]h:]h>]uh0jh*]rhUXCRITICALrr}r (h/Uh0jubah6h{ubaubhUX, r r }r (h/X, h0jubhb)r }r(h/X :data:`ERROR`rh0jh1h4h6hfh8}r(UreftypeXdatahhhiXERRORU refdomainXpyrh:]h;]U refexplicith<]h=]h>]hkhlhmNhnj=uh@KBh*]rhp)r}r(h/jh8}r(h<]h=]r(hujXpy-datareh;]h:]h>]uh0j h*]rhUXERRORrr}r(h/Uh0jubah6h{ubaubhUX, rr}r(h/X, h0jubhb)r}r (h/X:data:`WARNING`r!h0jh1h4h6hfh8}r"(UreftypeXdatahhhiXWARNINGU refdomainXpyr#h:]h;]U refexplicith<]h=]h>]hkhlhmNhnj=uh@KBh*]r$hp)r%}r&(h/j!h8}r'(h<]h=]r((huj#Xpy-datar)eh;]h:]h>]uh0jh*]r*hUXWARNINGr+r,}r-(h/Uh0j%ubah6h{ubaubhUX, r.r/}r0(h/X, h0jubhb)r1}r2(h/X :data:`INFO`r3h0jh1h4h6hfh8}r4(UreftypeXdatahhhiXINFOU refdomainXpyr5h:]h;]U refexplicith<]h=]h>]hkhlhmNhnj=uh@KBh*]r6hp)r7}r8(h/j3h8}r9(h<]h=]r:(huj5Xpy-datar;eh;]h:]h>]uh0j1h*]r<hUXINFOr=r>}r?(h/Uh0j7ubah6h{ubaubhUX and r@rA}rB(h/X and h0jubhb)rC}rD(h/X :data:`DEBUG`rEh0jh1h4h6hfh8}rF(UreftypeXdatahhhiXDEBUGU refdomainXpyrGh:]h;]U refexplicith<]h=]h>]hkhlhmNhnj=uh@KBh*]rHhp)rI}rJ(h/jEh8}rK(h<]h=]rL(hujGXpy-datarMeh;]h:]h>]uh0jCh*]rNhUXDEBUGrOrP}rQ(h/Uh0jIubah6h{ubaubhUX.rR}rS(h/X.h0jubeh6h\ubah6hubh)rT}rU(h/Uh8}rV(h<]h=]h;]h:]h>]uh0jh*]rWhY)rX}rY(h/Uh8}rZ(h<]h=]h;]h:]h>]uh0jTh*]r[(j)r\}r](h/X logstdouth8}r^(h<]h=]h;]h:]h>]uh0jXh*]r_hUX logstdoutr`ra}rb(h/Uh0j\ubah6jubhUX (rcrd}re(h/Uh0jXubhb)rf}rg(h/Uh8}rh(UreftypejU reftargetXbooleanriU refdomainj1h:]h;]U refexplicith<]h=]h>]uh0jXh*]rjj)rk}rl(h/jih8}rm(h<]h=]h;]h:]h>]uh0jfh*]rnhUXbooleanrorp}rq(h/Uh0jkubah6jubah6hfubhUX)rr}rs(h/Uh0jXubhUX -- rtru}rv(h/Uh0jXubhUXif rwrx}ry(h/Xif h0jXubhp)rz}r{(h/X``True``h8}r|(h<]h=]h;]h:]h>]uh0jXh*]r}hUXTruer~r}r(h/Uh0jzubah6h{ubhUX, all standard output (and error) of your application will be logged instead. For example if you "print 'hello'" it will appear in the Scrapy log. If omitted, the rr}r(h/X, all standard output (and error) of your application will be logged instead. For example if you "print 'hello'" it will appear in the Scrapy log. If omitted, the h0jXubhb)r}r(h/X:setting:`LOG_STDOUT`rh0jXh1h4h6hfh8}r(UreftypeXsettinghhhiX LOG_STDOUTU refdomainXstdrh:]h;]U refexplicith<]h=]h>]hkhluh@KFh*]rhp)r}r(h/jh8}r(h<]h=]r(hujX std-settingreh;]h:]h>]uh0jh*]rhUX LOG_STDOUTrr}r(h/Uh0jubah6h{ubaubhUX setting will be used.rr}r(h/X setting will be used.h0jXubeh6h\ubah6hubeh6U bullet_listrubah6U field_bodyrubeh6Ufieldrubaubeubeubj)r}r(h/Uh0jh1Nh6j h8}r(h:]h;]h<]h=]h>]Uentries]r(j#Xmsg() (in module scrapy.log)hUtrauh@NhAhh*]ubj*)r}r(h/Uh0jh1Nh6j-h8}r(j/j0Xpyrh:]h;]h<]h=]h>]j2Xfunctionrj4juh@NhAhh*]r(j6)r}r(h/X%msg(message, level=INFO, spider=None)h0jh1h4h6j9h8}r(h:]rhaj<j=h;]h<]h=]h>]rhaj?XmsgrjAUjBuh@KZhAhh*]r(jD)r}r(h/X scrapy.log.h0jh1h4h6jGh8}r(h<]h=]h;]h:]h>]uh@KZhAhh*]rhUX scrapy.log.rr}r(h/Uh0jubaubjM)r}r(h/jh0jh1h4h6jPh8}r(h<]h=]h;]h:]h>]uh@KZhAhh*]rhUXmsgrr}r(h/Uh0jubaubjV)r}r(h/Uh0jh1h4h6jYh8}r(h<]h=]h;]h:]h>]uh@KZhAhh*]r(j\)r}r(h/Xmessageh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXmessagerr}r(h/Uh0jubah6jdubj\)r}r(h/X level=INFOh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUX level=INFOrr}r(h/Uh0jubah6jdubj\)r}r(h/X spider=Noneh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUX spider=Nonerr}r(h/Uh0jubah6jdubeubeubjs)r}r(h/Uh0jh1h4h6jvh8}r(h<]h=]h;]h:]h>]uh@KZhAhh*]r(hY)r}r(h/X Log a messagerh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@KNhAhh*]rhUX Log a messagerr}r(h/jh0jubaubj)r}r(h/Uh0jh1Nh6jh8}r(h<]h=]h;]h:]h>]uh@NhAhh*]rj)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]r(j)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUX Parametersrr}r(h/Uh0jubah6jubj)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rj)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]r(h)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rhY)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]r(j)r}r(h/Xmessageh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXmessagerr}r(h/Uh0jubah6jubhUX (rr}r(h/Uh0jubhb)r}r(h/Uh8}r (UreftypejU reftargetXstrr U refdomainjh:]h;]U refexplicith<]h=]h>]uh0jh*]r j)r }r (h/j h8}r(h<]h=]h;]h:]h>]uh0jh*]rhUXstrrr}r(h/Uh0j ubah6jubah6hfubhUX)r}r(h/Uh0jubhUX -- rr}r(h/Uh0jubhUXthe message to logrr}r(h/Xthe message to logrh0jubeh6h\ubah6hubh)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rhY)r }r!(h/Uh8}r"(h<]h=]h;]h:]h>]uh0jh*]r#(j)r$}r%(h/Xlevelh8}r&(h<]h=]h;]h:]h>]uh0j h*]r'hUXlevelr(r)}r*(h/Uh0j$ubah6jubhUX -- r+r,}r-(h/Uh0j ubhUX$the log level for this message. See r.r/}r0(h/X$the log level for this message. See h0j ubhb)r1}r2(h/X:ref:`topics-logging-levels`r3h0j h1h4h6hfh8}r4(UreftypeXrefhhhiXtopics-logging-levelsU refdomainXstdr5h:]h;]U refexplicith<]h=]h>]hkhluh@KSh*]r6j)r7}r8(h/j3h8}r9(h<]h=]r:(huj5Xstd-refr;eh;]h:]h>]uh0j1h*]r<hUXtopics-logging-levelsr=r>}r?(h/Uh0j7ubah6jubaubhUX.r@}rA(h/X.h0j ubeh6h\ubah6hubh)rB}rC(h/Uh8}rD(h<]h=]h;]h:]h>]uh0jh*]rEhY)rF}rG(h/Uh8}rH(h<]h=]h;]h:]h>]uh0jBh*]rI(j)rJ}rK(h/Xspiderh8}rL(h<]h=]h;]h:]h>]uh0jFh*]rMhUXspiderrNrO}rP(h/Uh0jJubah6jubhUX (rQrR}rS(h/Uh0jFubhb)rT}rU(h/X:class:`~scrapy.spider.Spider`rVh0jFh1h4h6hfh8}rW(UreftypeXclasshhhiXscrapy.spider.SpiderU refdomainXpyrXh:]h;]U refexplicith<]h=]h>]hkhlhmNhnj=uh@KYh*]rYhp)rZ}r[(h/jVh8}r\(h<]h=]r](hujXXpy-classr^eh;]h:]h>]uh0jTh*]r_hUXSpiderr`ra}rb(h/Uh0jZubah6h{ubaubhUX objectrcrd}re(h/X objecth0jFubhUX)rf}rg(h/Uh0jFubhUX -- rhri}rj(h/Uh0jFubhUXthe spider to use for logging this message. This parameter should always be used when logging things related to a particular spider.rkrl}rm(h/Xthe spider to use for logging this message. This parameter should always be used when logging things related to a particular spider.rnh0jFubeh6h\ubah6hubeh6jubah6jubeh6jubaubeubeubj)ro}rp(h/Uh0jh1h4h6j h8}rq(h:]h;]h<]h=]h>]Uentries]rr(j#XCRITICAL (in module scrapy.log)hUtrsauh@NhAhh*]ubj*)rt}ru(h/Uh0jh1h4h6j-h8}rv(j/j0Xpyh:]h;]h<]h=]h>]j2Xdatarwj4jwuh@NhAhh*]rx(j6)ry}rz(h/XCRITICALr{h0jth1h4h6j9h8}r|(h:]r}haj<j=h;]h<]h=]h>]r~haj?j{jAUjBuh@K^hAhh*]r(jD)r}r(h/X scrapy.log.h0jyh1h4h6jGh8}r(h<]h=]h;]h:]h>]uh@K^hAhh*]rhUX scrapy.log.rr}r(h/Uh0jubaubjM)r}r(h/j{h0jyh1h4h6jPh8}r(h<]h=]h;]h:]h>]uh@K^hAhh*]rhUXCRITICALrr}r(h/Uh0jubaubeubjs)r}r(h/Uh0jth1h4h6jvh8}r(h<]h=]h;]h:]h>]uh@K^hAhh*]rhY)r}r(h/XLog level for critical errorsrh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@K]hAhh*]rhUXLog level for critical errorsrr}r(h/jh0jubaubaubeubj)r}r(h/Uh0jh1h4h6j h8}r(h:]h;]h<]h=]h>]Uentries]r(j#XERROR (in module scrapy.log)hUtrauh@NhAhh*]ubj*)r}r(h/Uh0jh1h4h6j-h8}r(j/j0Xpyh:]h;]h<]h=]h>]j2Xdatarj4juh@NhAhh*]r(j6)r}r(h/XERRORrh0jh1h4h6j9h8}r(h:]rhaj<j=h;]h<]h=]h>]rhaj?jjAUjBuh@KbhAhh*]r(jD)r}r(h/X scrapy.log.h0jh1h4h6jGh8}r(h<]h=]h;]h:]h>]uh@KbhAhh*]rhUX scrapy.log.rr}r(h/Uh0jubaubjM)r}r(h/jh0jh1h4h6jPh8}r(h<]h=]h;]h:]h>]uh@KbhAhh*]rhUXERRORrr}r(h/Uh0jubaubeubjs)r}r(h/Uh0jh1h4h6jvh8}r(h<]h=]h;]h:]h>]uh@KbhAhh*]rhY)r}r(h/XLog level for errorsrh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@KahAhh*]rhUXLog level for errorsrr}r(h/jh0jubaubaubeubj)r}r(h/Uh0jh1h4h6j h8}r(h:]h;]h<]h=]h>]Uentries]r(j#XWARNING (in module scrapy.log)hUtrauh@NhAhh*]ubj*)r}r(h/Uh0jh1h4h6j-h8}r(j/j0Xpyh:]h;]h<]h=]h>]j2Xdatarj4juh@NhAhh*]r(j6)r}r(h/XWARNINGrh0jh1h4h6j9h8}r(h:]rhaj<j=h;]h<]h=]h>]rhaj?jjAUjBuh@KfhAhh*]r(jD)r}r(h/X scrapy.log.h0jh1h4h6jGh8}r(h<]h=]h;]h:]h>]uh@KfhAhh*]rhUX scrapy.log.rr}r(h/Uh0jubaubjM)r}r(h/jh0jh1h4h6jPh8}r(h<]h=]h;]h:]h>]uh@KfhAhh*]rhUXWARNINGrr}r(h/Uh0jubaubeubjs)r}r(h/Uh0jh1h4h6jvh8}r(h<]h=]h;]h:]h>]uh@KfhAhh*]rhY)r}r(h/XLog level for warningsrh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@KehAhh*]rhUXLog level for warningsrr}r(h/jh0jubaubaubeubj)r}r(h/Uh0jh1h4h6j h8}r(h:]h;]h<]h=]h>]Uentries]r(j#XINFO (in module scrapy.log)hUtrauh@NhAhh*]ubj*)r}r(h/Uh0jh1h4h6j-h8}r(j/j0Xpyh:]h;]h<]h=]h>]j2Xdatarj4juh@NhAhh*]r(j6)r}r(h/XINFOrh0jh1h4h6j9h8}r(h:]rhaj<j=h;]h<]h=]h>]rhaj?jjAUjBuh@KkhAhh*]r(jD)r}r(h/X scrapy.log.h0jh1h4h6jGh8}r(h<]h=]h;]h:]h>]uh@KkhAhh*]rhUX scrapy.log.rr}r(h/Uh0jubaubjM)r}r (h/jh0jh1h4h6jPh8}r (h<]h=]h;]h:]h>]uh@KkhAhh*]r hUXINFOr r }r(h/Uh0jubaubeubjs)r}r(h/Uh0jh1h4h6jvh8}r(h<]h=]h;]h:]h>]uh@KkhAhh*]rhY)r}r(h/XSLog level for informational messages (recommended level for production deployments)rh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@KihAhh*]rhUXSLog level for informational messages (recommended level for production deployments)rr}r(h/jh0jubaubaubeubj)r}r(h/Uh0jh1h4h6j h8}r(h:]h;]h<]h=]h>]Uentries]r(j#XDEBUG (in module scrapy.log)h Utrauh@NhAhh*]ubj*)r }r!(h/Uh0jh1h4h6j-h8}r"(j/j0Xpyh:]h;]h<]h=]h>]j2Xdatar#j4j#uh@NhAhh*]r$(j6)r%}r&(h/XDEBUGr'h0j h1h4h6j9h8}r((h:]r)h aj<j=h;]h<]h=]h>]r*h aj?j'jAUjBuh@KohAhh*]r+(jD)r,}r-(h/X scrapy.log.h0j%h1h4h6jGh8}r.(h<]h=]h;]h:]h>]uh@KohAhh*]r/hUX scrapy.log.r0r1}r2(h/Uh0j,ubaubjM)r3}r4(h/j'h0j%h1h4h6jPh8}r5(h<]h=]h;]h:]h>]uh@KohAhh*]r6hUXDEBUGr7r8}r9(h/Uh0j3ubaubeubjs)r:}r;(h/Uh0j h1h4h6jvh8}r<(h<]h=]h;]h:]h>]uh@KohAhh*]r=hY)r>}r?(h/XDLog level for debugging messages (recommended level for development)r@h0j:h1h4h6h\h8}rA(h<]h=]h;]h:]h>]uh@KnhAhh*]rBhUXDLog level for debugging messages (recommended level for development)rCrD}rE(h/j@h0j>ubaubaubeubeubhB)rF}rG(h/Uh0hCh1h4h6hGh8}rH(h<]h=]h;]h:]rIh%ah>]rJhauh@KqhAhh*]rK(hN)rL}rM(h/XLogging settingsrNh0jFh1h4h6hRh8}rO(h<]h=]h;]h:]h>]uh@KqhAhh*]rPhUXLogging settingsrQrR}rS(h/jNh0jLubaubhY)rT}rU(h/X4These settings can be used to configure the logging:rVh0jFh1h4h6h\h8}rW(h<]h=]h;]h:]h>]uh@KshAhh*]rXhUX4These settings can be used to configure the logging:rYrZ}r[(h/jVh0jTubaubj)r\}r](h/Uh0jFh1h4h6jh8}r^(Ubulletr_X*h:]h;]h<]h=]h>]uh@KuhAhh*]r`(h)ra}rb(h/X:setting:`LOG_ENABLED`rch0j\h1h4h6hh8}rd(h<]h=]h;]h:]h>]uh@NhAhh*]rehY)rf}rg(h/jch0jah1h4h6h\h8}rh(h<]h=]h;]h:]h>]uh@Kuh*]rihb)rj}rk(h/jch0jfh1h4h6hfh8}rl(UreftypeXsettinghhhiX LOG_ENABLEDU refdomainXstdrmh:]h;]U refexplicith<]h=]h>]hkhluh@Kuh*]rnhp)ro}rp(h/jch8}rq(h<]h=]rr(hujmX std-settingrseh;]h:]h>]uh0jjh*]rthUX LOG_ENABLEDrurv}rw(h/Uh0joubah6h{ubaubaubaubh)rx}ry(h/X:setting:`LOG_ENCODING`rzh0j\h1h4h6hh8}r{(h<]h=]h;]h:]h>]uh@NhAhh*]r|hY)r}}r~(h/jzh0jxh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@Kvh*]rhb)r}r(h/jzh0j}h1h4h6hfh8}r(UreftypeXsettinghhhiX LOG_ENCODINGU refdomainXstdrh:]h;]U refexplicith<]h=]h>]hkhluh@Kvh*]rhp)r}r(h/jzh8}r(h<]h=]r(hujX std-settingreh;]h:]h>]uh0jh*]rhUX LOG_ENCODINGrr}r(h/Uh0jubah6h{ubaubaubaubh)r}r(h/X:setting:`LOG_FILE`rh0j\h1h4h6hh8}r(h<]h=]h;]h:]h>]uh@NhAhh*]rhY)r}r(h/jh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@Kwh*]rhb)r}r(h/jh0jh1h4h6hfh8}r(UreftypeXsettinghhhiXLOG_FILEU refdomainXstdrh:]h;]U refexplicith<]h=]h>]hkhluh@Kwh*]rhp)r}r(h/jh8}r(h<]h=]r(hujX std-settingreh;]h:]h>]uh0jh*]rhUXLOG_FILErr}r(h/Uh0jubah6h{ubaubaubaubh)r}r(h/X:setting:`LOG_LEVEL`rh0j\h1h4h6hh8}r(h<]h=]h;]h:]h>]uh@NhAhh*]rhY)r}r(h/jh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@Kxh*]rhb)r}r(h/jh0jh1h4h6hfh8}r(UreftypeXsettinghhhiX LOG_LEVELU refdomainXstdrh:]h;]U refexplicith<]h=]h>]hkhluh@Kxh*]rhp)r}r(h/jh8}r(h<]h=]r(hujX std-settingreh;]h:]h>]uh0jh*]rhUX LOG_LEVELrr}r(h/Uh0jubah6h{ubaubaubaubh)r}r(h/X:setting:`LOG_STDOUT` h0j\h1h4h6hh8}r(h<]h=]h;]h:]h>]uh@NhAhh*]rhY)r}r(h/X:setting:`LOG_STDOUT`rh0jh1h4h6h\h8}r(h<]h=]h;]h:]h>]uh@Kyh*]rhb)r}r(h/jh0jh1h4h6hfh8}r(UreftypeXsettinghhhiX LOG_STDOUTU refdomainXstdrh:]h;]U refexplicith<]h=]h>]hkhluh@Kyh*]rhp)r}r(h/jh8}r(h<]h=]r(hujX std-settingreh;]h:]h>]uh0jh*]rhUX LOG_STDOUTrr}r(h/Uh0jubah6h{ubaubaubaubeubeubeubeh/UU transformerrNU footnote_refsr}rUrefnamesr}rXtwisted logging]rhasUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rhAhU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h/Uh8}r(h<]UlevelKh:]h;]Usourceh4h=]h>]UlineKUtypeUINFOruh*]rhY)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUX4Hyperlink target "topics-logging" is not referenced.rr}r(h/Uh0jubah6h\ubah6Usystem_messagerubj)r}r(h/Uh8}r(h<]UlevelKh:]h;]Usourceh4h=]h>]UlineKUtypejuh*]rhY)r}r(h/Uh8}r(h<]h=]h;]h:]h>]uh0jh*]rhUX;Hyperlink target "topics-logging-levels" is not referenced.rr}r(h/Uh0jubah6h\ubah6jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r (Ufootnote_backlinksr KUrecord_dependenciesr NU rfc_base_urlr Uhttp://tools.ietf.org/html/r U tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhRNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformr KUdump_transformsr!NU docinfo_xformr"KUwarning_streamr#NUpep_file_url_templater$Upep-%04dr%Uexit_status_levelr&KUconfigr'NUstrict_visitorr(NUcloak_email_addressesr)Utrim_footnote_reference_spacer*Uenvr+NUdump_pseudo_xmlr,NUexpose_internalsr-NUsectsubtitle_xformr.U source_linkr/NUrfc_referencesr0NUoutput_encodingr1Uutf-8r2U source_urlr3NUinput_encodingr4U utf-8-sigr5U_disable_configr6NU id_prefixr7UU tab_widthr8KUerror_encodingr9UUTF-8r:U_sourcer;UD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/logging.rstr<Ugettext_compactr=U generatorr>NUdump_internalsr?NU smart_quotesr@U pep_base_urlrAUhttp://www.python.org/dev/peps/rBUsyntax_highlightrCUlongrDUinput_encoding_error_handlerrEjUauto_id_prefixrFUidrGUdoctitle_xformrHUstrip_elements_with_classesrINU _config_filesrJ]Ufile_insertion_enabledrKU raw_enabledrLKU dump_settingsrMNubUsymbol_footnote_startrNKUidsrO}rP(h"hhjyh!jh hCh j%h$hh j7h#j[hjh)jh&hCh%jFjh,)rQ}rR(h/Uh0jh1h4h6h7h8}rS(h<]h:]rTjah;]Uismodh=]h>]uh@NhAhh*]ubh'jhjh(hhjhjuUsubstitution_namesrU}rVh6hAh8}rW(h<]h:]h;]Usourceh4h=]h>]uU footnotesrX]rYUrefidsrZ}r[(h&]r\h-ah(]r]hauub.PKo1D|\@@)scrapy-0.22/.doctrees/topics/jobs.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xcookies expirationqNX(keeping persistent state between batchesqNX how to use itqNX!jobs: pausing and resuming crawlsq NXrequest serializationq NXpersistence gotchasq NX job directoryq NX topics-jobsq XpicklequUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUcookies-expirationqhU(keeping-persistent-state-between-batchesqhU how-to-use-itqh U jobs-pausing-and-resuming-crawlsqh Urequest-serializationqh Upersistence-gotchasqh U job-directoryqh U topics-jobsqhUpickleq uUchildrenq!]q"(cdocutils.nodes target q#)q$}q%(U rawsourceq&X.. _topics-jobs:Uparentq'hUsourceq(cdocutils.nodes reprunicode q)XA/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/jobs.rstq*q+}q,bUtagnameq-Utargetq.U attributesq/}q0(Uidsq1]Ubackrefsq2]Udupnamesq3]Uclassesq4]Unamesq5]Urefidq6huUlineq7KUdocumentq8hh!]ubcdocutils.nodes section q9)q:}q;(h&Uh'hh(h+Uexpect_referenced_by_nameq<}q=h h$sh-Usectionq>h/}q?(h3]h4]h2]h1]q@(hheh5]qA(h h euh7Kh8hUexpect_referenced_by_idqB}qChh$sh!]qD(cdocutils.nodes title qE)qF}qG(h&X!Jobs: pausing and resuming crawlsqHh'h:h(h+h-UtitleqIh/}qJ(h3]h4]h2]h1]h5]uh7Kh8hh!]qKcdocutils.nodes Text qLX!Jobs: pausing and resuming crawlsqMqN}qO(h&hHh'hFubaubcdocutils.nodes paragraph qP)qQ}qR(h&XZSometimes, for big sites, it's desirable to pause crawls and be able to resume them later.qSh'h:h(h+h-U paragraphqTh/}qU(h3]h4]h2]h1]h5]uh7Kh8hh!]qVhLXZSometimes, for big sites, it's desirable to pause crawls and be able to resume them later.qWqX}qY(h&hSh'hQubaubhP)qZ}q[(h&XXScrapy supports this functionality out of the box by providing the following facilities:q\h'h:h(h+h-hTh/}q](h3]h4]h2]h1]h5]uh7K h8hh!]q^hLXXScrapy supports this functionality out of the box by providing the following facilities:q_q`}qa(h&h\h'hZubaubcdocutils.nodes bullet_list qb)qc}qd(h&Uh'h:h(h+h-U bullet_listqeh/}qf(UbulletqgX*h1]h2]h3]h4]h5]uh7K h8hh!]qh(cdocutils.nodes list_item qi)qj}qk(h&X5a scheduler that persists scheduled requests on disk h'hch(h+h-U list_itemqlh/}qm(h3]h4]h2]h1]h5]uh7Nh8hh!]qnhP)qo}qp(h&X4a scheduler that persists scheduled requests on diskqqh'hjh(h+h-hTh/}qr(h3]h4]h2]h1]h5]uh7K h!]qshLX4a scheduler that persists scheduled requests on diskqtqu}qv(h&hqh'houbaubaubhi)qw}qx(h&X;a duplicates filter that persists visited requests on disk h'hch(h+h-hlh/}qy(h3]h4]h2]h1]h5]uh7Nh8hh!]qzhP)q{}q|(h&X:a duplicates filter that persists visited requests on diskq}h'hwh(h+h-hTh/}q~(h3]h4]h2]h1]h5]uh7Kh!]qhLX:a duplicates filter that persists visited requests on diskqq}q(h&h}h'h{ubaubaubhi)q}q(h&XWan extension that keeps some spider state (key/value pairs) persistent between batches h'hch(h+h-hlh/}q(h3]h4]h2]h1]h5]uh7Nh8hh!]qhP)q}q(h&XVan extension that keeps some spider state (key/value pairs) persistent between batchesqh'hh(h+h-hTh/}q(h3]h4]h2]h1]h5]uh7Kh!]qhLXVan extension that keeps some spider state (key/value pairs) persistent between batchesqq}q(h&hh'hubaubaubeubh9)q}q(h&Uh'h:h(h+h-h>h/}q(h3]h4]h2]h1]qhah5]qh auh7Kh8hh!]q(hE)q}q(h&X Job directoryqh'hh(h+h-hIh/}q(h3]h4]h2]h1]h5]uh7Kh8hh!]qhLX Job directoryqq}q(h&hh'hubaubhP)q}q(h&XTo enable persistence support you just need to define a *job directory* through the ``JOBDIR`` setting. This directory will be for storing all required data to keep the state of a single job (ie. a spider run). It's important to note that this directory must not be shared by different spiders, or even different jobs/runs of the same spider, as it's meant to be used for storing the state of a *single* job.h'hh(h+h-hTh/}q(h3]h4]h2]h1]h5]uh7Kh8hh!]q(hLX8To enable persistence support you just need to define a qq}q(h&X8To enable persistence support you just need to define a h'hubcdocutils.nodes emphasis q)q}q(h&X*job directory*h/}q(h3]h4]h2]h1]h5]uh'hh!]qhLX job directoryqq}q(h&Uh'hubah-UemphasisqubhLX through the qq}q(h&X through the h'hubcdocutils.nodes literal q)q}q(h&X ``JOBDIR``h/}q(h3]h4]h2]h1]h5]uh'hh!]qhLXJOBDIRqq}q(h&Uh'hubah-UliteralqubhLX. setting. This directory will be for storing all required data to keep the state of a single job (ie. a spider run). It's important to note that this directory must not be shared by different spiders, or even different jobs/runs of the same spider, as it's meant to be used for storing the state of a qq}q(h&X. setting. This directory will be for storing all required data to keep the state of a single job (ie. a spider run). It's important to note that this directory must not be shared by different spiders, or even different jobs/runs of the same spider, as it's meant to be used for storing the state of a h'hubh)q}q(h&X*single*h/}q(h3]h4]h2]h1]h5]uh'hh!]qhLXsingleqq}q(h&Uh'hubah-hubhLX job.qÅq}q(h&X job.h'hubeubeubh9)q}q(h&Uh'h:h(h+h-h>h/}q(h3]h4]h2]h1]qhah5]qhauh7Kh8hh!]q(hE)q}q(h&X How to use itqh'hh(h+h-hIh/}q(h3]h4]h2]h1]h5]uh7Kh8hh!]qhLX How to use itqхq}q(h&hh'hubaubhP)q}q(h&XHTo start a spider with persistence supported enabled, run it like this::qh'hh(h+h-hTh/}q(h3]h4]h2]h1]h5]uh7K!h8hh!]qhLXGTo start a spider with persistence supported enabled, run it like this:qمq}q(h&XGTo start a spider with persistence supported enabled, run it like this:h'hubaubcdocutils.nodes literal_block q)q}q(h&X5scrapy crawl somespider -s JOBDIR=crawls/somespider-1h'hh(h+h-U literal_blockqh/}q(U xml:spaceqUpreserveqh1]h2]h3]h4]h5]uh7K#h8hh!]qhLX5scrapy crawl somespider -s JOBDIR=crawls/somespider-1q䅁q}q(h&Uh'hubaubhP)q}q(h&XThen, you can stop the spider safely at any time (by pressing Ctrl-C or sending a signal), and resume it later by issuing the same command::h'hh(h+h-hTh/}q(h3]h4]h2]h1]h5]uh7K%h8hh!]qhLXThen, you can stop the spider safely at any time (by pressing Ctrl-C or sending a signal), and resume it later by issuing the same command:q녁q}q(h&XThen, you can stop the spider safely at any time (by pressing Ctrl-C or sending a signal), and resume it later by issuing the same command:h'hubaubh)q}q(h&X5scrapy crawl somespider -s JOBDIR=crawls/somespider-1h'hh(h+h-hh/}q(hhh1]h2]h3]h4]h5]uh7K(h8hh!]qhLX5scrapy crawl somespider -s JOBDIR=crawls/somespider-1qq}q(h&Uh'hubaubeubh9)q}q(h&Uh'h:h(h+h-h>h/}q(h3]h4]h2]h1]qhah5]qhauh7K+h8hh!]q(hE)q}q(h&X(Keeping persistent state between batchesqh'hh(h+h-hIh/}q(h3]h4]h2]h1]h5]uh7K+h8hh!]qhLX(Keeping persistent state between batchesrr}r(h&hh'hubaubhP)r}r(h&X?Sometimes you'll want to keep some persistent spider state between pause/resume batches. You can use the ``spider.state`` attribute for that, which should be a dict. There's a built-in extension that takes care of serializing, storing and loading that attribute from the job directory, when the spider starts and stops.h'hh(h+h-hTh/}r(h3]h4]h2]h1]h5]uh7K-h8hh!]r(hLXiSometimes you'll want to keep some persistent spider state between pause/resume batches. You can use the rr}r (h&XiSometimes you'll want to keep some persistent spider state between pause/resume batches. You can use the h'jubh)r }r (h&X``spider.state``h/}r (h3]h4]h2]h1]h5]uh'jh!]r hLX spider.staterr}r(h&Uh'j ubah-hubhLX attribute for that, which should be a dict. There's a built-in extension that takes care of serializing, storing and loading that attribute from the job directory, when the spider starts and stops.rr}r(h&X attribute for that, which should be a dict. There's a built-in extension that takes care of serializing, storing and loading that attribute from the job directory, when the spider starts and stops.h'jubeubhP)r}r(h&XgHere's an example of a callback that uses the spider state (other spider code is omitted for brevity)::h'hh(h+h-hTh/}r(h3]h4]h2]h1]h5]uh7K3h8hh!]rhLXfHere's an example of a callback that uses the spider state (other spider code is omitted for brevity):rr}r(h&XfHere's an example of a callback that uses the spider state (other spider code is omitted for brevity):h'jubaubh)r}r(h&Xzdef parse_item(self, response): # parse item here self.state['items_count'] = self.state.get('items_count', 0) + 1h'hh(h+h-hh/}r(hhh1]h2]h3]h4]h5]uh7K6h8hh!]rhLXzdef parse_item(self, response): # parse item here self.state['items_count'] = self.state.get('items_count', 0) + 1rr }r!(h&Uh'jubaubeubh9)r"}r#(h&Uh'h:h(h+h-h>h/}r$(h3]h4]h2]h1]r%hah5]r&h auh7K;h8hh!]r'(hE)r(}r)(h&XPersistence gotchasr*h'j"h(h+h-hIh/}r+(h3]h4]h2]h1]h5]uh7K;h8hh!]r,hLXPersistence gotchasr-r.}r/(h&j*h'j(ubaubhP)r0}r1(h&XdThere are a few things to keep in mind if you want to be able to use the Scrapy persistence support:r2h'j"h(h+h-hTh/}r3(h3]h4]h2]h1]h5]uh7K=h8hh!]r4hLXdThere are a few things to keep in mind if you want to be able to use the Scrapy persistence support:r5r6}r7(h&j2h'j0ubaubh9)r8}r9(h&Uh'j"h(h+h-h>h/}r:(h3]h4]h2]h1]r;hah5]r<hauh7KAh8hh!]r=(hE)r>}r?(h&XCookies expirationr@h'j8h(h+h-hIh/}rA(h3]h4]h2]h1]h5]uh7KAh8hh!]rBhLXCookies expirationrCrD}rE(h&j@h'j>ubaubhP)rF}rG(h&XCookies may expire. So, if you don't resume your spider quickly the requests scheduled may no longer work. This won't be an issue if you spider doesn't rely on cookies.rHh'j8h(h+h-hTh/}rI(h3]h4]h2]h1]h5]uh7KCh8hh!]rJhLXCookies may expire. So, if you don't resume your spider quickly the requests scheduled may no longer work. This won't be an issue if you spider doesn't rely on cookies.rKrL}rM(h&jHh'jFubaubeubh9)rN}rO(h&Uh'j"h(h+h-h>h/}rP(h3]h4]h2]h1]rQhah5]rRh auh7KHh8hh!]rS(hE)rT}rU(h&XRequest serializationrVh'jNh(h+h-hIh/}rW(h3]h4]h2]h1]h5]uh7KHh8hh!]rXhLXRequest serializationrYrZ}r[(h&jVh'jTubaubhP)r\}r](h&XRequests must be serializable by the `pickle` module, in order for persistence to work, so you should make sure that your requests are serializable.h'jNh(h+h-hTh/}r^(h3]h4]h2]h1]h5]uh7KJh8hh!]r_(hLX%Requests must be serializable by the r`ra}rb(h&X%Requests must be serializable by the h'j\ubcdocutils.nodes title_reference rc)rd}re(h&X`pickle`h/}rf(h3]h4]h2]h1]h5]uh'j\h!]rghLXpicklerhri}rj(h&Uh'jdubah-Utitle_referencerkubhLXg module, in order for persistence to work, so you should make sure that your requests are serializable.rlrm}rn(h&Xg module, in order for persistence to work, so you should make sure that your requests are serializable.h'j\ubeubhP)ro}rp(h&XgThe most common issue here is to use ``lambda`` functions on request callbacks that can't be persisted.h'jNh(h+h-hTh/}rq(h3]h4]h2]h1]h5]uh7KMh8hh!]rr(hLX%The most common issue here is to use rsrt}ru(h&X%The most common issue here is to use h'joubh)rv}rw(h&X ``lambda``h/}rx(h3]h4]h2]h1]h5]uh'joh!]ryhLXlambdarzr{}r|(h&Uh'jvubah-hubhLX8 functions on request callbacks that can't be persisted.r}r~}r(h&X8 functions on request callbacks that can't be persisted.h'joubeubhP)r}r(h&X"So, for example, this won't work::rh'jNh(h+h-hTh/}r(h3]h4]h2]h1]h5]uh7KPh8hh!]rhLX!So, for example, this won't work:rr}r(h&X!So, for example, this won't work:h'jubaubh)r}r(h&Xdef some_callback(self, response): somearg = 'test' return Request('http://www.example.com', callback=lambda r: self.other_callback(r, somearg)) def other_callback(self, response, somearg): print "the argument passed is:", someargh'jNh(h+h-hh/}r(hhh1]h2]h3]h4]h5]uh7KRh8hh!]rhLXdef some_callback(self, response): somearg = 'test' return Request('http://www.example.com', callback=lambda r: self.other_callback(r, somearg)) def other_callback(self, response, somearg): print "the argument passed is:", someargrr}r(h&Uh'jubaubhP)r}r(h&XBut this will::rh'jNh(h+h-hTh/}r(h3]h4]h2]h1]h5]uh7KYh8hh!]rhLXBut this will:rr}r(h&XBut this will:h'jubaubh)r}r(h&Xdef some_callback(self, response): somearg = 'test' return Request('http://www.example.com', meta={'somearg': somearg}) def other_callback(self, response): somearg = response.meta['somearg'] print "the argument passed is:", someargh'jNh(h+h-hh/}r(hhh1]h2]h3]h4]h5]uh7K[h8hh!]rhLXdef some_callback(self, response): somearg = 'test' return Request('http://www.example.com', meta={'somearg': somearg}) def other_callback(self, response): somearg = response.meta['somearg'] print "the argument passed is:", someargrr}r(h&Uh'jubaubh#)r}r(h&X6.. _pickle: http://docs.python.org/library/pickle.htmlh'jNh(h+h-h.h/}r(UrefurirX*http://docs.python.org/library/pickle.htmlh1]rh ah2]h3]h4]h5]rhauh7Kch8hh!]ubeubeubeubeh&UU transformerrNU footnote_refsr}rUrefnamesr}rUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh8hU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h&Uh/}r(h3]UlevelKh1]h2]Usourceh+h4]h5]UlineKUtypeUINFOruh!]rhP)r}r(h&Uh/}r(h3]h4]h2]h1]h5]uh'jh!]rhLX1Hyperlink target "topics-jobs" is not referenced.rr}r(h&Uh'jubah-hTubah-Usystem_messagerubj)r}r(h&Uh/}r(h3]UlevelKh1]h2]Usourceh+h4]h5]UlineKcUtypejuh!]rhP)r}r(h&Uh/}r(h3]h4]h2]h1]h5]uh'jh!]rhLX,Hyperlink target "pickle" is not referenced.rr}r(h&Uh'jubah-hTubah-jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhINUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesrNUoutput_encodingrUutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8r U_sourcer UA/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/jobs.rstr Ugettext_compactr U generatorr NUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongrUinput_encoding_error_handlerrjUauto_id_prefixrUidrUdoctitle_xformrUstrip_elements_with_classesrNU _config_filesr]Ufile_insertion_enabledrU raw_enabledrKU dump_settingsrNubUsymbol_footnote_startrKUidsr}r(hjNhj"hhhhhhhh:h jhh:hj8uUsubstitution_namesr }r!h-h8h/}r"(h3]h1]h2]Usourceh+h4]h5]uU footnotesr#]r$Urefidsr%}r&h]r'h$asub.PKo1D*t:scrapy-0.22/.doctrees/topics/downloader-middleware.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X&writing your own downloader middlewareqNX cookies_debugqNX9scrapy.contrib.downloadermiddleware.retry.RetryMiddlewareqXdummy policy (default)q NXrobotstxtmiddlewareq NXhttpcache_expiration_secsq NXAscrapy.contrib.downloadermiddleware.useragent.UserAgentMiddlewareq Xretrymiddleware settingsq NXdefaultheadersmiddlewareqNXhttpcache-policy-dummyqX$filesystem storage backend (default)qNXurllibqXhttpcache middleware settingsqNXcookies_enabledqNXtopics-downloader-middlewareqXmetarefreshmiddleware settingsqNX httpcache_dirqNXdbmqX"activating a downloader middlewareqNXbasic access authenticationqX?scrapy.contrib.downloadermiddleware.redirect.RedirectMiddlewareqXdownloaderstatsqNXhttpcache_dbm_moduleqNXIscrapy.contrib.downloadermiddleware.DownloaderMiddleware.process_responseqXredirect_max_timesqNXhttpcache_policyqNX8scrapy.contrib.downloadermiddleware.DownloaderMiddlewareq Xajaxcrawlmiddleware settingsq!NX retry_enabledq"NXrfc2616 policyq#NXanydbmq$Xcookiesmiddlewareq%NXhttpcache_enabledq&NXhttpcache_ignore_missingq'NXhttpcache_storageq(NX#multiple cookie sessions per spiderq)NXchunked transfer encodingq*Xtopics-dlmw-robotsq+XBscrapy.contrib.downloadermiddleware.redirect.MetaRefreshMiddlewareq,XKscrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddlewareq-Xredirect_enabledq.NXhttpcache-storage-dbmq/Xajaxcrawlmiddlewareq0NXhttpauthmiddlewareq1NXmetarefreshmiddlewareq2NXuseragentmiddlewareq3NXHscrapy.contrib.downloadermiddleware.DownloaderMiddleware.process_requestq4X=scrapy.contrib.downloadermiddleware.cookies.CookiesMiddlewareq5Xdownloadtimeoutmiddlewareq6NXredirect_max_metarefresh_delayq7NXajaxcrawl-middlewareq8Xredirectmiddleware settingsq9NXurllib2q:X$topics-downloader-middleware-settingq;Xajaxcrawl_enabledqNX retry_timesq?NXhttpcompressionmiddlewareq@NX(built-in downloader middleware referenceqANXretrymiddlewareqBNXAscrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddlewareqCX cookies-mwqDXhttpcachemiddlewareqENXhttpcache_ignore_schemesqFNX?scrapy.contrib.downloadermiddleware.httpauth.HttpAuthMiddlewareqGXhttpcache-policy-rfc2616qHXAscrapy.contrib.downloadermiddleware.robotstxt.RobotsTxtMiddlewareqIXdownloader middlewareqJNX"httpcompressionmiddleware settingsqKNXAscrapy.contrib.downloadermiddleware.ajaxcrawl.AjaxCrawlMiddlewareqLXhttpcache_ignore_http_codesqMNXchunkedtransfermiddlewareqNNXmetarefresh_enabledqONXretry_http_codesqPNXdbm storage backendqQNXEscrapy.contrib.downloadermiddleware.chunked.ChunkedTransferMiddlewareqRXAscrapy.contrib.downloadermiddleware.httpcache.HttpCacheMiddlewareqSX9scrapy.contrib.downloadermiddleware.stats.DownloaderStatsqTXcompression_enabledqUNXMscrapy.contrib.downloadermiddleware.httpcompression.HttpCompressionMiddlewareqVXJscrapy.contrib.downloadermiddleware.DownloaderMiddleware.process_exceptionqWX topics-downloader-middleware-refqXXredirectmiddlewareqYNXMscrapy.contrib.downloadermiddleware.downloadtimeout.DownloadTimeoutMiddlewareqZuUsubstitution_defsq[}q\Uparse_messagesq]]q^Ucurrent_sourceq_NU decorationq`NUautofootnote_startqaKUnameidsqb}qc(hU&writing-your-own-downloader-middlewareqdhU cookies-debugqehhh Udummy-policy-defaultqfh Urobotstxtmiddlewareqgh Uhttpcache-expiration-secsqhh h h Uretrymiddleware-settingsqihUdefaultheadersmiddlewareqjhUhttpcache-policy-dummyqkhU"filesystem-storage-backend-defaultqlhUurllibqmhUhttpcache-middleware-settingsqnhUcookies-enabledqohUtopics-downloader-middlewareqphUmetarefreshmiddleware-settingsqqhU httpcache-dirqrhUdbmqshU"activating-a-downloader-middlewareqthUbasic-access-authenticationquhhhUdownloaderstatsqvhUhttpcache-dbm-moduleqwhhhUredirect-max-timesqxhUhttpcache-policyqyh h h!Uajaxcrawlmiddleware-settingsqzh"U retry-enabledq{h#Urfc2616-policyq|h$Uanydbmq}h%Ucookiesmiddlewareq~h&Uhttpcache-enabledqh'Uhttpcache-ignore-missingqh(Uhttpcache-storageqh)U#multiple-cookie-sessions-per-spiderqh*Uchunked-transfer-encodingqh+Utopics-dlmw-robotsqh,h,h-h-h.Uredirect-enabledqh/Uhttpcache-storage-dbmqh0Uajaxcrawlmiddlewareqh1Uhttpauthmiddlewareqh2Umetarefreshmiddlewareqh3Uuseragentmiddlewareqh4h4h5h5h6Udownloadtimeoutmiddlewareqh7Uredirect-max-metarefresh-delayqh8Uajaxcrawl-middlewareqh9Uredirectmiddleware-settingsqh:Uurllib2qh;U$topics-downloader-middleware-settingqhUhttpproxymiddlewareqh?U retry-timesqh@UhttpcompressionmiddlewareqhAU(built-in-downloader-middleware-referenceqhBUretrymiddlewareqhChChDU cookies-mwqhEUhttpcachemiddlewareqhFUhttpcache-ignore-schemesqhGhGhHUhttpcache-policy-rfc2616qhIhIhJUdownloader-middlewareqhKU"httpcompressionmiddleware-settingsqhLhLhMUhttpcache-ignore-http-codesqhNUchunkedtransfermiddlewareqhOUmetarefresh-enabledqhPUretry-http-codesqhQUdbm-storage-backendqhRhRhShShThThUUcompression-enabledqhVhVhWhWhXU topics-downloader-middleware-refqhYUredirectmiddlewareqhZhZuUchildrenq]q(cdocutils.nodes target q)q}q(U rawsourceqX!.. _topics-downloader-middleware:UparentqhUsourceqcdocutils.nodes reprunicode qXR/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/downloader-middleware.rstqq}qbUtagnameqUtargetqU attributesq}q(Uidsq]Ubackrefsq]Udupnamesq]Uclassesq]Unamesq]UrefidqhpuUlineqKUdocumentqhh]ubcdocutils.nodes section q)q}q(hUhhhhUexpect_referenced_by_nameq}qhhshUsectionqh}q(h]h]h]h]q(hhpeh]q(hJheuhKhhUexpect_referenced_by_idq}qhphsh]q(cdocutils.nodes title q)q}q(hXDownloader MiddlewareqhhhhhUtitleqh}q(h]h]h]h]h]uhKhhh]qcdocutils.nodes Text qXDownloader Middlewareq҅q}q(hhhhubaubcdocutils.nodes paragraph q)q}q(hXThe downloader middleware is a framework of hooks into Scrapy's request/response processing. It's a light, low-level system for globally altering Scrapy's requests and responses.qhhhhhU paragraphqh}q(h]h]h]h]h]uhKhhh]qhXThe downloader middleware is a framework of hooks into Scrapy's request/response processing. It's a light, low-level system for globally altering Scrapy's requests and responses.q܅q}q(hhhhubaubh)q}q(hX).. _topics-downloader-middleware-setting:hhhhhhh}q(h]h]h]h]h]hhuhK hhh]ubh)q}q(hUhhhhh}qh;hshhh}q(h]h]h]h]q(htheh]q(hh;euhKhhh}qhhsh]q(h)q}q(hX"Activating a downloader middlewareqhhhhhhh}q(h]h]h]h]h]uhKhhh]qhX"Activating a downloader middlewareqq}q(hhhhubaubh)q}q(hXTo activate a downloader middleware component, add it to the :setting:`DOWNLOADER_MIDDLEWARES` setting, which is a dict whose keys are the middleware class paths and their values are the middleware orders.hhhhhhh}q(h]h]h]h]h]uhKhhh]q(hX=To activate a downloader middleware component, add it to the qq}q(hX=To activate a downloader middleware component, add it to the hhubcsphinx.addnodes pending_xref q)q}q(hX!:setting:`DOWNLOADER_MIDDLEWARES`qhhhhhU pending_xrefqh}q(UreftypeXsettingUrefwarnqU reftargetrXDOWNLOADER_MIDDLEWARESU refdomainXstdrh]h]U refexplicith]h]h]UrefdocrXtopics/downloader-middlewareruhKh]rcdocutils.nodes literal r)r}r(hhh}r(h]h]r (Uxrefr jX std-settingr eh]h]h]uhhh]r hXDOWNLOADER_MIDDLEWARESr r}r(hUhjubahUliteralrubaubhXo setting, which is a dict whose keys are the middleware class paths and their values are the middleware orders.rr}r(hXo setting, which is a dict whose keys are the middleware class paths and their values are the middleware orders.hhubeubh)r}r(hXHere's an example::rhhhhhhh}r(h]h]h]h]h]uhKhhh]rhXHere's an example:rr}r(hXHere's an example:hjubaubcdocutils.nodes literal_block r)r}r(hXYDOWNLOADER_MIDDLEWARES = { 'myproject.middlewares.CustomDownloaderMiddleware': 543, }hhhhhU literal_blockrh}r (U xml:spacer!Upreserver"h]h]h]h]h]uhKhhh]r#hXYDOWNLOADER_MIDDLEWARES = { 'myproject.middlewares.CustomDownloaderMiddleware': 543, }r$r%}r&(hUhjubaubh)r'}r((hXWThe :setting:`DOWNLOADER_MIDDLEWARES` setting is merged with the :setting:`DOWNLOADER_MIDDLEWARES_BASE` setting defined in Scrapy (and not meant to be overridden) and then sorted by order to get the final sorted list of enabled middlewares: the first middleware is the one closer to the engine and the last is the one closer to the downloader.hhhhhhh}r)(h]h]h]h]h]uhKhhh]r*(hXThe r+r,}r-(hXThe hj'ubh)r.}r/(hX!:setting:`DOWNLOADER_MIDDLEWARES`r0hj'hhhhh}r1(UreftypeXsettinghjXDOWNLOADER_MIDDLEWARESU refdomainXstdr2h]h]U refexplicith]h]h]jjuhKh]r3j)r4}r5(hj0h}r6(h]h]r7(j j2X std-settingr8eh]h]h]uhj.h]r9hXDOWNLOADER_MIDDLEWARESr:r;}r<(hUhj4ubahjubaubhX setting is merged with the r=r>}r?(hX setting is merged with the hj'ubh)r@}rA(hX&:setting:`DOWNLOADER_MIDDLEWARES_BASE`rBhj'hhhhh}rC(UreftypeXsettinghjXDOWNLOADER_MIDDLEWARES_BASEU refdomainXstdrDh]h]U refexplicith]h]h]jjuhKh]rEj)rF}rG(hjBh}rH(h]h]rI(j jDX std-settingrJeh]h]h]uhj@h]rKhXDOWNLOADER_MIDDLEWARES_BASErLrM}rN(hUhjFubahjubaubhX setting defined in Scrapy (and not meant to be overridden) and then sorted by order to get the final sorted list of enabled middlewares: the first middleware is the one closer to the engine and the last is the one closer to the downloader.rOrP}rQ(hX setting defined in Scrapy (and not meant to be overridden) and then sorted by order to get the final sorted list of enabled middlewares: the first middleware is the one closer to the engine and the last is the one closer to the downloader.hj'ubeubh)rR}rS(hXVTo decide which order to assign to your middleware see the :setting:`DOWNLOADER_MIDDLEWARES_BASE` setting and pick a value according to where you want to insert the middleware. The order does matter because each middleware performs a different action and your middleware could depend on some previous (or subsequent) middleware being applied.hhhhhhh}rT(h]h]h]h]h]uhK hhh]rU(hX;To decide which order to assign to your middleware see the rVrW}rX(hX;To decide which order to assign to your middleware see the hjRubh)rY}rZ(hX&:setting:`DOWNLOADER_MIDDLEWARES_BASE`r[hjRhhhhh}r\(UreftypeXsettinghjXDOWNLOADER_MIDDLEWARES_BASEU refdomainXstdr]h]h]U refexplicith]h]h]jjuhK h]r^j)r_}r`(hj[h}ra(h]h]rb(j j]X std-settingrceh]h]h]uhjYh]rdhXDOWNLOADER_MIDDLEWARES_BASErerf}rg(hUhj_ubahjubaubhX setting and pick a value according to where you want to insert the middleware. The order does matter because each middleware performs a different action and your middleware could depend on some previous (or subsequent) middleware being applied.rhri}rj(hX setting and pick a value according to where you want to insert the middleware. The order does matter because each middleware performs a different action and your middleware could depend on some previous (or subsequent) middleware being applied.hjRubeubh)rk}rl(hX.If you want to disable a built-in middleware (the ones defined in :setting:`DOWNLOADER_MIDDLEWARES_BASE` and enabled by default) you must define it in your project's :setting:`DOWNLOADER_MIDDLEWARES` setting and assign `None` as its value. For example, if you want to disable the off-site middleware::hhhhhhh}rm(h]h]h]h]h]uhK&hhh]rn(hXBIf you want to disable a built-in middleware (the ones defined in rorp}rq(hXBIf you want to disable a built-in middleware (the ones defined in hjkubh)rr}rs(hX&:setting:`DOWNLOADER_MIDDLEWARES_BASE`rthjkhhhhh}ru(UreftypeXsettinghjXDOWNLOADER_MIDDLEWARES_BASEU refdomainXstdrvh]h]U refexplicith]h]h]jjuhK&h]rwj)rx}ry(hjth}rz(h]h]r{(j jvX std-settingr|eh]h]h]uhjrh]r}hXDOWNLOADER_MIDDLEWARES_BASEr~r}r(hUhjxubahjubaubhX> and enabled by default) you must define it in your project's rr}r(hX> and enabled by default) you must define it in your project's hjkubh)r}r(hX!:setting:`DOWNLOADER_MIDDLEWARES`rhjkhhhhh}r(UreftypeXsettinghjXDOWNLOADER_MIDDLEWARESU refdomainXstdrh]h]U refexplicith]h]h]jjuhK&h]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXDOWNLOADER_MIDDLEWARESrr}r(hUhjubahjubaubhX setting and assign rr}r(hX setting and assign hjkubcdocutils.nodes title_reference r)r}r(hX`None`h}r(h]h]h]h]h]uhjkh]rhXNonerr}r(hUhjubahUtitle_referencerubhXL as its value. For example, if you want to disable the off-site middleware:rr}r(hXL as its value. For example, if you want to disable the off-site middleware:hjkubeubj)r}r(hXDOWNLOADER_MIDDLEWARES = { 'myproject.middlewares.CustomDownloaderMiddleware': 543, 'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None, }hhhhhjh}r(j!j"h]h]h]h]h]uhK+hhh]rhXDOWNLOADER_MIDDLEWARES = { 'myproject.middlewares.CustomDownloaderMiddleware': 543, 'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None, }rr}r(hUhjubaubh)r}r(hXFinally, keep in mind that some middlewares may need to be enabled through a particular setting. See each middleware documentation for more info.rhhhhhhh}r(h]h]h]h]h]uhK0hhh]rhXFinally, keep in mind that some middlewares may need to be enabled through a particular setting. See each middleware documentation for more info.rr}r(hjhjubaubeubh)r}r(hUhhhhhhh}r(h]h]h]h]rhdah]rhauhK4hhh]r(h)r}r(hX&Writing your own downloader middlewarerhjhhhhh}r(h]h]h]h]h]uhK4hhh]rhX&Writing your own downloader middlewarerr}r(hjhjubaubh)r}r(hXWriting your own downloader middleware is easy. Each middleware component is a single Python class that defines one or more of the following methods:rhjhhhhh}r(h]h]h]h]h]uhK6hhh]rhXWriting your own downloader middleware is easy. Each middleware component is a single Python class that defines one or more of the following methods:rr}r(hjhjubaubh)r}r(hUhjhhhhh}r(h]h]rX*module-scrapy.contrib.downloadermiddlewarerah]Uismodh]h]uhK:hhh]ubcsphinx.addnodes index r)r}r(hUhjhhhUindexrh}r(h]h]h]h]h]Uentries]r(UsinglerX,scrapy.contrib.downloadermiddleware (module)X*module-scrapy.contrib.downloadermiddlewareUtrauhK:hhh]ubj)r}r(hUhjhNhjh}r(h]h]h]h]h]Uentries]r(jXCDownloaderMiddleware (class in scrapy.contrib.downloadermiddleware)h UtrauhNhhh]ubcsphinx.addnodes desc r)r}r(hUhjhNhUdescrh}r(UnoindexrUdomainrXpyh]h]h]h]h]UobjtyperXclassrUdesctyperjuhNhhh]r(csphinx.addnodes desc_signature r)r}r(hXDownloaderMiddlewarerhjhhhUdesc_signaturerh}r(h]rh aUmodulerX#scrapy.contrib.downloadermiddlewarerh]h]h]h]rh aUfullnamerjUclassrUUfirstruhKhhh]r(csphinx.addnodes desc_annotation r)r}r(hXclass hjhhhUdesc_annotationrh}r(h]h]h]h]h]uhKhhh]rhXclass rr}r(hUhjubaubcsphinx.addnodes desc_addname r)r}r(hX$scrapy.contrib.downloadermiddleware.hjhhhU desc_addnamerh}r(h]h]h]h]h]uhKhhh]rhX$scrapy.contrib.downloadermiddleware.rr}r(hUhjubaubcsphinx.addnodes desc_name r)r}r(hjhjhhhU desc_namerh}r(h]h]h]h]h]uhKhhh]r hXDownloaderMiddlewarer r }r (hUhjubaubeubcsphinx.addnodes desc_content r )r}r(hUhjhhhU desc_contentrh}r(h]h]h]h]h]uhKhhh]r(j)r}r(hUhjhNhjh}r(h]h]h]h]h]Uentries]r(jXSprocess_request() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)h4UtrauhNhhh]ubj)r}r(hUhjhNhjh}r(jjXpyh]h]h]h]h]jXmethodrjjuhNhhh]r(j)r}r(hX process_request(request, spider)hjhhhjh}r(h]r h4ajjh]h]h]h]r!h4ajX$DownloaderMiddleware.process_requestjjjuhK_hhh]r"(j)r#}r$(hXprocess_requesthjhhhjh}r%(h]h]h]h]h]uhK_hhh]r&hXprocess_requestr'r(}r)(hUhj#ubaubcsphinx.addnodes desc_parameterlist r*)r+}r,(hUhjhhhUdesc_parameterlistr-h}r.(h]h]h]h]h]uhK_hhh]r/(csphinx.addnodes desc_parameter r0)r1}r2(hXrequesth}r3(h]h]h]h]h]uhj+h]r4hXrequestr5r6}r7(hUhj1ubahUdesc_parameterr8ubj0)r9}r:(hXspiderh}r;(h]h]h]h]h]uhj+h]r<hXspiderr=r>}r?(hUhj9ubahj8ubeubeubj )r@}rA(hUhjhhhjh}rB(h]h]h]h]h]uhK_hhh]rC(h)rD}rE(hXQThis method is called for each request that goes through the download middleware.rFhj@hhhhh}rG(h]h]h]h]h]uhK?hhh]rHhXQThis method is called for each request that goes through the download middleware.rIrJ}rK(hjFhjDubaubh)rL}rM(hX:meth:`process_request` should either: return ``None``, return a :class:`~scrapy.http.Response` object, return a :class:`~scrapy.http.Request` object, or raise :exc:`~scrapy.exceptions.IgnoreRequest`.hj@hhhhh}rN(h]h]h]h]h]uhKBhhh]rO(h)rP}rQ(hX:meth:`process_request`rRhjLhhhhh}rS(UreftypeXmethhjXprocess_requestU refdomainXpyrTh]h]U refexplicith]h]h]jjUpy:classrUjU py:modulerVjuhKBh]rWj)rX}rY(hjRh}rZ(h]h]r[(j jTXpy-methr\eh]h]h]uhjPh]r]hXprocess_request()r^r_}r`(hUhjXubahjubaubhX should either: return rarb}rc(hX should either: return hjLubj)rd}re(hX``None``h}rf(h]h]h]h]h]uhjLh]rghXNonerhri}rj(hUhjdubahjubhX , return a rkrl}rm(hX , return a hjLubh)rn}ro(hX:class:`~scrapy.http.Response`rphjLhhhhh}rq(UreftypeXclasshjXscrapy.http.ResponseU refdomainXpyrrh]h]U refexplicith]h]h]jjjUjjVjuhKBh]rsj)rt}ru(hjph}rv(h]h]rw(j jrXpy-classrxeh]h]h]uhjnh]ryhXResponserzr{}r|(hUhjtubahjubaubhX object, return a r}r~}r(hX object, return a hjLubh)r}r(hX:class:`~scrapy.http.Request`rhjLhhhhh}r(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKBh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXRequestrr}r(hUhjubahjubaubhX object, or raise rr}r(hX object, or raise hjLubh)r}r(hX':exc:`~scrapy.exceptions.IgnoreRequest`rhjLhhhhh}r(UreftypeXexchjXscrapy.exceptions.IgnoreRequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKBh]rj)r}r(hjh}r(h]h]r(j jXpy-excreh]h]h]uhjh]rhX IgnoreRequestrr}r(hUhjubahjubaubhX.r}r(hX.hjLubeubh)r}r(hXIf it returns ``None``, Scrapy will continue processing this request, executing all other middlewares until, finally, the appropriate downloader handler is called the request performed (and its response downloaded).hj@hhhhh}r(h]h]h]h]h]uhKFhhh]r(hXIf it returns rr}r(hXIf it returns hjubj)r}r(hX``None``h}r(h]h]h]h]h]uhjh]rhXNonerr}r(hUhjubahjubhX, Scrapy will continue processing this request, executing all other middlewares until, finally, the appropriate downloader handler is called the request performed (and its response downloaded).rr}r(hX, Scrapy will continue processing this request, executing all other middlewares until, finally, the appropriate downloader handler is called the request performed (and its response downloaded).hjubeubh)r}r(hX?If it returns a :class:`~scrapy.http.Response` object, Scrapy won't bother calling *any* other :meth:`process_request` or :meth:`process_exception` methods, or the appropriate download function; it'll return that response. The :meth:`process_response` methods of installed middleware is always called on every response.hj@hhhhh}r(h]h]h]h]h]uhKJhhh]r(hXIf it returns a rr}r(hXIf it returns a hjubh)r}r(hX:class:`~scrapy.http.Response`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.ResponseU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKJh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXResponserr}r(hUhjubahjubaubhX% object, Scrapy won't bother calling rr}r(hX% object, Scrapy won't bother calling hjubcdocutils.nodes emphasis r)r}r(hX*any*h}r(h]h]h]h]h]uhjh]rhXanyrr}r(hUhjubahUemphasisrubhX other rr}r(hX other hjubh)r}r(hX:meth:`process_request`rhjhhhhh}r(UreftypeXmethhjXprocess_requestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKJh]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_request()rr}r(hUhjubahjubaubhX or rr}r(hX or hjubh)r}r(hX:meth:`process_exception`rhjhhhhh}r(UreftypeXmethhjXprocess_exceptionU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKJh]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_exception()rr}r(hUhjubahjubaubhXP methods, or the appropriate download function; it'll return that response. The rr}r(hXP methods, or the appropriate download function; it'll return that response. The hjubh)r}r(hX:meth:`process_response`rhjhhhhh}r(UreftypeXmethhjXprocess_responseU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKJh]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_response()r r }r (hUhjubahjubaubhXD methods of installed middleware is always called on every response.r r }r(hXD methods of installed middleware is always called on every response.hjubeubh)r}r(hXIf it returns a :class:`~scrapy.http.Request` object, Scrapy will stop calling process_request methods and reschedule the returned request. Once the newly returned request is performed, the appropriate middleware chain will be called on the downloaded response.hj@hhhhh}r(h]h]h]h]h]uhKOhhh]r(hXIf it returns a rr}r(hXIf it returns a hjubh)r}r(hX:class:`~scrapy.http.Request`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKOh]rj)r}r(hjh}r(h]h]r(j jXpy-classr eh]h]h]uhjh]r!hXRequestr"r#}r$(hUhjubahjubaubhX object, Scrapy will stop calling process_request methods and reschedule the returned request. Once the newly returned request is performed, the appropriate middleware chain will be called on the downloaded response.r%r&}r'(hX object, Scrapy will stop calling process_request methods and reschedule the returned request. Once the newly returned request is performed, the appropriate middleware chain will be called on the downloaded response.hjubeubh)r(}r)(hXgIf it raises an :exc:`~scrapy.exceptions.IgnoreRequest` exception, the :meth:`process_exception` methods of installed downloader middleware will be called. If none of them handle the exception, the errback function of the request (``Request.errback``) is called. If no code handles the raised exception, it is ignored and not logged (unlike other exceptions).hj@hhhhh}r*(h]h]h]h]h]uhKThhh]r+(hXIf it raises an r,r-}r.(hXIf it raises an hj(ubh)r/}r0(hX':exc:`~scrapy.exceptions.IgnoreRequest`r1hj(hhhhh}r2(UreftypeXexchjXscrapy.exceptions.IgnoreRequestU refdomainXpyr3h]h]U refexplicith]h]h]jjjUjjVjuhKTh]r4j)r5}r6(hj1h}r7(h]h]r8(j j3Xpy-excr9eh]h]h]uhj/h]r:hX IgnoreRequestr;r<}r=(hUhj5ubahjubaubhX exception, the r>r?}r@(hX exception, the hj(ubh)rA}rB(hX:meth:`process_exception`rChj(hhhhh}rD(UreftypeXmethhjXprocess_exceptionU refdomainXpyrEh]h]U refexplicith]h]h]jjjUjjVjuhKTh]rFj)rG}rH(hjCh}rI(h]h]rJ(j jEXpy-methrKeh]h]h]uhjAh]rLhXprocess_exception()rMrN}rO(hUhjGubahjubaubhX methods of installed downloader middleware will be called. If none of them handle the exception, the errback function of the request (rPrQ}rR(hX methods of installed downloader middleware will be called. If none of them handle the exception, the errback function of the request (hj(ubj)rS}rT(hX``Request.errback``h}rU(h]h]h]h]h]uhj(h]rVhXRequest.errbackrWrX}rY(hUhjSubahjubhXm) is called. If no code handles the raised exception, it is ignored and not logged (unlike other exceptions).rZr[}r\(hXm) is called. If no code handles the raised exception, it is ignored and not logged (unlike other exceptions).hj(ubeubcdocutils.nodes field_list r])r^}r_(hUhj@hNhU field_listr`h}ra(h]h]h]h]h]uhNhhh]rbcdocutils.nodes field rc)rd}re(hUh}rf(h]h]h]h]h]uhj^h]rg(cdocutils.nodes field_name rh)ri}rj(hUh}rk(h]h]h]h]h]uhjdh]rlhX Parametersrmrn}ro(hUhjiubahU field_namerpubcdocutils.nodes field_body rq)rr}rs(hUh}rt(h]h]h]h]h]uhjdh]rucdocutils.nodes bullet_list rv)rw}rx(hUh}ry(h]h]h]h]h]uhjrh]rz(cdocutils.nodes list_item r{)r|}r}(hUh}r~(h]h]h]h]h]uhjwh]rh)r}r(hUh}r(h]h]h]h]h]uhj|h]r(cdocutils.nodes strong r)r}r(hXrequesth}r(h]h]h]h]h]uhjh]rhXrequestrr}r(hUhjubahUstrongrubhX (rr}r(hUhjubh)r}r(hX:class:`~scrapy.http.Request`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhK[h]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXRequestrr}r(hUhjubahjubaubhX objectrr}r(hX objecthjubhX)r}r(hUhjubhX -- rr}r(hUhjubhXthe request being processedrr}r(hXthe request being processedhjubehhubahU list_itemrubj{)r}r(hUh}r(h]h]h]h]h]uhjwh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]r(j)r}r(hXspiderh}r(h]h]h]h]h]uhjh]rhXspiderrr}r(hUhjubahjubhX (rr}r(hUhjubh)r}r(hX:class:`~scrapy.spider.Spider`rhjhhhhh}r(UreftypeXclasshjXscrapy.spider.SpiderU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhK^h]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXSpiderrr}r(hUhjubahjubaubhX objectrr}r(hX objecthjubhX)r}r(hUhjubhX -- rr}r(hUhjubhX-the spider for which this request is intendedrr}r(hX-the spider for which this request is intendedhjubehhubahjubehU bullet_listrubahU field_bodyrubehUfieldrubaubeubeubj)r}r(hUhjhNhjh}r(h]h]h]h]h]Uentries]r(jXTprocess_response() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)hUtrauhNhhh]ubj)r}r(hUhjhNhjh}r(jjXpyh]h]h]h]h]jXmethodrjjuhNhhh]r(j)r}r(hX+process_response(request, response, spider)hjhhhjh}r(h]rhajjh]h]h]h]rhajX%DownloaderMiddleware.process_responsejjjuhKzhhh]r(j)r}r(hXprocess_responsehjhhhjh}r(h]h]h]h]h]uhKzhhh]rhXprocess_responserr}r(hUhjubaubj*)r}r(hUhjhhhj-h}r(h]h]h]h]h]uhKzhhh]r(j0)r}r(hXrequesth}r(h]h]h]h]h]uhjh]rhXrequestrr}r(hUhjubahj8ubj0)r}r(hXresponseh}r(h]h]h]h]h]uhjh]rhXresponserr}r(hUhjubahj8ubj0)r}r(hXspiderh}r(h]h]h]h]h]uhjh]rhXspiderrr}r (hUhjubahj8ubeubeubj )r }r (hUhjhhhjh}r (h]h]h]h]h]uhKzhhh]r (h)r}r(hX:meth:`process_response` should either: return a :class:`~scrapy.http.Response` object, return a :class:`~scrapy.http.Request` object or raise a :exc:`~scrapy.exceptions.IgnoreRequest` exception.hj hhhhh}r(h]h]h]h]h]uhKbhhh]r(h)r}r(hX:meth:`process_response`rhjhhhhh}r(UreftypeXmethhjXprocess_responseU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKbh]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_response()rr}r (hUhjubahjubaubhX should either: return a r!r"}r#(hX should either: return a hjubh)r$}r%(hX:class:`~scrapy.http.Response`r&hjhhhhh}r'(UreftypeXclasshjXscrapy.http.ResponseU refdomainXpyr(h]h]U refexplicith]h]h]jjjUjjVjuhKbh]r)j)r*}r+(hj&h}r,(h]h]r-(j j(Xpy-classr.eh]h]h]uhj$h]r/hXResponser0r1}r2(hUhj*ubahjubaubhX object, return a r3r4}r5(hX object, return a hjubh)r6}r7(hX:class:`~scrapy.http.Request`r8hjhhhhh}r9(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyr:h]h]U refexplicith]h]h]jjjUjjVjuhKbh]r;j)r<}r=(hj8h}r>(h]h]r?(j j:Xpy-classr@eh]h]h]uhj6h]rAhXRequestrBrC}rD(hUhj<ubahjubaubhX object or raise a rErF}rG(hX object or raise a hjubh)rH}rI(hX':exc:`~scrapy.exceptions.IgnoreRequest`rJhjhhhhh}rK(UreftypeXexchjXscrapy.exceptions.IgnoreRequestU refdomainXpyrLh]h]U refexplicith]h]h]jjjUjjVjuhKbh]rMj)rN}rO(hjJh}rP(h]h]rQ(j jLXpy-excrReh]h]h]uhjHh]rShX IgnoreRequestrTrU}rV(hUhjNubahjubaubhX exception.rWrX}rY(hX exception.hjubeubh)rZ}r[(hXIf it returns a :class:`~scrapy.http.Response` (it could be the same given response, or a brand-new one), that response will continue to be processed with the :meth:`process_response` of the next middleware in the chain.hj hhhhh}r\(h]h]h]h]h]uhKfhhh]r](hXIf it returns a r^r_}r`(hXIf it returns a hjZubh)ra}rb(hX:class:`~scrapy.http.Response`rchjZhhhhh}rd(UreftypeXclasshjXscrapy.http.ResponseU refdomainXpyreh]h]U refexplicith]h]h]jjjUjjVjuhKfh]rfj)rg}rh(hjch}ri(h]h]rj(j jeXpy-classrkeh]h]h]uhjah]rlhXResponsermrn}ro(hUhjgubahjubaubhXq (it could be the same given response, or a brand-new one), that response will continue to be processed with the rprq}rr(hXq (it could be the same given response, or a brand-new one), that response will continue to be processed with the hjZubh)rs}rt(hX:meth:`process_response`ruhjZhhhhh}rv(UreftypeXmethhjXprocess_responseU refdomainXpyrwh]h]U refexplicith]h]h]jjjUjjVjuhKfh]rxj)ry}rz(hjuh}r{(h]h]r|(j jwXpy-methr}eh]h]h]uhjsh]r~hXprocess_response()rr}r(hUhjyubahjubaubhX% of the next middleware in the chain.rr}r(hX% of the next middleware in the chain.hjZubeubh)r}r(hXIf it returns a :class:`~scrapy.http.Request` object, the middleware chain is halted and the returned request is rescheduled to be downloaded in the future. This is the same behavior as if a request is returned from :meth:`process_request`.hj hhhhh}r(h]h]h]h]h]uhKjhhh]r(hXIf it returns a rr}r(hXIf it returns a hjubh)r}r(hX:class:`~scrapy.http.Request`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKjh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXRequestrr}r(hUhjubahjubaubhX object, the middleware chain is halted and the returned request is rescheduled to be downloaded in the future. This is the same behavior as if a request is returned from rr}r(hX object, the middleware chain is halted and the returned request is rescheduled to be downloaded in the future. This is the same behavior as if a request is returned from hjubh)r}r(hX:meth:`process_request`rhjhhhhh}r(UreftypeXmethhjXprocess_requestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKjh]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_request()rr}r(hUhjubahjubaubhX.r}r(hX.hjubeubh)r}r(hXIf it raises an :exc:`~scrapy.exceptions.IgnoreRequest` exception, the errback function of the request (``Request.errback``) is called. If no code handles the raised exception, it is ignored and not logged (unlike other exceptions).hj hhhhh}r(h]h]h]h]h]uhKnhhh]r(hXIf it raises an rr}r(hXIf it raises an hjubh)r}r(hX':exc:`~scrapy.exceptions.IgnoreRequest`rhjhhhhh}r(UreftypeXexchjXscrapy.exceptions.IgnoreRequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKnh]rj)r}r(hjh}r(h]h]r(j jXpy-excreh]h]h]uhjh]rhX IgnoreRequestrr}r(hUhjubahjubaubhX1 exception, the errback function of the request (rr}r(hX1 exception, the errback function of the request (hjubj)r}r(hX``Request.errback``h}r(h]h]h]h]h]uhjh]rhXRequest.errbackrr}r(hUhjubahjubhXm) is called. If no code handles the raised exception, it is ignored and not logged (unlike other exceptions).rr}r(hXm) is called. If no code handles the raised exception, it is ignored and not logged (unlike other exceptions).hjubeubj])r}r(hUhj hNhj`h}r(h]h]h]h]h]uhNhhh]rjc)r}r(hUh}r(h]h]h]h]h]uhjh]r(jh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX Parametersrr}r(hUhjubahjpubjq)r}r(hUh}r(h]h]h]h]h]uhjh]rjv)r}r(hUh}r(h]h]h]h]h]uhjh]r(j{)r}r(hUh}r(h]h]h]h]h]uhjh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]r(j)r}r(hXrequesth}r(h]h]h]h]h]uhjh]rhXrequestrr}r(hUhjubahjubhX (rr}r(hUhjubhXis a rr}r(hXis a hjubh)r}r(hX:class:`~scrapy.http.Request`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKsh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]r hXRequestr r }r (hUhjubahjubaubhX objectr r}r(hX objecthjubhX)r}r(hUhjubhX -- rr}r(hUhjubhX(the request that originated the responserr}r(hX(the request that originated the responserhjubehhubahjubj{)r}r(hUh}r(h]h]h]h]h]uhjh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]r (j)r!}r"(hXresponseh}r#(h]h]h]h]h]uhjh]r$hXresponser%r&}r'(hUhj!ubahjubhX (r(r)}r*(hUhjubh)r+}r,(hX:class:`~scrapy.http.Response`r-hjhhhhh}r.(UreftypeXclasshjXscrapy.http.ResponseU refdomainXpyr/h]h]U refexplicith]h]h]jjjUjjVjuhKvh]r0j)r1}r2(hj-h}r3(h]h]r4(j j/Xpy-classr5eh]h]h]uhj+h]r6hXResponser7r8}r9(hUhj1ubahjubaubhX objectr:r;}r<(hX objecthjubhX)r=}r>(hUhjubhX -- r?r@}rA(hUhjubhXthe response being processedrBrC}rD(hXthe response being processedrEhjubehhubahjubj{)rF}rG(hUh}rH(h]h]h]h]h]uhjh]rIh)rJ}rK(hUh}rL(h]h]h]h]h]uhjFh]rM(j)rN}rO(hXspiderh}rP(h]h]h]h]h]uhjJh]rQhXspiderrRrS}rT(hUhjNubahjubhX (rUrV}rW(hUhjJubh)rX}rY(hX:class:`~scrapy.spider.Spider`rZhjJhhhhh}r[(UreftypeXclasshjXscrapy.spider.SpiderU refdomainXpyr\h]h]U refexplicith]h]h]jjjUjjVjuhKyh]r]j)r^}r_(hjZh}r`(h]h]ra(j j\Xpy-classrbeh]h]h]uhjXh]rchXSpiderrdre}rf(hUhj^ubahjubaubhX objectrgrh}ri(hX objecthjJubhX)rj}rk(hUhjJubhX -- rlrm}rn(hUhjJubhX.the spider for which this response is intendedrorp}rq(hX.the spider for which this response is intendedrrhjJubehhubahjubehjubahjubehjubaubeubeubj)rs}rt(hUhjhNhjh}ru(h]h]h]h]h]Uentries]rv(jXUprocess_exception() (scrapy.contrib.downloadermiddleware.DownloaderMiddleware method)hWUtrwauhNhhh]ubj)rx}ry(hUhjhNhjh}rz(jjXpyh]h]h]h]h]jXmethodr{jj{uhNhhh]r|(j)r}}r~(hX-process_exception(request, exception, spider)hjxhhhjh}r(h]rhWajjh]h]h]h]rhWajX&DownloaderMiddleware.process_exceptionjjjuhKhhh]r(j)r}r(hXprocess_exceptionhj}hhhjh}r(h]h]h]h]h]uhKhhh]rhXprocess_exceptionrr}r(hUhjubaubj*)r}r(hUhj}hhhj-h}r(h]h]h]h]h]uhKhhh]r(j0)r}r(hXrequesth}r(h]h]h]h]h]uhjh]rhXrequestrr}r(hUhjubahj8ubj0)r}r(hX exceptionh}r(h]h]h]h]h]uhjh]rhX exceptionrr}r(hUhjubahj8ubj0)r}r(hXspiderh}r(h]h]h]h]h]uhjh]rhXspiderrr}r(hUhjubahj8ubeubeubj )r}r(hUhjxhhhjh}r(h]h]h]h]h]uhKhhh]r(h)r}r(hXScrapy calls :meth:`process_exception` when a download handler or a :meth:`process_request` (from a downloader middleware) raises an exception (including an :exc:`~scrapy.exceptions.IgnoreRequest` exception)hjhhhhh}r(h]h]h]h]h]uhK}hhh]r(hX Scrapy calls rr}r(hX Scrapy calls hjubh)r}r(hX:meth:`process_exception`rhjhhhhh}r(UreftypeXmethhjXprocess_exceptionU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhK}h]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_exception()rr}r(hUhjubahjubaubhX when a download handler or a rr}r(hX when a download handler or a hjubh)r}r(hX:meth:`process_request`rhjhhhhh}r(UreftypeXmethhjXprocess_requestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhK}h]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_request()rr}r(hUhjubahjubaubhXB (from a downloader middleware) raises an exception (including an rr}r(hXB (from a downloader middleware) raises an exception (including an hjubh)r}r(hX':exc:`~scrapy.exceptions.IgnoreRequest`rhjhhhhh}r(UreftypeXexchjXscrapy.exceptions.IgnoreRequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhK}h]rj)r}r(hjh}r(h]h]r(j jXpy-excreh]h]h]uhjh]rhX IgnoreRequestrr}r(hUhjubahjubaubhX exception)rr}r(hX exception)hjubeubh)r}r(hX:meth:`process_exception` should return: either ``None``, a :class:`~scrapy.http.Response` object, or a :class:`~scrapy.http.Request` object.hjhhhhh}r(h]h]h]h]h]uhKhhh]r(h)r}r(hX:meth:`process_exception`rhjhhhhh}r(UreftypeXmethhjXprocess_exceptionU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKh]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_exception()rr}r(hUhjubahjubaubhX should return: either rr}r(hX should return: either hjubj)r}r(hX``None``h}r(h]h]h]h]h]uhjh]rhXNonerr}r(hUhjubahjubhX, a rr}r(hX, a hjubh)r}r(hX:class:`~scrapy.http.Response`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.ResponseU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKh]r j)r }r (hjh}r (h]h]r (j jXpy-classreh]h]h]uhjh]rhXResponserr}r(hUhj ubahjubaubhX object, or a rr}r(hX object, or a hjubh)r}r(hX:class:`~scrapy.http.Request`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKh]rj)r}r(hjh}r(h]h]r(j jXpy-classr eh]h]h]uhjh]r!hXRequestr"r#}r$(hUhjubahjubaubhX object.r%r&}r'(hX object.hjubeubh)r(}r)(hXIf it returns ``None``, Scrapy will continue processing this exception, executing any other :meth:`process_exception` methods of installed middleware, until no middleware is left and the default exception handling kicks in.hjhhhhh}r*(h]h]h]h]h]uhKhhh]r+(hXIf it returns r,r-}r.(hXIf it returns hj(ubj)r/}r0(hX``None``h}r1(h]h]h]h]h]uhj(h]r2hXNoner3r4}r5(hUhj/ubahjubhXF, Scrapy will continue processing this exception, executing any other r6r7}r8(hXF, Scrapy will continue processing this exception, executing any other hj(ubh)r9}r:(hX:meth:`process_exception`r;hj(hhhhh}r<(UreftypeXmethhjXprocess_exceptionU refdomainXpyr=h]h]U refexplicith]h]h]jjjUjjVjuhKh]r>j)r?}r@(hj;h}rA(h]h]rB(j j=Xpy-methrCeh]h]h]uhj9h]rDhXprocess_exception()rErF}rG(hUhj?ubahjubaubhXj methods of installed middleware, until no middleware is left and the default exception handling kicks in.rHrI}rJ(hXj methods of installed middleware, until no middleware is left and the default exception handling kicks in.hj(ubeubh)rK}rL(hXIf it returns a :class:`~scrapy.http.Response` object, the :meth:`process_response` method chain of installed middleware is started, and Scrapy won't bother calling any other :meth:`process_exception` methods of middleware.hjhhhhh}rM(h]h]h]h]h]uhKhhh]rN(hXIf it returns a rOrP}rQ(hXIf it returns a hjKubh)rR}rS(hX:class:`~scrapy.http.Response`rThjKhhhhh}rU(UreftypeXclasshjXscrapy.http.ResponseU refdomainXpyrVh]h]U refexplicith]h]h]jjjUjjVjuhKh]rWj)rX}rY(hjTh}rZ(h]h]r[(j jVXpy-classr\eh]h]h]uhjRh]r]hXResponser^r_}r`(hUhjXubahjubaubhX object, the rarb}rc(hX object, the hjKubh)rd}re(hX:meth:`process_response`rfhjKhhhhh}rg(UreftypeXmethhjXprocess_responseU refdomainXpyrhh]h]U refexplicith]h]h]jjjUjjVjuhKh]rij)rj}rk(hjfh}rl(h]h]rm(j jhXpy-methrneh]h]h]uhjdh]rohXprocess_response()rprq}rr(hUhjjubahjubaubhX\ method chain of installed middleware is started, and Scrapy won't bother calling any other rsrt}ru(hX\ method chain of installed middleware is started, and Scrapy won't bother calling any other hjKubh)rv}rw(hX:meth:`process_exception`rxhjKhhhhh}ry(UreftypeXmethhjXprocess_exceptionU refdomainXpyrzh]h]U refexplicith]h]h]jjjUjjVjuhKh]r{j)r|}r}(hjxh}r~(h]h]r(j jzXpy-methreh]h]h]uhjvh]rhXprocess_exception()rr}r(hUhj|ubahjubaubhX methods of middleware.rr}r(hX methods of middleware.hjKubeubh)r}r(hXIf it returns a :class:`~scrapy.http.Request` object, the returned request is rescheduled to be downloaded in the future. This stops the execution of :meth:`process_exception` methods of the middleware the same as returning a response would.hjhhhhh}r(h]h]h]h]h]uhKhhh]r(hXIf it returns a rr}r(hXIf it returns a hjubh)r}r(hX:class:`~scrapy.http.Request`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXRequestrr}r(hUhjubahjubaubhXi object, the returned request is rescheduled to be downloaded in the future. This stops the execution of rr}r(hXi object, the returned request is rescheduled to be downloaded in the future. This stops the execution of hjubh)r}r(hX:meth:`process_exception`rhjhhhhh}r(UreftypeXmethhjXprocess_exceptionU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKh]rj)r}r(hjh}r(h]h]r(j jXpy-methreh]h]h]uhjh]rhXprocess_exception()rr}r(hUhjubahjubaubhXB methods of the middleware the same as returning a response would.rr}r(hXB methods of the middleware the same as returning a response would.hjubeubj])r}r(hUhjhNhj`h}r(h]h]h]h]h]uhNhhh]rjc)r}r(hUh}r(h]h]h]h]h]uhjh]r(jh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX Parametersrr}r(hUhjubahjpubjq)r}r(hUh}r(h]h]h]h]h]uhjh]rjv)r}r(hUh}r(h]h]h]h]h]uhjh]r(j{)r}r(hUh}r(h]h]h]h]h]uhjh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]r(j)r}r(hXrequesth}r(h]h]h]h]h]uhjh]rhXrequestrr}r(hUhjubahjubhX (rr}r(hUhjubhXis a rr}r(hXis a hjubh)r}r(hX:class:`~scrapy.http.Request`rhjhhhhh}r(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrh]h]U refexplicith]h]h]jjjUjjVjuhKh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXRequestrr}r(hUhjubahjubaubhX objectrr}r(hX objecthjubhX)r}r(hUhjubhX -- rr}r(hUhjubhX(the request that generated the exceptionrr}r(hX(the request that generated the exceptionhjubehhubahjubj{)r}r(hUh}r(h]h]h]h]h]uhjh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]r(j)r}r(hX exceptionh}r(h]h]h]h]h]uhjh]rhX exceptionrr}r(hUhjubahjubhX (rr }r (hUhjubhXan r r }r (hXan hjubj)r}r(hX ``Exception``h}r(h]h]h]h]h]uhjh]rhX Exceptionrr}r(hUhjubahjubhX objectrr}r(hX objecthjubhX)r}r(hUhjubhX -- rr}r(hUhjubhXthe raised exceptionrr}r(hXthe raised exceptionhjubehhubahjubj{)r }r!(hUh}r"(h]h]h]h]h]uhjh]r#h)r$}r%(hUh}r&(h]h]h]h]h]uhj h]r'(j)r(}r)(hXspiderh}r*(h]h]h]h]h]uhj$h]r+hXspiderr,r-}r.(hUhj(ubahjubhX (r/r0}r1(hUhj$ubh)r2}r3(hX:class:`~scrapy.spider.Spider`r4hj$hhhhh}r5(UreftypeXclasshjXscrapy.spider.SpiderU refdomainXpyr6h]h]U refexplicith]h]h]jjjUjjVjuhKh]r7j)r8}r9(hj4h}r:(h]h]r;(j j6Xpy-classr<eh]h]h]uhj2h]r=hXSpiderr>r?}r@(hUhj8ubahjubaubhX objectrArB}rC(hX objecthj$ubhX)rD}rE(hUhj$ubhX -- rFrG}rH(hUhj$ubhX-the spider for which this request is intendedrIrJ}rK(hX-the spider for which this request is intendedhj$ubehhubahjubehjubahjubehjubaubeubeubeubeubh)rL}rM(hX%.. _topics-downloader-middleware-ref:hjhhhhh}rN(h]h]h]h]h]hhuhKhhh]ubeubh)rO}rP(hUhhhhh}rQhXjLshhh}rR(h]h]h]h]rS(hheh]rT(hAhXeuhKhhh}rUhjLsh]rV(h)rW}rX(hX(Built-in downloader middleware referencerYhjOhhhhh}rZ(h]h]h]h]h]uhKhhh]r[hX(Built-in downloader middleware referencer\r]}r^(hjYhjWubaubh)r_}r`(hXThis page describes all downloader middleware components that come with Scrapy. For information on how to use them and how to write your own downloader middleware, see the :ref:`downloader middleware usage guide `.hjOhhhhh}ra(h]h]h]h]h]uhKhhh]rb(hXThis page describes all downloader middleware components that come with Scrapy. For information on how to use them and how to write your own downloader middleware, see the rcrd}re(hXThis page describes all downloader middleware components that come with Scrapy. For information on how to use them and how to write your own downloader middleware, see the hj_ubh)rf}rg(hXG:ref:`downloader middleware usage guide `rhhj_hhhhh}ri(UreftypeXrefhjXtopics-downloader-middlewareU refdomainXstdrjh]h]U refexplicith]h]h]jjuhKh]rkj)rl}rm(hjhh}rn(h]h]ro(j jjXstd-refrpeh]h]h]uhjfh]rqhX!downloader middleware usage guiderrrs}rt(hUhjlubahjubaubhX.ru}rv(hX.hj_ubeubh)rw}rx(hXzFor a list of the components enabled by default (and their orders) see the :setting:`DOWNLOADER_MIDDLEWARES_BASE` setting.hjOhhhhh}ry(h]h]h]h]h]uhKhhh]rz(hXKFor a list of the components enabled by default (and their orders) see the r{r|}r}(hXKFor a list of the components enabled by default (and their orders) see the hjwubh)r~}r(hX&:setting:`DOWNLOADER_MIDDLEWARES_BASE`rhjwhhhhh}r(UreftypeXsettinghjXDOWNLOADER_MIDDLEWARES_BASEU refdomainXstdrh]h]U refexplicith]h]h]jjuhKh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhj~h]rhXDOWNLOADER_MIDDLEWARES_BASErr}r(hUhjubahjubaubhX setting.rr}r(hX setting.hjwubeubh)r}r(hX.. _cookies-mw:hjOhhhhh}r(h]h]h]h]h]hhuhKhhh]ubh)r}r(hUhjOhhh}rhDjshhh}r(h]h]h]h]r(X2module-scrapy.contrib.downloadermiddleware.cookiesrh~heh]r(h%hDeuhKhhh}rhjsh]r(h)r}r(hXCookiesMiddlewarerhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXCookiesMiddlewarerr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jX4scrapy.contrib.downloadermiddleware.cookies (module)X2module-scrapy.contrib.downloadermiddleware.cookiesUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jXHCookiesMiddleware (class in scrapy.contrib.downloadermiddleware.cookies)h5UtrauhNhhh]ubj)r}r(hUhjhhhjh}r(jjXpyh]h]h]h]h]jXclassrjjuhNhhh]r(j)r}r(hXCookiesMiddlewarerhjhhhjh}r(h]rh5ajX+scrapy.contrib.downloadermiddleware.cookiesrh]h]h]h]rh5ajjjUjuhKhhh]r(j)r}r(hXclass hjhhhjh}r(h]h]h]h]h]uhKhhh]rhXclass rr}r(hUhjubaubj)r}r(hX,scrapy.contrib.downloadermiddleware.cookies.hjhhhjh}r(h]h]h]h]h]uhKhhh]rhX,scrapy.contrib.downloadermiddleware.cookies.rr}r(hUhjubaubj)r}r(hjhjhhhjh}r(h]h]h]h]h]uhKhhh]rhXCookiesMiddlewarerr}r(hUhjubaubeubj )r}r(hUhjhhhjh}r(h]h]h]h]h]uhKhhh]rh)r}r(hXThis middleware enables working with sites that require cookies, such as those that use sessions. It keeps track of cookies sent by web servers, and send them back on subsequent requests (from that spider), just like web browsers do.rhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXThis middleware enables working with sites that require cookies, such as those that use sessions. It keeps track of cookies sent by web servers, and send them back on subsequent requests (from that spider), just like web browsers do.rr}r(hjhjubaubaubeubh)r}r(hXFThe following settings can be used to configure the cookie middleware:rhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXFThe following settings can be used to configure the cookie middleware:rr}r(hjhjubaubjv)r}r(hUhjhhhjh}r(UbulletrX*h]h]h]h]h]uhKhhh]r(j{)r}r(hX:setting:`COOKIES_ENABLED`rhjhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hjhjhhhhh}r(h]h]h]h]h]uhKh]rh)r}r(hjhjhhhhh}r(UreftypeXsettinghjXCOOKIES_ENABLEDU refdomainXstdrh]h]U refexplicith]h]h]jjuhKh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXCOOKIES_ENABLEDrr}r(hUhjubahjubaubaubaubj{)r}r(hX:setting:`COOKIES_DEBUG` hjhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hX:setting:`COOKIES_DEBUG`rhjhhhhh}r(h]h]h]h]h]uhKh]rh)r }r (hjhjhhhhh}r (UreftypeXsettinghjX COOKIES_DEBUGU refdomainXstdr h]h]U refexplicith]h]h]jjuhKh]r j)r}r(hjh}r(h]h]r(j j X std-settingreh]h]h]uhj h]rhX COOKIES_DEBUGrr}r(hUhjubahjubaubaubaubeubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXcookiejar; reqmetaXstd:reqmeta-cookiejarrUtrauhKhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhKhhh]ubh)r }r!(hUhjhhh}hhh}r"(h]h]h]h]r#(hjeh]r$h)auhKhhh}r%jjsh]r&(h)r'}r((hX#Multiple cookie sessions per spiderr)hj hhhhh}r*(h]h]h]h]h]uhKhhh]r+hX#Multiple cookie sessions per spiderr,r-}r.(hj)hj'ubaubcsphinx.addnodes versionmodified r/)r0}r1(hUhj hhhUversionmodifiedr2h}r3(Uversionr4X0.15h]h]h]h]h]Utyper5X versionaddedr6uhKhhh]r7h)r8}r9(hUhj0hhhhh}r:(h]h]h]h]h]uhKhhh]r;cdocutils.nodes inline r<)r=}r>(hUh}r?(h]h]r@j2ah]h]h]uhj8h]rAhXNew in version 0.15.rBrC}rD(hUhj=ubahUinlinerEubaubaubh)rF}rG(hXThere is support for keeping multiple cookie sessions per spider by using the :reqmeta:`cookiejar` Request meta key. By default it uses a single cookie jar (session), but you can pass an identifier to use different ones.hj hhhhh}rH(h]h]h]h]h]uhKhhh]rI(hXNThere is support for keeping multiple cookie sessions per spider by using the rJrK}rL(hXNThere is support for keeping multiple cookie sessions per spider by using the hjFubh)rM}rN(hX:reqmeta:`cookiejar`rOhjFhhhhh}rP(UreftypeXreqmetahjX cookiejarU refdomainXstdrQh]h]U refexplicith]h]h]jjuhKh]rRj)rS}rT(hjOh}rU(h]h]rV(j jQX std-reqmetarWeh]h]h]uhjMh]rXhX cookiejarrYrZ}r[(hUhjSubahjubaubhXz Request meta key. By default it uses a single cookie jar (session), but you can pass an identifier to use different ones.r\r]}r^(hXz Request meta key. By default it uses a single cookie jar (session), but you can pass an identifier to use different ones.hjFubeubh)r_}r`(hX For example::rahj hhhhh}rb(h]h]h]h]h]uhKhhh]rchX For example:rdre}rf(hX For example:hj_ubaubj)rg}rh(hXfor i, url in enumerate(urls): yield Request("http://www.example.com", meta={'cookiejar': i}, callback=self.parse_page)hj hhhjh}ri(j!j"h]h]h]h]h]uhKhhh]rjhXfor i, url in enumerate(urls): yield Request("http://www.example.com", meta={'cookiejar': i}, callback=self.parse_page)rkrl}rm(hUhjgubaubh)rn}ro(hXKeep in mind that the :reqmeta:`cookiejar` meta key is not "sticky". You need to keep passing it along on subsequent requests. For example::hj hhhhh}rp(h]h]h]h]h]uhKhhh]rq(hXKeep in mind that the rrrs}rt(hXKeep in mind that the hjnubh)ru}rv(hX:reqmeta:`cookiejar`rwhjnhhhhh}rx(UreftypeXreqmetahjX cookiejarU refdomainXstdryh]h]U refexplicith]h]h]jjuhKh]rzj)r{}r|(hjwh}r}(h]h]r~(j jyX std-reqmetareh]h]h]uhjuh]rhX cookiejarrr}r(hUhj{ubahjubaubhXa meta key is not "sticky". You need to keep passing it along on subsequent requests. For example:rr}r(hXa meta key is not "sticky". You need to keep passing it along on subsequent requests. For example:hjnubeubj)r}r(hXdef parse_page(self, response): # do some processing return Request("http://www.example.com/otherpage", meta={'cookiejar': response.meta['cookiejar']}, callback=self.parse_other_page)hj hhhjh}r(j!j"h]h]h]h]h]uhKhhh]rhXdef parse_page(self, response): # do some processing return Request("http://www.example.com/otherpage", meta={'cookiejar': response.meta['cookiejar']}, callback=self.parse_other_page)rr}r(hUhjubaubj)r}r(hUhj hhhjh}r(h]h]h]h]h]Uentries]r(XpairXCOOKIES_ENABLED; settingXstd:setting-COOKIES_ENABLEDrUtrauhKhhh]ubh)r}r(hUhj hhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hojeh]rhauhKhhh}rjjsh]r(h)r}r(hXCOOKIES_ENABLEDrhjhhhhh}r(h]h]h]h]h]uhKhhh]rhXCOOKIES_ENABLEDrr}r(hjhjubaubh)r}r(hXDefault: ``True``rhjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX``True``h}r(h]h]h]h]h]uhjh]rhXTruerr}r(hUhjubahjubeubh)r}r(hX^Whether to enable the cookies middleware. If disabled, no cookies will be sent to web servers.rhjhhhhh}r(h]h]h]h]h]uhKhhh]rhX^Whether to enable the cookies middleware. If disabled, no cookies will be sent to web servers.rr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXCOOKIES_DEBUG; settingXstd:setting-COOKIES_DEBUGrUtrauhKhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhKhhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hejeh]rhauhKhhh}rjjsh]r(h)r}r(hX COOKIES_DEBUGrhjhhhhh}r(h]h]h]h]h]uhKhhh]rhX COOKIES_DEBUGrr}r(hjhjubaubh)r}r(hXDefault: ``False``rhjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX ``False``h}r(h]h]h]h]h]uhjh]rhXFalserr}r(hUhjubahjubeubh)r}r(hXIf enabled, Scrapy will log all cookies sent in requests (ie. ``Cookie`` header) and all cookies received in responses (ie. ``Set-Cookie`` header).hjhhhhh}r(h]h]h]h]h]uhKhhh]r(hX>If enabled, Scrapy will log all cookies sent in requests (ie. rr}r(hX>If enabled, Scrapy will log all cookies sent in requests (ie. hjubj)r}r(hX ``Cookie``h}r(h]h]h]h]h]uhjh]rhXCookierr}r(hUhjubahjubhX4 header) and all cookies received in responses (ie. rr}r(hX4 header) and all cookies received in responses (ie. hjubj)r}r(hX``Set-Cookie``h}r(h]h]h]h]h]uhjh]rhX Set-Cookierr}r(hUhjubahjubhX header).rr}r(hX header).hjubeubh)r}r (hXBHere's an example of a log with :setting:`COOKIES_DEBUG` enabled::r hjhhhhh}r (h]h]h]h]h]uhKhhh]r (hX Here's an example of a log with r r }r (hX Here's an example of a log with hjubh)r }r (hX:setting:`COOKIES_DEBUG`r hjhhhhh}r (UreftypeXsettinghjX COOKIES_DEBUGU refdomainXstdr h]h]U refexplicith]h]h]jjuhKh]r j)r }r (hj h}r (h]h]r (j j X std-settingr eh]h]h]uhj h]r hX COOKIES_DEBUGr r }r (hUhj ubahjubaubhX enabled:r r }r (hX enabled:hjubeubj)r }r (hX2011-04-06 14:35:10-0300 [diningcity] INFO: Spider opened 2011-04-06 14:35:10-0300 [diningcity] DEBUG: Sending cookies to: Cookie: clientlanguage_nl=en_EN 2011-04-06 14:35:14-0300 [diningcity] DEBUG: Received cookies from: <200 http://www.diningcity.com/netherlands/index.html> Set-Cookie: JSESSIONID=B~FA4DC0C496C8762AE4F1A620EAB34F38; Path=/ Set-Cookie: ip_isocode=US Set-Cookie: clientlanguage_nl=en_EN; Expires=Thu, 07-Apr-2011 21:21:34 GMT; Path=/ 2011-04-06 14:49:50-0300 [diningcity] DEBUG: Crawled (200) (referer: None) [...]hjhhhjh}r (j!j"h]h]h]h]h]uhKhhh]r hX2011-04-06 14:35:10-0300 [diningcity] INFO: Spider opened 2011-04-06 14:35:10-0300 [diningcity] DEBUG: Sending cookies to: Cookie: clientlanguage_nl=en_EN 2011-04-06 14:35:14-0300 [diningcity] DEBUG: Received cookies from: <200 http://www.diningcity.com/netherlands/index.html> Set-Cookie: JSESSIONID=B~FA4DC0C496C8762AE4F1A620EAB34F38; Path=/ Set-Cookie: ip_isocode=US Set-Cookie: clientlanguage_nl=en_EN; Expires=Thu, 07-Apr-2011 21:21:34 GMT; Path=/ 2011-04-06 14:49:50-0300 [diningcity] DEBUG: Crawled (200) (referer: None) [...]r r }r (hUhj ubaubeubeubh)r }r! (hUhjOhhhhh}r" (h]h]h]h]r# (X9module-scrapy.contrib.downloadermiddleware.defaultheadersr$ hjeh]r% hauhKhhh]r& (h)r' }r( (hXDefaultHeadersMiddlewarer) hj hhhhh}r* (h]h]h]h]h]uhKhhh]r+ hXDefaultHeadersMiddlewarer, r- }r. (hj) hj' ubaubj)r/ }r0 (hUhj hhhjh}r1 (h]h]h]h]h]Uentries]r2 (jX;scrapy.contrib.downloadermiddleware.defaultheaders (module)X9module-scrapy.contrib.downloadermiddleware.defaultheadersUtr3 auhNhhh]ubj)r4 }r5 (hUhj hhhjh}r6 (h]h]h]h]h]Uentries]r7 (jXVDefaultHeadersMiddleware (class in scrapy.contrib.downloadermiddleware.defaultheaders)h-Utr8 auhNhhh]ubj)r9 }r: (hUhj hhhjh}r; (jjXpyh]h]h]h]h]jXclassr< jj< uhNhhh]r= (j)r> }r? (hXDefaultHeadersMiddlewarer@ hj9 hhhjh}rA (h]rB h-ajX2scrapy.contrib.downloadermiddleware.defaultheadersrC h]h]h]h]rD h-ajj@ jUjuhMhhh]rE (j)rF }rG (hXclass hj> hhhjh}rH (h]h]h]h]h]uhMhhh]rI hXclass rJ rK }rL (hUhjF ubaubj)rM }rN (hX3scrapy.contrib.downloadermiddleware.defaultheaders.hj> hhhjh}rO (h]h]h]h]h]uhMhhh]rP hX3scrapy.contrib.downloadermiddleware.defaultheaders.rQ rR }rS (hUhjM ubaubj)rT }rU (hj@ hj> hhhjh}rV (h]h]h]h]h]uhMhhh]rW hXDefaultHeadersMiddlewarerX rY }rZ (hUhjT ubaubeubj )r[ }r\ (hUhj9 hhhjh}r] (h]h]h]h]h]uhMhhh]r^ h)r_ }r` (hXnThis middleware sets all default requests headers specified in the :setting:`DEFAULT_REQUEST_HEADERS` setting.hj[ hhhhh}ra (h]h]h]h]h]uhKhhh]rb (hXCThis middleware sets all default requests headers specified in the rc rd }re (hXCThis middleware sets all default requests headers specified in the hj_ ubh)rf }rg (hX":setting:`DEFAULT_REQUEST_HEADERS`rh hj_ hhhhh}ri (UreftypeXsettinghjXDEFAULT_REQUEST_HEADERSU refdomainXstdrj h]h]U refexplicith]h]h]jjuhKh]rk j)rl }rm (hjh h}rn (h]h]ro (j jj X std-settingrp eh]h]h]uhjf h]rq hXDEFAULT_REQUEST_HEADERSrr rs }rt (hUhjl ubahjubaubhX setting.ru rv }rw (hX setting.hj_ ubeubaubeubeubh)rx }ry (hUhjOhhhhh}rz (h]h]h]h]r{ (X:module-scrapy.contrib.downloadermiddleware.downloadtimeoutr| heh]r} h6auhMhhh]r~ (h)r }r (hXDownloadTimeoutMiddlewarer hjx hhhhh}r (h]h]h]h]h]uhMhhh]r hXDownloadTimeoutMiddlewarer r }r (hj hj ubaubj)r }r (hUhjx hhhjh}r (h]h]h]h]h]Uentries]r (jX<scrapy.contrib.downloadermiddleware.downloadtimeout (module)X:module-scrapy.contrib.downloadermiddleware.downloadtimeoutUtr auhNhhh]ubj)r }r (hUhjx hhhjh}r (h]h]h]h]h]Uentries]r (jXXDownloadTimeoutMiddleware (class in scrapy.contrib.downloadermiddleware.downloadtimeout)hZUtr auhNhhh]ubj)r }r (hUhjx hhhjh}r (jjXpyh]h]h]h]h]jXclassr jj uhNhhh]r (j)r }r (hXDownloadTimeoutMiddlewarer hj hhhjh}r (h]r hZajX3scrapy.contrib.downloadermiddleware.downloadtimeoutr h]h]h]h]r hZajj jUjuhM hhh]r (j)r }r (hXclass hj hhhjh}r (h]h]h]h]h]uhM hhh]r hXclass r r }r (hUhj ubaubj)r }r (hX4scrapy.contrib.downloadermiddleware.downloadtimeout.hj hhhjh}r (h]h]h]h]h]uhM hhh]r hX4scrapy.contrib.downloadermiddleware.downloadtimeout.r r }r (hUhj ubaubj)r }r (hj hj hhhjh}r (h]h]h]h]h]uhM hhh]r hXDownloadTimeoutMiddlewarer r }r (hUhj ubaubeubj )r }r (hUhj hhhjh}r (h]h]h]h]h]uhM hhh]r h)r }r (hXlThis middleware sets the download timeout for requests specified in the :setting:`DOWNLOAD_TIMEOUT` setting.hj hhhhh}r (h]h]h]h]h]uhM hhh]r (hXHThis middleware sets the download timeout for requests specified in the r r }r (hXHThis middleware sets the download timeout for requests specified in the hj ubh)r }r (hX:setting:`DOWNLOAD_TIMEOUT`r hj hhhhh}r (UreftypeXsettinghjXDOWNLOAD_TIMEOUTU refdomainXstdr h]h]U refexplicith]h]h]jjuhM h]r j)r }r (hj h}r (h]h]r (j j X std-settingr eh]h]h]uhj h]r hXDOWNLOAD_TIMEOUTr r }r (hUhj ubahjubaubhX setting.r r }r (hX setting.hj ubeubaubeubeubh)r }r (hUhjOhhhhh}r (h]h]h]h]r (X3module-scrapy.contrib.downloadermiddleware.httpauthr heh]r h1auhM hhh]r (h)r }r (hXHttpAuthMiddlewarer hj hhhhh}r (h]h]h]h]h]uhM hhh]r hXHttpAuthMiddlewarer r }r (hj hj ubaubj)r }r (hUhj hhhjh}r (h]h]h]h]h]Uentries]r (jX5scrapy.contrib.downloadermiddleware.httpauth (module)X3module-scrapy.contrib.downloadermiddleware.httpauthUtr auhNhhh]ubj)r }r (hUhj hhhjh}r (h]h]h]h]h]Uentries]r (jXJHttpAuthMiddleware (class in scrapy.contrib.downloadermiddleware.httpauth)hGUtr auhNhhh]ubj)r }r (hUhj hhhjh}r (jjXpyh]h]h]h]h]jXclassr jj uhNhhh]r (j)r }r (hXHttpAuthMiddlewarer hj hhhjh}r (h]r hGajX,scrapy.contrib.downloadermiddleware.httpauthr h]h]h]h]r hGajj jUjuhM%hhh]r (j)r }r (hXclass hj hhhjh}r (h]h]h]h]h]uhM%hhh]r hXclass r r }r (hUhj ubaubj)r }r (hX-scrapy.contrib.downloadermiddleware.httpauth.hj hhhjh}r (h]h]h]h]h]uhM%hhh]r hX-scrapy.contrib.downloadermiddleware.httpauth.r r }r (hUhj ubaubj)r }r (hj hj hhhjh}r (h]h]h]h]h]uhM%hhh]r hXHttpAuthMiddlewarer r }r (hUhj ubaubeubj )r }r (hUhj hhhjh}r (h]h]h]h]h]uhM%hhh]r (h)r }r (hXThis middleware authenticates all requests generated from certain spiders using `Basic access authentication`_ (aka. HTTP auth).hj hhhhh}r (h]h]h]h]h]uhMhhh]r (hXPThis middleware authenticates all requests generated from certain spiders using r r }r (hXPThis middleware authenticates all requests generated from certain spiders using hj ubcdocutils.nodes reference r )r }r (hX`Basic access authentication`_Uresolvedr Khj hU referencer h}r (UnameXBasic access authenticationUrefurir X8http://en.wikipedia.org/wiki/Basic_access_authenticationr h]h]h]h]h]uh]r hXBasic access authenticationr r }r! (hUhj ubaubhX (aka. HTTP auth).r" r# }r$ (hX (aka. HTTP auth).hj ubeubh)r% }r& (hXxTo enable HTTP authentication from certain spiders, set the ``http_user`` and ``http_pass`` attributes of those spiders.hj hhhhh}r' (h]h]h]h]h]uhMhhh]r( (hX<To enable HTTP authentication from certain spiders, set the r) r* }r+ (hX<To enable HTTP authentication from certain spiders, set the hj% ubj)r, }r- (hX ``http_user``h}r. (h]h]h]h]h]uhj% h]r/ hX http_userr0 r1 }r2 (hUhj, ubahjubhX and r3 r4 }r5 (hX and hj% ubj)r6 }r7 (hX ``http_pass``h}r8 (h]h]h]h]h]uhj% h]r9 hX http_passr: r; }r< (hUhj6 ubahjubhX attributes of those spiders.r= r> }r? (hX attributes of those spiders.hj% ubeubh)r@ }rA (hX Example::hj hhhhh}rB (h]h]h]h]h]uhMhhh]rC hXExample:rD rE }rF (hXExample:hj@ ubaubj)rG }rH (hXfrom scrapy.contrib.spiders import CrawlSpider class SomeIntranetSiteSpider(CrawlSpider): http_user = 'someuser' http_pass = 'somepass' name = 'intranet.example.com' # .. rest of the spider code omitted ...hj hhhjh}rI (j!j"h]h]h]h]h]uhMhhh]rJ hXfrom scrapy.contrib.spiders import CrawlSpider class SomeIntranetSiteSpider(CrawlSpider): http_user = 'someuser' http_pass = 'somepass' name = 'intranet.example.com' # .. rest of the spider code omitted ...rK rL }rM (hUhjG ubaubeubeubh)rN }rO (hXY.. _Basic access authentication: http://en.wikipedia.org/wiki/Basic_access_authenticationU referencedrP Khj hhhhh}rQ (j j h]rR huah]h]h]h]rS hauhM&hhh]ubeubh)rT }rU (hUhjOhhhhh}rV (h]h]h]h]rW (X4module-scrapy.contrib.downloadermiddleware.httpcacherX heh]rY hEauhM*hhh]rZ (h)r[ }r\ (hXHttpCacheMiddlewarer] hjT hhhhh}r^ (h]h]h]h]h]uhM*hhh]r_ hXHttpCacheMiddlewarer` ra }rb (hj] hj[ ubaubj)rc }rd (hUhjT hhhjh}re (h]h]h]h]h]Uentries]rf (jX6scrapy.contrib.downloadermiddleware.httpcache (module)X4module-scrapy.contrib.downloadermiddleware.httpcacheUtrg auhNhhh]ubj)rh }ri (hUhjT hhhjh}rj (h]h]h]h]h]Uentries]rk (jXLHttpCacheMiddleware (class in scrapy.contrib.downloadermiddleware.httpcache)hSUtrl auhNhhh]ubj)rm }rn (hUhjT hhhjh}ro (jjXpyh]h]h]h]h]jXclassrp jjp uhNhhh]rq (j)rr }rs (hXHttpCacheMiddlewarert hjm hhhjh}ru (h]rv hSajX-scrapy.contrib.downloadermiddleware.httpcacherw h]h]h]h]rx hSajjt jUjuhMDhhh]ry (j)rz }r{ (hXclass hjr hhhjh}r| (h]h]h]h]h]uhMDhhh]r} hXclass r~ r }r (hUhjz ubaubj)r }r (hX.scrapy.contrib.downloadermiddleware.httpcache.hjr hhhjh}r (h]h]h]h]h]uhMDhhh]r hX.scrapy.contrib.downloadermiddleware.httpcache.r r }r (hUhj ubaubj)r }r (hjt hjr hhhjh}r (h]h]h]h]h]uhMDhhh]r hXHttpCacheMiddlewarer r }r (hUhj ubaubeubj )r }r (hUhjm hhhjh}r (h]h]h]h]h]uhMDhhh]r (h)r }r (hXThis middleware provides low-level cache to all HTTP requests and responses. It has to be combined with a cache storage backend as well as a cache policy.r hj hhhhh}r (h]h]h]h]h]uhM1hhh]r hXThis middleware provides low-level cache to all HTTP requests and responses. It has to be combined with a cache storage backend as well as a cache policy.r r }r (hj hj ubaubh)r }r (hX2Scrapy ships with two HTTP cache storage backends:r hj hhhhh}r (h]h]h]h]h]uhM4hhh]r hX2Scrapy ships with two HTTP cache storage backends:r r }r (hj hj ubaubcdocutils.nodes block_quote r )r }r (hUhj hNhU block_quoter h}r (h]h]h]h]h]uhNhhh]r jv)r }r (hUh}r (jX*h]h]h]h]h]uhj h]r (j{)r }r (hX:ref:`httpcache-storage-fs`r h}r (h]h]h]h]h]uhj h]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhM6h]r h)r }r (hj hj hhhhh}r (UreftypeXrefhjXhttpcache-storage-fsU refdomainXstdr h]h]U refexplicith]h]h]jjuhM6h]r j)r }r (hj h}r (h]h]r (j j Xstd-refr eh]h]h]uhj h]r hXhttpcache-storage-fsr r }r (hUhj ubahjubaubaubahjubj{)r }r (hX:ref:`httpcache-storage-dbm` h}r (h]h]h]h]h]uhj h]r h)r }r (hX:ref:`httpcache-storage-dbm`r hj hhhhh}r (h]h]h]h]h]uhM7h]r h)r }r (hj hj hhhhh}r (UreftypeXrefhjXhttpcache-storage-dbmU refdomainXstdr h]h]U refexplicith]h]h]jjuhM7h]r j)r }r (hj h}r (h]h]r (j j Xstd-refr eh]h]h]uhj h]r hXhttpcache-storage-dbmr r }r (hUhj ubahjubaubaubahjubehjubaubh)r }r (hXYou can change the HTTP cache storage backend with the :setting:`HTTPCACHE_STORAGE` setting. Or you can also implement your own storage backend.hj hhhhh}r (h]h]h]h]h]uhM9hhh]r (hX7You can change the HTTP cache storage backend with the r r }r (hX7You can change the HTTP cache storage backend with the hj ubh)r }r (hX:setting:`HTTPCACHE_STORAGE`r hj hhhhh}r (UreftypeXsettinghjXHTTPCACHE_STORAGEU refdomainXstdr h]h]U refexplicith]h]h]jjuhM9h]r j)r }r (hj h}r (h]h]r (j j X std-settingr eh]h]h]uhj h]r hXHTTPCACHE_STORAGEr r }r (hUhj ubahjubaubhX= setting. Or you can also implement your own storage backend.r r }r (hX= setting. Or you can also implement your own storage backend.hj ubeubh)r }r (hX*Scrapy ships with two HTTP cache policies:r hj hhhhh}r (h]h]h]h]h]uhM<hhh]r hX*Scrapy ships with two HTTP cache policies:r r }r (hj hj ubaubj )r }r (hUhj hNhj h}r (h]h]h]h]h]uhNhhh]r jv)r }r (hUh}r (jX*h]h]h]h]h]uhj h]r (j{)r }r (hX:ref:`httpcache-policy-rfc2616`r h}r (h]h]h]h]h]uhj h]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhM>h]r h)r }r (hj hj hhhhh}r (UreftypeXrefhjXhttpcache-policy-rfc2616U refdomainXstdr h]h]U refexplicith]h]h]jjuhM>h]r j)r }r (hj h}r (h]h]r (j j Xstd-refr eh]h]h]uhj h]r hXhttpcache-policy-rfc2616r r }r (hUhj ubahjubaubaubahjubj{)r }r (hX:ref:`httpcache-policy-dummy` h}r (h]h]h]h]h]uhj h]r h)r }r (hX:ref:`httpcache-policy-dummy`r! hj hhhhh}r" (h]h]h]h]h]uhM?h]r# h)r$ }r% (hj! hj hhhhh}r& (UreftypeXrefhjXhttpcache-policy-dummyU refdomainXstdr' h]h]U refexplicith]h]h]jjuhM?h]r( j)r) }r* (hj! h}r+ (h]h]r, (j j' Xstd-refr- eh]h]h]uhj$ h]r. hXhttpcache-policy-dummyr/ r0 }r1 (hUhj) ubahjubaubaubahjubehjubaubh)r2 }r3 (hX}You can change the HTTP cache policy with the :setting:`HTTPCACHE_POLICY` setting. Or you can also implement your own policy.hj hhhhh}r4 (h]h]h]h]h]uhMAhhh]r5 (hX.You can change the HTTP cache policy with the r6 r7 }r8 (hX.You can change the HTTP cache policy with the hj2 ubh)r9 }r: (hX:setting:`HTTPCACHE_POLICY`r; hj2 hhhhh}r< (UreftypeXsettinghjXHTTPCACHE_POLICYU refdomainXstdr= h]h]U refexplicith]h]h]jjuhMAh]r> j)r? }r@ (hj; h}rA (h]h]rB (j j= X std-settingrC eh]h]h]uhj9 h]rD hXHTTPCACHE_POLICYrE rF }rG (hUhj? ubahjubaubhX4 setting. Or you can also implement your own policy.rH rI }rJ (hX4 setting. Or you can also implement your own policy.hj2 ubeubeubeubh)rK }rL (hX.. _httpcache-policy-dummy:hjT hhhhh}rM (h]h]h]h]h]hhkuhMEhhh]ubh)rN }rO (hUhjT hhh}rP hjK shhh}rQ (h]h]h]h]rR (hfhkeh]rS (h heuhMHhhh}rT hkjK sh]rU (h)rV }rW (hXDummy policy (default)rX hjN hhhhh}rY (h]h]h]h]h]uhMHhhh]rZ hXDummy policy (default)r[ r\ }r] (hjX hjV ubaubh)r^ }r_ (hXThis policy has no awareness of any HTTP Cache-Control directives. Every request and its corresponding response are cached. When the same request is seen again, the response is returned without transferring anything from the Internet.r` hjN hhhhh}ra (h]h]h]h]h]uhMJhhh]rb hXThis policy has no awareness of any HTTP Cache-Control directives. Every request and its corresponding response are cached. When the same request is seen again, the response is returned without transferring anything from the Internet.rc rd }re (hj` hj^ ubaubh)rf }rg (hXThe Dummy policy is useful for testing spiders faster (without having to wait for downloads every time) and for trying your spider offline, when an Internet connection is not available. The goal is to be able to "replay" a spider run *exactly as it ran before*.hjN hhhhh}rh (h]h]h]h]h]uhMOhhh]ri (hXThe Dummy policy is useful for testing spiders faster (without having to wait for downloads every time) and for trying your spider offline, when an Internet connection is not available. The goal is to be able to "replay" a spider run rj rk }rl (hXThe Dummy policy is useful for testing spiders faster (without having to wait for downloads every time) and for trying your spider offline, when an Internet connection is not available. The goal is to be able to "replay" a spider run hjf ubj)rm }rn (hX*exactly as it ran before*h}ro (h]h]h]h]h]uhjf h]rp hXexactly as it ran beforerq rr }rs (hUhjm ubahjubhX.rt }ru (hX.hjf ubeubh)rv }rw (hX!In order to use this policy, set:rx hjN hhhhh}ry (h]h]h]h]h]uhMThhh]rz hX!In order to use this policy, set:r{ r| }r} (hjx hjv ubaubjv)r~ }r (hUhjN hhhjh}r (jX*h]h]h]h]h]uhMVhhh]r j{)r }r (hXI:setting:`HTTPCACHE_POLICY` to ``scrapy.contrib.httpcache.DummyPolicy`` hj~ hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hXG:setting:`HTTPCACHE_POLICY` to ``scrapy.contrib.httpcache.DummyPolicy``hj hhhhh}r (h]h]h]h]h]uhMVh]r (h)r }r (hX:setting:`HTTPCACHE_POLICY`r hj hhhhh}r (UreftypeXsettinghjXHTTPCACHE_POLICYU refdomainXstdr h]h]U refexplicith]h]h]jjuhMVh]r j)r }r (hj h}r (h]h]r (j j X std-settingr eh]h]h]uhj h]r hXHTTPCACHE_POLICYr r }r (hUhj ubahjubaubhX to r r }r (hX to hj ubj)r }r (hX(``scrapy.contrib.httpcache.DummyPolicy``h}r (h]h]h]h]h]uhj h]r hX$scrapy.contrib.httpcache.DummyPolicyr r }r (hUhj ubahjubeubaubaubh)r }r (hX.. _httpcache-policy-rfc2616:hjN hhhhh}r (h]h]h]h]h]hhuhMYhhh]ubeubh)r }r (hUhjT hhh}r hHj shhh}r (h]h]h]h]r (h|heh]r (h#hHeuhM\hhh}r hj sh]r (h)r }r (hXRFC2616 policyr hj hhhhh}r (h]h]h]h]h]uhM\hhh]r hXRFC2616 policyr r }r (hj hj ubaubh)r }r (hXThis policy provides a RFC2616 compliant HTTP cache, i.e. with HTTP Cache-Control awareness, aimed at production and used in continuous runs to avoid downloading unmodified data (to save bandwidth and speed up crawls).r hj hhhhh}r (h]h]h]h]h]uhM^hhh]r hXThis policy provides a RFC2616 compliant HTTP cache, i.e. with HTTP Cache-Control awareness, aimed at production and used in continuous runs to avoid downloading unmodified data (to save bandwidth and speed up crawls).r r }r (hj hj ubaubh)r }r (hXwhat is implemented:r hj hhhhh}r (h]h]h]h]h]uhMbhhh]r hXwhat is implemented:r r }r (hj hj ubaubjv)r }r (hUhj hhhjh}r (jX*h]h]h]h]h]uhMdhhh]r (j{)r }r (hXVDo not attempt to store responses/requests with `no-store` cache-control directive setr hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMdh]r (hX0Do not attempt to store responses/requests with r r }r (hX0Do not attempt to store responses/requests with hj ubj)r }r (hX `no-store`h}r (h]h]h]h]h]uhj h]r hXno-storer r }r (hUhj ubahjubhX cache-control directive setr r }r (hX cache-control directive sethj ubeubaubj{)r }r (hXgDo not serve responses from cache if `no-cache` cache-control directive is set even for fresh responsesr hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMeh]r (hX%Do not serve responses from cache if r r }r (hX%Do not serve responses from cache if hj ubj)r }r (hX `no-cache`h}r (h]h]h]h]h]uhj h]r hXno-cacher r }r (hUhj ubahjubhX8 cache-control directive is set even for fresh responsesr r }r (hX8 cache-control directive is set even for fresh responseshj ubeubaubj{)r }r (hXACompute freshness lifetime from `max-age` cache-control directiver hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMfh]r (hX Compute freshness lifetime from r r }r (hX Compute freshness lifetime from hj ubj)r }r (hX `max-age`h}r (h]h]h]h]h]uhj h]r hXmax-ager r }r (hUhj ubahjubhX cache-control directiver r }r (hX cache-control directivehj ubeubaubj{)r }r (hX9Compute freshness lifetime from `Expires` response headerr hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMgh]r (hX Compute freshness lifetime from r r }r (hX Compute freshness lifetime from hj ubj)r }r (hX `Expires`h}r (h]h]h]h]h]uhj h]r hXExpiresr r }r (hUhj ubahjubhX response headerr r }r! (hX response headerhj ubeubaubj{)r" }r# (hX[Compute freshness lifetime from `Last-Modified` response header (heuristic used by Firefox)r$ hj hhhjh}r% (h]h]h]h]h]uhNhhh]r& h)r' }r( (hj$ hj" hhhhh}r) (h]h]h]h]h]uhMhh]r* (hX Compute freshness lifetime from r+ r, }r- (hX Compute freshness lifetime from hj' ubj)r. }r/ (hX`Last-Modified`h}r0 (h]h]h]h]h]uhj' h]r1 hX Last-Modifiedr2 r3 }r4 (hUhj. ubahjubhX, response header (heuristic used by Firefox)r5 r6 }r7 (hX, response header (heuristic used by Firefox)hj' ubeubaubj{)r8 }r9 (hX.Compute current age from `Age` response headerr: hj hhhjh}r; (h]h]h]h]h]uhNhhh]r< h)r= }r> (hj: hj8 hhhhh}r? (h]h]h]h]h]uhMih]r@ (hXCompute current age from rA rB }rC (hXCompute current age from hj= ubj)rD }rE (hX`Age`h}rF (h]h]h]h]h]uhj= h]rG hXAgerH rI }rJ (hUhjD ubahjubhX response headerrK rL }rM (hX response headerhj= ubeubaubj{)rN }rO (hX&Compute current age from `Date` headerrP hj hhhjh}rQ (h]h]h]h]h]uhNhhh]rR h)rS }rT (hjP hjN hhhhh}rU (h]h]h]h]h]uhMjh]rV (hXCompute current age from rW rX }rY (hXCompute current age from hjS ubj)rZ }r[ (hX`Date`h}r\ (h]h]h]h]h]uhjS h]r] hXDater^ r_ }r` (hUhjZ ubahjubhX headerra rb }rc (hX headerhjS ubeubaubj{)rd }re (hXCRevalidate stale responses based on `Last-Modified` response headerrf hj hhhjh}rg (h]h]h]h]h]uhNhhh]rh h)ri }rj (hjf hjd hhhhh}rk (h]h]h]h]h]uhMkh]rl (hX$Revalidate stale responses based on rm rn }ro (hX$Revalidate stale responses based on hji ubj)rp }rq (hX`Last-Modified`h}rr (h]h]h]h]h]uhji h]rs hX Last-Modifiedrt ru }rv (hUhjp ubahjubhX response headerrw rx }ry (hX response headerhji ubeubaubj{)rz }r{ (hX:Revalidate stale responses based on `ETag` response headerr| hj hhhjh}r} (h]h]h]h]h]uhNhhh]r~ h)r }r (hj| hjz hhhhh}r (h]h]h]h]h]uhMlh]r (hX$Revalidate stale responses based on r r }r (hX$Revalidate stale responses based on hj ubj)r }r (hX`ETag`h}r (h]h]h]h]h]uhj h]r hXETagr r }r (hUhj ubahjubhX response headerr r }r (hX response headerhj ubeubaubj{)r }r (hX7Set `Date` header for any received response missing it hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hX6Set `Date` header for any received response missing ithj hhhhh}r (h]h]h]h]h]uhMmh]r (hXSet r r }r (hXSet hj ubj)r }r (hX`Date`h}r (h]h]h]h]h]uhj h]r hXDater r }r (hUhj ubahjubhX, header for any received response missing itr r }r (hX, header for any received response missing ithj ubeubaubeubh)r }r (hXwhat is missing:r hj hhhhh}r (h]h]h]h]h]uhMohhh]r hXwhat is missing:r r }r (hj hj ubaubjv)r }r (hUhj hhhjh}r (jX*h]h]h]h]h]uhMqhhh]r (j{)r }r (hXA`Pragma: no-cache` support http://www.mnot.net/cache_docs/#PRAGMAr hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMqh]r (j)r }r (hX`Pragma: no-cache`h}r (h]h]h]h]h]uhj h]r hXPragma: no-cacher r }r (hUhj ubahjubhX support r r }r (hX support hj ubj )r }r (hX&http://www.mnot.net/cache_docs/#PRAGMAr h}r (Urefurij h]h]h]h]h]uhj h]r hX&http://www.mnot.net/cache_docs/#PRAGMAr r }r (hUhj ubahj ubeubaubj{)r }r (hXT`Vary` header support http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.6r hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMrh]r (j)r }r (hX`Vary`h}r (h]h]h]h]h]uhj h]r hXVaryr r }r (hUhj ubahjubhX header support r r }r (hX header support hj ubj )r }r (hX>http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.6r h}r (Urefurij h]h]h]h]h]uhj h]r hX>http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.6r r }r (hUhj ubahj ubeubaubj{)r }r (hXeInvalidation after updates or deletes http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.10r hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMsh]r (hX&Invalidation after updates or deletes r r }r (hX&Invalidation after updates or deletes hj ubj )r }r (hX?http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.10r h}r (Urefurij h]h]h]h]h]uhj h]r hX?http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.10r r }r (hUhj ubahj ubeubaubj{)r }r (hX... probably others .. hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hX... probably others ..r hj hhhhh}r (h]h]h]h]h]uhMth]r hX... probably others ..r r }r (hj hj ubaubaubeubh)r }r (hX!In order to use this policy, set:r hj hhhhh}r (h]h]h]h]h]uhMvhhh]r hX!In order to use this policy, set:r r }r (hj hj ubaubjv)r }r (hUhj hhhjh}r (jX*h]h]h]h]h]uhMxhhh]r j{)r }r (hXK:setting:`HTTPCACHE_POLICY` to ``scrapy.contrib.httpcache.RFC2616Policy`` hj hhhjh}r (h]h]h]h]h]uhNhhh]r h)r }r (hXI:setting:`HTTPCACHE_POLICY` to ``scrapy.contrib.httpcache.RFC2616Policy``hj hhhhh}r (h]h]h]h]h]uhMxh]r (h)r }r (hX:setting:`HTTPCACHE_POLICY`r hj hhhhh}r (UreftypeXsettinghjXHTTPCACHE_POLICYU refdomainXstdr h]h]U refexplicith]h]h]jjuhMxh]r j)r! }r" (hj h}r# (h]h]r$ (j j X std-settingr% eh]h]h]uhj h]r& hXHTTPCACHE_POLICYr' r( }r) (hUhj! ubahjubaubhX to r* r+ }r, (hX to hj ubj)r- }r. (hX*``scrapy.contrib.httpcache.RFC2616Policy``h}r/ (h]h]h]h]h]uhj h]r0 hX&scrapy.contrib.httpcache.RFC2616Policyr1 r2 }r3 (hUhj- ubahjubeubaubaubh)r4 }r5 (hX.. _httpcache-storage-fs:hj hhhhh}r6 (h]h]h]h]h]hhuhM{hhh]ubeubh)r7 }r8 (hUhjT hhh}r9 h=j4 shhh}r: (h]h]h]h]r; (hlheh]r< (hh=euhM~hhh}r= hj4 sh]r> (h)r? }r@ (hX$Filesystem storage backend (default)rA hj7 hhhhh}rB (h]h]h]h]h]uhM~hhh]rC hX$Filesystem storage backend (default)rD rE }rF (hjA hj? ubaubh)rG }rH (hXGFile system storage backend is available for the HTTP cache middleware.rI hj7 hhhhh}rJ (h]h]h]h]h]uhMhhh]rK hXGFile system storage backend is available for the HTTP cache middleware.rL rM }rN (hjI hjG ubaubh)rO }rP (hX*In order to use this storage backend, set:rQ hj7 hhhhh}rR (h]h]h]h]h]uhMhhh]rS hX*In order to use this storage backend, set:rT rU }rV (hjQ hjO ubaubjv)rW }rX (hUhj7 hhhjh}rY (jX*h]h]h]h]h]uhMhhh]rZ j{)r[ }r\ (hXT:setting:`HTTPCACHE_STORAGE` to ``scrapy.contrib.httpcache.FilesystemCacheStorage`` hjW hhhjh}r] (h]h]h]h]h]uhNhhh]r^ h)r_ }r` (hXS:setting:`HTTPCACHE_STORAGE` to ``scrapy.contrib.httpcache.FilesystemCacheStorage``hj[ hhhhh}ra (h]h]h]h]h]uhMh]rb (h)rc }rd (hX:setting:`HTTPCACHE_STORAGE`re hj_ hhhhh}rf (UreftypeXsettinghjXHTTPCACHE_STORAGEU refdomainXstdrg h]h]U refexplicith]h]h]jjuhMh]rh j)ri }rj (hje h}rk (h]h]rl (j jg X std-settingrm eh]h]h]uhjc h]rn hXHTTPCACHE_STORAGEro rp }rq (hUhji ubahjubaubhX to rr rs }rt (hX to hj_ ubj)ru }rv (hX3``scrapy.contrib.httpcache.FilesystemCacheStorage``h}rw (h]h]h]h]h]uhj_ h]rx hX/scrapy.contrib.httpcache.FilesystemCacheStoragery rz }r{ (hUhju ubahjubeubaubaubh)r| }r} (hX]Each request/response pair is stored in a different directory containing the following files:r~ hj7 hhhhh}r (h]h]h]h]h]uhMhhh]r hX]Each request/response pair is stored in a different directory containing the following files:r r }r (hj~ hj| ubaubj )r }r (hUhj7 hNhj h}r (h]h]h]h]h]uhNhhh]r jv)r }r (hUh}r (jX*h]h]h]h]h]uhj h]r (j{)r }r (hX)``request_body`` - the plain request bodyr h}r (h]h]h]h]h]uhj h]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMh]r (j)r }r (hX``request_body``h}r (h]h]h]h]h]uhj h]r hX request_bodyr r }r (hUhj ubahjubhX - the plain request bodyr r }r (hX - the plain request bodyhj ubeubahjubj{)r }r (hX>``request_headers`` - the request headers (in raw HTTP format)r h}r (h]h]h]h]h]uhj h]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMh]r (j)r }r (hX``request_headers``h}r (h]h]h]h]h]uhj h]r hXrequest_headersr r }r (hUhj ubahjubhX+ - the request headers (in raw HTTP format)r r }r (hX+ - the request headers (in raw HTTP format)hj ubeubahjubj{)r }r (hX+``response_body`` - the plain response bodyr h}r (h]h]h]h]h]uhj h]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMh]r (j)r }r (hX``response_body``h}r (h]h]h]h]h]uhj h]r hX response_bodyr r }r (hUhj ubahjubhX - the plain response bodyr r }r (hX - the plain response bodyhj ubeubahjubj{)r }r (hX?``response_headers`` - the request headers (in raw HTTP format)r h}r (h]h]h]h]h]uhj h]r h)r }r (hj hj hhhhh}r (h]h]h]h]h]uhMh]r (j)r }r (hX``response_headers``h}r (h]h]h]h]h]uhj h]r hXresponse_headersr r }r (hUhj ubahjubhX+ - the request headers (in raw HTTP format)r r }r (hX+ - the request headers (in raw HTTP format)hj ubeubahjubj{)r }r (hXb``meta`` - some metadata of this cache resource in Python ``repr()`` format (grep-friendly format)h}r (h]h]h]h]h]uhj h]r h)r }r (hXb``meta`` - some metadata of this cache resource in Python ``repr()`` format (grep-friendly format)hj hhhhh}r (h]h]h]h]h]uhMh]r (j)r }r (hX``meta``h}r (h]h]h]h]h]uhj h]r hXmetar r }r (hUhj ubahjubhX2 - some metadata of this cache resource in Python r r }r (hX2 - some metadata of this cache resource in Python hj ubj)r }r (hX ``repr()``h}r (h]h]h]h]h]uhj h]r hXrepr()r r }r (hUhj ubahjubhX format (grep-friendly format)r r }r (hX format (grep-friendly format)hj ubeubahjubj{)r }r (hX```pickled_meta`` - the same metadata in ``meta`` but pickled for more efficient deserialization h}r (h]h]h]h]h]uhj h]r h)r }r (hX_``pickled_meta`` - the same metadata in ``meta`` but pickled for more efficient deserializationhj hhhhh}r (h]h]h]h]h]uhMh]r (j)r }r (hX``pickled_meta``h}r (h]h]h]h]h]uhj h]r hX pickled_metarr}r(hUhj ubahjubhX - the same metadata in rr}r(hX - the same metadata in hj ubj)r}r(hX``meta``h}r(h]h]h]h]h]uhj h]r hXmetar r }r (hUhjubahjubhX/ but pickled for more efficient deserializationr r}r(hX/ but pickled for more efficient deserializationhj ubeubahjubehjubaubh)r}r(hXThe directory name is made from the request fingerprint (see ``scrapy.utils.request.fingerprint``), and one level of subdirectories is used to avoid creating too many files into the same directory (which is inefficient in many file systems). An example directory could be::hj7 hhhhh}r(h]h]h]h]h]uhMhhh]r(hX=The directory name is made from the request fingerprint (see rr}r(hX=The directory name is made from the request fingerprint (see hjubj)r}r(hX$``scrapy.utils.request.fingerprint``h}r(h]h]h]h]h]uhjh]rhX scrapy.utils.request.fingerprintrr}r(hUhjubahjubhX), and one level of subdirectories is used to avoid creating too many files into the same directory (which is inefficient in many file systems). An example directory could be:rr}r (hX), and one level of subdirectories is used to avoid creating too many files into the same directory (which is inefficient in many file systems). An example directory could be:hjubeubj)r!}r"(hXJ/path/to/cache/dir/example.com/72/72811f648e718090f041317756c03adb0ada46c7hj7 hhhjh}r#(j!j"h]h]h]h]h]uhMhhh]r$hXJ/path/to/cache/dir/example.com/72/72811f648e718090f041317756c03adb0ada46c7r%r&}r'(hUhj!ubaubh)r(}r)(hX.. _httpcache-storage-dbm:hj7 hhhhh}r*(h]h]h]h]h]hhuhMhhh]ubeubh)r+}r,(hUhjT hhh}r-h/j(shhh}r.(h]h]h]h]r/(hheh]r0(hQh/euhMhhh}r1hj(sh]r2(h)r3}r4(hXDBM storage backendr5hj+hhhhh}r6(h]h]h]h]h]uhMhhh]r7hXDBM storage backendr8r9}r:(hj5hj3ubaubj/)r;}r<(hUhj+hhhj2h}r=(j4X0.13r>h]h]h]h]h]j5X versionaddedr?uhMhhh]r@h)rA}rB(hUhj;hhhhh}rC(h]h]h]h]h]uhMhhh]rDj<)rE}rF(hUh}rG(h]h]rHj2ah]h]h]uhjAh]rIhXNew in version 0.13.rJrK}rL(hUhjEubahjEubaubaubh)rM}rN(hXGA DBM_ storage backend is also available for the HTTP cache middleware.rOhj+hhhhh}rP(h]h]h]h]h]uhMhhh]rQ(hXA rRrS}rT(hXA hjMubj )rU}rV(hXDBM_j KhjMhj h}rW(UnameXDBMj X http://en.wikipedia.org/wiki/DbmrXh]h]h]h]h]uh]rYhXDBMrZr[}r\(hUhjUubaubhXA storage backend is also available for the HTTP cache middleware.r]r^}r_(hXA storage backend is also available for the HTTP cache middleware.hjMubeubh)r`}ra(hXoBy default, it uses the anydbm_ module, but you can change it with the :setting:`HTTPCACHE_DBM_MODULE` setting.hj+hhhhh}rb(h]h]h]h]h]uhMhhh]rc(hXBy default, it uses the rdre}rf(hXBy default, it uses the hj`ubj )rg}rh(hXanydbm_j Khj`hj h}ri(UnameXanydbmrjj X*http://docs.python.org/library/anydbm.htmlrkh]h]h]h]h]uh]rlhXanydbmrmrn}ro(hUhjgubaubhX( module, but you can change it with the rprq}rr(hX( module, but you can change it with the hj`ubh)rs}rt(hX:setting:`HTTPCACHE_DBM_MODULE`ruhj`hhhhh}rv(UreftypeXsettinghjXHTTPCACHE_DBM_MODULEU refdomainXstdrwh]h]U refexplicith]h]h]jjuhMh]rxj)ry}rz(hjuh}r{(h]h]r|(j jwX std-settingr}eh]h]h]uhjsh]r~hXHTTPCACHE_DBM_MODULErr}r(hUhjyubahjubaubhX setting.rr}r(hX setting.hj`ubeubh)r}r(hX*In order to use this storage backend, set:rhj+hhhhh}r(h]h]h]h]h]uhMhhh]rhX*In order to use this storage backend, set:rr}r(hjhjubaubjv)r}r(hUhj+hhhjh}r(jX*h]h]h]h]h]uhMhhh]rj{)r}r(hXN:setting:`HTTPCACHE_STORAGE` to ``scrapy.contrib.httpcache.DbmCacheStorage`` hjhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hXL:setting:`HTTPCACHE_STORAGE` to ``scrapy.contrib.httpcache.DbmCacheStorage``hjhhhhh}r(h]h]h]h]h]uhMh]r(h)r}r(hX:setting:`HTTPCACHE_STORAGE`rhjhhhhh}r(UreftypeXsettinghjXHTTPCACHE_STORAGEU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXHTTPCACHE_STORAGErr}r(hUhjubahjubaubhX to rr}r(hX to hjubj)r}r(hX,``scrapy.contrib.httpcache.DbmCacheStorage``h}r(h]h]h]h]h]uhjh]rhX(scrapy.contrib.httpcache.DbmCacheStoragerr}r(hUhjubahjubeubaubaubeubh)r}r(hUhjT hhhhh}r(h]h]h]h]rhnah]rhauhMhhh]r(h)r}r(hXHTTPCache middleware settingsrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXHTTPCache middleware settingsrr}r(hjhjubaubh)r}r(hXRThe :class:`HttpCacheMiddleware` can be configured through the following settings:hjhhhhh}r(h]h]h]h]h]uhMhhh]r(hXThe rr}r(hXThe hjubh)r}r(hX:class:`HttpCacheMiddleware`rhjhhhhh}r(UreftypeXclasshjXHttpCacheMiddlewareU refdomainXpyrh]h]U refexplicith]h]h]jjjUNjVjw uhMh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXHttpCacheMiddlewarerr}r(hUhjubahjubaubhX2 can be configured through the following settings:rr}r(hX2 can be configured through the following settings:hjubeubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXHTTPCACHE_ENABLED; settingXstd:setting-HTTPCACHE_ENABLEDrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hjeh]rh&auhMhhh}rjjsh]r(h)r}r(hXHTTPCACHE_ENABLEDrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXHTTPCACHE_ENABLEDrr}r(hjhjubaubj/)r}r(hUhjhhhj2h}r(j4X0.11h]h]h]h]h]j5X versionaddedruhMhhh]rh)r}r(hUhjhhhhh}r(h]h]h]h]h]uhMhhh]rj<)r}r(hUh}r(h]h]rj2ah]h]h]uhjh]rhXNew in version 0.11.rr}r(hUhjubahjEubaubaubh)r}r(hXDefault: ``False``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r (hX Default: hjubj)r }r (hX ``False``h}r (h]h]h]h]h]uhjh]r hXFalserr}r(hUhj ubahjubeubh)r}r(hX'Whether the HTTP cache will be enabled.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX'Whether the HTTP cache will be enabled.rr}r(hjhjubaubj/)r}r(hUhjhhhj2h}r(j4X0.11h]h]h]h]h]j5XversionchangedruhMhhh]rh)r}r(hUhjhhhhh}r (h]h]h]h]h]uhMhhh]r!(j<)r"}r#(hUhjhhhjEh}r$(h]h]r%j2ah]h]h]uhMhhh]r&hXChanged in version 0.11: r'r(}r)(hUhj"ubaubhX Before 0.11, r*r+}r,(hX Before 0.11, hNhNhhhjubh)r-}r.(hX:setting:`HTTPCACHE_DIR`r/hjhhhhh}r0(UreftypeXsettinghjX HTTPCACHE_DIRU refdomainXstdr1h]h]U refexplicith]h]h]jjuhMhhh]r2j)r3}r4(hj/h}r5(h]h]r6(j j1X std-settingr7eh]h]h]uhj-h]r8hX HTTPCACHE_DIRr9r:}r;(hUhj3ubahjubaubhX was used to enable cache.r<r=}r>(hX was used to enable cache.hNhNhhhjubeubaubj)r?}r@(hUhjhhhjh}rA(h]h]h]h]h]Uentries]rB(XpairX"HTTPCACHE_EXPIRATION_SECS; settingX%std:setting-HTTPCACHE_EXPIRATION_SECSrCUtrDauhMhhh]ubh)rE}rF(hUhjhhhhh}rG(h]h]h]h]h]hjCuhMhhh]ubeubh)rH}rI(hUhjhhh}hhh}rJ(h]h]h]h]rK(hhjCeh]rLh auhMhhh}rMjCjEsh]rN(h)rO}rP(hXHTTPCACHE_EXPIRATION_SECSrQhjHhhhhh}rR(h]h]h]h]h]uhMhhh]rShXHTTPCACHE_EXPIRATION_SECSrTrU}rV(hjQhjOubaubh)rW}rX(hXDefault: ``0``rYhjHhhhhh}rZ(h]h]h]h]h]uhMhhh]r[(hX Default: r\r]}r^(hX Default: hjWubj)r_}r`(hX``0``h}ra(h]h]h]h]h]uhjWh]rbhX0rc}rd(hUhj_ubahjubeubh)re}rf(hX0Expiration time for cached requests, in seconds.rghjHhhhhh}rh(h]h]h]h]h]uhMhhh]rihX0Expiration time for cached requests, in seconds.rjrk}rl(hjghjeubaubh)rm}rn(hXgCached requests older than this time will be re-downloaded. If zero, cached requests will never expire.rohjHhhhhh}rp(h]h]h]h]h]uhMhhh]rqhXgCached requests older than this time will be re-downloaded. If zero, cached requests will never expire.rrrs}rt(hjohjmubaubj/)ru}rv(hUhjHhhhj2h}rw(j4X0.11h]h]h]h]h]j5XversionchangedrxuhMhhh]ryh)rz}r{(hUhjuhhhhh}r|(h]h]h]h]h]uhMhhh]r}(j<)r~}r(hUhjzhhhjEh}r(h]h]rj2ah]h]h]uhMhhh]rhXChanged in version 0.11: rr}r(hUhj~ubaubhX6Before 0.11, zero meant cached requests always expire.rr}r(hX6Before 0.11, zero meant cached requests always expire.hNhNhhhjzubeubaubj)r}r(hUhjHhhhjh}r(h]h]h]h]h]Uentries]r(XpairXHTTPCACHE_DIR; settingXstd:setting-HTTPCACHE_DIRrUtrauhMhhh]ubh)r}r(hUhjHhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hrjeh]rhauhMhhh}rjjsh]r(h)r}r(hX HTTPCACHE_DIRrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX HTTPCACHE_DIRrr}r(hjhjubaubh)r}r(hXDefault: ``'httpcache'``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX``'httpcache'``h}r(h]h]h]h]h]uhjh]rhX 'httpcache'rr}r(hUhjubahjubeubh)r}r(hXThe directory to use for storing the (low-level) HTTP cache. If empty, the HTTP cache will be disabled. If a relative path is given, is taken relative to the project data dir. For more info see: :ref:`topics-project-structure`.hjhhhhh}r(h]h]h]h]h]uhMhhh]r(hXThe directory to use for storing the (low-level) HTTP cache. If empty, the HTTP cache will be disabled. If a relative path is given, is taken relative to the project data dir. For more info see: rr}r(hXThe directory to use for storing the (low-level) HTTP cache. If empty, the HTTP cache will be disabled. If a relative path is given, is taken relative to the project data dir. For more info see: hjubh)r}r(hX:ref:`topics-project-structure`rhjhhhhh}r(UreftypeXrefhjXtopics-project-structureU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhXtopics-project-structurerr}r(hUhjubahjubaubhX.r}r(hX.hjubeubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairX$HTTPCACHE_IGNORE_HTTP_CODES; settingX'std:setting-HTTPCACHE_IGNORE_HTTP_CODESrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hjeh]rhMauhMhhh}rjjsh]r(h)r}r(hXHTTPCACHE_IGNORE_HTTP_CODESrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXHTTPCACHE_IGNORE_HTTP_CODESrr}r(hjhjubaubj/)r}r(hUhjhhhj2h}r(j4X0.10h]h]h]h]h]j5X versionaddedruhMhhh]rh)r}r(hUhjhhhhh}r(h]h]h]h]h]uhMhhh]rj<)r}r(hUh}r(h]h]rj2ah]h]h]uhjh]rhXNew in version 0.10.rr}r(hUhjubahjEubaubaubh)r}r(hXDefault: ``[]``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX``[]``h}r(h]h]h]h]h]uhjh]rhX[]rr}r(hUhjubahjubeubh)r}r(hX+Don't cache response with these HTTP codes.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX+Don't cache response with these HTTP codes.rr}r(hjhjubaubj)r}r (hUhjhhhjh}r (h]h]h]h]h]Uentries]r (XpairX!HTTPCACHE_IGNORE_MISSING; settingX$std:setting-HTTPCACHE_IGNORE_MISSINGr Utr auhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hj uhMhhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hj eh]rh'auhMhhh}rj jsh]r(h)r}r(hXHTTPCACHE_IGNORE_MISSINGrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXHTTPCACHE_IGNORE_MISSINGrr}r(hjhjubaubh)r }r!(hXDefault: ``False``r"hjhhhhh}r#(h]h]h]h]h]uhMhhh]r$(hX Default: r%r&}r'(hX Default: hj ubj)r(}r)(hX ``False``h}r*(h]h]h]h]h]uhj h]r+hXFalser,r-}r.(hUhj(ubahjubeubh)r/}r0(hXRIf enabled, requests not found in the cache will be ignored instead of downloaded.r1hjhhhhh}r2(h]h]h]h]h]uhMhhh]r3hXRIf enabled, requests not found in the cache will be ignored instead of downloaded.r4r5}r6(hj1hj/ubaubj)r7}r8(hUhjhhhjh}r9(h]h]h]h]h]Uentries]r:(XpairX!HTTPCACHE_IGNORE_SCHEMES; settingX$std:setting-HTTPCACHE_IGNORE_SCHEMESr;Utr<auhMhhh]ubh)r=}r>(hUhjhhhhh}r?(h]h]h]h]h]hj;uhMhhh]ubeubh)r@}rA(hUhjhhh}hhh}rB(h]h]h]h]rC(hj;eh]rDhFauhMhhh}rEj;j=sh]rF(h)rG}rH(hXHTTPCACHE_IGNORE_SCHEMESrIhj@hhhhh}rJ(h]h]h]h]h]uhMhhh]rKhXHTTPCACHE_IGNORE_SCHEMESrLrM}rN(hjIhjGubaubj/)rO}rP(hUhj@hhhj2h}rQ(j4X0.10h]h]h]h]h]j5X versionaddedrRuhMhhh]rSh)rT}rU(hUhjOhhhhh}rV(h]h]h]h]h]uhMhhh]rWj<)rX}rY(hUh}rZ(h]h]r[j2ah]h]h]uhjTh]r\hXNew in version 0.10.r]r^}r_(hUhjXubahjEubaubaubh)r`}ra(hXDefault: ``['file']``rbhj@hhhhh}rc(h]h]h]h]h]uhMhhh]rd(hX Default: rerf}rg(hX Default: hj`ubj)rh}ri(hX ``['file']``h}rj(h]h]h]h]h]uhj`h]rkhX['file']rlrm}rn(hUhjhubahjubeubh)ro}rp(hX-Don't cache responses with these URI schemes.rqhj@hhhhh}rr(h]h]h]h]h]uhMhhh]rshX-Don't cache responses with these URI schemes.rtru}rv(hjqhjoubaubj)rw}rx(hUhj@hhhjh}ry(h]h]h]h]h]Uentries]rz(XpairXHTTPCACHE_STORAGE; settingXstd:setting-HTTPCACHE_STORAGEr{Utr|auhMhhh]ubh)r}}r~(hUhj@hhhhh}r(h]h]h]h]h]hj{uhMhhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hj{eh]rh(auhMhhh}rj{j}sh]r(h)r}r(hXHTTPCACHE_STORAGErhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXHTTPCACHE_STORAGErr}r(hjhjubaubh)r}r(hX7Default: ``'scrapy.contrib.httpcache.DbmCacheStorage'``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX.``'scrapy.contrib.httpcache.DbmCacheStorage'``h}r(h]h]h]h]h]uhjh]rhX*'scrapy.contrib.httpcache.DbmCacheStorage'rr}r(hUhjubahjubeubh)r}r(hX5The class which implements the cache storage backend.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX5The class which implements the cache storage backend.rr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXHTTPCACHE_DBM_MODULE; settingX std:setting-HTTPCACHE_DBM_MODULErUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hwjeh]rhauhMhhh}rjjsh]r(h)r}r(hXHTTPCACHE_DBM_MODULErhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXHTTPCACHE_DBM_MODULErr}r(hjhjubaubj/)r}r(hUhjhhhj2h}r(j4X0.13h]h]h]h]h]j5X versionaddedruhMhhh]rh)r}r(hUhjhhhhh}r(h]h]h]h]h]uhMhhh]rj<)r}r(hUh}r(h]h]rj2ah]h]h]uhjh]rhXNew in version 0.13.rr}r(hUhjubahjEubaubaubh)r}r(hXDefault: ``'anydbm'``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX ``'anydbm'``h}r(h]h]h]h]h]uhjh]rhX'anydbm'rr}r(hUhjubahjubeubh)r}r(hXThe database module to use in the :ref:`DBM storage backend `. This setting is specific to the DBM backend.hjhhhhh}r(h]h]h]h]h]uhM hhh]r(hX"The database module to use in the rr}r(hX"The database module to use in the hjubh)r}r(hX2:ref:`DBM storage backend `rhjhhhhh}r(UreftypeXrefhjXhttpcache-storage-dbmU refdomainXstdrh]h]U refexplicith]h]h]jjuhM h]rj)r}r(hjh}r(h]h]r(j jXstd-refreh]h]h]uhjh]rhXDBM storage backendrr}r(hUhjubahjubaubhX.. This setting is specific to the DBM backend.rr}r(hX.. This setting is specific to the DBM backend.hjubeubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXHTTPCACHE_POLICY; settingXstd:setting-HTTPCACHE_POLICYrUtrauhM hhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhM hhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hyjeh]rhauhMhhh}rjjsh]r(h)r}r(hXHTTPCACHE_POLICYr hjhhhhh}r (h]h]h]h]h]uhMhhh]r hXHTTPCACHE_POLICYr r }r(hj hjubaubj/)r}r(hUhjhhhj2h}r(j4X0.18rh]h]h]h]h]j5X versionaddedruhMhhh]rh)r}r(hUhjhhhhh}r(h]h]h]h]h]uhMhhh]rj<)r}r(hUh}r(h]h]rj2ah]h]h]uhjh]rhXNew in version 0.18.rr}r (hUhjubahjEubaubaubh)r!}r"(hX3Default: ``'scrapy.contrib.httpcache.DummyPolicy'``r#hjhhhhh}r$(h]h]h]h]h]uhMhhh]r%(hX Default: r&r'}r((hX Default: hj!ubj)r)}r*(hX*``'scrapy.contrib.httpcache.DummyPolicy'``h}r+(h]h]h]h]h]uhj!h]r,hX&'scrapy.contrib.httpcache.DummyPolicy'r-r.}r/(hUhj)ubahjubeubh)r0}r1(hX,The class which implements the cache policy.r2hjhhhhh}r3(h]h]h]h]h]uhMhhh]r4hX,The class which implements the cache policy.r5r6}r7(hj2hj0ubaubeubeubeubh)r8}r9(hUhjOhhhhh}r:(h]h]h]h]r;(X:module-scrapy.contrib.downloadermiddleware.httpcompressionr<heh]r=h@auhMhhh]r>(h)r?}r@(hXHttpCompressionMiddlewarerAhj8hhhhh}rB(h]h]h]h]h]uhMhhh]rChXHttpCompressionMiddlewarerDrE}rF(hjAhj?ubaubj)rG}rH(hUhj8hhhjh}rI(h]h]h]h]h]Uentries]rJ(jX<scrapy.contrib.downloadermiddleware.httpcompression (module)X:module-scrapy.contrib.downloadermiddleware.httpcompressionUtrKauhNhhh]ubj)rL}rM(hUhj8hhhjh}rN(h]h]h]h]h]Uentries]rO(jXXHttpCompressionMiddleware (class in scrapy.contrib.downloadermiddleware.httpcompression)hVUtrPauhNhhh]ubj)rQ}rR(hUhj8hhhjh}rS(jjXpyh]h]h]h]h]jXclassrTjjTuhNhhh]rU(j)rV}rW(hXHttpCompressionMiddlewarerXhjQhhhjh}rY(h]rZhVajX3scrapy.contrib.downloadermiddleware.httpcompressionr[h]h]h]h]r\hVajjXjUjuhM"hhh]r](j)r^}r_(hXclass hjVhhhjh}r`(h]h]h]h]h]uhM"hhh]rahXclass rbrc}rd(hUhj^ubaubj)re}rf(hX4scrapy.contrib.downloadermiddleware.httpcompression.hjVhhhjh}rg(h]h]h]h]h]uhM"hhh]rhhX4scrapy.contrib.downloadermiddleware.httpcompression.rirj}rk(hUhjeubaubj)rl}rm(hjXhjVhhhjh}rn(h]h]h]h]h]uhM"hhh]rohXHttpCompressionMiddlewarerprq}rr(hUhjlubaubeubj )rs}rt(hUhjQhhhjh}ru(h]h]h]h]h]uhM"hhh]rvh)rw}rx(hX]This middleware allows compressed (gzip, deflate) traffic to be sent/received from web sites.ryhjshhhhh}rz(h]h]h]h]h]uhM hhh]r{hX]This middleware allows compressed (gzip, deflate) traffic to be sent/received from web sites.r|r}}r~(hjyhjwubaubaubeubh)r}r(hUhj8hhhhh}r(h]h]h]h]rhah]rhKauhM$hhh]r(h)r}r(hX"HttpCompressionMiddleware Settingsrhjhhhhh}r(h]h]h]h]h]uhM$hhh]rhX"HttpCompressionMiddleware Settingsrr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXCOMPRESSION_ENABLED; settingXstd:setting-COMPRESSION_ENABLEDrUtrauhM'hhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhM'hhh]ubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hjeh]rhUauhM)hhh}rjjsh]r(h)r}r(hXCOMPRESSION_ENABLEDrhjhhhhh}r(h]h]h]h]h]uhM)hhh]rhXCOMPRESSION_ENABLEDrr}r(hjhjubaubh)r}r(hXDefault: ``True``rhjhhhhh}r(h]h]h]h]h]uhM+hhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX``True``h}r(h]h]h]h]h]uhjh]rhXTruerr}r(hUhjubahjubeubh)r}r(hX3Whether the Compression middleware will be enabled.rhjhhhhh}r(h]h]h]h]h]uhM-hhh]rhX3Whether the Compression middleware will be enabled.rr}r(hjhjubaubeubeubeubh)r}r(hUhjOhhhhh}r(h]h]h]h]r(X2module-scrapy.contrib.downloadermiddleware.chunkedrheh]rhNauhM1hhh]r(h)r}r(hXChunkedTransferMiddlewarerhjhhhhh}r(h]h]h]h]h]uhM1hhh]rhXChunkedTransferMiddlewarerr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jX4scrapy.contrib.downloadermiddleware.chunked (module)X2module-scrapy.contrib.downloadermiddleware.chunkedUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jXPChunkedTransferMiddleware (class in scrapy.contrib.downloadermiddleware.chunked)hRUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(jjXpyh]h]h]h]h]jXclassrjjuhNhhh]r(j)r}r(hXChunkedTransferMiddlewarerhjhhhjh}r(h]rhRajX+scrapy.contrib.downloadermiddleware.chunkedrh]h]h]h]rhRajjjUjuhM9hhh]r(j)r}r(hXclass hjhhhjh}r(h]h]h]h]h]uhM9hhh]rhXclass rr}r(hUhjubaubj)r}r(hX,scrapy.contrib.downloadermiddleware.chunked.hjhhhjh}r(h]h]h]h]h]uhM9hhh]rhX,scrapy.contrib.downloadermiddleware.chunked.rr}r(hUhjubaubj)r}r(hjhjhhhjh}r(h]h]h]h]h]uhM9hhh]rhXChunkedTransferMiddlewarerr}r(hUhjubaubeubj )r}r(hUhjhhhjh}r(h]h]h]h]h]uhM9hhh]rh)r}r(hX=This middleware adds support for `chunked transfer encoding`_hjhhhhh}r(h]h]h]h]h]uhM8hhh]r(hX!This middleware adds support for rr}r(hX!This middleware adds support for hjubj )r}r(hX`chunked transfer encoding`_j Khjhj h}r(UnameXchunked transfer encodingj X6http://en.wikipedia.org/wiki/Chunked_transfer_encodingrh]h]h]h]h]uh]rhXchunked transfer encodingrr}r (hUhjubaubeubaubeubeubh)r }r (hUhjOhhhhh}r (h]h]h]h]r (X4module-scrapy.contrib.downloadermiddleware.httpproxyrheh]rh>auhM;hhh]r(h)r}r(hXHttpProxyMiddlewarerhj hhhhh}r(h]h]h]h]h]uhM;hhh]rhXHttpProxyMiddlewarerr}r(hjhjubaubj)r}r(hUhj hhhjh}r(h]h]h]h]h]Uentries]r(jX6scrapy.contrib.downloadermiddleware.httpproxy (module)X4module-scrapy.contrib.downloadermiddleware.httpproxyUtrauhNhhh]ubj/)r}r(hUhj hhhj2h}r (j4X0.8r!h]h]h]h]h]j5X versionaddedr"uhM@hhh]r#h)r$}r%(hUhjhhhhh}r&(h]h]h]h]h]uhMAhhh]r'j<)r(}r)(hUh}r*(h]h]r+j2ah]h]h]uhj$h]r,hXNew in version 0.8.r-r.}r/(hUhj(ubahjEubaubaubj)r0}r1(hUhj hNhjh}r2(h]h]h]h]h]Uentries]r3(jXLHttpProxyMiddleware (class in scrapy.contrib.downloadermiddleware.httpproxy)hCUtr4auhNhhh]ubj)r5}r6(hUhj hNhjh}r7(jjXpyh]h]h]h]h]jXclassr8jj8uhNhhh]r9(j)r:}r;(hXHttpProxyMiddlewarer<hj5hhhjh}r=(h]r>hCajX-scrapy.contrib.downloadermiddleware.httpproxyr?h]h]h]h]r@hCajj<jUjuhMMhhh]rA(j)rB}rC(hXclass hj:hhhjh}rD(h]h]h]h]h]uhMMhhh]rEhXclass rFrG}rH(hUhjBubaubj)rI}rJ(hX.scrapy.contrib.downloadermiddleware.httpproxy.hj:hhhjh}rK(h]h]h]h]h]uhMMhhh]rLhX.scrapy.contrib.downloadermiddleware.httpproxy.rMrN}rO(hUhjIubaubj)rP}rQ(hj<hj:hhhjh}rR(h]h]h]h]h]uhMMhhh]rShXHttpProxyMiddlewarerTrU}rV(hUhjPubaubeubj )rW}rX(hUhj5hhhjh}rY(h]h]h]h]h]uhMMhhh]rZ(h)r[}r\(hXThis middleware sets the HTTP proxy to use for requests, by setting the ``proxy`` meta value to :class:`~scrapy.http.Request` objects.hjWhhhhh}r](h]h]h]h]h]uhMDhhh]r^(hXHThis middleware sets the HTTP proxy to use for requests, by setting the r_r`}ra(hXHThis middleware sets the HTTP proxy to use for requests, by setting the hj[ubj)rb}rc(hX ``proxy``h}rd(h]h]h]h]h]uhj[h]rehXproxyrfrg}rh(hUhjbubahjubhX meta value to rirj}rk(hX meta value to hj[ubh)rl}rm(hX:class:`~scrapy.http.Request`rnhj[hhhhh}ro(UreftypeXclasshjXscrapy.http.RequestU refdomainXpyrph]h]U refexplicith]h]h]jjjUj<jVj?uhMDh]rqj)rr}rs(hjnh}rt(h]h]ru(j jpXpy-classrveh]h]h]uhjlh]rwhXRequestrxry}rz(hUhjrubahjubaubhX objects.r{r|}r}(hX objects.hj[ubeubh)r~}r(hXpLike the Python standard library modules `urllib`_ and `urllib2`_, it obeys the following environment variables:hjWhhhhh}r(h]h]h]h]h]uhMGhhh]r(hX)Like the Python standard library modules rr}r(hX)Like the Python standard library modules hj~ubj )r}r(hX `urllib`_j Khj~hj h}r(UnameXurllibrj X*http://docs.python.org/library/urllib.htmlrh]h]h]h]h]uh]rhXurllibrr}r(hUhjubaubhX and rr}r(hX and hj~ubj )r}r(hX `urllib2`_j Khj~hj h}r(UnameXurllib2rj X+http://docs.python.org/library/urllib2.htmlrh]h]h]h]h]uh]rhXurllib2rr}r(hUhjubaubhX/, it obeys the following environment variables:rr}r(hX/, it obeys the following environment variables:hj~ubeubjv)r}r(hUhjWhhhjh}r(jX*h]h]h]h]h]uhMJhhh]r(j{)r}r(hX``http_proxy``rhjhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hjhjhhhhh}r(h]h]h]h]h]uhMJh]rj)r}r(hjh}r(h]h]h]h]h]uhjh]rhX http_proxyrr}r(hUhjubahjubaubaubj{)r}r(hX``https_proxy``rhjhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hjhjhhhhh}r(h]h]h]h]h]uhMKh]rj)r}r(hjh}r(h]h]h]h]h]uhjh]rhX https_proxyrr}r(hUhjubahjubaubaubj{)r}r(hX ``no_proxy``rhjhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hjhjhhhhh}r(h]h]h]h]h]uhMLh]rj)r}r(hjh}r(h]h]h]h]h]uhjh]rhXno_proxyrr}r(hUhjubahjubaubaubeubeubeubh)r}r(hX6.. _urllib: http://docs.python.org/library/urllib.htmljP Khj hhhhh}r(j jh]rhmah]h]h]h]rhauhMNhhh]ubh)r}r(hX8.. _urllib2: http://docs.python.org/library/urllib2.htmljP Khj hhhhh}r(j jh]rhah]h]h]h]rh:auhMOhhh]ubeubh)r}r(hUhjOhhhhh}r(h]h]h]h]r(X3module-scrapy.contrib.downloadermiddleware.redirectrheh]rhYauhMRhhh]r(h)r}r(hXRedirectMiddlewarerhjhhhhh}r(h]h]h]h]h]uhMRhhh]rhXRedirectMiddlewarerr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jX5scrapy.contrib.downloadermiddleware.redirect (module)X3module-scrapy.contrib.downloadermiddleware.redirectUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jXJRedirectMiddleware (class in scrapy.contrib.downloadermiddleware.redirect)hUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(jjXpyh]h]h]h]h]jXclassrjjuhNhhh]r(j)r}r(hXRedirectMiddlewarerhjhhhjh}r(h]rhajX,scrapy.contrib.downloadermiddleware.redirectrh]h]h]h]rhajjjUjuhMZhhh]r(j)r}r(hXclass hjhhhjh}r(h]h]h]h]h]uhMZhhh]rhXclass rr}r(hUhjubaubj)r}r (hX-scrapy.contrib.downloadermiddleware.redirect.hjhhhjh}r (h]h]h]h]h]uhMZhhh]r hX-scrapy.contrib.downloadermiddleware.redirect.r r }r(hUhjubaubj)r}r(hjhjhhhjh}r(h]h]h]h]h]uhMZhhh]rhXRedirectMiddlewarerr}r(hUhjubaubeubj )r}r(hUhjhhhjh}r(h]h]h]h]h]uhMZhhh]rh)r}r(hXIThis middleware handles redirection of requests based on response status.rhjhhhhh}r(h]h]h]h]h]uhMYhhh]rhXIThis middleware handles redirection of requests based on response status.rr }r!(hjhjubaubaubeubj)r"}r#(hUhjhhhjh}r$(h]h]h]h]h]Uentries]r%(XpairXredirect_urls; reqmetaXstd:reqmeta-redirect_urlsr&Utr'auhM\hhh]ubh)r(}r)(hUhjhhhhh}r*(h]h]h]h]h]hj&uhM\hhh]ubh)r+}r,(hXThe urls which the request goes through (while being redirected) can be found in the ``redirect_urls`` :attr:`Request.meta ` key.hjhhh}hhh}r-(h]h]h]h]r.j&ah]uhM]hhh}r/j&j(sh]r0(hXUThe urls which the request goes through (while being redirected) can be found in the r1r2}r3(hXUThe urls which the request goes through (while being redirected) can be found in the hj+ubj)r4}r5(hX``redirect_urls``h}r6(h]h]h]h]h]uhj+h]r7hX redirect_urlsr8r9}r:(hUhj4ubahjubhX r;}r<(hX hj+ubh)r=}r>(hX/:attr:`Request.meta `r?hj+hhhhh}r@(UreftypeXattrhjXscrapy.http.Request.metaU refdomainXpyrAh]h]U refexplicith]h]h]jjjUNjVjuhM]h]rBj)rC}rD(hj?h}rE(h]h]rF(j jAXpy-attrrGeh]h]h]uhj=h]rHhX Request.metarIrJ}rK(hUhjCubahjubaubhX key.rLrM}rN(hX key.hj+ubeubh)rO}rP(hXThe :class:`RedirectMiddleware` can be configured through the following settings (see the settings documentation for more info):hjhhhhh}rQ(h]h]h]h]h]uhM`hhh]rR(hXThe rSrT}rU(hXThe hjOubh)rV}rW(hX:class:`RedirectMiddleware`rXhjOhhhhh}rY(UreftypeXclasshjXRedirectMiddlewareU refdomainXpyrZh]h]U refexplicith]h]h]jjjUNjVjuhM`h]r[j)r\}r](hjXh}r^(h]h]r_(j jZXpy-classr`eh]h]h]uhjVh]rahXRedirectMiddlewarerbrc}rd(hUhj\ubahjubaubhXa can be configured through the following settings (see the settings documentation for more info):rerf}rg(hXa can be configured through the following settings (see the settings documentation for more info):hjOubeubjv)rh}ri(hUhjhhhjh}rj(jX*h]h]h]h]h]uhMchhh]rk(j{)rl}rm(hX:setting:`REDIRECT_ENABLED`rnhjhhhhjh}ro(h]h]h]h]h]uhNhhh]rph)rq}rr(hjnhjlhhhhh}rs(h]h]h]h]h]uhMch]rth)ru}rv(hjnhjqhhhhh}rw(UreftypeXsettinghjXREDIRECT_ENABLEDU refdomainXstdrxh]h]U refexplicith]h]h]jjuhMch]ryj)rz}r{(hjnh}r|(h]h]r}(j jxX std-settingr~eh]h]h]uhjuh]rhXREDIRECT_ENABLEDrr}r(hUhjzubahjubaubaubaubj{)r}r(hX:setting:`REDIRECT_MAX_TIMES` hjhhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hX:setting:`REDIRECT_MAX_TIMES`rhjhhhhh}r(h]h]h]h]h]uhMdh]rh)r}r(hjhjhhhhh}r(UreftypeXsettinghjXREDIRECT_MAX_TIMESU refdomainXstdrh]h]U refexplicith]h]h]jjuhMdh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXREDIRECT_MAX_TIMESrr}r(hUhjubahjubaubaubaubeubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXdont_redirect; reqmetaXstd:reqmeta-dont_redirectrUtrauhMghhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMghhh]ubh)r}r(hXIf :attr:`Request.meta ` contains the ``dont_redirect`` key, the request will be ignored by this middleware.hjhhh}hhh}r(h]h]h]h]rjah]uhMhhhh}rjjsh]r(hXIf rr}r(hXIf hjubh)r}r(hX/:attr:`Request.meta `rhjhhhhh}r(UreftypeXattrhjXscrapy.http.Request.metaU refdomainXpyrh]h]U refexplicith]h]h]jjjUNjVjuhMhh]rj)r}r(hjh}r(h]h]r(j jXpy-attrreh]h]h]uhjh]rhX Request.metarr}r(hUhjubahjubaubhX contains the rr}r(hX contains the hjubj)r}r(hX``dont_redirect``h}r(h]h]h]h]h]uhjh]rhX dont_redirectrr}r(hUhjubahjubhX5 key, the request will be ignored by this middleware.rr}r(hX5 key, the request will be ignored by this middleware.hjubeubh)r}r(hUhjhhhhh}r(h]h]h]h]rhah]rh9auhMmhhh]r(h)r}r(hXRedirectMiddleware settingsrhjhhhhh}r(h]h]h]h]h]uhMmhhh]rhXRedirectMiddleware settingsrr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXREDIRECT_ENABLED; settingXstd:setting-REDIRECT_ENABLEDrUtrauhMphhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMphhh]ubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hjeh]rh.auhMrhhh}rjjsh]r(h)r}r(hXREDIRECT_ENABLEDrhjhhhhh}r(h]h]h]h]h]uhMrhhh]rhXREDIRECT_ENABLEDrr}r(hjhjubaubj/)r}r(hUhjhhhj2h}r(j4X0.13h]h]h]h]h]j5X versionaddedruhMthhh]rh)r}r(hUhjhhhhh}r(h]h]h]h]h]uhMuhhh]rj<)r}r(hUh}r(h]h]rj2ah]h]h]uhjh]rhXNew in version 0.13.rr}r(hUhjubahjEubaubaubh)r}r(hXDefault: ``True``rhjhhhhh}r(h]h]h]h]h]uhMvhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX``True``h}r (h]h]h]h]h]uhjh]r hXTruer r }r (hUhjubahjubeubh)r}r(hX0Whether the Redirect middleware will be enabled.rhjhhhhh}r(h]h]h]h]h]uhMxhhh]rhX0Whether the Redirect middleware will be enabled.rr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXREDIRECT_MAX_TIMES; settingXstd:setting-REDIRECT_MAX_TIMESrUtrauhM{hhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhM{hhh]ubeubh)r}r (hUhjhhh}hhh}r!(h]h]h]h]r"(hxjeh]r#hauhM}hhh}r$jjsh]r%(h)r&}r'(hXREDIRECT_MAX_TIMESr(hjhhhhh}r)(h]h]h]h]h]uhM}hhh]r*hXREDIRECT_MAX_TIMESr+r,}r-(hj(hj&ubaubh)r.}r/(hXDefault: ``20``r0hjhhhhh}r1(h]h]h]h]h]uhMhhh]r2(hX Default: r3r4}r5(hX Default: hj.ubj)r6}r7(hX``20``h}r8(h]h]h]h]h]uhj.h]r9hX20r:r;}r<(hUhj6ubahjubeubh)r=}r>(hXLThe maximum number of redirections that will be follow for a single request.r?hjhhhhh}r@(h]h]h]h]h]uhMhhh]rAhXLThe maximum number of redirections that will be follow for a single request.rBrC}rD(hj?hj=ubaubeubeubeubh)rE}rF(hUhjOhhhhh}rG(h]h]h]h]rHhah]rIh2auhMhhh]rJ(h)rK}rL(hXMetaRefreshMiddlewarerMhjEhhhhh}rN(h]h]h]h]h]uhMhhh]rOhXMetaRefreshMiddlewarerPrQ}rR(hjMhjKubaubj)rS}rT(hUhjEhhhjh}rU(h]h]h]h]h]Uentries]rV(jXMMetaRefreshMiddleware (class in scrapy.contrib.downloadermiddleware.redirect)h,UtrWauhNhhh]ubj)rX}rY(hUhjEhhhjh}rZ(jjXpyh]h]h]h]h]jXclassr[jj[uhNhhh]r\(j)r]}r^(hXMetaRefreshMiddlewarer_hjXhhhjh}r`(h]rah,ajjh]h]h]h]rbh,ajj_jUjuhMhhh]rc(j)rd}re(hXclass hj]hhhjh}rf(h]h]h]h]h]uhMhhh]rghXclass rhri}rj(hUhjdubaubj)rk}rl(hX-scrapy.contrib.downloadermiddleware.redirect.hj]hhhjh}rm(h]h]h]h]h]uhMhhh]rnhX-scrapy.contrib.downloadermiddleware.redirect.rorp}rq(hUhjkubaubj)rr}rs(hj_hj]hhhjh}rt(h]h]h]h]h]uhMhhh]ruhXMetaRefreshMiddlewarervrw}rx(hUhjrubaubeubj )ry}rz(hUhjXhhhjh}r{(h]h]h]h]h]uhMhhh]r|h)r}}r~(hXOThis middleware handles redirection of requests based on meta-refresh html tag.rhjyhhhhh}r(h]h]h]h]h]uhMhhh]rhXOThis middleware handles redirection of requests based on meta-refresh html tag.rr}r(hjhj}ubaubaubeubh)r}r(hXThe :class:`MetaRefreshMiddleware` can be configured through the following settings (see the settings documentation for more info):hjEhhhhh}r(h]h]h]h]h]uhMhhh]r(hXThe rr}r(hXThe hjubh)r}r(hX:class:`MetaRefreshMiddleware`rhjhhhhh}r(UreftypeXclasshjXMetaRefreshMiddlewareU refdomainXpyrh]h]U refexplicith]h]h]jjjUNjVjuhMh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXMetaRefreshMiddlewarerr}r(hUhjubahjubaubhXa can be configured through the following settings (see the settings documentation for more info):rr}r(hXa can be configured through the following settings (see the settings documentation for more info):hjubeubjv)r}r(hUhjEhhhjh}r(jX*h]h]h]h]h]uhMhhh]r(j{)r}r(hX:setting:`METAREFRESH_ENABLED`rhjhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hjhjhhhhh}r(h]h]h]h]h]uhMh]rh)r}r(hjhjhhhhh}r(UreftypeXsettinghjXMETAREFRESH_ENABLEDU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXMETAREFRESH_ENABLEDrr}r(hUhjubahjubaubaubaubj{)r}r(hX :setting:`METAREFRESH_MAXDELAY` hjhhhjh}r(h]h]h]h]h]uhNhhh]rh)r}r(hX:setting:`METAREFRESH_MAXDELAY`rhjhhhhh}r(h]h]h]h]h]uhMh]rh)r}r(hjhjhhhhh}r(UreftypeXsettinghjXMETAREFRESH_MAXDELAYU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXMETAREFRESH_MAXDELAYrr}r(hUhjubahjubaubaubaubeubh)r}r(hXThis middleware obey :setting:`REDIRECT_MAX_TIMES` setting, :reqmeta:`dont_redirect` and :reqmeta:`redirect_urls` request meta keys as described for :class:`RedirectMiddleware`hjEhhhhh}r(h]h]h]h]h]uhMhhh]r(hXThis middleware obey rr}r(hXThis middleware obey hjubh)r}r(hX:setting:`REDIRECT_MAX_TIMES`rhjhhhhh}r(UreftypeXsettinghjXREDIRECT_MAX_TIMESU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXREDIRECT_MAX_TIMESrr}r(hUhjubahjubaubhX setting, rr}r(hX setting, hjubh)r}r(hX:reqmeta:`dont_redirect`rhjhhhhh}r(UreftypeXreqmetahjX dont_redirectU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jX std-reqmetareh]h]h]uhjh]rhX dont_redirectrr}r(hUhjubahjubaubhX and rr}r(hX and hjubh)r}r(hX:reqmeta:`redirect_urls`rhjhhhhh}r(UreftypeXreqmetahjX redirect_urlsU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jX std-reqmetareh]h]h]uhjh]rhX redirect_urlsrr}r (hUhjubahjubaubhX$ request meta keys as described for r r }r (hX$ request meta keys as described for hjubh)r }r(hX:class:`RedirectMiddleware`rhjhhhhh}r(UreftypeXclasshjXRedirectMiddlewareU refdomainXpyrh]h]U refexplicith]h]h]jjjUNjVjuhMh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhj h]rhXRedirectMiddlewarerr}r(hUhjubahjubaubeubh)r}r(hUhjEhhhhh}r(h]h]h]h]rhqah]r hauhMhhh]r!(h)r"}r#(hXMetaRefreshMiddleware settingsr$hjhhhhh}r%(h]h]h]h]h]uhMhhh]r&hXMetaRefreshMiddleware settingsr'r(}r)(hj$hj"ubaubj)r*}r+(hUhjhhhjh}r,(h]h]h]h]h]Uentries]r-(XpairXMETAREFRESH_ENABLED; settingXstd:setting-METAREFRESH_ENABLEDr.Utr/auhMhhh]ubh)r0}r1(hUhjhhhhh}r2(h]h]h]h]h]hj.uhMhhh]ubh)r3}r4(hUhjhhh}hhh}r5(h]h]h]h]r6(hj.eh]r7hOauhMhhh}r8j.j0sh]r9(h)r:}r;(hXMETAREFRESH_ENABLEDr<hj3hhhhh}r=(h]h]h]h]h]uhMhhh]r>hXMETAREFRESH_ENABLEDr?r@}rA(hj<hj:ubaubj/)rB}rC(hUhj3hhhj2h}rD(j4X0.17h]h]h]h]h]j5X versionaddedrEuhMhhh]rFh)rG}rH(hUhjBhhhhh}rI(h]h]h]h]h]uhMhhh]rJj<)rK}rL(hUh}rM(h]h]rNj2ah]h]h]uhjGh]rOhXNew in version 0.17.rPrQ}rR(hUhjKubahjEubaubaubh)rS}rT(hXDefault: ``True``rUhj3hhhhh}rV(h]h]h]h]h]uhMhhh]rW(hX Default: rXrY}rZ(hX Default: hjSubj)r[}r\(hX``True``h}r](h]h]h]h]h]uhjSh]r^hXTruer_r`}ra(hUhj[ubahjubeubh)rb}rc(hX4Whether the Meta Refresh middleware will be enabled.rdhj3hhhhh}re(h]h]h]h]h]uhMhhh]rfhX4Whether the Meta Refresh middleware will be enabled.rgrh}ri(hjdhjbubaubj)rj}rk(hUhj3hhhjh}rl(h]h]h]h]h]Uentries]rm(XpairX'REDIRECT_MAX_METAREFRESH_DELAY; settingX*std:setting-REDIRECT_MAX_METAREFRESH_DELAYrnUtroauhMhhh]ubh)rp}rq(hUhj3hhhhh}rr(h]h]h]h]h]hjnuhMhhh]ubeubh)rs}rt(hUhjhhh}hhh}ru(h]h]h]h]rv(hjneh]rwh7auhMhhh}rxjnjpsh]ry(h)rz}r{(hXREDIRECT_MAX_METAREFRESH_DELAYr|hjshhhhh}r}(h]h]h]h]h]uhMhhh]r~hXREDIRECT_MAX_METAREFRESH_DELAYrr}r(hj|hjzubaubh)r}r(hXDefault: ``100``rhjshhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX``100``h}r(h]h]h]h]h]uhjh]rhX100rr}r(hUhjubahjubeubh)r}r(hXFThe maximum meta-refresh delay (in seconds) to follow the redirection.rhjshhhhh}r(h]h]h]h]h]uhMhhh]rhXFThe maximum meta-refresh delay (in seconds) to follow the redirection.rr}r(hjhjubaubeubeubeubh)r}r(hUhjOhhhhh}r(h]h]h]h]r(X0module-scrapy.contrib.downloadermiddleware.retryrheh]rhBauhMhhh]r(h)r}r(hXRetryMiddlewarerhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXRetryMiddlewarerr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jX2scrapy.contrib.downloadermiddleware.retry (module)X0module-scrapy.contrib.downloadermiddleware.retryUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jXDRetryMiddleware (class in scrapy.contrib.downloadermiddleware.retry)hUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(jjXpyh]h]h]h]h]jXclassrjjuhNhhh]r(j)r}r(hXRetryMiddlewarerhjhhhjh}r(h]rhajX)scrapy.contrib.downloadermiddleware.retryrh]h]h]h]rhajjjUjuhMhhh]r(j)r}r(hXclass hjhhhjh}r(h]h]h]h]h]uhMhhh]rhXclass rr}r(hUhjubaubj)r}r(hX*scrapy.contrib.downloadermiddleware.retry.hjhhhjh}r(h]h]h]h]h]uhMhhh]rhX*scrapy.contrib.downloadermiddleware.retry.rr}r(hUhjubaubj)r}r(hjhjhhhjh}r(h]h]h]h]h]uhMhhh]rhXRetryMiddlewarerr}r(hUhjubaubeubj )r}r(hUhjhhhjh}r(h]h]h]h]h]uhMhhh]rh)r}r(hXA middlware to retry failed requests that are potentially caused by temporary problems such as a connection timeout or HTTP 500 error.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXA middlware to retry failed requests that are potentially caused by temporary problems such as a connection timeout or HTTP 500 error.rr}r(hjhjubaubaubeubh)r}r(hX%Failed pages are collected on the scraping process and rescheduled at the end, once the spider has finished crawling all regular (non failed) pages. Once there are no more failed pages to retry, this middleware sends a signal (retry_complete), so other extensions could connect to that signal.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX%Failed pages are collected on the scraping process and rescheduled at the end, once the spider has finished crawling all regular (non failed) pages. Once there are no more failed pages to retry, this middleware sends a signal (retry_complete), so other extensions could connect to that signal.rr}r(hjhjubaubh)r}r(hX}The :class:`RetryMiddleware` can be configured through the following settings (see the settings documentation for more info):hjhhhhh}r(h]h]h]h]h]uhMhhh]r(hXThe rr}r(hXThe hjubh)r}r(hX:class:`RetryMiddleware`rhjhhhhh}r(UreftypeXclasshjXRetryMiddlewareU refdomainXpyrh]h]U refexplicith]h]h]jjjUNjVjuhMh]rj)r}r(hjh}r(h]h]r(j jXpy-classreh]h]h]uhjh]rhXRetryMiddlewarerr}r(hUhjubahjubaubhXa can be configured through the following settings (see the settings documentation for more info):rr}r(hXa can be configured through the following settings (see the settings documentation for more info):hjubeubjv)r}r(hUhjhhhjh}r(jX*h]h]h]h]h]uhMhhh]r(j{)r}r(hX:setting:`RETRY_ENABLED`rhjhhhjh}r(h]h]h]h]h]uhNhhh]r h)r }r (hjhjhhhhh}r (h]h]h]h]h]uhMh]r h)r}r(hjhj hhhhh}r(UreftypeXsettinghjX RETRY_ENABLEDU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhX RETRY_ENABLEDrr}r(hUhjubahjubaubaubaubj{)r}r(hX:setting:`RETRY_TIMES`rhjhhhjh}r(h]h]h]h]h]uhNhhh]r h)r!}r"(hjhjhhhhh}r#(h]h]h]h]h]uhMh]r$h)r%}r&(hjhj!hhhhh}r'(UreftypeXsettinghjX RETRY_TIMESU refdomainXstdr(h]h]U refexplicith]h]h]jjuhMh]r)j)r*}r+(hjh}r,(h]h]r-(j j(X std-settingr.eh]h]h]uhj%h]r/hX RETRY_TIMESr0r1}r2(hUhj*ubahjubaubaubaubj{)r3}r4(hX:setting:`RETRY_HTTP_CODES` hjhhhjh}r5(h]h]h]h]h]uhNhhh]r6h)r7}r8(hX:setting:`RETRY_HTTP_CODES`r9hj3hhhhh}r:(h]h]h]h]h]uhMh]r;h)r<}r=(hj9hj7hhhhh}r>(UreftypeXsettinghjXRETRY_HTTP_CODESU refdomainXstdr?h]h]U refexplicith]h]h]jjuhMh]r@j)rA}rB(hj9h}rC(h]h]rD(j j?X std-settingrEeh]h]h]uhj<h]rFhXRETRY_HTTP_CODESrGrH}rI(hUhjAubahjubaubaubaubeubh)rJ}rK(hXAbout HTTP errors to consider:rLhjhhhhh}rM(h]h]h]h]h]uhMhhh]rNhXAbout HTTP errors to consider:rOrP}rQ(hjLhjJubaubh)rR}rS(hXYou may want to remove 400 from :setting:`RETRY_HTTP_CODES`, if you stick to the HTTP protocol. It's included by default because it's a common code used to indicate server overload, which would be something we want to retry.hjhhhhh}rT(h]h]h]h]h]uhMhhh]rU(hX You may want to remove 400 from rVrW}rX(hX You may want to remove 400 from hjRubh)rY}rZ(hX:setting:`RETRY_HTTP_CODES`r[hjRhhhhh}r\(UreftypeXsettinghjXRETRY_HTTP_CODESU refdomainXstdr]h]h]U refexplicith]h]h]jjuhMh]r^j)r_}r`(hj[h}ra(h]h]rb(j j]X std-settingrceh]h]h]uhjYh]rdhXRETRY_HTTP_CODESrerf}rg(hUhj_ubahjubaubhX, if you stick to the HTTP protocol. It's included by default because it's a common code used to indicate server overload, which would be something we want to retry.rhri}rj(hX, if you stick to the HTTP protocol. It's included by default because it's a common code used to indicate server overload, which would be something we want to retry.hjRubeubj)rk}rl(hUhjhhhjh}rm(h]h]h]h]h]Uentries]rn(XpairXdont_retry; reqmetaXstd:reqmeta-dont_retryroUtrpauhMhhh]ubh)rq}rr(hUhjhhhhh}rs(h]h]h]h]h]hjouhMhhh]ubh)rt}ru(hXIf :attr:`Request.meta ` contains the ``dont_retry`` key, the request will be ignored by this middleware.hjhhh}hhh}rv(h]h]h]h]rwjoah]uhMhhh}rxjojqsh]ry(hXIf rzr{}r|(hXIf hjtubh)r}}r~(hX/:attr:`Request.meta `rhjthhhhh}r(UreftypeXattrhjXscrapy.http.Request.metaU refdomainXpyrh]h]U refexplicith]h]h]jjjUNjVjuhMh]rj)r}r(hjh}r(h]h]r(j jXpy-attrreh]h]h]uhj}h]rhX Request.metarr}r(hUhjubahjubaubhX contains the rr}r(hX contains the hjtubj)r}r(hX``dont_retry``h}r(h]h]h]h]h]uhjth]rhX dont_retryrr}r(hUhjubahjubhX5 key, the request will be ignored by this middleware.rr}r(hX5 key, the request will be ignored by this middleware.hjtubeubh)r}r(hUhjhhhhh}r(h]h]h]h]rhiah]rh auhMhhh]r(h)r}r(hXRetryMiddleware Settingsrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXRetryMiddleware Settingsrr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXRETRY_ENABLED; settingXstd:setting-RETRY_ENABLEDrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(h{jeh]rh"auhMhhh}rjjsh]r(h)r}r(hX RETRY_ENABLEDrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX RETRY_ENABLEDrr}r(hjhjubaubj/)r}r(hUhjhhhj2h}r(j4X0.13h]h]h]h]h]j5X versionaddedruhMhhh]rh)r}r(hUhjhhhhh}r(h]h]h]h]h]uhMhhh]rj<)r}r(hUh}r(h]h]rj2ah]h]h]uhjh]rhXNew in version 0.13.rr}r(hUhjubahjEubaubaubh)r}r(hXDefault: ``True``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX``True``h}r(h]h]h]h]h]uhjh]rhXTruerr}r(hUhjubahjubeubh)r}r(hX-Whether the Retry middleware will be enabled.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX-Whether the Retry middleware will be enabled.rr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXRETRY_TIMES; settingXstd:setting-RETRY_TIMESrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hjeh]rh?auhMhhh}rjjsh]r(h)r}r(hX RETRY_TIMESrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhX RETRY_TIMESrr}r(hjhjubaubh)r}r(hXDefault: ``2``rhjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX Default: rr}r(hX Default: hjubj)r}r(hX``2``h}r (h]h]h]h]h]uhjh]r hX2r }r (hUhjubahjubeubh)r }r(hXDMaximum number of times to retry, in addition to the first download.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXDMaximum number of times to retry, in addition to the first download.rr}r(hjhj ubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXRETRY_HTTP_CODES; settingXstd:setting-RETRY_HTTP_CODESrUtrauhMhhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhMhhh]ubeubh)r}r(hUhjhhh}hhh}r (h]h]h]h]r!(hjeh]r"hPauhMhhh}r#jjsh]r$(h)r%}r&(hXRETRY_HTTP_CODESr'hjhhhhh}r((h]h]h]h]h]uhMhhh]r)hXRETRY_HTTP_CODESr*r+}r,(hj'hj%ubaubh)r-}r.(hX+Default: ``[500, 502, 503, 504, 400, 408]``r/hjhhhhh}r0(h]h]h]h]h]uhMhhh]r1(hX Default: r2r3}r4(hX Default: hj-ubj)r5}r6(hX"``[500, 502, 503, 504, 400, 408]``h}r7(h]h]h]h]h]uhj-h]r8hX[500, 502, 503, 504, 400, 408]r9r:}r;(hUhj5ubahjubeubh)r<}r=(hXoWhich HTTP response codes to retry. Other errors (DNS lookup issues, connections lost, etc) are always retried.r>hjhhhhh}r?(h]h]h]h]h]uhMhhh]r@hXoWhich HTTP response codes to retry. Other errors (DNS lookup issues, connections lost, etc) are always retried.rArB}rC(hj>hj<ubaubh)rD}rE(hX.. _topics-dlmw-robots:hjhhhhh}rF(h]h]h]h]h]hhuhMhhh]ubeubeubeubh)rG}rH(hUhjOhhh}rIh+jDshhh}rJ(h]h]h]h]rK(X4module-scrapy.contrib.downloadermiddleware.robotstxtrLhgheh]rM(h h+euhMhhh}rNhjDsh]rO(h)rP}rQ(hXRobotsTxtMiddlewarerRhjGhhhhh}rS(h]h]h]h]h]uhMhhh]rThXRobotsTxtMiddlewarerUrV}rW(hjRhjPubaubj)rX}rY(hUhjGhhhjh}rZ(h]h]h]h]h]Uentries]r[(jX6scrapy.contrib.downloadermiddleware.robotstxt (module)X4module-scrapy.contrib.downloadermiddleware.robotstxtUtr\auhNhhh]ubj)r]}r^(hUhjGhNhjh}r_(h]h]h]h]h]Uentries]r`(jXLRobotsTxtMiddleware (class in scrapy.contrib.downloadermiddleware.robotstxt)hIUtraauhNhhh]ubj)rb}rc(hUhjGhNhjh}rd(jjXpyh]h]h]h]h]jXclassrejjeuhNhhh]rf(j)rg}rh(hXRobotsTxtMiddlewarerihjbhhhjh}rj(h]rkhIajX-scrapy.contrib.downloadermiddleware.robotstxtrlh]h]h]h]rmhIajjijUjuhMhhh]rn(j)ro}rp(hXclass hjghhhjh}rq(h]h]h]h]h]uhMhhh]rrhXclass rsrt}ru(hUhjoubaubj)rv}rw(hX.scrapy.contrib.downloadermiddleware.robotstxt.hjghhhjh}rx(h]h]h]h]h]uhMhhh]ryhX.scrapy.contrib.downloadermiddleware.robotstxt.rzr{}r|(hUhjvubaubj)r}}r~(hjihjghhhjh}r(h]h]h]h]h]uhMhhh]rhXRobotsTxtMiddlewarerr}r(hUhj}ubaubeubj )r}r(hUhjbhhhjh}r(h]h]h]h]h]uhMhhh]r(h)r}r(hXTThis middleware filters out requests forbidden by the robots.txt exclusion standard.rhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXTThis middleware filters out requests forbidden by the robots.txt exclusion standard.rr}r(hjhjubaubh)r}r(hXTo make sure Scrapy respects robots.txt make sure the middleware is enabled and the :setting:`ROBOTSTXT_OBEY` setting is enabled.hjhhhhh}r(h]h]h]h]h]uhMhhh]r(hXTTo make sure Scrapy respects robots.txt make sure the middleware is enabled and the rr}r(hXTTo make sure Scrapy respects robots.txt make sure the middleware is enabled and the hjubh)r}r(hX:setting:`ROBOTSTXT_OBEY`rhjhhhhh}r(UreftypeXsettinghjXROBOTSTXT_OBEYU refdomainXstdrh]h]U refexplicith]h]h]jjuhMh]rj)r}r(hjh}r(h]h]r(j jX std-settingreh]h]h]uhjh]rhXROBOTSTXT_OBEYrr}r(hUhjubahjubaubhX setting is enabled.rr}r(hX setting is enabled.hjubeubcdocutils.nodes warning r)r}r(hX#Keep in mind that, if you crawl using multiple concurrent requests per domain, Scrapy could still download some forbidden pages if they were requested before the robots.txt file was downloaded. This is a known limitation of the current robots.txt middleware and will be fixed in the future.hjhhhUwarningrh}r(h]h]h]h]h]uhNhhh]rh)r}r(hX#Keep in mind that, if you crawl using multiple concurrent requests per domain, Scrapy could still download some forbidden pages if they were requested before the robots.txt file was downloaded. This is a known limitation of the current robots.txt middleware and will be fixed in the future.rhjhhhhh}r(h]h]h]h]h]uhMh]rhX#Keep in mind that, if you crawl using multiple concurrent requests per domain, Scrapy could still download some forbidden pages if they were requested before the robots.txt file was downloaded. This is a known limitation of the current robots.txt middleware and will be fixed in the future.rr}r(hjhjubaubaubeubeubeubh)r}r(hUhjOhhhhh}r(h]h]h]h]r(X0module-scrapy.contrib.downloadermiddleware.statsrhveh]rhauhMhhh]r(h)r}r(hXDownloaderStatsrhjhhhhh}r(h]h]h]h]h]uhMhhh]rhXDownloaderStatsrr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jX2scrapy.contrib.downloadermiddleware.stats (module)X0module-scrapy.contrib.downloadermiddleware.statsUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(jXDDownloaderStats (class in scrapy.contrib.downloadermiddleware.stats)hTUtrauhNhhh]ubj)r}r(hUhjhhhjh}r(jjXpyh]h]h]h]h]jXclassrjjuhNhhh]r(j)r}r(hXDownloaderStatsrhjhhhjh}r(h]rhTajX)scrapy.contrib.downloadermiddleware.statsrh]h]h]h]rhTajjjUjuhMhhh]r(j)r}r(hXclass hjhhhjh}r(h]h]h]h]h]uhMhhh]rhXclass rr}r(hUhjubaubj)r}r(hX*scrapy.contrib.downloadermiddleware.stats.hjhhhjh}r(h]h]h]h]h]uhMhhh]rhX*scrapy.contrib.downloadermiddleware.stats.rr}r(hUhjubaubj)r}r(hjhjhhhjh}r(h]h]h]h]h]uhMhhh]rhXDownloaderStatsrr}r(hUhjubaubeubj )r}r(hUhjhhhjh}r(h]h]h]h]h]uhMhhh]r(h)r}r(hX\Middleware that stores stats of all requests, responses and exceptions that pass through it.rhjhhhhh}r(h]h]h]h]h]uhM hhh]rhX\Middleware that stores stats of all requests, responses and exceptions that pass through it.rr}r(hjhjubaubh)r}r(hXOTo use this middleware you must enable the :setting:`DOWNLOADER_STATS` setting.hjhhhhh}r(h]h]h]h]h]uhMhhh]r(hX+To use this middleware you must enable the rr}r(hX+To use this middleware you must enable the hjubh)r}r(hX:setting:`DOWNLOADER_STATS`rhjhhhhh}r(UreftypeXsettinghjXDOWNLOADER_STATSU refdomainXstdr h]h]U refexplicith]h]h]jjuhMh]r j)r }r (hjh}r (h]h]r(j j X std-settingreh]h]h]uhjh]rhXDOWNLOADER_STATSrr}r(hUhj ubahjubaubhX setting.rr}r(hX setting.hjubeubeubeubeubh)r}r(hUhjOhhhhh}r(h]h]h]h]r(X4module-scrapy.contrib.downloadermiddleware.useragentrheh]rh3auhMhhh]r(h)r}r(hXUserAgentMiddlewarer hjhhhhh}r!(h]h]h]h]h]uhMhhh]r"hXUserAgentMiddlewarer#r$}r%(hj hjubaubj)r&}r'(hUhjhhhjh}r((h]h]h]h]h]Uentries]r)(jX6scrapy.contrib.downloadermiddleware.useragent (module)X4module-scrapy.contrib.downloadermiddleware.useragentUtr*auhNhhh]ubj)r+}r,(hUhjhhhjh}r-(h]h]h]h]h]Uentries]r.(jXLUserAgentMiddleware (class in scrapy.contrib.downloadermiddleware.useragent)h Utr/auhNhhh]ubj)r0}r1(hUhjhhhjh}r2(jjXpyh]h]h]h]h]jXclassr3jj3uhNhhh]r4(j)r5}r6(hXUserAgentMiddlewarer7hj0hhhjh}r8(h]r9h ajX-scrapy.contrib.downloadermiddleware.useragentr:h]h]h]h]r;h ajj7jUjuhMhhh]r<(j)r=}r>(hXclass hj5hhhjh}r?(h]h]h]h]h]uhMhhh]r@hXclass rArB}rC(hUhj=ubaubj)rD}rE(hX.scrapy.contrib.downloadermiddleware.useragent.hj5hhhjh}rF(h]h]h]h]h]uhMhhh]rGhX.scrapy.contrib.downloadermiddleware.useragent.rHrI}rJ(hUhjDubaubj)rK}rL(hj7hj5hhhjh}rM(h]h]h]h]h]uhMhhh]rNhXUserAgentMiddlewarerOrP}rQ(hUhjKubaubeubj )rR}rS(hUhj0hhhjh}rT(h]h]h]h]h]uhMhhh]rU(h)rV}rW(hXBMiddleware that allows spiders to override the default user agent.rXhjRhhhhh}rY(h]h]h]h]h]uhMhhh]rZhXBMiddleware that allows spiders to override the default user agent.r[r\}r](hjXhjVubaubh)r^}r_(hXaIn order for a spider to override the default user agent, its `user_agent` attribute must be set.hjRhhhhh}r`(h]h]h]h]h]uhMhhh]ra(hX>In order for a spider to override the default user agent, its rbrc}rd(hX>In order for a spider to override the default user agent, its hj^ubj)re}rf(hX `user_agent`h}rg(h]h]h]h]h]uhj^h]rhhX user_agentrirj}rk(hUhjeubahjubhX attribute must be set.rlrm}rn(hX attribute must be set.hj^ubeubeubeubh)ro}rp(hX.. _ajaxcrawl-middleware:hjhhhhh}rq(h]h]h]h]h]hhuhMhhh]ubeubh)rr}rs(hUhjOhhh}rth8joshhh}ru(h]h]h]h]rv(X4module-scrapy.contrib.downloadermiddleware.ajaxcrawlrwhheh]rx(h0h8euhM"hhh}ryhjosh]rz(h)r{}r|(hXAjaxCrawlMiddlewarer}hjrhhhhh}r~(h]h]h]h]h]uhM"hhh]rhXAjaxCrawlMiddlewarerr}r(hj}hj{ubaubj)r}r(hUhjrhhhjh}r(h]h]h]h]h]Uentries]r(jX6scrapy.contrib.downloadermiddleware.ajaxcrawl (module)X4module-scrapy.contrib.downloadermiddleware.ajaxcrawlUtrauhM%hhh]ubj)r}r(hUhjrhNhjh}r(h]h]h]h]h]Uentries]r(jXLAjaxCrawlMiddleware (class in scrapy.contrib.downloadermiddleware.ajaxcrawl)hLUtrauhNhhh]ubj)r}r(hUhjrhNhjh}r(jjXpyh]h]h]h]h]jXclassrjjuhNhhh]r(j)r}r(hXAjaxCrawlMiddlewarerhjhhhjh}r(h]rhLajX-scrapy.contrib.downloadermiddleware.ajaxcrawlrh]h]h]h]rhLajjjUjuhM3hhh]r(j)r}r(hXclass hjhhhjh}r(h]h]h]h]h]uhM3hhh]rhXclass rr}r(hUhjubaubj)r}r(hX.scrapy.contrib.downloadermiddleware.ajaxcrawl.hjhhhjh}r(h]h]h]h]h]uhM3hhh]rhX.scrapy.contrib.downloadermiddleware.ajaxcrawl.rr}r(hUhjubaubj)r}r(hjhjhhhjh}r(h]h]h]h]h]uhM3hhh]rhXAjaxCrawlMiddlewarerr}r(hUhjubaubeubj )r}r(hUhjhhhjh}r(h]h]h]h]h]uhM3hhh]r(h)r}r(hXMiddleware that finds 'AJAX crawlable' page variants based on meta-fragment html tag. See https://developers.google.com/webmasters/ajax-crawling/docs/getting-started for more info.hjhhhhh}r(h]h]h]h]h]uhM(hhh]r(hXZMiddleware that finds 'AJAX crawlable' page variants based on meta-fragment html tag. See rr}r(hXZMiddleware that finds 'AJAX crawlable' page variants based on meta-fragment html tag. See hjubj )r}r(hXKhttps://developers.google.com/webmasters/ajax-crawling/docs/getting-startedrh}r(Urefurijh]h]h]h]h]uhjh]rhXKhttps://developers.google.com/webmasters/ajax-crawling/docs/getting-startedrr}r(hUhjubahj ubhX for more info.rr}r(hX for more info.hjubeubcdocutils.nodes note r)r}r(hXScrapy finds 'AJAX crawlable' pages for URLs like ``'http://example.com/!#foo=bar'`` even without this middleware. AjaxCrawlMiddleware is necessary when URL doesn't contain ``'!#'``. This is often a case for 'index' or 'main' website pages.hjhhhUnoterh}r(h]h]h]h]h]uhNhhh]rh)r}r(hXScrapy finds 'AJAX crawlable' pages for URLs like ``'http://example.com/!#foo=bar'`` even without this middleware. AjaxCrawlMiddleware is necessary when URL doesn't contain ``'!#'``. This is often a case for 'index' or 'main' website pages.hjhhhhh}r(h]h]h]h]h]uhM/h]r(hX2Scrapy finds 'AJAX crawlable' pages for URLs like rr}r(hX2Scrapy finds 'AJAX crawlable' pages for URLs like hjubj)r}r(hX"``'http://example.com/!#foo=bar'``h}r(h]h]h]h]h]uhjh]rhX'http://example.com/!#foo=bar'rr}r(hUhjubahjubhXY even without this middleware. AjaxCrawlMiddleware is necessary when URL doesn't contain rr}r(hXY even without this middleware. AjaxCrawlMiddleware is necessary when URL doesn't contain hjubj)r}r(hX``'!#'``h}r(h]h]h]h]h]uhjh]rhX'!#'rr}r(hUhjubahjubhX;. This is often a case for 'index' or 'main' website pages.rr}r(hX;. This is often a case for 'index' or 'main' website pages.hjubeubaubeubeubh)r}r(hUhjrhhhhh}r(h]h]h]h]rhzah]rh!auhM5hhh]r(h)r}r(hXAjaxCrawlMiddleware Settingsrhjhhhhh}r(h]h]h]h]h]uhM5hhh]rhXAjaxCrawlMiddleware Settingsrr}r(hjhjubaubj)r}r(hUhjhhhjh}r(h]h]h]h]h]Uentries]r(XpairXAJAXCRAWL_ENABLED; settingXstd:setting-AJAXCRAWL_ENABLEDrUtrauhM8hhh]ubh)r}r(hUhjhhhhh}r(h]h]h]h]h]hjuhM8hhh]ubh)r}r(hUhjhhh}hhh}r(h]h]h]h]r(hjeh]rhhhh]r"(hX Default: r#r$}r%(hX Default: hjubj)r&}r'(hX ``False``h}r((h]h]h]h]h]uhjh]r)hXFalser*r+}r,(hUhj&ubahjubeubh)r-}r.(hXyWhether the AjaxCrawlMiddleware will be enabled. You may want to enable it for :ref:`broad crawls `.hjhhhhh}r/(h]h]h]h]h]uhM@hhh]r0(hXOWhether the AjaxCrawlMiddleware will be enabled. You may want to enable it for r1r2}r3(hXOWhether the AjaxCrawlMiddleware will be enabled. You may want to enable it for hj-ubh)r4}r5(hX):ref:`broad crawls `r6hj-hhhhh}r7(UreftypeXrefhjXtopics-broad-crawlsU refdomainXstdr8h]h]U refexplicith]h]h]jjuhM@h]r9j)r:}r;(hj6h}r<(h]h]r=(j j8Xstd-refr>eh]h]h]uhj4h]r?hX broad crawlsr@rA}rB(hUhj:ubahjubaubhX.rC}rD(hX.hj-ubeubh)rE}rF(hX).. _DBM: http://en.wikipedia.org/wiki/DbmjP Khjhhhhh}rG(j jXh]rHhsah]h]h]h]rIhauhMDhhh]ubh)rJ}rK(hX6.. _anydbm: http://docs.python.org/library/anydbm.htmljP Khjhhhhh}rL(j jkh]rMh}ah]h]h]h]rNh$auhMEhhh]ubh)rO}rP(hXU.. _chunked transfer encoding: http://en.wikipedia.org/wiki/Chunked_transfer_encodingjP Khjhhhhh}rQ(j jh]rRhah]h]h]h]rSh*auhMFhhh]ubeubeubeubeubeubehUU transformerrTNU footnote_refsrU}rVUrefnamesrW}rX(jj]rYjgaXdbm]rZjUaj]r[jaj]r\jaXchunked transfer encoding]r]jaXbasic access authentication]r^j auUsymbol_footnotesr_]r`Uautofootnote_refsra]rbUsymbol_footnote_refsrc]rdU citationsre]rfhhU current_linergNUtransform_messagesrh]ri(cdocutils.nodes system_message rj)rk}rl(hUh}rm(h]UlevelKh]h]Usourcehh]h]UlineKUtypeUINFOrnuh]roh)rp}rq(hUh}rr(h]h]h]h]h]uhjkh]rshXBHyperlink target "topics-downloader-middleware" is not referenced.rtru}rv(hUhjpubahhubahUsystem_messagerwubjj)rx}ry(hUh}rz(h]UlevelKh]h]Usourcehh]h]UlineK Utypejnuh]r{h)r|}r}(hUh}r~(h]h]h]h]h]uhjxh]rhXJHyperlink target "topics-downloader-middleware-setting" is not referenced.rr}r(hUhj|ubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineK:Utypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXPHyperlink target "module-scrapy.contrib.downloadermiddleware" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXFHyperlink target "topics-downloader-middleware-ref" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX0Hyperlink target "cookies-mw" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX;Hyperlink target "std:reqmeta-cookiejar" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXAHyperlink target "std:setting-COOKIES_ENABLED" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineKUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX?Hyperlink target "std:setting-COOKIES_DEBUG" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMEUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX<Hyperlink target "httpcache-policy-dummy" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMYUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX>Hyperlink target "httpcache-policy-rfc2616" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM{Utypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX:Hyperlink target "httpcache-storage-fs" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX;Hyperlink target "httpcache-storage-dbm" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXCHyperlink target "std:setting-HTTPCACHE_ENABLED" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXKHyperlink target "std:setting-HTTPCACHE_EXPIRATION_SECS" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r (h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]r h)r }r (hUh}r (h]h]h]h]h]uhjh]rhX?Hyperlink target "std:setting-HTTPCACHE_DIR" is not referenced.rr}r(hUhj ubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXMHyperlink target "std:setting-HTTPCACHE_IGNORE_HTTP_CODES" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]r h)r!}r"(hUh}r#(h]h]h]h]h]uhjh]r$hXJHyperlink target "std:setting-HTTPCACHE_IGNORE_MISSING" is not referenced.r%r&}r'(hUhj!ubahhubahjwubjj)r(}r)(hUh}r*(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]r+h)r,}r-(hUh}r.(h]h]h]h]h]uhj(h]r/hXJHyperlink target "std:setting-HTTPCACHE_IGNORE_SCHEMES" is not referenced.r0r1}r2(hUhj,ubahhubahjwubjj)r3}r4(hUh}r5(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]r6h)r7}r8(hUh}r9(h]h]h]h]h]uhj3h]r:hXCHyperlink target "std:setting-HTTPCACHE_STORAGE" is not referenced.r;r<}r=(hUhj7ubahhubahjwubjj)r>}r?(hUh}r@(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rAh)rB}rC(hUh}rD(h]h]h]h]h]uhj>h]rEhXFHyperlink target "std:setting-HTTPCACHE_DBM_MODULE" is not referenced.rFrG}rH(hUhjBubahhubahjwubjj)rI}rJ(hUh}rK(h]UlevelKh]h]Usourcehh]h]UlineM Utypejnuh]rLh)rM}rN(hUh}rO(h]h]h]h]h]uhjIh]rPhXBHyperlink target "std:setting-HTTPCACHE_POLICY" is not referenced.rQrR}rS(hUhjMubahhubahjwubjj)rT}rU(hUh}rV(h]UlevelKh]h]Usourcehh]h]UlineM'Utypejnuh]rWh)rX}rY(hUh}rZ(h]h]h]h]h]uhjTh]r[hXEHyperlink target "std:setting-COMPRESSION_ENABLED" is not referenced.r\r]}r^(hUhjXubahhubahjwubjj)r_}r`(hUh}ra(h]UlevelKh]h]Usourcehh]h]UlineM\Utypejnuh]rbh)rc}rd(hUh}re(h]h]h]h]h]uhj_h]rfhX?Hyperlink target "std:reqmeta-redirect_urls" is not referenced.rgrh}ri(hUhjcubahhubahjwubjj)rj}rk(hUh}rl(h]UlevelKh]h]Usourcehh]h]UlineMgUtypejnuh]rmh)rn}ro(hUh}rp(h]h]h]h]h]uhjjh]rqhX?Hyperlink target "std:reqmeta-dont_redirect" is not referenced.rrrs}rt(hUhjnubahhubahjwubjj)ru}rv(hUh}rw(h]UlevelKh]h]Usourcehh]h]UlineMpUtypejnuh]rxh)ry}rz(hUh}r{(h]h]h]h]h]uhjuh]r|hXBHyperlink target "std:setting-REDIRECT_ENABLED" is not referenced.r}r~}r(hUhjyubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM{Utypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXDHyperlink target "std:setting-REDIRECT_MAX_TIMES" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXEHyperlink target "std:setting-METAREFRESH_ENABLED" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXPHyperlink target "std:setting-REDIRECT_MAX_METAREFRESH_DELAY" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX<Hyperlink target "std:reqmeta-dont_retry" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX?Hyperlink target "std:setting-RETRY_ENABLED" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX=Hyperlink target "std:setting-RETRY_TIMES" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXBHyperlink target "std:setting-RETRY_HTTP_CODES" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX8Hyperlink target "topics-dlmw-robots" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineMUtypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhX:Hyperlink target "ajaxcrawl-middleware" is not referenced.rr}r(hUhjubahhubahjwubjj)r}r(hUh}r(h]UlevelKh]h]Usourcehh]h]UlineM8Utypejnuh]rh)r}r(hUh}r(h]h]h]h]h]uhjh]rhXCHyperlink target "std:setting-AJAXCRAWL_ENABLED" is not referenced.rr}r(hUhjubahhubahjwubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrUentryrU language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesr NhNUerror_encoding_error_handlerr Ubackslashreplacer Udebugr NUembed_stylesheetr Uoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsrNUsectsubtitle_xformrU source_linkrNUrfc_referencesr NUoutput_encodingr!Uutf-8r"U source_urlr#NUinput_encodingr$U utf-8-sigr%U_disable_configr&NU id_prefixr'UU tab_widthr(KUerror_encodingr)UUTF-8r*U_sourcer+UR/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/downloader-middleware.rstr,Ugettext_compactr-U generatorr.NUdump_internalsr/NU smart_quotesr0U pep_base_urlr1Uhttp://www.python.org/dev/peps/r2Usyntax_highlightr3Ulongr4Uinput_encoding_error_handlerr5jUauto_id_prefixr6Uidr7Udoctitle_xformr8Ustrip_elements_with_classesr9NU _config_filesr:]Ufile_insertion_enabledr;U raw_enabledr<KU dump_settingsr=NubUsymbol_footnote_startr>KUidsr?}r@(h j5hsjEhkjN hj+jLh)rA}rB(hUhjGhhhhh}rC(h]h]rDjLah]Uismodh]h]uhNhhh]ubhjjjhjjjh-j> hj hjhhhjx jjhj7 hj hIjghgjGhjhSjr jnjshWj}jh)rE}rF(hUhjhhhhh}rG(h]h]rHjah]Uismodh]h]uhNhhh]ubhxjhjj jh)rI}rJ(hUhjhhhhh}rK(h]h]rLjah]Uismodh]h]uhNhhh]ubjjh|j jjhj3jjhjhjhjOhdjjjhjEhijjwh)rM}rN(hUhjrhhhhh}rO(h]h]rPjwah]Uismodh]h]uhM%hhh]ubhmjhrjhfjN h,j]jh)rQ}rR(hUhj hhhhh}rS(h]h]rTjah]Uismodh]h]uhNhhh]ubj{jhjrhojhthh4jhvjhLjhhjj hjhjhj+hj hqjhjhRjhCj:hjGhjjh)rU}rV(hUhjhhhhh}rW(h]h]rXjah]Uismodh]h]uhNhhh]ubhj@hphhejj.j3hwjh5jhjh jh~jjX h)rY}rZ(hUhjT hhhhh}r[(h]h]r\jX ah]Uismodh]h]uhNhhh]ubj<h)r]}r^(hUhj8hhhhh}r_(h]h]r`j<ah]Uismodh]h]uhNhhh]ubhjhjOh{jj$ h)ra}rb(hUhj hhhhh}rc(h]h]rdj$ ah]Uismodh]h]uhNhhh]ubjjhjj jjjjjhjhhjHjh)re}rf(hUhjhhhhh}rg(h]h]rhjah]Uismodh]h]uhNhhh]ubhGj jjh}jJhjhnjjCjHhjsj&j+j| h)ri}rj(hUhjx hhhhh}rk(h]h]rlj| ah]Uismodh]h]uhNhhh]ubjjhjhjrhZj jh)rm}rn(hUhjhhhhh}ro(h]h]rpjah]Uismodh]h]uhNhhh]ubhlj7 hyjjjj;j@hujN jjhjhjjh)rq}rr(hUhjhhhhh}rs(h]h]rtjah]Uismodh]h]uhNhhh]ubhzjhj8j h)ru}rv(hUhj hhhhh}rw(h]h]rxj ah]Uismodh]h]uhNhhh]ubhjT jojthj jjhjjjhTjhVjVhjOhjuUsubstitution_namesry}rzhhh}r{(h]h]h]Usourcehh]h]uU footnotesr|]r}Urefidsr~}r(j]rjaj ]rjaj]rjaj]rjaj]rjahp]rhaj.]rj0aj]rjaj]rjaj]rjaj]rjaj]rjaj{]rj}ajn]rjpaj]rjaj;]rj=ahk]rjK ah]rjoaj]rjaj]rjah]rhaj]rjaj]rjah]rj4 ah]rjLah]rjah]rj(ah]rj aj]rjaj]rjajC]rjEah]rjDajo]rjqaj&]rj(auub.PK o1Da2scrapy-0.22/.doctrees/topics/telnetconsole.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xtopics-telnetconsoleqX scrapy.telnet.update_telnet_varsqXtelnet consoleqNXtelnet settingsq NX pprint.pprintq X how to access the telnet consoleq NXtelnet console signalsq NX)available variables in the telnet consoleq NXtelnetconsole_portqNXtelnet console usage examplesqNX(pause, resume and stop the scrapy engineqNXview engine statusqNXtelnetconsole_hostqNuUsubstitution_defsq}qUparse_messagesq]qUcurrent_sourceqNU decorationqNUautofootnote_startqKUnameidsq}q(hUtopics-telnetconsoleqhhhUtelnet-consoleqh Utelnet-settingsqh U pprint-pprintqh U how-to-access-the-telnet-consoleq h Utelnet-console-signalsq!h U)available-variables-in-the-telnet-consoleq"hUtelnetconsole-portq#hUtelnet-console-usage-examplesq$hU'pause-resume-and-stop-the-scrapy-engineq%hUview-engine-statusq&hUtelnetconsole-hostq'uUchildrenq(]q)(cdocutils.nodes target q*)q+}q,(U rawsourceq-X.. _topics-telnetconsole:Uparentq.hUsourceq/cdocutils.nodes reprunicode q0XJ/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/telnetconsole.rstq1q2}q3bUtagnameq4Utargetq5U attributesq6}q7(Uidsq8]Ubackrefsq9]Udupnamesq:]Uclassesq;]Unamesq<]Urefidq=huUlineq>KUdocumentq?hh(]ubcdocutils.nodes section q@)qA}qB(h-Uh.hh/h2Uexpect_referenced_by_nameqC}qDhh+sh4UsectionqEh6}qF(h:]h;]h9]h8]qG(Xmodule-scrapy.telnetqHhheh<]qI(hheuh>Kh?hUexpect_referenced_by_idqJ}qKhh+sh(]qL(cdocutils.nodes title qM)qN}qO(h-XTelnet ConsoleqPh.hAh/h2h4UtitleqQh6}qR(h:]h;]h9]h8]h<]uh>Kh?hh(]qScdocutils.nodes Text qTXTelnet ConsoleqUqV}qW(h-hPh.hNubaubcsphinx.addnodes index qX)qY}qZ(h-Uh.hAh/h2h4Uindexq[h6}q\(h8]h9]h:]h;]h<]Uentries]q](Usingleq^Xscrapy.telnet (module)Xmodule-scrapy.telnetUtq_auh>Nh?hh(]ubcdocutils.nodes paragraph q`)qa}qb(h-XScrapy comes with a built-in telnet console for inspecting and controlling a Scrapy running process. The telnet console is just a regular python shell running inside the Scrapy process, so you can do literally anything from it.qch.hAh/h2h4U paragraphqdh6}qe(h:]h;]h9]h8]h<]uh>K h?hh(]qfhTXScrapy comes with a built-in telnet console for inspecting and controlling a Scrapy running process. The telnet console is just a regular python shell running inside the Scrapy process, so you can do literally anything from it.qgqh}qi(h-hch.haubaubh`)qj}qk(h-XThe telnet console is a :ref:`built-in Scrapy extension ` which comes enabled by default, but you can also disable it if you want. For more information about the extension itself see :ref:`topics-extensions-ref-telnetconsole`.h.hAh/h2h4hdh6}ql(h:]h;]h9]h8]h<]uh>Kh?hh(]qm(hTXThe telnet console is a qnqo}qp(h-XThe telnet console is a h.hjubcsphinx.addnodes pending_xref qq)qr}qs(h-X8:ref:`built-in Scrapy extension `qth.hjh/h2h4U pending_xrefquh6}qv(UreftypeXrefUrefwarnqwU reftargetqxXtopics-extensions-refU refdomainXstdqyh8]h9]U refexplicith:]h;]h<]UrefdocqzXtopics/telnetconsoleq{uh>Kh(]q|cdocutils.nodes emphasis q})q~}q(h-hth6}q(h:]h;]q(UxrefqhyXstd-refqeh9]h8]h<]uh.hrh(]qhTXbuilt-in Scrapy extensionqq}q(h-Uh.h~ubah4UemphasisqubaubhTX~ which comes enabled by default, but you can also disable it if you want. For more information about the extension itself see qq}q(h-X~ which comes enabled by default, but you can also disable it if you want. For more information about the extension itself see h.hjubhq)q}q(h-X*:ref:`topics-extensions-ref-telnetconsole`qh.hjh/h2h4huh6}q(UreftypeXrefhwhxX#topics-extensions-ref-telnetconsoleU refdomainXstdqh8]h9]U refexplicith:]h;]h<]hzh{uh>Kh(]qh})q}q(h-hh6}q(h:]h;]q(hhXstd-refqeh9]h8]h<]uh.hh(]qhTX#topics-extensions-ref-telnetconsoleqq}q(h-Uh.hubah4hubaubhTX.q}q(h-X.h.hjubeubcsphinx.addnodes highlightlang q)q}q(h-Uh.hAh/h2h4U highlightlangqh6}q(UlangXnoneUlinenothresholdI9223372036854775807 h8]h9]h:]h;]h<]uh>Kh?hh(]ubh@)q}q(h-Uh.hAh/h2h4hEh6}q(h:]h;]h9]h8]qh ah<]qh auh>Kh?hh(]q(hM)q}q(h-X How to access the telnet consoleqh.hh/h2h4hQh6}q(h:]h;]h9]h8]h<]uh>Kh?hh(]qhTX How to access the telnet consoleqq}q(h-hh.hubaubh`)q}q(h-XThe telnet console listens in the TCP port defined in the :setting:`TELNETCONSOLE_PORT` setting, which defaults to ``6023``. To access the console you need to type::h.hh/h2h4hdh6}q(h:]h;]h9]h8]h<]uh>Kh?hh(]q(hTX:The telnet console listens in the TCP port defined in the qq}q(h-X:The telnet console listens in the TCP port defined in the h.hubhq)q}q(h-X:setting:`TELNETCONSOLE_PORT`qh.hh/h2h4huh6}q(UreftypeXsettinghwhxXTELNETCONSOLE_PORTU refdomainXstdqh8]h9]U refexplicith:]h;]h<]hzh{uh>Kh(]qcdocutils.nodes literal q)q}q(h-hh6}q(h:]h;]q(hhX std-settingqeh9]h8]h<]uh.hh(]qhTXTELNETCONSOLE_PORTqąq}q(h-Uh.hubah4UliteralqubaubhTX setting, which defaults to qȅq}q(h-X setting, which defaults to h.hubh)q}q(h-X``6023``h6}q(h:]h;]h9]h8]h<]uh.hh(]qhTX6023qυq}q(h-Uh.hubah4hubhTX). To access the console you need to type:q҅q}q(h-X). To access the console you need to type:h.hubeubcdocutils.nodes literal_block q)q}q(h-Xtelnet localhost 6023 >>>h.hh/h2h4U literal_blockqh6}q(U xml:spaceqUpreserveqh8]h9]h:]h;]h<]uh>Kh?hh(]qhTXtelnet localhost 6023 >>>q݅q}q(h-Uh.hubaubh`)q}q(h-X`You need the telnet program which comes installed by default in Windows, and most Linux distros.qh.hh/h2h4hdh6}q(h:]h;]h9]h8]h<]uh>Kh?hh(]qhTX`You need the telnet program which comes installed by default in Windows, and most Linux distros.q允q}q(h-hh.hubaubeubh@)q}q(h-Uh.hAh/h2h4hEh6}q(h:]h;]h9]h8]qh"ah<]qh auh>K#h?hh(]q(hM)q}q(h-X)Available variables in the telnet consoleqh.hh/h2h4hQh6}q(h:]h;]h9]h8]h<]uh>K#h?hh(]qhTX)Available variables in the telnet consoleqq}q(h-hh.hubaubh`)q}q(h-XThe telnet console is like a regular Python shell running inside the Scrapy process, so you can do anything from it including importing new modules, etc.qh.hh/h2h4hdh6}q(h:]h;]h9]h8]h<]uh>K%h?hh(]qhTXThe telnet console is like a regular Python shell running inside the Scrapy process, so you can do anything from it including importing new modules, etc.qq}q(h-hh.hubaubh`)q}q(h-XVHowever, the telnet console comes with some default variables defined for convenience:rh.hh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K(h?hh(]rhTXVHowever, the telnet console comes with some default variables defined for convenience:rr}r(h-jh.hubaubcdocutils.nodes table r)r}r(h-Uh.hh/h2h4Utabler h6}r (h:]h;]h9]h8]h<]uh>Nh?hh(]r cdocutils.nodes tgroup r )r }r(h-Uh6}r(h8]h9]h:]h;]h<]UcolsKuh.jh(]r(cdocutils.nodes colspec r)r}r(h-Uh6}r(h8]h9]h:]h;]h<]UcolwidthKuh.j h(]h4Ucolspecrubj)r}r(h-Uh6}r(h8]h9]h:]h;]h<]UcolwidthKCuh.j h(]h4jubcdocutils.nodes thead r)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.j h(]rcdocutils.nodes row r)r}r (h-Uh6}r!(h:]h;]h9]h8]h<]uh.jh(]r"(cdocutils.nodes entry r#)r$}r%(h-Uh6}r&(h:]h;]h9]h8]h<]uh.jh(]r'h`)r(}r)(h-XShortcutr*h.j$h/h2h4hdh6}r+(h:]h;]h9]h8]h<]uh>K,h(]r,hTXShortcutr-r.}r/(h-j*h.j(ubaubah4Uentryr0ubj#)r1}r2(h-Uh6}r3(h:]h;]h9]h8]h<]uh.jh(]r4h`)r5}r6(h-X Descriptionr7h.j1h/h2h4hdh6}r8(h:]h;]h9]h8]h<]uh>K,h(]r9hTX Descriptionr:r;}r<(h-j7h.j5ubaubah4j0ubeh4Urowr=ubah4Utheadr>ubcdocutils.nodes tbody r?)r@}rA(h-Uh6}rB(h:]h;]h9]h8]h<]uh.j h(]rC(j)rD}rE(h-Uh6}rF(h:]h;]h9]h8]h<]uh.j@h(]rG(j#)rH}rI(h-Uh6}rJ(h:]h;]h9]h8]h<]uh.jDh(]rKh`)rL}rM(h-X ``crawler``rNh.jHh/h2h4hdh6}rO(h:]h;]h9]h8]h<]uh>K.h(]rPh)rQ}rR(h-jNh6}rS(h:]h;]h9]h8]h<]uh.jLh(]rThTXcrawlerrUrV}rW(h-Uh.jQubah4hubaubah4j0ubj#)rX}rY(h-Uh6}rZ(h:]h;]h9]h8]h<]uh.jDh(]r[h`)r\}r](h-X;the Scrapy Crawler (:class:`scrapy.crawler.Crawler` object)h.jXh/h2h4hdh6}r^(h:]h;]h9]h8]h<]uh>K.h(]r_(hTXthe Scrapy Crawler (r`ra}rb(h-Xthe Scrapy Crawler (h.j\ubhq)rc}rd(h-X:class:`scrapy.crawler.Crawler`reh.j\h/h2h4huh6}rf(UreftypeXclasshwhxXscrapy.crawler.CrawlerU refdomainXpyrgh8]h9]U refexplicith:]h;]h<]hzh{Upy:classrhNU py:moduleriX scrapy.telnetrjuh>K/h(]rkh)rl}rm(h-jeh6}rn(h:]h;]ro(hjgXpy-classrpeh9]h8]h<]uh.jch(]rqhTXscrapy.crawler.Crawlerrrrs}rt(h-Uh.jlubah4hubaubhTX object)rurv}rw(h-X object)h.j\ubeubah4j0ubeh4j=ubj)rx}ry(h-Uh6}rz(h:]h;]h9]h8]h<]uh.j@h(]r{(j#)r|}r}(h-Uh6}r~(h:]h;]h9]h8]h<]uh.jxh(]rh`)r}r(h-X ``engine``rh.j|h/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K0h(]rh)r}r(h-jh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXenginerr}r(h-Uh.jubah4hubaubah4j0ubj#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jxh(]rh`)r}r(h-XCrawler.engine attributerh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K0h(]rhTXCrawler.engine attributerr}r(h-jh.jubaubah4j0ubeh4j=ubj)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.j@h(]r(j#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X ``spider``rh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K2h(]rh)r}r(h-jh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXspiderrr}r(h-Uh.jubah4hubaubah4j0ubj#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-Xthe active spiderrh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K2h(]rhTXthe active spiderrr}r(h-jh.jubaubah4j0ubeh4j=ubj)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.j@h(]r(j#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X``slot``rh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K4h(]rh)r}r(h-jh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXslotrr}r(h-Uh.jubah4hubaubah4j0ubj#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-Xthe engine slotrh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K4h(]rhTXthe engine slotrr}r(h-jh.jubaubah4j0ubeh4j=ubj)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.j@h(]r(j#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X``extensions``rh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K6h(]rh)r}r(h-jh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTX extensionsrr}r(h-Uh.jubah4hubaubah4j0ubj#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X4the Extension Manager (Crawler.extensions attribute)rh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K6h(]rhTX4the Extension Manager (Crawler.extensions attribute)rr}r(h-jh.jubaubah4j0ubeh4j=ubj)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.j@h(]r(j#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X ``stats``rh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K8h(]rh)r}r(h-jh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXstatsr r }r (h-Uh.jubah4hubaubah4j0ubj#)r }r (h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X-the Stats Collector (Crawler.stats attribute)rh.j h/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K8h(]rhTX-the Stats Collector (Crawler.stats attribute)rr}r(h-jh.jubaubah4j0ubeh4j=ubj)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.j@h(]r(j#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r }r!(h-X ``settings``r"h.jh/h2h4hdh6}r#(h:]h;]h9]h8]h<]uh>K:h(]r$h)r%}r&(h-j"h6}r'(h:]h;]h9]h8]h<]uh.j h(]r(hTXsettingsr)r*}r+(h-Uh.j%ubah4hubaubah4j0ubj#)r,}r-(h-Uh6}r.(h:]h;]h9]h8]h<]uh.jh(]r/h`)r0}r1(h-X7the Scrapy settings object (Crawler.settings attribute)r2h.j,h/h2h4hdh6}r3(h:]h;]h9]h8]h<]uh>K:h(]r4hTX7the Scrapy settings object (Crawler.settings attribute)r5r6}r7(h-j2h.j0ubaubah4j0ubeh4j=ubj)r8}r9(h-Uh6}r:(h:]h;]h9]h8]h<]uh.j@h(]r;(j#)r<}r=(h-Uh6}r>(h:]h;]h9]h8]h<]uh.j8h(]r?h`)r@}rA(h-X``est``rBh.j<h/h2h4hdh6}rC(h:]h;]h9]h8]h<]uh>KKK>h(]rdh)re}rf(h-jbh6}rg(h:]h;]h9]h8]h<]uh.j`h(]rhhTXprefsrirj}rk(h-Uh.jeubah4hubaubah4j0ubj#)rl}rm(h-Uh6}rn(h:]h;]h9]h8]h<]uh.jXh(]roh`)rp}rq(h-X.for memory debugging (see :ref:`topics-leaks`)h.jlh/h2h4hdh6}rr(h:]h;]h9]h8]h<]uh>K>h(]rs(hTXfor memory debugging (see rtru}rv(h-Xfor memory debugging (see h.jpubhq)rw}rx(h-X:ref:`topics-leaks`ryh.jph/h2h4huh6}rz(UreftypeXrefhwhxX topics-leaksU refdomainXstdr{h8]h9]U refexplicith:]h;]h<]hzh{uh>K?h(]r|h})r}}r~(h-jyh6}r(h:]h;]r(hj{Xstd-refreh9]h8]h<]uh.jwh(]rhTX topics-leaksrr}r(h-Uh.j}ubah4hubaubhTX)r}r(h-X)h.jpubeubah4j0ubeh4j=ubj)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.j@h(]r(j#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X``p``rh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K@h(]rh)r}r(h-jh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXpr}r(h-Uh.jubah4hubaubah4j0ubj#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X+a shortcut to the `pprint.pprint`_ functionh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>K@h(]r(hTXa shortcut to the rr}r(h-Xa shortcut to the h.jubcdocutils.nodes reference r)r}r(h-X`pprint.pprint`_UresolvedrKh.jh4U referencerh6}r(UnameX pprint.pprintrUrefurirX8http://docs.python.org/library/pprint.html#pprint.pprintrh8]h9]h:]h;]h<]uh(]rhTX pprint.pprintrr}r(h-Uh.jubaubhTX functionrr}r(h-X functionh.jubeubah4j0ubeh4j=ubj)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.j@h(]r(j#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X``hpy``rh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>KBh(]rh)r}r(h-jh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXhpyrr}r(h-Uh.jubah4hubaubah4j0ubj#)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-X.for memory debugging (see :ref:`topics-leaks`)h.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>KBh(]r(hTXfor memory debugging (see rr}r(h-Xfor memory debugging (see h.jubhq)r}r(h-X:ref:`topics-leaks`rh.jh/h2h4huh6}r(UreftypeXrefhwhxX topics-leaksU refdomainXstdrh8]h9]U refexplicith:]h;]h<]hzh{uh>KCh(]rh})r}r(h-jh6}r(h:]h;]r(hjXstd-refreh9]h8]h<]uh.jh(]rhTX topics-leaksrr}r(h-Uh.jubah4hubaubhTX)r}r(h-X)h.jubeubah4j0ubeh4j=ubeh4Utbodyrubeh4Utgrouprubaubh*)r}r(h-XK.. _pprint.pprint: http://docs.python.org/library/pprint.html#pprint.pprintU referencedrKh.hh/h2h4h5h6}r(jjh8]rhah9]h:]h;]h<]rh auh>KEh?hh(]ubeubh@)r}r(h-Uh.hAh/h2h4hEh6}r(h:]h;]h9]h8]rh$ah<]rhauh>KHh?hh(]r(hM)r}r(h-XTelnet console usage examplesrh.jh/h2h4hQh6}r(h:]h;]h9]h8]h<]uh>KHh?hh(]rhTXTelnet console usage examplesrr}r(h-jh.jubaubh`)r}r(h-X?Here are some example tasks you can do with the telnet console:rh.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>KJh?hh(]rhTX?Here are some example tasks you can do with the telnet console:rr}r(h-jh.jubaubh@)r}r(h-Uh.jh/h2h4hEh6}r(h:]h;]h9]h8]rh&ah<]rhauh>KMh?hh(]r (hM)r }r (h-XView engine statusr h.jh/h2h4hQh6}r (h:]h;]h9]h8]h<]uh>KMh?hh(]rhTXView engine statusrr}r(h-j h.j ubaubh`)r}r(h-XjYou can use the ``est()`` method of the Scrapy engine to quickly show its state using the telnet console::h.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>KOh?hh(]r(hTXYou can use the rr}r(h-XYou can use the h.jubh)r}r(h-X ``est()``h6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXest()rr}r(h-Uh.jubah4hubhTXP method of the Scrapy engine to quickly show its state using the telnet console:r r!}r"(h-XP method of the Scrapy engine to quickly show its state using the telnet console:h.jubeubh)r#}r$(h-Xtelnet localhost 6023 >>> est() Execution engine status time()-engine.start_time : 9.24237799644 engine.has_capacity() : False engine.downloader.is_idle() : False len(engine.downloader.slots) : 2 len(engine.downloader.active) : 16 engine.scraper.is_idle() : False Spider: engine.spider_is_idle(spider) : False engine.slots[spider].closing : False len(engine.slots[spider].inprogress) : 21 len(engine.slots[spider].scheduler.dqs or []) : 0 len(engine.slots[spider].scheduler.mqs) : 4453 len(engine.scraper.slot.queue) : 0 len(engine.scraper.slot.active) : 5 engine.scraper.slot.active_size : 1515069 engine.scraper.slot.itemproc_size : 0 engine.scraper.slot.needs_backout() : Falseh.jh/h2h4hh6}r%(hhh8]h9]h:]h;]h<]uh>KRh?hh(]r&hTXtelnet localhost 6023 >>> est() Execution engine status time()-engine.start_time : 9.24237799644 engine.has_capacity() : False engine.downloader.is_idle() : False len(engine.downloader.slots) : 2 len(engine.downloader.active) : 16 engine.scraper.is_idle() : False Spider: engine.spider_is_idle(spider) : False engine.slots[spider].closing : False len(engine.slots[spider].inprogress) : 21 len(engine.slots[spider].scheduler.dqs or []) : 0 len(engine.slots[spider].scheduler.mqs) : 4453 len(engine.scraper.slot.queue) : 0 len(engine.scraper.slot.active) : 5 engine.scraper.slot.active_size : 1515069 engine.scraper.slot.itemproc_size : 0 engine.scraper.slot.needs_backout() : Falser'r(}r)(h-Uh.j#ubaubeubh@)r*}r+(h-Uh.jh/h2h4hEh6}r,(h:]h;]h9]h8]r-h%ah<]r.hauh>Kkh?hh(]r/(hM)r0}r1(h-X(Pause, resume and stop the Scrapy enginer2h.j*h/h2h4hQh6}r3(h:]h;]h9]h8]h<]uh>Kkh?hh(]r4hTX(Pause, resume and stop the Scrapy enginer5r6}r7(h-j2h.j0ubaubh`)r8}r9(h-X To pause::r:h.j*h/h2h4hdh6}r;(h:]h;]h9]h8]h<]uh>Kmh?hh(]r<hTX To pause:r=r>}r?(h-X To pause:h.j8ubaubh)r@}rA(h-X,telnet localhost 6023 >>> engine.pause() >>>h.j*h/h2h4hh6}rB(hhh8]h9]h:]h;]h<]uh>Koh?hh(]rChTX,telnet localhost 6023 >>> engine.pause() >>>rDrE}rF(h-Uh.j@ubaubh`)rG}rH(h-X To resume::rIh.j*h/h2h4hdh6}rJ(h:]h;]h9]h8]h<]uh>Ksh?hh(]rKhTX To resume:rLrM}rN(h-X To resume:h.jGubaubh)rO}rP(h-X.telnet localhost 6023 >>> engine.unpause() >>>h.j*h/h2h4hh6}rQ(hhh8]h9]h:]h;]h<]uh>Kuh?hh(]rRhTX.telnet localhost 6023 >>> engine.unpause() >>>rSrT}rU(h-Uh.jOubaubh`)rV}rW(h-X To stop::rXh.j*h/h2h4hdh6}rY(h:]h;]h9]h8]h<]uh>Kyh?hh(]rZhTXTo stop:r[r\}r](h-XTo stop:h.jVubaubh)r^}r_(h-XJtelnet localhost 6023 >>> engine.stop() Connection closed by foreign host.h.j*h/h2h4hh6}r`(hhh8]h9]h:]h;]h<]uh>K{h?hh(]rahTXJtelnet localhost 6023 >>> engine.stop() Connection closed by foreign host.rbrc}rd(h-Uh.j^ubaubeubeubh@)re}rf(h-Uh.hAh/h2h4hEh6}rg(h:]h;]h9]h8]rhh!ah<]rih auh>Kh?hh(]rj(hM)rk}rl(h-XTelnet Console signalsrmh.jeh/h2h4hQh6}rn(h:]h;]h9]h8]h<]uh>Kh?hh(]rohTXTelnet Console signalsrprq}rr(h-jmh.jkubaubhX)rs}rt(h-Uh.jeh/h2h4h[h6}ru(h8]h9]h:]h;]h<]Uentries]rv(XpairXupdate_telnet_vars; signalXstd:signal-update_telnet_varsrwUtrxauh>Kh?hh(]ubh*)ry}rz(h-Uh.jeh/h2h4h5h6}r{(h:]h;]h9]h8]r|jwah<]uh>Kh?hh(]ubhX)r}}r~(h-Uh.jeh/Nh4h[h6}r(h8]h9]h:]h;]h<]Uentries]r(h^X.update_telnet_vars() (in module scrapy.telnet)hUtrauh>Nh?hh(]ubcsphinx.addnodes desc r)r}r(h-Uh.jeh/Nh4Udescrh6}r(UnoindexrUdomainrXpyrh8]h9]h:]h;]h<]UobjtyperXfunctionrUdesctyperjuh>Nh?hh(]r(csphinx.addnodes desc_signature r)r}r(h-Xupdate_telnet_vars(telnet_vars)rh.jh/h2h4Udesc_signaturerh6}r(h8]rhaUmodulerjjh9]h:]h;]h<]rhaUfullnamerXupdate_telnet_varsrUclassrUUfirstruh>Kh?hh(]r(csphinx.addnodes desc_addname r)r}r(h-Xscrapy.telnet.h.jh/h2h4U desc_addnamerh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]rhTXscrapy.telnet.rr}r(h-Uh.jubaubcsphinx.addnodes desc_name r)r}r(h-jh.jh/h2h4U desc_namerh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]rhTXupdate_telnet_varsrr}r(h-Uh.jubaubcsphinx.addnodes desc_parameterlist r)r}r(h-Uh.jh/h2h4Udesc_parameterlistrh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]rcsphinx.addnodes desc_parameter r)r}r(h-X telnet_varsh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTX telnet_varsrr}r(h-Uh.jubah4Udesc_parameterrubaubeubcsphinx.addnodes desc_content r)r}r(h-Uh.jh/h2h4U desc_contentrh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]r(h`)r}r(h-XSent just before the telnet console is opened. You can hook up to this signal to add, remove or update the variables that will be available in the telnet local namespace. In order to do that, you need to update the ``telnet_vars`` dict in your handler.h.jh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]r(hTXSent just before the telnet console is opened. You can hook up to this signal to add, remove or update the variables that will be available in the telnet local namespace. In order to do that, you need to update the rr}r(h-XSent just before the telnet console is opened. You can hook up to this signal to add, remove or update the variables that will be available in the telnet local namespace. In order to do that, you need to update the h.jubh)r}r(h-X``telnet_vars``h6}r(h:]h;]h9]h8]h<]uh.jh(]rhTX telnet_varsrr}r(h-Uh.jubah4hubhTX dict in your handler.rr}r(h-X dict in your handler.h.jubeubcdocutils.nodes field_list r)r}r(h-Uh.jh/Nh4U field_listrh6}r(h:]h;]h9]h8]h<]uh>Nh?hh(]rcdocutils.nodes field r)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]r(cdocutils.nodes field_name r)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTX Parametersrr}r(h-Uh.jubah4U field_namerubcdocutils.nodes field_body r)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rh`)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]r(cdocutils.nodes strong r)r}r(h-X telnet_varsh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTX telnet_varsrr}r(h-Uh.jubah4UstrongrubhTX (rr}r(h-Uh.jubhq)r}r(h-Uh6}r(UreftypeUobjrU reftargetXdictrU refdomainjh8]h9]U refexplicith:]h;]h<]uh.jh(]rh})r}r(h-jh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXdictrr}r (h-Uh.jubah4hubah4huubhTX)r }r (h-Uh.jubhTX -- r r }r(h-Uh.jubhTXthe dict of telnet variablesrr}r(h-Xthe dict of telnet variablesrh.jubeh4hdubah4U field_bodyrubeh4Ufieldrubaubeubeubeubh@)r}r(h-Uh.hAh/h2h4hEh6}r(h:]h;]h9]h8]rhah<]rh auh>Kh?hh(]r(hM)r}r(h-XTelnet settingsrh.jh/h2h4hQh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]rhTXTelnet settingsr r!}r"(h-jh.jubaubh`)r#}r$(h-XCThese are the settings that control the telnet console's behaviour:r%h.jh/h2h4hdh6}r&(h:]h;]h9]h8]h<]uh>Kh?hh(]r'hTXCThese are the settings that control the telnet console's behaviour:r(r)}r*(h-j%h.j#ubaubhX)r+}r,(h-Uh.jh/h2h4h[h6}r-(h8]h9]h:]h;]h<]Uentries]r.(XpairXTELNETCONSOLE_PORT; settingXstd:setting-TELNETCONSOLE_PORTr/Utr0auh>Kh?hh(]ubh*)r1}r2(h-Uh.jh/h2h4h5h6}r3(h8]h9]h:]h;]h<]h=j/uh>Kh?hh(]ubh@)r4}r5(h-Uh.jh/h2hC}h4hEh6}r6(h:]h;]h9]h8]r7(h#j/eh<]r8hauh>Kh?hhJ}r9j/j1sh(]r:(hM)r;}r<(h-XTELNETCONSOLE_PORTr=h.j4h/h2h4hQh6}r>(h:]h;]h9]h8]h<]uh>Kh?hh(]r?hTXTELNETCONSOLE_PORTr@rA}rB(h-j=h.j;ubaubh`)rC}rD(h-XDefault: ``[6023, 6073]``rEh.j4h/h2h4hdh6}rF(h:]h;]h9]h8]h<]uh>Kh?hh(]rG(hTX Default: rHrI}rJ(h-X Default: h.jCubh)rK}rL(h-X``[6023, 6073]``h6}rM(h:]h;]h9]h8]h<]uh.jCh(]rNhTX [6023, 6073]rOrP}rQ(h-Uh.jKubah4hubeubh`)rR}rS(h-XoThe port range to use for the telnet console. If set to ``None`` or ``0``, a dynamically assigned port is used.h.j4h/h2h4hdh6}rT(h:]h;]h9]h8]h<]uh>Kh?hh(]rU(hTX8The port range to use for the telnet console. If set to rVrW}rX(h-X8The port range to use for the telnet console. If set to h.jRubh)rY}rZ(h-X``None``h6}r[(h:]h;]h9]h8]h<]uh.jRh(]r\hTXNoner]r^}r_(h-Uh.jYubah4hubhTX or r`ra}rb(h-X or h.jRubh)rc}rd(h-X``0``h6}re(h:]h;]h9]h8]h<]uh.jRh(]rfhTX0rg}rh(h-Uh.jcubah4hubhTX&, a dynamically assigned port is used.rirj}rk(h-X&, a dynamically assigned port is used.h.jRubeubhX)rl}rm(h-Uh.j4h/h2h4h[h6}rn(h8]h9]h:]h;]h<]Uentries]ro(XpairXTELNETCONSOLE_HOST; settingXstd:setting-TELNETCONSOLE_HOSTrpUtrqauh>Kh?hh(]ubh*)rr}rs(h-Uh.j4h/h2h4h5h6}rt(h8]h9]h:]h;]h<]h=jpuh>Kh?hh(]ubeubh@)ru}rv(h-Uh.jh/h2hC}h4hEh6}rw(h:]h;]h9]h8]rx(h'jpeh<]ryhauh>Kh?hhJ}rzjpjrsh(]r{(hM)r|}r}(h-XTELNETCONSOLE_HOSTr~h.juh/h2h4hQh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]rhTXTELNETCONSOLE_HOSTrr}r(h-j~h.j|ubaubh`)r}r(h-XDefault: ``'0.0.0.0'``rh.juh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]r(hTX Default: rr}r(h-X Default: h.jubh)r}r(h-X ``'0.0.0.0'``h6}r(h:]h;]h9]h8]h<]uh.jh(]rhTX '0.0.0.0'rr}r(h-Uh.jubah4hubeubh`)r}r(h-X1The interface the telnet console should listen onrh.juh/h2h4hdh6}r(h:]h;]h9]h8]h<]uh>Kh?hh(]rhTX1The interface the telnet console should listen onrr}r(h-jh.jubaubeubeubeubeh-UU transformerrNU footnote_refsr}rUrefnamesr}rj]rjasUsymbol_footnotesr]rUautofootnote_refsr]rUsymbol_footnote_refsr]rU citationsr]rh?hU current_linerNUtransform_messagesr]r(cdocutils.nodes system_message r)r}r(h-Uh6}r(h:]UlevelKh8]h9]Usourceh2h;]h<]UlineKUtypeUINFOruh(]rh`)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTX:Hyperlink target "topics-telnetconsole" is not referenced.rr}r(h-Uh.jubah4hdubah4Usystem_messagerubj)r}r(h-Uh6}r(h:]UlevelKh8]h9]Usourceh2h;]h<]UlineKUtypejuh(]rh`)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXCHyperlink target "std:signal-update_telnet_vars" is not referenced.rr}r(h-Uh.jubah4hdubah4jubj)r}r(h-Uh6}r(h:]UlevelKh8]h9]Usourceh2h;]h<]UlineKUtypejuh(]rh`)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXDHyperlink target "std:setting-TELNETCONSOLE_PORT" is not referenced.rr}r(h-Uh.jubah4hdubah4jubj)r}r(h-Uh6}r(h:]UlevelKh8]h9]Usourceh2h;]h<]UlineKUtypejuh(]rh`)r}r(h-Uh6}r(h:]h;]h9]h8]h<]uh.jh(]rhTXDHyperlink target "std:setting-TELNETCONSOLE_HOST" is not referenced.rr}r(h-Uh.jubah4hdubah4jubeUreporterrNUid_startrKU autofootnotesr]rU citation_refsr}rUindirect_targetsr]rUsettingsr(cdocutils.frontend Values ror}r(Ufootnote_backlinksrKUrecord_dependenciesrNU rfc_base_urlrUhttp://tools.ietf.org/html/rU tracebackrUpep_referencesrNUstrip_commentsrNU toc_backlinksrj0U language_coderUenrU datestamprNU report_levelrKU _destinationrNU halt_levelrKU strip_classesrNhQNUerror_encoding_error_handlerrUbackslashreplacerUdebugrNUembed_stylesheetrUoutput_encoding_error_handlerrUstrictrU sectnum_xformrKUdump_transformsrNU docinfo_xformrKUwarning_streamrNUpep_file_url_templaterUpep-%04drUexit_status_levelrKUconfigrNUstrict_visitorrNUcloak_email_addressesrUtrim_footnote_reference_spacerUenvrNUdump_pseudo_xmlrNUexpose_internalsr NUsectsubtitle_xformr U source_linkr NUrfc_referencesr NUoutput_encodingr Uutf-8rU source_urlrNUinput_encodingrU utf-8-sigrU_disable_configrNU id_prefixrUU tab_widthrKUerror_encodingrUUTF-8rU_sourcerUJ/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/telnetconsole.rstrUgettext_compactrU generatorrNUdump_internalsrNU smart_quotesrU pep_base_urlrUhttp://www.python.org/dev/peps/rUsyntax_highlightrUlongr Uinput_encoding_error_handlerr!jUauto_id_prefixr"Uidr#Udoctitle_xformr$Ustrip_elements_with_classesr%NU _config_filesr&]Ufile_insertion_enabledr'U raw_enabledr(KU dump_settingsr)NubUsymbol_footnote_startr*KUidsr+}r,(hhAh'juhhAj/j4jpjuh%j*hHh*)r-}r.(h-Uh.hAh/h2h4h5h6}r/(h:]h8]r0hHah9]Uismodh;]h<]uh>Nh?hh(]ubh&jh hhjhjjwjyhjh$jh#j4h!jeh"huUsubstitution_namesr1}r2h4h?h6}r3(h:]h8]h9]Usourceh2h;]h<]uU footnotesr4]r5Urefidsr6}r7(h]r8h+aj/]r9j1ajp]r:jrauub.PKo1DꃉR,scrapy-0.22/.doctrees/topics/scrapyd.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xtopics-scrapydqXscrapydqNuUsubstitution_defsq}q Uparse_messagesq ]q Ucurrent_sourceq NU decorationq NUautofootnote_startqKUnameidsq}q(hUtopics-scrapydqhUscrapydquUchildrenq]q(cdocutils.nodes target q)q}q(U rawsourceqX.. _topics-scrapyd:UparentqhUsourceqcdocutils.nodes reprunicode qXD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/scrapyd.rstqq}qbUtagnameqUtargetq U attributesq!}q"(Uidsq#]Ubackrefsq$]Udupnamesq%]Uclassesq&]Unamesq']Urefidq(huUlineq)KUdocumentq*hh]ubcdocutils.nodes section q+)q,}q-(hUhhhhUexpect_referenced_by_nameq.}q/hhshUsectionq0h!}q1(h%]h&]h$]h#]q2(hheh']q3(hheuh)Kh*hUexpect_referenced_by_idq4}q5hhsh]q6(cdocutils.nodes title q7)q8}q9(hXScrapydq:hh,hhhUtitleq;h!}q<(h%]h&]h$]h#]h']uh)Kh*hh]q=cdocutils.nodes Text q>XScrapydq?q@}qA(hh:hh8ubaubcdocutils.nodes paragraph qB)qC}qD(hX/Scrapyd has been moved into a separate project.qEhh,hhhU paragraphqFh!}qG(h%]h&]h$]h#]h']uh)Kh*hh]qHh>X/Scrapyd has been moved into a separate project.qIqJ}qK(hhEhhCubaubhB)qL}qM(hX#Its documentation is now hosted at:qNhh,hhhhFh!}qO(h%]h&]h$]h#]h']uh)K h*hh]qPh>X#Its documentation is now hosted at:qQqR}qS(hhNhhLubaubcdocutils.nodes block_quote qT)qU}qV(hUhh,hhhU block_quoteqWh!}qX(h%]h&]h$]h#]h']uh)Nh*hh]qYhB)qZ}q[(hXhttp://scrapyd.readthedocs.org/q\hhUhhhhFh!}q](h%]h&]h$]h#]h']uh)K h]q^cdocutils.nodes reference q_)q`}qa(hh\h!}qb(Urefurih\h#]h$]h%]h&]h']uhhZh]qch>Xhttp://scrapyd.readthedocs.org/qdqe}qf(hUhh`ubahU referenceqgubaubaubeubehUU transformerqhNU footnote_refsqi}qjUrefnamesqk}qlUsymbol_footnotesqm]qnUautofootnote_refsqo]qpUsymbol_footnote_refsqq]qrU citationsqs]qth*hU current_linequNUtransform_messagesqv]qwcdocutils.nodes system_message qx)qy}qz(hUh!}q{(h%]UlevelKh#]h$]Usourcehh&]h']UlineKUtypeUINFOq|uh]q}hB)q~}q(hUh!}q(h%]h&]h$]h#]h']uhhyh]qh>X4Hyperlink target "topics-scrapyd" is not referenced.qq}q(hUhh~ubahhFubahUsystem_messagequbaUreporterqNUid_startqKU autofootnotesq]qU citation_refsq}qUindirect_targetsq]qUsettingsq(cdocutils.frontend Values qoq}q(Ufootnote_backlinksqKUrecord_dependenciesqNU rfc_base_urlqUhttp://tools.ietf.org/html/qU tracebackqUpep_referencesqNUstrip_commentsqNU toc_backlinksqUentryqU language_codeqUenqU datestampqNU report_levelqKU _destinationqNU halt_levelqKU strip_classesqNh;NUerror_encoding_error_handlerqUbackslashreplaceqUdebugqNUembed_stylesheetqUoutput_encoding_error_handlerqUstrictqU sectnum_xformqKUdump_transformsqNU docinfo_xformqKUwarning_streamqNUpep_file_url_templateqUpep-%04dqUexit_status_levelqKUconfigqNUstrict_visitorqNUcloak_email_addressesqUtrim_footnote_reference_spaceqUenvqNUdump_pseudo_xmlqNUexpose_internalsqNUsectsubtitle_xformqU source_linkqNUrfc_referencesqNUoutput_encodingqUutf-8qU source_urlqNUinput_encodingqU utf-8-sigqU_disable_configqNU id_prefixqUU tab_widthqKUerror_encodingqUUTF-8qU_sourceqUD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/scrapyd.rstqUgettext_compactqňU generatorqNUdump_internalsqNU smart_quotesqȉU pep_base_urlqUhttp://www.python.org/dev/peps/qUsyntax_highlightqUlongqUinput_encoding_error_handlerqhUauto_id_prefixqUidqUdoctitle_xformqЉUstrip_elements_with_classesqNU _config_filesq]Ufile_insertion_enabledqӈU raw_enabledqKU dump_settingsqNubUsymbol_footnote_startqKUidsq}q(hh,hh,uUsubstitution_namesq}qhh*h!}q(h%]h#]h$]Usourcehh&]h']uU footnotesq]qUrefidsq}qh]qhasub.PKo1DY$xBxB/scrapy-0.22/.doctrees/topics/extensions.doctreecdocutils.nodes document q)q}q(U nametypesq}q(Xpython debuggerqXgeneral purpose extensionsqNXsigusr2qX scrapy.contrib.logstats.LogStatsq X)scrapy.telnet.scrapy.telnet.TelnetConsoleq Xgoogle sitemapsq Xloading & activating extensionsq NXtopics-extensionsq Xcore stats extensionqNXweb service extensionqNXtopics-extensions-ref-memusageqXlog stats extensionqNXmemory usage extensionqNXbuilt-in extensions referenceqNXdisabling an extensionqNX.scrapy.webservice.scrapy.webservice.WebServiceqXsigquitqXwriting your own extensionqNXextension settingsqNX topics-extensions-ref-webserviceqXAscrapy.contrib.statsmailer.scrapy.contrib.statsmailer.StatsMailerqX>scrapy.contrib.memdebug.scrapy.contrib.memdebug.MemoryDebuggerqXsample extensionqNXclose spider extensionqNXclosespider_itemcountqNX*available, enabled and disabled extensionsqNXAscrapy.contrib.closespider.scrapy.contrib.closespider.CloseSpiderq Xmemory debugger extensionq!NX;scrapy.contrib.memusage.scrapy.contrib.memusage.MemoryUsageq"Xdebugging extensionsq#NX"scrapy.contrib.corestats.CoreStatsq$Xdebugging in pythonq%X2scrapy.contrib.debug.scrapy.contrib.debug.Debuggerq&Xtelnet console extensionq'NX8scrapy.contrib.debug.scrapy.contrib.debug.StackTraceDumpq(Xclosespider_errorcountq)NXdebugger extensionq*NXstack trace dump extensionq+NXclosespider_timeoutq,NX extensionsq-NX#topics-extensions-ref-telnetconsoleq.Xclosespider_pagecountq/NXtopics-extensions-refq0Xstatsmailer extensionq1NuUsubstitution_defsq2}q3Uparse_messagesq4]q5Ucurrent_sourceq6NU decorationq7NUautofootnote_startq8KUnameidsq9}q:(hUpython-debuggerq;hUgeneral-purpose-extensionsqh Uloading-activating-extensionsq?h Utopics-extensionsq@hUcore-stats-extensionqAhUweb-service-extensionqBhUtopics-extensions-ref-memusageqChUlog-stats-extensionqDhUmemory-usage-extensionqEhUbuilt-in-extensions-referenceqFhUdisabling-an-extensionqGhhhUsigquitqHhUwriting-your-own-extensionqIhUextension-settingsqJhU topics-extensions-ref-webserviceqKhhhhhUsample-extensionqLhUclose-spider-extensionqMhUclosespider-itemcountqNhU)available-enabled-and-disabled-extensionsqOh h h!Umemory-debugger-extensionqPh"h"h#Udebugging-extensionsqQh$h$h%Udebugging-in-pythonqRh&h&h'Utelnet-console-extensionqSh(h(h)Uclosespider-errorcountqTh*Udebugger-extensionqUh+Ustack-trace-dump-extensionqVh,Uclosespider-timeoutqWh-U extensionsqXh.U#topics-extensions-ref-telnetconsoleqYh/Uclosespider-pagecountqZh0Utopics-extensions-refq[h1Ustatsmailer-extensionq\uUchildrenq]]q^(cdocutils.nodes target q_)q`}qa(U rawsourceqbX.. _topics-extensions:UparentqchUsourceqdcdocutils.nodes reprunicode qeXG/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/extensions.rstqfqg}qhbUtagnameqiUtargetqjU attributesqk}ql(Uidsqm]Ubackrefsqn]Udupnamesqo]Uclassesqp]Unamesqq]Urefidqrh@uUlineqsKUdocumentqthh]]ubcdocutils.nodes section qu)qv}qw(hbUhchhdhgUexpect_referenced_by_nameqx}qyh h`shiUsectionqzhk}q{(ho]hp]hn]hm]q|(hXh@ehq]q}(h-h euhsKhthUexpect_referenced_by_idq~}qh@h`sh]]q(cdocutils.nodes title q)q}q(hbX ExtensionsqhchvhdhghiUtitleqhk}q(ho]hp]hn]hm]hq]uhsKhthh]]qcdocutils.nodes Text qX Extensionsqq}q(hbhhchubaubcdocutils.nodes paragraph q)q}q(hbXfThe extensions framework provides a mechanism for inserting your own custom functionality into Scrapy.qhchvhdhghiU paragraphqhk}q(ho]hp]hn]hm]hq]uhsKhthh]]qhXfThe extensions framework provides a mechanism for inserting your own custom functionality into Scrapy.qq}q(hbhhchubaubh)q}q(hbXmExtensions are just regular classes that are instantiated at Scrapy startup, when extensions are initialized.qhchvhdhghihhk}q(ho]hp]hn]hm]hq]uhsK hthh]]qhXmExtensions are just regular classes that are instantiated at Scrapy startup, when extensions are initialized.qq}q(hbhhchubaubhu)q}q(hbUhchvhdhghihzhk}q(ho]hp]hn]hm]qhJahq]qhauhsKhthh]]q(h)q}q(hbXExtension settingsqhchhdhghihhk}q(ho]hp]hn]hm]hq]uhsKhthh]]qhXExtension settingsqq}q(hbhhchubaubh)q}q(hbXvExtensions use the :ref:`Scrapy settings ` to manage their settings, just like any other Scrapy code.hchhdhghihhk}q(ho]hp]hn]hm]hq]uhsKhthh]]q(hXExtensions use the qq}q(hbXExtensions use the hchubcsphinx.addnodes pending_xref q)q}q(hbX(:ref:`Scrapy settings `qhchhdhghiU pending_xrefqhk}q(UreftypeXrefUrefwarnqU reftargetqXtopics-settingsU refdomainXstdqhm]hn]U refexplicitho]hp]hq]UrefdocqXtopics/extensionsquhsKh]]qcdocutils.nodes emphasis q)q}q(hbhhk}q(ho]hp]q(UxrefqhXstd-refqehn]hm]hq]uhchh]]qhXScrapy settingsqDžq}q(hbUhchubahiUemphasisqubaubhX; to manage their settings, just like any other Scrapy code.q˅q}q(hbX; to manage their settings, just like any other Scrapy code.hchubeubh)q}q(hbXIt is customary for extensions to prefix their settings with their own name, to avoid collision with existing (and future) extensions. For example, an hypothetic extension to handle `Google Sitemaps`_ would use settings like `GOOGLESITEMAP_ENABLED`, `GOOGLESITEMAP_DEPTH`, and so on.hchhdhghihhk}q(ho]hp]hn]hm]hq]uhsKhthh]]q(hXIt is customary for extensions to prefix their settings with their own name, to avoid collision with existing (and future) extensions. For example, an hypothetic extension to handle q҅q}q(hbXIt is customary for extensions to prefix their settings with their own name, to avoid collision with existing (and future) extensions. For example, an hypothetic extension to handle hchubcdocutils.nodes reference q)q}q(hbX`Google Sitemaps`_UresolvedqKhchhiU referenceqhk}q(UnameXGoogle SitemapsUrefuriqX%http://en.wikipedia.org/wiki/Sitemapsqhm]hn]ho]hp]hq]uh]]qhXGoogle Sitemapsqޅq}q(hbUhchubaubhX would use settings like qᅁq}q(hbX would use settings like hchubcdocutils.nodes title_reference q)q}q(hbX`GOOGLESITEMAP_ENABLED`hk}q(ho]hp]hn]hm]hq]uhchh]]qhXGOOGLESITEMAP_ENABLEDq酁q}q(hbUhchubahiUtitle_referencequbhX, q텁q}q(hbX, hchubh)q}q(hbX`GOOGLESITEMAP_DEPTH`hk}q(ho]hp]hn]hm]hq]uhchh]]qhXGOOGLESITEMAP_DEPTHqq}q(hbUhchubahihubhX , and so on.qq}q(hbX , and so on.hchubeubh_)q}q(hbX:.. _Google Sitemaps: http://en.wikipedia.org/wiki/SitemapsU referencedqKhchhdhghihjhk}q(hhhm]qh>ahn]ho]hp]hq]qh auhsKhthh]]ubeubhu)r}r(hbUhchvhdhghihzhk}r(ho]hp]hn]hm]rh?ahq]rh auhsKhthh]]r(h)r}r(hbXLoading & activating extensionsrhcjhdhghihhk}r (ho]hp]hn]hm]hq]uhsKhthh]]r hXLoading & activating extensionsr r }r (hbjhcjubaubh)r}r(hbXExtensions are loaded and activated at startup by instantiating a single instance of the extension class. Therefore, all the extension initialization code must be performed in the class constructor (``__init__`` method).hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]r(hXExtensions are loaded and activated at startup by instantiating a single instance of the extension class. Therefore, all the extension initialization code must be performed in the class constructor (rr}r(hbXExtensions are loaded and activated at startup by instantiating a single instance of the extension class. Therefore, all the extension initialization code must be performed in the class constructor (hcjubcdocutils.nodes literal r)r}r(hbX ``__init__``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhX__init__rr}r(hbUhcjubahiUliteralrubhX method).rr}r (hbX method).hcjubeubh)r!}r"(hbXTo make an extension available, add it to the :setting:`EXTENSIONS` setting in your Scrapy settings. In :setting:`EXTENSIONS`, each extension is represented by a string: the full Python path to the extension's class name. For example::hcjhdhghihhk}r#(ho]hp]hn]hm]hq]uhsK!hthh]]r$(hX.To make an extension available, add it to the r%r&}r'(hbX.To make an extension available, add it to the hcj!ubh)r(}r)(hbX:setting:`EXTENSIONS`r*hcj!hdhghihhk}r+(UreftypeXsettinghhX EXTENSIONSU refdomainXstdr,hm]hn]U refexplicitho]hp]hq]hhuhsK!h]]r-j)r.}r/(hbj*hk}r0(ho]hp]r1(hj,X std-settingr2ehn]hm]hq]uhcj(h]]r3hX EXTENSIONSr4r5}r6(hbUhcj.ubahijubaubhX% setting in your Scrapy settings. In r7r8}r9(hbX% setting in your Scrapy settings. In hcj!ubh)r:}r;(hbX:setting:`EXTENSIONS`r<hcj!hdhghihhk}r=(UreftypeXsettinghhX EXTENSIONSU refdomainXstdr>hm]hn]U refexplicitho]hp]hq]hhuhsK!h]]r?j)r@}rA(hbj<hk}rB(ho]hp]rC(hj>X std-settingrDehn]hm]hq]uhcj:h]]rEhX EXTENSIONSrFrG}rH(hbUhcj@ubahijubaubhXm, each extension is represented by a string: the full Python path to the extension's class name. For example:rIrJ}rK(hbXm, each extension is represented by a string: the full Python path to the extension's class name. For example:hcj!ubeubcdocutils.nodes literal_block rL)rM}rN(hbXEXTENSIONS = { 'scrapy.contrib.corestats.CoreStats': 500, 'scrapy.webservice.WebService': 500, 'scrapy.telnet.TelnetConsole': 500, }hcjhdhghiU literal_blockrOhk}rP(U xml:spacerQUpreserverRhm]hn]ho]hp]hq]uhsK%hthh]]rShXEXTENSIONS = { 'scrapy.contrib.corestats.CoreStats': 500, 'scrapy.webservice.WebService': 500, 'scrapy.telnet.TelnetConsole': 500, }rTrU}rV(hbUhcjMubaubh)rW}rX(hbXAs you can see, the :setting:`EXTENSIONS` setting is a dict where the keys are the extension paths, and their values are the orders, which define the extension *loading* order. Extensions orders are not as important as middleware orders though, and they are typically irrelevant, ie. it doesn't matter in which order the extensions are loaded because they don't depend on each other [1].hcjhdhghihhk}rY(ho]hp]hn]hm]hq]uhsK,hthh]]rZ(hXAs you can see, the r[r\}r](hbXAs you can see, the hcjWubh)r^}r_(hbX:setting:`EXTENSIONS`r`hcjWhdhghihhk}ra(UreftypeXsettinghhX EXTENSIONSU refdomainXstdrbhm]hn]U refexplicitho]hp]hq]hhuhsK,h]]rcj)rd}re(hbj`hk}rf(ho]hp]rg(hjbX std-settingrhehn]hm]hq]uhcj^h]]rihX EXTENSIONSrjrk}rl(hbUhcjdubahijubaubhXw setting is a dict where the keys are the extension paths, and their values are the orders, which define the extension rmrn}ro(hbXw setting is a dict where the keys are the extension paths, and their values are the orders, which define the extension hcjWubh)rp}rq(hbX *loading*hk}rr(ho]hp]hn]hm]hq]uhcjWh]]rshXloadingrtru}rv(hbUhcjpubahihubhX order. Extensions orders are not as important as middleware orders though, and they are typically irrelevant, ie. it doesn't matter in which order the extensions are loaded because they don't depend on each other [1].rwrx}ry(hbX order. Extensions orders are not as important as middleware orders though, and they are typically irrelevant, ie. it doesn't matter in which order the extensions are loaded because they don't depend on each other [1].hcjWubeubh)rz}r{(hbXxHowever, this feature can be exploited if you need to add an extension which depends on other extensions already loaded.r|hcjhdhghihhk}r}(ho]hp]hn]hm]hq]uhsK3hthh]]r~hXxHowever, this feature can be exploited if you need to add an extension which depends on other extensions already loaded.rr}r(hbj|hcjzubaubh)r}r(hbX[1] This is is why the :setting:`EXTENSIONS_BASE` setting in Scrapy (which contains all built-in extensions enabled by default) defines all the extensions with the same order (``500``).hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsK6hthh]]r(hX[1] This is is why the rr}r(hbX[1] This is is why the hcjubh)r}r(hbX:setting:`EXTENSIONS_BASE`rhcjhdhghihhk}r(UreftypeXsettinghhXEXTENSIONS_BASEU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsK6h]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXEXTENSIONS_BASErr}r(hbUhcjubahijubaubhX setting in Scrapy (which contains all built-in extensions enabled by default) defines all the extensions with the same order (rr}r(hbX setting in Scrapy (which contains all built-in extensions enabled by default) defines all the extensions with the same order (hcjubj)r}r(hbX``500``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhX500rr}r(hbUhcjubahijubhX).rr}r(hbX).hcjubeubeubhu)r}r(hbUhchvhdhghihzhk}r(ho]hp]hn]hm]rhOahq]rhauhsK;hthh]]r(h)r}r(hbX*Available, enabled and disabled extensionsrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsK;hthh]]rhX*Available, enabled and disabled extensionsrr}r(hbjhcjubaubh)r}r(hbXNot all available extensions will be enabled. Some of them usually depend on a particular setting. For example, the HTTP Cache extension is available by default but disabled unless the :setting:`HTTPCACHE_ENABLED` setting is set.hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsK=hthh]]r(hXNot all available extensions will be enabled. Some of them usually depend on a particular setting. For example, the HTTP Cache extension is available by default but disabled unless the rr}r(hbXNot all available extensions will be enabled. Some of them usually depend on a particular setting. For example, the HTTP Cache extension is available by default but disabled unless the hcjubh)r}r(hbX:setting:`HTTPCACHE_ENABLED`rhcjhdhghihhk}r(UreftypeXsettinghhXHTTPCACHE_ENABLEDU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsK=h]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXHTTPCACHE_ENABLEDrr}r(hbUhcjubahijubaubhX setting is set.rr}r(hbX setting is set.hcjubeubeubhu)r}r(hbUhchvhdhghihzhk}r(ho]hp]hn]hm]rhGahq]rhauhsKBhthh]]r(h)r}r(hbXDisabling an extensionrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKBhthh]]rhXDisabling an extensionrr}r(hbjhcjubaubh)r}r(hbXIn order to disable an extension that comes enabled by default (ie. those included in the :setting:`EXTENSIONS_BASE` setting) you must set its order to ``None``. For example::hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKDhthh]]r(hXZIn order to disable an extension that comes enabled by default (ie. those included in the rr}r(hbXZIn order to disable an extension that comes enabled by default (ie. those included in the hcjubh)r}r(hbX:setting:`EXTENSIONS_BASE`rhcjhdhghihhk}r(UreftypeXsettinghhXEXTENSIONS_BASEU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsKDh]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXEXTENSIONS_BASErr}r(hbUhcjubahijubaubhX$ setting) you must set its order to rr}r(hbX$ setting) you must set its order to hcjubj)r}r(hbX``None``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhXNonerr}r(hbUhcjubahijubhX. For example:rr}r(hbX. For example:hcjubeubjL)r}r(hbX@EXTENSIONS = { 'scrapy.contrib.corestats.CoreStats': None, }hcjhdhghijOhk}r(jQjRhm]hn]ho]hp]hq]uhsKHhthh]]rhX@EXTENSIONS = { 'scrapy.contrib.corestats.CoreStats': None, }rr}r(hbUhcjubaubeubhu)r}r(hbUhchvhdhghihzhk}r(ho]hp]hn]hm]rhIahq]rhauhsKMhthh]]r (h)r }r (hbXWriting your own extensionr hcjhdhghihhk}r (ho]hp]hn]hm]hq]uhsKMhthh]]rhXWriting your own extensionrr}r(hbj hcj ubaubh)r}r(hbXWriting your own extension is easy. Each extension is a single Python class which doesn't need to implement any particular method.rhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKOhthh]]rhXWriting your own extension is easy. Each extension is a single Python class which doesn't need to implement any particular method.rr}r(hbjhcjubaubh)r}r(hbXjThe main entry point for a Scrapy extension (this also includes middlewares and pipelines) is the ``from_crawler`` class method which receives a ``Crawler`` instance which is the main object controlling the Scrapy crawler. Through that object you can access settings, signals, stats, and also control the crawler behaviour, if your extension needs to such thing.hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKRhthh]]r(hXbThe main entry point for a Scrapy extension (this also includes middlewares and pipelines) is the rr}r (hbXbThe main entry point for a Scrapy extension (this also includes middlewares and pipelines) is the hcjubj)r!}r"(hbX``from_crawler``hk}r#(ho]hp]hn]hm]hq]uhcjh]]r$hX from_crawlerr%r&}r'(hbUhcj!ubahijubhX class method which receives a r(r)}r*(hbX class method which receives a hcjubj)r+}r,(hbX ``Crawler``hk}r-(ho]hp]hn]hm]hq]uhcjh]]r.hXCrawlerr/r0}r1(hbUhcj+ubahijubhX instance which is the main object controlling the Scrapy crawler. Through that object you can access settings, signals, stats, and also control the crawler behaviour, if your extension needs to such thing.r2r3}r4(hbX instance which is the main object controlling the Scrapy crawler. Through that object you can access settings, signals, stats, and also control the crawler behaviour, if your extension needs to such thing.hcjubeubh)r5}r6(hbXeTypically, extensions connect to :ref:`signals ` and perform tasks triggered by them.hcjhdhghihhk}r7(ho]hp]hn]hm]hq]uhsKXhthh]]r8(hX!Typically, extensions connect to r9r:}r;(hbX!Typically, extensions connect to hcj5ubh)r<}r=(hbX:ref:`signals `r>hcj5hdhghihhk}r?(UreftypeXrefhhXtopics-signalsU refdomainXstdr@hm]hn]U refexplicitho]hp]hq]hhuhsKXh]]rAh)rB}rC(hbj>hk}rD(ho]hp]rE(hj@Xstd-refrFehn]hm]hq]uhcj<h]]rGhXsignalsrHrI}rJ(hbUhcjBubahihubaubhX% and perform tasks triggered by them.rKrL}rM(hbX% and perform tasks triggered by them.hcj5ubeubh)rN}rO(hbXFinally, if the ``from_crawler`` method raises the :exc:`~scrapy.exceptions.NotConfigured` exception, the extension will be disabled. Otherwise, the extension will be enabled.hcjhdhghihhk}rP(ho]hp]hn]hm]hq]uhsK[hthh]]rQ(hXFinally, if the rRrS}rT(hbXFinally, if the hcjNubj)rU}rV(hbX``from_crawler``hk}rW(ho]hp]hn]hm]hq]uhcjNh]]rXhX from_crawlerrYrZ}r[(hbUhcjUubahijubhX method raises the r\r]}r^(hbX method raises the hcjNubh)r_}r`(hbX':exc:`~scrapy.exceptions.NotConfigured`rahcjNhdhghihhk}rb(UreftypeXexchhXscrapy.exceptions.NotConfiguredU refdomainXpyrchm]hn]U refexplicitho]hp]hq]hhUpy:classrdNU py:modulereNuhsK[h]]rfj)rg}rh(hbjahk}ri(ho]hp]rj(hjcXpy-excrkehn]hm]hq]uhcj_h]]rlhX NotConfiguredrmrn}ro(hbUhcjgubahijubaubhXU exception, the extension will be disabled. Otherwise, the extension will be enabled.rprq}rr(hbXU exception, the extension will be disabled. Otherwise, the extension will be enabled.hcjNubeubhu)rs}rt(hbUhcjhdhghihzhk}ru(ho]hp]hn]hm]rvhLahq]rwhauhsK`hthh]]rx(h)ry}rz(hbXSample extensionr{hcjshdhghihhk}r|(ho]hp]hn]hm]hq]uhsK`hthh]]r}hXSample extensionr~r}r(hbj{hcjyubaubh)r}r(hbXHere we will implement a simple extension to illustrate the concepts described in the previous section. This extension will log a message every time:rhcjshdhghihhk}r(ho]hp]hn]hm]hq]uhsKbhthh]]rhXHere we will implement a simple extension to illustrate the concepts described in the previous section. This extension will log a message every time:rr}r(hbjhcjubaubcdocutils.nodes bullet_list r)r}r(hbUhcjshdhghiU bullet_listrhk}r(UbulletrX*hm]hn]ho]hp]hq]uhsKehthh]]r(cdocutils.nodes list_item r)r}r(hbXa spider is openedrhcjhdhghiU list_itemrhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbjhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKeh]]rhXa spider is openedrr}r(hbjhcjubaubaubj)r}r(hbXa spider is closedrhcjhdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbjhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKfh]]rhXa spider is closedrr}r(hbjhcjubaubaubj)r}r(hbX'a specific number of items are scraped hcjhdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbX&a specific number of items are scrapedrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKgh]]rhX&a specific number of items are scrapedrr}r(hbjhcjubaubaubeubh)r}r(hbXThe extension will be enabled through the ``MYEXT_ENABLED`` setting and the number of items will be specified through the ``MYEXT_ITEMCOUNT`` setting.hcjshdhghihhk}r(ho]hp]hn]hm]hq]uhsKihthh]]r(hX*The extension will be enabled through the rr}r(hbX*The extension will be enabled through the hcjubj)r}r(hbX``MYEXT_ENABLED``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhX MYEXT_ENABLEDrr}r(hbUhcjubahijubhX? setting and the number of items will be specified through the rr}r(hbX? setting and the number of items will be specified through the hcjubj)r}r(hbX``MYEXT_ITEMCOUNT``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhXMYEXT_ITEMCOUNTrr}r(hbUhcjubahijubhX setting.rr}r(hbX setting.hcjubeubh)r}r(hbX$Here is the code of such extension::rhcjshdhghihhk}r(ho]hp]hn]hm]hq]uhsKlhthh]]rhX#Here is the code of such extension:rr}r(hbX#Here is the code of such extension:hcjubaubjL)r}r(hbXfrom scrapy import signals from scrapy.exceptions import NotConfigured class SpiderOpenCloseLogging(object): def __init__(self, item_count): self.item_count = item_count self.items_scraped = 0 @classmethod def from_crawler(cls, crawler): # first check if the extension should be enabled and raise # NotConfigured otherwise if not crawler.settings.getbool('MYEXT_ENABLED'): raise NotConfigured # get the number of items from settings item_count = crawler.settings.getint('MYEXT_ITEMCOUNT', 1000) # instantiate the extension object ext = cls(item_count) # connect the extension object to signals crawler.signals.connect(ext.spider_opened, signal=signals.spider_opened) crawler.signals.connect(ext.spider_closed, signal=signals.spider_closed) crawler.signals.connect(ext.item_scraped, signal=signals.item_scraped) # return the extension object return ext def spider_opened(self, spider): spider.log("opened spider %s" % spider.name) def spider_closed(self, spider): spider.log("closed spider %s" % spider.name) def item_scraped(self, item, spider): self.items_scraped += 1 if self.items_scraped == self.item_count: spider.log("scraped %d items, resetting counter" % self.items_scraped) self.item_count = 0hcjshdhghijOhk}r(jQjRhm]hn]ho]hp]hq]uhsKnhthh]]rhXfrom scrapy import signals from scrapy.exceptions import NotConfigured class SpiderOpenCloseLogging(object): def __init__(self, item_count): self.item_count = item_count self.items_scraped = 0 @classmethod def from_crawler(cls, crawler): # first check if the extension should be enabled and raise # NotConfigured otherwise if not crawler.settings.getbool('MYEXT_ENABLED'): raise NotConfigured # get the number of items from settings item_count = crawler.settings.getint('MYEXT_ITEMCOUNT', 1000) # instantiate the extension object ext = cls(item_count) # connect the extension object to signals crawler.signals.connect(ext.spider_opened, signal=signals.spider_opened) crawler.signals.connect(ext.spider_closed, signal=signals.spider_closed) crawler.signals.connect(ext.item_scraped, signal=signals.item_scraped) # return the extension object return ext def spider_opened(self, spider): spider.log("opened spider %s" % spider.name) def spider_closed(self, spider): spider.log("closed spider %s" % spider.name) def item_scraped(self, item, spider): self.items_scraped += 1 if self.items_scraped == self.item_count: spider.log("scraped %d items, resetting counter" % self.items_scraped) self.item_count = 0rr}r(hbUhcjubaubh_)r}r(hbX.. _topics-extensions-ref:hcjshdhghihjhk}r(hm]hn]ho]hp]hq]hrh[uhsKhthh]]ubeubeubhu)r}r(hbUhchvhdhghx}rh0jshihzhk}r(ho]hp]hn]hm]r(hFh[ehq]r(hh0euhsKhthh~}rh[jsh]]r(h)r}r(hbXBuilt-in extensions referencerhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXBuilt-in extensions referencerr}r(hbjhcjubaubhu)r}r(hbUhcjhdhghihzhk}r(ho]hp]hn]hm]rh(hbUhcj7ubaubcsphinx.addnodes desc_addname r?)r@}rA(hbXscrapy.contrib.logstats.hcj)hdhghiU desc_addnamerBhk}rC(ho]hp]hn]hm]hq]uhsKhthh]]rDhXscrapy.contrib.logstats.rErF}rG(hbUhcj@ubaubcsphinx.addnodes desc_name rH)rI}rJ(hbj+hcj)hdhghiU desc_namerKhk}rL(ho]hp]hn]hm]hq]uhsKhthh]]rMhXLogStatsrNrO}rP(hbUhcjIubaubeubcsphinx.addnodes desc_content rQ)rR}rS(hbUhcjhdhghiU desc_contentrThk}rU(ho]hp]hn]hm]hq]uhsKhthh]]ubeubh)rV}rW(hbX5Log basic stats like crawled pages and scraped items.rXhcjhdhghihhk}rY(ho]hp]hn]hm]hq]uhsKhthh]]rZhX5Log basic stats like crawled pages and scraped items.r[r\}r](hbjXhcjVubaubeubhu)r^}r_(hbUhcjhdhghihzhk}r`(ho]hp]hn]hm]ra(Xmodule-scrapy.contrib.corestatsrbhAehq]rchauhsKhthh]]rd(h)re}rf(hbXCore Stats extensionrghcj^hdhghihhk}rh(ho]hp]hn]hm]hq]uhsKhthh]]rihXCore Stats extensionrjrk}rl(hbjghcjeubaubj)rm}rn(hbUhcj^hdhghijhk}ro(hm]hn]ho]hp]hq]Uentries]rp(jX!scrapy.contrib.corestats (module)Xmodule-scrapy.contrib.corestatsUtrqauhsNhthh]]ubj)rr}rs(hbUhcj^hdNhijhk}rt(hm]hn]ho]hp]hq]Uentries]ru(jX-CoreStats (class in scrapy.contrib.corestats)h$UtrvauhsNhthh]]ubj)rw}rx(hbUhcj^hdNhij hk}ry(j"j#Xpyhm]hn]ho]hp]hq]j$Xclassrzj&jzuhsNhthh]]r{(j()r|}r}(hbX CoreStatsr~hcjwhdhghij,hk}r(hm]rh$aj/Xscrapy.contrib.corestatsrhn]ho]hp]hq]rh$aj2j~j3Uj4uhsKhthh]]r(j6)r}r(hbXclass hcj|hdhghij9hk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXclass rr}r(hbUhcjubaubj?)r}r(hbXscrapy.contrib.corestats.hcj|hdhghijBhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXscrapy.contrib.corestats.rr}r(hbUhcjubaubjH)r}r(hbj~hcj|hdhghijKhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhX CoreStatsrr}r(hbUhcjubaubeubjQ)r}r(hbUhcjwhdhghijThk}r(ho]hp]hn]hm]hq]uhsKhthh]]ubeubh)r}r(hbXmEnable the collection of core statistics, provided the stats collection is enabled (see :ref:`topics-stats`).hcj^hdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]r(hXXEnable the collection of core statistics, provided the stats collection is enabled (see rr}r(hbXXEnable the collection of core statistics, provided the stats collection is enabled (see hcjubh)r}r(hbX:ref:`topics-stats`rhcjhdhghihhk}r(UreftypeXrefhhX topics-statsU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsKh]]rh)r}r(hbjhk}r(ho]hp]r(hjXstd-refrehn]hm]hq]uhcjh]]rhX topics-statsrr}r(hbUhcjubahihubaubhX).rr}r(hbX).hcjubeubh_)r}r(hbX%.. _topics-extensions-ref-webservice:hcj^hdhghihjhk}r(hm]hn]ho]hp]hq]hrhKuhsKhthh]]ubeubhu)r}r(hbUhcjhdhghx}rhjshihzhk}r(ho]hp]hn]hm]r(Xmodule-scrapy.webservicerhBhKehq]r(hheuhsKhthh~}rhKjsh]]r(h)r}r(hbXWeb service extensionrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXWeb service extensionrr}r(hbjhcjubaubj)r}r(hbUhcjhdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(jXscrapy.webservice (module)Xmodule-scrapy.webserviceUtrauhsNhthh]]ubj)r}r(hbUhcjhdNhijhk}r(hm]hn]ho]hp]hq]Uentries]r(jX9scrapy.webservice.WebService (class in scrapy.webservice)hUtrauhsNhthh]]ubj)r}r(hbUhcjhdNhij hk}r(j"j#Xpyhm]hn]ho]hp]hq]j$Xclassrj&juhsNhthh]]r(j()r}r(hbXscrapy.webservice.WebServicehcjhdhghij,hk}r(hm]rhaj/Xscrapy.webservicerhn]ho]hp]hq]rhaj2Xscrapy.webservice.WebServicej3Xscrapy.webservicej4uhsKhthh]]r(j6)r}r(hbXclass hcjhdhghij9hk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXclass rr}r(hbUhcjubaubj?)r}r(hbXscrapy.webservice.hcjhdhghijBhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXscrapy.webservice.rr}r(hbUhcjubaubjH)r}r(hbX WebServicehcjhdhghijKhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhX WebServicerr}r(hbUhcjubaubeubjQ)r}r(hbUhcjhdhghijThk}r(ho]hp]hn]hm]hq]uhsKhthh]]ubeubh)r}r(hbXSee `topics-webservice`.rhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]r(hXSee rr}r(hbXSee hcjubh)r}r(hbX`topics-webservice`hk}r(ho]hp]hn]hm]hq]uhcjh]]rhXtopics-webservicerr}r(hbUhcjubahihubhX.r}r(hbX.hcjubeubh_)r}r (hbX(.. _topics-extensions-ref-telnetconsole:hcjhdhghihjhk}r (hm]hn]ho]hp]hq]hrhYuhsKhthh]]ubeubhu)r }r (hbUhcjhdhghx}r h.jshihzhk}r(ho]hp]hn]hm]r(Xmodule-scrapy.telnetrhShYehq]r(h'h.euhsKhthh~}rhYjsh]]r(h)r}r(hbXTelnet console extensionrhcj hdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXTelnet console extensionrr}r(hbjhcjubaubj)r}r(hbUhcj hdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(jXscrapy.telnet (module)Xmodule-scrapy.telnetUtr auhsNhthh]]ubj)r!}r"(hbUhcj hdNhijhk}r#(hm]hn]ho]hp]hq]Uentries]r$(jX4scrapy.telnet.TelnetConsole (class in scrapy.telnet)h Utr%auhsNhthh]]ubj)r&}r'(hbUhcj hdNhij hk}r((j"j#Xpyhm]hn]ho]hp]hq]j$Xclassr)j&j)uhsNhthh]]r*(j()r+}r,(hbXscrapy.telnet.TelnetConsolehcj&hdhghij,hk}r-(hm]r.h aj/X scrapy.telnetr/hn]ho]hp]hq]r0h aj2Xscrapy.telnet.TelnetConsolej3X scrapy.telnetj4uhsKhthh]]r1(j6)r2}r3(hbXclass hcj+hdhghij9hk}r4(ho]hp]hn]hm]hq]uhsKhthh]]r5hXclass r6r7}r8(hbUhcj2ubaubj?)r9}r:(hbXscrapy.telnet.hcj+hdhghijBhk}r;(ho]hp]hn]hm]hq]uhsKhthh]]r<hXscrapy.telnet.r=r>}r?(hbUhcj9ubaubjH)r@}rA(hbX TelnetConsolehcj+hdhghijKhk}rB(ho]hp]hn]hm]hq]uhsKhthh]]rChX TelnetConsolerDrE}rF(hbUhcj@ubaubeubjQ)rG}rH(hbUhcj&hdhghijThk}rI(ho]hp]hn]hm]hq]uhsKhthh]]ubeubh)rJ}rK(hbXProvides a telnet console for getting into a Python interpreter inside the currently running Scrapy process, which can be very useful for debugging.rLhcj hdhghihhk}rM(ho]hp]hn]hm]hq]uhsKhthh]]rNhXProvides a telnet console for getting into a Python interpreter inside the currently running Scrapy process, which can be very useful for debugging.rOrP}rQ(hbjLhcjJubaubh)rR}rS(hbXThe telnet console must be enabled by the :setting:`TELNETCONSOLE_ENABLED` setting, and the server will listen in the port specified in :setting:`TELNETCONSOLE_PORT`.hcj hdhghihhk}rT(ho]hp]hn]hm]hq]uhsKhthh]]rU(hX*The telnet console must be enabled by the rVrW}rX(hbX*The telnet console must be enabled by the hcjRubh)rY}rZ(hbX :setting:`TELNETCONSOLE_ENABLED`r[hcjRhdhghihhk}r\(UreftypeXsettinghhXTELNETCONSOLE_ENABLEDU refdomainXstdr]hm]hn]U refexplicitho]hp]hq]hhuhsKh]]r^j)r_}r`(hbj[hk}ra(ho]hp]rb(hj]X std-settingrcehn]hm]hq]uhcjYh]]rdhXTELNETCONSOLE_ENABLEDrerf}rg(hbUhcj_ubahijubaubhX> setting, and the server will listen in the port specified in rhri}rj(hbX> setting, and the server will listen in the port specified in hcjRubh)rk}rl(hbX:setting:`TELNETCONSOLE_PORT`rmhcjRhdhghihhk}rn(UreftypeXsettinghhXTELNETCONSOLE_PORTU refdomainXstdrohm]hn]U refexplicitho]hp]hq]hhuhsKh]]rpj)rq}rr(hbjmhk}rs(ho]hp]rt(hjoX std-settingruehn]hm]hq]uhcjkh]]rvhXTELNETCONSOLE_PORTrwrx}ry(hbUhcjqubahijubaubhX.rz}r{(hbX.hcjRubeubh_)r|}r}(hbX#.. _topics-extensions-ref-memusage:hcj hdhghihjhk}r~(hm]hn]ho]hp]hq]hrhCuhsKhthh]]ubeubhu)r}r(hbUhcjhdhghx}rhj|shihzhk}r(ho]hp]hn]hm]r(Xmodule-scrapy.contrib.memusagerhEhCehq]r(hheuhsKhthh~}rhCj|sh]]r(h)r}r(hbXMemory usage extensionrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXMemory usage extensionrr}r(hbjhcjubaubj)r}r(hbUhcjhdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(jX scrapy.contrib.memusage (module)Xmodule-scrapy.contrib.memusageUtrauhsNhthh]]ubj)r}r(hbUhcjhdNhijhk}r(hm]hn]ho]hp]hq]Uentries]r(jXFscrapy.contrib.memusage.MemoryUsage (class in scrapy.contrib.memusage)h"UtrauhsNhthh]]ubj)r}r(hbUhcjhdNhij hk}r(j"j#Xpyhm]hn]ho]hp]hq]j$Xclassrj&juhsNhthh]]r(j()r}r(hbX#scrapy.contrib.memusage.MemoryUsagehcjhdhghij,hk}r(hm]rh"aj/Xscrapy.contrib.memusagerhn]ho]hp]hq]rh"aj2X#scrapy.contrib.memusage.MemoryUsagej3Xscrapy.contrib.memusagej4uhsKhthh]]r(j6)r}r(hbXclass hcjhdhghij9hk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXclass rr}r(hbUhcjubaubj?)r}r(hbXscrapy.contrib.memusage.hcjhdhghijBhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXscrapy.contrib.memusage.rr}r(hbUhcjubaubjH)r}r(hbX MemoryUsagehcjhdhghijKhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhX MemoryUsagerr}r(hbUhcjubaubeubjQ)r}r(hbUhcjhdhghijThk}r(ho]hp]hn]hm]hq]uhsKhthh]]ubeubcdocutils.nodes note r)r}r(hbX(This extension does not work in Windows.rhcjhdhghiUnoterhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbjhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKh]]rhX(This extension does not work in Windows.rr}r(hbjhcjubaubaubh)r}r(hbXHMonitors the memory used by the Scrapy process that runs the spider and:rhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXHMonitors the memory used by the Scrapy process that runs the spider and:rr}r(hbjhcjubaubh)r}r(hbXs1, sends a notification e-mail when it exceeds a certain value 2. closes the spider when it exceeds a certain valuerhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXs1, sends a notification e-mail when it exceeds a certain value 2. closes the spider when it exceeds a certain valuerr}r(hbjhcjubaubh)r}r(hbXThe notification e-mails can be triggered when a certain warning value is reached (:setting:`MEMUSAGE_WARNING_MB`) and when the maximum value is reached (:setting:`MEMUSAGE_LIMIT_MB`) which will also cause the spider to be closed and the Scrapy process to be terminated.hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]r(hXSThe notification e-mails can be triggered when a certain warning value is reached (rr}r(hbXSThe notification e-mails can be triggered when a certain warning value is reached (hcjubh)r}r(hbX:setting:`MEMUSAGE_WARNING_MB`rhcjhdhghihhk}r(UreftypeXsettinghhXMEMUSAGE_WARNING_MBU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsKh]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXMEMUSAGE_WARNING_MBrr}r(hbUhcjubahijubaubhX)) and when the maximum value is reached (rr}r(hbX)) and when the maximum value is reached (hcjubh)r}r(hbX:setting:`MEMUSAGE_LIMIT_MB`rhcjhdhghihhk}r(UreftypeXsettinghhXMEMUSAGE_LIMIT_MBU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsKh]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXMEMUSAGE_LIMIT_MBrr}r(hbUhcjubahijubaubhXX) which will also cause the spider to be closed and the Scrapy process to be terminated.rr}r(hbXX) which will also cause the spider to be closed and the Scrapy process to be terminated.hcjubeubh)r}r(hbXwThis extension is enabled by the :setting:`MEMUSAGE_ENABLED` setting and can be configured with the following settings:hcjhdhghihhk}r (ho]hp]hn]hm]hq]uhsKhthh]]r (hX!This extension is enabled by the r r }r (hbX!This extension is enabled by the hcjubh)r}r(hbX:setting:`MEMUSAGE_ENABLED`rhcjhdhghihhk}r(UreftypeXsettinghhXMEMUSAGE_ENABLEDU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsKh]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXMEMUSAGE_ENABLEDrr}r(hbUhcjubahijubaubhX; setting and can be configured with the following settings:rr}r(hbX; setting and can be configured with the following settings:hcjubeubj)r }r!(hbUhcjhdhghijhk}r"(jX*hm]hn]ho]hp]hq]uhsKhthh]]r#(j)r$}r%(hbX:setting:`MEMUSAGE_LIMIT_MB`r&hcj hdhghijhk}r'(ho]hp]hn]hm]hq]uhsNhthh]]r(h)r)}r*(hbj&hcj$hdhghihhk}r+(ho]hp]hn]hm]hq]uhsKh]]r,h)r-}r.(hbj&hcj)hdhghihhk}r/(UreftypeXsettinghhXMEMUSAGE_LIMIT_MBU refdomainXstdr0hm]hn]U refexplicitho]hp]hq]hhuhsKh]]r1j)r2}r3(hbj&hk}r4(ho]hp]r5(hj0X std-settingr6ehn]hm]hq]uhcj-h]]r7hXMEMUSAGE_LIMIT_MBr8r9}r:(hbUhcj2ubahijubaubaubaubj)r;}r<(hbX:setting:`MEMUSAGE_WARNING_MB`r=hcj hdhghijhk}r>(ho]hp]hn]hm]hq]uhsNhthh]]r?h)r@}rA(hbj=hcj;hdhghihhk}rB(ho]hp]hn]hm]hq]uhsKh]]rCh)rD}rE(hbj=hcj@hdhghihhk}rF(UreftypeXsettinghhXMEMUSAGE_WARNING_MBU refdomainXstdrGhm]hn]U refexplicitho]hp]hq]hhuhsKh]]rHj)rI}rJ(hbj=hk}rK(ho]hp]rL(hjGX std-settingrMehn]hm]hq]uhcjDh]]rNhXMEMUSAGE_WARNING_MBrOrP}rQ(hbUhcjIubahijubaubaubaubj)rR}rS(hbX:setting:`MEMUSAGE_NOTIFY_MAIL`rThcj hdhghijhk}rU(ho]hp]hn]hm]hq]uhsNhthh]]rVh)rW}rX(hbjThcjRhdhghihhk}rY(ho]hp]hn]hm]hq]uhsKh]]rZh)r[}r\(hbjThcjWhdhghihhk}r](UreftypeXsettinghhXMEMUSAGE_NOTIFY_MAILU refdomainXstdr^hm]hn]U refexplicitho]hp]hq]hhuhsKh]]r_j)r`}ra(hbjThk}rb(ho]hp]rc(hj^X std-settingrdehn]hm]hq]uhcj[h]]rehXMEMUSAGE_NOTIFY_MAILrfrg}rh(hbUhcj`ubahijubaubaubaubj)ri}rj(hbX:setting:`MEMUSAGE_REPORT` hcj hdhghijhk}rk(ho]hp]hn]hm]hq]uhsNhthh]]rlh)rm}rn(hbX:setting:`MEMUSAGE_REPORT`rohcjihdhghihhk}rp(ho]hp]hn]hm]hq]uhsKh]]rqh)rr}rs(hbjohcjmhdhghihhk}rt(UreftypeXsettinghhXMEMUSAGE_REPORTU refdomainXstdruhm]hn]U refexplicitho]hp]hq]hhuhsKh]]rvj)rw}rx(hbjohk}ry(ho]hp]rz(hjuX std-settingr{ehn]hm]hq]uhcjrh]]r|hXMEMUSAGE_REPORTr}r~}r(hbUhcjwubahijubaubaubaubeubeubhu)r}r(hbUhcjhdhghihzhk}r(ho]hp]hn]hm]r(Xmodule-scrapy.contrib.memdebugrhPehq]rh!auhsKhthh]]r(h)r}r(hbXMemory debugger extensionrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXMemory debugger extensionrr}r(hbjhcjubaubj)r}r(hbUhcjhdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(jX scrapy.contrib.memdebug (module)Xmodule-scrapy.contrib.memdebugUtrauhsNhthh]]ubj)r}r(hbUhcjhdNhijhk}r(hm]hn]ho]hp]hq]Uentries]r(jXIscrapy.contrib.memdebug.MemoryDebugger (class in scrapy.contrib.memdebug)hUtrauhsNhthh]]ubj)r}r(hbUhcjhdNhij hk}r(j"j#Xpyhm]hn]ho]hp]hq]j$Xclassrj&juhsNhthh]]r(j()r}r(hbX&scrapy.contrib.memdebug.MemoryDebuggerhcjhdhghij,hk}r(hm]rhaj/Xscrapy.contrib.memdebugrhn]ho]hp]hq]rhaj2X&scrapy.contrib.memdebug.MemoryDebuggerj3Xscrapy.contrib.memdebugj4uhsKhthh]]r(j6)r}r(hbXclass hcjhdhghij9hk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXclass rr}r(hbUhcjubaubj?)r}r(hbXscrapy.contrib.memdebug.hcjhdhghijBhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXscrapy.contrib.memdebug.rr}r(hbUhcjubaubjH)r}r(hbXMemoryDebuggerhcjhdhghijKhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXMemoryDebuggerrr}r(hbUhcjubaubeubjQ)r}r(hbUhcjhdhghijThk}r(ho]hp]hn]hm]hq]uhsKhthh]]ubeubh)r}r(hbXGAn extension for debugging memory usage. It collects information about:rhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]rhXGAn extension for debugging memory usage. It collects information about:rr}r(hbjhcjubaubj)r}r(hbUhcjhdhghijhk}r(jX*hm]hn]ho]hp]hq]uhsKhthh]]r(j)r}r(hbX3objects uncollected by the Python garbage collectorrhcjhdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbjhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKh]]rhX3objects uncollected by the Python garbage collectorrr}r(hbjhcjubaubaubj)r}r(hbXTobjects left alive that shouldn't. For more info, see :ref:`topics-leaks-trackrefs` hcjhdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbXSobjects left alive that shouldn't. For more info, see :ref:`topics-leaks-trackrefs`hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKh]]r(hX6objects left alive that shouldn't. For more info, see rr}r(hbX6objects left alive that shouldn't. For more info, see hcjubh)r}r(hbX:ref:`topics-leaks-trackrefs`rhcjhdhghihhk}r(UreftypeXrefhhXtopics-leaks-trackrefsU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsKh]]rh)r}r(hbjhk}r(ho]hp]r(hjXstd-refrehn]hm]hq]uhcjh]]rhXtopics-leaks-trackrefsrr}r(hbUhcjubahihubaubeubaubeubh)r}r(hbXpTo enable this extension, turn on the :setting:`MEMDEBUG_ENABLED` setting. The info will be stored in the stats.hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsKhthh]]r(hX&To enable this extension, turn on the rr}r(hbX&To enable this extension, turn on the hcjubh)r}r(hbX:setting:`MEMDEBUG_ENABLED`rhcjhdhghihhk}r(UreftypeXsettinghhXMEMDEBUG_ENABLEDU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsKh]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXMEMDEBUG_ENABLEDrr}r(hbUhcjubahijubaubhX/ setting. The info will be stored in the stats.rr}r(hbX/ setting. The info will be stored in the stats.hcjubeubeubhu)r}r (hbUhcjhdhghihzhk}r (ho]hp]hn]hm]r (X!module-scrapy.contrib.closespiderr hMehq]r hauhsMhthh]]r(h)r}r(hbXClose spider extensionrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMhthh]]rhXClose spider extensionrr}r(hbjhcjubaubj)r}r(hbUhcjhdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(jX#scrapy.contrib.closespider (module)X!module-scrapy.contrib.closespiderUtrauhsNhthh]]ubj)r}r(hbUhcjhdNhijhk}r(hm]hn]ho]hp]hq]Uentries]r(jXLscrapy.contrib.closespider.CloseSpider (class in scrapy.contrib.closespider)h Utr auhsNhthh]]ubj)r!}r"(hbUhcjhdNhij hk}r#(j"j#Xpyhm]hn]ho]hp]hq]j$Xclassr$j&j$uhsNhthh]]r%(j()r&}r'(hbX&scrapy.contrib.closespider.CloseSpiderhcj!hdhghij,hk}r((hm]r)h aj/Xscrapy.contrib.closespiderr*hn]ho]hp]hq]r+h aj2X&scrapy.contrib.closespider.CloseSpiderj3Xscrapy.contrib.closespiderj4uhsMhthh]]r,(j6)r-}r.(hbXclass hcj&hdhghij9hk}r/(ho]hp]hn]hm]hq]uhsMhthh]]r0hXclass r1r2}r3(hbUhcj-ubaubj?)r4}r5(hbXscrapy.contrib.closespider.hcj&hdhghijBhk}r6(ho]hp]hn]hm]hq]uhsMhthh]]r7hXscrapy.contrib.closespider.r8r9}r:(hbUhcj4ubaubjH)r;}r<(hbX CloseSpiderhcj&hdhghijKhk}r=(ho]hp]hn]hm]hq]uhsMhthh]]r>hX CloseSpiderr?r@}rA(hbUhcj;ubaubeubjQ)rB}rC(hbUhcj!hdhghijThk}rD(ho]hp]hn]hm]hq]uhsMhthh]]ubeubh)rE}rF(hbXoCloses a spider automatically when some conditions are met, using a specific closing reason for each condition.rGhcjhdhghihhk}rH(ho]hp]hn]hm]hq]uhsMhthh]]rIhXoCloses a spider automatically when some conditions are met, using a specific closing reason for each condition.rJrK}rL(hbjGhcjEubaubh)rM}rN(hbXUThe conditions for closing a spider can be configured through the following settings:rOhcjhdhghihhk}rP(ho]hp]hn]hm]hq]uhsM hthh]]rQhXUThe conditions for closing a spider can be configured through the following settings:rRrS}rT(hbjOhcjMubaubj)rU}rV(hbUhcjhdhghijhk}rW(jX*hm]hn]ho]hp]hq]uhsMhthh]]rX(j)rY}rZ(hbX:setting:`CLOSESPIDER_TIMEOUT`r[hcjUhdhghijhk}r\(ho]hp]hn]hm]hq]uhsNhthh]]r]h)r^}r_(hbj[hcjYhdhghihhk}r`(ho]hp]hn]hm]hq]uhsMh]]rah)rb}rc(hbj[hcj^hdhghihhk}rd(UreftypeXsettinghhXCLOSESPIDER_TIMEOUTU refdomainXstdrehm]hn]U refexplicitho]hp]hq]hhuhsMh]]rfj)rg}rh(hbj[hk}ri(ho]hp]rj(hjeX std-settingrkehn]hm]hq]uhcjbh]]rlhXCLOSESPIDER_TIMEOUTrmrn}ro(hbUhcjgubahijubaubaubaubj)rp}rq(hbX :setting:`CLOSESPIDER_ITEMCOUNT`rrhcjUhdhghijhk}rs(ho]hp]hn]hm]hq]uhsNhthh]]rth)ru}rv(hbjrhcjphdhghihhk}rw(ho]hp]hn]hm]hq]uhsMh]]rxh)ry}rz(hbjrhcjuhdhghihhk}r{(UreftypeXsettinghhXCLOSESPIDER_ITEMCOUNTU refdomainXstdr|hm]hn]U refexplicitho]hp]hq]hhuhsMh]]r}j)r~}r(hbjrhk}r(ho]hp]r(hj|X std-settingrehn]hm]hq]uhcjyh]]rhXCLOSESPIDER_ITEMCOUNTrr}r(hbUhcj~ubahijubaubaubaubj)r}r(hbX :setting:`CLOSESPIDER_PAGECOUNT`rhcjUhdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbjhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMh]]rh)r}r(hbjhcjhdhghihhk}r(UreftypeXsettinghhXCLOSESPIDER_PAGECOUNTU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsMh]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXCLOSESPIDER_PAGECOUNTrr}r(hbUhcjubahijubaubaubaubj)r}r(hbX":setting:`CLOSESPIDER_ERRORCOUNT` hcjUhdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbX!:setting:`CLOSESPIDER_ERRORCOUNT`rhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMh]]rh)r}r(hbjhcjhdhghihhk}r(UreftypeXsettinghhXCLOSESPIDER_ERRORCOUNTU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsMh]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXCLOSESPIDER_ERRORCOUNTrr}r(hbUhcjubahijubaubaubaubeubj)r}r(hbUhcjhdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(XpairXCLOSESPIDER_TIMEOUT; settingXstd:setting-CLOSESPIDER_TIMEOUTrUtrauhsMhthh]]ubh_)r}r(hbUhcjhdhghihjhk}r(hm]hn]ho]hp]hq]hrjuhsMhthh]]ubhu)r}r(hbUhcjhdhghx}hihzhk}r(ho]hp]hn]hm]r(hWjehq]rh,auhsMhthh~}rjjsh]]r(h)r}r(hbXCLOSESPIDER_TIMEOUTrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMhthh]]rhXCLOSESPIDER_TIMEOUTrr}r(hbjhcjubaubh)r}r(hbXDefault: ``0``rhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMhthh]]r(hX Default: rr}r(hbX Default: hcjubj)r}r(hbX``0``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhX0r}r(hbUhcjubahijubeubh)r}r(hbXAn integer which specifies a number of seconds. If the spider remains open for more than that number of second, it will be automatically closed with the reason ``closespider_timeout``. If zero (or non set), spiders won't be closed by timeout.hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMhthh]]r(hXAn integer which specifies a number of seconds. If the spider remains open for more than that number of second, it will be automatically closed with the reason rr}r(hbXAn integer which specifies a number of seconds. If the spider remains open for more than that number of second, it will be automatically closed with the reason hcjubj)r}r(hbX``closespider_timeout``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhXclosespider_timeoutrr}r(hbUhcjubahijubhX;. If zero (or non set), spiders won't be closed by timeout.rr}r(hbX;. If zero (or non set), spiders won't be closed by timeout.hcjubeubj)r}r(hbUhcjhdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(XpairXCLOSESPIDER_ITEMCOUNT; settingX!std:setting-CLOSESPIDER_ITEMCOUNTrUtrauhsM hthh]]ubh_)r}r(hbUhcjhdhghihjhk}r(hm]hn]ho]hp]hq]hrjuhsM hthh]]ubeubhu)r}r(hbUhcjhdhghx}hihzhk}r(ho]hp]hn]hm]r(hNjehq]rhauhsM"hthh~}rjjsh]]r(h)r}r(hbXCLOSESPIDER_ITEMCOUNTrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsM"hthh]]rhXCLOSESPIDER_ITEMCOUNTrr}r(hbjhcjubaubh)r}r(hbXDefault: ``0``rhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsM$hthh]]r(hX Default: r r }r (hbX Default: hcjubj)r }r (hbX``0``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhX0r}r(hbUhcj ubahijubeubh)r}r(hbX!An integer which specifies a number of items. If the spider scrapes more than that amount if items and those items are passed by the item pipeline, the spider will be closed with the reason ``closespider_itemcount``. If zero (or non set), spiders won't be closed by number of passed items.hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsM&hthh]]r(hXAn integer which specifies a number of items. If the spider scrapes more than that amount if items and those items are passed by the item pipeline, the spider will be closed with the reason rr}r(hbXAn integer which specifies a number of items. If the spider scrapes more than that amount if items and those items are passed by the item pipeline, the spider will be closed with the reason hcjubj)r}r(hbX``closespider_itemcount``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhXclosespider_itemcountrr}r(hbUhcjubahijubhXJ. If zero (or non set), spiders won't be closed by number of passed items.r r!}r"(hbXJ. If zero (or non set), spiders won't be closed by number of passed items.hcjubeubj)r#}r$(hbUhcjhdhghijhk}r%(hm]hn]ho]hp]hq]Uentries]r&(XpairXCLOSESPIDER_PAGECOUNT; settingX!std:setting-CLOSESPIDER_PAGECOUNTr'Utr(auhsM,hthh]]ubh_)r)}r*(hbUhcjhdhghihjhk}r+(hm]hn]ho]hp]hq]hrj'uhsM,hthh]]ubeubhu)r,}r-(hbUhcjhdhghx}hihzhk}r.(ho]hp]hn]hm]r/(hZj'ehq]r0h/auhsM.hthh~}r1j'j)sh]]r2(h)r3}r4(hbXCLOSESPIDER_PAGECOUNTr5hcj,hdhghihhk}r6(ho]hp]hn]hm]hq]uhsM.hthh]]r7hXCLOSESPIDER_PAGECOUNTr8r9}r:(hbj5hcj3ubaubcsphinx.addnodes versionmodified r;)r<}r=(hbUhcj,hdhghiUversionmodifiedr>hk}r?(Uversionr@X0.11hm]hn]ho]hp]hq]UtyperAX versionaddedrBuhsM0hthh]]rCh)rD}rE(hbUhcj<hdhghihhk}rF(ho]hp]hn]hm]hq]uhsM1hthh]]rGcdocutils.nodes inline rH)rI}rJ(hbUhk}rK(ho]hp]rLj>ahn]hm]hq]uhcjDh]]rMhXNew in version 0.11.rNrO}rP(hbUhcjIubahiUinlinerQubaubaubh)rR}rS(hbXDefault: ``0``rThcj,hdhghihhk}rU(ho]hp]hn]hm]hq]uhsM2hthh]]rV(hX Default: rWrX}rY(hbX Default: hcjRubj)rZ}r[(hbX``0``hk}r\(ho]hp]hn]hm]hq]uhcjRh]]r]hX0r^}r_(hbUhcjZubahijubeubh)r`}ra(hbXAn integer which specifies the maximum number of responses to crawl. If the spider crawls more than that, the spider will be closed with the reason ``closespider_pagecount``. If zero (or non set), spiders won't be closed by number of crawled responses.hcj,hdhghihhk}rb(ho]hp]hn]hm]hq]uhsM4hthh]]rc(hXAn integer which specifies the maximum number of responses to crawl. If the spider crawls more than that, the spider will be closed with the reason rdre}rf(hbXAn integer which specifies the maximum number of responses to crawl. If the spider crawls more than that, the spider will be closed with the reason hcj`ubj)rg}rh(hbX``closespider_pagecount``hk}ri(ho]hp]hn]hm]hq]uhcj`h]]rjhXclosespider_pagecountrkrl}rm(hbUhcjgubahijubhXO. If zero (or non set), spiders won't be closed by number of crawled responses.rnro}rp(hbXO. If zero (or non set), spiders won't be closed by number of crawled responses.hcj`ubeubj)rq}rr(hbUhcj,hdhghijhk}rs(hm]hn]ho]hp]hq]Uentries]rt(XpairXCLOSESPIDER_ERRORCOUNT; settingX"std:setting-CLOSESPIDER_ERRORCOUNTruUtrvauhsM:hthh]]ubh_)rw}rx(hbUhcj,hdhghihjhk}ry(hm]hn]ho]hp]hq]hrjuuhsM:hthh]]ubeubhu)rz}r{(hbUhcjhdhghx}hihzhk}r|(ho]hp]hn]hm]r}(hTjuehq]r~h)auhsM<hthh~}rjujwsh]]r(h)r}r(hbXCLOSESPIDER_ERRORCOUNTrhcjzhdhghihhk}r(ho]hp]hn]hm]hq]uhsM<hthh]]rhXCLOSESPIDER_ERRORCOUNTrr}r(hbjhcjubaubj;)r}r(hbUhcjzhdhghij>hk}r(j@X0.11hm]hn]ho]hp]hq]jAX versionaddedruhsM>hthh]]rh)r}r(hbUhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsM?hthh]]rjH)r}r(hbUhk}r(ho]hp]rj>ahn]hm]hq]uhcjh]]rhXNew in version 0.11.rr}r(hbUhcjubahijQubaubaubh)r}r(hbXDefault: ``0``rhcjzhdhghihhk}r(ho]hp]hn]hm]hq]uhsM@hthh]]r(hX Default: rr}r(hbX Default: hcjubj)r}r(hbX``0``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhX0r}r(hbUhcjubahijubeubh)r}r(hbXAn integer which specifies the maximum number of errors to receive before closing the spider. If the spider generates more than that number of errors, it will be closed with the reason ``closespider_errorcount``. If zero (or non set), spiders won't be closed by number of errors.hcjzhdhghihhk}r(ho]hp]hn]hm]hq]uhsMBhthh]]r(hXAn integer which specifies the maximum number of errors to receive before closing the spider. If the spider generates more than that number of errors, it will be closed with the reason rr}r(hbXAn integer which specifies the maximum number of errors to receive before closing the spider. If the spider generates more than that number of errors, it will be closed with the reason hcjubj)r}r(hbX``closespider_errorcount``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhXclosespider_errorcountrr}r(hbUhcjubahijubhXD. If zero (or non set), spiders won't be closed by number of errors.rr}r(hbXD. If zero (or non set), spiders won't be closed by number of errors.hcjubeubeubeubhu)r}r(hbUhcjhdhghihzhk}r(ho]hp]hn]hm]r(X!module-scrapy.contrib.statsmailerrh\ehq]rh1auhsMHhthh]]r(h)r}r(hbXStatsMailer extensionrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMHhthh]]rhXStatsMailer extensionrr}r(hbjhcjubaubj)r}r(hbUhcjhdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(jX#scrapy.contrib.statsmailer (module)X!module-scrapy.contrib.statsmailerUtrauhsNhthh]]ubj)r}r(hbUhcjhdNhijhk}r(hm]hn]ho]hp]hq]Uentries]r(jXLscrapy.contrib.statsmailer.StatsMailer (class in scrapy.contrib.statsmailer)hUtrauhsNhthh]]ubj)r}r(hbUhcjhdNhij hk}r(j"j#Xpyhm]hn]ho]hp]hq]j$Xclassrj&juhsNhthh]]r(j()r}r(hbX&scrapy.contrib.statsmailer.StatsMailerhcjhdhghij,hk}r(hm]rhaj/Xscrapy.contrib.statsmailerrhn]ho]hp]hq]rhaj2X&scrapy.contrib.statsmailer.StatsMailerj3Xscrapy.contrib.statsmailerj4uhsMNhthh]]r(j6)r}r(hbXclass hcjhdhghij9hk}r(ho]hp]hn]hm]hq]uhsMNhthh]]rhXclass rr}r(hbUhcjubaubj?)r}r(hbXscrapy.contrib.statsmailer.hcjhdhghijBhk}r(ho]hp]hn]hm]hq]uhsMNhthh]]rhXscrapy.contrib.statsmailer.rr}r(hbUhcjubaubjH)r}r(hbX StatsMailerhcjhdhghijKhk}r(ho]hp]hn]hm]hq]uhsMNhthh]]rhX StatsMailerrr}r(hbUhcjubaubeubjQ)r}r(hbUhcjhdhghijThk}r(ho]hp]hn]hm]hq]uhsMNhthh]]ubeubh)r}r(hbXThis simple extension can be used to send a notification e-mail every time a domain has finished scraping, including the Scrapy stats collected. The email will be sent to all recipients specified in the :setting:`STATSMAILER_RCPTS` setting.hcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMOhthh]]r(hXThis simple extension can be used to send a notification e-mail every time a domain has finished scraping, including the Scrapy stats collected. The email will be sent to all recipients specified in the rr}r(hbXThis simple extension can be used to send a notification e-mail every time a domain has finished scraping, including the Scrapy stats collected. The email will be sent to all recipients specified in the hcjubh)r}r(hbX:setting:`STATSMAILER_RCPTS`rhcjhdhghihhk}r(UreftypeXsettinghhXSTATSMAILER_RCPTSU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsMOh]]rj)r}r(hbjhk}r(ho]hp]r(hjX std-settingrehn]hm]hq]uhcjh]]rhXSTATSMAILER_RCPTSr r }r (hbUhcjubahijubaubhX setting.r r }r(hbX setting.hcjubeubh_)r}r(hbUhcjhdhghihjhk}r(ho]hm]rXmodule-scrapy.contrib.debugrahn]Uismodhp]hq]uhsNhthh]]ubj)r}r(hbUhcjhdhghijhk}r(hm]hn]ho]hp]hq]Uentries]r(jXscrapy.contrib.debug (module)Xmodule-scrapy.contrib.debugUtrauhsNhthh]]ubeubeubhu)r}r(hbUhcjhdhghihzhk}r(ho]hp]hn]hm]rhQahq]rh#auhsMXhthh]]r(h)r}r (hbXDebugging extensionsr!hcjhdhghihhk}r"(ho]hp]hn]hm]hq]uhsMXhthh]]r#hXDebugging extensionsr$r%}r&(hbj!hcjubaubhu)r'}r((hbUhcjhdhghihzhk}r)(ho]hp]hn]hm]r*hVahq]r+h+auhsM[hthh]]r,(h)r-}r.(hbXStack trace dump extensionr/hcj'hdhghihhk}r0(ho]hp]hn]hm]hq]uhsM[hthh]]r1hXStack trace dump extensionr2r3}r4(hbj/hcj-ubaubj)r5}r6(hbUhcj'hdNhijhk}r7(hm]hn]ho]hp]hq]Uentries]r8(jXCscrapy.contrib.debug.StackTraceDump (class in scrapy.contrib.debug)h(Utr9auhsNhthh]]ubj)r:}r;(hbUhcj'hdNhij hk}r<(j"j#Xpyhm]hn]ho]hp]hq]j$Xclassr=j&j=uhsNhthh]]r>(j()r?}r@(hbX#scrapy.contrib.debug.StackTraceDumphcj:hdhghij,hk}rA(hm]rBh(aj/Xscrapy.contrib.debugrChn]ho]hp]hq]rDh(aj2X#scrapy.contrib.debug.StackTraceDumpj3Xscrapy.contrib.debugj4uhsM^hthh]]rE(j6)rF}rG(hbXclass hcj?hdhghij9hk}rH(ho]hp]hn]hm]hq]uhsM^hthh]]rIhXclass rJrK}rL(hbUhcjFubaubj?)rM}rN(hbXscrapy.contrib.debug.hcj?hdhghijBhk}rO(ho]hp]hn]hm]hq]uhsM^hthh]]rPhXscrapy.contrib.debug.rQrR}rS(hbUhcjMubaubjH)rT}rU(hbXStackTraceDumphcj?hdhghijKhk}rV(ho]hp]hn]hm]hq]uhsM^hthh]]rWhXStackTraceDumprXrY}rZ(hbUhcjTubaubeubjQ)r[}r\(hbUhcj:hdhghijThk}r](ho]hp]hn]hm]hq]uhsM^hthh]]ubeubh)r^}r_(hbXDumps information about the running process when a `SIGQUIT`_ or `SIGUSR2`_ signal is received. The information dumped is the following:hcj'hdhghihhk}r`(ho]hp]hn]hm]hq]uhsM_hthh]]ra(hX3Dumps information about the running process when a rbrc}rd(hbX3Dumps information about the running process when a hcj^ubh)re}rf(hbX `SIGQUIT`_hKhcj^hihhk}rg(UnameXSIGQUIThX$http://en.wikipedia.org/wiki/SIGQUITrhhm]hn]ho]hp]hq]uh]]rihXSIGQUITrjrk}rl(hbUhcjeubaubhX or rmrn}ro(hbX or hcj^ubh)rp}rq(hbX `SIGUSR2`_hKhcj^hihhk}rr(UnameXSIGUSR2hX0http://en.wikipedia.org/wiki/SIGUSR1_and_SIGUSR2rshm]hn]ho]hp]hq]uh]]rthXSIGUSR2rurv}rw(hbUhcjpubaubhX= signal is received. The information dumped is the following:rxry}rz(hbX= signal is received. The information dumped is the following:hcj^ubeubcdocutils.nodes enumerated_list r{)r|}r}(hbUhcj'hdhghiUenumerated_listr~hk}r(UsuffixrU.hm]hn]ho]UprefixrUhp]hq]UenumtyperUarabicruhsMbhthh]]r(j)r}r(hbXAengine status (using ``scrapy.utils.engine.get_engine_status()``)rhcj|hdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbjhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMbh]]r(hXengine status (using rr}r(hbXengine status (using hcjubj)r}r(hbX+``scrapy.utils.engine.get_engine_status()``hk}r(ho]hp]hn]hm]hq]uhcjh]]rhX'scrapy.utils.engine.get_engine_status()rr}r(hbUhcjubahijubhX)r}r(hbX)hcjubeubaubj)r}r(hbX3live references (see :ref:`topics-leaks-trackrefs`)rhcj|hdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbjhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMch]]r(hXlive references (see rr}r(hbXlive references (see hcjubh)r}r(hbX:ref:`topics-leaks-trackrefs`rhcjhdhghihhk}r(UreftypeXrefhhXtopics-leaks-trackrefsU refdomainXstdrhm]hn]U refexplicitho]hp]hq]hhuhsMch]]rh)r}r(hbjhk}r(ho]hp]r(hjXstd-refrehn]hm]hq]uhcjh]]rhXtopics-leaks-trackrefsrr}r(hbUhcjubahihubaubhX)r}r(hbX)hcjubeubaubj)r}r(hbXstack trace of all threads hcj|hdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]rh)r}r(hbXstack trace of all threadsrhcjhdhghihhk}r(ho]hp]hn]hm]hq]uhsMdh]]rhXstack trace of all threadsrr}r(hbjhcjubaubaubeubh)r}r(hbXaAfter the stack trace and engine status is dumped, the Scrapy process continues running normally.rhcj'hdhghihhk}r(ho]hp]hn]hm]hq]uhsMfhthh]]rhXaAfter the stack trace and engine status is dumped, the Scrapy process continues running normally.rr}r(hbjhcjubaubh)r}r(hbXThis extension only works on POSIX-compliant platforms (ie. not Windows), because the `SIGQUIT`_ and `SIGUSR2`_ signals are not available on Windows.hcj'hdhghihhk}r(ho]hp]hn]hm]hq]uhsMihthh]]r(hXVThis extension only works on POSIX-compliant platforms (ie. not Windows), because the rr}r(hbXVThis extension only works on POSIX-compliant platforms (ie. not Windows), because the hcjubh)r}r(hbX `SIGQUIT`_hKhcjhihhk}r(UnameXSIGQUIThjhhm]hn]ho]hp]hq]uh]]rhXSIGQUITrr}r(hbUhcjubaubhX and rr}r(hbX and hcjubh)r}r(hbX `SIGUSR2`_hKhcjhihhk}r(UnameXSIGUSR2hjshm]hn]ho]hp]hq]uh]]rhXSIGUSR2rr}r(hbUhcjubaubhX& signals are not available on Windows.rr}r(hbX& signals are not available on Windows.hcjubeubh)r}r(hbXAThere are at least two ways to send Scrapy the `SIGQUIT`_ signal:rhcj'hdhghihhk}r(ho]hp]hn]hm]hq]uhsMlhthh]]r(hX/There are at least two ways to send Scrapy the rr}r(hbX/There are at least two ways to send Scrapy the hcjubh)r}r(hbX `SIGQUIT`_hKhcjhihhk}r(UnameXSIGQUIThjhhm]hn]ho]hp]hq]uh]]rhXSIGQUITrr}r(hbUhcjubaubhX signal:rr}r(hbX signal:hcjubeubj{)r}r(hbUhcj'hdhghij~hk}r(jU.hm]hn]ho]jUhp]hq]jjuhsMnhthh]]r(j)r}r(hbXBBy pressing Ctrl-\ while a Scrapy process is running (Linux only?)rhcjhdhghijhk}r(ho]hp]hn]hm]hq]uhsNhthh]]r h)r }r (hbjhcjhdhghihhk}r (ho]hp]hn]hm]hq]uhsMnh]]r hX@By pressing Ctrl-while a Scrapy process is running (Linux only?)r r }r (hbXBBy pressing Ctrl-\ while a Scrapy process is running (Linux only?)hcj ubaubaubj)r }r (hbXjBy running this command (assuming ```` is the process id of the Scrapy process):: kill -QUIT hcjhdhghijhk}r (ho]hp]hn]hm]hq]uhsNhthh]]r (h)r }r (hbXVBy running this command (assuming ```` is the process id of the Scrapy process)::hcj hdhghihhk}r (ho]hp]hn]hm]hq]uhsMoh]]r (hX"By running this command (assuming r r }r (hbX"By running this command (assuming hcj ubj)r }r (hbX ````hk}r (ho]hp]hn]hm]hq]uhcj h]]r hXr r }r (hbUhcj ubahijubhX* is the process id of the Scrapy process):r r }r (hbX* is the process id of the Scrapy process):hcj ubeubjL)r }r (hbXkill -QUIT hcj hijOhk}r (jQjRhm]hn]ho]hp]hq]uhsMrh]]r hXkill -QUIT r! r" }r# (hbUhcj ubaubeubeubh_)r$ }r% (hbX=.. _SIGUSR2: http://en.wikipedia.org/wiki/SIGUSR1_and_SIGUSR2hKhcj'hdhghihjhk}r& (hjshm]r' h=ahn]ho]hp]hq]r( hauhsMththh]]ubh_)r) }r* (hbX1.. _SIGQUIT: http://en.wikipedia.org/wiki/SIGQUIThKhcj'hdhghihjhk}r+ (hjhhm]r, hHahn]ho]hp]hq]r- hauhsMuhthh]]ubeubhu)r. }r/ (hbUhcjhdhghihzhk}r0 (ho]hp]hn]hm]r1 hUahq]r2 h*auhsMxhthh]]r3 (h)r4 }r5 (hbXDebugger extensionr6 hcj. hdhghihhk}r7 (ho]hp]hn]hm]hq]uhsMxhthh]]r8 hXDebugger extensionr9 r: }r; (hbj6 hcj4 ubaubj)r< }r= (hbUhcj. hdNhijhk}r> (hm]hn]ho]hp]hq]Uentries]r? (jX=scrapy.contrib.debug.Debugger (class in scrapy.contrib.debug)h&Utr@ auhsNhthh]]ubj)rA }rB (hbUhcj. hdNhij hk}rC (j"j#Xpyhm]hn]ho]hp]hq]j$XclassrD j&jD uhsNhthh]]rE (j()rF }rG (hbXscrapy.contrib.debug.DebuggerrH hcjA hdhghij,hk}rI (hm]rJ h&aj/jChn]ho]hp]hq]rK h&aj2Xscrapy.contrib.debug.Debuggerj3Xscrapy.contrib.debugj4uhsM{hthh]]rL (j6)rM }rN (hbXclass hcjF hdhghij9hk}rO (ho]hp]hn]hm]hq]uhsM{hthh]]rP hXclass rQ rR }rS (hbUhcjM ubaubj?)rT }rU (hbXscrapy.contrib.debug.hcjF hdhghijBhk}rV (ho]hp]hn]hm]hq]uhsM{hthh]]rW hXscrapy.contrib.debug.rX rY }rZ (hbUhcjT ubaubjH)r[ }r\ (hbXDebuggerhcjF hdhghijKhk}r] (ho]hp]hn]hm]hq]uhsM{hthh]]r^ hXDebuggerr_ r` }ra (hbUhcj[ ubaubeubjQ)rb }rc (hbUhcjA hdhghijThk}rd (ho]hp]hn]hm]hq]uhsM{hthh]]ubeubh)re }rf (hbXInvokes a `Python debugger`_ inside a running Scrapy process when a `SIGUSR2`_ signal is received. After the debugger is exited, the Scrapy process continues running normally.hcj. hdhghihhk}rg (ho]hp]hn]hm]hq]uhsM|hthh]]rh (hX Invokes a ri rj }rk (hbX Invokes a hcje ubh)rl }rm (hbX`Python debugger`_hKhcje hihhk}rn (UnameXPython debuggerhX'http://docs.python.org/library/pdb.htmlro hm]hn]ho]hp]hq]uh]]rp hXPython debuggerrq rr }rs (hbUhcjl ubaubhX( inside a running Scrapy process when a rt ru }rv (hbX( inside a running Scrapy process when a hcje ubh)rw }rx (hbX `SIGUSR2`_hKhcje hihhk}ry (UnameXSIGUSR2hjshm]hn]ho]hp]hq]uh]]rz hXSIGUSR2r{ r| }r} (hbUhcjw ubaubhXa signal is received. After the debugger is exited, the Scrapy process continues running normally.r~ r }r (hbXa signal is received. After the debugger is exited, the Scrapy process continues running normally.hcje ubeubh)r }r (hbX(For more info see `Debugging in Python`.r hcj. hdhghihhk}r (ho]hp]hn]hm]hq]uhsMhthh]]r (hXFor more info see r r }r (hbXFor more info see hcj ubh)r }r (hbX`Debugging in Python`hk}r (ho]hp]hn]hm]hq]uhcj h]]r hXDebugging in Pythonr r }r (hbUhcj ubahihubhX.r }r (hbX.hcj ubeubh)r }r (hbXIThis extension only works on POSIX-compliant platforms (ie. not Windows).r hcj. hdhghihhk}r (ho]hp]hn]hm]hq]uhsMhthh]]r hXIThis extension only works on POSIX-compliant platforms (ie. not Windows).r r }r (hbj hcj ubaubh_)r }r (hbX<.. _Python debugger: http://docs.python.org/library/pdb.htmlhKhcj. hdhghihjhk}r (hjo hm]r h;ahn]ho]hp]hq]r hauhsMhthh]]ubh_)r }r (hbXL.. _Debugging in Python: http://www.ferg.org/papers/debugging_in_python.htmlhcj. hdhghihjhk}r (hX3http://www.ferg.org/papers/debugging_in_python.htmlhm]r hRahn]ho]hp]hq]r h%auhsMhthh]]ubeubeubeubeubehbUU transformerr NU footnote_refsr }r Urefnamesr }r (Xpython debugger]r jl aXsigusr2]r (jpjjw eXgoogle sitemaps]r haXsigquit]r (jejjeuUsymbol_footnotesr ]r Uautofootnote_refsr ]r Usymbol_footnote_refsr ]r U citationsr ]r hthU current_liner NUtransform_messagesr ]r (cdocutils.nodes system_message r )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineKUtypeUINFOr uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hX7Hyperlink target "topics-extensions" is not referenced.r r }r (hbUhcj ubahihubahiUsystem_messager ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineKUtypej uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hX;Hyperlink target "topics-extensions-ref" is not referenced.r r }r (hbUhcj ubahihubahij ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineKUtypej uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hXFHyperlink target "topics-extensions-ref-webservice" is not referenced.r r }r (hbUhcj ubahihubahij ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineKUtypej uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hXIHyperlink target "topics-extensions-ref-telnetconsole" is not referenced.r r }r (hbUhcj ubahihubahij ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineKUtypej uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hXDHyperlink target "topics-extensions-ref-memusage" is not referenced.r r }r (hbUhcj ubahihubahij ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineMUtypej uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hXEHyperlink target "std:setting-CLOSESPIDER_TIMEOUT" is not referenced.r r }r (hbUhcj ubahihubahij ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineM Utypej uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hXGHyperlink target "std:setting-CLOSESPIDER_ITEMCOUNT" is not referenced.r r }r (hbUhcj ubahihubahij ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineM,Utypej uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hXGHyperlink target "std:setting-CLOSESPIDER_PAGECOUNT" is not referenced.r r }r (hbUhcj ubahihubahij ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]UlineM:Utypej uh]]r h)r }r (hbUhk}r (ho]hp]hn]hm]hq]uhcj h]]r hXHHyperlink target "std:setting-CLOSESPIDER_ERRORCOUNT" is not referenced.r r }r (hbUhcj ubahihubahij ubj )r }r (hbUhk}r (ho]UlevelKhm]hn]Usourcehghp]hq]Utypej uh]]r! h)r" }r# (hbUhk}r$ (ho]hp]hn]hm]hq]uhcj h]]r% hXAHyperlink target "module-scrapy.contrib.debug" is not referenced.r& r' }r( (hbUhcj" ubahihubahij ubj )r) }r* (hbUhk}r+ (ho]UlevelKhm]hn]Usourcehghp]hq]UlineMUtypej uh]]r, h)r- }r. (hbUhk}r/ (ho]hp]hn]hm]hq]uhcj) h]]r0 hX9Hyperlink target "debugging in python" is not referenced.r1 r2 }r3 (hbUhcj- ubahihubahij ubeUreporterr4 NUid_startr5 KU autofootnotesr6 ]r7 U citation_refsr8 }r9 Uindirect_targetsr: ]r; Usettingsr< (cdocutils.frontend Values r= or> }r? (Ufootnote_backlinksr@ KUrecord_dependenciesrA NU rfc_base_urlrB Uhttp://tools.ietf.org/html/rC U tracebackrD Upep_referencesrE NUstrip_commentsrF NU toc_backlinksrG UentryrH U language_coderI UenrJ U datestamprK NU report_levelrL KU _destinationrM NU halt_levelrN KU strip_classesrO NhNUerror_encoding_error_handlerrP UbackslashreplacerQ UdebugrR NUembed_stylesheetrS Uoutput_encoding_error_handlerrT UstrictrU U sectnum_xformrV KUdump_transformsrW NU docinfo_xformrX KUwarning_streamrY NUpep_file_url_templaterZ Upep-%04dr[ Uexit_status_levelr\ KUconfigr] NUstrict_visitorr^ NUcloak_email_addressesr_ Utrim_footnote_reference_spacer` Uenvra NUdump_pseudo_xmlrb NUexpose_internalsrc NUsectsubtitle_xformrd U source_linkre NUrfc_referencesrf NUoutput_encodingrg Uutf-8rh U source_urlri NUinput_encodingrj U utf-8-sigrk U_disable_configrl NU id_prefixrm UU tab_widthrn KUerror_encodingro UUTF-8rp U_sourcerq UG/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/extensions.rstrr Ugettext_compactrs U generatorrt NUdump_internalsru NU smart_quotesrv U pep_base_urlrw Uhttp://www.python.org/dev/peps/rx Usyntax_highlightry Ulongrz Uinput_encoding_error_handlerr{ jU Uauto_id_prefixr| Uidr} Udoctitle_xformr~ Ustrip_elements_with_classesr NU _config_filesr ]Ufile_insertion_enabledr U raw_enabledr KU dump_settingsr NubUsymbol_footnote_startr KUidsr }r (hjhZj,jh_)r }r (hbUhcjhdhghihjhk}r (ho]hm]r jahn]Uismodhp]hq]uhsNhthh]]ubh j)jh_)r }r (hbUhcj hdhghihjhk}r (ho]hm]r jahn]Uismodhp]hq]uhsNhthh]]ubjh_)r }r (hbUhcjhdhghihjhk}r (ho]hm]r jahn]Uismodhp]hq]uhsNhthh]]ubh"jh@hvhWjhEjhCjjh_)r }r (hbUhcjhdhghihjhk}r (ho]hm]r jahn]Uismodhp]hq]uhsNhthh]]ubhLjsh=j$ hNjhIjhjj'j,hHj) jh_)r }r (hbUhcjhdhghihjhk}r (ho]hm]r jahn]Uismodhp]hq]uhsNhthh]]ubhRj hKjhjhPjh j&hTjzh\jhOjh j+jjhBjhDjhJhjujzjbh_)r }r (hbUhcj^hdhghihjhk}r (ho]hm]r jbahn]Uismodhp]hq]uhsNhthh]]ubj h_)r }r (hbUhcjhdhghihjhk}r (ho]hm]r j ahn]Uismodhp]hq]uhsNhthh]]ubjh_)r }r (hbUhcjhdhghihjhk}r (ho]hm]r jahn]Uismodhp]hq]uhsNhthh]]ubh&jF jjh$j|hFjhAj^h;j h(j?hMjhVj'h?jhhhUj. uUsubstitution_namesr }r hihthk}r (ho]hm]hn]Usourcehghp]hq]uU footnotesr ]r Urefidsr }r (ju]r jwahC]r j|ah@]r h`ahK]r jaj]r jaj]r jahY]r jah[]r jaj']r j)auub.PKo1Drtb(M(M(scrapy-0.22/.doctrees/topics/api.doctreecdocutils.nodes document q)q}q(U nametypesq}q(X:scrapy.signalmanager.SignalManager.send_catch_log_deferredqXscrapy.crawler.Crawler.statsqX!scrapy.settings.Settings.getfloatqX(scrapy.statscol.StatsCollector.min_valueq X(scrapy.statscol.StatsCollector.get_statsq Xscrapy.crawler.Crawler.startq X(scrapy.statscol.StatsCollector.max_valueq X(scrapy.statscol.StatsCollector.get_valueq Xtopics-api-statsqX+scrapy.statscol.StatsCollector.close_spiderqXscrapy.crawler.CrawlerqX-scrapy.signalmanager.SignalManager.disconnectqX scrapy.settings.Settings.getboolqX1scrapy.signalmanager.SignalManager.send_catch_logqXtopics-api-crawlerqX scrapy.settings.Settings.getlistqX scrapy.crawler.Crawler.configureqXscrapy.settings.Settings.getqX*scrapy.statscol.StatsCollector.open_spiderqX(scrapy.statscol.StatsCollector.set_statsqXcore apiqNX*scrapy.statscol.StatsCollector.clear_statsqXstats collector apiqNXscrapy.crawler.Crawler.signalsqX"scrapy.signalmanager.SignalManagerqX signals apiqNX!scrapy.crawler.Crawler.extensionsq X1scrapy.signalmanager.SignalManager.disconnect_allq!Xscrapy.crawler.Crawler.engineq"Xscrapy.crawler.Crawler.spidersq#X"scrapy.settings.Settings.overridesq$X topics-apiq%X(scrapy.statscol.StatsCollector.set_valueq&X settings apiq'NXscrapy.settings.Settingsq(X deferredsq)Xscrapy.statscol.StatsCollectorq*Xdeferredq+X crawler apiq,NX(scrapy.statscol.StatsCollector.inc_valueq-X*scrapy.signalmanager.SignalManager.connectq.Xtopics-api-signalsq/Xscrapy.settings.Settings.getintq0Xscrapy.crawler.Crawler.settingsq1uUsubstitution_defsq2}q3Uparse_messagesq4]q5Ucurrent_sourceq6NU decorationq7NUautofootnote_startq8KUnameidsq9}q:(hhhhhhh h h h h h h h h h hUtopics-api-statsq;hhhhhhhhhhhUtopics-api-crawlerqhhhhhU signals-apiq?h h h!h!h"h"h#h#h$h$h%U topics-apiq@h&h&h'U settings-apiqAh(h(h)U deferredsqBh*h*h+UdeferredqCh,U crawler-apiqDh-h-h.h.h/Utopics-api-signalsqEh0h0h1h1uUchildrenqF]qG(cdocutils.nodes target qH)qI}qJ(U rawsourceqKX.. _topics-api:UparentqLhUsourceqMcdocutils.nodes reprunicode qNX@/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/api.rstqOqP}qQbUtagnameqRUtargetqSU attributesqT}qU(UidsqV]UbackrefsqW]UdupnamesqX]UclassesqY]UnamesqZ]Urefidq[h@uUlineq\KUdocumentq]hhF]ubcdocutils.nodes section q^)q_}q`(hKUhLhhMhPUexpect_referenced_by_nameqa}qbh%hIshRUsectionqchT}qd(hX]hY]hW]hV]qe(h=h@ehZ]qf(hh%euh\Kh]hUexpect_referenced_by_idqg}qhh@hIshF]qi(cdocutils.nodes title qj)qk}ql(hKXCore APIqmhLh_hMhPhRUtitleqnhT}qo(hX]hY]hW]hV]hZ]uh\Kh]hhF]qpcdocutils.nodes Text qqXCore APIqrqs}qt(hKhmhLhkubaubcsphinx.addnodes versionmodified qu)qv}qw(hKUhLh_hMhPhRUversionmodifiedqxhT}qy(UversionqzX0.15q{hV]hW]hX]hY]hZ]Utypeq|X versionaddedq}uh\Kh]hhF]q~cdocutils.nodes paragraph q)q}q(hKUhLhvhMhPhRU paragraphqhT}q(hX]hY]hW]hV]hZ]uh\Kh]hhF]qcdocutils.nodes inline q)q}q(hKUhT}q(hX]hY]qhxahW]hV]hZ]uhLhhF]qhqXNew in version 0.15.qq}q(hKUhLhubahRUinlinequbaubaubh)q}q(hKXkThis section documents the Scrapy core API, and it's intended for developers of extensions and middlewares.qhLh_hMhPhRhhT}q(hX]hY]hW]hV]hZ]uh\K h]hhF]qhqXkThis section documents the Scrapy core API, and it's intended for developers of extensions and middlewares.qq}q(hKhhLhubaubhH)q}q(hKX.. _topics-api-crawler:hLh_hMhPhRhShT}q(hV]hW]hX]hY]hZ]h[h`.hLhhMhPhRhhT}q(hX]hY]hW]hV]hZ]uh\Kh]hhF]q(hqX{The Extension Manager is responsible for loading and keeping track of installed extensions and it's configured through the q酁q}q(hKX{The Extension Manager is responsible for loading and keeping track of installed extensions and it's configured through the hLhubh)q}q(hKX:setting:`EXTENSIONS`qhLhhMhPhRhhT}q(UreftypeXsettinghhX EXTENSIONSU refdomainXstdqhV]hW]U refexplicithX]hY]hZ]hhuh\KhF]qh)q}q(hKhhT}q(hX]hY]q(hhX std-settingqehW]hV]hZ]uhLhhF]qhqX EXTENSIONSqq}q(hKUhLhubahRhubaubhqXd setting which contains a dictionary of all available extensions and their order similar to how you qq}q(hKXd setting which contains a dictionary of all available extensions and their order similar to how you hLhubh)q}q(hKXR:ref:`configure the downloader middlewares `rhLhhMhPhRhhT}r(UreftypeXrefhhX$topics-downloader-middleware-settingU refdomainXstdrhV]hW]U refexplicithX]hY]hZ]hhuh\KhF]rcdocutils.nodes emphasis r)r}r(hKjhT}r(hX]hY]r(hjXstd-refr ehW]hV]hZ]uhLhhF]r hqX$configure the downloader middlewaresr r }r (hKUhLjubahRUemphasisrubaubhqX.r}r(hKX.hLhubeubh)r}r(hKUhLhhMNhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX!Crawler (class in scrapy.crawler)hUtrauh\Nh]hhF]ubcsphinx.addnodes desc r)r}r(hKUhLhhMNhRUdescrhT}r(UnoindexrUdomainrXpyhV]hW]hX]hY]hZ]UobjtyperXclassrUdesctyperjuh\Nh]hhF]r (csphinx.addnodes desc_signature r!)r"}r#(hKXCrawler(settings)hLjhMhPhRUdesc_signaturer$hT}r%(hV]r&haUmoduler'Xscrapy.crawlerr(hW]hX]hY]hZ]r)haUfullnamer*XCrawlerr+Uclassr,UUfirstr-uh\Kih]hhF]r.(csphinx.addnodes desc_annotation r/)r0}r1(hKXclass hLj"hMhPhRUdesc_annotationr2hT}r3(hX]hY]hW]hV]hZ]uh\Kih]hhF]r4hqXclass r5r6}r7(hKUhLj0ubaubcsphinx.addnodes desc_addname r8)r9}r:(hKXscrapy.crawler.hLj"hMhPhRU desc_addnamer;hT}r<(hX]hY]hW]hV]hZ]uh\Kih]hhF]r=hqXscrapy.crawler.r>r?}r@(hKUhLj9ubaubcsphinx.addnodes desc_name rA)rB}rC(hKj+hLj"hMhPhRU desc_namerDhT}rE(hX]hY]hW]hV]hZ]uh\Kih]hhF]rFhqXCrawlerrGrH}rI(hKUhLjBubaubcsphinx.addnodes desc_parameterlist rJ)rK}rL(hKUhLj"hMhPhRUdesc_parameterlistrMhT}rN(hX]hY]hW]hV]hZ]uh\Kih]hhF]rOcsphinx.addnodes desc_parameter rP)rQ}rR(hKXsettingshT}rS(hX]hY]hW]hV]hZ]uhLjKhF]rThqXsettingsrUrV}rW(hKUhLjQubahRUdesc_parameterrXubaubeubcsphinx.addnodes desc_content rY)rZ}r[(hKUhLjhMhPhRU desc_contentr\hT}r](hX]hY]hW]hV]hZ]uh\Kih]hhF]r^(h)r_}r`(hKXXThe Crawler object must be instantiated with a :class:`scrapy.settings.Settings` object.hLjZhMhPhRhhT}ra(hX]hY]hW]hV]hZ]uh\K!h]hhF]rb(hqX/The Crawler object must be instantiated with a rcrd}re(hKX/The Crawler object must be instantiated with a hLj_ubh)rf}rg(hKX!:class:`scrapy.settings.Settings`rhhLj_hMhPhRhhT}ri(UreftypeXclasshhXscrapy.settings.SettingsU refdomainXpyrjhV]hW]U refexplicithX]hY]hZ]hhhj+hj(uh\K!hF]rkh)rl}rm(hKjhhT}rn(hX]hY]ro(hjjXpy-classrpehW]hV]hZ]uhLjfhF]rqhqXscrapy.settings.Settingsrrrs}rt(hKUhLjlubahRhubaubhqX object.rurv}rw(hKX object.hLj_ubeubh)rx}ry(hKUhLjZhMhPhRhhT}rz(hV]hW]hX]hY]hZ]Uentries]r{(hX+settings (scrapy.crawler.Crawler attribute)h1Utr|auh\Nh]hhF]ubj)r}}r~(hKUhLjZhMhPhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jX attributerjjuh\Nh]hhF]r(j!)r}r(hKXsettingsrhLj}hMhPhRj$hT}r(hV]rh1aj'j(hW]hX]hY]hZ]rh1aj*XCrawler.settingsj,j+j-uh\K.h]hhF]rjA)r}r(hKjhLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\K.h]hhF]rhqXsettingsrr}r(hKUhLjubaubaubjY)r}r(hKUhLj}hMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\K.h]hhF]r(h)r}r(hKX%The settings manager of this crawler.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\K&h]hhF]rhqX%The settings manager of this crawler.rr}r(hKjhLjubaubh)r}r(hKXWThis is used by extensions & middlewares to access the Scrapy settings of this crawler.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\K(h]hhF]rhqXWThis is used by extensions & middlewares to access the Scrapy settings of this crawler.rr}r(hKjhLjubaubh)r}r(hKXBFor an introduction on Scrapy settings see :ref:`topics-settings`.hLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\K+h]hhF]r(hqX+For an introduction on Scrapy settings see rr}r(hKX+For an introduction on Scrapy settings see hLjubh)r}r(hKX:ref:`topics-settings`rhLjhMhPhRhhT}r(UreftypeXrefhhXtopics-settingsU refdomainXstdrhV]hW]U refexplicithX]hY]hZ]hhuh\K+hF]rj)r}r(hKjhT}r(hX]hY]r(hjXstd-refrehW]hV]hZ]uhLjhF]rhqXtopics-settingsrr}r(hKUhLjubahRjubaubhqX.r}r(hKX.hLjubeubh)r}r(hKX9For the API see :class:`~scrapy.settings.Settings` class.hLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\K-h]hhF]r(hqXFor the API see rr}r(hKXFor the API see hLjubh)r}r(hKX":class:`~scrapy.settings.Settings`rhLjhMhPhRhhT}r(UreftypeXclasshhXscrapy.settings.SettingsU refdomainXpyrhV]hW]U refexplicithX]hY]hZ]hhhj+hj(uh\K-hF]rh)r}r(hKjhT}r(hX]hY]r(hjXpy-classrehW]hV]hZ]uhLjhF]rhqXSettingsrr}r(hKUhLjubahRhubaubhqX class.rr}r(hKX class.hLjubeubeubeubh)r}r(hKUhLjZhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX*signals (scrapy.crawler.Crawler attribute)hUtrauh\Nh]hhF]ubj)r}r(hKUhLjZhMhPhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jX attributerjjuh\Nh]hhF]r(j!)r}r(hKXsignalsrhLjhMhPhRj$hT}r(hV]rhaj'j(hW]hX]hY]hZ]rhaj*XCrawler.signalsj,j+j-uh\K9h]hhF]rjA)r}r(hKjhLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\K9h]hhF]rhqXsignalsrr}r(hKUhLjubaubaubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\K9h]hhF]r(h)r}r(hKX$The signals manager of this crawler.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\K1h]hhF]rhqX$The signals manager of this crawler.rr}r(hKjhLjubaubh)r}r(hKXVThis is used by extensions & middlewares to hook themselves into Scrapy functionality.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\K3h]hhF]rhqXVThis is used by extensions & middlewares to hook themselves into Scrapy functionality.rr}r(hKjhLjubaubh)r}r(hKX9For an introduction on signals see :ref:`topics-signals`.hLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\K6h]hhF]r(hqX#For an introduction on signals see rr}r(hKX#For an introduction on signals see hLjubh)r}r (hKX:ref:`topics-signals`r hLjhMhPhRhhT}r (UreftypeXrefhhXtopics-signalsU refdomainXstdr hV]hW]U refexplicithX]hY]hZ]hhuh\K6hF]r j)r}r(hKj hT}r(hX]hY]r(hj Xstd-refrehW]hV]hZ]uhLjhF]rhqXtopics-signalsrr}r(hKUhLjubahRjubaubhqX.r}r(hKX.hLjubeubh)r}r(hKXCFor the API see :class:`~scrapy.signalmanager.SignalManager` class.hLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\K8h]hhF]r(hqXFor the API see rr}r(hKXFor the API see hLjubh)r }r!(hKX,:class:`~scrapy.signalmanager.SignalManager`r"hLjhMhPhRhhT}r#(UreftypeXclasshhX"scrapy.signalmanager.SignalManagerU refdomainXpyr$hV]hW]U refexplicithX]hY]hZ]hhhj+hj(uh\K8hF]r%h)r&}r'(hKj"hT}r((hX]hY]r)(hj$Xpy-classr*ehW]hV]hZ]uhLj hF]r+hqX SignalManagerr,r-}r.(hKUhLj&ubahRhubaubhqX class.r/r0}r1(hKX class.hLjubeubeubeubh)r2}r3(hKUhLjZhMhPhRhhT}r4(hV]hW]hX]hY]hZ]Uentries]r5(hX(stats (scrapy.crawler.Crawler attribute)hUtr6auh\Nh]hhF]ubj)r7}r8(hKUhLjZhMhPhRjhT}r9(jjXpyhV]hW]hX]hY]hZ]jX attributer:jj:uh\Nh]hhF]r;(j!)r<}r=(hKXstatsr>hLj7hMhPhRj$hT}r?(hV]r@haj'j(hW]hX]hY]hZ]rAhaj*X Crawler.statsj,j+j-uh\KDh]hhF]rBjA)rC}rD(hKj>hLj<hMhPhRjDhT}rE(hX]hY]hW]hV]hZ]uh\KDh]hhF]rFhqXstatsrGrH}rI(hKUhLjCubaubaubjY)rJ}rK(hKUhLj7hMhPhRj\hT}rL(hX]hY]hW]hV]hZ]uh\KDh]hhF]rM(h)rN}rO(hKX$The stats collector of this crawler.rPhLjJhMhPhRhhT}rQ(hX]hY]hW]hV]hZ]uh\Kh]hhF]rZhqX}This is used from extensions & middlewares to record stats of their behaviour, or access stats collected by other extensions.r[r\}r](hKjXhLjVubaubh)r^}r_(hKX@For an introduction on stats collection see :ref:`topics-stats`.hLjJhMhPhRhhT}r`(hX]hY]hW]hV]hZ]uh\KAh]hhF]ra(hqX,For an introduction on stats collection see rbrc}rd(hKX,For an introduction on stats collection see hLj^ubh)re}rf(hKX:ref:`topics-stats`rghLj^hMhPhRhhT}rh(UreftypeXrefhhX topics-statsU refdomainXstdrihV]hW]U refexplicithX]hY]hZ]hhuh\KAhF]rjj)rk}rl(hKjghT}rm(hX]hY]rn(hjiXstd-refroehW]hV]hZ]uhLjehF]rphqX topics-statsrqrr}rs(hKUhLjkubahRjubaubhqX.rt}ru(hKX.hLj^ubeubh)rv}rw(hKX?For the API see :class:`~scrapy.statscol.StatsCollector` class.hLjJhMhPhRhhT}rx(hX]hY]hW]hV]hZ]uh\KCh]hhF]ry(hqXFor the API see rzr{}r|(hKXFor the API see hLjvubh)r}}r~(hKX(:class:`~scrapy.statscol.StatsCollector`rhLjvhMhPhRhhT}r(UreftypeXclasshhXscrapy.statscol.StatsCollectorU refdomainXpyrhV]hW]U refexplicithX]hY]hZ]hhhj+hj(uh\KChF]rh)r}r(hKjhT}r(hX]hY]r(hjXpy-classrehW]hV]hZ]uhLj}hF]rhqXStatsCollectorrr}r(hKUhLjubahRhubaubhqX class.rr}r(hKX class.hLjvubeubeubeubh)r}r(hKUhLjZhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX-extensions (scrapy.crawler.Crawler attribute)h Utrauh\Nh]hhF]ubj)r}r(hKUhLjZhMhPhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jX attributerjjuh\Nh]hhF]r(j!)r}r(hKX extensionsrhLjhMhPhRj$hT}r(hV]rh aj'j(hW]hX]hY]hZ]rh aj*XCrawler.extensionsj,j+j-uh\KMh]hhF]rjA)r}r(hKjhLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\KMh]hhF]rhqX extensionsrr}r(hKUhLjubaubaubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\KMh]hhF]r(h)r}r(hKX=The extension manager that keeps track of enabled extensions.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\KGh]hhF]rhqX=The extension manager that keeps track of enabled extensions.rr}r(hKjhLjubaubh)r}r(hKX4Most extensions won't need to access this attribute.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\KIh]hhF]rhqX4Most extensions won't need to access this attribute.rr}r(hKjhLjubaubh)r}r(hKXlFor an introduction on extensions and a list of available extensions on Scrapy see :ref:`topics-extensions`.hLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\KKh]hhF]r(hqXSFor an introduction on extensions and a list of available extensions on Scrapy see rr}r(hKXSFor an introduction on extensions and a list of available extensions on Scrapy see hLjubh)r}r(hKX:ref:`topics-extensions`rhLjhMhPhRhhT}r(UreftypeXrefhhXtopics-extensionsU refdomainXstdrhV]hW]U refexplicithX]hY]hZ]hhuh\KKhF]rj)r}r(hKjhT}r(hX]hY]r(hjXstd-refrehW]hV]hZ]uhLjhF]rhqXtopics-extensionsrr}r(hKUhLjubahRjubaubhqX.r}r(hKX.hLjubeubeubeubh)r}r(hKUhLjZhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX*spiders (scrapy.crawler.Crawler attribute)h#Utrauh\Nh]hhF]ubj)r}r(hKUhLjZhMhPhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jX attributerjjuh\Nh]hhF]r(j!)r}r(hKXspidersrhLjhMhPhRj$hT}r(hV]rh#aj'j(hW]hX]hY]hZ]rh#aj*XCrawler.spidersj,j+j-uh\KTh]hhF]rjA)r}r(hKjhLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\KTh]hhF]rhqXspidersrr}r(hKUhLjubaubaubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\KTh]hhF]r(h)r}r(hKXIThe spider manager which takes care of loading and instantiating spiders.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\KPh]hhF]rhqXIThe spider manager which takes care of loading and instantiating spiders.rr}r(hKjhLjubaubh)r}r(hKX4Most extensions won't need to access this attribute.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\KSh]hhF]rhqX4Most extensions won't need to access this attribute.rr}r(hKjhLjubaubeubeubh)r}r(hKUhLjZhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX)engine (scrapy.crawler.Crawler attribute)h"Utrauh\Nh]hhF]ubj)r}r(hKUhLjZhMhPhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jX attributerjjuh\Nh]hhF]r(j!)r }r (hKXenginer hLjhMhPhRj$hT}r (hV]r h"aj'j(hW]hX]hY]hZ]rh"aj*XCrawler.enginej,j+j-uh\K]h]hhF]rjA)r}r(hKj hLj hMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\K]h]hhF]rhqXenginerr}r(hKUhLjubaubaubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\K]h]hhF]r(h)r}r(hKXnThe execution engine, which coordinates the core crawling logic between the scheduler, downloader and spiders.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\KWh]hhF]rhqXnThe execution engine, which coordinates the core crawling logic between the scheduler, downloader and spiders.r r!}r"(hKjhLjubaubh)r#}r$(hKXSome extension may want to access the Scrapy engine, to modify inspect or modify the downloader and scheduler behaviour, although this is an advanced use and this API is not yet stable.r%hLjhMhPhRhhT}r&(hX]hY]hW]hV]hZ]uh\KZh]hhF]r'hqXSome extension may want to access the Scrapy engine, to modify inspect or modify the downloader and scheduler behaviour, although this is an advanced use and this API is not yet stable.r(r)}r*(hKj%hLj#ubaubeubeubh)r+}r,(hKUhLjZhMhPhRhhT}r-(hV]hW]hX]hY]hZ]Uentries]r.(hX+configure() (scrapy.crawler.Crawler method)hUtr/auh\Nh]hhF]ubj)r0}r1(hKUhLjZhMhPhRjhT}r2(jjXpyhV]hW]hX]hY]hZ]jXmethodr3jj3uh\Nh]hhF]r4(j!)r5}r6(hKX configure()hLj0hMhPhRj$hT}r7(hV]r8haj'j(hW]hX]hY]hZ]r9haj*XCrawler.configurej,j+j-uh\Kdh]hhF]r:(jA)r;}r<(hKX configurehLj5hMhPhRjDhT}r=(hX]hY]hW]hV]hZ]uh\Kdh]hhF]r>hqX configurer?r@}rA(hKUhLj;ubaubjJ)rB}rC(hKUhLj5hMhPhRjMhT}rD(hX]hY]hW]hV]hZ]uh\Kdh]hhF]ubeubjY)rE}rF(hKUhLj0hMhPhRj\hT}rG(hX]hY]hW]hV]hZ]uh\Kdh]hhF]rH(h)rI}rJ(hKXConfigure the crawler.rKhLjEhMhPhRhhT}rL(hX]hY]hW]hV]hZ]uh\K`h]hhF]rMhqXConfigure the crawler.rNrO}rP(hKjKhLjIubaubh)rQ}rR(hKXThis loads extensions, middlewares and spiders, leaving the crawler ready to be started. It also configures the execution engine.rShLjEhMhPhRhhT}rT(hX]hY]hW]hV]hZ]uh\Kbh]hhF]rUhqXThis loads extensions, middlewares and spiders, leaving the crawler ready to be started. It also configures the execution engine.rVrW}rX(hKjShLjQubaubeubeubh)rY}rZ(hKUhLjZhMhPhRhhT}r[(hV]hW]hX]hY]hZ]Uentries]r\(hX'start() (scrapy.crawler.Crawler method)h Utr]auh\Nh]hhF]ubj)r^}r_(hKUhLjZhMhPhRjhT}r`(jjXpyhV]hW]hX]hY]hZ]jXmethodrajjauh\Nh]hhF]rb(j!)rc}rd(hKXstart()hLj^hMhPhRj$hT}re(hV]rfh aj'j(hW]hX]hY]hZ]rgh aj*X Crawler.startj,j+j-uh\Khh]hhF]rh(jA)ri}rj(hKXstarthLjchMhPhRjDhT}rk(hX]hY]hW]hV]hZ]uh\Khh]hhF]rlhqXstartrmrn}ro(hKUhLjiubaubjJ)rp}rq(hKUhLjchMhPhRjMhT}rr(hX]hY]hW]hV]hZ]uh\Khh]hhF]ubeubjY)rs}rt(hKUhLj^hMhPhRj\hT}ru(hX]hY]hW]hV]hZ]uh\Khh]hhF]rvh)rw}rx(hKXStart the crawler. This calls :meth:`configure` if it hasn't been called yet. Returns a deferred that is fired when the crawl is finished.hLjshMhPhRhhT}ry(hX]hY]hW]hV]hZ]uh\Kgh]hhF]rz(hqXStart the crawler. This calls r{r|}r}(hKXStart the crawler. This calls hLjwubh)r~}r(hKX:meth:`configure`rhLjwhMhPhRhhT}r(UreftypeXmethhhX configureU refdomainXpyrhV]hW]U refexplicithX]hY]hZ]hhhj+hj(uh\KghF]rh)r}r(hKjhT}r(hX]hY]r(hjXpy-methrehW]hV]hZ]uhLj~hF]rhqX configure()rr}r(hKUhLjubahRhubaubhqX[ if it hasn't been called yet. Returns a deferred that is fired when the crawl is finished.rr}r(hKX[ if it hasn't been called yet. Returns a deferred that is fired when the crawl is finished.hLjwubeubaubeubeubeubeubh^)r}r(hKUhLh_hMhPhRhchT}r(hX]hY]hW]hV]r(Xmodule-scrapy.settingsrhAehZ]rh'auh\Kkh]hhF]r(hj)r}r(hKX Settings APIrhLjhMhPhRhnhT}r(hX]hY]hW]hV]hZ]uh\Kkh]hhF]rhqX Settings APIrr}r(hKjhLjubaubh)r}r(hKUhLjhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hXscrapy.settings (module)Xmodule-scrapy.settingsUtrauh\Nh]hhF]ubh)r}r(hKUhLjhMNhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX#Settings (class in scrapy.settings)h(Utrauh\Nh]hhF]ubj)r}r(hKUhLjhMNhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jXclassrjjuh\Nh]hhF]r(j!)r}r(hKX Settings()hLjhMhPhRj$hT}r(hV]rh(aj'Xscrapy.settingsrhW]hX]hY]hZ]rh(aj*XSettingsrj,Uj-uh\Kh]hhF]r(j/)r}r(hKXclass hLjhMhPhRj2hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXclass rr}r(hKUhLjubaubj8)r}r(hKXscrapy.settings.hLjhMhPhRj;hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXscrapy.settings.rr}r(hKUhLjubaubjA)r}r(hKjhLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXSettingsrr}r(hKUhLjubaubeubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(h)r}r(hKX4This object that provides access to Scrapy settings.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\Krh]hhF]rhqX4This object that provides access to Scrapy settings.rr}r(hKjhLjubaubh)r}r(hKUhLjhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX.overrides (scrapy.settings.Settings attribute)h$Utrauh\Nh]hhF]ubj)r}r(hKUhLjhMhPhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jX attributerjjuh\Nh]hhF]r(j!)r}r(hKX overridesrhLjhMhPhRj$hT}r(hV]rh$aj'jhW]hX]hY]hZ]rh$aj*XSettings.overridesj,jj-uh\K}h]hhF]rjA)r}r(hKjhLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\K}h]hhF]rhqX overridesrr}r(hKUhLjubaubaubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\K}h]hhF]r(h)r}r(hKXkGlobal overrides are the ones that take most precedence, and are usually populated by command-line options.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\Kvh]hhF]rhqXkGlobal overrides are the ones that take most precedence, and are usually populated by command-line options.rr}r(hKjhLjubaubh)r}r(hKXOverrides should be populated *before* configuring the Crawler object (through the :meth:`~scrapy.crawler.Crawler.configure` method), otherwise they won't have any effect. You don't typically need to worry about overrides unless you are implementing your own Scrapy command.hLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\Kyh]hhF]r(hqXOverrides should be populated rr}r(hKXOverrides should be populated hLjubj)r}r(hKX*before*hT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXbeforerr}r(hKUhLjubahRjubhqX- configuring the Crawler object (through the r r }r (hKX- configuring the Crawler object (through the hLjubh)r }r (hKX):meth:`~scrapy.crawler.Crawler.configure`rhLjhMhPhRhhT}r(UreftypeXmethhhX scrapy.crawler.Crawler.configureU refdomainXpyrhV]hW]U refexplicithX]hY]hZ]hhhjhjuh\KyhF]rh)r}r(hKjhT}r(hX]hY]r(hjXpy-methrehW]hV]hZ]uhLj hF]rhqX configure()rr}r(hKUhLjubahRhubaubhqX method), otherwise they won't have any effect. You don't typically need to worry about overrides unless you are implementing your own Scrapy command.rr}r(hKX method), otherwise they won't have any effect. You don't typically need to worry about overrides unless you are implementing your own Scrapy command.hLjubeubeubeubh)r}r(hKUhLjhMNhRhhT}r (hV]hW]hX]hY]hZ]Uentries]r!(hX'get() (scrapy.settings.Settings method)hUtr"auh\Nh]hhF]ubj)r#}r$(hKUhLjhMNhRjhT}r%(jjXpyr&hV]hW]hX]hY]hZ]jXmethodr'jj'uh\Nh]hhF]r((j!)r)}r*(hKXget(name, default=None)hLj#hMhPhRj$hT}r+(hV]r,haj'jhW]hX]hY]hZ]r-haj*X Settings.getj,jj-uh\Kh]hhF]r.(jA)r/}r0(hKXgethLj)hMhPhRjDhT}r1(hX]hY]hW]hV]hZ]uh\Kh]hhF]r2hqXgetr3r4}r5(hKUhLj/ubaubjJ)r6}r7(hKUhLj)hMhPhRjMhT}r8(hX]hY]hW]hV]hZ]uh\Kh]hhF]r9(jP)r:}r;(hKXnamehT}r<(hX]hY]hW]hV]hZ]uhLj6hF]r=hqXnamer>r?}r@(hKUhLj:ubahRjXubjP)rA}rB(hKX default=NonehT}rC(hX]hY]hW]hV]hZ]uhLj6hF]rDhqX default=NonerErF}rG(hKUhLjAubahRjXubeubeubjY)rH}rI(hKUhLj#hMhPhRj\hT}rJ(hX]hY]hW]hV]hZ]uh\Kh]hhF]rK(h)rL}rM(hKX8Get a setting value without affecting its original type.rNhLjHhMhPhRhhT}rO(hX]hY]hW]hV]hZ]uh\Kh]hhF]rPhqX8Get a setting value without affecting its original type.rQrR}rS(hKjNhLjLubaubcdocutils.nodes field_list rT)rU}rV(hKUhLjHhMNhRU field_listrWhT}rX(hX]hY]hW]hV]hZ]uh\Nh]hhF]rYcdocutils.nodes field rZ)r[}r\(hKUhT}r](hX]hY]hW]hV]hZ]uhLjUhF]r^(cdocutils.nodes field_name r_)r`}ra(hKUhT}rb(hX]hY]hW]hV]hZ]uhLj[hF]rchqX Parametersrdre}rf(hKUhLj`ubahRU field_namergubcdocutils.nodes field_body rh)ri}rj(hKUhT}rk(hX]hY]hW]hV]hZ]uhLj[hF]rlcdocutils.nodes bullet_list rm)rn}ro(hKUhT}rp(hX]hY]hW]hV]hZ]uhLjihF]rq(cdocutils.nodes list_item rr)rs}rt(hKUhT}ru(hX]hY]hW]hV]hZ]uhLjnhF]rvh)rw}rx(hKUhT}ry(hX]hY]hW]hV]hZ]uhLjshF]rz(cdocutils.nodes strong r{)r|}r}(hKXnamehT}r~(hX]hY]hW]hV]hZ]uhLjwhF]rhqXnamerr}r(hKUhLj|ubahRUstrongrubhqX (rr}r(hKUhLjwubh)r}r(hKUhT}r(UreftypeUobjrU reftargetXstringrU refdomainj&hV]hW]U refexplicithX]hY]hZ]uhLjwhF]rj)r}r(hKjhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXstringrr}r(hKUhLjubahRjubahRhubhqX)r}r(hKUhLjwubhqX -- rr}r(hKUhLjwubhqXthe setting namerr}r(hKXthe setting namehLjwubehRhubahRU list_itemrubjr)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjnhF]rh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(j{)r}r(hKXdefaulthT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXdefaultrr}r(hKUhLjubahRjubhqX (rr}r(hKUhLjubh)r}r(hKUhT}r(UreftypejU reftargetXanyrU refdomainj&hV]hW]U refexplicithX]hY]hZ]uhLjhF]rj)r}r(hKjhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXanyrr}r(hKUhLjubahRjubahRhubhqX)r}r(hKUhLjubhqX -- rr}r(hKUhLjubhqX*the value to return if no setting is foundrr}r(hKX*the value to return if no setting is foundhLjubehRhubahRjubehRU bullet_listrubahRU field_bodyrubehRUfieldrubaubeubeubh)r}r(hKUhLjhMNhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX+getbool() (scrapy.settings.Settings method)hUtrauh\Nh]hhF]ubj)r}r(hKUhLjhMNhRjhT}r(jjXpyrhV]hW]hX]hY]hZ]jXmethodrjjuh\Nh]hhF]r(j!)r}r(hKXgetbool(name, default=False)hLjhMhPhRj$hT}r(hV]rhaj'jhW]hX]hY]hZ]rhaj*XSettings.getboolj,jj-uh\Kh]hhF]r(jA)r}r(hKXgetboolhLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXgetboolrr}r(hKUhLjubaubjJ)r}r(hKUhLjhMhPhRjMhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(jP)r}r(hKXnamehT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXnamerr}r(hKUhLjubahRjXubjP)r}r(hKX default=FalsehT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqX default=Falserr}r(hKUhLjubahRjXubeubeubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(h)r}r(hKXGet a setting value as a boolean. For example, both ``1`` and ``'1'``, and ``True`` return ``True``, while ``0``, ``'0'``, ``False`` and ``None`` return ``False````hLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(hqX4Get a setting value as a boolean. For example, both rr}r(hKX4Get a setting value as a boolean. For example, both hLjubh)r}r(hKX``1``hT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqX1r}r(hKUhLjubahRhubhqX and rr}r(hKX and hLjubh)r}r(hKX``'1'``hT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqX'1'rr }r (hKUhLjubahRhubhqX, and r r }r (hKX, and hLjubh)r}r(hKX``True``hT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXTruerr}r(hKUhLjubahRhubhqX return rr}r(hKX return hLjubh)r}r(hKX``True``hT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXTruerr}r(hKUhLjubahRhubhqX, while rr }r!(hKX, while hLjubh)r"}r#(hKX``0``hT}r$(hX]hY]hW]hV]hZ]uhLjhF]r%hqX0r&}r'(hKUhLj"ubahRhubhqX, r(r)}r*(hKX, hLjubh)r+}r,(hKX``'0'``hT}r-(hX]hY]hW]hV]hZ]uhLjhF]r.hqX'0'r/r0}r1(hKUhLj+ubahRhubhqX, r2r3}r4(hKX, hLjubh)r5}r6(hKX ``False``hT}r7(hX]hY]hW]hV]hZ]uhLjhF]r8hqXFalser9r:}r;(hKUhLj5ubahRhubhqX and r<r=}r>(hKX and hLjubh)r?}r@(hKX``None``hT}rA(hX]hY]hW]hV]hZ]uhLjhF]rBhqXNonerCrD}rE(hKUhLj?ubahRhubhqX return rFrG}rH(hKX return hLjubh)rI}rJ(hKX ``False````hT}rK(hX]hY]hW]hV]hZ]uhLjhF]rLhqXFalse``rMrN}rO(hKUhLjIubahRhubeubh)rP}rQ(hKXzFor example, settings populated through environment variables set to ``'0'`` will return ``False`` when using this method.hLjhMhPhRhhT}rR(hX]hY]hW]hV]hZ]uh\Kh]hhF]rS(hqXEFor example, settings populated through environment variables set to rTrU}rV(hKXEFor example, settings populated through environment variables set to hLjPubh)rW}rX(hKX``'0'``hT}rY(hX]hY]hW]hV]hZ]uhLjPhF]rZhqX'0'r[r\}r](hKUhLjWubahRhubhqX will return r^r_}r`(hKX will return hLjPubh)ra}rb(hKX ``False``hT}rc(hX]hY]hW]hV]hZ]uhLjPhF]rdhqXFalsererf}rg(hKUhLjaubahRhubhqX when using this method.rhri}rj(hKX when using this method.hLjPubeubjT)rk}rl(hKUhLjhMNhRjWhT}rm(hX]hY]hW]hV]hZ]uh\Nh]hhF]rnjZ)ro}rp(hKUhT}rq(hX]hY]hW]hV]hZ]uhLjkhF]rr(j_)rs}rt(hKUhT}ru(hX]hY]hW]hV]hZ]uhLjohF]rvhqX Parametersrwrx}ry(hKUhLjsubahRjgubjh)rz}r{(hKUhT}r|(hX]hY]hW]hV]hZ]uhLjohF]r}jm)r~}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjzhF]r(jr)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLj~hF]rh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(j{)r}r(hKXnamehT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXnamerr}r(hKUhLjubahRjubhqX (rr}r(hKUhLjubh)r}r(hKUhT}r(UreftypejU reftargetXstringrU refdomainjhV]hW]U refexplicithX]hY]hZ]uhLjhF]rj)r}r(hKjhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXstringrr}r(hKUhLjubahRjubahRhubhqX)r}r(hKUhLjubhqX -- rr}r(hKUhLjubhqXthe setting namerr}r(hKXthe setting namehLjubehRhubahRjubjr)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLj~hF]rh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(j{)r}r(hKXdefaulthT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXdefaultrr}r(hKUhLjubahRjubhqX (rr}r(hKUhLjubh)r}r(hKUhT}r(UreftypejU reftargetXanyrU refdomainjhV]hW]U refexplicithX]hY]hZ]uhLjhF]rj)r}r(hKjhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXanyrr}r(hKUhLjubahRjubahRhubhqX)r}r(hKUhLjubhqX -- rr}r(hKUhLjubhqX*the value to return if no setting is foundrr}r(hKX*the value to return if no setting is foundhLjubehRhubahRjubehRjubahRjubehRjubaubeubeubh)r}r(hKUhLjhMNhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX*getint() (scrapy.settings.Settings method)h0Utrauh\Nh]hhF]ubj)r}r(hKUhLjhMNhRjhT}r(jjXpyrhV]hW]hX]hY]hZ]jXmethodrjjuh\Nh]hhF]r(j!)r}r(hKXgetint(name, default=0)hLjhMhPhRj$hT}r(hV]rh0aj'jhW]hX]hY]hZ]rh0aj*XSettings.getintj,jj-uh\Kh]hhF]r(jA)r}r(hKXgetinthLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXgetintrr}r(hKUhLjubaubjJ)r}r(hKUhLjhMhPhRjMhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(jP)r}r(hKXnamehT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXnamerr}r(hKUhLjubahRjXubjP)r}r(hKX default=0hT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqX default=0rr}r(hKUhLjubahRjXubeubeubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(h)r}r(hKXGet a setting value as an intrhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXGet a setting value as an intrr}r(hKjhLjubaubjT)r}r(hKUhLjhMNhRjWhT}r(hX]hY]hW]hV]hZ]uh\Nh]hhF]rjZ)r}r (hKUhT}r (hX]hY]hW]hV]hZ]uhLjhF]r (j_)r }r (hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqX Parametersrr}r(hKUhLj ubahRjgubjh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]rjm)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(jr)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]rh)r}r (hKUhT}r!(hX]hY]hW]hV]hZ]uhLjhF]r"(j{)r#}r$(hKXnamehT}r%(hX]hY]hW]hV]hZ]uhLjhF]r&hqXnamer'r(}r)(hKUhLj#ubahRjubhqX (r*r+}r,(hKUhLjubh)r-}r.(hKUhT}r/(UreftypejU reftargetXstringr0U refdomainjhV]hW]U refexplicithX]hY]hZ]uhLjhF]r1j)r2}r3(hKj0hT}r4(hX]hY]hW]hV]hZ]uhLj-hF]r5hqXstringr6r7}r8(hKUhLj2ubahRjubahRhubhqX)r9}r:(hKUhLjubhqX -- r;r<}r=(hKUhLjubhqXthe setting namer>r?}r@(hKXthe setting namehLjubehRhubahRjubjr)rA}rB(hKUhT}rC(hX]hY]hW]hV]hZ]uhLjhF]rDh)rE}rF(hKUhT}rG(hX]hY]hW]hV]hZ]uhLjAhF]rH(j{)rI}rJ(hKXdefaulthT}rK(hX]hY]hW]hV]hZ]uhLjEhF]rLhqXdefaultrMrN}rO(hKUhLjIubahRjubhqX (rPrQ}rR(hKUhLjEubh)rS}rT(hKUhT}rU(UreftypejU reftargetXanyrVU refdomainjhV]hW]U refexplicithX]hY]hZ]uhLjEhF]rWj)rX}rY(hKjVhT}rZ(hX]hY]hW]hV]hZ]uhLjShF]r[hqXanyr\r]}r^(hKUhLjXubahRjubahRhubhqX)r_}r`(hKUhLjEubhqX -- rarb}rc(hKUhLjEubhqX*the value to return if no setting is foundrdre}rf(hKX*the value to return if no setting is foundhLjEubehRhubahRjubehRjubahRjubehRjubaubeubeubh)rg}rh(hKUhLjhMNhRhhT}ri(hV]hW]hX]hY]hZ]Uentries]rj(hX,getfloat() (scrapy.settings.Settings method)hUtrkauh\Nh]hhF]ubj)rl}rm(hKUhLjhMNhRjhT}rn(jjXpyrohV]hW]hX]hY]hZ]jXmethodrpjjpuh\Nh]hhF]rq(j!)rr}rs(hKXgetfloat(name, default=0.0)hLjlhMhPhRj$hT}rt(hV]ruhaj'jhW]hX]hY]hZ]rvhaj*XSettings.getfloatj,jj-uh\Kh]hhF]rw(jA)rx}ry(hKXgetfloathLjrhMhPhRjDhT}rz(hX]hY]hW]hV]hZ]uh\Kh]hhF]r{hqXgetfloatr|r}}r~(hKUhLjxubaubjJ)r}r(hKUhLjrhMhPhRjMhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(jP)r}r(hKXnamehT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXnamerr}r(hKUhLjubahRjXubjP)r}r(hKX default=0.0hT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqX default=0.0rr}r(hKUhLjubahRjXubeubeubjY)r}r(hKUhLjlhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(h)r}r(hKXGet a setting value as a floatrhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXGet a setting value as a floatrr}r(hKjhLjubaubjT)r}r(hKUhLjhMNhRjWhT}r(hX]hY]hW]hV]hZ]uh\Nh]hhF]rjZ)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(j_)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqX Parametersrr}r(hKUhLjubahRjgubjh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]rjm)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(jr)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]rh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(j{)r}r(hKXnamehT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXnamerr}r(hKUhLjubahRjubhqX (rr}r(hKUhLjubh)r}r(hKUhT}r(UreftypejU reftargetXstringrU refdomainjohV]hW]U refexplicithX]hY]hZ]uhLjhF]rj)r}r(hKjhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXstringrr}r(hKUhLjubahRjubahRhubhqX)r}r(hKUhLjubhqX -- rr}r(hKUhLjubhqXthe setting namerr}r(hKXthe setting namehLjubehRhubahRjubjr)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]rh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(j{)r}r(hKXdefaulthT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXdefaultrr}r(hKUhLjubahRjubhqX (rr}r(hKUhLjubh)r}r(hKUhT}r(UreftypejU reftargetXanyrU refdomainjohV]hW]U refexplicithX]hY]hZ]uhLjhF]rj)r}r(hKjhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXanyrr}r(hKUhLjubahRjubahRhubhqX)r}r(hKUhLjubhqX -- rr}r(hKUhLjubhqX*the value to return if no setting is foundrr}r(hKX*the value to return if no setting is foundhLjubehRhubahRjubehRjubahRjubehRjubaubeubeubh)r}r(hKUhLjhMNhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX+getlist() (scrapy.settings.Settings method)hUtrauh\Nh]hhF]ubj)r}r(hKUhLjhMNhRjhT}r(jjXpyrhV]hW]hX]hY]hZ]jXmethodr jj uh\Nh]hhF]r (j!)r }r (hKXgetlist(name, default=None)hLjhMhPhRj$hT}r (hV]rhaj'jhW]hX]hY]hZ]rhaj*XSettings.getlistj,jj-uh\Kh]hhF]r(jA)r}r(hKXgetlisthLj hMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXgetlistrr}r(hKUhLjubaubjJ)r}r(hKUhLj hMhPhRjMhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(jP)r}r(hKXnamehT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXnamer r!}r"(hKUhLjubahRjXubjP)r#}r$(hKX default=NonehT}r%(hX]hY]hW]hV]hZ]uhLjhF]r&hqX default=Noner'r(}r)(hKUhLj#ubahRjXubeubeubjY)r*}r+(hKUhLjhMhPhRj\hT}r,(hX]hY]hW]hV]hZ]uh\Kh]hhF]r-(h)r.}r/(hKXGet a setting value as a list. If the setting original type is a list it will be returned verbatim. If it's a string it will be split by ",".r0hLj*hMhPhRhhT}r1(hX]hY]hW]hV]hZ]uh\Kh]hhF]r2hqXGet a setting value as a list. If the setting original type is a list it will be returned verbatim. If it's a string it will be split by ",".r3r4}r5(hKj0hLj.ubaubh)r6}r7(hKXFor example, settings populated through environment variables set to ``'one,two'`` will return a list ['one', 'two'] when using this method.hLj*hMhPhRhhT}r8(hX]hY]hW]hV]hZ]uh\Kh]hhF]r9(hqXEFor example, settings populated through environment variables set to r:r;}r<(hKXEFor example, settings populated through environment variables set to hLj6ubh)r=}r>(hKX ``'one,two'``hT}r?(hX]hY]hW]hV]hZ]uhLj6hF]r@hqX 'one,two'rArB}rC(hKUhLj=ubahRhubhqX: will return a list ['one', 'two'] when using this method.rDrE}rF(hKX: will return a list ['one', 'two'] when using this method.hLj6ubeubjT)rG}rH(hKUhLj*hMNhRjWhT}rI(hX]hY]hW]hV]hZ]uh\Nh]hhF]rJjZ)rK}rL(hKUhT}rM(hX]hY]hW]hV]hZ]uhLjGhF]rN(j_)rO}rP(hKUhT}rQ(hX]hY]hW]hV]hZ]uhLjKhF]rRhqX ParametersrSrT}rU(hKUhLjOubahRjgubjh)rV}rW(hKUhT}rX(hX]hY]hW]hV]hZ]uhLjKhF]rYjm)rZ}r[(hKUhT}r\(hX]hY]hW]hV]hZ]uhLjVhF]r](jr)r^}r_(hKUhT}r`(hX]hY]hW]hV]hZ]uhLjZhF]rah)rb}rc(hKUhT}rd(hX]hY]hW]hV]hZ]uhLj^hF]re(j{)rf}rg(hKXnamehT}rh(hX]hY]hW]hV]hZ]uhLjbhF]rihqXnamerjrk}rl(hKUhLjfubahRjubhqX (rmrn}ro(hKUhLjbubh)rp}rq(hKUhT}rr(UreftypejU reftargetXstringrsU refdomainjhV]hW]U refexplicithX]hY]hZ]uhLjbhF]rtj)ru}rv(hKjshT}rw(hX]hY]hW]hV]hZ]uhLjphF]rxhqXstringryrz}r{(hKUhLjuubahRjubahRhubhqX)r|}r}(hKUhLjbubhqX -- r~r}r(hKUhLjbubhqXthe setting namerr}r(hKXthe setting namerhLjbubehRhubahRjubjr)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjZhF]rh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLjhF]r(j{)r}r(hKXdefaulthT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXdefaultrr}r(hKUhLjubahRjubhqX (rr}r(hKUhLjubh)r}r(hKUhT}r(UreftypejU reftargetXanyrU refdomainjhV]hW]U refexplicithX]hY]hZ]uhLjhF]rj)r}r(hKjhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXanyrr}r(hKUhLjubahRjubahRhubhqX)r}r(hKUhLjubhqX -- rr}r(hKUhLjubhqX*the value to return if no setting is foundrr}r(hKX*the value to return if no setting is foundrhLjubehRhubahRjubehRjubahRjubehRjubaubeubeubeubeubhH)r}r(hKX.. _topics-api-signals:hLjhMhPhRhShT}r(hV]hW]hX]hY]hZ]h[hEuh\Kh]hhF]ubeubh^)r}r(hKUhLh_hMhPha}rh/jshRhchT}r(hX]hY]hW]hV]r(Xmodule-scrapy.signalmanagerrh?hEehZ]r(hh/euh\Kh]hhg}rhEjshF]r(hj)r}r(hKX Signals APIrhLjhMhPhRhnhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqX Signals APIrr}r(hKjhLjubaubh)r}r(hKUhLjhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hXscrapy.signalmanager (module)Xmodule-scrapy.signalmanagerUtrauh\Nh]hhF]ubh)r}r(hKUhLjhMNhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX-SignalManager (class in scrapy.signalmanager)hUtrauh\Nh]hhF]ubj)r}r(hKUhLjhMNhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jXclassrjjuh\Nh]hhF]r(j!)r}r(hKX SignalManagerrhLjhMhPhRj$hT}r(hV]rhaj'Xscrapy.signalmanagerrhW]hX]hY]hZ]rhaj*jj,Uj-uh\Kh]hhF]r(j/)r}r(hKXclass hLjhMhPhRj2hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXclass rr}r(hKUhLjubaubj8)r}r(hKXscrapy.signalmanager.hLjhMhPhRj;hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXscrapy.signalmanager.rr}r(hKUhLjubaubjA)r}r(hKjhLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqX SignalManagerrr}r(hKUhLjubaubeubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(h)r}r(hKUhLjhMNhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX5connect() (scrapy.signalmanager.SignalManager method)h.Utrauh\Nh]hhF]ubj)r}r(hKUhLjhMNhRjhT}r(jjXpyrhV]hW]hX]hY]hZ]jXmethodrjjuh\Nh]hhF]r(j!)r}r(hKXconnect(receiver, signal)hLjhMhPhRj$hT}r(hV]rh.aj'jhW]hX]hY]hZ]rh.aj*XSignalManager.connectj,jj-uh\Kh]hhF]r(jA)r}r(hKXconnecthLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXconnectrr}r(hKUhLjubaubjJ)r}r (hKUhLjhMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r (jP)r }r (hKXreceiverhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXreceiverrr}r(hKUhLj ubahRjXubjP)r}r(hKXsignalhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXsignalrr}r(hKUhLjubahRjXubeubeubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(h)r}r(hKX(Connect a receiver function to a signal.r hLjhMhPhRhhT}r!(hX]hY]hW]hV]hZ]uh\Kh]hhF]r"hqX(Connect a receiver function to a signal.r#r$}r%(hKj hLjubaubh)r&}r'(hKXThe signal can be any object, although Scrapy comes with some predefined signals that are documented in the :ref:`topics-signals` section.hLjhMhPhRhhT}r((hX]hY]hW]hV]hZ]uh\Kh]hhF]r)(hqXlThe signal can be any object, although Scrapy comes with some predefined signals that are documented in the r*r+}r,(hKXlThe signal can be any object, although Scrapy comes with some predefined signals that are documented in the hLj&ubh)r-}r.(hKX:ref:`topics-signals`r/hLj&hMhPhRhhT}r0(UreftypeXrefhhXtopics-signalsU refdomainXstdr1hV]hW]U refexplicithX]hY]hZ]hhuh\KhF]r2j)r3}r4(hKj/hT}r5(hX]hY]r6(hj1Xstd-refr7ehW]hV]hZ]uhLj-hF]r8hqXtopics-signalsr9r:}r;(hKUhLj3ubahRjubaubhqX section.r<r=}r>(hKX section.hLj&ubeubjT)r?}r@(hKUhLjhMNhRjWhT}rA(hX]hY]hW]hV]hZ]uh\Nh]hhF]rBjZ)rC}rD(hKUhT}rE(hX]hY]hW]hV]hZ]uhLj?hF]rF(j_)rG}rH(hKUhT}rI(hX]hY]hW]hV]hZ]uhLjChF]rJhqX ParametersrKrL}rM(hKUhLjGubahRjgubjh)rN}rO(hKUhT}rP(hX]hY]hW]hV]hZ]uhLjChF]rQjm)rR}rS(hKUhT}rT(hX]hY]hW]hV]hZ]uhLjNhF]rU(jr)rV}rW(hKUhT}rX(hX]hY]hW]hV]hZ]uhLjRhF]rYh)rZ}r[(hKUhT}r\(hX]hY]hW]hV]hZ]uhLjVhF]r](j{)r^}r_(hKXreceiverhT}r`(hX]hY]hW]hV]hZ]uhLjZhF]rahqXreceiverrbrc}rd(hKUhLj^ubahRjubhqX (rerf}rg(hKUhLjZubh)rh}ri(hKUhT}rj(UreftypejU reftargetXcallablerkU refdomainjhV]hW]U refexplicithX]hY]hZ]uhLjZhF]rlj)rm}rn(hKjkhT}ro(hX]hY]hW]hV]hZ]uhLjhhF]rphqXcallablerqrr}rs(hKUhLjmubahRjubahRhubhqX)rt}ru(hKUhLjZubhqX -- rvrw}rx(hKUhLjZubhqXthe function to be connectedryrz}r{(hKXthe function to be connectedr|hLjZubehRhubahRjubjr)r}}r~(hKUhT}r(hX]hY]hW]hV]hZ]uhLjRhF]rh)r}r(hKUhT}r(hX]hY]hW]hV]hZ]uhLj}hF]r(j{)r}r(hKXsignalhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXsignalrr}r(hKUhLjubahRjubhqX (rr}r(hKUhLjubh)r}r(hKUhT}r(UreftypejU reftargetXobjectrU refdomainjhV]hW]U refexplicithX]hY]hZ]uhLjhF]rj)r}r(hKjhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXobjectrr}r(hKUhLjubahRjubahRhubhqX)r}r(hKUhLjubhqX -- rr}r(hKUhLjubhqXthe signal to connect torr}r(hKXthe signal to connect torhLjubehRhubahRjubehRjubahRjubehRjubaubeubeubh)r}r(hKUhLjhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hX<send_catch_log() (scrapy.signalmanager.SignalManager method)hUtrauh\Nh]hhF]ubj)r}r(hKUhLjhMhPhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jXmethodrjjuh\Nh]hhF]r(j!)r}r(hKX send_catch_log(signal, **kwargs)hLjhMhPhRj$hT}r(hV]rhaj'jhW]hX]hY]hZ]rhaj*XSignalManager.send_catch_logj,jj-uh\Kh]hhF]r(jA)r}r(hKXsend_catch_loghLjhMhPhRjDhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqXsend_catch_logrr}r(hKUhLjubaubjJ)r}r(hKUhLjhMhPhRjMhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(jP)r}r(hKXsignalhT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqXsignalrr}r(hKUhLjubahRjXubjP)r}r(hKX**kwargshT}r(hX]hY]hW]hV]hZ]uhLjhF]rhqX**kwargsrr}r(hKUhLjubahRjXubeubeubjY)r}r(hKUhLjhMhPhRj\hT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(h)r}r(hKX-Send a signal, catch exceptions and log them.rhLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]rhqX-Send a signal, catch exceptions and log them.rr}r(hKjhLjubaubh)r}r(hKXgThe keyword arguments are passed to the signal handlers (connected through the :meth:`connect` method).hLjhMhPhRhhT}r(hX]hY]hW]hV]hZ]uh\Kh]hhF]r(hqXOThe keyword arguments are passed to the signal handlers (connected through the rr}r(hKXOThe keyword arguments are passed to the signal handlers (connected through the hLjubh)r}r(hKX:meth:`connect`rhLjhMhPhRhhT}r(UreftypeXmethhhXconnectU refdomainXpyrhV]hW]U refexplicithX]hY]hZ]hhhjhjuh\KhF]rh)r}r(hKjhT}r(hX]hY]r(hjXpy-methrehW]hV]hZ]uhLjhF]rhqX connect()rr}r(hKUhLjubahRhubaubhqX method).rr}r(hKX method).hLjubeubeubeubh)r}r(hKUhLjhMhPhRhhT}r(hV]hW]hX]hY]hZ]Uentries]r(hXEsend_catch_log_deferred() (scrapy.signalmanager.SignalManager method)hUtrauh\Nh]hhF]ubj)r}r(hKUhLjhMhPhRjhT}r(jjXpyhV]hW]hX]hY]hZ]jXmethodrjjuh\Nh]hhF]r(j!)r}r(hKX)send_catch_log_deferred(signal, **kwargs)hLjhMhPhRj$hT}r(hV]rhaj'jhW]hX]hY]hZ]r haj*X%SignalManager.send_catch_log_deferredj,jj-uh\Kh]hhF]r (jA)r }r (hKXsend_catch_log_deferredhLjhMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r hqXsend_catch_log_deferredr r }r (hKUhLj ubaubjJ)r }r (hKUhLjhMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r (jP)r }r (hKXsignalhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXsignalr r }r (hKUhLj ubahRjXubjP)r }r (hKX**kwargshT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX**kwargsr r }r (hKUhLj ubahRjXubeubeubjY)r }r (hKUhLjhMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r (h)r }r (hKXULike :meth:`send_catch_log` but supports returning `deferreds`_ from signal handlers.hLj hMhPhRhhT}r! (hX]hY]hW]hV]hZ]uh\Kh]hhF]r" (hqXLike r# r$ }r% (hKXLike hLj ubh)r& }r' (hKX:meth:`send_catch_log`r( hLj hMhPhRhhT}r) (UreftypeXmethhhXsend_catch_logU refdomainXpyr* hV]hW]U refexplicithX]hY]hZ]hhhjhjuh\KhF]r+ h)r, }r- (hKj( hT}r. (hX]hY]r/ (hj* Xpy-methr0 ehW]hV]hZ]uhLj& hF]r1 hqXsend_catch_log()r2 r3 }r4 (hKUhLj, ubahRhubaubhqX but supports returning r5 r6 }r7 (hKX but supports returning hLj ubcdocutils.nodes reference r8 )r9 }r: (hKX `deferreds`_Uresolvedr; KhLj hRU referencer< hT}r= (UnameX deferredsr> Urefurir? X@http://twistedmatrix.com/documents/current/core/howto/defer.htmlr@ hV]hW]hX]hY]hZ]uhF]rA hqX deferredsrB rC }rD (hKUhLj9 ubaubhqX from signal handlers.rE rF }rG (hKX from signal handlers.hLj ubeubh)rH }rI (hKXReturns a `deferred`_ that gets fired once all signal handlers deferreds were fired. Send a signal, catch exceptions and log them.hLj hMhPhRhhT}rJ (hX]hY]hW]hV]hZ]uh\Kh]hhF]rK (hqX Returns a rL rM }rN (hKX Returns a hLjH ubj8 )rO }rP (hKX `deferred`_j; KhLjH hRj< hT}rQ (UnameXdeferredrR j? X@http://twistedmatrix.com/documents/current/core/howto/defer.htmlrS hV]hW]hX]hY]hZ]uhF]rT hqXdeferredrU rV }rW (hKUhLjO ubaubhqXm that gets fired once all signal handlers deferreds were fired. Send a signal, catch exceptions and log them.rX rY }rZ (hKXm that gets fired once all signal handlers deferreds were fired. Send a signal, catch exceptions and log them.hLjH ubeubh)r[ }r\ (hKXgThe keyword arguments are passed to the signal handlers (connected through the :meth:`connect` method).hLj hMhPhRhhT}r] (hX]hY]hW]hV]hZ]uh\Kh]hhF]r^ (hqXOThe keyword arguments are passed to the signal handlers (connected through the r_ r` }ra (hKXOThe keyword arguments are passed to the signal handlers (connected through the hLj[ ubh)rb }rc (hKX:meth:`connect`rd hLj[ hMhPhRhhT}re (UreftypeXmethhhXconnectU refdomainXpyrf hV]hW]U refexplicithX]hY]hZ]hhhjhjuh\KhF]rg h)rh }ri (hKjd hT}rj (hX]hY]rk (hjf Xpy-methrl ehW]hV]hZ]uhLjb hF]rm hqX connect()rn ro }rp (hKUhLjh ubahRhubaubhqX method).rq rr }rs (hKX method).hLj[ ubeubeubeubh)rt }ru (hKUhLjhMhPhRhhT}rv (hV]hW]hX]hY]hZ]Uentries]rw (hX8disconnect() (scrapy.signalmanager.SignalManager method)hUtrx auh\Nh]hhF]ubj)ry }rz (hKUhLjhMhPhRjhT}r{ (jjXpyhV]hW]hX]hY]hZ]jXmethodr| jj| uh\Nh]hhF]r} (j!)r~ }r (hKXdisconnect(receiver, signal)hLjy hMhPhRj$hT}r (hV]r haj'jhW]hX]hY]hZ]r haj*XSignalManager.disconnectj,jj-uh\Kh]hhF]r (jA)r }r (hKX disconnecthLj~ hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r hqX disconnectr r }r (hKUhLj ubaubjJ)r }r (hKUhLj~ hMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r (jP)r }r (hKXreceiverhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXreceiverr r }r (hKUhLj ubahRjXubjP)r }r (hKXsignalhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXsignalr r }r (hKUhLj ubahRjXubeubeubjY)r }r (hKUhLjy hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r h)r }r (hKXDisconnect a receiver function from a signal. This has the opposite effect of the :meth:`connect` method, and the arguments are the same.hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r (hqXRDisconnect a receiver function from a signal. This has the opposite effect of the r r }r (hKXRDisconnect a receiver function from a signal. This has the opposite effect of the hLj ubh)r }r (hKX:meth:`connect`r hLj hMhPhRhhT}r (UreftypeXmethhhXconnectU refdomainXpyr hV]hW]U refexplicithX]hY]hZ]hhhjhjuh\KhF]r h)r }r (hKj hT}r (hX]hY]r (hj Xpy-methr ehW]hV]hZ]uhLj hF]r hqX connect()r r }r (hKUhLj ubahRhubaubhqX( method, and the arguments are the same.r r }r (hKX( method, and the arguments are the same.hLj ubeubaubeubh)r }r (hKUhLjhMNhRhhT}r (hV]hW]hX]hY]hZ]Uentries]r (hX<disconnect_all() (scrapy.signalmanager.SignalManager method)h!Utr auh\Nh]hhF]ubj)r }r (hKUhLjhMNhRjhT}r (jjXpyr hV]hW]hX]hY]hZ]jXmethodr jj uh\Nh]hhF]r (j!)r }r (hKXdisconnect_all(signal)r hLj hMhPhRj$hT}r (hV]r h!aj'jhW]hX]hY]hZ]r h!aj*XSignalManager.disconnect_allj,jj-uh\Kh]hhF]r (jA)r }r (hKXdisconnect_allhLj hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r hqXdisconnect_allr r }r (hKUhLj ubaubjJ)r }r (hKUhLj hMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r jP)r }r (hKXsignalhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXsignalr r }r (hKUhLj ubahRjXubaubeubjY)r }r (hKUhLj hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r (h)r }r (hKX/Disconnect all receivers from the given signal.r hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\Kh]hhF]r hqX/Disconnect all receivers from the given signal.r r }r (hKj hLj ubaubjT)r }r (hKUhLj hMNhRjWhT}r (hX]hY]hW]hV]hZ]uh\Nh]hhF]r jZ)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r (j_)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX Parametersr r }r (hKUhLj ubahRjgubjh)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r h)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r (j{)r }r (hKXsignalhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXsignalr r }r (hKUhLj ubahRjubhqX (r r }r (hKUhLj ubh)r }r (hKUhT}r (UreftypejU reftargetXobjectr U refdomainj hV]hW]U refexplicithX]hY]hZ]uhLj hF]r j)r }r (hKj hT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXobjectr r }r (hKUhLj ubahRjubahRhubhqX)r }r (hKUhLj ubhqX -- r r }r (hKUhLj ubhqXthe signal to disconnect fromr r }r (hKXthe signal to disconnect fromr hLj ubehRhubahRjubehRjubaubeubeubeubeubhH)r }r! (hKX.. _topics-api-stats:hLjhMhPhRhShT}r" (hV]hW]hX]hY]hZ]h[h;uh\Kh]hhF]ubeubh^)r# }r$ (hKUhLh_hMhPha}r% hj shRhchT}r& (hX]hY]hW]hV]r' (h>h;ehZ]r( (hheuh\Kh]hhg}r) h;j shF]r* (hj)r+ }r, (hKXStats Collector APIr- hLj# hMhPhRhnhT}r. (hX]hY]hW]hV]hZ]uh\Kh]hhF]r/ hqXStats Collector APIr0 r1 }r2 (hKj- hLj+ ubaubh)r3 }r4 (hKXThere are several Stats Collectors available under the :mod:`scrapy.statscol` module and they all implement the Stats Collector API defined by the :class:`~scrapy.statscol.StatsCollector` class (which they all inherit from).hLj# hMhPhRhhT}r5 (hX]hY]hW]hV]hZ]uh\Kh]hhF]r6 (hqX7There are several Stats Collectors available under the r7 r8 }r9 (hKX7There are several Stats Collectors available under the hLj3 ubh)r: }r; (hKX:mod:`scrapy.statscol`r< hLj3 hMhPhRhhT}r= (UreftypeXmodhhXscrapy.statscolU refdomainXpyr> hV]hW]U refexplicithX]hY]hZ]hhhNhjuh\KhF]r? h)r@ }rA (hKj< hT}rB (hX]hY]rC (hj> Xpy-modrD ehW]hV]hZ]uhLj: hF]rE hqXscrapy.statscolrF rG }rH (hKUhLj@ ubahRhubaubhqXF module and they all implement the Stats Collector API defined by the rI rJ }rK (hKXF module and they all implement the Stats Collector API defined by the hLj3 ubh)rL }rM (hKX(:class:`~scrapy.statscol.StatsCollector`rN hLj3 hMhPhRhhT}rO (UreftypeXclasshhXscrapy.statscol.StatsCollectorU refdomainXpyrP hV]hW]U refexplicithX]hY]hZ]hhhNhjuh\KhF]rQ h)rR }rS (hKjN hT}rT (hX]hY]rU (hjP Xpy-classrV ehW]hV]hZ]uhLjL hF]rW hqXStatsCollectorrX rY }rZ (hKUhLjR ubahRhubaubhqX% class (which they all inherit from).r[ r\ }r] (hKX% class (which they all inherit from).hLj3 ubeubhH)r^ }r_ (hKUhLj# hMhPhRhShT}r` (hX]hV]ra Xmodule-scrapy.statscolrb ahW]UismodhY]hZ]uh\Nh]hhF]ubh)rc }rd (hKUhLj# hMhPhRhhT}re (hV]hW]hX]hY]hZ]Uentries]rf (hXscrapy.statscol (module)Xmodule-scrapy.statscolUtrg auh\Nh]hhF]ubh)rh }ri (hKUhLj# hMNhRhhT}rj (hV]hW]hX]hY]hZ]Uentries]rk (hX)StatsCollector (class in scrapy.statscol)h*Utrl auh\Nh]hhF]ubj)rm }rn (hKUhLj# hMNhRjhT}ro (jjXpyhV]hW]hX]hY]hZ]jXclassrp jjp uh\Nh]hhF]rq (j!)rr }rs (hKXStatsCollectorrt hLjm hMhPhRj$hT}ru (hV]rv h*aj'Xscrapy.statscolrw hW]hX]hY]hZ]rx h*aj*jt j,Uj-uh\M.h]hhF]ry (j/)rz }r{ (hKXclass hLjr hMhPhRj2hT}r| (hX]hY]hW]hV]hZ]uh\M.h]hhF]r} hqXclass r~ r }r (hKUhLjz ubaubj8)r }r (hKXscrapy.statscol.hLjr hMhPhRj;hT}r (hX]hY]hW]hV]hZ]uh\M.h]hhF]r hqXscrapy.statscol.r r }r (hKUhLj ubaubjA)r }r (hKjt hLjr hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\M.h]hhF]r hqXStatsCollectorr r }r (hKUhLj ubaubeubjY)r }r (hKUhLjm hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\M.h]hhF]r (h)r }r (hKUhLj hMhPhRhhT}r (hV]hW]hX]hY]hZ]Uentries]r (hX3get_value() (scrapy.statscol.StatsCollector method)h Utr auh\Nh]hhF]ubj)r }r (hKUhLj hMhPhRjhT}r (jjXpyhV]hW]hX]hY]hZ]jXmethodr jj uh\Nh]hhF]r (j!)r }r (hKXget_value(key, default=None)hLj hMhPhRj$hT}r (hV]r h aj'jw hW]hX]hY]hZ]r h aj*XStatsCollector.get_valuej,jt j-uh\Mh]hhF]r (jA)r }r (hKX get_valuehLj hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqX get_valuer r }r (hKUhLj ubaubjJ)r }r (hKUhLj hMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r (jP)r }r (hKXkeyhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXkeyr r }r (hKUhLj ubahRjXubjP)r }r (hKX default=NonehT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX default=Noner r }r (hKUhLj ubahRjXubeubeubjY)r }r (hKUhLj hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r h)r }r (hKXHReturn the value for the given stats key or default if it doesn't exist.r hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqXHReturn the value for the given stats key or default if it doesn't exist.r r }r (hKj hLj ubaubaubeubh)r }r (hKUhLj hMhPhRhhT}r (hV]hW]hX]hY]hZ]Uentries]r (hX3get_stats() (scrapy.statscol.StatsCollector method)h Utr auh\Nh]hhF]ubj)r }r (hKUhLj hMhPhRjhT}r (jjXpyhV]hW]hX]hY]hZ]jXmethodr jj uh\Nh]hhF]r (j!)r }r (hKX get_stats()hLj hMhPhRj$hT}r (hV]r h aj'jw hW]hX]hY]hZ]r h aj*XStatsCollector.get_statsj,jt j-uh\Mh]hhF]r (jA)r }r (hKX get_statshLj hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqX get_statsr r }r (hKUhLj ubaubjJ)r }r (hKUhLj hMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]ubeubjY)r }r (hKUhLj hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r h)r }r (hKX:Get all stats from the currently running spider as a dict.r hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqX:Get all stats from the currently running spider as a dict.r r }r (hKj hLj ubaubaubeubh)r }r (hKUhLj hMhPhRhhT}r (hV]hW]hX]hY]hZ]Uentries]r (hX3set_value() (scrapy.statscol.StatsCollector method)h&Utr auh\Nh]hhF]ubj)r }r (hKUhLj hMhPhRjhT}r (jjXpyhV]hW]hX]hY]hZ]jXmethodr jj uh\Nh]hhF]r (j!)r }r (hKXset_value(key, value)hLj hMhPhRj$hT}r (hV]r h&aj'jw hW]hX]hY]hZ]r h&aj*XStatsCollector.set_valuej,jt j-uh\M h]hhF]r (jA)r }r (hKX set_valuehLj hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\M h]hhF]r hqX set_valuer r }r (hKUhLj ubaubjJ)r }r (hKUhLj hMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\M h]hhF]r (jP)r }r (hKXkeyhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXkeyr r }r (hKUhLj ubahRjXubjP)r }r (hKXvaluehT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXvaluer r }r (hKUhLj ubahRjXubeubeubjY)r }r (hKUhLj hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\M h]hhF]r h)r }r (hKX,Set the given value for the given stats key.r hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqX,Set the given value for the given stats key.r r! }r" (hKj hLj ubaubaubeubh)r# }r$ (hKUhLj hMhPhRhhT}r% (hV]hW]hX]hY]hZ]Uentries]r& (hX3set_stats() (scrapy.statscol.StatsCollector method)hUtr' auh\Nh]hhF]ubj)r( }r) (hKUhLj hMhPhRjhT}r* (jjXpyhV]hW]hX]hY]hZ]jXmethodr+ jj+ uh\Nh]hhF]r, (j!)r- }r. (hKXset_stats(stats)hLj( hMhPhRj$hT}r/ (hV]r0 haj'jw hW]hX]hY]hZ]r1 haj*XStatsCollector.set_statsj,jt j-uh\M h]hhF]r2 (jA)r3 }r4 (hKX set_statshLj- hMhPhRjDhT}r5 (hX]hY]hW]hV]hZ]uh\M h]hhF]r6 hqX set_statsr7 r8 }r9 (hKUhLj3 ubaubjJ)r: }r; (hKUhLj- hMhPhRjMhT}r< (hX]hY]hW]hV]hZ]uh\M h]hhF]r= jP)r> }r? (hKXstatshT}r@ (hX]hY]hW]hV]hZ]uhLj: hF]rA hqXstatsrB rC }rD (hKUhLj> ubahRjXubaubeubjY)rE }rF (hKUhLj( hMhPhRj\hT}rG (hX]hY]hW]hV]hZ]uh\M h]hhF]rH h)rI }rJ (hKXFOverride the current stats with the dict passed in ``stats`` argument.hLjE hMhPhRhhT}rK (hX]hY]hW]hV]hZ]uh\M h]hhF]rL (hqX3Override the current stats with the dict passed in rM rN }rO (hKX3Override the current stats with the dict passed in hLjI ubh)rP }rQ (hKX ``stats``hT}rR (hX]hY]hW]hV]hZ]uhLjI hF]rS hqXstatsrT rU }rV (hKUhLjP ubahRhubhqX argument.rW rX }rY (hKX argument.hLjI ubeubaubeubh)rZ }r[ (hKUhLj hMhPhRhhT}r\ (hV]hW]hX]hY]hZ]Uentries]r] (hX3inc_value() (scrapy.statscol.StatsCollector method)h-Utr^ auh\Nh]hhF]ubj)r_ }r` (hKUhLj hMhPhRjhT}ra (jjXpyhV]hW]hX]hY]hZ]jXmethodrb jjb uh\Nh]hhF]rc (j!)rd }re (hKX inc_value(key, count=1, start=0)hLj_ hMhPhRj$hT}rf (hV]rg h-aj'jw hW]hX]hY]hZ]rh h-aj*XStatsCollector.inc_valuej,jt j-uh\Mh]hhF]ri (jA)rj }rk (hKX inc_valuehLjd hMhPhRjDhT}rl (hX]hY]hW]hV]hZ]uh\Mh]hhF]rm hqX inc_valuern ro }rp (hKUhLjj ubaubjJ)rq }rr (hKUhLjd hMhPhRjMhT}rs (hX]hY]hW]hV]hZ]uh\Mh]hhF]rt (jP)ru }rv (hKXkeyhT}rw (hX]hY]hW]hV]hZ]uhLjq hF]rx hqXkeyry rz }r{ (hKUhLju ubahRjXubjP)r| }r} (hKXcount=1hT}r~ (hX]hY]hW]hV]hZ]uhLjq hF]r hqXcount=1r r }r (hKUhLj| ubahRjXubjP)r }r (hKXstart=0hT}r (hX]hY]hW]hV]hZ]uhLjq hF]r hqXstart=0r r }r (hKUhLj ubahRjXubeubeubjY)r }r (hKUhLj_ hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r h)r }r (hKXsIncrement the value of the given stats key, by the given count, assuming the start value given (when it's not set).r hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqXsIncrement the value of the given stats key, by the given count, assuming the start value given (when it's not set).r r }r (hKj hLj ubaubaubeubh)r }r (hKUhLj hMhPhRhhT}r (hV]hW]hX]hY]hZ]Uentries]r (hX3max_value() (scrapy.statscol.StatsCollector method)h Utr auh\Nh]hhF]ubj)r }r (hKUhLj hMhPhRjhT}r (jjXpyhV]hW]hX]hY]hZ]jXmethodr jj uh\Nh]hhF]r (j!)r }r (hKXmax_value(key, value)hLj hMhPhRj$hT}r (hV]r h aj'jw hW]hX]hY]hZ]r h aj*XStatsCollector.max_valuej,jt j-uh\Mh]hhF]r (jA)r }r (hKX max_valuehLj hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqX max_valuer r }r (hKUhLj ubaubjJ)r }r (hKUhLj hMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r (jP)r }r (hKXkeyhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXkeyr r }r (hKUhLj ubahRjXubjP)r }r (hKXvaluehT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXvaluer r }r (hKUhLj ubahRjXubeubeubjY)r }r (hKUhLj hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r h)r }r (hKXSet the given value for the given key only if current value for the same key is lower than value. If there is no current value for the given key, the value is always set.r hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqXSet the given value for the given key only if current value for the same key is lower than value. If there is no current value for the given key, the value is always set.r r }r (hKj hLj ubaubaubeubh)r }r (hKUhLj hMhPhRhhT}r (hV]hW]hX]hY]hZ]Uentries]r (hX3min_value() (scrapy.statscol.StatsCollector method)h Utr auh\Nh]hhF]ubj)r }r (hKUhLj hMhPhRjhT}r (jjXpyhV]hW]hX]hY]hZ]jXmethodr jj uh\Nh]hhF]r (j!)r }r (hKXmin_value(key, value)hLj hMhPhRj$hT}r (hV]r h aj'jw hW]hX]hY]hZ]r h aj*XStatsCollector.min_valuej,jt j-uh\Mh]hhF]r (jA)r }r (hKX min_valuehLj hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqX min_valuer r }r (hKUhLj ubaubjJ)r }r (hKUhLj hMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r (jP)r }r (hKXkeyhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXkeyr r }r (hKUhLj ubahRjXubjP)r }r (hKXvaluehT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqXvaluer r }r (hKUhLj ubahRjXubeubeubjY)r }r (hKUhLj hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r h)r }r (hKXSet the given value for the given key only if current value for the same key is greater than value. If there is no current value for the given key, the value is always set.r hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\Mh]hhF]r hqXSet the given value for the given key only if current value for the same key is greater than value. If there is no current value for the given key, the value is always set.r r }r (hKj hLj ubaubaubeubh)r }r (hKUhLj hMhPhRhhT}r (hV]hW]hX]hY]hZ]Uentries]r (hX5clear_stats() (scrapy.statscol.StatsCollector method)hUtr auh\Nh]hhF]ubj)r }r (hKUhLj hMhPhRjhT}r (jjXpyhV]hW]hX]hY]hZ]jXmethodr jj uh\Nh]hhF]r (j!)r }r (hKX clear_stats()hLj hMhPhRj$hT}r (hV]r haj'jw hW]hX]hY]hZ]r haj*XStatsCollector.clear_statsj,jt j-uh\M"h]hhF]r (jA)r }r (hKX clear_statshLj hMhPhRjDhT}r (hX]hY]hW]hV]hZ]uh\M"h]hhF]r hqX clear_statsr r }r (hKUhLj ubaubjJ)r }r (hKUhLj hMhPhRjMhT}r (hX]hY]hW]hV]hZ]uh\M"h]hhF]ubeubjY)r }r (hKUhLj hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\M"h]hhF]r h)r }r (hKXClear all stats.r hLj hMhPhRhhT}r! (hX]hY]hW]hV]hZ]uh\M!h]hhF]r" hqXClear all stats.r# r$ }r% (hKj hLj ubaubaubeubh)r& }r' (hKXzThe following methods are not part of the stats collection api but instead used when implementing custom stats collectors:r( hLj hMhPhRhhT}r) (hX]hY]hW]hV]hZ]uh\M#h]hhF]r* hqXzThe following methods are not part of the stats collection api but instead used when implementing custom stats collectors:r+ r, }r- (hKj( hLj& ubaubh)r. }r/ (hKUhLj hMhPhRhhT}r0 (hV]hW]hX]hY]hZ]Uentries]r1 (hX5open_spider() (scrapy.statscol.StatsCollector method)hUtr2 auh\Nh]hhF]ubj)r3 }r4 (hKUhLj hMhPhRjhT}r5 (jjXpyhV]hW]hX]hY]hZ]jXmethodr6 jj6 uh\Nh]hhF]r7 (j!)r8 }r9 (hKXopen_spider(spider)hLj3 hMhPhRj$hT}r: (hV]r; haj'jw hW]hX]hY]hZ]r< haj*XStatsCollector.open_spiderj,jt j-uh\M)h]hhF]r= (jA)r> }r? (hKX open_spiderhLj8 hMhPhRjDhT}r@ (hX]hY]hW]hV]hZ]uh\M)h]hhF]rA hqX open_spiderrB rC }rD (hKUhLj> ubaubjJ)rE }rF (hKUhLj8 hMhPhRjMhT}rG (hX]hY]hW]hV]hZ]uh\M)h]hhF]rH jP)rI }rJ (hKXspiderhT}rK (hX]hY]hW]hV]hZ]uhLjE hF]rL hqXspiderrM rN }rO (hKUhLjI ubahRjXubaubeubjY)rP }rQ (hKUhLj3 hMhPhRj\hT}rR (hX]hY]hW]hV]hZ]uh\M)h]hhF]rS h)rT }rU (hKX+Open the given spider for stats collection.rV hLjP hMhPhRhhT}rW (hX]hY]hW]hV]hZ]uh\M(h]hhF]rX hqX+Open the given spider for stats collection.rY rZ }r[ (hKjV hLjT ubaubaubeubh)r\ }r] (hKUhLj hMhPhRhhT}r^ (hV]hW]hX]hY]hZ]Uentries]r_ (hX6close_spider() (scrapy.statscol.StatsCollector method)hUtr` auh\Nh]hhF]ubj)ra }rb (hKUhLj hMhPhRjhT}rc (jjXpyhV]hW]hX]hY]hZ]jXmethodrd jjd uh\Nh]hhF]re (j!)rf }rg (hKXclose_spider(spider)rh hLja hMhPhRj$hT}ri (hV]rj haj'jw hW]hX]hY]hZ]rk haj*XStatsCollector.close_spiderj,jt j-uh\M-h]hhF]rl (jA)rm }rn (hKX close_spiderhLjf hMhPhRjDhT}ro (hX]hY]hW]hV]hZ]uh\M-h]hhF]rp hqX close_spiderrq rr }rs (hKUhLjm ubaubjJ)rt }ru (hKUhLjf hMhPhRjMhT}rv (hX]hY]hW]hV]hZ]uh\M-h]hhF]rw jP)rx }ry (hKXspiderhT}rz (hX]hY]hW]hV]hZ]uhLjt hF]r{ hqXspiderr| r} }r~ (hKUhLjx ubahRjXubaubeubjY)r }r (hKUhLja hMhPhRj\hT}r (hX]hY]hW]hV]hZ]uh\M-h]hhF]r h)r }r (hKXbClose the given spider. After this is called, no more specific stats can be accessed or collected.r hLj hMhPhRhhT}r (hX]hY]hW]hV]hZ]uh\M,h]hhF]r hqXbClose the given spider. After this is called, no more specific stats can be accessed or collected.r r }r (hKj hLj ubaubaubeubeubeubhH)r }r (hKXO.. _deferreds: http://twistedmatrix.com/documents/current/core/howto/defer.htmlU referencedr KhLj# hMhPhRhShT}r (j? j@ hV]r hBahW]hX]hY]hZ]r h)auh\M/h]hhF]ubhH)r }r (hKXN.. _deferred: http://twistedmatrix.com/documents/current/core/howto/defer.htmlj KhLj# hMhPhRhShT}r (j? jS hV]r hCahW]hX]hY]hZ]r h+auh\M0h]hhF]ubeubeubehKUU transformerr NU footnote_refsr }r Urefnamesr }r (j> ]r j9 ajR ]r jO auUsymbol_footnotesr ]r Uautofootnote_refsr ]r Usymbol_footnote_refsr ]r U citationsr ]r h]hU current_liner NUtransform_messagesr ]r (cdocutils.nodes system_message r )r }r (hKUhT}r (hX]UlevelKhV]hW]UsourcehPhY]hZ]UlineKUtypeUINFOr uhF]r h)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX0Hyperlink target "topics-api" is not referenced.r r }r (hKUhLj ubahRhubahRUsystem_messager ubj )r }r (hKUhT}r (hX]UlevelKhV]hW]UsourcehPhY]hZ]UlineK Utypej uhF]r h)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX8Hyperlink target "topics-api-crawler" is not referenced.r r }r (hKUhLj ubahRhubahRj ubj )r }r (hKUhT}r (hX]UlevelKhV]hW]UsourcehPhY]hZ]Utypej uhF]r h)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX;Hyperlink target "module-scrapy.crawler" is not referenced.r r }r (hKUhLj ubahRhubahRj ubj )r }r (hKUhT}r (hX]UlevelKhV]hW]UsourcehPhY]hZ]UlineKUtypej uhF]r h)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX8Hyperlink target "topics-api-signals" is not referenced.r r }r (hKUhLj ubahRhubahRj ubj )r }r (hKUhT}r (hX]UlevelKhV]hW]UsourcehPhY]hZ]UlineKUtypej uhF]r h)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX6Hyperlink target "topics-api-stats" is not referenced.r r }r (hKUhLj ubahRhubahRj ubj )r }r (hKUhT}r (hX]UlevelKhV]hW]UsourcehPhY]hZ]Utypej uhF]r h)r }r (hKUhT}r (hX]hY]hW]hV]hZ]uhLj hF]r hqX<Hyperlink target "module-scrapy.statscol" is not referenced.r r }r (hKUhLj ubahRhubahRj ubeUreporterr NUid_startr KU autofootnotesr ]r U citation_refsr }r Uindirect_targetsr ]r Usettingsr (cdocutils.frontend Values r or }r (Ufootnote_backlinksr KUrecord_dependenciesr NU rfc_base_urlr Uhttp://tools.ietf.org/html/r U tracebackr Upep_referencesr NUstrip_commentsr NU toc_backlinksr Uentryr U language_coder Uenr U datestampr NU report_levelr KU _destinationr NU halt_levelr KU strip_classesr NhnNUerror_encoding_error_handlerr Ubackslashreplacer Udebugr NUembed_stylesheetr Uoutput_encoding_error_handlerr Ustrictr U sectnum_xformr KUdump_transformsr NU docinfo_xformr KUwarning_streamr NUpep_file_url_templater Upep-%04dr Uexit_status_levelr KUconfigr NUstrict_visitorr NUcloak_email_addressesr Utrim_footnote_reference_spacer Uenvr NUdump_pseudo_xmlr NUexpose_internalsr NUsectsubtitle_xformr U source_linkr NUrfc_referencesr NUoutput_encodingr Uutf-8r! U source_urlr" NUinput_encodingr# U utf-8-sigr$ U_disable_configr% NU id_prefixr& UU tab_widthr' KUerror_encodingr( UUTF-8r) U_sourcer* U@/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/api.rstr+ Ugettext_compactr, U generatorr- NUdump_internalsr. NU smart_quotesr/ U pep_base_urlr0 Uhttp://www.python.org/dev/peps/r1 Usyntax_highlightr2 Ulongr3 Uinput_encoding_error_handlerr4 j Uauto_id_prefixr5 Uidr6 Udoctitle_xformr7 Ustrip_elements_with_classesr8 NU _config_filesr9 ]Ufile_insertion_enabledr: U raw_enabledr; KU dump_settingsr< NubUsymbol_footnote_startr= KUidsr> }r? (hjhj<hjrh j h j h jch j h>j# h j h;j# hjf hj"jhH)r@ }rA (hKUhLjhMhPhRhShT}rB (hX]hV]rC jahW]UismodhY]hZ]uh\Nh]hhF]ubhjhjh(Uidsq?]Ubackrefsq@]UdupnamesqA]UclassesqB]UnamesqC]UrefidqDh-uUlineqEKUdocumentqFhh/]ubcdocutils.nodes section qG)qH}qI(h4Uh5hh6h9Uexpect_referenced_by_nameqJ}qKhh2sh;UsectionqLh=}qM(hA]hB]h@]h?]qN(h*h-ehC]qO(hheuhEKhFhUexpect_referenced_by_idqP}qQh-h2sh/]qR(cdocutils.nodes title qS)qT}qU(h4XArchitecture overviewqVh5hHh6h9h;UtitleqWh=}qX(hA]hB]h@]h?]hC]uhEKhFhh/]qYcdocutils.nodes Text qZXArchitecture overviewq[q\}q](h4hVh5hTubaubcdocutils.nodes paragraph q^)q_}q`(h4XSThis document describes the architecture of Scrapy and how its components interact.qah5hHh6h9h;U paragraphqbh=}qc(hA]hB]h@]h?]hC]uhEKhFhh/]qdhZXSThis document describes the architecture of Scrapy and how its components interact.qeqf}qg(h4hah5h_ubaubhG)qh}qi(h4Uh5hHh6h9h;hLh=}qj(hA]hB]h@]h?]qkh$ahC]qlh auhEK hFhh/]qm(hS)qn}qo(h4XOverviewqph5hhh6h9h;hWh=}qq(hA]hB]h@]h?]hC]uhEK hFhh/]qrhZXOverviewqsqt}qu(h4hph5hnubaubh^)qv}qw(h4XJThe following diagram shows an overview of the Scrapy architecture with its components and an outline of the data flow that takes place inside the system (shown by the green arrows). A brief description of the components is included below with links for more detailed information about them. The data flow is also described below.qxh5hhh6h9h;hbh=}qy(hA]hB]h@]h?]hC]uhEK hFhh/]qzhZXJThe following diagram shows an overview of the Scrapy architecture with its components and an outline of the data flow that takes place inside the system (shown by the green arrows). A brief description of the components is included below with links for more detailed information about them. The data flow is also described below.q{q|}q}(h4hxh5hvubaubcdocutils.nodes image q~)q}q(h4Xg.. image:: _images/scrapy_architecture.png :width: 700 :height: 494 :alt: Scrapy architecture h5hhh6h9h;Uimageqh=}q(hA]UuriX&topics/_images/scrapy_architecture.pngqh?]h@]UwidthX700hB]U candidatesq}qU*hshC]Ualth7XScrapy architectureqq}qbUheightX494uhENhFhh/]ubeubhG)q}q(h4Uh5hHh6h9h;hLh=}q(hA]hB]h@]h?]qh"ahC]qh auhEKhFhh/]q(hS)q}q(h4X Componentsqh5hh6h9h;hWh=}q(hA]hB]h@]h?]hC]uhEKhFhh/]qhZX Componentsqq}q(h4hh5hubaubhG)q}q(h4Uh5hh6h9h;hLh=}q(hA]hB]h@]h?]qh#ahC]qh auhEKhFhh/]q(hS)q}q(h4X Scrapy Engineqh5hh6h9h;hWh=}q(hA]hB]h@]h?]hC]uhEKhFhh/]qhZX Scrapy Engineqq}q(h4hh5hubaubh^)q}q(h4XThe engine is responsible for controlling the data flow between all components of the system, and triggering events when certain actions occur. See the Data Flow section below for more details.qh5hh6h9h;hbh=}q(hA]hB]h@]h?]hC]uhEKhFhh/]qhZXThe engine is responsible for controlling the data flow between all components of the system, and triggering events when certain actions occur. See the Data Flow section below for more details.qq}q(h4hh5hubaubeubhG)q}q(h4Uh5hh6h9h;hLh=}q(hA]hB]h@]h?]qh)ahC]qhauhEK#hFhh/]q(hS)q}q(h4X Schedulerqh5hh6h9h;hWh=}q(hA]hB]h@]h?]hC]uhEK#hFhh/]qhZX Schedulerqq}q(h4hh5hubaubh^)q}q(h4XThe Scheduler receives requests from the engine and enqueues them for feeding them later (also to the engine) when the engine requests them.qh5hh6h9h;hbh=}q(hA]hB]h@]h?]hC]uhEK%hFhh/]qhZXThe Scheduler receives requests from the engine and enqueues them for feeding them later (also to the engine) when the engine requests them.qq}q(h4hh5hubaubeubhG)q}q(h4Uh5hh6h9h;hLh=}q(hA]hB]h@]h?]qh%ahC]qh auhEK)hFhh/]q(hS)q}q(h4X Downloaderqh5hh6h9h;hWh=}q(hA]hB]h@]h?]hC]uhEK)hFhh/]qhZX Downloaderq΅q}q(h4hh5hubaubh^)q}q(h4X~The Downloader is responsible for fetching web pages and feeding them to the engine which, in turn, feeds them to the spiders.qh5hh6h9h;hbh=}q(hA]hB]h@]h?]hC]uhEK+hFhh/]qhZX~The Downloader is responsible for fetching web pages and feeding them to the engine which, in turn, feeds them to the spiders.qօq}q(h4hh5hubaubeubhG)q}q(h4Uh5hh6h9h;hLh=}q(hA]hB]h@]h?]qh(ahC]qhauhEK/hFhh/]q(hS)q}q(h4XSpidersqh5hh6h9h;hWh=}q(hA]hB]h@]h?]hC]uhEK/hFhh/]qhZXSpidersq䅁q}q(h4hh5hubaubh^)q}q(h4XSpiders are custom classes written by Scrapy users to parse responses and extract items (aka scraped items) from them or additional URLs (requests) to follow. Each spider is able to handle a specific domain (or group of domains). For more information see :ref:`topics-spiders`.h5hh6h9h;hbh=}q(hA]hB]h@]h?]hC]uhEK1hFhh/]q(hZXSpiders are custom classes written by Scrapy users to parse responses and extract items (aka scraped items) from them or additional URLs (requests) to follow. Each spider is able to handle a specific domain (or group of domains). For more information see q녁q}q(h4XSpiders are custom classes written by Scrapy users to parse responses and extract items (aka scraped items) from them or additional URLs (requests) to follow. Each spider is able to handle a specific domain (or group of domains). For more information see h5hubcsphinx.addnodes pending_xref q)q}q(h4X:ref:`topics-spiders`qh5hh6h9h;U pending_xrefqh=}q(UreftypeXrefUrefwarnqU reftargetqXtopics-spidersU refdomainXstdqh?]h@]U refexplicithA]hB]hC]UrefdocqXtopics/architecturequhEK1h/]qcdocutils.nodes emphasis q)q}q(h4hh=}q(hA]hB]q(UxrefqhXstd-refreh@]h?]hC]uh5hh/]rhZXtopics-spidersrr}r(h4Uh5hubah;UemphasisrubaubhZX.r}r(h4X.h5hubeubeubhG)r}r (h4Uh5hh6h9h;hLh=}r (hA]hB]h@]h?]r h!ahC]r hauhEK7hFhh/]r (hS)r}r(h4X Item Pipelinerh5jh6h9h;hWh=}r(hA]hB]h@]h?]hC]uhEK7hFhh/]rhZX Item Pipelinerr}r(h4jh5jubaubh^)r}r(h4X The Item Pipeline is responsible for processing the items once they have been extracted (or scraped) by the spiders. Typical tasks include cleansing, validation and persistence (like storing the item in a database). For more information see :ref:`topics-item-pipeline`.h5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEK9hFhh/]r(hZXThe Item Pipeline is responsible for processing the items once they have been extracted (or scraped) by the spiders. Typical tasks include cleansing, validation and persistence (like storing the item in a database). For more information see rr}r(h4XThe Item Pipeline is responsible for processing the items once they have been extracted (or scraped) by the spiders. Typical tasks include cleansing, validation and persistence (like storing the item in a database). For more information see h5jubh)r}r(h4X:ref:`topics-item-pipeline`rh5jh6h9h;hh=}r (UreftypeXrefhhXtopics-item-pipelineU refdomainXstdr!h?]h@]U refexplicithA]hB]hC]hhuhEK9h/]r"h)r#}r$(h4jh=}r%(hA]hB]r&(hj!Xstd-refr'eh@]h?]hC]uh5jh/]r(hZXtopics-item-pipeliner)r*}r+(h4Uh5j#ubah;jubaubhZX.r,}r-(h4X.h5jubeubeubhG)r.}r/(h4Uh5hh6h9h;hLh=}r0(hA]hB]h@]h?]r1hahC]r2hauhEK?hFhh/]r3(hS)r4}r5(h4XDownloader middlewaresr6h5j.h6h9h;hWh=}r7(hA]hB]h@]h?]hC]uhEK?hFhh/]r8hZXDownloader middlewaresr9r:}r;(h4j6h5j4ubaubh^)r<}r=(h4XtDownloader middlewares are specific hooks that sit between the Engine and the Downloader and process requests when they pass from the Engine to the Downloader, and responses that pass from Downloader to the Engine. They provide a convenient mechanism for extending Scrapy functionality by plugging custom code. For more information see :ref:`topics-downloader-middleware`.h5j.h6h9h;hbh=}r>(hA]hB]h@]h?]hC]uhEKAhFhh/]r?(hZXPDownloader middlewares are specific hooks that sit between the Engine and the Downloader and process requests when they pass from the Engine to the Downloader, and responses that pass from Downloader to the Engine. They provide a convenient mechanism for extending Scrapy functionality by plugging custom code. For more information see r@rA}rB(h4XPDownloader middlewares are specific hooks that sit between the Engine and the Downloader and process requests when they pass from the Engine to the Downloader, and responses that pass from Downloader to the Engine. They provide a convenient mechanism for extending Scrapy functionality by plugging custom code. For more information see h5j<ubh)rC}rD(h4X#:ref:`topics-downloader-middleware`rEh5j<h6h9h;hh=}rF(UreftypeXrefhhXtopics-downloader-middlewareU refdomainXstdrGh?]h@]U refexplicithA]hB]hC]hhuhEKAh/]rHh)rI}rJ(h4jEh=}rK(hA]hB]rL(hjGXstd-refrMeh@]h?]hC]uh5jCh/]rNhZXtopics-downloader-middlewarerOrP}rQ(h4Uh5jIubah;jubaubhZX.rR}rS(h4X.h5j<ubeubeubhG)rT}rU(h4Uh5hh6h9h;hLh=}rV(hA]hB]h@]h?]rWh'ahC]rXhauhEKHhFhh/]rY(hS)rZ}r[(h4XSpider middlewaresr\h5jTh6h9h;hWh=}r](hA]hB]h@]h?]hC]uhEKHhFhh/]r^hZXSpider middlewaresr_r`}ra(h4j\h5jZubaubh^)rb}rc(h4X=Spider middlewares are specific hooks that sit between the Engine and the Spiders and are able to process spider input (responses) and output (items and requests). They provide a convenient mechanism for extending Scrapy functionality by plugging custom code. For more information see :ref:`topics-spider-middleware`.h5jTh6h9h;hbh=}rd(hA]hB]h@]h?]hC]uhEKJhFhh/]re(hZXSpider middlewares are specific hooks that sit between the Engine and the Spiders and are able to process spider input (responses) and output (items and requests). They provide a convenient mechanism for extending Scrapy functionality by plugging custom code. For more information see rfrg}rh(h4XSpider middlewares are specific hooks that sit between the Engine and the Spiders and are able to process spider input (responses) and output (items and requests). They provide a convenient mechanism for extending Scrapy functionality by plugging custom code. For more information see h5jbubh)ri}rj(h4X:ref:`topics-spider-middleware`rkh5jbh6h9h;hh=}rl(UreftypeXrefhhXtopics-spider-middlewareU refdomainXstdrmh?]h@]U refexplicithA]hB]hC]hhuhEKJh/]rnh)ro}rp(h4jkh=}rq(hA]hB]rr(hjmXstd-refrseh@]h?]hC]uh5jih/]rthZXtopics-spider-middlewarerurv}rw(h4Uh5joubah;jubaubhZX.rx}ry(h4X.h5jbubeubeubeubhG)rz}r{(h4Uh5hHh6h9h;hLh=}r|(hA]hB]h@]h?]r}h,ahC]r~hauhEKQhFhh/]r(hS)r}r(h4X Data flowrh5jzh6h9h;hWh=}r(hA]hB]h@]h?]hC]uhEKQhFhh/]rhZX Data flowrr}r(h4jh5jubaubh^)r}r(h4XRThe data flow in Scrapy is controlled by the execution engine, and goes like this:rh5jzh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKShFhh/]rhZXRThe data flow in Scrapy is controlled by the execution engine, and goes like this:rr}r(h4jh5jubaubcdocutils.nodes enumerated_list r)r}r(h4Uh5jzh6h9h;Uenumerated_listrh=}r(UsuffixrU.h?]h@]hA]UprefixrUhB]hC]UenumtyperUarabicruhEKVhFhh/]r(cdocutils.nodes list_item r)r}r(h4XyThe Engine opens a domain, locates the Spider that handles that domain, and asks the spider for the first URLs to crawl. h5jh6h9h;U list_itemrh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4XxThe Engine opens a domain, locates the Spider that handles that domain, and asks the spider for the first URLs to crawl.rh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKVh/]rhZXxThe Engine opens a domain, locates the Spider that handles that domain, and asks the spider for the first URLs to crawl.rr}r(h4jh5jubaubaubj)r}r(h4XjThe Engine gets the first URLs to crawl from the Spider and schedules them in the Scheduler, as Requests. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4XiThe Engine gets the first URLs to crawl from the Spider and schedules them in the Scheduler, as Requests.rh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKYh/]rhZXiThe Engine gets the first URLs to crawl from the Spider and schedules them in the Scheduler, as Requests.rr}r(h4jh5jubaubaubj)r}r(h4X:The Engine asks the Scheduler for the next URLs to crawl. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4X9The Engine asks the Scheduler for the next URLs to crawl.rh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEK\h/]rhZX9The Engine asks the Scheduler for the next URLs to crawl.rr}r(h4jh5jubaubaubj)r}r(h4XThe Scheduler returns the next URLs to crawl to the Engine and the Engine sends them to the Downloader, passing through the Downloader Middleware (request direction). h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4XThe Scheduler returns the next URLs to crawl to the Engine and the Engine sends them to the Downloader, passing through the Downloader Middleware (request direction).rh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEK^h/]rhZXThe Scheduler returns the next URLs to crawl to the Engine and the Engine sends them to the Downloader, passing through the Downloader Middleware (request direction).rr}r(h4jh5jubaubaubj)r}r(h4XOnce the page finishes downloading the Downloader generates a Response (with that page) and sends it to the Engine, passing through the Downloader Middleware (response direction). h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4XOnce the page finishes downloading the Downloader generates a Response (with that page) and sends it to the Engine, passing through the Downloader Middleware (response direction).rh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKbh/]rhZXOnce the page finishes downloading the Downloader generates a Response (with that page) and sends it to the Engine, passing through the Downloader Middleware (response direction).rr}r(h4jh5jubaubaubj)r}r(h4XThe Engine receives the Response from the Downloader and sends it to the Spider for processing, passing through the Spider Middleware (input direction). h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4XThe Engine receives the Response from the Downloader and sends it to the Spider for processing, passing through the Spider Middleware (input direction).rh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKfh/]rhZXThe Engine receives the Response from the Downloader and sends it to the Spider for processing, passing through the Spider Middleware (input direction).rr}r(h4jh5jubaubaubj)r}r(h4XhThe Spider processes the Response and returns scraped Items and new Requests (to follow) to the Engine. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4XgThe Spider processes the Response and returns scraped Items and new Requests (to follow) to the Engine.rh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKih/]rhZXgThe Spider processes the Response and returns scraped Items and new Requests (to follow) to the Engine.rr}r(h4jh5jubaubaubj)r}r(h4XThe Engine sends scraped Items (returned by the Spider) to the Item Pipeline and Requests (returned by spider) to the Scheduler h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4XThe Engine sends scraped Items (returned by the Spider) to the Item Pipeline and Requests (returned by spider) to the Schedulerrh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKlh/]rhZXThe Engine sends scraped Items (returned by the Spider) to the Item Pipeline and Requests (returned by spider) to the Schedulerrr}r(h4jh5jubaubaubj)r}r(h4XyThe process repeats (from step 2) until there are no more requests from the Scheduler, and the Engine closes the domain. h5jh6h9h;jh=}r(hA]hB]h@]h?]hC]uhENhFhh/]rh^)r}r(h4XxThe process repeats (from step 2) until there are no more requests from the Scheduler, and the Engine closes the domain.rh5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKoh/]rhZXxThe process repeats (from step 2) until there are no more requests from the Scheduler, and the Engine closes the domain.rr}r(h4jh5jubaubaubeubeubhG)r}r (h4Uh5hHh6h9h;hLh=}r (hA]hB]h@]h?]r h+ahC]r hauhEKshFhh/]r (hS)r}r(h4XEvent-driven networkingrh5jh6h9h;hWh=}r(hA]hB]h@]h?]hC]uhEKshFhh/]rhZXEvent-driven networkingrr}r(h4jh5jubaubh^)r}r(h4XScrapy is written with `Twisted`_, a popular event-driven networking framework for Python. Thus, it's implemented using a non-blocking (aka asynchronous) code for concurrency.h5jh6h9h;hbh=}r(hA]hB]h@]h?]hC]uhEKuhFhh/]r(hZXScrapy is written with rr}r(h4XScrapy is written with h5jubcdocutils.nodes reference r)r}r(h4X `Twisted`_Uresolvedr Kh5jh;U referencer!h=}r"(UnameXTwistedUrefurir#Xhttp://twistedmatrix.com/trac/r$h?]h@]hA]hB]hC]uh/]r%hZXTwistedr&r'}r((h4Uh5jubaubhZX, a popular event-driven networking framework for Python. Thus, it's implemented using a non-blocking (aka asynchronous) code for concurrency.r)r*}r+(h4X, a popular event-driven networking framework for Python. Thus, it's implemented using a non-blocking (aka asynchronous) code for concurrency.h5jubeubh^)r,}r-(h4XPFor more information about asynchronous programming and Twisted see these links:r.h5jh6h9h;hbh=}r/(hA]hB]h@]h?]hC]uhEKyhFhh/]r0hZXPFor more information about asynchronous programming and Twisted see these links:r1r2}r3(h4j.h5j,ubaubcdocutils.nodes bullet_list r4)r5}r6(h4Uh5jh6h9h;U bullet_listr7h=}r8(Ubulletr9X*h?]h@]hA]hB]hC]uhEK|hFhh/]r:(j)r;}r<(h4X(`Asynchronous Programming with Twisted`_r=h5j5h6h9h;jh=}r>(hA]hB]h@]h?]hC]uhENhFhh/]r?h^)r@}rA(h4j=h5j;h6h9h;hbh=}rB(hA]hB]h@]h?]hC]uhEK|h/]rCj)rD}rE(h4j=j Kh5j@h;j!h=}rF(UnameX%Asynchronous Programming with Twistedj#XEhttp://twistedmatrix.com/projects/core/documentation/howto/async.htmlrGh?]h@]hA]hB]hC]uh/]rHhZX%Asynchronous Programming with TwistedrIrJ}rK(h4Uh5jDubaubaubaubj)rL}rM(h4X-`Twisted - hello, asynchronous programming`_ h5j5h6h9h;jh=}rN(hA]hB]h@]h?]hC]uhENhFhh/]rOh^)rP}rQ(h4X,`Twisted - hello, asynchronous programming`_rRh5jLh6h9h;hbh=}rS(hA]hB]h@]h?]hC]uhEK}h/]rTj)rU}rV(h4jRj Kh5jPh;j!h=}rW(UnameX)Twisted - hello, asynchronous programmingj#XIhttp://jessenoller.com/2009/02/11/twisted-hello-asynchronous-programming/rXh?]h@]hA]hB]hC]uh/]rYhZX)Twisted - hello, asynchronous programmingrZr[}r\(h4Uh5jUubaubaubaubeubh1)r]}r^(h4X+.. _Twisted: http://twistedmatrix.com/trac/U referencedr_Kh5jh6h9h;h}q?(U rawsourceq@X.. _topics-signals:UparentqAhUsourceqBcdocutils.nodes reprunicode qCXD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/signals.rstqDqE}qFbUtagnameqGUtargetqHU attributesqI}qJ(UidsqK]UbackrefsqL]UdupnamesqM]UclassesqN]UnamesqO]UrefidqPh*uUlineqQKUdocumentqRhh;]ubcdocutils.nodes section qS)qT}qU(h@UhAhhBhEUexpect_referenced_by_nameqV}qWhh>shGUsectionqXhI}qY(hM]hN]hL]hK]qZ(h/h*ehO]q[(hheuhQKhRhUexpect_referenced_by_idq\}q]h*h>sh;]q^(cdocutils.nodes title q_)q`}qa(h@XSignalsqbhAhThBhEhGUtitleqchI}qd(hM]hN]hL]hK]hO]uhQKhRhh;]qecdocutils.nodes Text qfXSignalsqgqh}qi(h@hbhAh`ubaubcdocutils.nodes paragraph qj)qk}ql(h@XScrapy uses signals extensively to notify when certain events occur. You can catch some of those signals in your Scrapy project (using an :ref:`extension `, for example) to perform additional tasks or extend Scrapy to add functionality not provided out of the box.hAhThBhEhGU paragraphqmhI}qn(hM]hN]hL]hK]hO]uhQKhRhh;]qo(hfXScrapy uses signals extensively to notify when certain events occur. You can catch some of those signals in your Scrapy project (using an qpqq}qr(h@XScrapy uses signals extensively to notify when certain events occur. You can catch some of those signals in your Scrapy project (using an hAhkubcsphinx.addnodes pending_xref qs)qt}qu(h@X$:ref:`extension `qvhAhkhBhEhGU pending_xrefqwhI}qx(UreftypeXrefUrefwarnqyU reftargetqzXtopics-extensionsU refdomainXstdq{hK]hL]U refexplicithM]hN]hO]Urefdocq|Xtopics/signalsq}uhQKh;]q~cdocutils.nodes emphasis q)q}q(h@hvhI}q(hM]hN]q(Uxrefqh{Xstd-refqehL]hK]hO]uhAhth;]qhfX extensionqq}q(h@UhAhubahGUemphasisqubaubhfXm, for example) to perform additional tasks or extend Scrapy to add functionality not provided out of the box.qq}q(h@Xm, for example) to perform additional tasks or extend Scrapy to add functionality not provided out of the box.hAhkubeubhj)q}q(h@XEven though signals provide several arguments, the handlers that catch them don't need to accept all of them - the signal dispatching mechanism will only deliver the arguments that the handler receives.qhAhThBhEhGhmhI}q(hM]hN]hL]hK]hO]uhQK hRhh;]qhfXEven though signals provide several arguments, the handlers that catch them don't need to accept all of them - the signal dispatching mechanism will only deliver the arguments that the handler receives.qq}q(h@hhAhubaubhj)q}q(h@XTYou can connect to signals (or send your own) through the :ref:`topics-api-signals`.hAhThBhEhGhmhI}q(hM]hN]hL]hK]hO]uhQKhRhh;]q(hfX:You can connect to signals (or send your own) through the qq}q(h@X:You can connect to signals (or send your own) through the hAhubhs)q}q(h@X:ref:`topics-api-signals`qhAhhBhEhGhwhI}q(UreftypeXrefhyhzXtopics-api-signalsU refdomainXstdqhK]hL]U refexplicithM]hN]hO]h|h}uhQKh;]qh)q}q(h@hhI}q(hM]hN]q(hhXstd-refqehL]hK]hO]uhAhh;]qhfXtopics-api-signalsqq}q(h@UhAhubahGhubaubhfX.q}q(h@X.hAhubeubhS)q}q(h@UhAhThBhEhGhXhI}q(hM]hN]hL]hK]qh0ahO]qhauhQKhRhh;]q(h_)q}q(h@XDeferred signal handlersqhAhhBhEhGhchI}q(hM]hN]hL]hK]hO]uhQKhRhh;]qhfXDeferred signal handlersqq}q(h@hhAhubaubhj)q}q(h@XSome signals support returning `Twisted deferreds`_ from their handlers, see the :ref:`topics-signals-ref` below to know which ones.hAhhBhEhGhmhI}q(hM]hN]hL]hK]hO]uhQKhRhh;]q(hfXSome signals support returning qq}q(h@XSome signals support returning hAhubcdocutils.nodes reference q)q}q(h@X`Twisted deferreds`_UresolvedqKhAhhGU referenceqhI}q(UnameXTwisted deferredsUrefuriqX@http://twistedmatrix.com/documents/current/core/howto/defer.htmlqhK]hL]hM]hN]hO]uh;]qhfXTwisted deferredsq̅q}q(h@UhAhubaubhfX from their handlers, see the qυq}q(h@X from their handlers, see the hAhubhs)q}q(h@X:ref:`topics-signals-ref`qhAhhBhEhGhwhI}q(UreftypeXrefhyhzXtopics-signals-refU refdomainXstdqhK]hL]U refexplicithM]hN]hO]h|h}uhQKh;]qh)q}q(h@hhI}q(hM]hN]q(hhXstd-refqehL]hK]hO]uhAhh;]qhfXtopics-signals-refqޅq}q(h@UhAhubahGhubaubhfX below to know which ones.qᅁq}q(h@X below to know which ones.hAhubeubh=)q}q(h@XW.. _Twisted deferreds: http://twistedmatrix.com/documents/current/core/howto/defer.htmlU referencedqKhAhhBhEhGhHhI}q(hhhK]qh+ahL]hM]hN]hO]qh auhQKhRhh;]ubh=)q}q(h@X.. _topics-signals-ref:hAhhBhEhGhHhI}q(hK]hL]hM]hN]hO]hPh,uhQKhRhh;]ubeubhS)q}q(h@UhAhThBhEhV}qh hshGhXhI}q(hM]hN]hL]hK]q(Xmodule-scrapy.signalsqh6h,ehO]q(hh euhQKhRhh\}qh,hsh;]q(h_)q}q(h@XBuilt-in signals referenceqhAhhBhEhGhchI}q(hM]hN]hL]hK]hO]uhQKhRhh;]qhfXBuilt-in signals referenceqq}q(h@hhAhubaubcsphinx.addnodes index q)q}r(h@UhAhhBhEhGUindexrhI}r(hK]hL]hM]hN]hO]Uentries]r(UsinglerXscrapy.signals (module)Xmodule-scrapy.signalsUtrauhQNhRhh;]ubhj)r}r(h@X=Here's the list of Scrapy built-in signals and their meaning.rhAhhBhEhGhmhI}r (hM]hN]hL]hK]hO]uhQK#hRhh;]r hfX=Here's the list of Scrapy built-in signals and their meaning.r r }r (h@jhAjubaubhS)r}r(h@UhAhhBhEhGhXhI}r(hM]hN]hL]hK]rh8ahO]rhauhQK&hRhh;]r(h_)r}r(h@Xengine_startedrhAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQK&hRhh;]rhfXengine_startedrr}r(h@jhAjubaubh)r}r(h@UhAjhBhEhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(XpairXengine_started; signalXstd:signal-engine_startedr Utr!auhQK(hRhh;]ubh=)r"}r#(h@UhAjhBhEhGhHhI}r$(hM]hN]hL]hK]r%j ahO]uhQK(hRhh;]ubh)r&}r'(h@UhAjhBhEhGjhI}r((hK]hL]hM]hN]hO]Uentries]r)(jX+engine_started() (in module scrapy.signals)hUtr*auhQNhRhh;]ubcsphinx.addnodes desc r+)r,}r-(h@UhAjhBhEhGUdescr.hI}r/(Unoindexr0Udomainr1XpyhK]hL]hM]hN]hO]Uobjtyper2Xfunctionr3Udesctyper4j3uhQNhRhh;]r5(csphinx.addnodes desc_signature r6)r7}r8(h@Xengine_started()hAj,hBhEhGUdesc_signaturer9hI}r:(hK]r;haUmoduler<Xscrapy.signalsr=hL]hM]hN]hO]r>haUfullnamer?Xengine_startedr@UclassrAUUfirstrBuhQK.hRhh;]rC(csphinx.addnodes desc_addname rD)rE}rF(h@Xscrapy.signals.hAj7hBhEhGU desc_addnamerGhI}rH(hM]hN]hL]hK]hO]uhQK.hRhh;]rIhfXscrapy.signals.rJrK}rL(h@UhAjEubaubcsphinx.addnodes desc_name rM)rN}rO(h@j@hAj7hBhEhGU desc_namerPhI}rQ(hM]hN]hL]hK]hO]uhQK.hRhh;]rRhfXengine_startedrSrT}rU(h@UhAjNubaubcsphinx.addnodes desc_parameterlist rV)rW}rX(h@UhAj7hBhEhGUdesc_parameterlistrYhI}rZ(hM]hN]hL]hK]hO]uhQK.hRhh;]ubeubcsphinx.addnodes desc_content r[)r\}r](h@UhAj,hBhEhGU desc_contentr^hI}r_(hM]hN]hL]hK]hO]uhQK.hRhh;]r`(hj)ra}rb(h@X1Sent when the Scrapy engine has started crawling.rchAj\hBhEhGhmhI}rd(hM]hN]hL]hK]hO]uhQK+hRhh;]rehfX1Sent when the Scrapy engine has started crawling.rfrg}rh(h@jchAjaubaubhj)ri}rj(h@X=This signal supports returning deferreds from their handlers.rkhAj\hBhEhGhmhI}rl(hM]hN]hL]hK]hO]uhQK-hRhh;]rmhfX=This signal supports returning deferreds from their handlers.rnro}rp(h@jkhAjiubaubeubeubcdocutils.nodes note rq)rr}rs(h@XThis signal may be fired *after* the :signal:`spider_opened` signal, depending on how the spider was started. So **don't** rely on this signal getting fired before :signal:`spider_opened`.hAjhBhEhGUnoterthI}ru(hM]hN]hL]hK]hO]uhQNhRhh;]rvhj)rw}rx(h@XThis signal may be fired *after* the :signal:`spider_opened` signal, depending on how the spider was started. So **don't** rely on this signal getting fired before :signal:`spider_opened`.hAjrhBhEhGhmhI}ry(hM]hN]hL]hK]hO]uhQK/h;]rz(hfXThis signal may be fired r{r|}r}(h@XThis signal may be fired hAjwubh)r~}r(h@X*after*hI}r(hM]hN]hL]hK]hO]uhAjwh;]rhfXafterrr}r(h@UhAj~ubahGhubhfX the rr}r(h@X the hAjwubhs)r}r(h@X:signal:`spider_opened`rhAjwhBhEhGhwhI}r(UreftypeXsignalhyhzX spider_openedU refdomainXstdrhK]hL]U refexplicithM]hN]hO]h|h}uhQK/h;]rcdocutils.nodes literal r)r}r(h@jhI}r(hM]hN]r(hjX std-signalrehL]hK]hO]uhAjh;]rhfX spider_openedrr}r(h@UhAjubahGUliteralrubaubhfX5 signal, depending on how the spider was started. So rr}r(h@X5 signal, depending on how the spider was started. So hAjwubcdocutils.nodes strong r)r}r(h@X **don't**hI}r(hM]hN]hL]hK]hO]uhAjwh;]rhfXdon'trr}r(h@UhAjubahGUstrongrubhfX* rely on this signal getting fired before rr}r(h@X* rely on this signal getting fired before hAjwubhs)r}r(h@X:signal:`spider_opened`rhAjwhBhEhGhwhI}r(UreftypeXsignalhyhzX spider_openedU refdomainXstdrhK]hL]U refexplicithM]hN]hO]h|h}uhQK/h;]rj)r}r(h@jhI}r(hM]hN]r(hjX std-signalrehL]hK]hO]uhAjh;]rhfX spider_openedrr}r(h@UhAjubahGjubaubhfX.r}r(h@X.hAjwubeubaubeubhS)r}r(h@UhAhhBhEhGhXhI}r(hM]hN]hL]hK]rh9ahO]rhauhQK4hRhh;]r(h_)r}r(h@Xengine_stoppedrhAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQK4hRhh;]rhfXengine_stoppedrr}r(h@jhAjubaubh)r}r(h@UhAjhBhEhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(XpairXengine_stopped; signalXstd:signal-engine_stoppedrUtrauhQK6hRhh;]ubh=)r}r(h@UhAjhBhEhGhHhI}r(hM]hN]hL]hK]rjahO]uhQK6hRhh;]ubh)r}r(h@UhAjhBhEhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(jX+engine_stopped() (in module scrapy.signals)hUtrauhQNhRhh;]ubj+)r}r(h@UhAjhBhEhGj.hI}r(j0j1XpyhK]hL]hM]hN]hO]j2Xfunctionrj4juhQNhRhh;]r(j6)r}r(h@Xengine_stopped()hAjhBhEhGj9hI}r(hK]rhaj<j=hL]hM]hN]hO]rhaj?Xengine_stoppedrjAUjBuhQK=hRhh;]r(jD)r}r(h@Xscrapy.signals.hAjhBhEhGjGhI}r(hM]hN]hL]hK]hO]uhQK=hRhh;]rhfXscrapy.signals.rr}r(h@UhAjubaubjM)r}r(h@jhAjhBhEhGjPhI}r(hM]hN]hL]hK]hO]uhQK=hRhh;]rhfXengine_stoppedrr}r(h@UhAjubaubjV)r}r(h@UhAjhBhEhGjYhI}r(hM]hN]hL]hK]hO]uhQK=hRhh;]ubeubj[)r}r(h@UhAjhBhEhGj^hI}r(hM]hN]hL]hK]hO]uhQK=hRhh;]r(hj)r}r(h@X[Sent when the Scrapy engine is stopped (for example, when a crawling process has finished).rhAjhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQK9hRhh;]rhfX[Sent when the Scrapy engine is stopped (for example, when a crawling process has finished).rr}r(h@jhAjubaubhj)r}r(h@X=This signal supports returning deferreds from their handlers.rhAjhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQK}r?(h@UhAj)hBhEhGjYhI}r@(hM]hN]hL]hK]hO]uhQKQhRhh;]rA(csphinx.addnodes desc_parameter rB)rC}rD(h@XitemhI}rE(hM]hN]hL]hK]hO]uhAj>h;]rFhfXitemrGrH}rI(h@UhAjCubahGUdesc_parameterrJubjB)rK}rL(h@XresponsehI}rM(hM]hN]hL]hK]hO]uhAj>h;]rNhfXresponserOrP}rQ(h@UhAjKubahGjJubjB)rR}rS(h@XspiderhI}rT(hM]hN]hL]hK]hO]uhAj>h;]rUhfXspiderrVrW}rX(h@UhAjRubahGjJubeubeubj[)rY}rZ(h@UhAj$hBhEhGj^hI}r[(hM]hN]hL]hK]hO]uhQKQhRhh;]r\(hj)r]}r^(h@X{Sent when an item has been scraped, after it has passed all the :ref:`topics-item-pipeline` stages (without being dropped).hAjYhBhEhGhmhI}r_(hM]hN]hL]hK]hO]uhQKDhRhh;]r`(hfX@Sent when an item has been scraped, after it has passed all the rarb}rc(h@X@Sent when an item has been scraped, after it has passed all the hAj]ubhs)rd}re(h@X:ref:`topics-item-pipeline`rfhAj]hBhEhGhwhI}rg(UreftypeXrefhyhzXtopics-item-pipelineU refdomainXstdrhhK]hL]U refexplicithM]hN]hO]h|h}uhQKDh;]rih)rj}rk(h@jfhI}rl(hM]hN]rm(hjhXstd-refrnehL]hK]hO]uhAjdh;]rohfXtopics-item-pipelinerprq}rr(h@UhAjjubahGhubaubhfX stages (without being dropped).rsrt}ru(h@X stages (without being dropped).hAj]ubeubhj)rv}rw(h@X=This signal supports returning deferreds from their handlers.rxhAjYhBhEhGhmhI}ry(hM]hN]hL]hK]hO]uhQKGhRhh;]rzhfX=This signal supports returning deferreds from their handlers.r{r|}r}(h@jxhAjvubaubcdocutils.nodes field_list r~)r}r(h@UhAjYhBNhGU field_listrhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rcdocutils.nodes field r)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(cdocutils.nodes field_name r)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX Parametersrr}r(h@UhAjubahGU field_namerubcdocutils.nodes field_body r)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rcdocutils.nodes bullet_list r)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(cdocutils.nodes list_item r)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XitemhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXitemrr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.item.Item`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.item.ItemU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}Upy:classrNU py:modulerj=uhQKJh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXItemrr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfXthe item scrapedrr}r(h@Xthe item scrapedhAjubehGhmubahGU list_itemrubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XresponsehI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXresponserr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.http.Response`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.http.ResponseU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKMh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXResponserr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfX,the response from where the item was scrapedrr}r(h@X,the response from where the item was scrapedhAjubehGhmubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XspiderhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXspiderrr}r(h@UhAjubahGjubhfX (rr}r (h@UhAjubhs)r }r (h@X:class:`~scrapy.spider.Spider`r hAjhBhEhGhwhI}r (UreftypeXclasshyhzXscrapy.spider.SpiderU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKPh;]rj)r}r(h@j hI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAj h;]rhfXSpiderrr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r (h@UhAjubhfX!the spider which scraped the itemr!r"}r#(h@X!the spider which scraped the itemhAjubehGhmubahGjubehGU bullet_listr$ubahGU field_bodyr%ubehGUfieldr&ubaubeubeubeubhS)r'}r((h@UhAhhBhEhGhXhI}r)(hM]hN]hL]hK]r*h7ahO]r+hauhQKShRhh;]r,(h_)r-}r.(h@X item_droppedr/hAj'hBhEhGhchI}r0(hM]hN]hL]hK]hO]uhQKShRhh;]r1hfX item_droppedr2r3}r4(h@j/hAj-ubaubh)r5}r6(h@UhAj'hBhEhGjhI}r7(hK]hL]hM]hN]hO]Uentries]r8(XpairXitem_dropped; signalXstd:signal-item_droppedr9Utr:auhQKUhRhh;]ubh=)r;}r<(h@UhAj'hBhEhGhHhI}r=(hM]hN]hL]hK]r>j9ahO]uhQKUhRhh;]ubh)r?}r@(h@UhAj'hBNhGjhI}rA(hK]hL]hM]hN]hO]Uentries]rB(jX)item_dropped() (in module scrapy.signals)hUtrCauhQNhRhh;]ubj+)rD}rE(h@UhAj'hBNhGj.hI}rF(j0j1XpyhK]hL]hM]hN]hO]j2XfunctionrGj4jGuhQNhRhh;]rH(j6)rI}rJ(h@X%item_dropped(item, spider, exception)hAjDhBhEhGj9hI}rK(hK]rLhaj<j=hL]hM]hN]hO]rMhaj?X item_droppedrNjAUjBuhQKghRhh;]rO(jD)rP}rQ(h@Xscrapy.signals.hAjIhBhEhGjGhI}rR(hM]hN]hL]hK]hO]uhQKghRhh;]rShfXscrapy.signals.rTrU}rV(h@UhAjPubaubjM)rW}rX(h@jNhAjIhBhEhGjPhI}rY(hM]hN]hL]hK]hO]uhQKghRhh;]rZhfX item_droppedr[r\}r](h@UhAjWubaubjV)r^}r_(h@UhAjIhBhEhGjYhI}r`(hM]hN]hL]hK]hO]uhQKghRhh;]ra(jB)rb}rc(h@XitemhI}rd(hM]hN]hL]hK]hO]uhAj^h;]rehfXitemrfrg}rh(h@UhAjbubahGjJubjB)ri}rj(h@XspiderhI}rk(hM]hN]hL]hK]hO]uhAj^h;]rlhfXspiderrmrn}ro(h@UhAjiubahGjJubjB)rp}rq(h@X exceptionhI}rr(hM]hN]hL]hK]hO]uhAj^h;]rshfX exceptionrtru}rv(h@UhAjpubahGjJubeubeubj[)rw}rx(h@UhAjDhBhEhGj^hI}ry(hM]hN]hL]hK]hO]uhQKghRhh;]rz(hj)r{}r|(h@XSent after an item has been dropped from the :ref:`topics-item-pipeline` when some stage raised a :exc:`~scrapy.exceptions.DropItem` exception.hAjwhBhEhGhmhI}r}(hM]hN]hL]hK]hO]uhQKXhRhh;]r~(hfX-Sent after an item has been dropped from the rr}r(h@X-Sent after an item has been dropped from the hAj{ubhs)r}r(h@X:ref:`topics-item-pipeline`rhAj{hBhEhGhwhI}r(UreftypeXrefhyhzXtopics-item-pipelineU refdomainXstdrhK]hL]U refexplicithM]hN]hO]h|h}uhQKXh;]rh)r}r(h@jhI}r(hM]hN]r(hjXstd-refrehL]hK]hO]uhAjh;]rhfXtopics-item-pipelinerr}r(h@UhAjubahGhubaubhfX when some stage raised a rr}r(h@X when some stage raised a hAj{ubhs)r}r(h@X":exc:`~scrapy.exceptions.DropItem`rhAj{hBhEhGhwhI}r(UreftypeXexchyhzXscrapy.exceptions.DropItemU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKXh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-excrehL]hK]hO]uhAjh;]rhfXDropItemrr}r(h@UhAjubahGjubaubhfX exception.rr}r(h@X exception.hAj{ubeubhj)r}r(h@X=This signal supports returning deferreds from their handlers.rhAjwhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQK[hRhh;]rhfX=This signal supports returning deferreds from their handlers.rr}r(h@jhAjubaubj~)r}r(h@UhAjwhBNhGjhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX Parametersrr}r(h@UhAjubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XitemhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXitemrr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.item.Item`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.item.ItemU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQK^h;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXItemrr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfXthe item dropped from the rr}r(h@Xthe item dropped from the hAjubhs)r}r(h@X:ref:`topics-item-pipeline`rhAjhBhEhGhwhI}r(UreftypeXrefhyhzXtopics-item-pipelineU refdomainXstdrhK]hL]U refexplicithM]hN]hO]h|h}uhQK]h;]rh)r}r(h@jhI}r(hM]hN]r(hjXstd-refrehL]hK]hO]uhAjh;]rhfXtopics-item-pipelinerr}r(h@UhAjubahGhubaubehGhmubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r (h@XspiderhI}r (hM]hN]hL]hK]hO]uhAjh;]r hfXspiderr r }r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.spider.Spider`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.spider.SpiderU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKah;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXSpiderrr}r (h@UhAjubahGjubaubhfX objectr!r"}r#(h@X objecthAjubhfX)r$}r%(h@UhAjubhfX -- r&r'}r((h@UhAjubhfX!the spider which scraped the itemr)r*}r+(h@X!the spider which scraped the itemhAjubehGhmubahGjubj)r,}r-(h@UhI}r.(hM]hN]hL]hK]hO]uhAjh;]r/hj)r0}r1(h@UhI}r2(hM]hN]hL]hK]hO]uhAj,h;]r3(j)r4}r5(h@X exceptionhI}r6(hM]hN]hL]hK]hO]uhAj0h;]r7hfX exceptionr8r9}r:(h@UhAj4ubahGjubhfX (r;r<}r=(h@UhAj0ubhs)r>}r?(h@X":exc:`~scrapy.exceptions.DropItem`r@hAj0hBhEhGhwhI}rA(UreftypeXexchyhzXscrapy.exceptions.DropItemU refdomainXpyrBhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKfh;]rCj)rD}rE(h@j@hI}rF(hM]hN]rG(hjBXpy-excrHehL]hK]hO]uhAj>h;]rIhfXDropItemrJrK}rL(h@UhAjDubahGjubaubhfX exceptionrMrN}rO(h@X exceptionhAj0ubhfX)rP}rQ(h@UhAj0ubhfX -- rRrS}rT(h@UhAj0ubhfXthe exception (which must be a rUrV}rW(h@Xthe exception (which must be a hAj0ubhs)rX}rY(h@X":exc:`~scrapy.exceptions.DropItem`rZhAj0hBhEhGhwhI}r[(UreftypeXexchyhzXscrapy.exceptions.DropItemU refdomainXpyr\hK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKch;]r]j)r^}r_(h@jZhI}r`(hM]hN]ra(hj\Xpy-excrbehL]hK]hO]uhAjXh;]rchfXDropItemrdre}rf(h@UhAj^ubahGjubaubhfX. subclass) which caused the item to be droppedrgrh}ri(h@X. subclass) which caused the item to be droppedhAj0ubehGhmubahGjubehGj$ubahGj%ubehGj&ubaubeubeubeubhS)rj}rk(h@UhAhhBhEhGhXhI}rl(hM]hN]hL]hK]rmh1ahO]rnhauhQKihRhh;]ro(h_)rp}rq(h@X spider_closedrrhAjjhBhEhGhchI}rs(hM]hN]hL]hK]hO]uhQKihRhh;]rthfX spider_closedrurv}rw(h@jrhAjpubaubh)rx}ry(h@UhAjjhBhEhGjhI}rz(hK]hL]hM]hN]hO]Uentries]r{(XpairXspider_closed; signalXstd:signal-spider_closedr|Utr}auhQKkhRhh;]ubh=)r~}r(h@UhAjjhBhEhGhHhI}r(hM]hN]hL]hK]rj|ahO]uhQKkhRhh;]ubh)r}r(h@UhAjjhBNhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(jX*spider_closed() (in module scrapy.signals)hUtrauhQNhRhh;]ubj+)r}r(h@UhAjjhBNhGj.hI}r(j0j1XpyrhK]hL]hM]hN]hO]j2Xfunctionrj4juhQNhRhh;]r(j6)r}r(h@Xspider_closed(spider, reason)hAjhBhEhGj9hI}r(hK]rhaj<j=hL]hM]hN]hO]rhaj?X spider_closedrjAUjBuhQK~hRhh;]r(jD)r}r(h@Xscrapy.signals.hAjhBhEhGjGhI}r(hM]hN]hL]hK]hO]uhQK~hRhh;]rhfXscrapy.signals.rr}r(h@UhAjubaubjM)r}r(h@jhAjhBhEhGjPhI}r(hM]hN]hL]hK]hO]uhQK~hRhh;]rhfX spider_closedrr}r(h@UhAjubaubjV)r}r(h@UhAjhBhEhGjYhI}r(hM]hN]hL]hK]hO]uhQK~hRhh;]r(jB)r}r(h@XspiderhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXspiderrr}r(h@UhAjubahGjJubjB)r}r(h@XreasonhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXreasonrr}r(h@UhAjubahGjJubeubeubj[)r}r(h@UhAjhBhEhGj^hI}r(hM]hN]hL]hK]hO]uhQK~hRhh;]r(hj)r}r(h@XzSent after a spider has been closed. This can be used to release per-spider resources reserved on :signal:`spider_opened`.hAjhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQKnhRhh;]r(hfXbSent after a spider has been closed. This can be used to release per-spider resources reserved on rr}r(h@XbSent after a spider has been closed. This can be used to release per-spider resources reserved on hAjubhs)r}r(h@X:signal:`spider_opened`rhAjhBhEhGhwhI}r(UreftypeXsignalhyhzX spider_openedU refdomainXstdrhK]hL]U refexplicithM]hN]hO]h|h}uhQKnh;]rj)r}r(h@jhI}r(hM]hN]r(hjX std-signalrehL]hK]hO]uhAjh;]rhfX spider_openedrr}r(h@UhAjubahGjubaubhfX.r}r(h@X.hAjubeubhj)r}r(h@X=This signal supports returning deferreds from their handlers.rhAjhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQKqhRhh;]rhfX=This signal supports returning deferreds from their handlers.rr}r(h@jhAjubaubj~)r}r(h@UhAjhBNhGjhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX Parametersrr}r(h@UhAjubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XspiderhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXspiderrr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.spider.Spider`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.spider.SpiderU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKth;]rj)r}r(h@jhI}r (hM]hN]r (hjXpy-classr ehL]hK]hO]uhAjh;]r hfXSpiderr r}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfX the spider which has been closedrr}r(h@X the spider which has been closedrhAjubehGhmubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r }r!(h@UhI}r"(hM]hN]hL]hK]hO]uhAjh;]r#(j)r$}r%(h@XreasonhI}r&(hM]hN]hL]hK]hO]uhAj h;]r'hfXreasonr(r)}r*(h@UhAj$ubahGjubhfX (r+r,}r-(h@UhAj ubhs)r.}r/(h@UhI}r0(UreftypeUobjr1U reftargetXstrr2U refdomainjhK]hL]U refexplicithM]hN]hO]uhAj h;]r3h)r4}r5(h@j2hI}r6(hM]hN]hL]hK]hO]uhAj.h;]r7hfXstrr8r9}r:(h@UhAj4ubahGhubahGhwubhfX)r;}r<(h@UhAj ubhfX -- r=r>}r?(h@UhAj ubhfXa string which describes the reason why the spider was closed. If it was closed because the spider has completed scraping, the reason is r@rA}rB(h@Xa string which describes the reason why the spider was closed. If it was closed because the spider has completed scraping, the reason is hAj ubj)rC}rD(h@X``'finished'``hI}rE(hM]hN]hL]hK]hO]uhAj h;]rFhfX 'finished'rGrH}rI(h@UhAjCubahGjubhfX>. Otherwise, if the spider was manually closed by calling the rJrK}rL(h@X>. Otherwise, if the spider was manually closed by calling the hAj ubj)rM}rN(h@X``close_spider``hI}rO(hM]hN]hL]hK]hO]uhAj h;]rPhfX close_spiderrQrR}rS(h@UhAjMubahGjubhfX9 engine method, then the reason is the one passed in the rTrU}rV(h@X9 engine method, then the reason is the one passed in the hAj ubj)rW}rX(h@X ``reason``hI}rY(hM]hN]hL]hK]hO]uhAj h;]rZhfXreasonr[r\}r](h@UhAjWubahGjubhfX, argument of that method (which defaults to r^r_}r`(h@X, argument of that method (which defaults to hAj ubj)ra}rb(h@X``'cancelled'``hI}rc(hM]hN]hL]hK]hO]uhAj h;]rdhfX 'cancelled'rerf}rg(h@UhAjaubahGjubhfX]). If the engine was shutdown (for example, by hitting Ctrl-C to stop it) the reason will be rhri}rj(h@X]). If the engine was shutdown (for example, by hitting Ctrl-C to stop it) the reason will be hAj ubj)rk}rl(h@X``'shutdown'``hI}rm(hM]hN]hL]hK]hO]uhAj h;]rnhfX 'shutdown'rorp}rq(h@UhAjkubahGjubhfX.rr}rs(h@X.hAj ubehGhmubahGjubehGj$ubahGj%ubehGj&ubaubeubeubeubhS)rt}ru(h@UhAhhBhEhGhXhI}rv(hM]hN]hL]hK]rwh:ahO]rxh auhQKhRhh;]ry(h_)rz}r{(h@X spider_openedr|hAjthBhEhGhchI}r}(hM]hN]hL]hK]hO]uhQKhRhh;]r~hfX spider_openedrr}r(h@j|hAjzubaubh)r}r(h@UhAjthBhEhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(XpairXspider_opened; signalXstd:signal-spider_openedrUtrauhQKhRhh;]ubh=)r}r(h@UhAjthBhEhGhHhI}r(hM]hN]hL]hK]rjahO]uhQKhRhh;]ubh)r}r(h@UhAjthBNhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(jX*spider_opened() (in module scrapy.signals)hUtrauhQNhRhh;]ubj+)r}r(h@UhAjthBNhGj.hI}r(j0j1XpyhK]hL]hM]hN]hO]j2Xfunctionrj4juhQNhRhh;]r(j6)r}r(h@Xspider_opened(spider)hAjhBhEhGj9hI}r(hK]rhaj<j=hL]hM]hN]hO]rhaj?X spider_openedrjAUjBuhQKhRhh;]r(jD)r}r(h@Xscrapy.signals.hAjhBhEhGjGhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXscrapy.signals.rr}r(h@UhAjubaubjM)r}r(h@jhAjhBhEhGjPhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfX spider_openedrr}r(h@UhAjubaubjV)r}r(h@UhAjhBhEhGjYhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rjB)r}r(h@XspiderhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXspiderrr}r(h@UhAjubahGjJubaubeubj[)r}r(h@UhAjhBhEhGj^hI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hj)r}r(h@XSent after a spider has been opened for crawling. This is typically used to reserve per-spider resources, but can be used for any task that needs to be performed when a spider is opened.rhAjhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXSent after a spider has been opened for crawling. This is typically used to reserve per-spider resources, but can be used for any task that needs to be performed when a spider is opened.rr}r(h@jhAjubaubhj)r}r(h@X=This signal supports returning deferreds from their handlers.rhAjhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfX=This signal supports returning deferreds from their handlers.rr}r(h@jhAjubaubj~)r}r(h@UhAjhBNhGjhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX Parametersrr}r(h@UhAjubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XspiderhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXspiderrr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.spider.Spider`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.spider.SpiderU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXSpiderrr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfX the spider which has been openedrr}r(h@X the spider which has been openedhAjubehGhmubahGj%ubehGj&ubaubeubeubeubhS)r}r(h@UhAhhBhEhGhXhI}r(hM]hN]hL]hK]rh-ahO]r h auhQKhRhh;]r (h_)r }r (h@X spider_idler hAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfX spider_idlerr}r(h@j hAj ubaubh)r}r(h@UhAjhBhEhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(XpairXspider_idle; signalXstd:signal-spider_idlerUtrauhQKhRhh;]ubh=)r}r(h@UhAjhBhEhGhHhI}r(hM]hN]hL]hK]rjahO]uhQKhRhh;]ubh)r}r(h@UhAjhBNhGjhI}r(hK]hL]hM]hN]hO]Uentries]r (jX(spider_idle() (in module scrapy.signals)hUtr!auhQNhRhh;]ubj+)r"}r#(h@UhAjhBNhGj.hI}r$(j0j1XpyhK]hL]hM]hN]hO]j2Xfunctionr%j4j%uhQNhRhh;]r&(j6)r'}r((h@Xspider_idle(spider)r)hAj"hBhEhGj9hI}r*(hK]r+haj<j=hL]hM]hN]hO]r,haj?X spider_idler-jAUjBuhQKhRhh;]r.(jD)r/}r0(h@Xscrapy.signals.hAj'hBhEhGjGhI}r1(hM]hN]hL]hK]hO]uhQKhRhh;]r2hfXscrapy.signals.r3r4}r5(h@UhAj/ubaubjM)r6}r7(h@j-hAj'hBhEhGjPhI}r8(hM]hN]hL]hK]hO]uhQKhRhh;]r9hfX spider_idler:r;}r<(h@UhAj6ubaubjV)r=}r>(h@UhAj'hBhEhGjYhI}r?(hM]hN]hL]hK]hO]uhQKhRhh;]r@jB)rA}rB(h@XspiderhI}rC(hM]hN]hL]hK]hO]uhAj=h;]rDhfXspiderrErF}rG(h@UhAjAubahGjJubaubeubj[)rH}rI(h@UhAj"hBhEhGj^hI}rJ(hM]hN]hL]hK]hO]uhQKhRhh;]rK(hj)rL}rM(h@XHSent when a spider has gone idle, which means the spider has no further:rNhAjHhBhEhGhmhI}rO(hM]hN]hL]hK]hO]uhQKhRhh;]rPhfXHSent when a spider has gone idle, which means the spider has no further:rQrR}rS(h@jNhAjLubaubcdocutils.nodes block_quote rT)rU}rV(h@UhAjHhBNhGU block_quoterWhI}rX(hM]hN]hL]hK]hO]uhQNhRhh;]rYj)rZ}r[(h@UhI}r\(Ubulletr]X*hK]hL]hM]hN]hO]uhAjUh;]r^(j)r_}r`(h@X!requests waiting to be downloadedrahI}rb(hM]hN]hL]hK]hO]uhAjZh;]rchj)rd}re(h@jahAj_hBhEhGhmhI}rf(hM]hN]hL]hK]hO]uhQKh;]rghfX!requests waiting to be downloadedrhri}rj(h@jahAjdubaubahGjubj)rk}rl(h@Xrequests scheduledrmhI}rn(hM]hN]hL]hK]hO]uhAjZh;]rohj)rp}rq(h@jmhAjkhBhEhGhmhI}rr(hM]hN]hL]hK]hO]uhQKh;]rshfXrequests scheduledrtru}rv(h@jmhAjpubaubahGjubj)rw}rx(h@X+items being processed in the item pipeline hI}ry(hM]hN]hL]hK]hO]uhAjZh;]rzhj)r{}r|(h@X*items being processed in the item pipeliner}hAjwhBhEhGhmhI}r~(hM]hN]hL]hK]hO]uhQKh;]rhfX*items being processed in the item pipelinerr}r(h@j}hAj{ubaubahGjubehGj$ubaubhj)r}r(h@XIf the idle state persists after all handlers of this signal have finished, the engine starts closing the spider. After the spider has finished closing, the :signal:`spider_closed` signal is sent.hAjHhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfXIf the idle state persists after all handlers of this signal have finished, the engine starts closing the spider. After the spider has finished closing, the rr}r(h@XIf the idle state persists after all handlers of this signal have finished, the engine starts closing the spider. After the spider has finished closing, the hAjubhs)r}r(h@X:signal:`spider_closed`rhAjhBhEhGhwhI}r(UreftypeXsignalhyhzX spider_closedU refdomainXstdrhK]hL]U refexplicithM]hN]hO]h|h}uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjX std-signalrehL]hK]hO]uhAjh;]rhfX spider_closedrr}r(h@UhAjubahGjubaubhfX signal is sent.rr}r(h@X signal is sent.hAjubeubhj)r}r(h@X{You can, for example, schedule some requests in your :signal:`spider_idle` handler to prevent the spider from being closed.hAjHhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfX5You can, for example, schedule some requests in your rr}r(h@X5You can, for example, schedule some requests in your hAjubhs)r}r(h@X:signal:`spider_idle`rhAjhBhEhGhwhI}r(UreftypeXsignalhyhzX spider_idleU refdomainXstdrhK]hL]U refexplicithM]hN]hO]h|h}uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjX std-signalrehL]hK]hO]uhAjh;]rhfX spider_idlerr}r(h@UhAjubahGjubaubhfX1 handler to prevent the spider from being closed.rr}r(h@X1 handler to prevent the spider from being closed.hAjubeubhj)r}r(h@XEThis signal does not support returning deferreds from their handlers.rhAjHhBhEhGhmhI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXEThis signal does not support returning deferreds from their handlers.rr}r(h@jhAjubaubj~)r}r(h@UhAjHhBNhGjhI}r(hM]hN]hL]hK]hO]uhQNhRhh;]rj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfX Parametersrr}r(h@UhAjubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XspiderhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXspiderrr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.spider.Spider`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.spider.SpiderU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXSpiderrr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfXthe spider which has gone idlerr}r(h@Xthe spider which has gone idlerhAjubehGhmubahGj%ubehGj&ubaubeubeubeubhS)r}r(h@UhAhhBhEhGhXhI}r(hM]hN]hL]hK]rh2ahO]rhauhQKhRhh;]r(h_)r}r(h@X spider_errorrhAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfX spider_errorrr}r(h@jhAjubaubh)r}r(h@UhAjhBhEhGjhI}r (hK]hL]hM]hN]hO]Uentries]r (XpairXspider_error; signalXstd:signal-spider_errorr Utr auhQKhRhh;]ubh=)r }r(h@UhAjhBhEhGhHhI}r(hM]hN]hL]hK]rj ahO]uhQKhRhh;]ubh)r}r(h@UhAjhBNhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(jX)spider_error() (in module scrapy.signals)hUtrauhQNhRhh;]ubj+)r}r(h@UhAjhBNhGj.hI}r(j0j1XpyhK]hL]hM]hN]hO]j2Xfunctionrj4juhQNhRhh;]r(j6)r}r(h@X'spider_error(failure, response, spider)hAjhBhEhGj9hI}r(hK]rhaj<j=hL]hM]hN]hO]rhaj?X spider_errorr jAUjBuhQKhRhh;]r!(jD)r"}r#(h@Xscrapy.signals.hAjhBhEhGjGhI}r$(hM]hN]hL]hK]hO]uhQKhRhh;]r%hfXscrapy.signals.r&r'}r((h@UhAj"ubaubjM)r)}r*(h@j hAjhBhEhGjPhI}r+(hM]hN]hL]hK]hO]uhQKhRhh;]r,hfX spider_errorr-r.}r/(h@UhAj)ubaubjV)r0}r1(h@UhAjhBhEhGjYhI}r2(hM]hN]hL]hK]hO]uhQKhRhh;]r3(jB)r4}r5(h@XfailurehI}r6(hM]hN]hL]hK]hO]uhAj0h;]r7hfXfailurer8r9}r:(h@UhAj4ubahGjJubjB)r;}r<(h@XresponsehI}r=(hM]hN]hL]hK]hO]uhAj0h;]r>hfXresponser?r@}rA(h@UhAj;ubahGjJubjB)rB}rC(h@XspiderhI}rD(hM]hN]hL]hK]hO]uhAj0h;]rEhfXspiderrFrG}rH(h@UhAjBubahGjJubeubeubj[)rI}rJ(h@UhAjhBhEhGj^hI}rK(hM]hN]hL]hK]hO]uhQKhRhh;]rL(hj)rM}rN(h@XISent when a spider callback generates an error (ie. raises an exception).rOhAjIhBhEhGhmhI}rP(hM]hN]hL]hK]hO]uhQKhRhh;]rQhfXISent when a spider callback generates an error (ie. raises an exception).rRrS}rT(h@jOhAjMubaubj~)rU}rV(h@UhAjIhBNhGjhI}rW(hM]hN]hL]hK]hO]uhQNhRhh;]rXj)rY}rZ(h@UhI}r[(hM]hN]hL]hK]hO]uhAjUh;]r\(j)r]}r^(h@UhI}r_(hM]hN]hL]hK]hO]uhAjYh;]r`hfX Parametersrarb}rc(h@UhAj]ubahGjubj)rd}re(h@UhI}rf(hM]hN]hL]hK]hO]uhAjYh;]rgj)rh}ri(h@UhI}rj(hM]hN]hL]hK]hO]uhAjdh;]rk(j)rl}rm(h@UhI}rn(hM]hN]hL]hK]hO]uhAjhh;]rohj)rp}rq(h@UhI}rr(hM]hN]hL]hK]hO]uhAjlh;]rs(j)rt}ru(h@XfailurehI}rv(hM]hN]hL]hK]hO]uhAjph;]rwhfXfailurerxry}rz(h@UhAjtubahGjubhfX (r{r|}r}(h@UhAjpubh)r~}r(h@X `Failure`_hKhAjphGhhI}r(UnameXFailurehXRhttp://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.htmlrhK]hL]hM]hN]hO]uh;]rhfXFailurerr}r(h@UhAj~ubaubhfX objectrr}r(h@X objecthAjpubhfX)r}r(h@UhAjpubhfX -- rr}r(h@UhAjpubhfX"the exception raised as a Twisted rr}r(h@X"the exception raised as a Twisted hAjpubh)r}r(h@X `Failure`_hKhAjphGhhI}r(UnameXFailurehjhK]hL]hM]hN]hO]uh;]rhfXFailurerr}r(h@UhAjubaubhfX objectrr}r(h@X objecthAjpubehGhmubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjhh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XresponsehI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXresponserr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.http.Response`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.http.ResponseU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXResponserr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfX:the response being processed when the exception was raisedrr}r(h@X:the response being processed when the exception was raisedrhAjubehGhmubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjhh;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XspiderhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXspiderrr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.spider.Spider`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.spider.SpiderU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXSpiderrr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfX%the spider which raised the exceptionrr}r(h@X%the spider which raised the exceptionrhAjubehGhmubahGjubehGj$ubahGj%ubehGj&ubaubeubeubeubhS)r}r(h@UhAhhBhEhGhXhI}r(hM]hN]hL]hK]rh5ahO]rhauhQKhRhh;]r(h_)r}r(h@Xresponse_receivedrhAjhBhEhGhchI}r(hM]hN]hL]hK]hO]uhQKhRhh;]rhfXresponse_receivedrr}r(h@jhAjubaubh)r}r(h@UhAjhBhEhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(XpairXresponse_received; signalXstd:signal-response_receivedrUtrauhQKhRhh;]ubh=)r }r (h@UhAjhBhEhGhHhI}r (hM]hN]hL]hK]r jahO]uhQKhRhh;]ubh)r }r(h@UhAjhBNhGjhI}r(hK]hL]hM]hN]hO]Uentries]r(jX.response_received() (in module scrapy.signals)hUtrauhQNhRhh;]ubj+)r}r(h@UhAjhBNhGj.hI}r(j0j1XpyhK]hL]hM]hN]hO]j2Xfunctionrj4juhQNhRhh;]r(j6)r}r(h@X,response_received(response, request, spider)hAjhBhEhGj9hI}r(hK]rhaj<j=hL]hM]hN]hO]rhaj?Xresponse_receivedrjAUjBuhQKhRhh;]r(jD)r}r(h@Xscrapy.signals.hAjhBhEhGjGhI}r (hM]hN]hL]hK]hO]uhQKhRhh;]r!hfXscrapy.signals.r"r#}r$(h@UhAjubaubjM)r%}r&(h@jhAjhBhEhGjPhI}r'(hM]hN]hL]hK]hO]uhQKhRhh;]r(hfXresponse_receivedr)r*}r+(h@UhAj%ubaubjV)r,}r-(h@UhAjhBhEhGjYhI}r.(hM]hN]hL]hK]hO]uhQKhRhh;]r/(jB)r0}r1(h@XresponsehI}r2(hM]hN]hL]hK]hO]uhAj,h;]r3hfXresponser4r5}r6(h@UhAj0ubahGjJubjB)r7}r8(h@XrequesthI}r9(hM]hN]hL]hK]hO]uhAj,h;]r:hfXrequestr;r<}r=(h@UhAj7ubahGjJubjB)r>}r?(h@XspiderhI}r@(hM]hN]hL]hK]hO]uhAj,h;]rAhfXspiderrBrC}rD(h@UhAj>ubahGjJubeubeubj[)rE}rF(h@UhAjhBhEhGj^hI}rG(hM]hN]hL]hK]hO]uhQKhRhh;]rH(hj)rI}rJ(h@XWSent when the engine receives a new :class:`~scrapy.http.Response` from the downloader.hAjEhBhEhGhmhI}rK(hM]hN]hL]hK]hO]uhQKhRhh;]rL(hfX$Sent when the engine receives a new rMrN}rO(h@X$Sent when the engine receives a new hAjIubhs)rP}rQ(h@X:class:`~scrapy.http.Response`rRhAjIhBhEhGhwhI}rS(UreftypeXclasshyhzXscrapy.http.ResponseU refdomainXpyrThK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]rUj)rV}rW(h@jRhI}rX(hM]hN]rY(hjTXpy-classrZehL]hK]hO]uhAjPh;]r[hfXResponser\r]}r^(h@UhAjVubahGjubaubhfX from the downloader.r_r`}ra(h@X from the downloader.hAjIubeubhj)rb}rc(h@XEThis signal does not support returning deferreds from their handlers.rdhAjEhBhEhGhmhI}re(hM]hN]hL]hK]hO]uhQKhRhh;]rfhfXEThis signal does not support returning deferreds from their handlers.rgrh}ri(h@jdhAjbubaubj~)rj}rk(h@UhAjEhBNhGjhI}rl(hM]hN]hL]hK]hO]uhQNhRhh;]rmj)rn}ro(h@UhI}rp(hM]hN]hL]hK]hO]uhAjjh;]rq(j)rr}rs(h@UhI}rt(hM]hN]hL]hK]hO]uhAjnh;]ruhfX Parametersrvrw}rx(h@UhAjrubahGjubj)ry}rz(h@UhI}r{(hM]hN]hL]hK]hO]uhAjnh;]r|j)r}}r~(h@UhI}r(hM]hN]hL]hK]hO]uhAjyh;]r(j)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAj}h;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XresponsehI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXresponserr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.http.Response`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.http.ResponseU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXResponserr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfXthe response receivedrr}r(h@Xthe response receivedrhAjubehGhmubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAj}h;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XrequesthI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXrequestrr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.http.Request`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.http.RequestU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXRequestrr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r(h@UhAjubhfX -- rr}r(h@UhAjubhfX'the request that generated the responserr}r(h@X'the request that generated the responserhAjubehGhmubahGjubj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAj}h;]rhj)r}r(h@UhI}r(hM]hN]hL]hK]hO]uhAjh;]r(j)r}r(h@XspiderhI}r(hM]hN]hL]hK]hO]uhAjh;]rhfXspiderrr}r(h@UhAjubahGjubhfX (rr}r(h@UhAjubhs)r}r(h@X:class:`~scrapy.spider.Spider`rhAjhBhEhGhwhI}r(UreftypeXclasshyhzXscrapy.spider.SpiderU refdomainXpyrhK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]rj)r}r(h@jhI}r(hM]hN]r(hjXpy-classrehL]hK]hO]uhAjh;]rhfXSpiderrr}r(h@UhAjubahGjubaubhfX objectrr}r(h@X objecthAjubhfX)r}r (h@UhAjubhfX -- r r }r (h@UhAjubhfX-the spider for which the response is intendedr r }r (h@X-the spider for which the response is intendedr hAjubehGhmubahGjubehGj$ubahGj%ubehGj&ubaubeubeubeubhS)r }r (h@UhAhhBhEhGhXhI}r (hM]hN]hL]hK]r h.ahO]r h auhQKhRhh;]r (h_)r }r (h@Xresponse_downloadedr hAj hBhEhGhchI}r (hM]hN]hL]hK]hO]uhQKhRhh;]r hfXresponse_downloadedr r }r (h@j hAj ubaubh)r }r (h@UhAj hBhEhGjhI}r (hK]hL]hM]hN]hO]Uentries]r (XpairXresponse_downloaded; signalXstd:signal-response_downloadedr Utr auhQKhRhh;]ubh=)r }r (h@UhAj hBhEhGhHhI}r (hM]hN]hL]hK]r j ahO]uhQKhRhh;]ubh)r }r! (h@UhAj hBNhGjhI}r" (hK]hL]hM]hN]hO]Uentries]r# (jX0response_downloaded() (in module scrapy.signals)hUtr$ auhQNhRhh;]ubj+)r% }r& (h@UhAj hBNhGj.hI}r' (j0j1XpyhK]hL]hM]hN]hO]j2Xfunctionr( j4j( uhQNhRhh;]r) (j6)r* }r+ (h@X.response_downloaded(response, request, spider)hAj% hBhEhGj9hI}r, (hK]r- haj<j=hL]hM]hN]hO]r. haj?Xresponse_downloadedr/ jAUjBuhQKhRhh;]r0 (jD)r1 }r2 (h@Xscrapy.signals.hAj* hBhEhGjGhI}r3 (hM]hN]hL]hK]hO]uhQKhRhh;]r4 hfXscrapy.signals.r5 r6 }r7 (h@UhAj1 ubaubjM)r8 }r9 (h@j/ hAj* hBhEhGjPhI}r: (hM]hN]hL]hK]hO]uhQKhRhh;]r; hfXresponse_downloadedr< r= }r> (h@UhAj8 ubaubjV)r? }r@ (h@UhAj* hBhEhGjYhI}rA (hM]hN]hL]hK]hO]uhQKhRhh;]rB (jB)rC }rD (h@XresponsehI}rE (hM]hN]hL]hK]hO]uhAj? h;]rF hfXresponserG rH }rI (h@UhAjC ubahGjJubjB)rJ }rK (h@XrequesthI}rL (hM]hN]hL]hK]hO]uhAj? h;]rM hfXrequestrN rO }rP (h@UhAjJ ubahGjJubjB)rQ }rR (h@XspiderhI}rS (hM]hN]hL]hK]hO]uhAj? h;]rT hfXspiderrU rV }rW (h@UhAjQ ubahGjJubeubeubj[)rX }rY (h@UhAj% hBhEhGj^hI}rZ (hM]hN]hL]hK]hO]uhQKhRhh;]r[ (hj)r\ }r] (h@XDSent by the downloader right after a ``HTTPResponse`` is downloaded.r^ hAjX hBhEhGhmhI}r_ (hM]hN]hL]hK]hO]uhQKhRhh;]r` (hfX%Sent by the downloader right after a ra rb }rc (h@X%Sent by the downloader right after a hAj\ ubj)rd }re (h@X``HTTPResponse``hI}rf (hM]hN]hL]hK]hO]uhAj\ h;]rg hfX HTTPResponserh ri }rj (h@UhAjd ubahGjubhfX is downloaded.rk rl }rm (h@X is downloaded.hAj\ ubeubhj)rn }ro (h@XEThis signal does not support returning deferreds from their handlers.rp hAjX hBhEhGhmhI}rq (hM]hN]hL]hK]hO]uhQKhRhh;]rr hfXEThis signal does not support returning deferreds from their handlers.rs rt }ru (h@jp hAjn ubaubj~)rv }rw (h@UhAjX hBNhGjhI}rx (hM]hN]hL]hK]hO]uhQNhRhh;]ry j)rz }r{ (h@UhI}r| (hM]hN]hL]hK]hO]uhAjv h;]r} (j)r~ }r (h@UhI}r (hM]hN]hL]hK]hO]uhAjz h;]r hfX Parametersr r }r (h@UhAj~ ubahGjubj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAjz h;]r j)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r (j)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r (j)r }r (h@XresponsehI}r (hM]hN]hL]hK]hO]uhAj h;]r hfXresponser r }r (h@UhAj ubahGjubhfX (r r }r (h@UhAj ubhs)r }r (h@X:class:`~scrapy.http.Response`r hAj hBhEhGhwhI}r (UreftypeXclasshyhzXscrapy.http.ResponseU refdomainXpyr hK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]r j)r }r (h@j hI}r (hM]hN]r (hj Xpy-classr ehL]hK]hO]uhAj h;]r hfXResponser r }r (h@UhAj ubahGjubaubhfX objectr r }r (h@X objecthAj ubhfX)r }r (h@UhAj ubhfX -- r r }r (h@UhAj ubhfXthe response downloadedr r }r (h@Xthe response downloadedr hAj ubehGhmubahGjubj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r (j)r }r (h@XrequesthI}r (hM]hN]hL]hK]hO]uhAj h;]r hfXrequestr r }r (h@UhAj ubahGjubhfX (r r }r (h@UhAj ubhs)r }r (h@X:class:`~scrapy.http.Request`r hAj hBhEhGhwhI}r (UreftypeXclasshyhzXscrapy.http.RequestU refdomainXpyr hK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]r j)r }r (h@j hI}r (hM]hN]r (hj Xpy-classr ehL]hK]hO]uhAj h;]r hfXRequestr r }r (h@UhAj ubahGjubaubhfX objectr r }r (h@X objecthAj ubhfX)r }r (h@UhAj ubhfX -- r r }r (h@UhAj ubhfX'the request that generated the responser r }r (h@X'the request that generated the responser hAj ubehGhmubahGjubj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r (j)r }r (h@XspiderhI}r (hM]hN]hL]hK]hO]uhAj h;]r hfXspiderr r }r (h@UhAj ubahGjubhfX (r r }r (h@UhAj ubhs)r }r (h@X:class:`~scrapy.spider.Spider`r hAj hBhEhGhwhI}r (UreftypeXclasshyhzXscrapy.spider.SpiderU refdomainXpyr hK]hL]U refexplicithM]hN]hO]h|h}jNjj=uhQKh;]r j)r }r (h@j hI}r (hM]hN]r (hj Xpy-classr ehL]hK]hO]uhAj h;]r hfXSpiderr r }r (h@UhAj ubahGjubaubhfX objectr r }r (h@X objecthAj ubhfX)r }r (h@UhAj ubhfX -- r r }r (h@UhAj ubhfX-the spider for which the response is intendedr r }r (h@X-the spider for which the response is intendedr hAj ubehGhmubahGjubehGj$ubahGj%ubehGj&ubaubeubeubh=)r }r (h@X_.. _Failure: http://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.htmlhKhAj hBhEhGhHhI}r (hjhK]r h4ahL]hM]hN]hO]r hauhQKhRhh;]ubeubeubeubeh@UU transformerr NU footnote_refsr }r Urefnamesr }r (Xfailure]r (jj~eXtwisted deferreds]r hauUsymbol_footnotesr ]r! Uautofootnote_refsr" ]r# Usymbol_footnote_refsr$ ]r% U citationsr& ]r' hRhU current_liner( NUtransform_messagesr) ]r* (cdocutils.nodes system_message r+ )r, }r- (h@UhI}r. (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypeUINFOr/ uh;]r0 hj)r1 }r2 (h@UhI}r3 (hM]hN]hL]hK]hO]uhAj, h;]r4 hfX4Hyperlink target "topics-signals" is not referenced.r5 r6 }r7 (h@UhAj1 ubahGhmubahGUsystem_messager8 ubj+ )r9 }r: (h@UhI}r; (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypej/ uh;]r< hj)r= }r> (h@UhI}r? (hM]hN]hL]hK]hO]uhAj9 h;]r@ hfX8Hyperlink target "topics-signals-ref" is not referenced.rA rB }rC (h@UhAj= ubahGhmubahGj8 ubj+ )rD }rE (h@UhI}rF (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineK(Utypej/ uh;]rG hj)rH }rI (h@UhI}rJ (hM]hN]hL]hK]hO]uhAjD h;]rK hfX?Hyperlink target "std:signal-engine_started" is not referenced.rL rM }rN (h@UhAjH ubahGhmubahGj8 ubj+ )rO }rP (h@UhI}rQ (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineK6Utypej/ uh;]rR hj)rS }rT (h@UhI}rU (hM]hN]hL]hK]hO]uhAjO h;]rV hfX?Hyperlink target "std:signal-engine_stopped" is not referenced.rW rX }rY (h@UhAjS ubahGhmubahGj8 ubj+ )rZ }r[ (h@UhI}r\ (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKAUtypej/ uh;]r] hj)r^ }r_ (h@UhI}r` (hM]hN]hL]hK]hO]uhAjZ h;]ra hfX=Hyperlink target "std:signal-item_scraped" is not referenced.rb rc }rd (h@UhAj^ ubahGhmubahGj8 ubj+ )re }rf (h@UhI}rg (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUUtypej/ uh;]rh hj)ri }rj (h@UhI}rk (hM]hN]hL]hK]hO]uhAje h;]rl hfX=Hyperlink target "std:signal-item_dropped" is not referenced.rm rn }ro (h@UhAji ubahGhmubahGj8 ubj+ )rp }rq (h@UhI}rr (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKkUtypej/ uh;]rs hj)rt }ru (h@UhI}rv (hM]hN]hL]hK]hO]uhAjp h;]rw hfX>Hyperlink target "std:signal-spider_closed" is not referenced.rx ry }rz (h@UhAjt ubahGhmubahGj8 ubj+ )r{ }r| (h@UhI}r} (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypej/ uh;]r~ hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj{ h;]r hfX>Hyperlink target "std:signal-spider_opened" is not referenced.r r }r (h@UhAj ubahGhmubahGj8 ubj+ )r }r (h@UhI}r (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypej/ uh;]r hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r hfX<Hyperlink target "std:signal-spider_idle" is not referenced.r r }r (h@UhAj ubahGhmubahGj8 ubj+ )r }r (h@UhI}r (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypej/ uh;]r hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r hfX=Hyperlink target "std:signal-spider_error" is not referenced.r r }r (h@UhAj ubahGhmubahGj8 ubj+ )r }r (h@UhI}r (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypej/ uh;]r hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r hfXBHyperlink target "std:signal-response_received" is not referenced.r r }r (h@UhAj ubahGhmubahGj8 ubj+ )r }r (h@UhI}r (hM]UlevelKhK]hL]UsourcehEhN]hO]UlineKUtypej/ uh;]r hj)r }r (h@UhI}r (hM]hN]hL]hK]hO]uhAj h;]r hfXDHyperlink target "std:signal-response_downloaded" is not referenced.r r }r (h@UhAj ubahGhmubahGj8 ubeUreporterr NUid_startr KU autofootnotesr ]r U citation_refsr }r Uindirect_targetsr ]r Usettingsr (cdocutils.frontend Values r or }r (Ufootnote_backlinksr KUrecord_dependenciesr NU rfc_base_urlr Uhttp://tools.ietf.org/html/r U tracebackr Upep_referencesr NUstrip_commentsr NU toc_backlinksr Uentryr U language_coder Uenr U datestampr NU report_levelr KU _destinationr NU halt_levelr KU strip_classesr NhcNUerror_encoding_error_handlerr Ubackslashreplacer Udebugr NUembed_stylesheetr Uoutput_encoding_error_handlerr Ustrictr U sectnum_xformr KUdump_transformsr NU docinfo_xformr KUwarning_streamr NUpep_file_url_templater Upep-%04dr Uexit_status_levelr KUconfigr NUstrict_visitorr NUcloak_email_addressesr Utrim_footnote_reference_spacer Uenvr NUdump_pseudo_xmlr NUexpose_internalsr NUsectsubtitle_xformr U source_linkr NUrfc_referencesr NUoutput_encodingr Uutf-8r U source_urlr NUinput_encodingr U utf-8-sigr U_disable_configr NU id_prefixr UU tab_widthr KUerror_encodingr UUTF-8r U_sourcer UD/var/build/user_builds/scrapy/checkouts/0.22/docs/topics/signals.rstr Ugettext_compactr U generatorr NUdump_internalsr NU smart_quotesr U pep_base_urlr Uhttp://www.python.org/dev/peps/r Usyntax_highlightr Ulongr Uinput_encoding_error_handlerr j Uauto_id_prefixr Uidr Udoctitle_xformr Ustrip_elements_with_classesr NU _config_filesr ]Ufile_insertion_enabledr U raw_enabledr KU dump_settingsr NubUsymbol_footnote_startr KUidsr }r (hjIj|j~h*hThj'hjh:jtj j h,hjjh2jhjjjh3jh.j h6hh8jhh=)r }r (h@UhAhhBhEhGhHhI}r (hM]hK]r hahL]UismodhN]hO]uhQNhRhh;]ubh j)h/hTh7j'hjj9j;h-jhj* h5jjjjj hjh0hh4j h9jh1jjh+hj j"hjhj7j j jjuUsubstitution_namesr }r hGhRhI}r (hM]hK]hL]UsourcehEhN]hO]uU footnotesr ]r Urefidsr }r (h,]r hah*]r h>auub.PK!o1Dõõscrapy-0.22/intro/examples.html Examples — Scrapy 0.22.0 documentation

              Examples

              The best way to learn is with examples, and Scrapy is no exception. For this reason, there is an example Scrapy project named dirbot, that you can use to play and learn more about Scrapy. It contains the dmoz spider described in the tutorial.

              This dirbot project is available at: https://github.com/scrapy/dirbot

              It contains a README file with a detailed description of the project contents.

              If you’re familiar with git, you can checkout the code. Otherwise you can download a tarball or zip file of the project by clicking on Downloads.

              The scrapy tag on Snipplr is used for sharing code snippets such as spiders, middlewares, extensions, or scripts. Feel free (and encouraged!) to share any code there.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK!o1D 8|scrapy-0.22/intro/overview.html Scrapy at a glance — Scrapy 0.22.0 documentation

              Scrapy at a glance

              Scrapy is an application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing or historical archival.

              Even though Scrapy was originally designed for screen scraping (more precisely, web scraping), it can also be used to extract data using APIs (such as Amazon Associates Web Services) or as a general purpose web crawler.

              The purpose of this document is to introduce you to the concepts behind Scrapy so you can get an idea of how it works and decide if Scrapy is what you need.

              When you’re ready to start a project, you can start with the tutorial.

              Pick a website

              So you need to extract some information from a website, but the website doesn’t provide any API or mechanism to access that info programmatically. Scrapy can help you extract that information.

              Let’s say we want to extract the URL, name, description and size of all torrent files added today in the Mininova site.

              The list of all torrents added today can be found on this page:

              Define the data you want to scrape

              The first thing is to define the data we want to scrape. In Scrapy, this is done through Scrapy Items (Torrent files, in this case).

              This would be our Item:

              from scrapy.item import Item, Field
              
              class TorrentItem(Item):
                  url = Field()
                  name = Field()
                  description = Field()
                  size = Field()
              

              Write a Spider to extract the data

              The next thing is to write a Spider which defines the start URL (http://www.mininova.org/today), the rules for following links and the rules for extracting the data from pages.

              If we take a look at that page content we’ll see that all torrent URLs are like http://www.mininova.org/tor/NUMBER where NUMBER is an integer. We’ll use that to construct the regular expression for the links to follow: /tor/\d+.

              We’ll use XPath for selecting the data to extract from the web page HTML source. Let’s take one of those torrent pages:

              And look at the page HTML source to construct the XPath to select the data we want which is: torrent name, description and size.

              By looking at the page HTML source we can see that the file name is contained inside a <h1> tag:

              <h1>Darwin - The Evolution Of An Exhibition</h1>
              

              An XPath expression to extract the name could be:

              //h1/text()
              

              And the description is contained inside a <div> tag with id="description":

              <h2>Description:</h2>
              
              <div id="description">
              Short documentary made for Plymouth City Museum and Art Gallery regarding the setup of an exhibit about Charles Darwin in conjunction with the 200th anniversary of his birth.
              
              ...
              

              An XPath expression to select the description could be:

              //div[@id='description']
              

              Finally, the file size is contained in the second <p> tag inside the <div> tag with id=specifications:

              <div id="specifications">
              
              <p>
              <strong>Category:</strong>
              <a href="/cat/4">Movies</a> &gt; <a href="/sub/35">Documentary</a>
              </p>
              
              <p>
              <strong>Total size:</strong>
              150.62&nbsp;megabyte</p>
              

              An XPath expression to select the file size could be:

              //div[@id='specifications']/p[2]/text()[2]
              

              For more information about XPath see the XPath reference.

              Finally, here’s the spider code:

              from scrapy.contrib.spiders import CrawlSpider, Rule
              from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
              from scrapy.selector import Selector
              
              class MininovaSpider(CrawlSpider):
              
                  name = 'mininova'
                  allowed_domains = ['mininova.org']
                  start_urls = ['http://www.mininova.org/today']
                  rules = [Rule(SgmlLinkExtractor(allow=['/tor/\d+']), 'parse_torrent')]
              
                  def parse_torrent(self, response):
                      sel = Selector(response)
                      torrent = TorrentItem()
                      torrent['url'] = response.url
                      torrent['name'] = sel.xpath("//h1/text()").extract()
                      torrent['description'] = sel.xpath("//div[@id='description']").extract()
                      torrent['size'] = sel.xpath("//div[@id='info-left']/p[2]/text()[2]").extract()
                      return torrent
              

              The TorrentItem class is defined above.

              Run the spider to extract the data

              Finally, we’ll run the spider to crawl the site an output file scraped_data.json with the scraped data in JSON format:

              scrapy crawl mininova -o scraped_data.json -t json

              This uses feed exports to generate the JSON file. You can easily change the export format (XML or CSV, for example) or the storage backend (FTP or Amazon S3, for example).

              You can also write an item pipeline to store the items in a database very easily.

              Review scraped data

              If you check the scraped_data.json file after the process finishes, you’ll see the scraped items there:

              [{"url": "http://www.mininova.org/tor/2676093", "name": ["Darwin - The Evolution Of An Exhibition"], "description": ["Short documentary made for Plymouth ..."], "size": ["150.62 megabyte"]},
              # ... other items ...
              ]
              

              You’ll notice that all field values (except for the url which was assigned directly) are actually lists. This is because the selectors return lists. You may want to store single values, or perform some additional parsing/cleansing to the values. That’s what Item Loaders are for.

              What else?

              You’ve seen how to extract and store items from a website using Scrapy, but this is just the surface. Scrapy provides a lot of powerful features for making scraping easy and efficient, such as:

              • Built-in support for selecting and extracting data from HTML and XML sources
              • Built-in support for cleaning and sanitizing the scraped data using a collection of reusable filters (called Item Loaders) shared between all the spiders.
              • Built-in support for generating feed exports in multiple formats (JSON, CSV, XML) and storing them in multiple backends (FTP, S3, local filesystem)
              • A media pipeline for automatically downloading images (or any other media) associated with the scraped items
              • Support for extending Scrapy by plugging your own functionality using signals and a well-defined API (middlewares, extensions, and pipelines).
              • Wide range of built-in middlewares and extensions for:
                • cookies and session handling
                • HTTP compression
                • HTTP authentication
                • HTTP cache
                • user-agent spoofing
                • robots.txt
                • crawl depth restriction
                • and more
              • Robust encoding support and auto-detection, for dealing with foreign, non-standard and broken encoding declarations.
              • Support for creating spiders based on pre-defined templates, to speed up spider creation and make their code more consistent on large projects. See genspider command for more details.
              • Extensible stats collection for multiple spider metrics, useful for monitoring the performance of your spiders and detecting when they get broken
              • An Interactive shell console for trying XPaths, very useful for writing and debugging your spiders
              • A System service designed to ease the deployment and run of your spiders in production.
              • A built-in Web service for monitoring and controlling your bot
              • A Telnet console for hooking into a Python console running inside your Scrapy process, to introspect and debug your crawler
              • Logging facility that you can hook on to for catching errors during the scraping process.
              • Support for crawling based on URLs discovered through Sitemaps
              • A caching DNS resolver

              What’s next?

              The next obvious steps are for you to download Scrapy, read the tutorial and join the community. Thanks for your interest!

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK!o1Dwwscrapy-0.22/intro/install.html Installation guide — Scrapy 0.22.0 documentation

              Installation guide

              Pre-requisites

              The installation steps assume that you have the following things installed:

              Installing Scrapy

              You can install Scrapy using easy_install or pip (which is the canonical way to distribute and install Python packages).

              To install using pip:

              pip install Scrapy

              To install using easy_install:

              easy_install Scrapy

              Platform specific installation notes

              Windows

              After installing Python, follow these steps before installing Scrapy:

              • add the C:\python27\Scripts and C:\python27 folders to the system path by adding those directories to the PATH environment variable from the Control Panel.
              • install OpenSSL by following these steps:
                1. go to Win32 OpenSSL page
                2. download Visual C++ 2008 redistributables for your Windows and architecture
                3. download OpenSSL for your Windows and architecture (the regular version, not the light one)
                4. add the c:\openssl-win32\bin (or similar) directory to your PATH, the same way you added python27 in the first step`` in the first step
              • some binary packages that Scrapy depends on (like Twisted, lxml and pyOpenSSL) require a compiler available to install, and fail if you don’t have Visual Studio installed. You can find Windows installers for those in the following links. Make sure you respect your Python version and Windows architecture.

              Finally, this page contains many precompiled Python binary libraries, which may come handy to fulfill Scrapy dependencies:

              Ubuntu 9.10 or above

              Don’t use the python-scrapy package provided by Ubuntu, they are typically too old and slow to catch up with latest Scrapy.

              Instead, use the official Ubuntu Packages, which already solve all dependencies for you and are continuously updated with the latest bug fixes.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK!o1D[.H.Hscrapy-0.22/intro/tutorial.html Scrapy Tutorial — Scrapy 0.22.0 documentation

              Scrapy Tutorial

              In this tutorial, we’ll assume that Scrapy is already installed on your system. If that’s not the case, see Installation guide.

              We are going to use Open directory project (dmoz) as our example domain to scrape.

              This tutorial will walk you through these tasks:

              1. Creating a new Scrapy project
              2. Defining the Items you will extract
              3. Writing a spider to crawl a site and extract Items
              4. Writing an Item Pipeline to store the extracted Items

              Scrapy is written in Python. If you’re new to the language you might want to start by getting an idea of what the language is like, to get the most out of Scrapy. If you’re already familiar with other languages, and want to learn Python quickly, we recommend Learn Python The Hard Way. If you’re new to programming and want to start with Python, take a look at this list of Python resources for non-programmers.

              Creating a project

              Before you start scraping, you will have set up a new Scrapy project. Enter a directory where you’d like to store your code and then run:

              scrapy startproject tutorial

              This will create a tutorial directory with the following contents:

              tutorial/
                  scrapy.cfg
                  tutorial/
                      __init__.py
                      items.py
                      pipelines.py
                      settings.py
                      spiders/
                          __init__.py
                          ...

              These are basically:

              • scrapy.cfg: the project configuration file
              • tutorial/: the project’s python module, you’ll later import your code from here.
              • tutorial/items.py: the project’s items file.
              • tutorial/pipelines.py: the project’s pipelines file.
              • tutorial/settings.py: the project’s settings file.
              • tutorial/spiders/: a directory where you’ll later put your spiders.

              Defining our Item

              Items are containers that will be loaded with the scraped data; they work like simple python dicts but provide additional protecting against populating undeclared fields, to prevent typos.

              They are declared by creating an scrapy.item.Item class and defining its attributes as scrapy.item.Field objects, like you will in an ORM (don’t worry if you’re not familiar with ORMs, you will see that this is an easy task).

              We begin by modeling the item that we will use to hold the sites data obtained from dmoz.org, as we want to capture the name, url and description of the sites, we define fields for each of these three attributes. To do that, we edit items.py, found in the tutorial directory. Our Item class looks like this:

              from scrapy.item import Item, Field
              
              class DmozItem(Item):
                  title = Field()
                  link = Field()
                  desc = Field()
              

              This may seem complicated at first, but defining the item allows you to use other handy components of Scrapy that need to know how your item looks like.

              Our first Spider

              Spiders are user-written classes used to scrape information from a domain (or group of domains).

              They define an initial list of URLs to download, how to follow links, and how to parse the contents of those pages to extract items.

              To create a Spider, you must subclass scrapy.spider.Spider, and define the three main, mandatory, attributes:

              • name: identifies the Spider. It must be unique, that is, you can’t set the same name for different Spiders.

              • start_urls: is a list of URLs where the Spider will begin to crawl from. So, the first pages downloaded will be those listed here. The subsequent URLs will be generated successively from data contained in the start URLs.

              • parse() is a method of the spider, which will be called with the downloaded Response object of each start URL. The response is passed to the method as the first and only argument.

                This method is responsible for parsing the response data and extracting scraped data (as scraped items) and more URLs to follow.

                The parse() method is in charge of processing the response and returning scraped data (as Item objects) and more URLs to follow (as Request objects).

              This is the code for our first Spider; save it in a file named dmoz_spider.py under the tutorial/spiders directory:

              from scrapy.spider import Spider
              
              class DmozSpider(Spider):
                  name = "dmoz"
                  allowed_domains = ["dmoz.org"]
                  start_urls = [
                      "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
                      "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
                  ]
              
                  def parse(self, response):
                      filename = response.url.split("/")[-2]
                      open(filename, 'wb').write(response.body)
              

              Crawling

              To put our spider to work, go to the project’s top level directory and run:

              scrapy crawl dmoz

              The crawl dmoz command runs the spider for the dmoz.org domain. You will get an output similar to this:

              2008-08-20 03:51:13-0300 [scrapy] INFO: Started project: dmoz
              2008-08-20 03:51:13-0300 [tutorial] INFO: Enabled extensions: ...
              2008-08-20 03:51:13-0300 [tutorial] INFO: Enabled downloader middlewares: ...
              2008-08-20 03:51:13-0300 [tutorial] INFO: Enabled spider middlewares: ...
              2008-08-20 03:51:13-0300 [tutorial] INFO: Enabled item pipelines: ...
              2008-08-20 03:51:14-0300 [dmoz] INFO: Spider opened
              2008-08-20 03:51:14-0300 [dmoz] DEBUG: Crawled <http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/> (referer: <None>)
              2008-08-20 03:51:14-0300 [dmoz] DEBUG: Crawled <http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> (referer: <None>)
              2008-08-20 03:51:14-0300 [dmoz] INFO: Spider closed (finished)

              Pay attention to the lines containing [dmoz], which corresponds to our spider. You can see a log line for each URL defined in start_urls. Because these URLs are the starting ones, they have no referrers, which is shown at the end of the log line, where it says (referer: <None>).

              But more interesting, as our parse method instructs, two files have been created: Books and Resources, with the content of both URLs.

              What just happened under the hood?

              Scrapy creates scrapy.http.Request objects for each URL in the start_urls attribute of the Spider, and assigns them the parse method of the spider as their callback function.

              These Requests are scheduled, then executed, and scrapy.http.Response objects are returned and then fed back to the spider, through the parse() method.

              Extracting Items

              Introduction to Selectors

              There are several ways to extract data from web pages. Scrapy uses a mechanism based on XPath or CSS expressions called Scrapy Selectors. For more information about selectors and other extraction mechanisms see the Selectors documentation.

              Here are some examples of XPath expressions and their meanings:

              • /html/head/title: selects the <title> element, inside the <head> element of a HTML document
              • /html/head/title/text(): selects the text inside the aforementioned <title> element.
              • //td: selects all the <td> elements
              • //div[@class="mine"]: selects all div elements which contain an attribute class="mine"

              These are just a couple of simple examples of what you can do with XPath, but XPath expressions are indeed much more powerful. To learn more about XPath we recommend this XPath tutorial.

              For working with XPaths, Scrapy provides a Selector class, it is instantiated with a HtmlResponse or XmlResponse object as first argument.

              You can see selectors as objects that represent nodes in the document structure. So, the first instantiated selectors are associated to the root node, or the entire document.

              Selectors have four basic methods (click on the method to see the complete API documentation).

              • xpath(): returns a list of selectors, each of them representing the nodes selected by the xpath expression given as argument.
              • css(): returns a list of selectors, each of them representing the nodes selected by the CSS expression given as argument.
              • extract(): returns a unicode string with the selected data.
              • re(): returns a list of unicode strings extracted by applying the regular expression given as argument.

              Trying Selectors in the Shell

              To illustrate the use of Selectors we’re going to use the built-in Scrapy shell, which also requires IPython (an extended Python console) installed on your system.

              To start a shell, you must go to the project’s top level directory and run:

              scrapy shell "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/"

              Note

              Remember to always enclose urls with quotes when running Scrapy shell from command-line, otherwise urls containing arguments (ie. & character) will not work.

              This is what the shell looks like:

              [ ... Scrapy log here ... ]
              
              [s] Available Scrapy objects:
              [s] 2010-08-19 21:45:59-0300 [default] INFO: Spider closed (finished)
              [s]   sel        <Selector (http://www.dmoz.org/Computers/Programming/Languages/Python/Books/) xpath=None>
              [s]   item       Item()
              [s]   request    <GET http://www.dmoz.org/Computers/Programming/Languages/Python/Books/>
              [s]   response   <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/>
              [s]   spider     <Spider 'default' at 0x1b6c2d0>
              [s] Useful shortcuts:
              [s]   shelp()           Print this help
              [s]   fetch(req_or_url) Fetch a new request or URL and update shell objects
              [s]   view(response)    View response in a browser
              
              In [1]:

              After the shell loads, you will have the response fetched in a local response variable, so if you type response.body you will see the body of the response, or you can type response.headers to see its headers.

              The shell also pre-instantiate a selector for this response in variable sel, the selector automatically chooses the best parsing rules (XML vs HTML) based on response’s type.

              So let’s try it:

              In [1]: sel.xpath('//title')
              Out[1]: [<Selector (title) xpath=//title>]
              
              In [2]: sel.xpath('//title').extract()
              Out[2]: [u'<title>Open Directory - Computers: Programming: Languages: Python: Books</title>']
              
              In [3]: sel.xpath('//title/text()')
              Out[3]: [<Selector (text) xpath=//title/text()>]
              
              In [4]: sel.xpath('//title/text()').extract()
              Out[4]: [u'Open Directory - Computers: Programming: Languages: Python: Books']
              
              In [5]: sel.xpath('//title/text()').re('(\w+):')
              Out[5]: [u'Computers', u'Programming', u'Languages', u'Python']

              Extracting the data

              Now, let’s try to extract some real information from those pages.

              You could type response.body in the console, and inspect the source code to figure out the XPaths you need to use. However, inspecting the raw HTML code there could become a very tedious task. To make this an easier task, you can use some Firefox extensions like Firebug. For more information see Using Firebug for scraping and Using Firefox for scraping.

              After inspecting the page source, you’ll find that the web sites information is inside a <ul> element, in fact the second <ul> element.

              So we can select each <li> element belonging to the sites list with this code:

              sel.xpath('//ul/li')
              

              And from them, the sites descriptions:

              sel.xpath('//ul/li/text()').extract()
              

              The sites titles:

              sel.xpath('//ul/li/a/text()').extract()
              

              And the sites links:

              sel.xpath('//ul/li/a/@href').extract()
              

              As we said before, each .xpath() call returns a list of selectors, so we can concatenate further .xpath() calls to dig deeper into a node. We are going to use that property here, so:

              sites = sel.xpath('//ul/li')
              for site in sites:
                  title = site.xpath('a/text()').extract()
                  link = site.xpath('a/@href').extract()
                  desc = site.xpath('text()').extract()
                  print title, link, desc
              

              Note

              For a more detailed description of using nested selectors, see Nesting selectors and Working with relative XPaths in the Selectors documentation

              Let’s add this code to our spider:

              from scrapy.spider import Spider
              from scrapy.selector import Selector
              
              class DmozSpider(Spider):
                  name = "dmoz"
                  allowed_domains = ["dmoz.org"]
                  start_urls = [
                      "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
                      "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
                  ]
              
                  def parse(self, response):
                      sel = Selector(response)
                      sites = sel.xpath('//ul/li')
                      for site in sites:
                          title = site.xpath('a/text()').extract()
                          link = site.xpath('a/@href').extract()
                          desc = site.xpath('text()').extract()
                          print title, link, desc
              

              Notice we import our Selector class from scrapy.selector and instantiate a new Selector object. We can now specify our XPaths just as we did in the shell. Now try crawling the dmoz.org domain again and you’ll see sites being printed in your output, run:

              scrapy crawl dmoz

              Using our item

              Item objects are custom python dicts; you can access the values of their fields (attributes of the class we defined earlier) using the standard dict syntax like:

              >>> item = DmozItem()
              >>> item['title'] = 'Example title'
              >>> item['title']
              'Example title'
              

              Spiders are expected to return their scraped data inside Item objects. So, in order to return the data we’ve scraped so far, the final code for our Spider would be like this:

              from scrapy.spider import Spider
              from scrapy.selector import Selector
              
              from tutorial.items import DmozItem
              
              class DmozSpider(Spider):
                 name = "dmoz"
                 allowed_domains = ["dmoz.org"]
                 start_urls = [
                     "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
                     "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
                 ]
              
                 def parse(self, response):
                     sel = Selector(response)
                     sites = sel.xpath('//ul/li')
                     items = []
                     for site in sites:
                         item = DmozItem()
                         item['title'] = site.xpath('a/text()').extract()
                         item['link'] = site.xpath('a/@href').extract()
                         item['desc'] = site.xpath('text()').extract()
                         items.append(item)
                     return items
              

              Note

              You can find a fully-functional variant of this spider in the dirbot project available at https://github.com/scrapy/dirbot

              Now doing a crawl on the dmoz.org domain yields DmozItem‘s:

              [dmoz] DEBUG: Scraped from <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/>
                   {'desc': [u' - By David Mertz; Addison Wesley. Book in progress, full text, ASCII format. Asks for feedback. [author website, Gnosis Software, Inc.\n],
                    'link': [u'http://gnosis.cx/TPiP/'],
                    'title': [u'Text Processing in Python']}
              [dmoz] DEBUG: Scraped from <200 http://www.dmoz.org/Computers/Programming/Languages/Python/Books/>
                   {'desc': [u' - By Sean McGrath; Prentice Hall PTR, 2000, ISBN 0130211192, has CD-ROM. Methods to build XML applications fast, Python tutorial, DOM and SAX, new Pyxie open source XML processing library. [Prentice Hall PTR]\n'],
                    'link': [u'http://www.informit.com/store/product.aspx?isbn=0130211192'],
                    'title': [u'XML Processing with Python']}

              Storing the scraped data

              The simplest way to store the scraped data is by using the Feed exports, with the following command:

              scrapy crawl dmoz -o items.json -t json

              That will generate a items.json file containing all scraped items, serialized in JSON.

              In small projects (like the one in this tutorial), that should be enough. However, if you want to perform more complex things with the scraped items, you can write an Item Pipeline. As with Items, a placeholder file for Item Pipelines has been set up for you when the project is created, in tutorial/pipelines.py. Though you don’t need to implement any item pipeline if you just want to store the scraped items.

              Next steps

              This tutorial covers only the basics of Scrapy, but there’s a lot of other features not mentioned here. Check the What else? section in Scrapy at a glance chapter for a quick overview of the most important ones.

              Then, we recommend you continue by playing with an example project (see Examples), and then continue with the section Basic concepts.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK&o1Dݽ/\scrapy-0.22/topics/stats.html Stats Collection — Scrapy 0.22.0 documentation

              Stats Collection

              Scrapy provides a convenient facility for collecting stats in the form of key/values, where values are often counters. The facility is called the Stats Collector, and can be accessed through the stats attribute of the Crawler API, as illustrated by the examples in the Common Stats Collector uses section below.

              However, the Stats Collector is always available, so you can always import it in your module and use its API (to increment or set new stat keys), regardless of whether the stats collection is enabled or not. If it’s disabled, the API will still work but it won’t collect anything. This is aimed at simplifying the stats collector usage: you should spend no more than one line of code for collecting stats in your spider, Scrapy extension, or whatever code you’re using the Stats Collector from.

              Another feature of the Stats Collector is that it’s very efficient (when enabled) and extremely efficient (almost unnoticeable) when disabled.

              The Stats Collector keeps a stats table per open spider which is automatically opened when the spider is opened, and closed when the spider is closed.

              Common Stats Collector uses

              Access the stats collector through the stats attribute. Here is an example of an extension that access stats:

              class ExtensionThatAccessStats(object):
              
                  def __init__(self, stats):
                      self.stats = stats
              
                  @classmethod
                  def from_crawler(cls, crawler):
                      return cls(crawler.stats)
              

              Set stat value:

              stats.set_value('hostname', socket.gethostname())
              

              Increment stat value:

              stats.inc_value('pages_crawled')
              

              Set stat value only if greater than previous:

              stats.max_value('max_items_scraped', value)
              

              Set stat value only if lower than previous:

              stats.min_value('min_free_memory_percent', value)
              

              Get stat value:

              >>> stats.get_value('pages_crawled')
              8
              

              Get all stats:

              >>> stats.get_stats()
              {'pages_crawled': 1238, 'start_time': datetime.datetime(2009, 7, 14, 21, 47, 28, 977139)}
              

              Available Stats Collectors

              Besides the basic StatsCollector there are other Stats Collectors available in Scrapy which extend the basic Stats Collector. You can select which Stats Collector to use through the STATS_CLASS setting. The default Stats Collector used is the MemoryStatsCollector.

              MemoryStatsCollector

              class scrapy.statscol.MemoryStatsCollector

              A simple stats collector that keeps the stats of the last scraping run (for each spider) in memory, after they’re closed. The stats can be accessed through the spider_stats attribute, which is a dict keyed by spider domain name.

              This is the default Stats Collector used in Scrapy.

              spider_stats

              A dict of dicts (keyed by spider name) containing the stats of the last scraping run for each spider.

              DummyStatsCollector

              class scrapy.statscol.DummyStatsCollector

              A Stats collector which does nothing but is very efficient (because it does nothing). This stats collector can be set via the STATS_CLASS setting, to disable stats collect in order to improve performance. However, the performance penalty of stats collection is usually marginal compared to other Scrapy workload like parsing pages.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK"o1Ds^ll scrapy-0.22/topics/commands.html Command line tool — Scrapy 0.22.0 documentation

              Command line tool

              New in version 0.10.

              Scrapy is controlled through the scrapy command-line tool, to be referred here as the “Scrapy tool” to differentiate it from their sub-commands which we just call “commands”, or “Scrapy commands”.

              The Scrapy tool provides several commands, for multiple purposes, and each one accepts a different set of arguments and options.

              Default structure of Scrapy projects

              Before delving into the command-line tool and its sub-commands, let’s first understand the directory structure of a Scrapy project.

              Even thought it can be modified, all Scrapy projects have the same file structure by default, similar to this:

              scrapy.cfg
              myproject/
                  __init__.py
                  items.py
                  pipelines.py
                  settings.py
                  spiders/
                      __init__.py
                      spider1.py
                      spider2.py
                      ...

              The directory where the scrapy.cfg file resides is known as the project root directory. That file contains the name of the python module that defines the project settings. Here is an example:

              [settings]
              default = myproject.settings
              

              Using the scrapy tool

              You can start by running the Scrapy tool with no arguments and it will print some usage help and the available commands:

              Scrapy X.Y - no active project
              
              Usage:
                scrapy <command> [options] [args]
              
              Available commands:
                crawl         Run a spider
                fetch         Fetch a URL using the Scrapy downloader
              [...]

              The first line will print the currently active project, if you’re inside a Scrapy project. In this, it was run from outside a project. If run from inside a project it would have printed something like this:

              Scrapy X.Y - project: myproject
              
              Usage:
                scrapy <command> [options] [args]
              
              [...]

              Creating projects

              The first thing you typically do with the scrapy tool is create your Scrapy project:

              scrapy startproject myproject

              That will create a Scrapy project under the myproject directory.

              Next, you go inside the new project directory:

              cd myproject

              And you’re ready to use the scrapy command to manage and control your project from there.

              Controlling projects

              You use the scrapy tool from inside your projects to control and manage them.

              For example, to create a new spider:

              scrapy genspider mydomain mydomain.com

              Some Scrapy commands (like crawl) must be run from inside a Scrapy project. See the commands reference below for more information on which commands must be run from inside projects, and which not.

              Also keep in mind that some commands may have slightly different behaviours when running them from inside projects. For example, the fetch command will use spider-overridden behaviours (such as the user_agent attribute to override the user-agent) if the url being fetched is associated with some specific spider. This is intentional, as the fetch command is meant to be used to check how spiders are downloading pages.

              Available tool commands

              This section contains a list of the available built-in commands with a description and some usage examples. Remember you can always get more info about each command by running:

              scrapy <command> -h
              

              And you can see all available commands with:

              scrapy -h
              

              There are two kinds of commands, those that only work from inside a Scrapy project (Project-specific commands) and those that also work without an active Scrapy project (Global commands), though they may behave slightly different when running from inside a project (as they would use the project overridden settings).

              Global commands:

              Project-only commands:

              startproject

              • Syntax: scrapy startproject <project_name>
              • Requires project: no

              Creates a new Scrapy project named project_name, under the project_name directory.

              Usage example:

              $ scrapy startproject myproject

              genspider

              • Syntax: scrapy genspider [-t template] <name> <domain>
              • Requires project: yes

              Create a new spider in the current project.

              This is just a convenient shortcut command for creating spiders based on pre-defined templates, but certainly not the only way to create spiders. You can just create the spider source code files yourself, instead of using this command.

              Usage example:

              $ scrapy genspider -l
              Available templates:
                basic
                crawl
                csvfeed
                xmlfeed
              
              $ scrapy genspider -d basic
              from scrapy.spider import Spider
              
              class $classname(Spider):
                  name = "$name"
                  allowed_domains = ["$domain"]
                  start_urls = (
                      'http://www.$domain/',
                      )
              
                  def parse(self, response):
                      pass
              
              $ scrapy genspider -t basic example example.com
              Created spider 'example' using template 'basic' in module:
                mybot.spiders.example

              crawl

              • Syntax: scrapy crawl <spider>
              • Requires project: yes

              Start crawling a spider.

              Usage examples:

              $ scrapy crawl myspider
              [ ... myspider starts crawling ... ]

              check

              • Syntax: scrapy check [-l] <spider>
              • Requires project: yes

              Run contract checks.

              Usage examples:

              $ scrapy check -l
              first_spider
                * parse
                * parse_item
              second_spider
                * parse
                * parse_item
              
              $ scrapy check
              [FAILED] first_spider:parse_item
              >>> 'RetailPricex' field is missing
              
              [FAILED] first_spider:parse
              >>> Returned 92 requests, expected 0..4

              list

              • Syntax: scrapy list
              • Requires project: yes

              List all available spiders in the current project. The output is one spider per line.

              Usage example:

              $ scrapy list
              spider1
              spider2

              edit

              • Syntax: scrapy edit <spider>
              • Requires project: yes

              Edit the given spider using the editor defined in the EDITOR setting.

              This command is provided only as a convenient shortcut for the most common case, the developer is of course free to choose any tool or IDE to write and debug his spiders.

              Usage example:

              $ scrapy edit spider1

              fetch

              • Syntax: scrapy fetch <url>
              • Requires project: no

              Downloads the given URL using the Scrapy downloader and writes the contents to standard output.

              The interesting thing about this command is that it fetches the page how the the spider would download it. For example, if the spider has an USER_AGENT attribute which overrides the User Agent, it will use that one.

              So this command can be used to “see” how your spider would fetch certain page.

              If used outside a project, no particular per-spider behaviour would be applied and it will just use the default Scrapy downloder settings.

              Usage examples:

              $ scrapy fetch --nolog http://www.example.com/some/page.html
              [ ... html content here ... ]
              
              $ scrapy fetch --nolog --headers http://www.example.com/
              {'Accept-Ranges': ['bytes'],
               'Age': ['1263   '],
               'Connection': ['close     '],
               'Content-Length': ['596'],
               'Content-Type': ['text/html; charset=UTF-8'],
               'Date': ['Wed, 18 Aug 2010 23:59:46 GMT'],
               'Etag': ['"573c1-254-48c9c87349680"'],
               'Last-Modified': ['Fri, 30 Jul 2010 15:30:18 GMT'],
               'Server': ['Apache/2.2.3 (CentOS)']}

              view

              • Syntax: scrapy view <url>
              • Requires project: no

              Opens the given URL in a browser, as your Scrapy spider would “see” it. Sometimes spiders see pages differently from regular users, so this can be used to check what the spider “sees” and confirm it’s what you expect.

              Usage example:

              $ scrapy view http://www.example.com/some/page.html
              [ ... browser starts ... ]

              shell

              • Syntax: scrapy shell [url]
              • Requires project: no

              Starts the Scrapy shell for the given URL (if given) or empty if not URL is given. See Scrapy shell for more info.

              Usage example:

              $ scrapy shell http://www.example.com/some/page.html
              [ ... scrapy shell starts ... ]

              parse

              • Syntax: scrapy parse <url> [options]
              • Requires project: yes

              Fetches the given URL and parses with the spider that handles it, using the method passed with the --callback option, or parse if not given.

              Supported options:

              • --callback or -c: spider method to use as callback for parsing the response
              • --rules or -r: use CrawlSpider rules to discover the callback (ie. spider method) to use for parsing the response
              • --noitems: don’t show scraped items
              • --nolinks: don’t show extracted links
              • --depth or -d: depth level for which the requests should be followed recursively (default: 1)
              • --verbose or -v: display information for each depth level

              Usage example:

              $ scrapy parse http://www.example.com/ -c parse_item
              [ ... scrapy log lines crawling example.com spider ... ]
              
              >>> STATUS DEPTH LEVEL 1 <<<
              # Scraped Items  ------------------------------------------------------------
              [{'name': u'Example item',
               'category': u'Furniture',
               'length': u'12 cm'}]
              
              # Requests  -----------------------------------------------------------------
              []

              settings

              • Syntax: scrapy settings [options]
              • Requires project: no

              Get the value of a Scrapy setting.

              If used inside a project it’ll show the project setting value, otherwise it’ll show the default Scrapy value for that setting.

              Example usage:

              $ scrapy settings --get BOT_NAME
              scrapybot
              $ scrapy settings --get DOWNLOAD_DELAY
              0

              runspider

              • Syntax: scrapy runspider <spider_file.py>
              • Requires project: no

              Run a spider self-contained in a Python file, without having to create a project.

              Example usage:

              $ scrapy runspider myspider.py
              [ ... spider starts crawling ... ]

              version

              • Syntax: scrapy version [-v]
              • Requires project: no

              Prints the Scrapy version. If used with -v it also prints Python, Twisted and Platform info, which is useful for bug reports.

              deploy

              New in version 0.11.

              • Syntax: scrapy deploy [ <target:project> | -l <target> | -L ]
              • Requires project: yes

              Deploy the project into a Scrapyd server. See Deploying your project.

              bench

              New in version 0.17.

              • Syntax: scrapy bench
              • Requires project: no

              Run quick benchmark test. Benchmarking.

              Custom project commands

              You can also add your custom project commands by using the COMMANDS_MODULE setting. See the Scrapy commands in scrapy/commands for examples on how to implement your commands.

              COMMANDS_MODULE

              Default: '' (empty string)

              A module to use for looking custom Scrapy commands. This is used to add custom commands for your Scrapy project.

              Example:

              COMMANDS_MODULE = 'mybot.commands'
              
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK!o1D'Yu[scrapy-0.22/topics/api.html Core API — Scrapy 0.22.0 documentation

              Core API

              New in version 0.15.

              This section documents the Scrapy core API, and it’s intended for developers of extensions and middlewares.

              Crawler API

              The main entry point to Scrapy API is the Crawler object, passed to extensions through the from_crawler class method. This object provides access to all Scrapy core components, and it’s the only way for extensions to access them and hook their functionality into Scrapy.

              The Extension Manager is responsible for loading and keeping track of installed extensions and it’s configured through the EXTENSIONS setting which contains a dictionary of all available extensions and their order similar to how you configure the downloader middlewares.

              class scrapy.crawler.Crawler(settings)

              The Crawler object must be instantiated with a scrapy.settings.Settings object.

              settings

              The settings manager of this crawler.

              This is used by extensions & middlewares to access the Scrapy settings of this crawler.

              For an introduction on Scrapy settings see Settings.

              For the API see Settings class.

              signals

              The signals manager of this crawler.

              This is used by extensions & middlewares to hook themselves into Scrapy functionality.

              For an introduction on signals see Signals.

              For the API see SignalManager class.

              stats

              The stats collector of this crawler.

              This is used from extensions & middlewares to record stats of their behaviour, or access stats collected by other extensions.

              For an introduction on stats collection see Stats Collection.

              For the API see StatsCollector class.

              extensions

              The extension manager that keeps track of enabled extensions.

              Most extensions won’t need to access this attribute.

              For an introduction on extensions and a list of available extensions on Scrapy see Extensions.

              spiders

              The spider manager which takes care of loading and instantiating spiders.

              Most extensions won’t need to access this attribute.

              engine

              The execution engine, which coordinates the core crawling logic between the scheduler, downloader and spiders.

              Some extension may want to access the Scrapy engine, to modify inspect or modify the downloader and scheduler behaviour, although this is an advanced use and this API is not yet stable.

              configure()

              Configure the crawler.

              This loads extensions, middlewares and spiders, leaving the crawler ready to be started. It also configures the execution engine.

              start()

              Start the crawler. This calls configure() if it hasn’t been called yet. Returns a deferred that is fired when the crawl is finished.

              Settings API

              class scrapy.settings.Settings

              This object that provides access to Scrapy settings.

              overrides

              Global overrides are the ones that take most precedence, and are usually populated by command-line options.

              Overrides should be populated before configuring the Crawler object (through the configure() method), otherwise they won’t have any effect. You don’t typically need to worry about overrides unless you are implementing your own Scrapy command.

              get(name, default=None)

              Get a setting value without affecting its original type.

              Parameters:
              • name (string) – the setting name
              • default (any) – the value to return if no setting is found
              getbool(name, default=False)

              Get a setting value as a boolean. For example, both 1 and '1', and True return True, while 0, '0', False and None return False``

              For example, settings populated through environment variables set to '0' will return False when using this method.

              Parameters:
              • name (string) – the setting name
              • default (any) – the value to return if no setting is found
              getint(name, default=0)

              Get a setting value as an int

              Parameters:
              • name (string) – the setting name
              • default (any) – the value to return if no setting is found
              getfloat(name, default=0.0)

              Get a setting value as a float

              Parameters:
              • name (string) – the setting name
              • default (any) – the value to return if no setting is found
              getlist(name, default=None)

              Get a setting value as a list. If the setting original type is a list it will be returned verbatim. If it’s a string it will be split by ”,”.

              For example, settings populated through environment variables set to 'one,two' will return a list [‘one’, ‘two’] when using this method.

              Parameters:
              • name (string) – the setting name
              • default (any) – the value to return if no setting is found

              Signals API

              class scrapy.signalmanager.SignalManager
              connect(receiver, signal)

              Connect a receiver function to a signal.

              The signal can be any object, although Scrapy comes with some predefined signals that are documented in the Signals section.

              Parameters:
              • receiver (callable) – the function to be connected
              • signal (object) – the signal to connect to
              send_catch_log(signal, **kwargs)

              Send a signal, catch exceptions and log them.

              The keyword arguments are passed to the signal handlers (connected through the connect() method).

              send_catch_log_deferred(signal, **kwargs)

              Like send_catch_log() but supports returning deferreds from signal handlers.

              Returns a deferred that gets fired once all signal handlers deferreds were fired. Send a signal, catch exceptions and log them.

              The keyword arguments are passed to the signal handlers (connected through the connect() method).

              disconnect(receiver, signal)

              Disconnect a receiver function from a signal. This has the opposite effect of the connect() method, and the arguments are the same.

              disconnect_all(signal)

              Disconnect all receivers from the given signal.

              Parameters:signal (object) – the signal to disconnect from

              Stats Collector API

              There are several Stats Collectors available under the scrapy.statscol module and they all implement the Stats Collector API defined by the StatsCollector class (which they all inherit from).

              class scrapy.statscol.StatsCollector
              get_value(key, default=None)

              Return the value for the given stats key or default if it doesn’t exist.

              get_stats()

              Get all stats from the currently running spider as a dict.

              set_value(key, value)

              Set the given value for the given stats key.

              set_stats(stats)

              Override the current stats with the dict passed in stats argument.

              inc_value(key, count=1, start=0)

              Increment the value of the given stats key, by the given count, assuming the start value given (when it’s not set).

              max_value(key, value)

              Set the given value for the given key only if current value for the same key is lower than value. If there is no current value for the given key, the value is always set.

              min_value(key, value)

              Set the given value for the given key only if current value for the same key is greater than value. If there is no current value for the given key, the value is always set.

              clear_stats()

              Clear all stats.

              The following methods are not part of the stats collection api but instead used when implementing custom stats collectors:

              open_spider(spider)

              Open the given spider for stats collection.

              close_spider(spider)

              Close the given spider. After this is called, no more specific stats can be accessed or collected.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK&o1D<<"scrapy-0.22/topics/webservice.html Web Service — Scrapy 0.22.0 documentation

              Web Service

              Scrapy comes with a built-in web service for monitoring and controlling a running crawler. The service exposes most resources using the JSON-RPC 2.0 protocol, but there are also other (read-only) resources which just output JSON data.

              Provides an extensible web service for managing a Scrapy process. It’s enabled by the WEBSERVICE_ENABLED setting. The web server will listen in the port specified in WEBSERVICE_PORT, and will log to the file specified in WEBSERVICE_LOGFILE.

              The web service is a built-in Scrapy extension which comes enabled by default, but you can also disable it if you’re running tight on memory.

              Web service resources

              The web service contains several resources, defined in the WEBSERVICE_RESOURCES setting. Each resource provides a different functionality. See Available JSON-RPC resources for a list of resources available by default.

              Although you can implement your own resources using any protocol, there are two kinds of resources bundled with Scrapy:

              • Simple JSON resources - which are read-only and just output JSON data
              • JSON-RPC resources - which provide direct access to certain Scrapy objects using the JSON-RPC 2.0 protocol

              Available JSON-RPC resources

              These are the JSON-RPC resources available by default in Scrapy:

              Crawler JSON-RPC resource

              class scrapy.contrib.webservice.crawler.CrawlerResource

              Provides access to the main Crawler object that controls the Scrapy process.

              Available by default at: http://localhost:6080/crawler

              Stats Collector JSON-RPC resource

              class scrapy.contrib.webservice.stats.StatsResource

              Provides access to the Stats Collector used by the crawler.

              Available by default at: http://localhost:6080/stats

              Spider Manager JSON-RPC resource

              You can access the spider manager JSON-RPC resource through the Crawler JSON-RPC resource at: http://localhost:6080/crawler/spiders

              Extension Manager JSON-RPC resource

              You can access the extension manager JSON-RPC resource through the Crawler JSON-RPC resource at: http://localhost:6080/crawler/spiders

              Available JSON resources

              These are the JSON resources available by default:

              Engine status JSON resource

              class scrapy.contrib.webservice.enginestatus.EngineStatusResource

              Provides access to engine status metrics.

              Available by default at: http://localhost:6080/enginestatus

              Web service settings

              These are the settings that control the web service behaviour:

              WEBSERVICE_ENABLED

              Default: True

              A boolean which specifies if the web service will be enabled (provided its extension is also enabled).

              WEBSERVICE_LOGFILE

              Default: None

              A file to use for logging HTTP requests made to the web service. If unset web the log is sent to standard scrapy log.

              WEBSERVICE_PORT

              Default: [6080, 7030]

              The port range to use for the web service. If set to None or 0, a dynamically assigned port is used.

              WEBSERVICE_HOST

              Default: '0.0.0.0'

              The interface the web service should listen on

              WEBSERVICE_RESOURCES

              Default: {}

              The list of web service resources enabled for your project. See Web service resources. These are added to the ones available by default in Scrapy, defined in the WEBSERVICE_RESOURCES_BASE setting.

              WEBSERVICE_RESOURCES_BASE

              Default:

              {
                  'scrapy.contrib.webservice.crawler.CrawlerResource': 1,
                  'scrapy.contrib.webservice.enginestatus.EngineStatusResource': 1,
                  'scrapy.contrib.webservice.stats.StatsResource': 1,
              }
              

              The list of web service resources available by default in Scrapy. You shouldn’t change this setting in your project, change WEBSERVICE_RESOURCES instead. If you want to disable some resource set its value to None in WEBSERVICE_RESOURCES.

              Writing a web service resource

              Web service resources are implemented using the Twisted Web API. See this Twisted Web guide for more information on Twisted web and Twisted web resources.

              To write a web service resource you should subclass the JsonResource or JsonRpcResource classes and implement the renderGET method.

              class scrapy.webservice.JsonResource

              A subclass of twisted.web.resource.Resource that implements a JSON web service resource. See

              ws_name

              The name by which the Scrapy web service will known this resource, and also the path where this resource will listen. For example, assuming Scrapy web service is listening on http://localhost:6080/ and the ws_name is 'resource1' the URL for that resource will be:

              class scrapy.webservice.JsonRpcResource(crawler, target=None)

              This is a subclass of JsonResource for implementing JSON-RPC resources. JSON-RPC resources wrap Python (Scrapy) objects around a JSON-RPC API. The resource wrapped must be returned by the get_target() method, which returns the target passed in the constructor by default

              get_target()

              Return the object wrapped by this JSON-RPC resource. By default, it returns the object passed on the constructor.

              Examples of web service resources

              StatsResource (JSON-RPC resource)

              from scrapy.webservice import JsonRpcResource
              
              class StatsResource(JsonRpcResource):
              
                  ws_name = 'stats'
              
                  def __init__(self, crawler):
                      JsonRpcResource.__init__(self, crawler, crawler.stats)
              

              EngineStatusResource (JSON resource)

              from scrapy.webservice import JsonResource
              from scrapy.utils.engine import get_engine_status
              
              class EngineStatusResource(JsonResource):
              
                  ws_name = 'enginestatus'
              
                  def __init__(self, crawler, spider_name=None):
                      JsonResource.__init__(self, crawler)
                      self._spider_name = spider_name
                      self.isLeaf = spider_name is not None
              
                  def render_GET(self, txrequest):
                      status = get_engine_status(self.crawler.engine)
                      if self._spider_name is None:
                          return status
                      for sp, st in status['spiders'].items():
                          if sp.name == self._spider_name:
                              return st
              
                  def getChild(self, name, txrequest):
                      return EngineStatusResource(name, self.crawler)
              

              Example of web service client

              scrapy-ws.py script

              #!/usr/bin/env python
              """
              Example script to control a Scrapy server using its JSON-RPC web service.
              
              It only provides a reduced functionality as its main purpose is to illustrate
              how to write a web service client. Feel free to improve or write you own.
              
              Also, keep in mind that the JSON-RPC API is not stable. The recommended way for
              controlling a Scrapy server is through the execution queue (see the "queue"
              command).
              
              """
              
              from __future__ import print_function
              import sys, optparse, urllib, json
              from urlparse import urljoin
              
              from scrapy.utils.jsonrpc import jsonrpc_client_call, JsonRpcError
              
              def get_commands():
                  return {
                      'help': cmd_help,
                      'stop': cmd_stop,
                      'list-available': cmd_list_available,
                      'list-running': cmd_list_running,
                      'list-resources': cmd_list_resources,
                      'get-global-stats': cmd_get_global_stats,
                      'get-spider-stats': cmd_get_spider_stats,
                  }
              
              def cmd_help(args, opts):
                  """help - list available commands"""
                  print("Available commands:")
                  for _, func in sorted(get_commands().items()):
                      print("  ", func.__doc__)
              
              def cmd_stop(args, opts):
                  """stop <spider> - stop a running spider"""
                  jsonrpc_call(opts, 'crawler/engine', 'close_spider', args[0])
              
              def cmd_list_running(args, opts):
                  """list-running - list running spiders"""
                  for x in json_get(opts, 'crawler/engine/open_spiders'):
                      print(x)
              
              def cmd_list_available(args, opts):
                  """list-available - list name of available spiders"""
                  for x in jsonrpc_call(opts, 'crawler/spiders', 'list'):
                      print(x)
              
              def cmd_list_resources(args, opts):
                  """list-resources - list available web service resources"""
                  for x in json_get(opts, '')['resources']:
                      print(x)
              
              def cmd_get_spider_stats(args, opts):
                  """get-spider-stats <spider> - get stats of a running spider"""
                  stats = jsonrpc_call(opts, 'stats', 'get_stats', args[0])
                  for name, value in stats.items():
                      print("%-40s %s" % (name, value))
              
              def cmd_get_global_stats(args, opts):
                  """get-global-stats - get global stats"""
                  stats = jsonrpc_call(opts, 'stats', 'get_stats')
                  for name, value in stats.items():
                      print("%-40s %s" % (name, value))
              
              def get_wsurl(opts, path):
                  return urljoin("http://%s:%s/"% (opts.host, opts.port), path)
              
              def jsonrpc_call(opts, path, method, *args, **kwargs):
                  url = get_wsurl(opts, path)
                  return jsonrpc_client_call(url, method, *args, **kwargs)
              
              def json_get(opts, path):
                  url = get_wsurl(opts, path)
                  return json.loads(urllib.urlopen(url).read())
              
              def parse_opts():
                  usage = "%prog [options] <command> [arg] ..."
                  description = "Scrapy web service control script. Use '%prog help' " \
                      "to see the list of available commands."
                  op = optparse.OptionParser(usage=usage, description=description)
                  op.add_option("-H", dest="host", default="localhost", \
                      help="Scrapy host to connect to")
                  op.add_option("-P", dest="port", type="int", default=6080, \
                      help="Scrapy port to connect to")
                  opts, args = op.parse_args()
                  if not args:
                      op.print_help()
                      sys.exit(2)
                  cmdname, cmdargs, opts = args[0], args[1:], opts
                  commands = get_commands()
                  if cmdname not in commands:
                      sys.stderr.write("Unknown command: %s\n\n" % cmdname)
                      cmd_help(None, None)
                      sys.exit(1)
                  return commands[cmdname], cmdargs, opts
              
              def main():
                  cmd, args, opts = parse_opts()
                  try:
                      cmd(args, opts)
                  except IndexError:
                      print(cmd.__doc__)
                  except JsonRpcError as e:
                      print(str(e))
                      if e.data:
                          print("Server Traceback below:")
                          print(e.data)
              
              
              if __name__ == '__main__':
                  main()
              
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1Dn-scrapy-0.22/topics/email.html Sending e-mail — Scrapy 0.22.0 documentation

              Sending e-mail

              Although Python makes sending e-mails relatively easy via the smtplib library, Scrapy provides its own facility for sending e-mails which is very easy to use and it’s implemented using Twisted non-blocking IO, to avoid interfering with the non-blocking IO of the crawler. It also provides a simple API for sending attachments and it’s very easy to configure, with a few settings.

              Quick example

              There are two ways to instantiate the mail sender. You can instantiate it using the standard constructor:

              from scrapy.mail import MailSender
              mailer = MailSender()
              

              Or you can instantiate it passing a Scrapy settings object, which will respect the settings:

              mailer = MailSender.from_settings(settings)
              

              And here is how to use it to send an e-mail (without attachments):

              mailer.send(to=["someone@example.com"], subject="Some subject", body="Some body", cc=["another@example.com"])
              

              MailSender class reference

              MailSender is the preferred class to use for sending emails from Scrapy, as it uses Twisted non-blocking IO, like the rest of the framework.

              class scrapy.mail.MailSender(smtphost=None, mailfrom=None, smtpuser=None, smtppass=None, smtpport=None)
              Parameters:
              • smtphost (str) – the SMTP host to use for sending the emails. If omitted, the MAIL_HOST setting will be used.
              • mailfrom (str) – the address used to send emails (in the From: header). If omitted, the MAIL_FROM setting will be used.
              • smtpuser – the SMTP user. If omitted, the MAIL_USER setting will be used. If not given, no SMTP authentication will be performed.
              • smtppass (str) – the SMTP pass for authentication.
              • smtpport (boolean) – the SMTP port to connect to
              • smtptls – enforce using SMTP STARTTLS
              • smtpssl – enforce using a secure SSL connection
              classmethod from_settings(settings)

              Instantiate using a Scrapy settings object, which will respect these Scrapy settings.

              Parameters:settings (scrapy.settings.Settings object) – the e-mail recipients
              send(to, subject, body, cc=None, attachs=())

              Send email to the given recipients.

              Parameters:
              • to (list) – the e-mail recipients
              • subject (str) – the subject of the e-mail
              • cc (list) – the e-mails to CC
              • body (str) – the e-mail body
              • attachs (iterable) – an iterable of tuples (attach_name, mimetype, file_object) where attach_name is a string with the name that will appear on the e-mail’s attachment, mimetype is the mimetype of the attachment and file_object is a readable file object with the contents of the attachment

              Mail settings

              These settings define the default constructor values of the MailSender class, and can be used to configure e-mail notifications in your project without writing any code (for those extensions and code that uses MailSender).

              MAIL_FROM

              Default: 'scrapy@localhost'

              Sender email to use (From: header) for sending emails.

              MAIL_HOST

              Default: 'localhost'

              SMTP host to use for sending emails.

              MAIL_PORT

              Default: 25

              SMTP port to use for sending emails.

              MAIL_USER

              Default: None

              User to use for SMTP authentication. If disabled no SMTP authentication will be performed.

              MAIL_PASS

              Default: None

              Password to use for SMTP authentication, along with MAIL_USER.

              MAIL_TLS

              Default: False

              Enforce using STARTTLS. STARTTLS is a way to take an existing insecure connection, and upgrade it to a secure connection using SSL/TLS.

              MAIL_SSL

              Default: False

              Enforce connecting using an SSL encrypted connection

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK$o1D}scrapy-0.22/topics/items.html Items — Scrapy 0.22.0 documentation

              Items

              The main goal in scraping is to extract structured data from unstructured sources, typically, web pages. Scrapy provides the Item class for this purpose.

              Item objects are simple containers used to collect the scraped data. They provide a dictionary-like API with a convenient syntax for declaring their available fields.

              Declaring Items

              Items are declared using a simple class definition syntax and Field objects. Here is an example:

              from scrapy.item import Item, Field
              
              class Product(Item):
                  name = Field()
                  price = Field()
                  stock = Field()
                  last_updated = Field(serializer=str)
              

              Note

              Those familiar with Django will notice that Scrapy Items are declared similar to Django Models, except that Scrapy Items are much simpler as there is no concept of different field types.

              Item Fields

              Field objects are used to specify metadata for each field. For example, the serializer function for the last_updated field illustrated in the example above.

              You can specify any kind of metadata for each field. There is no restriction on the values accepted by Field objects. For this same reason, there isn’t a reference list of all available metadata keys. Each key defined in Field objects could be used by a different components, and only those components know about it. You can also define and use any other Field key in your project too, for your own needs. The main goal of Field objects is to provide a way to define all field metadata in one place. Typically, those components whose behaviour depends on each field use certain field keys to configure that behaviour. You must refer to their documentation to see which metadata keys are used by each component.

              It’s important to note that the Field objects used to declare the item do not stay assigned as class attributes. Instead, they can be accessed through the Item.fields attribute.

              And that’s all you need to know about declaring items.

              Working with Items

              Here are some examples of common tasks performed with items, using the Product item declared above. You will notice the API is very similar to the dict API.

              Creating items

              >>> product = Product(name='Desktop PC', price=1000)
              >>> print product
              Product(name='Desktop PC', price=1000)
              

              Getting field values

              >>> product['name']
              Desktop PC
              >>> product.get('name')
              Desktop PC
              
              >>> product['price']
              1000
              
              >>> product['last_updated']
              Traceback (most recent call last):
                  ...
              KeyError: 'last_updated'
              
              >>> product.get('last_updated', 'not set')
              not set
              
              >>> product['lala'] # getting unknown field
              Traceback (most recent call last):
                  ...
              KeyError: 'lala'
              
              >>> product.get('lala', 'unknown field')
              'unknown field'
              
              >>> 'name' in product  # is name field populated?
              True
              
              >>> 'last_updated' in product  # is last_updated populated?
              False
              
              >>> 'last_updated' in product.fields  # is last_updated a declared field?
              True
              
              >>> 'lala' in product.fields  # is lala a declared field?
              False
              

              Setting field values

              >>> product['last_updated'] = 'today'
              >>> product['last_updated']
              today
              
              >>> product['lala'] = 'test' # setting unknown field
              Traceback (most recent call last):
                  ...
              KeyError: 'Product does not support field: lala'
              

              Accessing all populated values

              To access all populated values, just use the typical dict API:

              >>> product.keys()
              ['price', 'name']
              
              >>> product.items()
              [('price', 1000), ('name', 'Desktop PC')]
              

              Other common tasks

              Copying items:

              >>> product2 = Product(product)
              >>> print product2
              Product(name='Desktop PC', price=1000)
              
              >>> product3 = product2.copy()
              >>> print product3
              Product(name='Desktop PC', price=1000)
              

              Creating dicts from items:

              >>> dict(product) # create a dict from all populated values
              {'price': 1000, 'name': 'Desktop PC'}
              

              Creating items from dicts:

              >>> Product({'name': 'Laptop PC', 'price': 1500})
              Product(price=1500, name='Laptop PC')
              
              >>> Product({'name': 'Laptop PC', 'lala': 1500}) # warning: unknown field in dict
              Traceback (most recent call last):
                  ...
              KeyError: 'Product does not support field: lala'
              

              Extending Items

              You can extend Items (to add more fields or to change some metadata for some fields) by declaring a subclass of your original Item.

              For example:

              class DiscountedProduct(Product):
                  discount_percent = Field(serializer=str)
                  discount_expiration_date = Field()
              

              You can also extend field metadata by using the previous field metadata and appending more values, or changing existing values, like this:

              class SpecificProduct(Product):
                  name = Field(Product.fields['name'], serializer=my_serializer)
              

              That adds (or replaces) the serializer metadata key for the name field, keeping all the previously existing metadata values.

              Item objects

              class scrapy.item.Item([arg])

              Return a new Item optionally initialized from the given argument.

              Items replicate the standard dict API, including its constructor. The only additional attribute provided by Items is:

              fields

              A dictionary containing all declared fields for this Item, not only those populated. The keys are the field names and the values are the Field objects used in the Item declaration.

              Field objects

              class scrapy.item.Field([arg])

              The Field class is just an alias to the built-in dict class and doesn’t provide any extra functionality or attributes. In other words, Field objects are plain-old Python dicts. A separate class is used to support the item declaration syntax based on class attributes.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1D|scrapy-0.22/topics/firefox.html Using Firefox for scraping — Scrapy 0.22.0 documentation

              Using Firefox for scraping

              Here is a list of tips and advice on using Firefox for scraping, along with a list of useful Firefox add-ons to ease the scraping process.

              Caveats with inspecting the live browser DOM

              Since Firefox add-ons operate on a live browser DOM, what you’ll actually see when inspecting the page source is not the original HTML, but a modified one after applying some browser clean up and executing Javascript code. Firefox, in particular, is known for adding <tbody> elements to tables. Scrapy, on the other hand, does not modify the original page HTML, so you won’t be able to extract any data if you use <tbody in your XPath expressions.

              Therefore, you should keep in mind the following things when working with Firefox and XPath:

              • Disable Firefox Javascript while inspecting the DOM looking for XPaths to be used in Scrapy
              • Never use full XPath paths, use relative and clever ones based on attributes (such as id, class, width, etc) or any identifying features like contains(@href, 'image').
              • Never include <tbody> elements in your XPath expressions unless you really know what you’re doing

              Useful Firefox add-ons for scraping

              Firebug

              Firebug is a widely known tool among web developers and it’s also very useful for scraping. In particular, its Inspect Element feature comes very handy when you need to construct the XPaths for extracting data because it allows you to view the HTML code of each page element while moving your mouse over it.

              See Using Firebug for scraping for a detailed guide on how to use Firebug with Scrapy.

              XPather

              XPather allows you to test XPath expressions directly on the pages.

              XPath Checker

              XPath Checker is another Firefox add-on for testing XPaths on your pages.

              Tamper Data

              Tamper Data is a Firefox add-on which allows you to view and modify the HTTP request headers sent by Firefox. Firebug also allows to view HTTP headers, but not to modify them.

              Firecookie

              Firecookie makes it easier to view and manage cookies. You can use this extension to create a new cookie, delete existing cookies, see a list of cookies for the current site, manage cookies permissions and a lot more.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK"o1D##$scrapy-0.22/topics/broad-crawls.html Broad Crawls — Scrapy 0.22.0 documentation

              Broad Crawls

              Scrapy defaults are optimized for crawling specific sites. These sites are often handled by a single Scrapy spider, although this is not necessary or required (for example, there are generic spiders that handle any given site thrown at them).

              In addition to this “focused crawl”, there is another common type of crawling which covers a large (potentially unlimited) number of domains, and is only limited by time or other arbitrary constraint, rather than stopping when the domain was crawled to completion or when there are no more requests to perform. These are called “broad crawls” and is the typical crawlers employed by search engines.

              These are some common properties often found in broad crawls:

              • they crawl many domains (often, unbounded) instead of a specific set of sites
              • they don’t necessarily crawl domains to completion, because it would impractical (or impossible) to do so, and instead limit the crawl by time or number of pages crawled
              • they are simpler in logic (as opposed to very complex spiders with many extraction rules) because data is often post-processed in a separate stage
              • they crawl many domains concurrently, which allows them to achieve faster crawl speeds by not being limited by any particular site constraint (each site is crawled slowly to respect politeness, but many sites are crawled in parallel)

              As said above, Scrapy default settings are optimized for focused crawls, not broad crawls. However, due to its asynchronous architecture, Scrapy is very well suited for performing fast broad crawls. This page summarize some things you need to keep in mind when using Scrapy for doing broad crawls, along with concrete suggestions of Scrapy settings to tune in order to achieve an efficient broad crawl.

              Increase concurrency

              Concurrency is the number of requests that are processed in parallel. There is a global limit and a per-domain limit.

              The default global concurrency limit in Scrapy is not suitable for crawling many different domains in parallel, so you will want to increase it. How much to increase it will depend on how much CPU you crawler will have available. A good starting point is 100, but the best way to find out is by doing some trials and identifying at what concurrency your Scrapy process gets CPU bounded. For optimum performance, You should pick a concurrency where CPU usage is at 80-90%.

              To increase the global concurrency use:

              CONCURRENT_REQUESTS = 100
              

              Reduce log level

              When doing broad crawls you are often only interested in the crawl rates you get and any errors found. These stats are reported by Scrapy when using the INFO log level. In order to save CPU (and log storage requirements) you should not use DEBUG log level when preforming large broad crawls in production. Using DEBUG level when developing your (broad) crawler may fine though.

              To set the log level use:

              LOG_LEVEL = 'INFO'
              

              Disable cookies

              Disable cookies unless you really need. Cookies are often not needed when doing broad crawls (search engine crawlers ignore them), and they improve performance by saving some CPU cycles and reducing the memory footprint of your Scrapy crawler.

              To disable cookies use:

              COOKIES_ENABLED = False
              

              Disable retries

              Retrying failed HTTP requests can slow down the crawls substantially, specially when sites causes are very slow (or fail) to respond, thus causing a timeout error which gets retried many times, unnecessarily, preventing crawler capacity to be reused for other domains.

              To disable retries use:

              RETRY_ENABLED = False
              

              Reduce download timeout

              Unless you are crawling from a very slow connection (which shouldn’t be the case for broad crawls) reduce the download timeout so that stuck requests are discarded quickly and free up capacity to process the next ones.

              To reduce the download timeout use:

              DOWNLOAD_TIMEOUT = 15
              

              Disable redirects

              Consider disabling redirects, unless you are interested in following them. When doing broad crawls it’s common to save redirects and resolve them when revisiting the site at a later crawl. This also help to keep the number of request constant per crawl batch, otherwise redirect loops may cause the crawler to dedicate too many resources on any specific domain.

              To disable redirects use:

              REDIRECT_ENABLED = False
              

              Enable crawling of “Ajax Crawlable Pages”

              Some pages (up to 1%, based on empirical data from year 2013) declare themselves as ajax crawlable. This means they provide plain HTML version of content that is usually available only via AJAX. Pages can indicate it in two ways:

              1. by using #! in URL - this is the default way;
              2. by using a special meta tag - this way is used on “main”, “index” website pages.

              Scrapy handles (1) automatically; to handle (2) enable AjaxCrawlMiddleware:

              AJAXCRAWL_ENABLED = True
              

              When doing broad crawls it’s common to crawl a lot of “index” web pages; AjaxCrawlMiddleware helps to crawl them correctly. It is turned OFF by default because it has some performance overhead, and enabling it for focused crawls doesn’t make much sense.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK%o1Da`%%scrapy-0.22/topics/signals.html Signals — Scrapy 0.22.0 documentation

              Signals

              Scrapy uses signals extensively to notify when certain events occur. You can catch some of those signals in your Scrapy project (using an extension, for example) to perform additional tasks or extend Scrapy to add functionality not provided out of the box.

              Even though signals provide several arguments, the handlers that catch them don’t need to accept all of them - the signal dispatching mechanism will only deliver the arguments that the handler receives.

              You can connect to signals (or send your own) through the Signals API.

              Deferred signal handlers

              Some signals support returning Twisted deferreds from their handlers, see the Built-in signals reference below to know which ones.

              Built-in signals reference

              Here’s the list of Scrapy built-in signals and their meaning.

              engine_started

              scrapy.signals.engine_started()

              Sent when the Scrapy engine has started crawling.

              This signal supports returning deferreds from their handlers.

              Note

              This signal may be fired after the spider_opened signal, depending on how the spider was started. So don’t rely on this signal getting fired before spider_opened.

              engine_stopped

              scrapy.signals.engine_stopped()

              Sent when the Scrapy engine is stopped (for example, when a crawling process has finished).

              This signal supports returning deferreds from their handlers.

              item_scraped

              scrapy.signals.item_scraped(item, response, spider)

              Sent when an item has been scraped, after it has passed all the Item Pipeline stages (without being dropped).

              This signal supports returning deferreds from their handlers.

              Parameters:
              • item (Item object) – the item scraped
              • response (Response object) – the response from where the item was scraped
              • spider (Spider object) – the spider which scraped the item

              item_dropped

              scrapy.signals.item_dropped(item, spider, exception)

              Sent after an item has been dropped from the Item Pipeline when some stage raised a DropItem exception.

              This signal supports returning deferreds from their handlers.

              Parameters:
              • item (Item object) – the item dropped from the Item Pipeline
              • spider (Spider object) – the spider which scraped the item
              • exception (DropItem exception) – the exception (which must be a DropItem subclass) which caused the item to be dropped

              spider_closed

              scrapy.signals.spider_closed(spider, reason)

              Sent after a spider has been closed. This can be used to release per-spider resources reserved on spider_opened.

              This signal supports returning deferreds from their handlers.

              Parameters:
              • spider (Spider object) – the spider which has been closed
              • reason (str) – a string which describes the reason why the spider was closed. If it was closed because the spider has completed scraping, the reason is 'finished'. Otherwise, if the spider was manually closed by calling the close_spider engine method, then the reason is the one passed in the reason argument of that method (which defaults to 'cancelled'). If the engine was shutdown (for example, by hitting Ctrl-C to stop it) the reason will be 'shutdown'.

              spider_opened

              scrapy.signals.spider_opened(spider)

              Sent after a spider has been opened for crawling. This is typically used to reserve per-spider resources, but can be used for any task that needs to be performed when a spider is opened.

              This signal supports returning deferreds from their handlers.

              Parameters:spider (Spider object) – the spider which has been opened

              spider_idle

              scrapy.signals.spider_idle(spider)

              Sent when a spider has gone idle, which means the spider has no further:

              • requests waiting to be downloaded
              • requests scheduled
              • items being processed in the item pipeline

              If the idle state persists after all handlers of this signal have finished, the engine starts closing the spider. After the spider has finished closing, the spider_closed signal is sent.

              You can, for example, schedule some requests in your spider_idle handler to prevent the spider from being closed.

              This signal does not support returning deferreds from their handlers.

              Parameters:spider (Spider object) – the spider which has gone idle

              spider_error

              scrapy.signals.spider_error(failure, response, spider)

              Sent when a spider callback generates an error (ie. raises an exception).

              Parameters:
              • failure (Failure object) – the exception raised as a Twisted Failure object
              • response (Response object) – the response being processed when the exception was raised
              • spider (Spider object) – the spider which raised the exception

              response_received

              scrapy.signals.response_received(response, request, spider)

              Sent when the engine receives a new Response from the downloader.

              This signal does not support returning deferreds from their handlers.

              Parameters:
              • response (Response object) – the response received
              • request (Request object) – the request that generated the response
              • spider (Spider object) – the spider for which the response is intended

              response_downloaded

              scrapy.signals.response_downloaded(response, request, spider)

              Sent by the downloader right after a HTTPResponse is downloaded.

              This signal does not support returning deferreds from their handlers.

              Parameters:
              • response (Response object) – the response downloaded
              • request (Request object) – the request that generated the response
              • spider (Spider object) – the spider for which the response is intended
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK"o1Doccff$scrapy-0.22/topics/benchmarking.html Benchmarking — Scrapy 0.22.0 documentation

              Benchmarking

              New in version 0.17.

              Scrapy comes with a simple benchmarking suite that spawns a local HTTP server and crawls it at the maximum possible speed. The goal of this benchmarking is to get an idea of how Scrapy performs in your hardware, in order to have a common baseline for comparisons. It uses a simple spider that does nothing and just follows links.

              To run it use:

              scrapy bench

              You should see an output like this:

              2013-05-16 13:08:46-0300 [scrapy] INFO: Scrapy 0.17.0 started (bot: scrapybot)
              2013-05-16 13:08:47-0300 [follow] INFO: Spider opened
              2013-05-16 13:08:47-0300 [follow] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:48-0300 [follow] INFO: Crawled 74 pages (at 4440 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:49-0300 [follow] INFO: Crawled 143 pages (at 4140 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:50-0300 [follow] INFO: Crawled 210 pages (at 4020 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:51-0300 [follow] INFO: Crawled 274 pages (at 3840 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:52-0300 [follow] INFO: Crawled 343 pages (at 4140 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:53-0300 [follow] INFO: Crawled 410 pages (at 4020 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:54-0300 [follow] INFO: Crawled 474 pages (at 3840 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:55-0300 [follow] INFO: Crawled 538 pages (at 3840 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:56-0300 [follow] INFO: Crawled 602 pages (at 3840 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:57-0300 [follow] INFO: Closing spider (closespider_timeout)
              2013-05-16 13:08:57-0300 [follow] INFO: Crawled 666 pages (at 3840 pages/min), scraped 0 items (at 0 items/min)
              2013-05-16 13:08:57-0300 [follow] INFO: Dumping Scrapy stats:
                  {'downloader/request_bytes': 231508,
                   'downloader/request_count': 682,
                   'downloader/request_method_count/GET': 682,
                   'downloader/response_bytes': 1172802,
                   'downloader/response_count': 682,
                   'downloader/response_status_count/200': 682,
                   'finish_reason': 'closespider_timeout',
                   'finish_time': datetime.datetime(2013, 5, 16, 16, 8, 57, 985539),
                   'log_count/INFO': 14,
                   'request_depth_max': 34,
                   'response_received_count': 682,
                   'scheduler/dequeued': 682,
                   'scheduler/dequeued/memory': 682,
                   'scheduler/enqueued': 12767,
                   'scheduler/enqueued/memory': 12767,
                   'start_time': datetime.datetime(2013, 5, 16, 16, 8, 47, 676539)}
              2013-05-16 13:08:57-0300 [follow] INFO: Spider closed (closespider_timeout)

              That tells you that Scrapy is able to crawl about 3900 pages per minute in the hardware where you run it. Note that this is a very simple spider intended to follow links, any custom spider you write will probably do more stuff which results in slower crawl rates. How slower depends on how much your spider does and how well it’s written.

              In the future, more cases will be added to the benchmarking suite to cover other common scenarios.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK%o1Dcj (scrapy-0.22/topics/request-response.html Requests and Responses — Scrapy 0.22.0 documentation

              Requests and Responses

              Scrapy uses Request and Response objects for crawling web sites.

              Typically, Request objects are generated in the spiders and pass across the system until they reach the Downloader, which executes the request and returns a Response object which travels back to the spider that issued the request.

              Both Request and Response classes have subclasses which add functionality not required in the base classes. These are described below in Request subclasses and Response subclasses.

              Request objects

              class scrapy.http.Request(url[, callback, method='GET', headers, body, cookies, meta, encoding='utf-8', priority=0, dont_filter=False, errback])

              A Request object represents an HTTP request, which is usually generated in the Spider and executed by the Downloader, and thus generating a Response.

              Parameters:
              • url (string) – the URL of this request
              • callback (callable) – the function that will be called with the response of this request (once its downloaded) as its first parameter. For more information see Passing additional data to callback functions below. If a Request doesn’t specify a callback, the spider’s parse() method will be used. Note that if exceptions are raised during processing, errback is called instead.
              • method (string) – the HTTP method of this request. Defaults to 'GET'.
              • meta (dict) – the initial values for the Request.meta attribute. If given, the dict passed in this parameter will be shallow copied.
              • body (str or unicode) – the request body. If a unicode is passed, then it’s encoded to str using the encoding passed (which defaults to utf-8). If body is not given,, an empty string is stored. Regardless of the type of this argument, the final value stored will be a str (never unicode or None).
              • headers (dict) – the headers of this request. The dict values can be strings (for single valued headers) or lists (for multi-valued headers). If None is passed as value, the HTTP header will not be sent at all.
              • cookies (dict or list) –

                the request cookies. These can be sent in two forms.

                1. Using a dict:
                  request_with_cookies = Request(url="http://www.example.com",
                                                 cookies={'currency': 'USD', 'country': 'UY'})
                  
                2. Using a list of dicts:
                  request_with_cookies = Request(url="http://www.example.com",
                                                 cookies=[{'name': 'currency',
                                                          'value': 'USD',
                                                          'domain': 'example.com',
                                                          'path': '/currency'}])
                  

                The latter form allows for customizing the domain and path attributes of the cookie. These is only useful if the cookies are saved for later requests.

                When some site returns cookies (in a response) those are stored in the cookies for that domain and will be sent again in future requests. That’s the typical behaviour of any regular web browser. However, if, for some reason, you want to avoid merging with existing cookies you can instruct Scrapy to do so by setting the dont_merge_cookies key in the Request.meta.

                Example of request without merging cookies:

                request_with_cookies = Request(url="http://www.example.com",
                                               cookies={'currency': 'USD', 'country': 'UY'},
                                               meta={'dont_merge_cookies': True})
                

                For more info see CookiesMiddleware.

              • encoding (string) – the encoding of this request (defaults to 'utf-8'). This encoding will be used to percent-encode the URL and to convert the body to str (if given as unicode).
              • priority (int) – the priority of this request (defaults to 0). The priority is used by the scheduler to define the order used to process requests.
              • dont_filter (boolean) – indicates that this request should not be filtered by the scheduler. This is used when you want to perform an identical request multiple times, to ignore the duplicates filter. Use it with care, or you will get into crawling loops. Default to False.
              • errback (callable) – a function that will be called if any exception was raised while processing the request. This includes pages that failed with 404 HTTP errors and such. It receives a Twisted Failure instance as first parameter.
              url

              A string containing the URL of this request. Keep in mind that this attribute contains the escaped URL, so it can differ from the URL passed in the constructor.

              This attribute is read-only. To change the URL of a Request use replace().

              method

              A string representing the HTTP method in the request. This is guaranteed to be uppercase. Example: "GET", "POST", "PUT", etc

              headers

              A dictionary-like object which contains the request headers.

              body

              A str that contains the request body.

              This attribute is read-only. To change the body of a Request use replace().

              meta

              A dict that contains arbitrary metadata for this request. This dict is empty for new Requests, and is usually populated by different Scrapy components (extensions, middlewares, etc). So the data contained in this dict depends on the extensions you have enabled.

              See Request.meta special keys for a list of special meta keys recognized by Scrapy.

              This dict is shallow copied when the request is cloned using the copy() or replace() methods, and can also be accessed, in your spider, from the response.meta attribute.

              copy()

              Return a new Request which is a copy of this Request. See also: Passing additional data to callback functions.

              replace([url, method, headers, body, cookies, meta, encoding, dont_filter, callback, errback])

              Return a Request object with the same members, except for those members given new values by whichever keyword arguments are specified. The attribute Request.meta is copied by default (unless a new value is given in the meta argument). See also Passing additional data to callback functions.

              Passing additional data to callback functions

              The callback of a request is a function that will be called when the response of that request is downloaded. The callback function will be called with the downloaded Response object as its first argument.

              Example:

              def parse_page1(self, response):
                  return Request("http://www.example.com/some_page.html",
                                    callback=self.parse_page2)
              
              def parse_page2(self, response):
                  # this would log http://www.example.com/some_page.html
                  self.log("Visited %s" % response.url)
              

              In some cases you may be interested in passing arguments to those callback functions so you can receive the arguments later, in the second callback. You can use the Request.meta attribute for that.

              Here’s an example of how to pass an item using this mechanism, to populate different fields from different pages:

              def parse_page1(self, response):
                  item = MyItem()
                  item['main_url'] = response.url
                  request = Request("http://www.example.com/some_page.html",
                                    callback=self.parse_page2)
                  request.meta['item'] = item
                  return request
              
              def parse_page2(self, response):
                  item = response.meta['item']
                  item['other_url'] = response.url
                  return item
              

              Request.meta special keys

              The Request.meta attribute can contain any arbitrary data, but there are some special keys recognized by Scrapy and its built-in extensions.

              Those are:

              bindaddress

              The IP of the outgoing IP address to use for the performing the request.

              Request subclasses

              Here is the list of built-in Request subclasses. You can also subclass it to implement your own custom functionality.

              FormRequest objects

              The FormRequest class extends the base Request with functionality for dealing with HTML forms. It uses lxml.html forms to pre-populate form fields with form data from Response objects.

              class scrapy.http.FormRequest(url[, formdata, ...])

              The FormRequest class adds a new argument to the constructor. The remaining arguments are the same as for the Request class and are not documented here.

              Parameters:formdata (dict or iterable of tuples) – is a dictionary (or iterable of (key, value) tuples) containing HTML Form data which will be url-encoded and assigned to the body of the request.

              The FormRequest objects support the following class method in addition to the standard Request methods:

              classmethod from_response(response[, formname=None, formnumber=0, formdata=None, formxpath=None, dont_click=False, ...])

              Returns a new FormRequest object with its form field values pre-populated with those found in the HTML <form> element contained in the given response. For an example see Using FormRequest.from_response() to simulate a user login.

              The policy is to automatically simulate a click, by default, on any form control that looks clickable, like a <input type="submit">. Even though this is quite convenient, and often the desired behaviour, sometimes it can cause problems which could be hard to debug. For example, when working with forms that are filled and/or submitted using javascript, the default from_response() behaviour may not be the most appropriate. To disable this behaviour you can set the dont_click argument to True. Also, if you want to change the control clicked (instead of disabling it) you can also use the clickdata argument.

              Parameters:
              • response (Response object) – the response containing a HTML form which will be used to pre-populate the form fields
              • formname (string) – if given, the form with name attribute set to this value will be used.
              • formxpath (string) – if given, the first form that matches the xpath will be used.
              • formnumber (integer) – the number of form to use, when the response contains multiple forms. The first one (and also the default) is 0.
              • formdata (dict) – fields to override in the form data. If a field was already present in the response <form> element, its value is overridden by the one passed in this parameter.
              • dont_click (boolean) – If True, the form data will be submitted without clicking in any element.

              The other parameters of this class method are passed directly to the FormRequest constructor.

              New in version 0.10.3: The formname parameter.

              New in version 0.17: The formxpath parameter.

              Request usage examples

              Using FormRequest to send data via HTTP POST

              If you want to simulate a HTML Form POST in your spider and send a couple of key-value fields, you can return a FormRequest object (from your spider) like this:

              return [FormRequest(url="http://www.example.com/post/action",
                                  formdata={'name': 'John Doe', 'age': '27'},
                                  callback=self.after_post)]
              

              Using FormRequest.from_response() to simulate a user login

              It is usual for web sites to provide pre-populated form fields through <input type="hidden"> elements, such as session related data or authentication tokens (for login pages). When scraping, you’ll want these fields to be automatically pre-populated and only override a couple of them, such as the user name and password. You can use the FormRequest.from_response() method for this job. Here’s an example spider which uses it:

              class LoginSpider(Spider):
                  name = 'example.com'
                  start_urls = ['http://www.example.com/users/login.php']
              
                  def parse(self, response):
                      return [FormRequest.from_response(response,
                                  formdata={'username': 'john', 'password': 'secret'},
                                  callback=self.after_login)]
              
                  def after_login(self, response):
                      # check login succeed before going on
                      if "authentication failed" in response.body:
                          self.log("Login failed", level=log.ERROR)
                          return
              
                      # continue scraping with authenticated session...
              

              Response objects

              class scrapy.http.Response(url[, status=200, headers, body, flags])

              A Response object represents an HTTP response, which is usually downloaded (by the Downloader) and fed to the Spiders for processing.

              Parameters:
              • url (string) – the URL of this response
              • headers (dict) – the headers of this response. The dict values can be strings (for single valued headers) or lists (for multi-valued headers).
              • status (integer) – the HTTP status of the response. Defaults to 200.
              • body (str) – the response body. It must be str, not unicode, unless you’re using a encoding-aware Response subclass, such as TextResponse.
              • meta (dict) – the initial values for the Response.meta attribute. If given, the dict will be shallow copied.
              • flags (list) – is a list containing the initial values for the Response.flags attribute. If given, the list will be shallow copied.
              url

              A string containing the URL of the response.

              This attribute is read-only. To change the URL of a Response use replace().

              status

              An integer representing the HTTP status of the response. Example: 200, 404.

              headers

              A dictionary-like object which contains the response headers.

              body

              A str containing the body of this Response. Keep in mind that Reponse.body is always a str. If you want the unicode version use TextResponse.body_as_unicode() (only available in TextResponse and subclasses).

              This attribute is read-only. To change the body of a Response use replace().

              request

              The Request object that generated this response. This attribute is assigned in the Scrapy engine, after the response and the request have passed through all Downloader Middlewares. In particular, this means that:

              • HTTP redirections will cause the original request (to the URL before redirection) to be assigned to the redirected response (with the final URL after redirection).
              • Response.request.url doesn’t always equal Response.url
              • This attribute is only available in the spider code, and in the Spider Middlewares, but not in Downloader Middlewares (although you have the Request available there by other means) and handlers of the response_downloaded signal.
              meta

              A shortcut to the Request.meta attribute of the Response.request object (ie. self.request.meta).

              Unlike the Response.request attribute, the Response.meta attribute is propagated along redirects and retries, so you will get the original Request.meta sent from your spider.

              See also

              Request.meta attribute

              flags

              A list that contains flags for this response. Flags are labels used for tagging Responses. For example: ‘cached’, ‘redirected‘, etc. And they’re shown on the string representation of the Response (__str__ method) which is used by the engine for logging.

              copy()

              Returns a new Response which is a copy of this Response.

              replace([url, status, headers, body, request, flags, cls])

              Returns a Response object with the same members, except for those members given new values by whichever keyword arguments are specified. The attribute Response.meta is copied by default.

              Response subclasses

              Here is the list of available built-in Response subclasses. You can also subclass the Response class to implement your own functionality.

              TextResponse objects

              class scrapy.http.TextResponse(url[, encoding[, ...]])

              TextResponse objects adds encoding capabilities to the base Response class, which is meant to be used only for binary data, such as images, sounds or any media file.

              TextResponse objects support a new constructor argument, in addition to the base Response objects. The remaining functionality is the same as for the Response class and is not documented here.

              Parameters:encoding (string) – is a string which contains the encoding to use for this response. If you create a TextResponse object with a unicode body, it will be encoded using this encoding (remember the body attribute is always a string). If encoding is None (default value), the encoding will be looked up in the response headers and body instead.

              TextResponse objects support the following attributes in addition to the standard Response ones:

              encoding

              A string with the encoding of this response. The encoding is resolved by trying the following mechanisms, in order:

              1. the encoding passed in the constructor encoding argument
              2. the encoding declared in the Content-Type HTTP header. If this encoding is not valid (ie. unknown), it is ignored and the next resolution mechanism is tried.
              3. the encoding declared in the response body. The TextResponse class doesn’t provide any special functionality for this. However, the HtmlResponse and XmlResponse classes do.
              4. the encoding inferred by looking at the response body. This is the more fragile method but also the last one tried.

              TextResponse objects support the following methods in addition to the standard Response ones:

              body_as_unicode()

              Returns the body of the response as unicode. This is equivalent to:

              response.body.decode(response.encoding)
              

              But not equivalent to:

              unicode(response.body)
              

              Since, in the latter case, you would be using you system default encoding (typically ascii) to convert the body to unicode, instead of the response encoding.

              HtmlResponse objects

              class scrapy.http.HtmlResponse(url[, ...])

              The HtmlResponse class is a subclass of TextResponse which adds encoding auto-discovering support by looking into the HTML meta http-equiv attribute. See TextResponse.encoding.

              XmlResponse objects

              class scrapy.http.XmlResponse(url[, ...])

              The XmlResponse class is a subclass of TextResponse which adds encoding auto-discovering support by looking into the XML declaration line. See TextResponse.encoding.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1D2!!!scrapy-0.22/topics/exporters.html Item Exporters — Scrapy 0.22.0 documentation

              Item Exporters

              Once you have scraped your Items, you often want to persist or export those items, to use the data in some other application. That is, after all, the whole purpose of the scraping process.

              For this purpose Scrapy provides a collection of Item Exporters for different output formats, such as XML, CSV or JSON.

              Using Item Exporters

              If you are in a hurry, and just want to use an Item Exporter to output scraped data see the Feed exports. Otherwise, if you want to know how Item Exporters work or need more custom functionality (not covered by the default exports), continue reading below.

              In order to use an Item Exporter, you must instantiate it with its required args. Each Item Exporter requires different arguments, so check each exporter documentation to be sure, in Built-in Item Exporters reference. After you have instantiated you exporter, you have to:

              1. call the method start_exporting() in order to signal the beginning of the exporting process

              2. call the export_item() method for each item you want to export

              3. and finally call the finish_exporting() to signal the end of the exporting process

              Here you can see an Item Pipeline which uses an Item Exporter to export scraped items to different files, one per spider:

              from scrapy import signals
              from scrapy.contrib.exporter import XmlItemExporter
              
              class XmlExportPipeline(object):
              
                  def __init__(self):
                      self.files = {}
              
                   @classmethod
                   def from_crawler(cls, crawler):
                       pipeline = cls()
                       crawler.signals.connect(pipeline.spider_opened, signals.spider_opened)
                       crawler.signals.connect(pipeline.spider_closed, signals.spider_closed)
                       return pipeline
              
                  def spider_opened(self, spider):
                      file = open('%s_products.xml' % spider.name, 'w+b')
                      self.files[spider] = file
                      self.exporter = XmlItemExporter(file)
                      self.exporter.start_exporting()
              
                  def spider_closed(self, spider):
                      self.exporter.finish_exporting()
                      file = self.files.pop(spider)
                      file.close()
              
                  def process_item(self, item, spider):
                      self.exporter.export_item(item)
                      return item

              Serialization of item fields

              By default, the field values are passed unmodified to the underlying serialization library, and the decision of how to serialize them is delegated to each particular serialization library.

              However, you can customize how each field value is serialized before it is passed to the serialization library.

              There are two ways to customize how a field will be serialized, which are described next.

              1. Declaring a serializer in the field

              You can declare a serializer in the field metadata. The serializer must be a callable which receives a value and returns its serialized form.

              Example:

              from scrapy.item import Item, Field
              
              def serialize_price(value):
                 return '$ %s' % str(value)
              
              class Product(Item):
                  name = Field()
                  price = Field(serializer=serialize_price)
              

              2. Overriding the serialize_field() method

              You can also override the serialize() method to customize how your field value will be exported.

              Make sure you call the base class serialize() method after your custom code.

              Example:

              from scrapy.contrib.exporter import XmlItemExporter
              
              class ProductXmlExporter(XmlItemExporter):
              
                  def serialize_field(self, field, name, value):
                      if field == 'price':
                          return '$ %s' % str(value)
                      return super(Product, self).serialize_field(field, name, value)
              

              Built-in Item Exporters reference

              Here is a list of the Item Exporters bundled with Scrapy. Some of them contain output examples, which assume you’re exporting these two items:

              Item(name='Color TV', price='1200')
              Item(name='DVD player', price='200')
              

              BaseItemExporter

              class scrapy.contrib.exporter.BaseItemExporter(fields_to_export=None, export_empty_fields=False, encoding='utf-8')

              This is the (abstract) base class for all Item Exporters. It provides support for common features used by all (concrete) Item Exporters, such as defining what fields to export, whether to export empty fields, or which encoding to use.

              These features can be configured through the constructor arguments which populate their respective instance attributes: fields_to_export, export_empty_fields, encoding.

              export_item(item)

              Exports the given item. This method must be implemented in subclasses.

              serialize_field(field, name, value)

              Return the serialized value for the given field. You can override this method (in your custom Item Exporters) if you want to control how a particular field or value will be serialized/exported.

              By default, this method looks for a serializer declared in the item field and returns the result of applying that serializer to the value. If no serializer is found, it returns the value unchanged except for unicode values which are encoded to str using the encoding declared in the encoding attribute.

              Parameters:
              • field (Field object) – the field being serialized
              • name (str) – the name of the field being serialized
              • value – the value being serialized
              start_exporting()

              Signal the beginning of the exporting process. Some exporters may use this to generate some required header (for example, the XmlItemExporter). You must call this method before exporting any items.

              finish_exporting()

              Signal the end of the exporting process. Some exporters may use this to generate some required footer (for example, the XmlItemExporter). You must always call this method after you have no more items to export.

              fields_to_export

              A list with the name of the fields that will be exported, or None if you want to export all fields. Defaults to None.

              Some exporters (like CsvItemExporter) respect the order of the fields defined in this attribute.

              export_empty_fields

              Whether to include empty/unpopulated item fields in the exported data. Defaults to False. Some exporters (like CsvItemExporter) ignore this attribute and always export all empty fields.

              encoding

              The encoding that will be used to encode unicode values. This only affects unicode values (which are always serialized to str using this encoding). Other value types are passed unchanged to the specific serialization library.

              XmlItemExporter

              class scrapy.contrib.exporter.XmlItemExporter(file, item_element='item', root_element='items', **kwargs)

              Exports Items in XML format to the specified file object.

              Parameters:
              • file – the file-like object to use for exporting the data.
              • root_element (str) – The name of root element in the exported XML.
              • item_element (str) – The name of each item element in the exported XML.

              The additional keyword arguments of this constructor are passed to the BaseItemExporter constructor.

              A typical output of this exporter would be:

              <?xml version="1.0" encoding="utf-8"?>
              <items>
                <item>
                  <name>Color TV</name>
                  <price>1200</price>
               </item>
                <item>
                  <name>DVD player</name>
                  <price>200</price>
               </item>
              </items>
              

              Unless overridden in the serialize_field() method, multi-valued fields are exported by serializing each value inside a <value> element. This is for convenience, as multi-valued fields are very common.

              For example, the item:

              Item(name=['John', 'Doe'], age='23')
              

              Would be serialized as:

              <?xml version="1.0" encoding="utf-8"?>
              <items>
                <item>
                  <name>
                    <value>John</value>
                    <value>Doe</value>
                  </name>
                  <age>23</age>
                </item>
              </items>
              

              CsvItemExporter

              class scrapy.contrib.exporter.CsvItemExporter(file, include_headers_line=True, join_multivalued=', ', **kwargs)

              Exports Items in CSV format to the given file-like object. If the fields_to_export attribute is set, it will be used to define the CSV columns and their order. The export_empty_fields attribute has no effect on this exporter.

              Parameters:
              • file – the file-like object to use for exporting the data.
              • include_headers_line (str) – If enabled, makes the exporter output a header line with the field names taken from BaseItemExporter.fields_to_export or the first exported item fields.
              • join_multivalued – The char (or chars) that will be used for joining multi-valued fields, if found.

              The additional keyword arguments of this constructor are passed to the BaseItemExporter constructor, and the leftover arguments to the csv.writer constructor, so you can use any csv.writer constructor argument to customize this exporter.

              A typical output of this exporter would be:

              product,price
              Color TV,1200
              DVD player,200
              

              PickleItemExporter

              class scrapy.contrib.exporter.PickleItemExporter(file, protocol=0, **kwargs)

              Exports Items in pickle format to the given file-like object.

              Parameters:
              • file – the file-like object to use for exporting the data.
              • protocol (int) – The pickle protocol to use.

              For more information, refer to the pickle module documentation.

              The additional keyword arguments of this constructor are passed to the BaseItemExporter constructor.

              Pickle isn’t a human readable format, so no output examples are provided.

              PprintItemExporter

              class scrapy.contrib.exporter.PprintItemExporter(file, **kwargs)

              Exports Items in pretty print format to the specified file object.

              Parameters:file – the file-like object to use for exporting the data.

              The additional keyword arguments of this constructor are passed to the BaseItemExporter constructor.

              A typical output of this exporter would be:

              {'name': 'Color TV', 'price': '1200'}
              {'name': 'DVD player', 'price': '200'}
              

              Longer lines (when present) are pretty-formatted.

              JsonItemExporter

              class scrapy.contrib.exporter.JsonItemExporter(file, **kwargs)

              Exports Items in JSON format to the specified file-like object, writing all objects as a list of objects. The additional constructor arguments are passed to the BaseItemExporter constructor, and the leftover arguments to the JSONEncoder constructor, so you can use any JSONEncoder constructor argument to customize this exporter.

              Parameters:file – the file-like object to use for exporting the data.

              A typical output of this exporter would be:

              [{"name": "Color TV", "price": "1200"},
              {"name": "DVD player", "price": "200"}]
              

              Warning

              JSON is very simple and flexible serialization format, but it doesn’t scale well for large amounts of data since incremental (aka. stream-mode) parsing is not well supported (if at all) among JSON parsers (on any language), and most of them just parse the entire object in memory. If you want the power and simplicity of JSON with a more stream-friendly format, consider using JsonLinesItemExporter instead, or splitting the output in multiple chunks.

              JsonLinesItemExporter

              class scrapy.contrib.exporter.JsonLinesItemExporter(file, **kwargs)

              Exports Items in JSON format to the specified file-like object, writing one JSON-encoded item per line. The additional constructor arguments are passed to the BaseItemExporter constructor, and the leftover arguments to the JSONEncoder constructor, so you can use any JSONEncoder constructor argument to customize this exporter.

              Parameters:file – the file-like object to use for exporting the data.

              A typical output of this exporter would be:

              {"name": "Color TV", "price": "1200"}
              {"name": "DVD player", "price": "200"}
              

              Unlike the one produced by JsonItemExporter, the format produced by this exporter is well suited for serializing large amounts of data.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK%o1Dڄ)scrapy-0.22/topics/spider-middleware.html Spider Middleware — Scrapy 0.22.0 documentation

              Spider Middleware

              The spider middleware is a framework of hooks into Scrapy’s spider processing mechanism where you can plug custom functionality to process the requests that are sent to Spiders for processing and to process the responses and items that are generated from spiders.

              Activating a spider middleware

              To activate a spider middleware component, add it to the SPIDER_MIDDLEWARES setting, which is a dict whose keys are the middleware class path and their values are the middleware orders.

              Here’s an example:

              SPIDER_MIDDLEWARES = {
                  'myproject.middlewares.CustomSpiderMiddleware': 543,
              }
              

              The SPIDER_MIDDLEWARES setting is merged with the SPIDER_MIDDLEWARES_BASE setting defined in Scrapy (and not meant to be overridden) and then sorted by order to get the final sorted list of enabled middlewares: the first middleware is the one closer to the engine and the last is the one closer to the spider.

              To decide which order to assign to your middleware see the SPIDER_MIDDLEWARES_BASE setting and pick a value according to where you want to insert the middleware. The order does matter because each middleware performs a different action and your middleware could depend on some previous (or subsequent) middleware being applied.

              If you want to disable a builtin middleware (the ones defined in SPIDER_MIDDLEWARES_BASE, and enabled by default) you must define it in your project SPIDER_MIDDLEWARES setting and assign None as its value. For example, if you want to disable the off-site middleware:

              SPIDER_MIDDLEWARES = {
                  'myproject.middlewares.CustomSpiderMiddleware': 543,
                  'scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware': None,
              }
              

              Finally, keep in mind that some middlewares may need to be enabled through a particular setting. See each middleware documentation for more info.

              Writing your own spider middleware

              Writing your own spider middleware is easy. Each middleware component is a single Python class that defines one or more of the following methods:

              class scrapy.contrib.spidermiddleware.SpiderMiddleware
              process_spider_input(response, spider)

              This method is called for each response that goes through the spider middleware and into the spider, for processing.

              process_spider_input() should return None or raise an exception.

              If it returns None, Scrapy will continue processing this response, executing all other middlewares until, finally, the response is handed to the spider for processing.

              If it raises an exception, Scrapy won’t bother calling any other spider middleware process_spider_input() and will call the request errback. The output of the errback is chained back in the other direction for process_spider_output() to process it, or process_spider_exception() if it raised an exception.

              Parameters:
              • response (Response object) – the response being processed
              • spider (Spider object) – the spider for which this response is intended
              process_spider_output(response, result, spider)

              This method is called with the results returned from the Spider, after it has processed the response.

              process_spider_output() must return an iterable of Request or Item objects.

              Parameters:
              • response (class:~scrapy.http.Response object) – the response which generated this output from the spider
              • result (an iterable of Request or Item objects) – the result returned by the spider
              • spider (Spider object) – the spider whose result is being processed
              process_spider_exception(response, exception, spider)

              This method is called when when a spider or process_spider_input() method (from other spider middleware) raises an exception.

              process_spider_exception() should return either None or an iterable of Response or Item objects.

              If it returns None, Scrapy will continue processing this exception, executing any other process_spider_exception() in the following middleware components, until no middleware components are left and the exception reaches the engine (where it’s logged and discarded).

              If it returns an iterable the process_spider_output() pipeline kicks in, and no other process_spider_exception() will be called.

              Parameters:
              • response (Response object) – the response being processed when the exception was raised
              • exception (Exception object) – the exception raised
              • spider (scrapy.spider.Spider object) – the spider which raised the exception
              process_start_requests(start_requests, spider)

              New in version 0.15.

              This method is called with the start requests of the spider, and works similarly to the process_spider_output() method, except that it doesn’t have a response associated and must return only requests (not items).

              It receives an iterable (in the start_requests parameter) and must return another iterable of Request objects.

              Note

              When implementing this method in your spider middleware, you should always return an iterable (that follows the input one) and not consume all start_requests iterator because it can be very large (or even unbounded) and cause a memory overflow. The Scrapy engine is designed to pull start requests while it has capacity to process them, so the start requests iterator can be effectively endless where there is some other condition for stopping the spider (like a time limit or item/page count).

              Parameters:
              • start_requests (an iterable of Request) – the start requests
              • spider (Spider object) – the spider to whom the start requests belong

              Built-in spider middleware reference

              This page describes all spider middleware components that come with Scrapy. For information on how to use them and how to write your own spider middleware, see the spider middleware usage guide.

              For a list of the components enabled by default (and their orders) see the SPIDER_MIDDLEWARES_BASE setting.

              DepthMiddleware

              class scrapy.contrib.spidermiddleware.depth.DepthMiddleware

              DepthMiddleware is a scrape middleware used for tracking the depth of each Request inside the site being scraped. It can be used to limit the maximum depth to scrape or things like that.

              The DepthMiddleware can be configured through the following settings (see the settings documentation for more info):

              • DEPTH_LIMIT - The maximum depth that will be allowed to crawl for any site. If zero, no limit will be imposed.
              • DEPTH_STATS - Whether to collect depth stats.
              • DEPTH_PRIORITY - Whether to prioritize the requests based on their depth.

              HttpErrorMiddleware

              class scrapy.contrib.spidermiddleware.httperror.HttpErrorMiddleware

              Filter out unsuccessful (erroneous) HTTP responses so that spiders don’t have to deal with them, which (most of the time) imposes an overhead, consumes more resources, and makes the spider logic more complex.

              According to the HTTP standard, successful responses are those whose status codes are in the 200-300 range.

              If you still want to process response codes outside that range, you can specify which response codes the spider is able to handle using the handle_httpstatus_list spider attribute or HTTPERROR_ALLOWED_CODES setting.

              For example, if you want your spider to handle 404 responses you can do this:

              class MySpider(CrawlSpider):
                  handle_httpstatus_list = [404]
              

              The handle_httpstatus_list key of Request.meta can also be used to specify which response codes to allow on a per-request basis.

              Keep in mind, however, that it’s usually a bad idea to handle non-200 responses, unless you really know what you’re doing.

              For more information see: HTTP Status Code Definitions.

              HttpErrorMiddleware settings

              HTTPERROR_ALLOWED_CODES

              Default: []

              Pass all responses with non-200 status codes contained in this list.

              HTTPERROR_ALLOW_ALL

              Default: False

              Pass all responses, regardless of its status code.

              OffsiteMiddleware

              class scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware

              Filters out Requests for URLs outside the domains covered by the spider.

              This middleware filters out every request whose host names aren’t in the spider’s allowed_domains attribute.

              When your spider returns a request for a domain not belonging to those covered by the spider, this middleware will log a debug message similar to this one:

              DEBUG: Filtered offsite request to 'www.othersite.com': <GET http://www.othersite.com/some/page.html>

              To avoid filling the log with too much noise, it will only print one of these messages for each new domain filtered. So, for example, if another request for www.othersite.com is filtered, no log message will be printed. But if a request for someothersite.com is filtered, a message will be printed (but only for the first request filtered).

              If the spider doesn’t define an allowed_domains attribute, or the attribute is empty, the offsite middleware will allow all requests.

              If the request has the dont_filter attribute set, the offsite middleware will allow the request even if its domain is not listed in allowed domains.

              RefererMiddleware

              class scrapy.contrib.spidermiddleware.referer.RefererMiddleware

              Populates Request Referer header, based on the URL of the Response which generated it.

              RefererMiddleware settings

              REFERER_ENABLED

              New in version 0.15.

              Default: True

              Whether to enable referer middleware.

              UrlLengthMiddleware

              class scrapy.contrib.spidermiddleware.urllength.UrlLengthMiddleware

              Filters out requests with URLs longer than URLLENGTH_LIMIT

              The UrlLengthMiddleware can be configured through the following settings (see the settings documentation for more info):

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK"o1DmZZ!scrapy-0.22/topics/contracts.html Spiders Contracts — Scrapy 0.22.0 documentation

              Spiders Contracts

              New in version 0.15.

              Note

              This is a new feature (introduced in Scrapy 0.15) and may be subject to minor functionality/API updates. Check the release notes to be notified of updates.

              Testing spiders can get particularly annoying and while nothing prevents you from writing unit tests the task gets cumbersome quickly. Scrapy offers an integrated way of testing your spiders by the means of contracts.

              This allows you to test each callback of your spider by hardcoding a sample url and check various constraints for how the callback processes the response. Each contract is prefixed with an @ and included in the docstring. See the following example:

              def parse(self, response):
                  """ This function parses a sample response. Some contracts are mingled
                  with this docstring.
              
                  @url http://www.amazon.com/s?field-keywords=selfish+gene
                  @returns items 1 16
                  @returns requests 0 0
                  @scrapes Title Author Year Price
                  """
              

              This callback is tested using three built-in contracts:

              class scrapy.contracts.default.UrlContract

              This contract (@url) sets the sample url used when checking other contract conditions for this spider. This contract is mandatory. All callbacks lacking this contract are ignored when running the checks:

              @url url
              class scrapy.contracts.default.ReturnsContract

              This contract (@returns) sets lower and upper bounds for the items and requests returned by the spider. The upper bound is optional:

              @returns item(s)|request(s) [min [max]]
              class scrapy.contracts.default.ScrapesContract

              This contract (@scrapes) checks that all the items returned by the callback have the specified fields:

              @scrapes field_1 field_2 ...

              Use the check command to run the contract checks.

              Custom Contracts

              If you find you need more power than the built-in scrapy contracts you can create and load your own contracts in the project by using the SPIDER_CONTRACTS setting:

              SPIDER_CONTRACTS = {
                  'myproject.contracts.ResponseCheck': 10,
                  'myproject.contracts.ItemValidate': 10,
              }
              

              Each contract must inherit from scrapy.contracts.Contract and can override three methods:

              class scrapy.contracts.Contract(method, *args)
              Parameters:
              • method (function) – callback function to which the contract is associated
              • args (list) – list of arguments passed into the docstring (whitespace separated)
              adjust_request_args(args)

              This receives a dict as an argument containing default arguments for Request object. Must return the same or a modified version of it.

              pre_process(response)

              This allows hooking in various checks on the response received from the sample request, before it’s being passed to the callback.

              post_process(output)

              This allows processing the output of the callback. Iterators are converted listified before being passed to this hook.

              Here is a demo contract which checks the presence of a custom header in the response received. Raise scrapy.exceptions.ContractFail in order to get the failures pretty printed:

              from scrapy.contracts import Contract
              from scrapy.exceptions import ContractFail
              
              class HasHeaderContract(Contract):
                  """ Demo contract which checks the presence of a custom header
                      @has_header X-CustomHeader
                  """
              
                  name = 'has_header'
              
                  def pre_process(self, response):
                      for header in self.args:
                          if header not in response.headers:
                              raise ContractFail('X-CustomHeader not present')
              
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1D!ŗ**$scrapy-0.22/topics/feed-exports.html Feed exports — Scrapy 0.22.0 documentation

              Feed exports

              New in version 0.10.

              One of the most frequently required features when implementing scrapers is being able to store the scraped data properly and, quite often, that means generating a “export file” with the scraped data (commonly called “export feed”) to be consumed by other systems.

              Scrapy provides this functionality out of the box with the Feed Exports, which allows you to generate a feed with the scraped items, using multiple serialization formats and storage backends.

              Serialization formats

              For serializing the scraped data, the feed exports use the Item exporters and these formats are supported out of the box:

              But you can also extend the supported format through the FEED_EXPORTERS setting.

              JSON

              JSON lines

              CSV

              XML

              Pickle

              Marshal

              • FEED_FORMAT: marshal
              • Exporter used: MarshalItemExporter

              Storages

              When using the feed exports you define where to store the feed using a URI (through the FEED_URI setting). The feed exports supports multiple storage backend types which are defined by the URI scheme.

              The storages backends supported out of the box are:

              Some storage backends may be unavailable if the required external libraries are not available. For example, the S3 backend is only available if the boto library is installed.

              Storage URI parameters

              The storage URI can also contain parameters that get replaced when the feed is being created. These parameters are:

              • %(time)s - gets replaced by a timestamp when the feed is being created
              • %(name)s - gets replaced by the spider name

              Any other named parameter gets replaced by the spider attribute of the same name. For example, %(site_id)s would get replaced by the spider.site_id attribute the moment the feed is being created.

              Here are some examples to illustrate:

              • Store in FTP using one directory per spider:
                • ftp://user:password@ftp.example.com/scraping/feeds/%(name)s/%(time)s.json
              • Store in S3 using one directory per spider:
                • s3://mybucket/scraping/feeds/%(name)s/%(time)s.json

              Storage backends

              Local filesystem

              The feeds are stored in the local filesystem.

              • URI scheme: file
              • Example URI: file:///tmp/export.csv
              • Required external libraries: none

              Note that for the local filesystem storage (only) you can omit the scheme if you specify an absolute path like /tmp/export.csv. This only works on Unix systems though.

              FTP

              The feeds are stored in a FTP server.

              • URI scheme: ftp
              • Example URI: ftp://user:pass@ftp.example.com/path/to/export.csv
              • Required external libraries: none

              S3

              The feeds are stored on Amazon S3.

              • URI scheme: s3
              • Example URIs:
                • s3://mybucket/path/to/export.csv
                • s3://aws_key:aws_secret@mybucket/path/to/export.csv
              • Required external libraries: boto

              The AWS credentials can be passed as user/password in the URI, or they can be passed through the following settings:

              Standard output

              The feeds are written to the standard output of the Scrapy process.

              • URI scheme: stdout
              • Example URI: stdout:
              • Required external libraries: none

              Settings

              These are the settings used for configuring the feed exports:

              FEED_URI

              Default: None

              The URI of the export feed. See Storage backends for supported URI schemes.

              This setting is required for enabling the feed exports.

              FEED_FORMAT

              The serialization format to be used for the feed. See Serialization formats for possible values.

              FEED_STORE_EMPTY

              Default: False

              Whether to export empty feeds (ie. feeds with no items).

              FEED_STORAGES

              Default:: {}

              A dict containing additional feed storage backends supported by your project. The keys are URI schemes and the values are paths to storage classes.

              FEED_STORAGES_BASE

              Default:

              {
                  '': 'scrapy.contrib.feedexport.FileFeedStorage',
                  'file': 'scrapy.contrib.feedexport.FileFeedStorage',
                  'stdout': 'scrapy.contrib.feedexport.StdoutFeedStorage',
                  's3': 'scrapy.contrib.feedexport.S3FeedStorage',
                  'ftp': 'scrapy.contrib.feedexport.FTPFeedStorage',
              }
              

              A dict containing the built-in feed storage backends supported by Scrapy.

              FEED_EXPORTERS

              Default:: {}

              A dict containing additional exporters supported by your project. The keys are URI schemes and the values are paths to Item exporter classes.

              FEED_EXPORTERS_BASE

              Default:

              FEED_EXPORTERS_BASE = {
                  'json': 'scrapy.contrib.exporter.JsonItemExporter',
                  'jsonlines': 'scrapy.contrib.exporter.JsonLinesItemExporter',
                  'csv': 'scrapy.contrib.exporter.CsvItemExporter',
                  'xml': 'scrapy.contrib.exporter.XmlItemExporter',
                  'marshal': 'scrapy.contrib.exporter.MarshalItemExporter',
              }
              

              A dict containing the built-in feed exporters supported by Scrapy.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK$o1DE==scrapy-0.22/topics/jobs.html Jobs: pausing and resuming crawls — Scrapy 0.22.0 documentation

              Jobs: pausing and resuming crawls

              Sometimes, for big sites, it’s desirable to pause crawls and be able to resume them later.

              Scrapy supports this functionality out of the box by providing the following facilities:

              • a scheduler that persists scheduled requests on disk
              • a duplicates filter that persists visited requests on disk
              • an extension that keeps some spider state (key/value pairs) persistent between batches

              Job directory

              To enable persistence support you just need to define a job directory through the JOBDIR setting. This directory will be for storing all required data to keep the state of a single job (ie. a spider run). It’s important to note that this directory must not be shared by different spiders, or even different jobs/runs of the same spider, as it’s meant to be used for storing the state of a single job.

              How to use it

              To start a spider with persistence supported enabled, run it like this:

              scrapy crawl somespider -s JOBDIR=crawls/somespider-1

              Then, you can stop the spider safely at any time (by pressing Ctrl-C or sending a signal), and resume it later by issuing the same command:

              scrapy crawl somespider -s JOBDIR=crawls/somespider-1

              Keeping persistent state between batches

              Sometimes you’ll want to keep some persistent spider state between pause/resume batches. You can use the spider.state attribute for that, which should be a dict. There’s a built-in extension that takes care of serializing, storing and loading that attribute from the job directory, when the spider starts and stops.

              Here’s an example of a callback that uses the spider state (other spider code is omitted for brevity):

              def parse_item(self, response):
                  # parse item here
                  self.state['items_count'] = self.state.get('items_count', 0) + 1
              

              Persistence gotchas

              There are a few things to keep in mind if you want to be able to use the Scrapy persistence support:

              Cookies expiration

              Cookies may expire. So, if you don’t resume your spider quickly the requests scheduled may no longer work. This won’t be an issue if you spider doesn’t rely on cookies.

              Request serialization

              Requests must be serializable by the pickle module, in order for persistence to work, so you should make sure that your requests are serializable.

              The most common issue here is to use lambda functions on request callbacks that can’t be persisted.

              So, for example, this won’t work:

              def some_callback(self, response):
                  somearg = 'test'
                  return Request('http://www.example.com', callback=lambda r: self.other_callback(r, somearg))
              
              def other_callback(self, response, somearg):
                  print "the argument passed is:", somearg
              

              But this will:

              def some_callback(self, response):
                  somearg = 'test'
                  return Request('http://www.example.com', meta={'somearg': somearg})
              
              def other_callback(self, response):
                  somearg = response.meta['somearg']
                  print "the argument passed is:", somearg
              
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK"o1DcG&scrapy-0.22/topics/debug.html Debugging Spiders — Scrapy 0.22.0 documentation

              Debugging Spiders

              This document explains the most common techniques for debugging spiders. Consider the following scrapy spider below:

              class MySpider(Spider):
                  name = 'myspider'
                  start_urls = (
                      'http://example.com/page1',
                      'http://example.com/page2',
                      )
              
                  def parse(self, response):
                      # collect `item_urls`
                      for item_url in item_urls:
                          yield Request(url=item_url, callback=self.parse_item)
              
                  def parse_item(self, response):
                      item = MyItem()
                      # populate `item` fields
                      yield Request(url=item_details_url, meta={'item': item},
                          callback=self.parse_details)
              
                  def parse_details(self, response):
                      item = response.meta['item']
                      # populate more `item` fields
                      return item
              

              Basically this is a simple spider which parses two pages of items (the start_urls). Items also have a details page with additional information, so we use the meta functionality of Request to pass a partially populated item.

              Parse Command

              The most basic way of checking the output of your spider is to use the parse command. It allows to check the behaviour of different parts of the spider at the method level. It has the advantage of being flexible and simple to use, but does not allow debugging code inside a method.

              In order to see the item scraped from a specific url:

              $ scrapy parse --spider=myspider -c parse_item -d 2 <item_url>
              [ ... scrapy log lines crawling example.com spider ... ]
              
              >>> STATUS DEPTH LEVEL 2 <<<
              # Scraped Items  ------------------------------------------------------------
              [{'url': <item_url>}]
              
              # Requests  -----------------------------------------------------------------
              []

              Using the --verbose or -v option we can see the status at each depth level:

              $ scrapy parse --spider=myspider -c parse_item -d 2 -v <item_url>
              [ ... scrapy log lines crawling example.com spider ... ]
              
              >>> DEPTH LEVEL: 1 <<<
              # Scraped Items  ------------------------------------------------------------
              []
              
              # Requests  -----------------------------------------------------------------
              [<GET item_details_url>]
              
              
              >>> DEPTH LEVEL: 2 <<<
              # Scraped Items  ------------------------------------------------------------
              [{'url': <item_url>}]
              
              # Requests  -----------------------------------------------------------------
              []

              Checking items scraped from a single start_url, can also be easily achieved using:

              $ scrapy parse --spider=myspider -d 3 'http://example.com/page1'

              Scrapy Shell

              While the parse command is very useful for checking behaviour of a spider, it is of little help to check what happens inside a callback, besides showing the response received and the output. How to debug the situation when parse_details sometimes receives no item?

              Fortunately, the shell is your bread and butter in this case (see Invoking the shell from spiders to inspect responses):

              from scrapy.shell import inspect_response
              
              def parse_details(self, response):
                  item = response.meta.get('item', None)
                  if item:
                      # populate more `item` fields
                      return item
                  else:
                      inspect_response(response, self)
              

              See also: Invoking the shell from spiders to inspect responses.

              Open in browser

              Sometimes you just want to see how a certain response looks in a browser, you can use the open_in_browser function for that. Here is an example of how you would use it:

              from scrapy.utils.response import open_in_browser
              
              def parse_details(self, response):
                  if "item name" not in response.body:
                      open_in_browser(response)
              

              open_in_browser will open a browser with the response received by Scrapy at that point, adjusting the base tag so that images and styles are displayed properly.

              Logging

              Logging is another useful option for getting information about your spider run. Although not as convenient, it comes with the advantage that the logs will be available in all future runs should they be necessary again:

              from scrapy import log
              
              def parse_details(self, response):
                  item = response.meta.get('item', None)
                  if item:
                      # populate more `item` fields
                      return item
                  else:
                      self.log('No item received for %s' % response.url,
                          level=log.WARNING)
              

              For more information, check the Logging section.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1DSWN[scrapy-0.22/topics/firebug.html Using Firebug for scraping — Scrapy 0.22.0 documentation

              Using Firebug for scraping

              Note

              Google Directory, the example website used in this guide is no longer available as it has been shut down by Google. The concepts in this guide are still valid though. If you want to update this guide to use a new (working) site, your contribution will be more than welcome!. See Contributing to Scrapy for information on how to do so.

              Introduction

              This document explains how to use Firebug (a Firefox add-on) to make the scraping process easier and more fun. For other useful Firefox add-ons see Useful Firefox add-ons for scraping. There are some caveats with using Firefox add-ons to inspect pages, see Caveats with inspecting the live browser DOM.

              In this example, we’ll show how to use Firebug to scrape data from the Google Directory, which contains the same data as the Open Directory Project used in the tutorial but with a different face.

              Firebug comes with a very useful feature called Inspect Element which allows you to inspect the HTML code of the different page elements just by hovering your mouse over them. Otherwise you would have to search for the tags manually through the HTML body which can be a very tedious task.

              In the following screenshot you can see the Inspect Element tool in action.

              Inspecting elements with Firebug

              At first sight, we can see that the directory is divided in categories, which are also divided in subcategories.

              However, it seems that there are more subcategories than the ones being shown in this page, so we’ll keep looking:

              Inspecting elements with Firebug

              As expected, the subcategories contain links to other subcategories, and also links to actual websites, which is the purpose of the directory.

              Extracting the data

              Now we’re going to write the code to extract data from those pages.

              With the help of Firebug, we’ll take a look at some page containing links to websites (say http://directory.google.com/Top/Arts/Awards/) and find out how we can extract those links using Selectors. We’ll also use the Scrapy shell to test those XPath’s and make sure they work as we expect.

              Inspecting elements with Firebug

              As you can see, the page markup is not very descriptive: the elements don’t contain id, class or any attribute that clearly identifies them, so we’‘ll use the ranking bars as a reference point to select the data to extract when we construct our XPaths.

              After using FireBug, we can see that each link is inside a td tag, which is itself inside a tr tag that also contains the link’s ranking bar (in another td).

              So we can select the ranking bar, then find its parent (the tr), and then finally, the link’s td (which contains the data we want to scrape).

              This results in the following XPath:

              //td[descendant::a[contains(@href, "#pagerank")]]/following-sibling::td//a

              It’s important to use the Scrapy shell to test these complex XPath expressions and make sure they work as expected.

              Basically, that expression will look for the ranking bar’s td element, and then select any td element who has a descendant a element whose href attribute contains the string #pagerank

              Of course, this is not the only XPath, and maybe not the simpler one to select that data. Another approach could be, for example, to find any font tags that have that grey colour of the links,

              Finally, we can write our parse_category() method:

              def parse_category(self, response):
                  sel = Selector(response)
              
                  # The path to website links in directory page
                  links = sel.xpath('//td[descendant::a[contains(@href, "#pagerank")]]/following-sibling::td/font')
              
                  for link in links:
                      item = DirectoryItem()
                      item['name'] = link.xpath('a/text()').extract()
                      item['url'] = link.xpath('a/@href').extract()
                      item['description'] = link.xpath('font[2]/text()').extract()
                      yield item
              

              Be aware that you may find some elements which appear in Firebug but not in the original HTML, such as the typical case of <tbody> elements.

              or tags which Therefer in page HTML sources may on Firebug inspects the live DOM

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK&o1D%scrapy-0.22/topics/telnetconsole.html Telnet Console — Scrapy 0.22.0 documentation

              Telnet Console

              Scrapy comes with a built-in telnet console for inspecting and controlling a Scrapy running process. The telnet console is just a regular python shell running inside the Scrapy process, so you can do literally anything from it.

              The telnet console is a built-in Scrapy extension which comes enabled by default, but you can also disable it if you want. For more information about the extension itself see Telnet console extension.

              How to access the telnet console

              The telnet console listens in the TCP port defined in the TELNETCONSOLE_PORT setting, which defaults to 6023. To access the console you need to type:

              telnet localhost 6023
              >>>
              

              You need the telnet program which comes installed by default in Windows, and most Linux distros.

              Available variables in the telnet console

              The telnet console is like a regular Python shell running inside the Scrapy process, so you can do anything from it including importing new modules, etc.

              However, the telnet console comes with some default variables defined for convenience:

              Shortcut Description
              crawler the Scrapy Crawler (scrapy.crawler.Crawler object)
              engine Crawler.engine attribute
              spider the active spider
              slot the engine slot
              extensions the Extension Manager (Crawler.extensions attribute)
              stats the Stats Collector (Crawler.stats attribute)
              settings the Scrapy settings object (Crawler.settings attribute)
              est print a report of the engine status
              prefs for memory debugging (see Debugging memory leaks)
              p a shortcut to the pprint.pprint function
              hpy for memory debugging (see Debugging memory leaks)

              Telnet console usage examples

              Here are some example tasks you can do with the telnet console:

              View engine status

              You can use the est() method of the Scrapy engine to quickly show its state using the telnet console:

              telnet localhost 6023
              >>> est()
              Execution engine status
              
              time()-engine.start_time                        : 9.24237799644
              engine.has_capacity()                           : False
              engine.downloader.is_idle()                     : False
              len(engine.downloader.slots)                    : 2
              len(engine.downloader.active)                   : 16
              engine.scraper.is_idle()                        : False
              
              Spider: <GayotSpider 'gayotcom' at 0x2dc2b10>
                engine.spider_is_idle(spider)                      : False
                engine.slots[spider].closing                       : False
                len(engine.slots[spider].inprogress)               : 21
                len(engine.slots[spider].scheduler.dqs or [])      : 0
                len(engine.slots[spider].scheduler.mqs)            : 4453
                len(engine.scraper.slot.queue)                     : 0
                len(engine.scraper.slot.active)                    : 5
                engine.scraper.slot.active_size                    : 1515069
                engine.scraper.slot.itemproc_size                  : 0
                engine.scraper.slot.needs_backout()                : False
              

              Pause, resume and stop the Scrapy engine

              To pause:

              telnet localhost 6023
              >>> engine.pause()
              >>>
              

              To resume:

              telnet localhost 6023
              >>> engine.unpause()
              >>>
              

              To stop:

              telnet localhost 6023
              >>> engine.stop()
              Connection closed by foreign host.
              

              Telnet Console signals

              scrapy.telnet.update_telnet_vars(telnet_vars)

              Sent just before the telnet console is opened. You can hook up to this signal to add, remove or update the variables that will be available in the telnet local namespace. In order to do that, you need to update the telnet_vars dict in your handler.

              Parameters:telnet_vars (dict) – the dict of telnet variables

              Telnet settings

              These are the settings that control the telnet console’s behaviour:

              TELNETCONSOLE_PORT

              Default: [6023, 6073]

              The port range to use for the telnet console. If set to None or 0, a dynamically assigned port is used.

              TELNETCONSOLE_HOST

              Default: '0.0.0.0'

              The interface the telnet console should listen on

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK&o1D7s߇scrapy-0.22/topics/ubuntu.html Ubuntu packages — Scrapy 0.22.0 documentation

              Ubuntu packages

              New in version 0.10.

              Scrapinghub publishes apt-gettable packages which are generally fresher than those in Ubuntu, and more stable too since they’re continuously built from Github repo (master & stable branches) and so they contain the latest bug fixes.

              To use the packages, just add the following line to your /etc/apt/sources.list, and then run aptitude update and apt-get install scrapy-0.18:

              deb http://archive.scrapy.org/ubuntu DISTRO main

              Replacing DISTRO with the name of your Ubuntu release, which you can get with command:

              lsb_release -cs
              

              Supported Ubuntu releases are: precise, quantal, raring.

              For Ubuntu Raring (13.04):

              deb http://archive.scrapy.org/ubuntu raring main

              For Ubuntu Quantal (12.10):

              deb http://archive.scrapy.org/ubuntu quantal main

              For Ubuntu Precise (12.04):

              deb http://archive.scrapy.org/ubuntu precise main

              Warning

              Please note that these packages are updated frequently, and so if you find you can’t download the packages, try updating your apt package lists first, e.g., with apt-get update or aptitude update.

              The public GPG key used to sign these packages can be imported into you APT keyring as follows:

              curl -s http://archive.scrapy.org/ubuntu/archive.key | sudo apt-key add -
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK$o1D-uu!scrapy-0.22/topics/practices.html Common Practices — Scrapy 0.22.0 documentation

              Common Practices

              This section documents common practices when using Scrapy. These are things that cover many topics and don’t often fall into any other specific section.

              Run Scrapy from a script

              You can use the API to run Scrapy from a script, instead of the typical way of running Scrapy via scrapy crawl.

              Remember that Scrapy is built on top of the Twisted asynchronous networking library, so you need run it inside the Twisted reactor.

              Note that you will also have to shutdown the Twisted reactor yourself after the spider is finished. This can be achieved by connecting a handler to the signals.spider_closed signal.

              What follows is a working example of how to do that, using the testspiders project as example.

              from twisted.internet import reactor
              from scrapy.crawler import Crawler
              from scrapy import log, signals
              from testspiders.spiders.followall import FollowAllSpider
              from scrapy.utils.project import get_project_settings
              
              spider = FollowAllSpider(domain='scrapinghub.com')
              settings = get_project_settings()
              crawler = Crawler(settings)
              crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
              crawler.configure()
              crawler.crawl(spider)
              crawler.start()
              log.start()
              reactor.run() # the script will block here until the spider_closed signal was sent
              

              Running multiple spiders in the same process

              By default, Scrapy runs a single spider per process when you run scrapy crawl. However, Scrapy supports running multiple spiders per process using the internal API.

              Here is an example, using the testspiders project:

              from twisted.internet import reactor
              from scrapy.crawler import Crawler
              from scrapy import log
              from testspiders.spiders.followall import FollowAllSpider
              from scrapy.utils.project import get_project_settings
              
              def setup_crawler(domain):
                  spider = FollowAllSpider(domain=domain)
                  settings = get_project_settings()
                  crawler = Crawler(settings)
                  crawler.configure()
                  crawler.crawl(spider)
                  crawler.start()
              
              for domain in ['scrapinghub.com', 'insophia.com']:
                  setup_crawler(domain)
              log.start()
              reactor.run()
              

              Distributed crawls

              Scrapy doesn’t provide any built-in facility for running crawls in a distribute (multi-server) manner. However, there are some ways to distribute crawls, which vary depending on how you plan to distribute them.

              If you have many spiders, the obvious way to distribute the load is to setup many Scrapyd instances and distribute spider runs among those.

              If you instead want to run a single (big) spider through many machines, what you usually do is partition the urls to crawl and send them to each separate spider. Here is a concrete example:

              First, you prepare the list of urls to crawl and put them into separate files/urls:

              http://somedomain.com/urls-to-crawl/spider1/part1.list
              http://somedomain.com/urls-to-crawl/spider1/part2.list
              http://somedomain.com/urls-to-crawl/spider1/part3.list

              Then you fire a spider run on 3 different Scrapyd servers. The spider would receive a (spider) argument part with the number of the partition to crawl:

              curl http://scrapy1.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=1
              curl http://scrapy2.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=2
              curl http://scrapy3.mycompany.com:6800/schedule.json -d project=myproject -d spider=spider1 -d part=3

              Avoiding getting banned

              Some websites implement certain measures to prevent bots from crawling them, with varying degrees of sophistication. Getting around those measures can be difficult and tricky, and may sometimes require special infrastructure. Please consider contacting commercial support if in doubt.

              Here are some tips to keep in mind when dealing with these kind of sites:

              • rotate your user agent from a pool of well-known ones from browsers (google around to get a list of them)
              • disable cookies (see COOKIES_ENABLED) as some sites may use cookies to spot bot behaviour
              • use download delays (2 or higher). See DOWNLOAD_DELAY setting.
              • if possible, use Google cache to fetch pages, instead of hitting the sites directly
              • use a pool of rotating IPs. For example, the free Tor project or paid services like ProxyMesh
              • use a highly distributed downloader that circumvents bans internally, so you can just focus on parsing clean pages. One example of such downloaders is Crawlera

              If you are still unable to prevent your bot getting banned, consider contacting commercial support.

              Dynamic Creation of Item Classes

              For applications in which the structure of item class is to be determined by user input, or other changing conditions, you can dynamically create item classes instead of manually coding them.

              from scrapy.item import DictItem, Field
              
              def create_item_class(class_name,field_list):
                  field_dict = {}
                  for field_name in field_list:
                      field_dict[field_name] = Field()
              
                  return type(class_name,DictItem,field_dict)
              
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK$o1DlZbb'scrapy-0.22/topics/link-extractors.html Link Extractors — Scrapy 0.22.0 documentation

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1Drd??"scrapy-0.22/topics/exceptions.html Exceptions — Scrapy 0.22.0 documentation

              Exceptions

              Built-in Exceptions reference

              Here’s a list of all exceptions included in Scrapy and their usage.

              DropItem

              exception scrapy.exceptions.DropItem

              The exception that must be raised by item pipeline stages to stop processing an Item. For more information see Item Pipeline.

              CloseSpider

              exception scrapy.exceptions.CloseSpider(reason='cancelled')

              This exception can be raised from a spider callback to request the spider to be closed/stopped. Supported arguments:

              Parameters:reason (str) – the reason for closing

              For example:

              def parse_page(self, response):
                  if 'Bandwidth exceeded' in response.body:
                      raise CloseSpider('bandwidth_exceeded')
              

              IgnoreRequest

              exception scrapy.exceptions.IgnoreRequest

              This exception can be raised by the Scheduler or any downloader middleware to indicate that the request should be ignored.

              NotConfigured

              exception scrapy.exceptions.NotConfigured

              This exception can be raised by some components to indicate that they will remain disabled. Those components include:

              • Extensions
              • Item pipelines
              • Downloader middlwares
              • Spider middlewares

              The exception must be raised in the component constructor.

              NotSupported

              exception scrapy.exceptions.NotSupported

              This exception is raised to indicate an unsupported feature.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK%o1DBscrapy-0.22/topics/scrapyd.html Scrapyd — Scrapy 0.22.0 documentation

              Scrapyd

              Scrapyd has been moved into a separate project.

              Its documentation is now hosted at:

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK&o1Dscrapy-0.22/topics/spiders.html Spiders — Scrapy 0.22.0 documentation

              Spiders

              Spiders are classes which define how a certain site (or group of sites) will be scraped, including how to perform the crawl (ie. follow links) and how to extract structured data from their pages (ie. scraping items). In other words, Spiders are the place where you define the custom behaviour for crawling and parsing pages for a particular site (or, in some cases, group of sites).

              For spiders, the scraping cycle goes through something like this:

              1. You start by generating the initial Requests to crawl the first URLs, and specify a callback function to be called with the response downloaded from those requests.

                The first requests to perform are obtained by calling the start_requests() method which (by default) generates Request for the URLs specified in the start_urls and the parse method as callback function for the Requests.

              2. In the callback function, you parse the response (web page) and return either Item objects, Request objects, or an iterable of both. Those Requests will also contain a callback (maybe the same) and will then be downloaded by Scrapy and then their response handled by the specified callback.

              3. In callback functions, you parse the page contents, typically using Selectors (but you can also use BeautifulSoup, lxml or whatever mechanism you prefer) and generate items with the parsed data.

              4. Finally, the items returned from the spider will be typically persisted to a database (in some Item Pipeline) or written to a file using Feed exports.

              Even though this cycle applies (more or less) to any kind of spider, there are different kinds of default spiders bundled into Scrapy for different purposes. We will talk about those types here.

              Spider arguments

              Spiders can receive arguments that modify their behaviour. Some common uses for spider arguments are to define the start URLs or to restrict the crawl to certain sections of the site, but they can be used to configure any functionality of the spider.

              Spider arguments are passed through the crawl command using the -a option. For example:

              scrapy crawl myspider -a category=electronics

              Spiders receive arguments in their constructors:

              class MySpider(Spider):
                  name = 'myspider'
              
                  def __init__(self, category=None, *args, **kwargs):
                      super(MySpider, self).__init__(*args, **kwargs)
                      self.start_urls = ['http://www.example.com/categories/%s' % category]
                      # ...
              

              Spider arguments can also be passed through the Scrapyd schedule.json API. See Scrapyd documentation.

              Built-in spiders reference

              Scrapy comes with some useful generic spiders that you can use, to subclass your spiders from. Their aim is to provide convenient functionality for a few common scraping cases, like following all links on a site based on certain rules, crawling from Sitemaps, or parsing a XML/CSV feed.

              For the examples used in the following spiders, we’ll assume you have a project with a TestItem declared in a myproject.items module:

              from scrapy.item import Item
              
              class TestItem(Item):
                  id = Field()
                  name = Field()
                  description = Field()
              

              Spider

              class scrapy.spider.Spider

              This is the simplest spider, and the one from which every other spider must inherit from (either the ones that come bundled with Scrapy, or the ones that you write yourself). It doesn’t provide any special functionality. It just requests the given start_urls/start_requests, and calls the spider’s method parse for each of the resulting responses.

              name

              A string which defines the name for this spider. The spider name is how the spider is located (and instantiated) by Scrapy, so it must be unique. However, nothing prevents you from instantiating more than one instance of the same spider. This is the most important spider attribute and it’s required.

              If the spider scrapes a single domain, a common practice is to name the spider after the domain, or without the TLD. So, for example, a spider that crawls mywebsite.com would often be called mywebsite.

              allowed_domains

              An optional list of strings containing domains that this spider is allowed to crawl. Requests for URLs not belonging to the domain names specified in this list won’t be followed if OffsiteMiddleware is enabled.

              start_urls

              A list of URLs where the spider will begin to crawl from, when no particular URLs are specified. So, the first pages downloaded will be those listed here. The subsequent URLs will be generated successively from data contained in the start URLs.

              start_requests()

              This method must return an iterable with the first Requests to crawl for this spider.

              This is the method called by Scrapy when the spider is opened for scraping when no particular URLs are specified. If particular URLs are specified, the make_requests_from_url() is used instead to create the Requests. This method is also called only once from Scrapy, so it’s safe to implement it as a generator.

              The default implementation uses make_requests_from_url() to generate Requests for each url in start_urls.

              If you want to change the Requests used to start scraping a domain, this is the method to override. For example, if you need to start by logging in using a POST request, you could do:

              def start_requests(self):
                  return [FormRequest("http://www.example.com/login",
                                      formdata={'user': 'john', 'pass': 'secret'},
                                      callback=self.logged_in)]
              
              def logged_in(self, response):
                  # here you would extract links to follow and return Requests for
                  # each of them, with another callback
                  pass
              
              make_requests_from_url(url)

              A method that receives a URL and returns a Request object (or a list of Request objects) to scrape. This method is used to construct the initial requests in the start_requests() method, and is typically used to convert urls to requests.

              Unless overridden, this method returns Requests with the parse() method as their callback function, and with dont_filter parameter enabled (see Request class for more info).

              parse(response)

              This is the default callback used by Scrapy to process downloaded responses, when their requests don’t specify a callback.

              The parse method is in charge of processing the response and returning scraped data and/or more URLs to follow. Other Requests callbacks have the same requirements as the Spider class.

              This method, as well as any other Request callback, must return an iterable of Request and/or Item objects.

              Parameters:response (:class:~scrapy.http.Response`) – the response to parse
              log(message[, level, component])

              Log a message using the scrapy.log.msg() function, automatically populating the spider argument with the name of this spider. For more information see Logging.

              Spider example

              Let’s see an example:

              from scrapy import log # This module is useful for printing out debug information
              from scrapy.spider import Spider
              
              class MySpider(Spider):
                  name = 'example.com'
                  allowed_domains = ['example.com']
                  start_urls = [
                      'http://www.example.com/1.html',
                      'http://www.example.com/2.html',
                      'http://www.example.com/3.html',
                  ]
              
                  def parse(self, response):
                      self.log('A response from %s just arrived!' % response.url)
              

              Another example returning multiples Requests and Items from a single callback:

              from scrapy.selector import Selector
              from scrapy.spider import Spider
              from scrapy.http import Request
              from myproject.items import MyItem
              
              class MySpider(Spider):
                  name = 'example.com'
                  allowed_domains = ['example.com']
                  start_urls = [
                      'http://www.example.com/1.html',
                      'http://www.example.com/2.html',
                      'http://www.example.com/3.html',
                  ]
              
                  def parse(self, response):
                      sel = Selector(response)
                      for h3 in sel.xpath('//h3').extract():
                          yield MyItem(title=h3)
              
                      for url in sel.xpath('//a/@href').extract():
                          yield Request(url, callback=self.parse)
              

              CrawlSpider

              class scrapy.contrib.spiders.CrawlSpider

              This is the most commonly used spider for crawling regular websites, as it provides a convenient mechanism for following links by defining a set of rules. It may not be the best suited for your particular web sites or project, but it’s generic enough for several cases, so you can start from it and override it as needed for more custom functionality, or just implement your own spider.

              Apart from the attributes inherited from Spider (that you must specify), this class supports a new attribute:

              rules

              Which is a list of one (or more) Rule objects. Each Rule defines a certain behaviour for crawling the site. Rules objects are described below. If multiple rules match the same link, the first one will be used, according to the order they’re defined in this attribute.

              This spider also exposes an overrideable method:

              parse_start_url(response)

              This method is called for the start_urls responses. It allows to parse the initial responses and must return either a Item object, a Request object, or an iterable containing any of them.

              Crawling rules

              class scrapy.contrib.spiders.Rule(link_extractor, callback=None, cb_kwargs=None, follow=None, process_links=None, process_request=None)

              link_extractor is a Link Extractor object which defines how links will be extracted from each crawled page.

              callback is a callable or a string (in which case a method from the spider object with that name will be used) to be called for each link extracted with the specified link_extractor. This callback receives a response as its first argument and must return a list containing Item and/or Request objects (or any subclass of them).

              Warning

              When writing crawl spider rules, avoid using parse as callback, since the CrawlSpider uses the parse method itself to implement its logic. So if you override the parse method, the crawl spider will no longer work.

              cb_kwargs is a dict containing the keyword arguments to be passed to the callback function

              follow is a boolean which specifies if links should be followed from each response extracted with this rule. If callback is None follow defaults to True, otherwise it default to False.

              process_links is a callable, or a string (in which case a method from the spider object with that name will be used) which will be called for each list of links extracted from each response using the specified link_extractor. This is mainly used for filtering purposes.

              process_request is a callable, or a string (in which case a method from the spider object with that name will be used) which will be called with every request extracted by this rule, and must return a request or None (to filter out the request).

              CrawlSpider example

              Let’s now take a look at an example CrawlSpider with rules:

              from scrapy.contrib.spiders import CrawlSpider, Rule
              from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
              from scrapy.selector import Selector
              from scrapy.item import Item
              
              class MySpider(CrawlSpider):
                  name = 'example.com'
                  allowed_domains = ['example.com']
                  start_urls = ['http://www.example.com']
              
                  rules = (
                      # Extract links matching 'category.php' (but not matching 'subsection.php')
                      # and follow links from them (since no callback means follow=True by default).
                      Rule(SgmlLinkExtractor(allow=('category\.php', ), deny=('subsection\.php', ))),
              
                      # Extract links matching 'item.php' and parse them with the spider's method parse_item
                      Rule(SgmlLinkExtractor(allow=('item\.php', )), callback='parse_item'),
                  )
              
                  def parse_item(self, response):
                      self.log('Hi, this is an item page! %s' % response.url)
              
                      sel = Selector(response)
                      item = Item()
                      item['id'] = sel.xpath('//td[@id="item_id"]/text()').re(r'ID: (\d+)')
                      item['name'] = sel.xpath('//td[@id="item_name"]/text()').extract()
                      item['description'] = sel.xpath('//td[@id="item_description"]/text()').extract()
                      return item
              

              This spider would start crawling example.com’s home page, collecting category links, and item links, parsing the latter with the parse_item method. For each item response, some data will be extracted from the HTML using XPath, and a Item will be filled with it.

              XMLFeedSpider

              class scrapy.contrib.spiders.XMLFeedSpider

              XMLFeedSpider is designed for parsing XML feeds by iterating through them by a certain node name. The iterator can be chosen from: iternodes, xml, and html. It’s recommended to use the iternodes iterator for performance reasons, since the xml and html iterators generate the whole DOM at once in order to parse it. However, using html as the iterator may be useful when parsing XML with bad markup.

              To set the iterator and the tag name, you must define the following class attributes:

              iterator

              A string which defines the iterator to use. It can be either:

              • 'iternodes' - a fast iterator based on regular expressions
              • 'html' - an iterator which uses Selector. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feeds
              • 'xml' - an iterator which uses Selector. Keep in mind this uses DOM parsing and must load all DOM in memory which could be a problem for big feeds

              It defaults to: 'iternodes'.

              itertag

              A string with the name of the node (or element) to iterate in. Example:

              itertag = 'product'
              
              namespaces

              A list of (prefix, uri) tuples which define the namespaces available in that document that will be processed with this spider. The prefix and uri will be used to automatically register namespaces using the register_namespace() method.

              You can then specify nodes with namespaces in the itertag attribute.

              Example:

              class YourSpider(XMLFeedSpider):
              
                  namespaces = [('n', 'http://www.sitemaps.org/schemas/sitemap/0.9')]
                  itertag = 'n:url'
                  # ...
              

              Apart from these new attributes, this spider has the following overrideable methods too:

              adapt_response(response)

              A method that receives the response as soon as it arrives from the spider middleware, before the spider starts parsing it. It can be used to modify the response body before parsing it. This method receives a response and also returns a response (it could be the same or another one).

              parse_node(response, selector)

              This method is called for the nodes matching the provided tag name (itertag). Receives the response and an Selector for each node. Overriding this method is mandatory. Otherwise, you spider won’t work. This method must return either a Item object, a Request object, or an iterable containing any of them.

              process_results(response, results)

              This method is called for each result (item or request) returned by the spider, and it’s intended to perform any last time processing required before returning the results to the framework core, for example setting the item IDs. It receives a list of results and the response which originated those results. It must return a list of results (Items or Requests).

              XMLFeedSpider example

              These spiders are pretty easy to use, let’s have a look at one example:

              from scrapy import log
              from scrapy.contrib.spiders import XMLFeedSpider
              from myproject.items import TestItem
              
              class MySpider(XMLFeedSpider):
                  name = 'example.com'
                  allowed_domains = ['example.com']
                  start_urls = ['http://www.example.com/feed.xml']
                  iterator = 'iternodes' # This is actually unnecessary, since it's the default value
                  itertag = 'item'
              
                  def parse_node(self, response, node):
                      log.msg('Hi, this is a <%s> node!: %s' % (self.itertag, ''.join(node.extract())))
              
                      item = Item()
                      item['id'] = node.xpath('@id').extract()
                      item['name'] = node.xpath('name').extract()
                      item['description'] = node.xpath('description').extract()
                      return item
              

              Basically what we did up there was to create a spider that downloads a feed from the given start_urls, and then iterates through each of its item tags, prints them out, and stores some random data in an Item.

              CSVFeedSpider

              class scrapy.contrib.spiders.CSVFeedSpider

              This spider is very similar to the XMLFeedSpider, except that it iterates over rows, instead of nodes. The method that gets called in each iteration is parse_row().

              delimiter

              A string with the separator character for each field in the CSV file Defaults to ',' (comma).

              headers

              A list of the rows contained in the file CSV feed which will be used to extract fields from it.

              parse_row(response, row)

              Receives a response and a dict (representing each row) with a key for each provided (or detected) header of the CSV file. This spider also gives the opportunity to override adapt_response and process_results methods for pre- and post-processing purposes.

              CSVFeedSpider example

              Let’s see an example similar to the previous one, but using a CSVFeedSpider:

              from scrapy import log
              from scrapy.contrib.spiders import CSVFeedSpider
              from myproject.items import TestItem
              
              class MySpider(CSVFeedSpider):
                  name = 'example.com'
                  allowed_domains = ['example.com']
                  start_urls = ['http://www.example.com/feed.csv']
                  delimiter = ';'
                  headers = ['id', 'name', 'description']
              
                  def parse_row(self, response, row):
                      log.msg('Hi, this is a row!: %r' % row)
              
                      item = TestItem()
                      item['id'] = row['id']
                      item['name'] = row['name']
                      item['description'] = row['description']
                      return item
              

              SitemapSpider

              class scrapy.contrib.spiders.SitemapSpider

              SitemapSpider allows you to crawl a site by discovering the URLs using Sitemaps.

              It supports nested sitemaps and discovering sitemap urls from robots.txt.

              sitemap_urls

              A list of urls pointing to the sitemaps whose urls you want to crawl.

              You can also point to a robots.txt and it will be parsed to extract sitemap urls from it.

              sitemap_rules

              A list of tuples (regex, callback) where:

              • regex is a regular expression to match urls extracted from sitemaps. regex can be either a str or a compiled regex object.
              • callback is the callback to use for processing the urls that match the regular expression. callback can be a string (indicating the name of a spider method) or a callable.

              For example:

              sitemap_rules = [('/product/', 'parse_product')]
              

              Rules are applied in order, and only the first one that matches will be used.

              If you omit this attribute, all urls found in sitemaps will be processed with the parse callback.

              sitemap_follow

              A list of regexes of sitemap that should be followed. This is is only for sites that use Sitemap index files that point to other sitemap files.

              By default, all sitemaps are followed.

              Specifies if alternate links for one url should be followed. These are links for the same website in another language passed within the same url block.

              For example:

              <url>
                  <loc>http://example.com/</loc>
                  <xhtml:link rel="alternate" hreflang="de" href="http://example.com/de"/>
              </url>

              With sitemap_alternate_links set, this would retrieve both URLs. With sitemap_alternate_links disabled, only http://example.com/ would be retrieved.

              Default is sitemap_alternate_links disabled.

              SitemapSpider examples

              Simplest example: process all urls discovered through sitemaps using the parse callback:

              from scrapy.contrib.spiders import SitemapSpider
              
              class MySpider(SitemapSpider):
                  sitemap_urls = ['http://www.example.com/sitemap.xml']
              
                  def parse(self, response):
                      pass # ... scrape item here ...
              

              Process some urls with certain callback and other urls with a different callback:

              from scrapy.contrib.spiders import SitemapSpider
              
              class MySpider(SitemapSpider):
                  sitemap_urls = ['http://www.example.com/sitemap.xml']
                  sitemap_rules = [
                      ('/product/', 'parse_product'),
                      ('/category/', 'parse_category'),
                  ]
              
                  def parse_product(self, response):
                      pass # ... scrape product ...
              
                  def parse_category(self, response):
                      pass # ... scrape category ...
              

              Follow sitemaps defined in the robots.txt file and only follow sitemaps whose url contains /sitemap_shop:

              from scrapy.contrib.spiders import SitemapSpider
              
              class MySpider(SitemapSpider):
                  sitemap_urls = ['http://www.example.com/robots.txt']
                  sitemap_rules = [
                      ('/shop/', 'parse_shop'),
                  ]
                  sitemap_follow = ['/sitemap_shops']
              
                  def parse_shop(self, response):
                      pass # ... scrape shop here ...
              

              Combine SitemapSpider with other sources of urls:

              from scrapy.contrib.spiders import SitemapSpider
              
              class MySpider(SitemapSpider):
                  sitemap_urls = ['http://www.example.com/robots.txt']
                  sitemap_rules = [
                      ('/shop/', 'parse_shop'),
                  ]
              
                  other_urls = ['http://www.example.com/about']
              
                  def start_requests(self):
                      requests = list(super(MySpider, self).start_requests())
                      requests += [Request(x, callback=self.parse_other) for x in self.other_urls]
                      return requests
              
                  def parse_shop(self, response):
                      pass # ... scrape shop here ...
              
                  def parse_other(self, response):
                      pass # ... scrape other here ...
              
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK"o1D=$scrapy-0.22/topics/architecture.html Architecture overview — Scrapy 0.22.0 documentation

              Architecture overview

              This document describes the architecture of Scrapy and how its components interact.

              Overview

              The following diagram shows an overview of the Scrapy architecture with its components and an outline of the data flow that takes place inside the system (shown by the green arrows). A brief description of the components is included below with links for more detailed information about them. The data flow is also described below.

              Scrapy architecture

              Components

              Scrapy Engine

              The engine is responsible for controlling the data flow between all components of the system, and triggering events when certain actions occur. See the Data Flow section below for more details.

              Scheduler

              The Scheduler receives requests from the engine and enqueues them for feeding them later (also to the engine) when the engine requests them.

              Downloader

              The Downloader is responsible for fetching web pages and feeding them to the engine which, in turn, feeds them to the spiders.

              Spiders

              Spiders are custom classes written by Scrapy users to parse responses and extract items (aka scraped items) from them or additional URLs (requests) to follow. Each spider is able to handle a specific domain (or group of domains). For more information see Spiders.

              Item Pipeline

              The Item Pipeline is responsible for processing the items once they have been extracted (or scraped) by the spiders. Typical tasks include cleansing, validation and persistence (like storing the item in a database). For more information see Item Pipeline.

              Downloader middlewares

              Downloader middlewares are specific hooks that sit between the Engine and the Downloader and process requests when they pass from the Engine to the Downloader, and responses that pass from Downloader to the Engine. They provide a convenient mechanism for extending Scrapy functionality by plugging custom code. For more information see Downloader Middleware.

              Spider middlewares

              Spider middlewares are specific hooks that sit between the Engine and the Spiders and are able to process spider input (responses) and output (items and requests). They provide a convenient mechanism for extending Scrapy functionality by plugging custom code. For more information see Spider Middleware.

              Data flow

              The data flow in Scrapy is controlled by the execution engine, and goes like this:

              1. The Engine opens a domain, locates the Spider that handles that domain, and asks the spider for the first URLs to crawl.
              2. The Engine gets the first URLs to crawl from the Spider and schedules them in the Scheduler, as Requests.
              3. The Engine asks the Scheduler for the next URLs to crawl.
              4. The Scheduler returns the next URLs to crawl to the Engine and the Engine sends them to the Downloader, passing through the Downloader Middleware (request direction).
              5. Once the page finishes downloading the Downloader generates a Response (with that page) and sends it to the Engine, passing through the Downloader Middleware (response direction).
              6. The Engine receives the Response from the Downloader and sends it to the Spider for processing, passing through the Spider Middleware (input direction).
              7. The Spider processes the Response and returns scraped Items and new Requests (to follow) to the Engine.
              8. The Engine sends scraped Items (returned by the Spider) to the Item Pipeline and Requests (returned by spider) to the Scheduler
              9. The process repeats (from step 2) until there are no more requests from the Scheduler, and the Engine closes the domain.

              Event-driven networking

              Scrapy is written with Twisted, a popular event-driven networking framework for Python. Thus, it’s implemented using a non-blocking (aka asynchronous) code for concurrency.

              For more information about asynchronous programming and Twisted see these links:

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK$o1D,o-nn%scrapy-0.22/topics/item-pipeline.html Item Pipeline — Scrapy 0.22.0 documentation

              Item Pipeline

              After an item has been scraped by a spider, it is sent to the Item Pipeline which process it through several components that are executed sequentially.

              Each item pipeline component (sometimes referred as just “Item Pipeline”) is a Python class that implements a simple method. They receive an Item and perform an action over it, also deciding if the Item should continue through the pipeline or be dropped and no longer processed.

              Typical use for item pipelines are:

              • cleansing HTML data
              • validating scraped data (checking that the items contain certain fields)
              • checking for duplicates (and dropping them)
              • storing the scraped item in a database

              Writing your own item pipeline

              Writing your own item pipeline is easy. Each item pipeline component is a single Python class that must implement the following method:

              process_item(item, spider)

              This method is called for every item pipeline component and must either return a Item (or any descendant class) object or raise a DropItem exception. Dropped items are no longer processed by further pipeline components.

              Parameters:
              • item (Item object) – the item scraped
              • spider (Spider object) – the spider which scraped the item

              Additionally, they may also implement the following methods:

              open_spider(spider)

              This method is called when the spider is opened.

              Parameters:spider (Spider object) – the spider which was opened
              close_spider(spider)

              This method is called when the spider is closed.

              Parameters:spider (Spider object) – the spider which was closed

              Item pipeline example

              Price validation and dropping items with no prices

              Let’s take a look at the following hypothetical pipeline that adjusts the price attribute for those items that do not include VAT (price_excludes_vat attribute), and drops those items which don’t contain a price:

              from scrapy.exceptions import DropItem
              
              class PricePipeline(object):
              
                  vat_factor = 1.15
              
                  def process_item(self, item, spider):
                      if item['price']:
                          if item['price_excludes_vat']:
                              item['price'] = item['price'] * self.vat_factor
                          return item
                      else:
                          raise DropItem("Missing price in %s" % item)
              

              Write items to a JSON file

              The following pipeline stores all scraped items (from all spiders) into a a single items.jl file, containing one item per line serialized in JSON format:

              import json
              
              class JsonWriterPipeline(object):
              
                  def __init__(self):
                      self.file = open('items.jl', 'wb')
              
                  def process_item(self, item, spider):
                      line = json.dumps(dict(item)) + "\n"
                      self.file.write(line)
                      return item
              

              Note

              The purpose of JsonWriterPipeline is just to introduce how to write item pipelines. If you really want to store all scraped items into a JSON file you should use the Feed exports.

              Duplicates filter

              A filter that looks for duplicate items, and drops those items that were already processed. Let say that our items have an unique id, but our spider returns multiples items with the same id:

              from scrapy.exceptions import DropItem
              
              class DuplicatesPipeline(object):
              
                  def __init__(self):
                      self.ids_seen = set()
              
                  def process_item(self, item, spider):
                      if item['id'] in self.ids_seen:
                          raise DropItem("Duplicate item found: %s" % item)
                      else:
                          self.ids_seen.add(item['id'])
                          return item
              

              Activating an Item Pipeline component

              To activate an Item Pipeline component you must add its class to the ITEM_PIPELINES setting, like in the following example:

              ITEM_PIPELINES = {
                  'myproject.pipeline.PricePipeline': 300,
                  'myproject.pipeline.JsonWriterPipeline': 800,
              }
              

              The integer values you assign to classes in this setting determine the order they run in- items go through pipelines from order number low to high. It’s customary to define these numbers in the 0-1000 range.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1D7kO O "scrapy-0.22/topics/extensions.html Extensions — Scrapy 0.22.0 documentation

              Extensions

              The extensions framework provides a mechanism for inserting your own custom functionality into Scrapy.

              Extensions are just regular classes that are instantiated at Scrapy startup, when extensions are initialized.

              Extension settings

              Extensions use the Scrapy settings to manage their settings, just like any other Scrapy code.

              It is customary for extensions to prefix their settings with their own name, to avoid collision with existing (and future) extensions. For example, an hypothetic extension to handle Google Sitemaps would use settings like GOOGLESITEMAP_ENABLED, GOOGLESITEMAP_DEPTH, and so on.

              Loading & activating extensions

              Extensions are loaded and activated at startup by instantiating a single instance of the extension class. Therefore, all the extension initialization code must be performed in the class constructor (__init__ method).

              To make an extension available, add it to the EXTENSIONS setting in your Scrapy settings. In EXTENSIONS, each extension is represented by a string: the full Python path to the extension’s class name. For example:

              EXTENSIONS = {
                  'scrapy.contrib.corestats.CoreStats': 500,
                  'scrapy.webservice.WebService': 500,
                  'scrapy.telnet.TelnetConsole': 500,
              }
              

              As you can see, the EXTENSIONS setting is a dict where the keys are the extension paths, and their values are the orders, which define the extension loading order. Extensions orders are not as important as middleware orders though, and they are typically irrelevant, ie. it doesn’t matter in which order the extensions are loaded because they don’t depend on each other [1].

              However, this feature can be exploited if you need to add an extension which depends on other extensions already loaded.

              [1] This is is why the EXTENSIONS_BASE setting in Scrapy (which contains all built-in extensions enabled by default) defines all the extensions with the same order (500).

              Available, enabled and disabled extensions

              Not all available extensions will be enabled. Some of them usually depend on a particular setting. For example, the HTTP Cache extension is available by default but disabled unless the HTTPCACHE_ENABLED setting is set.

              Disabling an extension

              In order to disable an extension that comes enabled by default (ie. those included in the EXTENSIONS_BASE setting) you must set its order to None. For example:

              EXTENSIONS = {
                  'scrapy.contrib.corestats.CoreStats': None,
              }
              

              Writing your own extension

              Writing your own extension is easy. Each extension is a single Python class which doesn’t need to implement any particular method.

              The main entry point for a Scrapy extension (this also includes middlewares and pipelines) is the from_crawler class method which receives a Crawler instance which is the main object controlling the Scrapy crawler. Through that object you can access settings, signals, stats, and also control the crawler behaviour, if your extension needs to such thing.

              Typically, extensions connect to signals and perform tasks triggered by them.

              Finally, if the from_crawler method raises the NotConfigured exception, the extension will be disabled. Otherwise, the extension will be enabled.

              Sample extension

              Here we will implement a simple extension to illustrate the concepts described in the previous section. This extension will log a message every time:

              • a spider is opened
              • a spider is closed
              • a specific number of items are scraped

              The extension will be enabled through the MYEXT_ENABLED setting and the number of items will be specified through the MYEXT_ITEMCOUNT setting.

              Here is the code of such extension:

              from scrapy import signals
              from scrapy.exceptions import NotConfigured
              
              class SpiderOpenCloseLogging(object):
              
                  def __init__(self, item_count):
                      self.item_count = item_count
                      self.items_scraped = 0
              
                  @classmethod
                  def from_crawler(cls, crawler):
                      # first check if the extension should be enabled and raise
                      # NotConfigured otherwise
                      if not crawler.settings.getbool('MYEXT_ENABLED'):
                          raise NotConfigured
              
                      # get the number of items from settings
                      item_count = crawler.settings.getint('MYEXT_ITEMCOUNT', 1000)
              
                      # instantiate the extension object
                      ext = cls(item_count)
              
                      # connect the extension object to signals
                      crawler.signals.connect(ext.spider_opened, signal=signals.spider_opened)
                      crawler.signals.connect(ext.spider_closed, signal=signals.spider_closed)
                      crawler.signals.connect(ext.item_scraped, signal=signals.item_scraped)
              
                      # return the extension object
                      return ext
              
                  def spider_opened(self, spider):
                      spider.log("opened spider %s" % spider.name)
              
                  def spider_closed(self, spider):
                      spider.log("closed spider %s" % spider.name)
              
                  def item_scraped(self, item, spider):
                      self.items_scraped += 1
                      if self.items_scraped == self.item_count:
                          spider.log("scraped %d items, resetting counter" % self.items_scraped)
                          self.item_count = 0
              

              Built-in extensions reference

              General purpose extensions

              Log Stats extension

              class scrapy.contrib.logstats.LogStats

              Log basic stats like crawled pages and scraped items.

              Core Stats extension

              class scrapy.contrib.corestats.CoreStats

              Enable the collection of core statistics, provided the stats collection is enabled (see Stats Collection).

              Web service extension

              class scrapy.webservice.WebService

              See topics-webservice.

              Telnet console extension

              class scrapy.telnet.TelnetConsole

              Provides a telnet console for getting into a Python interpreter inside the currently running Scrapy process, which can be very useful for debugging.

              The telnet console must be enabled by the TELNETCONSOLE_ENABLED setting, and the server will listen in the port specified in TELNETCONSOLE_PORT.

              Memory usage extension

              class scrapy.contrib.memusage.MemoryUsage

              Note

              This extension does not work in Windows.

              Monitors the memory used by the Scrapy process that runs the spider and:

              1, sends a notification e-mail when it exceeds a certain value 2. closes the spider when it exceeds a certain value

              The notification e-mails can be triggered when a certain warning value is reached (MEMUSAGE_WARNING_MB) and when the maximum value is reached (MEMUSAGE_LIMIT_MB) which will also cause the spider to be closed and the Scrapy process to be terminated.

              This extension is enabled by the MEMUSAGE_ENABLED setting and can be configured with the following settings:

              Memory debugger extension

              class scrapy.contrib.memdebug.MemoryDebugger

              An extension for debugging memory usage. It collects information about:

              To enable this extension, turn on the MEMDEBUG_ENABLED setting. The info will be stored in the stats.

              Close spider extension

              class scrapy.contrib.closespider.CloseSpider

              Closes a spider automatically when some conditions are met, using a specific closing reason for each condition.

              The conditions for closing a spider can be configured through the following settings:

              CLOSESPIDER_TIMEOUT

              Default: 0

              An integer which specifies a number of seconds. If the spider remains open for more than that number of second, it will be automatically closed with the reason closespider_timeout. If zero (or non set), spiders won’t be closed by timeout.

              CLOSESPIDER_ITEMCOUNT

              Default: 0

              An integer which specifies a number of items. If the spider scrapes more than that amount if items and those items are passed by the item pipeline, the spider will be closed with the reason closespider_itemcount. If zero (or non set), spiders won’t be closed by number of passed items.

              CLOSESPIDER_PAGECOUNT

              New in version 0.11.

              Default: 0

              An integer which specifies the maximum number of responses to crawl. If the spider crawls more than that, the spider will be closed with the reason closespider_pagecount. If zero (or non set), spiders won’t be closed by number of crawled responses.

              CLOSESPIDER_ERRORCOUNT

              New in version 0.11.

              Default: 0

              An integer which specifies the maximum number of errors to receive before closing the spider. If the spider generates more than that number of errors, it will be closed with the reason closespider_errorcount. If zero (or non set), spiders won’t be closed by number of errors.

              StatsMailer extension

              class scrapy.contrib.statsmailer.StatsMailer

              This simple extension can be used to send a notification e-mail every time a domain has finished scraping, including the Scrapy stats collected. The email will be sent to all recipients specified in the STATSMAILER_RCPTS setting.

              Debugging extensions

              Stack trace dump extension

              class scrapy.contrib.debug.StackTraceDump

              Dumps information about the running process when a SIGQUIT or SIGUSR2 signal is received. The information dumped is the following:

              1. engine status (using scrapy.utils.engine.get_engine_status())
              2. live references (see Debugging memory leaks with trackref)
              3. stack trace of all threads

              After the stack trace and engine status is dumped, the Scrapy process continues running normally.

              This extension only works on POSIX-compliant platforms (ie. not Windows), because the SIGQUIT and SIGUSR2 signals are not available on Windows.

              There are at least two ways to send Scrapy the SIGQUIT signal:

              1. By pressing Ctrl-while a Scrapy process is running (Linux only?)

              2. By running this command (assuming <pid> is the process id of the Scrapy process):

                kill -QUIT <pid>

              Debugger extension

              class scrapy.contrib.debug.Debugger

              Invokes a Python debugger inside a running Scrapy process when a SIGUSR2 signal is received. After the debugger is exited, the Scrapy process continues running normally.

              For more info see Debugging in Python.

              This extension only works on POSIX-compliant platforms (ie. not Windows).

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK"o1Dޟ"scrapy-0.22/topics/djangoitem.html DjangoItem — Scrapy 0.22.0 documentation

              DjangoItem

              DjangoItem is a class of item that gets its fields definition from a Django model, you simply create a DjangoItem and specify what Django model it relates to.

              Besides of getting the model fields defined on your item, DjangoItem provides a method to create and populate a Django model instance with the item data.

              Using DjangoItem

              DjangoItem works much like ModelForms in Django, you create a subclass and define its django_model attribute to be a valid Django model. With this you will get an item with a field for each Django model field.

              In addition, you can define fields that aren’t present in the model and even override fields that are present in the model defining them in the item.

              Let’s see some examples:

              Creating a Django model for the examples:

              from django.db import models
              
              class Person(models.Model):
                  name = models.CharField(max_length=255)
                  age = models.IntegerField()
              

              Defining a basic DjangoItem:

              from scrapy.contrib.djangoitem import DjangoItem
              
              class PersonItem(DjangoItem):
                  django_model = Person
              

              DjangoItem work just like Item:

              >>> p = PersonItem()
              >>> p['name'] = 'John'
              >>> p['age'] = '22'
              

              To obtain the Django model from the item, we call the extra method save() of the DjangoItem:

              >>> person = p.save()
              >>> person.name
              'John'
              >>> person.age
              '22'
              >>> person.id
              1
              

              The model is already saved when we call save(), we can prevent this by calling it with commit=False. We can use commit=False in save() method to obtain an unsaved model:

              >>> person = p.save(commit=False)
              >>> person.name
              'John'
              >>> person.age
              '22'
              >>> person.id
              None
              

              As said before, we can add other fields to the item:

              class PersonItem(DjangoItem):
                  django_model = Person
                  sex = Field()
              
              >>> p = PersonItem()
              >>> p['name'] = 'John'
              >>> p['age'] = '22'
              >>> p['sex'] = 'M'
              

              Note

              fields added to the item won’t be taken into account when doing a save()

              And we can override the fields of the model with your own:

              class PersonItem(DjangoItem):
                  django_model = Person
                  name = Field(default='No Name')
              

              This is useful to provide properties to the field, like a default or any other property that your project uses.

              DjangoItem caveats

              DjangoItem is a rather convenient way to integrate Scrapy projects with Django models, but bear in mind that Django ORM may not scale well if you scrape a lot of items (ie. millions) with Scrapy. This is because a relational backend is often not a good choice for a write intensive application (such as a web crawler), specially if the database is highly normalized and with many indices.

              Django settings set up

              To use the Django models outside the Django application you need to set up the DJANGO_SETTINGS_MODULE environment variable and –in most cases– modify the PYTHONPATH environment variable to be able to import the settings module.

              There are many ways to do this depending on your use case and preferences. Below is detailed one of the simplest ways to do it.

              Suppose your Django project is named mysite, is located in the path /home/projects/mysite and you have created an app myapp with the model Person. That means your directory structure is something like this:

              /home/projects/mysite
              ├── manage.py
              ├── myapp
              │   ├── __init__.py
              │   ├── models.py
              │   ├── tests.py
              │   └── views.py
              └── mysite
                  ├── __init__.py
                  ├── settings.py
                  ├── urls.py
                  └── wsgi.py

              Then you need to add /home/projects/mysite to the PYTHONPATH environment variable and set up the environment variable DJANGO_SETTINGS_MODULE to mysite.settings. That can be done in your Scrapy’s settings file by adding the lines below:

              import sys
              sys.path.append('/home/projects/mysite')
              
              import os
              os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'
              

              Notice that we modify the sys.path variable instead the PYTHONPATH environment variable as we are already within the python runtime. If everything is right, you should be able to start the scrapy shell command and import the model Person (i.e. from myapp.models import Person).

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK%o1D5,{scrapy-0.22/topics/shell.html Scrapy shell — Scrapy 0.22.0 documentation

              Scrapy shell

              The Scrapy shell is an interactive shell where you can try and debug your scraping code very quickly, without having to run the spider. It’s meant to be used for testing data extraction code, but you can actually use it for testing any kind of code as it is also a regular Python shell.

              The shell is used for testing XPath or CSS expressions and see how they work and what data they extract from the web pages you’re trying to scrape. It allows you to interactively test your expressions while you’re writing your spider, without having to run the spider to test every change.

              Once you get familiarized with the Scrapy shell, you’ll see that it’s an invaluable tool for developing and debugging your spiders.

              If you have IPython installed, the Scrapy shell will use it (instead of the standard Python console). The IPython console is much more powerful and provides smart auto-completion and colorized output, among other things.

              We highly recommend you install IPython, specially if you’re working on Unix systems (where IPython excels). See the IPython installation guide for more info.

              Launch the shell

              To launch the Scrapy shell you can use the shell command like this:

              scrapy shell <url>

              Where the <url> is the URL you want to scrape.

              Using the shell

              The Scrapy shell is just a regular Python console (or IPython console if you have it available) which provides some additional shortcut functions for convenience.

              Available Shortcuts

              • shelp() - print a help with the list of available objects and shortcuts
              • fetch(request_or_url) - fetch a new response from the given request or URL and update all related objects accordingly.
              • view(response) - open the given response in your local web browser, for inspection. This will add a <base> tag to the response body in order for external links (such as images and style sheets) to display properly. Note, however,that this will create a temporary file in your computer, which won’t be removed automatically.

              Available Scrapy objects

              The Scrapy shell automatically creates some convenient objects from the downloaded page, like the Response object and the Selector objects (for both HTML and XML content).

              Those objects are:

              • spider - the Spider which is known to handle the URL, or a Spider object if there is no spider found for the current URL
              • request - a Request object of the last fetched page. You can modify this request using replace() or fetch a new request (without leaving the shell) using the fetch shortcut.
              • response - a Response object containing the last fetched page
              • sel - a Selector object constructed with the last response fetched
              • settings - the current Scrapy settings

              Example of shell session

              Here’s an example of a typical shell session where we start by scraping the http://scrapy.org page, and then proceed to scrape the http://slashdot.org page. Finally, we modify the (Slashdot) request method to POST and re-fetch it getting a HTTP 405 (method not allowed) error. We end the session by typing Ctrl-D (in Unix systems) or Ctrl-Z in Windows.

              Keep in mind that the data extracted here may not be the same when you try it, as those pages are not static and could have changed by the time you test this. The only purpose of this example is to get you familiarized with how the Scrapy shell works.

              First, we launch the shell:

              scrapy shell 'http://scrapy.org' --nolog

              Then, the shell fetches the URL (using the Scrapy downloader) and prints the list of available objects and useful shortcuts (you’ll notice that these lines all start with the [s] prefix):

              [s] Available objects
              [s]   sel       <Selector (http://scrapy.org) xpath=None>
              [s]   item      Item()
              [s]   request   <http://scrapy.org>
              [s]   response  <http://scrapy.org>
              [s]   settings  <Settings 'mybot.settings'>
              [s]   spider    <Spider 'default' at 0x2bed9d0>
              [s] Useful shortcuts:
              [s]   shelp()           Prints this help.
              [s]   fetch(req_or_url) Fetch a new request or URL and update objects
              [s]   view(response)    View response in a browser
              
              >>>

              After that, we can star playing with the objects:

              >>> sel.xpath("//h2/text()").extract()[0]
              u'Welcome to Scrapy'
              
              >>> fetch("http://slashdot.org")
              [s] Available Scrapy objects:
              [s]   sel        <Selector (http://slashdot.org) xpath=None>
              [s]   item       JobItem()
              [s]   request    <GET http://slashdot.org>
              [s]   response   <200 http://slashdot.org>
              [s]   settings   <Settings 'jobsbot.settings'>
              [s]   spider     <Spider 'default' at 0x3c44a10>
              [s] Useful shortcuts:
              [s]   shelp()           Shell help (print this help)
              [s]   fetch(req_or_url) Fetch request (or URL) and update local objects
              [s]   view(response)    View response in a browser
              
              >>> sel.xpath("//h2/text()").extract()
              [u'News for nerds, stuff that matters']
              
              >>> request = request.replace(method="POST")
              
              >>> fetch(request)
              2009-04-03 00:57:39-0300 [default] ERROR: Downloading <http://slashdot.org> from <None>: 405 Method Not Allowed
              
              >>>
              

              Invoking the shell from spiders to inspect responses

              Sometimes you want to inspect the responses that are being processed in a certain point of your spider, if only to check that response you expect is getting there.

              This can be achieved by using the scrapy.shell.inspect_response function.

              Here’s an example of how you would call it from your spider:

              class MySpider(Spider):
                  ...
              
                  def parse(self, response):
                      if response.url == 'http://www.example.com/products.php':
                          from scrapy.shell import inspect_response
                          inspect_response(response)
              
                      # ... your parsing code ..
              

              When you run the spider, you will get something similar to this:

              2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled <http://www.example.com/> (referer: <None>)
              2009-08-27 19:15:26-0300 [example.com] DEBUG: Crawled <http://www.example.com/products.php> (referer: <http://www.example.com/>)
              [s] Available objects
              [s]   sel       <Selector (http://www.example.com/products.php) xpath=None>
              ...
              
              >>> response.url
              'http://www.example.com/products.php'

              Then, you can check if the extraction code is working:

              >>> sel.xpath('//h1')
              []
              

              Nope, it doesn’t. So you can open the response in your web browser and see if it’s the response you were expecting:

              >>> view(response)
              >>>
              

              Finally you hit Ctrl-D (or Ctrl-Z in Windows) to exit the shell and resume the crawling:

              >>> ^D
              2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled <http://www.example.com/product.php?id=1> (referer: <None>)
              2009-08-27 19:15:25-0300 [example.com] DEBUG: Crawled <http://www.example.com/product.php?id=2> (referer: <None>)
              # ...
              

              Note that you can’t use the fetch shortcut here since the Scrapy engine is blocked by the shell. However, after you leave the shell, the spider will continue crawling where it stopped, as shown above.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK$o1Dx scrapy-0.22/topics/leaks.html Debugging memory leaks — Scrapy 0.22.0 documentation

              Debugging memory leaks

              In Scrapy, objects such as Requests, Responses and Items have a finite lifetime: they are created, used for a while, and finally destroyed.

              From all those objects, the Request is probably the one with the longest lifetime, as it stays waiting in the Scheduler queue until it’s time to process it. For more info see Architecture overview.

              As these Scrapy objects have a (rather long) lifetime, there is always the risk of accumulating them in memory without releasing them properly and thus causing what is known as a “memory leak”.

              To help debugging memory leaks, Scrapy provides a built-in mechanism for tracking objects references called trackref, and you can also use a third-party library called Guppy for more advanced memory debugging (see below for more info). Both mechanisms must be used from the Telnet Console.

              Common causes of memory leaks

              It happens quite often (sometimes by accident, sometimes on purpose) that the Scrapy developer passes objects referenced in Requests (for example, using the meta attribute or the request callback function) and that effectively bounds the lifetime of those referenced objects to the lifetime of the Request. This is, by far, the most common cause of memory leaks in Scrapy projects, and a quite difficult one to debug for newcomers.

              In big projects, the spiders are typically written by different people and some of those spiders could be “leaking” and thus affecting the rest of the other (well-written) spiders when they get to run concurrently, which, in turn, affects the whole crawling process.

              At the same time, it’s hard to avoid the reasons that cause these leaks without restricting the power of the framework, so we have decided not to restrict the functionally but provide useful tools for debugging these leaks, which quite often consist in an answer to the question: which spider is leaking?.

              The leak could also come from a custom middleware, pipeline or extension that you have written, if you are not releasing the (previously allocated) resources properly. For example, if you’re allocating resources on spider_opened but not releasing them on spider_closed.

              Debugging memory leaks with trackref

              trackref is a module provided by Scrapy to debug the most common cases of memory leaks. It basically tracks the references to all live Requests, Responses, Item and Selector objects.

              You can enter the telnet console and inspect how many objects (of the classes mentioned above) are currently alive using the prefs() function which is an alias to the print_live_refs() function:

              telnet localhost 6023
              
              >>> prefs()
              Live References
              
              ExampleSpider                       1   oldest: 15s ago
              HtmlResponse                       10   oldest: 1s ago
              Selector                            2   oldest: 0s ago
              FormRequest                       878   oldest: 7s ago

              As you can see, that report also shows the “age” of the oldest object in each class.

              If you do have leaks, chances are you can figure out which spider is leaking by looking at the oldest request or response. You can get the oldest object of each class using the get_oldest() function like this (from the telnet console).

              Which objects are tracked?

              The objects tracked by trackrefs are all from these classes (and all its subclasses):

              • scrapy.http.Request
              • scrapy.http.Response
              • scrapy.item.Item
              • scrapy.selector.Selector
              • scrapy.spider.Spider

              A real example

              Let’s see a concrete example of an hypothetical case of memory leaks.

              Suppose we have some spider with a line similar to this one:

              return Request("http://www.somenastyspider.com/product.php?pid=%d" % product_id,
                  callback=self.parse, meta={referer: response}")

              That line is passing a response reference inside a request which effectively ties the response lifetime to the requests’ one, and that would definitely cause memory leaks.

              Let’s see how we can discover which one is the nasty spider (without knowing it a-priori, of course) by using the trackref tool.

              After the crawler is running for a few minutes and we notice its memory usage has grown a lot, we can enter its telnet console and check the live references:

              >>> prefs()
              Live References
              
              SomenastySpider                     1   oldest: 15s ago
              HtmlResponse                     3890   oldest: 265s ago
              Selector                            2   oldest: 0s ago
              Request                          3878   oldest: 250s ago
              

              The fact that there are so many live responses (and that they’re so old) is definitely suspicious, as responses should have a relatively short lifetime compared to Requests. So let’s check the oldest response:

              >>> from scrapy.utils.trackref import get_oldest
              >>> r = get_oldest('HtmlResponse')
              >>> r.url
              'http://www.somenastyspider.com/product.php?pid=123'
              

              There it is. By looking at the URL of the oldest response we can see it belongs to the somenastyspider.com spider. We can now go and check the code of that spider to discover the nasty line that is generating the leaks (passing response references inside requests).

              If you want to iterate over all objects, instead of getting the oldest one, you can use the iter_all() function:

              >>> from scrapy.utils.trackref import iter_all
              >>> [r.url for r in iter_all('HtmlResponse')]
              ['http://www.somenastyspider.com/product.php?pid=123',
               'http://www.somenastyspider.com/product.php?pid=584',
              ...
              

              Too many spiders?

              If your project has too many spiders, the output of prefs() can be difficult to read. For this reason, that function has a ignore argument which can be used to ignore a particular class (and all its subclases). For example, using:

              >>> from scrapy.spider import Spider
              >>> prefs(ignore=Spider)
              

              Won’t show any live references to spiders.

              scrapy.utils.trackref module

              Here are the functions available in the trackref module.

              class scrapy.utils.trackref.object_ref

              Inherit from this class (instead of object) if you want to track live instances with the trackref module.

              scrapy.utils.trackref.print_live_refs(class_name, ignore=NoneType)

              Print a report of live references, grouped by class name.

              Parameters:ignore (class or classes tuple) – if given, all objects from the specified class (or tuple of classes) will be ignored.
              scrapy.utils.trackref.get_oldest(class_name)

              Return the oldest object alive with the given class name, or None if none is found. Use print_live_refs() first to get a list of all tracked live objects per class name.

              scrapy.utils.trackref.iter_all(class_name)

              Return an iterator over all objects alive with the given class name, or None if none is found. Use print_live_refs() first to get a list of all tracked live objects per class name.

              Debugging memory leaks with Guppy

              trackref provides a very convenient mechanism for tracking down memory leaks, but it only keeps track of the objects that are more likely to cause memory leaks (Requests, Responses, Items, and Selectors). However, there are other cases where the memory leaks could come from other (more or less obscure) objects. If this is your case, and you can’t find your leaks using trackref, you still have another resource: the Guppy library.

              If you use setuptools, you can install Guppy with the following command:

              easy_install guppy

              The telnet console also comes with a built-in shortcut (hpy) for accessing Guppy heap objects. Here’s an example to view all Python objects available in the heap using Guppy:

              >>> x = hpy.heap()
              >>> x.bytype
              Partition of a set of 297033 objects. Total size = 52587824 bytes.
               Index  Count   %     Size   % Cumulative  % Type
                   0  22307   8 16423880  31  16423880  31 dict
                   1 122285  41 12441544  24  28865424  55 str
                   2  68346  23  5966696  11  34832120  66 tuple
                   3    227   0  5836528  11  40668648  77 unicode
                   4   2461   1  2222272   4  42890920  82 type
                   5  16870   6  2024400   4  44915320  85 function
                   6  13949   5  1673880   3  46589200  89 types.CodeType
                   7  13422   5  1653104   3  48242304  92 list
                   8   3735   1  1173680   2  49415984  94 _sre.SRE_Pattern
                   9   1209   0   456936   1  49872920  95 scrapy.http.headers.Headers
              <1676 more rows. Type e.g. '_.more' to view.>
              

              You can see that most space is used by dicts. Then, if you want to see from which attribute those dicts are referenced, you could do:

              >>> x.bytype[0].byvia
              Partition of a set of 22307 objects. Total size = 16423880 bytes.
               Index  Count   %     Size   % Cumulative  % Referred Via:
                   0  10982  49  9416336  57   9416336  57 '.__dict__'
                   1   1820   8  2681504  16  12097840  74 '.__dict__', '.func_globals'
                   2   3097  14  1122904   7  13220744  80
                   3    990   4   277200   2  13497944  82 "['cookies']"
                   4    987   4   276360   2  13774304  84 "['cache']"
                   5    985   4   275800   2  14050104  86 "['meta']"
                   6    897   4   251160   2  14301264  87 '[2]'
                   7      1   0   196888   1  14498152  88 "['moduleDict']", "['modules']"
                   8    672   3   188160   1  14686312  89 "['cb_kwargs']"
                   9     27   0   155016   1  14841328  90 '[1]'
              <333 more rows. Type e.g. '_.more' to view.>
              

              As you can see, the Guppy module is very powerful but also requires some deep knowledge about Python internals. For more info about Guppy, refer to the Guppy documentation.

              Leaks without leaks

              Sometimes, you may notice that the memory usage of your Scrapy process will only increase, but never decrease. Unfortunately, this could happen even though neither Scrapy nor your project are leaking memory. This is due to a (not so well) known problem of Python, which may not return released memory to the operating system in some cases. For more information on this issue see:

              The improvements proposed by Evan Jones, which are detailed in this paper, got merged in Python 2.5, but this only reduces the problem, it doesn’t fix it completely. To quote the paper:

              Unfortunately, this patch can only free an arena if there are no more objects allocated in it anymore. This means that fragmentation is a large issue. An application could have many megabytes of free memory, scattered throughout all the arenas, but it will be unable to free any of it. This is a problem experienced by all memory allocators. The only way to solve it is to move to a compacting garbage collector, which is able to move objects in memory. This would require significant changes to the Python interpreter.

              This problem will be fixed in future Scrapy releases, where we plan to adopt a new process model and run spiders in a pool of recyclable sub-processes.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK$o1DtNscrapy-0.22/topics/loaders.html Item Loaders — Scrapy 0.22.0 documentation

              Item Loaders

              Item Loaders provide a convenient mechanism for populating scraped Items. Even though Items can be populated using their own dictionary-like API, the Item Loaders provide a much more convenient API for populating them from a scraping process, by automating some common tasks like parsing the raw extracted data before assigning it.

              In other words, Items provide the container of scraped data, while Item Loaders provide the mechanism for populating that container.

              Item Loaders are designed to provide a flexible, efficient and easy mechanism for extending and overriding different field parsing rules, either by spider, or by source format (HTML, XML, etc) without becoming a nightmare to maintain.

              Using Item Loaders to populate items

              To use an Item Loader, you must first instantiate it. You can either instantiate it with an dict-like object (e.g. Item or dict) or without one, in which case an Item is automatically instantiated in the Item Loader constructor using the Item class specified in the ItemLoader.default_item_class attribute.

              Then, you start collecting values into the Item Loader, typically using Selectors. You can add more than one value to the same item field; the Item Loader will know how to “join” those values later using a proper processing function.

              Here is a typical Item Loader usage in a Spider, using the Product item declared in the Items chapter:

              from scrapy.contrib.loader import ItemLoader
              from myproject.items import Product
              
              def parse(self, response):
                  l = ItemLoader(item=Product(), response=response)
                  l.add_xpath('name', '//div[@class="product_name"]')
                  l.add_xpath('name', '//div[@class="product_title"]')
                  l.add_xpath('price', '//p[@id="price"]')
                  l.add_css('stock', 'p#stock]')
                  l.add_value('last_updated', 'today') # you can also use literal values
                  return l.load_item()
              

              By quickly looking at that code, we can see the name field is being extracted from two different XPath locations in the page:

              1. //div[@class="product_name"]
              2. //div[@class="product_title"]

              In other words, data is being collected by extracting it from two XPath locations, using the add_xpath() method. This is the data that will be assigned to the name field later.

              Afterwords, similar calls are used for price and stock fields (the later using a CSS selector with the add_css() method), and finally the last_update field is populated directly with a literal value (today) using a different method: add_value().

              Finally, when all data is collected, the ItemLoader.load_item() method is called which actually populates and returns the item populated with the data previously extracted and collected with the add_xpath(), add_css(), and add_value() calls.

              Input and Output processors

              An Item Loader contains one input processor and one output processor for each (item) field. The input processor processes the extracted data as soon as it’s received (through the add_xpath(), add_css() or add_value() methods) and the result of the input processor is collected and kept inside the ItemLoader. After collecting all data, the ItemLoader.load_item() method is called to populate and get the populated Item object. That’s when the output processor is called with the data previously collected (and processed using the input processor). The result of the output processor is the final value that gets assigned to the item.

              Let’s see an example to illustrate how the input and output processors are called for a particular field (the same applies for any other field):

              l = ItemLoader(Product(), some_selector)
              l.add_xpath('name', xpath1) # (1)
              l.add_xpath('name', xpath2) # (2)
              l.add_css('name', css) # (3)
              l.add_value('name', 'test') # (4)
              return l.load_item() # (5)
              

              So what happens is:

              1. Data from xpath1 is extracted, and passed through the input processor of the name field. The result of the input processor is collected and kept in the Item Loader (but not yet assigned to the item).
              2. Data from xpath2 is extracted, and passed through the same input processor used in (1). The result of the input processor is appended to the data collected in (1) (if any).
              3. This case is similar to the previous ones, except that the data is extracted from the css CSS selector, and passed through the same input processor used in (1) and (2). The result of the input processor is appended to the data collected in (1) and (2) (if any).
              4. This case is also similar to the previous ones, except that the value to be collected is assigned directly, instead of being extracted from a XPath expression or a CSS selector. However, the value is still passed through the input processors. In this case, since the value is not iterable it is converted to an iterable of a single element before passing it to the input processor, because input processor always receive iterables.
              5. The data collected in steps (1), (2), (3) and (4) is passed through the output processor of the name field. The result of the output processor is the value assigned to the name field in the item.

              It’s worth noticing that processors are just callable objects, which are called with the data to be parsed, and return a parsed value. So you can use any function as input or output processor. The only requirement is that they must accept one (and only one) positional argument, which will be an iterator.

              Note

              Both input and output processors must receive an iterator as their first argument. The output of those functions can be anything. The result of input processors will be appended to an internal list (in the Loader) containing the collected values (for that field). The result of the output processors is the value that will be finally assigned to the item.

              The other thing you need to keep in mind is that the values returned by input processors are collected internally (in lists) and then passed to output processors to populate the fields.

              Last, but not least, Scrapy comes with some commonly used processors built-in for convenience.

              Declaring Item Loaders

              Item Loaders are declared like Items, by using a class definition syntax. Here is an example:

              from scrapy.contrib.loader import ItemLoader
              from scrapy.contrib.loader.processor import TakeFirst, MapCompose, Join
              
              class ProductLoader(ItemLoader):
              
                  default_output_processor = TakeFirst()
              
                  name_in = MapCompose(unicode.title)
                  name_out = Join()
              
                  price_in = MapCompose(unicode.strip)
              
                  # ...
              

              As you can see, input processors are declared using the _in suffix while output processors are declared using the _out suffix. And you can also declare a default input/output processors using the ItemLoader.default_input_processor and ItemLoader.default_output_processor attributes.

              Declaring Input and Output Processors

              As seen in the previous section, input and output processors can be declared in the Item Loader definition, and it’s very common to declare input processors this way. However, there is one more place where you can specify the input and output processors to use: in the Item Field metadata. Here is an example:

              from scrapy.item import Item, Field
              from scrapy.contrib.loader.processor import MapCompose, Join, TakeFirst
              
              from scrapy.utils.markup import remove_entities
              from myproject.utils import filter_prices
              
              class Product(Item):
                  name = Field(
                      input_processor=MapCompose(remove_entities),
                      output_processor=Join(),
                  )
                  price = Field(
                      default=0,
                      input_processor=MapCompose(remove_entities, filter_prices),
                      output_processor=TakeFirst(),
                  )
              

              The precedence order, for both input and output processors, is as follows:

              1. Item Loader field-specific attributes: field_in and field_out (most precedence)
              2. Field metadata (input_processor and output_processor key)
              3. Item Loader defaults: ItemLoader.default_input_processor() and ItemLoader.default_output_processor() (least precedence)

              See also: Reusing and extending Item Loaders.

              Item Loader Context

              The Item Loader Context is a dict of arbitrary key/values which is shared among all input and output processors in the Item Loader. It can be passed when declaring, instantiating or using Item Loader. They are used to modify the behaviour of the input/output processors.

              For example, suppose you have a function parse_length which receives a text value and extracts a length from it:

              def parse_length(text, loader_context):
                  unit = loader_context.get('unit', 'm')
                  # ... length parsing code goes here ...
                  return parsed_length
              

              By accepting a loader_context argument the function is explicitly telling the Item Loader that is able to receive an Item Loader context, so the Item Loader passes the currently active context when calling it, and the processor function (parse_length in this case) can thus use them.

              There are several ways to modify Item Loader context values:

              1. By modifying the currently active Item Loader context (context attribute):

                loader = ItemLoader(product)
                loader.context['unit'] = 'cm'
                
              2. On Item Loader instantiation (the keyword arguments of Item Loader constructor are stored in the Item Loader context):

                loader = ItemLoader(product, unit='cm')
                
              3. On Item Loader declaration, for those input/output processors that support instatiating them with a Item Loader context. MapCompose is one of them:

                class ProductLoader(ItemLoader):
                    length_out = MapCompose(parse_length, unit='cm')
                

              ItemLoader objects

              class scrapy.contrib.loader.ItemLoader([item, selector, response], **kwargs)

              Return a new Item Loader for populating the given Item. If no item is given, one is instantiated automatically using the class in default_item_class.

              When instantiated with a selector or a response parameters the ItemLoader class provides convenient mechanisms for extracting data from web pages using selectors.

              Parameters:

              The item, selector, response and the remaining keyword arguments are assigned to the Loader context (accessible through the context attribute).

              ItemLoader instances have the following methods:

              get_value(value, *processors, **kwargs)

              Process the given value by the given processors and keyword arguments.

              Available keyword arguments:

              Parameters:re (str or compiled regex) – a regular expression to use for extracting data from the given value using extract_regex() method, applied before processors

              Examples:

              >>> from scrapy.contrib.loader.processor import TakeFirst
              >>> loader.get_value(u'name: foo', TakeFirst(), unicode.upper, re='name: (.+)')
              'FOO`
              
              add_value(field_name, value, *processors, **kwargs)

              Process and then add the given value for the given field.

              The value is first passed through get_value() by giving the processors and kwargs, and then passed through the field input processor and its result appended to the data collected for that field. If the field already contains collected data, the new data is added.

              The given field_name can be None, in which case values for multiple fields may be added. And the processed value should be a dict with field_name mapped to values.

              Examples:

              loader.add_value('name', u'Color TV')
              loader.add_value('colours', [u'white', u'blue'])
              loader.add_value('length', u'100')
              loader.add_value('name', u'name: foo', TakeFirst(), re='name: (.+)')
              loader.add_value(None, {'name': u'foo', 'sex': u'male'})
              
              replace_value(field_name, value)

              Similar to add_value() but replaces the collected data with the new value instead of adding it.

              get_xpath(xpath, *processors, **kwargs)

              Similar to ItemLoader.get_value() but receives an XPath instead of a value, which is used to extract a list of unicode strings from the selector associated with this ItemLoader.

              Parameters:
              • xpath (str) – the XPath to extract data from
              • re (str or compiled regex) – a regular expression to use for extracting data from the selected XPath region

              Examples:

              # HTML snippet: <p class="product-name">Color TV</p>
              loader.get_xpath('//p[@class="product-name"]')
              # HTML snippet: <p id="price">the price is $1200</p>
              loader.get_xpath('//p[@id="price"]', TakeFirst(), re='the price is (.*)')
              
              add_xpath(field_name, xpath, *processors, **kwargs)

              Similar to ItemLoader.add_value() but receives an XPath instead of a value, which is used to extract a list of unicode strings from the selector associated with this ItemLoader.

              See get_xpath() for kwargs.

              Parameters:xpath (str) – the XPath to extract data from

              Examples:

              # HTML snippet: <p class="product-name">Color TV</p>
              loader.add_xpath('name', '//p[@class="product-name"]')
              # HTML snippet: <p id="price">the price is $1200</p>
              loader.add_xpath('price', '//p[@id="price"]', re='the price is (.*)')
              
              replace_xpath(field_name, xpath, *processors, **kwargs)

              Similar to add_xpath() but replaces collected data instead of adding it.

              get_css(css, *processors, **kwargs)

              Similar to ItemLoader.get_value() but receives a CSS selector instead of a value, which is used to extract a list of unicode strings from the selector associated with this ItemLoader.

              Parameters:
              • css (str) – the CSS selector to extract data from
              • re (str or compiled regex) – a regular expression to use for extracting data from the selected CSS region

              Examples:

              # HTML snippet: <p class="product-name">Color TV</p>
              loader.get_css('p.product-name')
              # HTML snippet: <p id="price">the price is $1200</p>
              loader.get_css('p#price', TakeFirst(), re='the price is (.*)')
              
              add_css(field_name, css, *processors, **kwargs)

              Similar to ItemLoader.add_value() but receives a CSS selector instead of a value, which is used to extract a list of unicode strings from the selector associated with this ItemLoader.

              See get_css() for kwargs.

              Parameters:css (str) – the CSS selector to extract data from

              Examples:

              # HTML snippet: <p class="product-name">Color TV</p>
              loader.add_css('name', 'p.product-name')
              # HTML snippet: <p id="price">the price is $1200</p>
              loader.add_css('price', 'p#price', re='the price is (.*)')
              
              replace_css(field_name, css, *processors, **kwargs)

              Similar to add_css() but replaces collected data instead of adding it.

              load_item()

              Populate the item with the data collected so far, and return it. The data collected is first passed through the output processors to get the final value to assign to each item field.

              get_collected_values(field_name)

              Return the collected values for the given field.

              get_output_value(field_name)

              Return the collected values parsed using the output processor, for the given field. This method doesn’t populate or modify the item at all.

              get_input_processor(field_name)

              Return the input processor for the given field.

              get_output_processor(field_name)

              Return the output processor for the given field.

              ItemLoader instances have the following attributes:

              item

              The Item object being parsed by this Item Loader.

              context

              The currently active Context of this Item Loader.

              default_item_class

              An Item class (or factory), used to instantiate items when not given in the constructor.

              default_input_processor

              The default input processor to use for those fields which don’t specify one.

              default_output_processor

              The default output processor to use for those fields which don’t specify one.

              default_selector_class

              The class used to construct the selector of this ItemLoader, if only a response is given in the constructor. If a selector is given in the constructor this attribute is ignored. This attribute is sometimes overridden in subclasses.

              selector

              The Selector object to extract data from. It’s either the selector given in the constructor or one created from the response given in the constructor using the default_selector_class. This attribute is meant to be read-only.

              Reusing and extending Item Loaders

              As your project grows bigger and acquires more and more spiders, maintenance becomes a fundamental problem, specially when you have to deal with many different parsing rules for each spider, having a lot of exceptions, but also wanting to reuse the common processors.

              Item Loaders are designed to ease the maintenance burden of parsing rules, without losing flexibility and, at the same time, providing a convenient mechanism for extending and overriding them. For this reason Item Loaders support traditional Python class inheritance for dealing with differences of specific spiders (or groups of spiders).

              Suppose, for example, that some particular site encloses their product names in three dashes (ie. ---Plasma TV---) and you don’t want to end up scraping those dashes in the final product names.

              Here’s how you can remove those dashes by reusing and extending the default Product Item Loader (ProductLoader):

              from scrapy.contrib.loader.processor import MapCompose
              from myproject.ItemLoaders import ProductLoader
              
              def strip_dashes(x):
                  return x.strip('-')
              
              class SiteSpecificLoader(ProductLoader):
                  name_in = MapCompose(strip_dashes, ProductLoader.name_in)
              

              Another case where extending Item Loaders can be very helpful is when you have multiple source formats, for example XML and HTML. In the XML version you may want to remove CDATA occurrences. Here’s an example of how to do it:

              from scrapy.contrib.loader.processor import MapCompose
              from myproject.ItemLoaders import ProductLoader
              from myproject.utils.xml import remove_cdata
              
              class XmlProductLoader(ProductLoader):
                  name_in = MapCompose(remove_cdata, ProductLoader.name_in)
              

              And that’s how you typically extend input processors.

              As for output processors, it is more common to declare them in the field metadata, as they usually depend only on the field and not on each specific site parsing rule (as input processors do). See also: Declaring Input and Output Processors.

              There are many other possible ways to extend, inherit and override your Item Loaders, and different Item Loaders hierarchies may fit better for different projects. Scrapy only provides the mechanism; it doesn’t impose any specific organization of your Loaders collection - that’s up to you and your project’s needs.

              Available built-in processors

              Even though you can use any callable function as input and output processors, Scrapy provides some commonly used processors, which are described below. Some of them, like the MapCompose (which is typically used as input processor) compose the output of several functions executed in order, to produce the final parsed value.

              Here is a list of all built-in processors:

              class scrapy.contrib.loader.processor.Identity

              The simplest processor, which doesn’t do anything. It returns the original values unchanged. It doesn’t receive any constructor arguments nor accepts Loader contexts.

              Example:

              >>> from scrapy.contrib.loader.processor import Identity
              >>> proc = Identity()
              >>> proc(['one', 'two', 'three'])
              ['one', 'two', 'three']
              
              class scrapy.contrib.loader.processor.TakeFirst

              Return the first non-null/non-empty value from the values received, so it’s typically used as an output processor to single-valued fields. It doesn’t receive any constructor arguments, nor accept Loader contexts.

              Example:

              >>> from scrapy.contrib.loader.processor import TakeFirst
              >>> proc = TakeFirst()
              >>> proc(['', 'one', 'two', 'three'])
              'one'
              
              class scrapy.contrib.loader.processor.Join(separator=u' ')

              Returns the values joined with the separator given in the constructor, which defaults to u' '. It doesn’t accept Loader contexts.

              When using the default separator, this processor is equivalent to the function: u' '.join

              Examples:

              >>> from scrapy.contrib.loader.processor import Join
              >>> proc = Join()
              >>> proc(['one', 'two', 'three'])
              u'one two three'
              >>> proc = Join('<br>')
              >>> proc(['one', 'two', 'three'])
              u'one<br>two<br>three'
              
              class scrapy.contrib.loader.processor.Compose(*functions, **default_loader_context)

              A processor which is constructed from the composition of the given functions. This means that each input value of this processor is passed to the first function, and the result of that function is passed to the second function, and so on, until the last function returns the output value of this processor.

              By default, stop process on None value. This behaviour can be changed by passing keyword argument stop_on_none=False.

              Example:

              >>> from scrapy.contrib.loader.processor import Compose
              >>> proc = Compose(lambda v: v[0], str.upper)
              >>> proc(['hello', 'world'])
              'HELLO'
              

              Each function can optionally receive a loader_context parameter. For those which do, this processor will pass the currently active Loader context through that parameter.

              The keyword arguments passed in the constructor are used as the default Loader context values passed to each function call. However, the final Loader context values passed to functions are overridden with the currently active Loader context accessible through the ItemLoader.context() attribute.

              class scrapy.contrib.loader.processor.MapCompose(*functions, **default_loader_context)

              A processor which is constructed from the composition of the given functions, similar to the Compose processor. The difference with this processor is the way internal results are passed among functions, which is as follows:

              The input value of this processor is iterated and each element is passed to the first function, and the result of that function (for each element) is concatenated to construct a new iterable, which is then passed to the second function, and so on, until the last function is applied for each value of the list of values collected so far. The output values of the last function are concatenated together to produce the output of this processor.

              Each particular function can return a value or a list of values, which is flattened with the list of values returned by the same function applied to the other input values. The functions can also return None in which case the output of that function is ignored for further processing over the chain.

              This processor provides a convenient way to compose functions that only work with single values (instead of iterables). For this reason the MapCompose processor is typically used as input processor, since data is often extracted using the extract() method of selectors, which returns a list of unicode strings.

              The example below should clarify how it works:

              >>> def filter_world(x):
              ...     return None if x == 'world' else x
              ...
              >>> from scrapy.contrib.loader.processor import MapCompose
              >>> proc = MapCompose(filter_world, unicode.upper)
              >>> proc([u'hello', u'world', u'this', u'is', u'scrapy'])
              [u'HELLO, u'THIS', u'IS', u'SCRAPY']
              

              As with the Compose processor, functions can receive Loader contexts, and constructor keyword arguments are used as default context values. See Compose processor for more info.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK"o1D ֳ$scrapy-0.22/topics/autothrottle.html AutoThrottle extension — Scrapy 0.22.0 documentation

              AutoThrottle extension

              This is an extension for automatically throttling crawling speed based on load of both the Scrapy server and the website you are crawling.

              Design goals

              1. be nicer to sites instead of using default download delay of zero
              2. automatically adjust scrapy to the optimum crawling speed, so the user doesn’t have to tune the download delays and concurrent requests to find the optimum one. the user only needs to specify the maximum concurrent requests it allows, and the extension does the rest.

              How it works

              In Scrapy, the download latency is measured as the time elapsed between establishing the TCP connection and receiving the HTTP headers.

              Note that these latencies are very hard to measure accurately in a cooperative multitasking environment because Scrapy may be busy processing a spider callback, for example, and unable to attend downloads. However, these latencies should still give a reasonable estimate of how busy Scrapy (and ultimately, the server) is, and this extension builds on that premise.

              Throttling algorithm

              This adjusts download delays and concurrency based on the following rules:

              1. spiders always start with one concurrent request and a download delay of AUTOTHROTTLE_START_DELAY
              2. when a response is received, the download delay is adjusted to the average of previous download delay and the latency of the response.

              Note

              The AutoThrottle extension honours the standard Scrapy settings for concurrency and delay. This means that it will never set a download delay lower than DOWNLOAD_DELAY or a concurrency higher than CONCURRENT_REQUESTS_PER_DOMAIN (or CONCURRENT_REQUESTS_PER_IP, depending on which one you use).

              Settings

              The settings used to control the AutoThrottle extension are:

              For more information see Throttling algorithm.

              AUTOTHROTTLE_ENABLED

              Default: False

              Enables the AutoThrottle extension.

              AUTOTHROTTLE_START_DELAY

              Default: 5.0

              The initial download delay (in seconds).

              AUTOTHROTTLE_MAX_DELAY

              Default: 60.0

              The maximum download delay (in seconds) to be set in case of high latencies.

              AUTOTHROTTLE_DEBUG

              Default: False

              Enable AutoThrottle debug mode which will display stats on every response received, so you can see how the throttling parameters are being adjusted in real time.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK$o1DxUxxscrapy-0.22/topics/logging.html Logging — Scrapy 0.22.0 documentation

              Logging

              Scrapy provides a logging facility which can be used through the scrapy.log module. The current underlying implementation uses Twisted logging but this may change in the future.

              The logging service must be explicitly started through the scrapy.log.start() function.

              Log levels

              Scrapy provides 5 logging levels:

              1. CRITICAL - for critical errors
              2. ERROR - for regular errors
              3. WARNING - for warning messages
              4. INFO - for informational messages
              5. DEBUG - for debugging messages

              How to set the log level

              You can set the log level using the –loglevel/-L command line option, or using the LOG_LEVEL setting.

              How to log messages

              Here’s a quick example of how to log a message using the WARNING level:

              from scrapy import log
              log.msg("This is a warning", level=log.WARNING)
              

              Logging from Spiders

              The recommended way to log from spiders is by using the Spider log() method, which already populates the spider argument of the scrapy.log.msg() function. The other arguments are passed directly to the msg() function.

              scrapy.log module

              scrapy.log.start(logfile=None, loglevel=None, logstdout=None)

              Start the logging facility. This must be called before actually logging any messages. Otherwise, messages logged before this call will get lost.

              Parameters:
              • logfile (str) – the file path to use for logging output. If omitted, the LOG_FILE setting will be used. If both are None, the log will be sent to standard error.
              • loglevel – the minimum logging level to log. Available values are: CRITICAL, ERROR, WARNING, INFO and DEBUG.
              • logstdout (boolean) – if True, all standard output (and error) of your application will be logged instead. For example if you “print ‘hello’” it will appear in the Scrapy log. If omitted, the LOG_STDOUT setting will be used.
              scrapy.log.msg(message, level=INFO, spider=None)

              Log a message

              Parameters:
              • message (str) – the message to log
              • level – the log level for this message. See Log levels.
              • spider (Spider object) – the spider to use for logging this message. This parameter should always be used when logging things related to a particular spider.
              scrapy.log.CRITICAL

              Log level for critical errors

              scrapy.log.ERROR

              Log level for errors

              scrapy.log.WARNING

              Log level for warnings

              scrapy.log.INFO

              Log level for informational messages (recommended level for production deployments)

              scrapy.log.DEBUG

              Log level for debugging messages (recommended level for development)

              Logging settings

              These settings can be used to configure the logging:

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK%o1Dq*Oadd!scrapy-0.22/topics/selectors.html Selectors — Scrapy 0.22.0 documentation

              Selectors

              When you’re scraping web pages, the most common task you need to perform is to extract data from the HTML source. There are several libraries available to achieve this:

              • BeautifulSoup is a very popular screen scraping library among Python programmers which constructs a Python object based on the structure of the HTML code and also deals with bad markup reasonably well, but it has one drawback: it’s slow.
              • lxml is a XML parsing library (which also parses HTML) with a pythonic API based on ElementTree (which is not part of the Python standard library).

              Scrapy comes with its own mechanism for extracting data. They’re called selectors because they “select” certain parts of the HTML document specified either by XPath or CSS expressions.

              XPath is a language for selecting nodes in XML documents, which can also be used with HTML. CSS is a language for applying styles to HTML documents. It defines selectors to associate those styles with specific HTML elements.

              Scrapy selectors are built over the lxml library, which means they’re very similar in speed and parsing accuracy.

              This page explains how selectors work and describes their API which is very small and simple, unlike the lxml API which is much bigger because the lxml library can be used for many other tasks, besides selecting markup documents.

              For a complete reference of the selectors API see Selector reference

              Using selectors

              Constructing selectors

              Scrapy selectors are instances of Selector class constructed by passing a Response object as first argument, the response’s body is what they’re going to be “selecting”:

              from scrapy.spider import Spider
              from scrapy.selector import Selector
              
              class MySpider(Spider):
                  # ...
                  def parse(self, response):
                      sel = Selector(response)
                      # Using XPath query
                      print sel.xpath('//p')
                      # Using CSS query
                      print sel.css('p')
                      # Nesting queries
                      print sel.xpath('//div[@foo="bar"]').css('span#bold')
              

              Using selectors

              To explain how to use the selectors we’ll use the Scrapy shell (which provides interactive testing) and an example page located in the Scrapy documentation server:

              Here’s its HTML code:

              <html>
               <head>
                <base href='http://example.com/' />
                <title>Example website</title>
               </head>
               <body>
                <div id='images'>
                 <a href='image1.html'>Name: My image 1 <br /><img src='image1_thumb.jpg' /></a>
                 <a href='image2.html'>Name: My image 2 <br /><img src='image2_thumb.jpg' /></a>
                 <a href='image3.html'>Name: My image 3 <br /><img src='image3_thumb.jpg' /></a>
                 <a href='image4.html'>Name: My image 4 <br /><img src='image4_thumb.jpg' /></a>
                 <a href='image5.html'>Name: My image 5 <br /><img src='image5_thumb.jpg' /></a>
                </div>
               </body>
              </html>
              

              First, let’s open the shell:

              scrapy shell http://doc.scrapy.org/en/latest/_static/selectors-sample1.html
              

              Then, after the shell loads, you’ll have a selector already instantiated and ready to use in sel shell variable.

              Since we’re dealing with HTML, the selector will automatically use an HTML parser.

              So, by looking at the HTML code of that page, let’s construct an XPath (using an HTML selector) for selecting the text inside the title tag:

              >>> sel.xpath('//title/text()')
              [<Selector (text) xpath=//title/text()>]
              

              As you can see, the .xpath() method returns an SelectorList instance, which is a list of new selectors. This API can be used quickly for extracting nested data.

              To actually extract the textual data, you must call the selector .extract() method, as follows:

              >>> sel.xpath('//title/text()').extract()
              [u'Example website']
              

              Notice that CSS selectors can select text or attribute nodes using CSS3 pseudo-elements:

              >>> sel.css('title::text').extract()
              [u'Example website']
              

              Now we’re going to get the base URL and some image links:

              >>> sel.xpath('//base/@href').extract()
              [u'http://example.com/']
              
              >>> sel.css('base::attr(href)').extract()
              [u'http://example.com/']
              
              >>> sel.xpath('//a[contains(@href, "image")]/@href').extract()
              [u'image1.html',
               u'image2.html',
               u'image3.html',
               u'image4.html',
               u'image5.html']
              
              >>> sel.css('a[href*=image]::attr(href)').extract()
              [u'image1.html',
               u'image2.html',
               u'image3.html',
               u'image4.html',
               u'image5.html']
              
              >>> sel.xpath('//a[contains(@href, "image")]/img/@src').extract()
              [u'image1_thumb.jpg',
               u'image2_thumb.jpg',
               u'image3_thumb.jpg',
               u'image4_thumb.jpg',
               u'image5_thumb.jpg']
              
              >>> sel.css('a[href*=image] img::attr(src)').extract()
              [u'image1_thumb.jpg',
               u'image2_thumb.jpg',
               u'image3_thumb.jpg',
               u'image4_thumb.jpg',
               u'image5_thumb.jpg']
              

              Nesting selectors

              The selection methods (.xpath() or .css()) returns a list of selectors of the same type, so you can call the selection methods for those selectors too. Here’s an example:

              >>> links = sel.xpath('//a[contains(@href, "image")]')
              >>> links.extract()
              [u'<a href="image1.html">Name: My image 1 <br><img src="image1_thumb.jpg"></a>',
               u'<a href="image2.html">Name: My image 2 <br><img src="image2_thumb.jpg"></a>',
               u'<a href="image3.html">Name: My image 3 <br><img src="image3_thumb.jpg"></a>',
               u'<a href="image4.html">Name: My image 4 <br><img src="image4_thumb.jpg"></a>',
               u'<a href="image5.html">Name: My image 5 <br><img src="image5_thumb.jpg"></a>']
              
              >>> for index, link in enumerate(links):
                      args = (index, link.xpath('@href').extract(), link.xpath('img/@src').extract())
                      print 'Link number %d points to url %s and image %s' % args
              
              Link number 0 points to url [u'image1.html'] and image [u'image1_thumb.jpg']
              Link number 1 points to url [u'image2.html'] and image [u'image2_thumb.jpg']
              Link number 2 points to url [u'image3.html'] and image [u'image3_thumb.jpg']
              Link number 3 points to url [u'image4.html'] and image [u'image4_thumb.jpg']
              Link number 4 points to url [u'image5.html'] and image [u'image5_thumb.jpg']
              

              Using selectors with regular expressions

              Selector also have a .re() method for extracting data using regular expressions. However, unlike using .xpath() or .css() methods, .re() method returns a list of unicode strings. So you can’t construct nested .re() calls.

              Here’s an example used to extract images names from the HTML code above:

              >>> sel.xpath('//a[contains(@href, "image")]/text()').re(r'Name:\s*(.*)')
              [u'My image 1',
               u'My image 2',
               u'My image 3',
               u'My image 4',
               u'My image 5']
              

              Working with relative XPaths

              Keep in mind that if you are nesting selectors and use an XPath that starts with /, that XPath will be absolute to the document and not relative to the Selector you’re calling it from.

              For example, suppose you want to extract all <p> elements inside <div> elements. First, you would get all <div> elements:

              >>> divs = sel.xpath('//div')
              

              At first, you may be tempted to use the following approach, which is wrong, as it actually extracts all <p> elements from the document, not only those inside <div> elements:

              >>> for p in divs.xpath('//p')  # this is wrong - gets all <p> from the whole document
              >>>     print p.extract()
              

              This is the proper way to do it (note the dot prefixing the .//p XPath):

              >>> for p in divs.xpath('.//p')  # extracts all <p> inside
              >>>     print p.extract()
              

              Another common case would be to extract all direct <p> children:

              >>> for p in divs.xpath('p')
              >>>     print p.extract()
              

              For more details about relative XPaths see the Location Paths section in the XPath specification.

              Using EXSLT extensions

              Being built atop lxml, Scrapy selectors also support some EXSLT extensions and come with these pre-registered namespaces to use in XPath expressions:

              prefix namespace usage
              re http://exslt.org/regular-expressions regular expressions
              set http://exslt.org/sets set manipulation

              Regular expressions

              The test() function for example can prove quite useful when XPath’s starts-with() or contains() are not sufficient.

              Example selecting links in list item with a “class” attribute ending with a digit:

              >>> doc = """
              ... <div>
              ...     <ul>
              ...         <li class="item-0"><a href="link1.html">first item</a></li>
              ...         <li class="item-1"><a href="link2.html">second item</a></li>
              ...         <li class="item-inactive"><a href="link3.html">third item</a></li>
              ...         <li class="item-1"><a href="link4.html">fourth item</a></li>
              ...         <li class="item-0"><a href="link5.html">fifth item</a></li>
              ...     </ul>
              ... </div>
              ... """
              >>> sel = Selector(text=doc, type="html")
              >>> sel.xpath('//li//@href').extract()
              [u'link1.html', u'link2.html', u'link3.html', u'link4.html', u'link5.html']
              >>> sel.xpath('//li[re:test(@class, "item-\d$")]//@href').extract()
              [u'link1.html', u'link2.html', u'link4.html', u'link5.html']
              >>>
              

              Warning

              C library libxslt doesn’t natively support EXSLT regular expressions so lxml‘s implementation uses hooks to Python’s re module. Thus, using regexp functions in your XPath expressions may add a small performance penalty.

              Set operations

              These can be handy for excluding parts of a document tree before extracting text elements for example.

              Example extracting microdata (sample content taken from http://schema.org/Product) with groups of itemscopes and corresponding itemprops:

              >>> doc = """
              ... <div itemscope itemtype="http://schema.org/Product">
              ...   <span itemprop="name">Kenmore White 17" Microwave</span>
              ...   <img src="kenmore-microwave-17in.jpg" alt='Kenmore 17" Microwave' />
              ...   <div itemprop="aggregateRating"
              ...     itemscope itemtype="http://schema.org/AggregateRating">
              ...    Rated <span itemprop="ratingValue">3.5</span>/5
              ...    based on <span itemprop="reviewCount">11</span> customer reviews
              ...   </div>
              ...
              ...   <div itemprop="offers" itemscope itemtype="http://schema.org/Offer">
              ...     <span itemprop="price">$55.00</span>
              ...     <link itemprop="availability" href="http://schema.org/InStock" />In stock
              ...   </div>
              ...
              ...   Product description:
              ...   <span itemprop="description">0.7 cubic feet countertop microwave.
              ...   Has six preset cooking categories and convenience features like
              ...   Add-A-Minute and Child Lock.</span>
              ...
              ...   Customer reviews:
              ...
              ...   <div itemprop="review" itemscope itemtype="http://schema.org/Review">
              ...     <span itemprop="name">Not a happy camper</span> -
              ...     by <span itemprop="author">Ellie</span>,
              ...     <meta itemprop="datePublished" content="2011-04-01">April 1, 2011
              ...     <div itemprop="reviewRating" itemscope itemtype="http://schema.org/Rating">
              ...       <meta itemprop="worstRating" content = "1">
              ...       <span itemprop="ratingValue">1</span>/
              ...       <span itemprop="bestRating">5</span>stars
              ...     </div>
              ...     <span itemprop="description">The lamp burned out and now I have to replace
              ...     it. </span>
              ...   </div>
              ...
              ...   <div itemprop="review" itemscope itemtype="http://schema.org/Review">
              ...     <span itemprop="name">Value purchase</span> -
              ...     by <span itemprop="author">Lucas</span>,
              ...     <meta itemprop="datePublished" content="2011-03-25">March 25, 2011
              ...     <div itemprop="reviewRating" itemscope itemtype="http://schema.org/Rating">
              ...       <meta itemprop="worstRating" content = "1"/>
              ...       <span itemprop="ratingValue">4</span>/
              ...       <span itemprop="bestRating">5</span>stars
              ...     </div>
              ...     <span itemprop="description">Great microwave for the price. It is small and
              ...     fits in my apartment.</span>
              ...   </div>
              ...   ...
              ... </div>
              ... """
              >>>
              >>> for scope in sel.xpath('//div[@itemscope]'):
              ...     print "current scope:", scope.xpath('@itemtype').extract()
              ...     props = scope.xpath('''
              ...                 set:difference(./descendant::*/@itemprop,
              ...                                .//*[@itemscope]/*/@itemprop)''')
              ...     print "    properties:", props.extract()
              ...     print
              ...
              current scope: [u'http://schema.org/Product']
                  properties: [u'name', u'aggregateRating', u'offers', u'description', u'review', u'review']
              
              current scope: [u'http://schema.org/AggregateRating']
                  properties: [u'ratingValue', u'reviewCount']
              
              current scope: [u'http://schema.org/Offer']
                  properties: [u'price', u'availability']
              
              current scope: [u'http://schema.org/Review']
                  properties: [u'name', u'author', u'datePublished', u'reviewRating', u'description']
              
              current scope: [u'http://schema.org/Rating']
                  properties: [u'worstRating', u'ratingValue', u'bestRating']
              
              current scope: [u'http://schema.org/Review']
                  properties: [u'name', u'author', u'datePublished', u'reviewRating', u'description']
              
              current scope: [u'http://schema.org/Rating']
                  properties: [u'worstRating', u'ratingValue', u'bestRating']
              
              >>>
              

              Here we first iterate over itemscope elements, and for each one, we look for all itemprops elements and exclude those that are themselves inside another itemscope.

              Built-in Selectors reference

              class scrapy.selector.Selector(response=None, text=None, type=None)

              An instance of Selector is a wrapper over response to select certain parts of its content.

              response is a HtmlResponse or XmlResponse object that will be used for selecting and extracting data.

              text is a unicode string or utf-8 encoded text for cases when a response isn’t available. Using text and response together is undefined behavior.

              type defines the selector type, it can be "html", "xml" or None (default).

              If type is None, the selector automatically chooses the best type based on response type (see below), or defaults to "html" in case it is used together with text.

              If type is None and a response is passed, the selector type is inferred from the response type as follow:

              Otherwise, if type is set, the selector type will be forced and no detection will occur.

              xpath(query)

              Find nodes matching the xpath query and return the result as a SelectorList instance with all elements flattened. List elements implement Selector interface too.

              query is a string containing the XPATH query to apply.

              css(query)

              Apply the given CSS selector and return a SelectorList instance.

              query is a string containing the CSS selector to apply.

              In the background, CSS queries are translated into XPath queries using cssselect library and run .xpath() method.

              extract()

              Serialize and return the matched nodes as a list of unicode strings. Percent encoded content is unquoted.

              re(regex)

              Apply the given regex and return a list of unicode strings with the matches.

              regex can be either a compiled regular expression or a string which will be compiled to a regular expression using re.compile(regex)

              register_namespace(prefix, uri)

              Register the given namespace to be used in this Selector. Without registering namespaces you can’t select or extract data from non-standard namespaces. See examples below.

              remove_namespaces()

              Remove all namespaces, allowing to traverse the document using namespace-less xpaths. See example below.

              __nonzero__()

              Returns True if there is any real content selected or False otherwise. In other words, the boolean value of a Selector is given by the contents it selects.

              SelectorList objects

              class scrapy.selector.SelectorList

              The SelectorList class is subclass of the builtin list class, which provides a few additional methods.

              xpath(query)

              Call the .xpath() method for each element in this list and return their results flattened as another SelectorList.

              query is the same argument as the one in Selector.xpath()

              css(query)

              Call the .css() method for each element in this list and return their results flattened as another SelectorList.

              query is the same argument as the one in Selector.css()

              extract()

              Call the .extract() method for each element is this list and return their results flattened, as a list of unicode strings.

              re()

              Call the .re() method for each element is this list and return their results flattened, as a list of unicode strings.

              __nonzero__()

              returns True if the list is not empty, False otherwise.

              Selector examples on HTML response

              Here’s a couple of Selector examples to illustrate several concepts. In all cases, we assume there is already an Selector instantiated with a HtmlResponse object like this:

              sel = Selector(html_response)
              
              1. Select all <h1> elements from a HTML response body, returning a list of Selector objects (ie. a SelectorList object):

                sel.xpath("//h1")
                
              2. Extract the text of all <h1> elements from a HTML response body, returning a list of unicode strings:

                sel.xpath("//h1").extract()         # this includes the h1 tag
                sel.xpath("//h1/text()").extract()  # this excludes the h1 tag
                
              3. Iterate over all <p> tags and print their class attribute:

                for node in sel.xpath("//p"):
                ...    print node.xpath("@class").extract()

              Selector examples on XML response

              Here’s a couple of examples to illustrate several concepts. In both cases we assume there is already an Selector instantiated with a XmlResponse object like this:

              sel = Selector(xml_response)
              
              1. Select all <product> elements from a XML response body, returning a list of Selector objects (ie. a SelectorList object):

                sel.xpath("//product")
                
              2. Extract all prices from a Google Base XML feed which requires registering a namespace:

                sel.register_namespace("g", "http://base.google.com/ns/1.0")
                sel.xpath("//g:price").extract()
                

              Removing namespaces

              When dealing with scraping projects, it is often quite convenient to get rid of namespaces altogether and just work with element names, to write more simple/convenient XPaths. You can use the Selector.remove_namespaces() method for that.

              Let’s show an example that illustrates this with Github blog atom feed.

              First, we open the shell with the url we want to scrape:

              $ scrapy shell https://github.com/blog.atom

              Once in the shell we can try selecting all <link> objects and see that it doesn’t work (because the Atom XML namespace is obfuscating those nodes):

              >>> sel.xpath("//link")
              []
              

              But once we call the Selector.remove_namespaces() method, all nodes can be accessed directly by their names:

              >>> sel.remove_namespaces()
              >>> sel.xpath("//link")
              [<Selector xpath='//link' data=u'<link xmlns="http://www.w3.org/2005/Atom'>,
               <Selector xpath='//link' data=u'<link xmlns="http://www.w3.org/2005/Atom'>,
               ...
              

              If you wonder why the namespace removal procedure is not always called, instead of having to call it manually. This is because of two reasons which, in order of relevance, are:

              1. Removing namespaces requires to iterate and modify all nodes in the document, which is a reasonably expensive operation to performs for all documents crawled by Scrapy
              2. There could be some cases where using namespaces is actually required, in case some element names clash between namespaces. These cases are very rare though.
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK%o1D8SnSn scrapy-0.22/topics/settings.html Settings — Scrapy 0.22.0 documentation

              Settings

              The Scrapy settings allows you to customize the behaviour of all Scrapy components, including the core, extensions, pipelines and spiders themselves.

              The infrastructure of the settings provides a global namespace of key-value mappings that the code can use to pull configuration values from. The settings can be populated through different mechanisms, which are described below.

              The settings are also the mechanism for selecting the currently active Scrapy project (in case you have many).

              For a list of available built-in settings see: Built-in settings reference.

              Designating the settings

              When you use Scrapy, you have to tell it which settings you’re using. You can do this by using an environment variable, SCRAPY_SETTINGS_MODULE.

              The value of SCRAPY_SETTINGS_MODULE should be in Python path syntax, e.g. myproject.settings. Note that the settings module should be on the Python import search path.

              Populating the settings

              Settings can be populated using different mechanisms, each of which having a different precedence. Here is the list of them in decreasing order of precedence:

              1. Global overrides (most precedence)
              2. Project settings module
              3. Default settings per-command
              4. Default global settings (less precedence)

              These mechanisms are described in more detail below.

              1. Global overrides

              Global overrides are the ones that take most precedence, and are usually populated by command-line options. You can also override one (or more) settings from command line using the -s (or --set) command line option.

              For more information see the overrides Settings attribute.

              Example:

              scrapy crawl myspider -s LOG_FILE=scrapy.log
              

              2. Project settings module

              The project settings module is the standard configuration file for your Scrapy project. It’s where most of your custom settings will be populated. For example:: myproject.settings.

              3. Default settings per-command

              Each Scrapy tool command can have its own default settings, which override the global default settings. Those custom command settings are specified in the default_settings attribute of the command class.

              4. Default global settings

              The global defaults are located in the scrapy.settings.default_settings module and documented in the Built-in settings reference section.

              How to access settings

              Settings can be accessed through the scrapy.crawler.Crawler.settings attribute of the Crawler that is passed to from_crawler method in extensions and middlewares:

              class MyExtension(object):
              
                  @classmethod
                  def from_crawler(cls, crawler):
                      settings = crawler.settings
                      if settings['LOG_ENABLED']:
                          print "log is enabled!"
              

              In other words, settings can be accessed like a dict, but it’s usually preferred to extract the setting in the format you need it to avoid type errors. In order to do that you’ll have to use one of the methods provided the Settings API.

              Rationale for setting names

              Setting names are usually prefixed with the component that they configure. For example, proper setting names for a fictional robots.txt extension would be ROBOTSTXT_ENABLED, ROBOTSTXT_OBEY, ROBOTSTXT_CACHEDIR, etc.

              Built-in settings reference

              Here’s a list of all available Scrapy settings, in alphabetical order, along with their default values and the scope where they apply.

              The scope, where available, shows where the setting is being used, if it’s tied to any particular component. In that case the module of that component will be shown, typically an extension, middleware or pipeline. It also means that the component must be enabled in order for the setting to have any effect.

              AWS_ACCESS_KEY_ID

              Default: None

              The AWS access key used by code that requires access to Amazon Web services, such as the S3 feed storage backend.

              AWS_SECRET_ACCESS_KEY

              Default: None

              The AWS secret key used by code that requires access to Amazon Web services, such as the S3 feed storage backend.

              BOT_NAME

              Default: 'scrapybot'

              The name of the bot implemented by this Scrapy project (also known as the project name). This will be used to construct the User-Agent by default, and also for logging.

              It’s automatically populated with your project name when you create your project with the startproject command.

              CONCURRENT_ITEMS

              Default: 100

              Maximum number of concurrent items (per response) to process in parallel in the Item Processor (also known as the Item Pipeline).

              CONCURRENT_REQUESTS

              Default: 16

              The maximum number of concurrent (ie. simultaneous) requests that will be performed by the Scrapy downloader.

              CONCURRENT_REQUESTS_PER_DOMAIN

              Default: 8

              The maximum number of concurrent (ie. simultaneous) requests that will be performed to any single domain.

              CONCURRENT_REQUESTS_PER_IP

              Default: 0

              The maximum number of concurrent (ie. simultaneous) requests that will be performed to any single IP. If non-zero, the CONCURRENT_REQUESTS_PER_DOMAIN setting is ignored, and this one is used instead. In other words, concurrency limits will be applied per IP, not per domain.

              This setting also affects DOWNLOAD_DELAY: if CONCURRENT_REQUESTS_PER_IP is non-zero, download delay is enforced per IP, not per domain.

              DEFAULT_ITEM_CLASS

              Default: 'scrapy.item.Item'

              The default class that will be used for instantiating items in the the Scrapy shell.

              DEFAULT_REQUEST_HEADERS

              Default:

              {
                  'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
                  'Accept-Language': 'en',
              }
              

              The default headers used for Scrapy HTTP Requests. They’re populated in the DefaultHeadersMiddleware.

              DEPTH_LIMIT

              Default: 0

              The maximum depth that will be allowed to crawl for any site. If zero, no limit will be imposed.

              DEPTH_PRIORITY

              Default: 0

              An integer that is used to adjust the request priority based on its depth.

              If zero, no priority adjustment is made from depth.

              DEPTH_STATS

              Default: True

              Whether to collect maximum depth stats.

              DEPTH_STATS_VERBOSE

              Default: False

              Whether to collect verbose depth stats. If this is enabled, the number of requests for each depth is collected in the stats.

              DNSCACHE_ENABLED

              Default: True

              Whether to enable DNS in-memory cache.

              DOWNLOADER_DEBUG

              Default: False

              Whether to enable the Downloader debugging mode.

              DOWNLOADER_MIDDLEWARES

              Default:: {}

              A dict containing the downloader middlewares enabled in your project, and their orders. For more info see Activating a downloader middleware.

              DOWNLOADER_MIDDLEWARES_BASE

              Default:

              {
                  'scrapy.contrib.downloadermiddleware.robotstxt.RobotsTxtMiddleware': 100,
                  'scrapy.contrib.downloadermiddleware.httpauth.HttpAuthMiddleware': 300,
                  'scrapy.contrib.downloadermiddleware.downloadtimeout.DownloadTimeoutMiddleware': 350,
                  'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': 400,
                  'scrapy.contrib.downloadermiddleware.retry.RetryMiddleware': 500,
                  'scrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddleware': 550,
                  'scrapy.contrib.downloadermiddleware.redirect.MetaRefreshMiddleware': 580,
                  'scrapy.contrib.downloadermiddleware.httpcompression.HttpCompressionMiddleware': 590,
                  'scrapy.contrib.downloadermiddleware.redirect.RedirectMiddleware': 600,
                  'scrapy.contrib.downloadermiddleware.cookies.CookiesMiddleware': 700,
                  'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 750,
                  'scrapy.contrib.downloadermiddleware.chunked.ChunkedTransferMiddleware': 830,
                  'scrapy.contrib.downloadermiddleware.stats.DownloaderStats': 850,
                  'scrapy.contrib.downloadermiddleware.httpcache.HttpCacheMiddleware': 900,
              }
              

              A dict containing the downloader middlewares enabled by default in Scrapy. You should never modify this setting in your project, modify DOWNLOADER_MIDDLEWARES instead. For more info see Activating a downloader middleware.

              DOWNLOADER_STATS

              Default: True

              Whether to enable downloader stats collection.

              DOWNLOAD_DELAY

              Default: 0

              The amount of time (in secs) that the downloader should wait before downloading consecutive pages from the same website. This can be used to throttle the crawling speed to avoid hitting servers too hard. Decimal numbers are supported. Example:

              DOWNLOAD_DELAY = 0.25    # 250 ms of delay
              

              This setting is also affected by the RANDOMIZE_DOWNLOAD_DELAY setting (which is enabled by default). By default, Scrapy doesn’t wait a fixed amount of time between requests, but uses a random interval between 0.5 and 1.5 * DOWNLOAD_DELAY.

              When CONCURRENT_REQUESTS_PER_IP is non-zero, delays are enforced per ip address instead of per domain.

              You can also change this setting per spider by setting download_delay spider attribute.

              DOWNLOAD_HANDLERS

              Default: {}

              A dict containing the request downloader handlers enabled in your project. See DOWNLOAD_HANDLERS_BASE for example format.

              DOWNLOAD_HANDLERS_BASE

              Default:

              {
                  'file': 'scrapy.core.downloader.handlers.file.FileDownloadHandler',
                  'http': 'scrapy.core.downloader.handlers.http.HttpDownloadHandler',
                  'https': 'scrapy.core.downloader.handlers.http.HttpDownloadHandler',
                  's3': 'scrapy.core.downloader.handlers.s3.S3DownloadHandler',
              }
              

              A dict containing the request download handlers enabled by default in Scrapy. You should never modify this setting in your project, modify DOWNLOAD_HANDLERS instead.

              DOWNLOAD_TIMEOUT

              Default: 180

              The amount of time (in secs) that the downloader will wait before timing out.

              DUPEFILTER_CLASS

              Default: 'scrapy.dupefilter.RFPDupeFilter'

              The class used to detect and filter duplicate requests.

              The default (RFPDupeFilter) filters based on request fingerprint using the scrapy.utils.request.request_fingerprint function.

              EDITOR

              Default: depends on the environment

              The editor to use for editing spiders with the edit command. It defaults to the EDITOR environment variable, if set. Otherwise, it defaults to vi (on Unix systems) or the IDLE editor (on Windows).

              EXTENSIONS

              Default:: {}

              A dict containing the extensions enabled in your project, and their orders.

              EXTENSIONS_BASE

              Default:

              {
                  'scrapy.contrib.corestats.CoreStats': 0,
                  'scrapy.webservice.WebService': 0,
                  'scrapy.telnet.TelnetConsole': 0,
                  'scrapy.contrib.memusage.MemoryUsage': 0,
                  'scrapy.contrib.memdebug.MemoryDebugger': 0,
                  'scrapy.contrib.closespider.CloseSpider': 0,
                  'scrapy.contrib.feedexport.FeedExporter': 0,
                  'scrapy.contrib.logstats.LogStats': 0,
                  'scrapy.contrib.spiderstate.SpiderState': 0,
                  'scrapy.contrib.throttle.AutoThrottle': 0,
              }
              

              The list of available extensions. Keep in mind that some of them need to be enabled through a setting. By default, this setting contains all stable built-in extensions.

              For more information See the extensions user guide and the list of available extensions.

              ITEM_PIPELINES

              Default: {}

              A dict containing the item pipelines to use, and their orders. The dict is empty by default order values are arbitrary but it’s customary to define them in the 0-1000 range.

              Lists are supported in ITEM_PIPELINES for backwards compatibility, but they are deprecated.

              Example:

              ITEM_PIPELINES = {
                  'mybot.pipeline.validate.ValidateMyItem': 300,
                  'mybot.pipeline.validate.StoreMyItem': 800,
              }
              

              ITEM_PIPELINES_BASE

              Default: {}

              A dict containing the pipelines enabled by default in Scrapy. You should never modify this setting in your project, modify ITEM_PIPELINES instead.

              LOG_ENABLED

              Default: True

              Whether to enable logging.

              LOG_ENCODING

              Default: 'utf-8'

              The encoding to use for logging.

              LOG_FILE

              Default: None

              File name to use for logging output. If None, standard error will be used.

              LOG_LEVEL

              Default: 'DEBUG'

              Minimum level to log. Available levels are: CRITICAL, ERROR, WARNING, INFO, DEBUG. For more info see Logging.

              LOG_STDOUT

              Default: False

              If True, all standard output (and error) of your process will be redirected to the log. For example if you print 'hello' it will appear in the Scrapy log.

              MEMDEBUG_ENABLED

              Default: False

              Whether to enable memory debugging.

              MEMDEBUG_NOTIFY

              Default: []

              When memory debugging is enabled a memory report will be sent to the specified addresses if this setting is not empty, otherwise the report will be written to the log.

              Example:

              MEMDEBUG_NOTIFY = ['user@example.com']
              

              MEMUSAGE_ENABLED

              Default: False

              Scope: scrapy.contrib.memusage

              Whether to enable the memory usage extension that will shutdown the Scrapy process when it exceeds a memory limit, and also notify by email when that happened.

              See Memory usage extension.

              MEMUSAGE_LIMIT_MB

              Default: 0

              Scope: scrapy.contrib.memusage

              The maximum amount of memory to allow (in megabytes) before shutting down Scrapy (if MEMUSAGE_ENABLED is True). If zero, no check will be performed.

              See Memory usage extension.

              MEMUSAGE_NOTIFY_MAIL

              Default: False

              Scope: scrapy.contrib.memusage

              A list of emails to notify if the memory limit has been reached.

              Example:

              MEMUSAGE_NOTIFY_MAIL = ['user@example.com']
              

              See Memory usage extension.

              MEMUSAGE_REPORT

              Default: False

              Scope: scrapy.contrib.memusage

              Whether to send a memory usage report after each spider has been closed.

              See Memory usage extension.

              MEMUSAGE_WARNING_MB

              Default: 0

              Scope: scrapy.contrib.memusage

              The maximum amount of memory to allow (in megabytes) before sending a warning email notifying about it. If zero, no warning will be produced.

              NEWSPIDER_MODULE

              Default: ''

              Module where to create new spiders using the genspider command.

              Example:

              NEWSPIDER_MODULE = 'mybot.spiders_dev'
              

              RANDOMIZE_DOWNLOAD_DELAY

              Default: True

              If enabled, Scrapy will wait a random amount of time (between 0.5 and 1.5 * DOWNLOAD_DELAY) while fetching requests from the same website.

              This randomization decreases the chance of the crawler being detected (and subsequently blocked) by sites which analyze requests looking for statistically significant similarities in the time between their requests.

              The randomization policy is the same used by wget --random-wait option.

              If DOWNLOAD_DELAY is zero (default) this option has no effect.

              REDIRECT_MAX_TIMES

              Default: 20

              Defines the maximum times a request can be redirected. After this maximum the request’s response is returned as is. We used Firefox default value for the same task.

              REDIRECT_MAX_METAREFRESH_DELAY

              Default: 100

              Some sites use meta-refresh for redirecting to a session expired page, so we restrict automatic redirection to a maximum delay (in seconds)

              REDIRECT_PRIORITY_ADJUST

              Default: +2

              Adjust redirect request priority relative to original request. A negative priority adjust means more priority.

              ROBOTSTXT_OBEY

              Default: False

              Scope: scrapy.contrib.downloadermiddleware.robotstxt

              If enabled, Scrapy will respect robots.txt policies. For more information see RobotsTxtMiddleware

              SCHEDULER

              Default: 'scrapy.core.scheduler.Scheduler'

              The scheduler to use for crawling.

              SPIDER_CONTRACTS

              Default:: {}

              A dict containing the scrapy contracts enabled in your project, used for testing spiders. For more info see Spiders Contracts.

              SPIDER_CONTRACTS_BASE

              Default:

              {
                  'scrapy.contracts.default.UrlContract' : 1,
                  'scrapy.contracts.default.ReturnsContract': 2,
                  'scrapy.contracts.default.ScrapesContract': 3,
              }
              

              A dict containing the scrapy contracts enabled by default in Scrapy. You should never modify this setting in your project, modify SPIDER_CONTRACTS instead. For more info see Spiders Contracts.

              SPIDER_MIDDLEWARES

              Default:: {}

              A dict containing the spider middlewares enabled in your project, and their orders. For more info see Activating a spider middleware.

              SPIDER_MIDDLEWARES_BASE

              Default:

              {
                  'scrapy.contrib.spidermiddleware.httperror.HttpErrorMiddleware': 50,
                  'scrapy.contrib.spidermiddleware.offsite.OffsiteMiddleware': 500,
                  'scrapy.contrib.spidermiddleware.referer.RefererMiddleware': 700,
                  'scrapy.contrib.spidermiddleware.urllength.UrlLengthMiddleware': 800,
                  'scrapy.contrib.spidermiddleware.depth.DepthMiddleware': 900,
              }
              

              A dict containing the spider middlewares enabled by default in Scrapy. You should never modify this setting in your project, modify SPIDER_MIDDLEWARES instead. For more info see Activating a spider middleware.

              SPIDER_MODULES

              Default: []

              A list of modules where Scrapy will look for spiders.

              Example:

              SPIDER_MODULES = ['mybot.spiders_prod', 'mybot.spiders_dev']
              

              STATS_CLASS

              Default: 'scrapy.statscol.MemoryStatsCollector'

              The class to use for collecting stats, who must implement the Stats Collector API.

              STATS_DUMP

              Default: True

              Dump the Scrapy stats (to the Scrapy log) once the spider finishes.

              For more info see: Stats Collection.

              STATSMAILER_RCPTS

              Default: [] (empty list)

              Send Scrapy stats after spiders finish scraping. See StatsMailer for more info.

              TELNETCONSOLE_ENABLED

              Default: True

              A boolean which specifies if the telnet console will be enabled (provided its extension is also enabled).

              TELNETCONSOLE_PORT

              Default: [6023, 6073]

              The port range to use for the telnet console. If set to None or 0, a dynamically assigned port is used. For more info see Telnet Console.

              TEMPLATES_DIR

              Default: templates dir inside scrapy module

              The directory where to look for templates when creating new projects with startproject command.

              URLLENGTH_LIMIT

              Default: 2083

              Scope: contrib.spidermiddleware.urllength

              The maximum URL length to allow for crawled URLs. For more information about the default value for this setting see: http://www.boutell.com/newfaq/misc/urllength.html

              USER_AGENT

              Default: "Scrapy/VERSION (+http://scrapy.org)"

              The default User-Agent to use when crawling, unless overridden.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1D66scrapy-0.22/topics/images.html Downloading Item Images — Scrapy 0.22.0 documentation

              Downloading Item Images

              Scrapy provides an item pipeline for downloading images attached to a particular item, for example, when you scrape products and also want to download their images locally.

              This pipeline, called the Images Pipeline and implemented in the ImagesPipeline class, provides a convenient way for downloading and storing images locally with some additional features:

              • Convert all downloaded images to a common format (JPG) and mode (RGB)
              • Avoid re-downloading images which were downloaded recently
              • Thumbnail generation
              • Check images width/height to make sure they meet a minimum constraint

              This pipeline also keeps an internal queue of those images which are currently being scheduled for download, and connects those items that arrive containing the same image, to that queue. This avoids downloading the same image more than once when it’s shared by several items.

              Pillow is used for thumbnailing and normalizing images to JPEG/RGB format, so you need to install this library in order to use the images pipeline. Python Imaging Library (PIL) should also work in most cases, but it is known to cause troubles in some setups, so we recommend to use Pillow instead of PIL.

              Using the Images Pipeline

              The typical workflow, when using the ImagesPipeline goes like this:

              1. In a Spider, you scrape an item and put the URLs of its images into a image_urls field.
              2. The item is returned from the spider and goes to the item pipeline.
              3. When the item reaches the ImagesPipeline, the URLs in the image_urls field are scheduled for download using the standard Scrapy scheduler and downloader (which means the scheduler and downloader middlewares are reused), but with a higher priority, processing them before other pages are scraped. The item remains “locked” at that particular pipeline stage until the images have finish downloading (or fail for some reason).
              4. When the images are downloaded another field (images) will be populated with the results. This field will contain a list of dicts with information about the images downloaded, such as the downloaded path, the original scraped url (taken from the image_urls field) , and the image checksum. The images in the list of the images field will retain the same order of the original image_urls field. If some image failed downloading, an error will be logged and the image won’t be present in the images field.

              Usage example

              In order to use the image pipeline you just need to enable it and define an item with the image_urls and images fields:

              from scrapy.item import Item
              
              class MyItem(Item):
              
                  # ... other item fields ...
                  image_urls = Field()
                  images = Field()
              

              If you need something more complex and want to override the custom images pipeline behaviour, see Implementing your custom Images Pipeline.

              Enabling your Images Pipeline

              To enable your images pipeline you must first add it to your project ITEM_PIPELINES setting:

              ITEM_PIPELINES = {'scrapy.contrib.pipeline.images.ImagesPipeline': 1}
              

              And set the IMAGES_STORE setting to a valid directory that will be used for storing the downloaded images. Otherwise the pipeline will remain disabled, even if you include it in the ITEM_PIPELINES setting.

              For example:

              IMAGES_STORE = '/path/to/valid/dir'
              

              Images Storage

              File system is currently the only officially supported storage, but there is also (undocumented) support for Amazon S3.

              File system storage

              The images are stored in files (one per image), using a SHA1 hash of their URLs for the file names.

              For example, the following image URL:

              http://www.example.com/image.jpg

              Whose SHA1 hash is:

              3afec3b4765f8f0a07b78f98c07b83f013567a0a

              Will be downloaded and stored in the following file:

              <IMAGES_STORE>/full/3afec3b4765f8f0a07b78f98c07b83f013567a0a.jpg

              Where:

              • <IMAGES_STORE> is the directory defined in IMAGES_STORE setting
              • full is a sub-directory to separate full images from thumbnails (if used). For more info see Thumbnail generation.

              Additional features

              Image expiration

              The Image Pipeline avoids downloading images that were downloaded recently. To adjust this retention delay use the IMAGES_EXPIRES setting, which specifies the delay in number of days:

              # 90 days of delay for image expiration
              IMAGES_EXPIRES = 90
              

              Thumbnail generation

              The Images Pipeline can automatically create thumbnails of the downloaded images.

              In order use this feature, you must set IMAGES_THUMBS to a dictionary where the keys are the thumbnail names and the values are their dimensions.

              For example:

              IMAGES_THUMBS = {
                  'small': (50, 50),
                  'big': (270, 270),
              }
              

              When you use this feature, the Images Pipeline will create thumbnails of the each specified size with this format:

              <IMAGES_STORE>/thumbs/<size_name>/<image_id>.jpg

              Where:

              • <size_name> is the one specified in the IMAGES_THUMBS dictionary keys (small, big, etc)
              • <image_id> is the SHA1 hash of the image url

              Example of image files stored using small and big thumbnail names:

              <IMAGES_STORE>/full/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg
              <IMAGES_STORE>/thumbs/small/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg
              <IMAGES_STORE>/thumbs/big/63bbfea82b8880ed33cdb762aa11fab722a90a24.jpg

              The first one is the full image, as downloaded from the site.

              Filtering out small images

              You can drop images which are too small, by specifying the minimum allowed size in the IMAGES_MIN_HEIGHT and IMAGES_MIN_WIDTH settings.

              For example:

              IMAGES_MIN_HEIGHT = 110
              IMAGES_MIN_WIDTH = 110
              

              Note: these size constraints don’t affect thumbnail generation at all.

              By default, there are no size constraints, so all images are processed.

              Implementing your custom Images Pipeline

              Here are the methods that you should override in your custom Images Pipeline:

              class scrapy.contrib.pipeline.images.ImagesPipeline
              get_media_requests(item, info)

              As seen on the workflow, the pipeline will get the URLs of the images to download from the item. In order to do this, you must override the get_media_requests() method and return a Request for each image URL:

              def get_media_requests(self, item, info):
                  for image_url in item['image_urls']:
                      yield Request(image_url)
              

              Those requests will be processed by the pipeline and, when they have finished downloading, the results will be sent to the item_completed() method, as a list of 2-element tuples. Each tuple will contain (success, image_info_or_failure) where:

              • success is a boolean which is True if the image was downloaded successfully or False if it failed for some reason
              • image_info_or_error is a dict containing the following keys (if success is True) or a Twisted Failure if there was a problem.
                • url - the url where the image was downloaded from. This is the url of the request returned from the get_media_requests() method.
                • path - the path (relative to IMAGES_STORE) where the image was stored
                • checksum - a MD5 hash of the image contents

              The list of tuples received by item_completed() is guaranteed to retain the same order of the requests returned from the get_media_requests() method.

              Here’s a typical value of the results argument:

              [(True,
                {'checksum': '2b00042f7481c7b056c4b410d28f33cf',
                 'path': 'full/7d97e98f8af710c7e7fe703abc8f639e0ee507c4.jpg',
                 'url': 'http://www.example.com/images/product1.jpg'}),
               (True,
                {'checksum': 'b9628c4ab9b595f72f280b90c4fd093d',
                 'path': 'full/1ca5879492b8fd606df1964ea3c1e2f4520f076f.jpg',
                 'url': 'http://www.example.com/images/product2.jpg'}),
               (False,
                Failure(...))]
              

              By default the get_media_requests() method returns None which means there are no images to download for the item.

              item_completed(results, items, info)

              The ImagesPipeline.item_completed() method called when all image requests for a single item have completed (either finished downloading, or failed for some reason).

              The item_completed() method must return the output that will be sent to subsequent item pipeline stages, so you must return (or drop) the item, as you would in any pipeline.

              Here is an example of the item_completed() method where we store the downloaded image paths (passed in results) in the image_paths item field, and we drop the item if it doesn’t contain any images:

              from scrapy.exceptions import DropItem
              
              def item_completed(self, results, item, info):
                  image_paths = [x['path'] for ok, x in results if ok]
                  if not image_paths:
                      raise DropItem("Item contains no images")
                  item['image_paths'] = image_paths
                  return item
              

              By default, the item_completed() method returns the item.

              Custom Images pipeline example

              Here is a full example of the Images Pipeline whose methods are examplified above:

              from scrapy.contrib.pipeline.images import ImagesPipeline
              from scrapy.exceptions import DropItem
              from scrapy.http import Request
              
              class MyImagesPipeline(ImagesPipeline):
              
                  def get_media_requests(self, item, info):
                      for image_url in item['image_urls']:
                          yield Request(image_url)
              
                  def item_completed(self, results, item, info):
                      image_paths = [x['path'] for ok, x in results if ok]
                      if not image_paths:
                          raise DropItem("Item contains no images")
                      item['image_paths'] = image_paths
                      return item
              
              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PK#o1Dz--scrapy-0.22/topics/downloader-middleware.html Downloader Middleware — Scrapy 0.22.0 documentation

              Downloader Middleware

              The downloader middleware is a framework of hooks into Scrapy’s request/response processing. It’s a light, low-level system for globally altering Scrapy’s requests and responses.

              Activating a downloader middleware

              To activate a downloader middleware component, add it to the DOWNLOADER_MIDDLEWARES setting, which is a dict whose keys are the middleware class paths and their values are the middleware orders.

              Here’s an example:

              DOWNLOADER_MIDDLEWARES = {
                  'myproject.middlewares.CustomDownloaderMiddleware': 543,
              }
              

              The DOWNLOADER_MIDDLEWARES setting is merged with the DOWNLOADER_MIDDLEWARES_BASE setting defined in Scrapy (and not meant to be overridden) and then sorted by order to get the final sorted list of enabled middlewares: the first middleware is the one closer to the engine and the last is the one closer to the downloader.

              To decide which order to assign to your middleware see the DOWNLOADER_MIDDLEWARES_BASE setting and pick a value according to where you want to insert the middleware. The order does matter because each middleware performs a different action and your middleware could depend on some previous (or subsequent) middleware being applied.

              If you want to disable a built-in middleware (the ones defined in DOWNLOADER_MIDDLEWARES_BASE and enabled by default) you must define it in your project’s DOWNLOADER_MIDDLEWARES setting and assign None as its value. For example, if you want to disable the off-site middleware:

              DOWNLOADER_MIDDLEWARES = {
                  'myproject.middlewares.CustomDownloaderMiddleware': 543,
                  'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
              }
              

              Finally, keep in mind that some middlewares may need to be enabled through a particular setting. See each middleware documentation for more info.

              Writing your own downloader middleware

              Writing your own downloader middleware is easy. Each middleware component is a single Python class that defines one or more of the following methods:

              class scrapy.contrib.downloadermiddleware.DownloaderMiddleware
              process_request(request, spider)

              This method is called for each request that goes through the download middleware.

              process_request() should either: return None, return a Response object, return a Request object, or raise IgnoreRequest.

              If it returns None, Scrapy will continue processing this request, executing all other middlewares until, finally, the appropriate downloader handler is called the request performed (and its response downloaded).

              If it returns a Response object, Scrapy won’t bother calling any other process_request() or process_exception() methods, or the appropriate download function; it’ll return that response. The process_response() methods of installed middleware is always called on every response.

              If it returns a Request object, Scrapy will stop calling process_request methods and reschedule the returned request. Once the newly returned request is performed, the appropriate middleware chain will be called on the downloaded response.

              If it raises an IgnoreRequest exception, the process_exception() methods of installed downloader middleware will be called. If none of them handle the exception, the errback function of the request (Request.errback) is called. If no code handles the raised exception, it is ignored and not logged (unlike other exceptions).

              Parameters:
              • request (Request object) – the request being processed
              • spider (Spider object) – the spider for which this request is intended
              process_response(request, response, spider)

              process_response() should either: return a Response object, return a Request object or raise a IgnoreRequest exception.

              If it returns a Response (it could be the same given response, or a brand-new one), that response will continue to be processed with the process_response() of the next middleware in the chain.

              If it returns a Request object, the middleware chain is halted and the returned request is rescheduled to be downloaded in the future. This is the same behavior as if a request is returned from process_request().

              If it raises an IgnoreRequest exception, the errback function of the request (Request.errback) is called. If no code handles the raised exception, it is ignored and not logged (unlike other exceptions).

              Parameters:
              • request (is a Request object) – the request that originated the response
              • response (Response object) – the response being processed
              • spider (Spider object) – the spider for which this response is intended
              process_exception(request, exception, spider)

              Scrapy calls process_exception() when a download handler or a process_request() (from a downloader middleware) raises an exception (including an IgnoreRequest exception)

              process_exception() should return: either None, a Response object, or a Request object.

              If it returns None, Scrapy will continue processing this exception, executing any other process_exception() methods of installed middleware, until no middleware is left and the default exception handling kicks in.

              If it returns a Response object, the process_response() method chain of installed middleware is started, and Scrapy won’t bother calling any other process_exception() methods of middleware.

              If it returns a Request object, the returned request is rescheduled to be downloaded in the future. This stops the execution of process_exception() methods of the middleware the same as returning a response would.

              Parameters:
              • request (is a Request object) – the request that generated the exception
              • exception (an Exception object) – the raised exception
              • spider (Spider object) – the spider for which this request is intended

              Built-in downloader middleware reference

              This page describes all downloader middleware components that come with Scrapy. For information on how to use them and how to write your own downloader middleware, see the downloader middleware usage guide.

              For a list of the components enabled by default (and their orders) see the DOWNLOADER_MIDDLEWARES_BASE setting.

              CookiesMiddleware

              class scrapy.contrib.downloadermiddleware.cookies.CookiesMiddleware

              This middleware enables working with sites that require cookies, such as those that use sessions. It keeps track of cookies sent by web servers, and send them back on subsequent requests (from that spider), just like web browsers do.

              The following settings can be used to configure the cookie middleware:

              COOKIES_ENABLED

              Default: True

              Whether to enable the cookies middleware. If disabled, no cookies will be sent to web servers.

              COOKIES_DEBUG

              Default: False

              If enabled, Scrapy will log all cookies sent in requests (ie. Cookie header) and all cookies received in responses (ie. Set-Cookie header).

              Here’s an example of a log with COOKIES_DEBUG enabled:

              2011-04-06 14:35:10-0300 [diningcity] INFO: Spider opened
              2011-04-06 14:35:10-0300 [diningcity] DEBUG: Sending cookies to: <GET http://www.diningcity.com/netherlands/index.html>
                      Cookie: clientlanguage_nl=en_EN
              2011-04-06 14:35:14-0300 [diningcity] DEBUG: Received cookies from: <200 http://www.diningcity.com/netherlands/index.html>
                      Set-Cookie: JSESSIONID=B~FA4DC0C496C8762AE4F1A620EAB34F38; Path=/
                      Set-Cookie: ip_isocode=US
                      Set-Cookie: clientlanguage_nl=en_EN; Expires=Thu, 07-Apr-2011 21:21:34 GMT; Path=/
              2011-04-06 14:49:50-0300 [diningcity] DEBUG: Crawled (200) <GET http://www.diningcity.com/netherlands/index.html> (referer: None)
              [...]

              DefaultHeadersMiddleware

              class scrapy.contrib.downloadermiddleware.defaultheaders.DefaultHeadersMiddleware

              This middleware sets all default requests headers specified in the DEFAULT_REQUEST_HEADERS setting.

              DownloadTimeoutMiddleware

              class scrapy.contrib.downloadermiddleware.downloadtimeout.DownloadTimeoutMiddleware

              This middleware sets the download timeout for requests specified in the DOWNLOAD_TIMEOUT setting.

              HttpAuthMiddleware

              class scrapy.contrib.downloadermiddleware.httpauth.HttpAuthMiddleware

              This middleware authenticates all requests generated from certain spiders using Basic access authentication (aka. HTTP auth).

              To enable HTTP authentication from certain spiders, set the http_user and http_pass attributes of those spiders.

              Example:

              from scrapy.contrib.spiders import CrawlSpider
              
              class SomeIntranetSiteSpider(CrawlSpider):
              
                  http_user = 'someuser'
                  http_pass = 'somepass'
                  name = 'intranet.example.com'
              
                  # .. rest of the spider code omitted ...
              

              HttpCacheMiddleware

              class scrapy.contrib.downloadermiddleware.httpcache.HttpCacheMiddleware

              This middleware provides low-level cache to all HTTP requests and responses. It has to be combined with a cache storage backend as well as a cache policy.

              Scrapy ships with two HTTP cache storage backends:

              You can change the HTTP cache storage backend with the HTTPCACHE_STORAGE setting. Or you can also implement your own storage backend.

              Scrapy ships with two HTTP cache policies:

              You can change the HTTP cache policy with the HTTPCACHE_POLICY setting. Or you can also implement your own policy.

              Dummy policy (default)

              This policy has no awareness of any HTTP Cache-Control directives. Every request and its corresponding response are cached. When the same request is seen again, the response is returned without transferring anything from the Internet.

              The Dummy policy is useful for testing spiders faster (without having to wait for downloads every time) and for trying your spider offline, when an Internet connection is not available. The goal is to be able to “replay” a spider run exactly as it ran before.

              In order to use this policy, set:

              RFC2616 policy

              This policy provides a RFC2616 compliant HTTP cache, i.e. with HTTP Cache-Control awareness, aimed at production and used in continuous runs to avoid downloading unmodified data (to save bandwidth and speed up crawls).

              what is implemented:

              • Do not attempt to store responses/requests with no-store cache-control directive set
              • Do not serve responses from cache if no-cache cache-control directive is set even for fresh responses
              • Compute freshness lifetime from max-age cache-control directive
              • Compute freshness lifetime from Expires response header
              • Compute freshness lifetime from Last-Modified response header (heuristic used by Firefox)
              • Compute current age from Age response header
              • Compute current age from Date header
              • Revalidate stale responses based on Last-Modified response header
              • Revalidate stale responses based on ETag response header
              • Set Date header for any received response missing it

              what is missing:

              In order to use this policy, set:

              Filesystem storage backend (default)

              File system storage backend is available for the HTTP cache middleware.

              In order to use this storage backend, set:

              Each request/response pair is stored in a different directory containing the following files:

              • request_body - the plain request body
              • request_headers - the request headers (in raw HTTP format)
              • response_body - the plain response body
              • response_headers - the request headers (in raw HTTP format)
              • meta - some metadata of this cache resource in Python repr() format (grep-friendly format)
              • pickled_meta - the same metadata in meta but pickled for more efficient deserialization

              The directory name is made from the request fingerprint (see scrapy.utils.request.fingerprint), and one level of subdirectories is used to avoid creating too many files into the same directory (which is inefficient in many file systems). An example directory could be:

              /path/to/cache/dir/example.com/72/72811f648e718090f041317756c03adb0ada46c7

              DBM storage backend

              New in version 0.13.

              A DBM storage backend is also available for the HTTP cache middleware.

              By default, it uses the anydbm module, but you can change it with the HTTPCACHE_DBM_MODULE setting.

              In order to use this storage backend, set:

              HTTPCache middleware settings

              The HttpCacheMiddleware can be configured through the following settings:

              HTTPCACHE_ENABLED

              New in version 0.11.

              Default: False

              Whether the HTTP cache will be enabled.

              Changed in version 0.11: Before 0.11, HTTPCACHE_DIR was used to enable cache.

              HTTPCACHE_EXPIRATION_SECS

              Default: 0

              Expiration time for cached requests, in seconds.

              Cached requests older than this time will be re-downloaded. If zero, cached requests will never expire.

              Changed in version 0.11: Before 0.11, zero meant cached requests always expire.

              HTTPCACHE_DIR

              Default: 'httpcache'

              The directory to use for storing the (low-level) HTTP cache. If empty, the HTTP cache will be disabled. If a relative path is given, is taken relative to the project data dir. For more info see: Default structure of Scrapy projects.

              HTTPCACHE_IGNORE_HTTP_CODES

              New in version 0.10.

              Default: []

              Don’t cache response with these HTTP codes.

              HTTPCACHE_IGNORE_MISSING

              Default: False

              If enabled, requests not found in the cache will be ignored instead of downloaded.

              HTTPCACHE_IGNORE_SCHEMES

              New in version 0.10.

              Default: ['file']

              Don’t cache responses with these URI schemes.

              HTTPCACHE_STORAGE

              Default: 'scrapy.contrib.httpcache.DbmCacheStorage'

              The class which implements the cache storage backend.

              HTTPCACHE_DBM_MODULE

              New in version 0.13.

              Default: 'anydbm'

              The database module to use in the DBM storage backend. This setting is specific to the DBM backend.

              HTTPCACHE_POLICY

              New in version 0.18.

              Default: 'scrapy.contrib.httpcache.DummyPolicy'

              The class which implements the cache policy.

              HttpCompressionMiddleware

              class scrapy.contrib.downloadermiddleware.httpcompression.HttpCompressionMiddleware

              This middleware allows compressed (gzip, deflate) traffic to be sent/received from web sites.

              HttpCompressionMiddleware Settings

              COMPRESSION_ENABLED

              Default: True

              Whether the Compression middleware will be enabled.

              ChunkedTransferMiddleware

              class scrapy.contrib.downloadermiddleware.chunked.ChunkedTransferMiddleware

              This middleware adds support for chunked transfer encoding

              HttpProxyMiddleware

              New in version 0.8.

              class scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware

              This middleware sets the HTTP proxy to use for requests, by setting the proxy meta value to Request objects.

              Like the Python standard library modules urllib and urllib2, it obeys the following environment variables:

              • http_proxy
              • https_proxy
              • no_proxy

              RedirectMiddleware

              class scrapy.contrib.downloadermiddleware.redirect.RedirectMiddleware

              This middleware handles redirection of requests based on response status.

              The urls which the request goes through (while being redirected) can be found in the redirect_urls Request.meta key.

              The RedirectMiddleware can be configured through the following settings (see the settings documentation for more info):

              If Request.meta contains the dont_redirect key, the request will be ignored by this middleware.

              RedirectMiddleware settings

              REDIRECT_ENABLED

              New in version 0.13.

              Default: True

              Whether the Redirect middleware will be enabled.

              REDIRECT_MAX_TIMES

              Default: 20

              The maximum number of redirections that will be follow for a single request.

              MetaRefreshMiddleware

              class scrapy.contrib.downloadermiddleware.redirect.MetaRefreshMiddleware

              This middleware handles redirection of requests based on meta-refresh html tag.

              The MetaRefreshMiddleware can be configured through the following settings (see the settings documentation for more info):

              This middleware obey REDIRECT_MAX_TIMES setting, dont_redirect and redirect_urls request meta keys as described for RedirectMiddleware

              MetaRefreshMiddleware settings

              METAREFRESH_ENABLED

              New in version 0.17.

              Default: True

              Whether the Meta Refresh middleware will be enabled.

              REDIRECT_MAX_METAREFRESH_DELAY

              Default: 100

              The maximum meta-refresh delay (in seconds) to follow the redirection.

              RetryMiddleware

              class scrapy.contrib.downloadermiddleware.retry.RetryMiddleware

              A middlware to retry failed requests that are potentially caused by temporary problems such as a connection timeout or HTTP 500 error.

              Failed pages are collected on the scraping process and rescheduled at the end, once the spider has finished crawling all regular (non failed) pages. Once there are no more failed pages to retry, this middleware sends a signal (retry_complete), so other extensions could connect to that signal.

              The RetryMiddleware can be configured through the following settings (see the settings documentation for more info):

              About HTTP errors to consider:

              You may want to remove 400 from RETRY_HTTP_CODES, if you stick to the HTTP protocol. It’s included by default because it’s a common code used to indicate server overload, which would be something we want to retry.

              If Request.meta contains the dont_retry key, the request will be ignored by this middleware.

              RetryMiddleware Settings

              RETRY_ENABLED

              New in version 0.13.

              Default: True

              Whether the Retry middleware will be enabled.

              RETRY_TIMES

              Default: 2

              Maximum number of times to retry, in addition to the first download.

              RETRY_HTTP_CODES

              Default: [500, 502, 503, 504, 400, 408]

              Which HTTP response codes to retry. Other errors (DNS lookup issues, connections lost, etc) are always retried.

              RobotsTxtMiddleware

              class scrapy.contrib.downloadermiddleware.robotstxt.RobotsTxtMiddleware

              This middleware filters out requests forbidden by the robots.txt exclusion standard.

              To make sure Scrapy respects robots.txt make sure the middleware is enabled and the ROBOTSTXT_OBEY setting is enabled.

              Warning

              Keep in mind that, if you crawl using multiple concurrent requests per domain, Scrapy could still download some forbidden pages if they were requested before the robots.txt file was downloaded. This is a known limitation of the current robots.txt middleware and will be fixed in the future.

              DownloaderStats

              class scrapy.contrib.downloadermiddleware.stats.DownloaderStats

              Middleware that stores stats of all requests, responses and exceptions that pass through it.

              To use this middleware you must enable the DOWNLOADER_STATS setting.

              UserAgentMiddleware

              class scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware

              Middleware that allows spiders to override the default user agent.

              In order for a spider to override the default user agent, its user_agent attribute must be set.

              AjaxCrawlMiddleware

              class scrapy.contrib.downloadermiddleware.ajaxcrawl.AjaxCrawlMiddleware

              Middleware that finds ‘AJAX crawlable’ page variants based on meta-fragment html tag. See https://developers.google.com/webmasters/ajax-crawling/docs/getting-started for more info.

              Note

              Scrapy finds ‘AJAX crawlable’ pages for URLs like 'http://example.com/!#foo=bar' even without this middleware. AjaxCrawlMiddleware is necessary when URL doesn’t contain '!#'. This is often a case for ‘index’ or ‘main’ website pages.

              AjaxCrawlMiddleware Settings

              AJAXCRAWL_ENABLED

              New in version 0.21.

              Default: False

              Whether the AjaxCrawlMiddleware will be enabled. You may want to enable it for broad crawls.

              Read the Docs v: 0.22
              Versions
              latest
              0.22
              0.20
              0.18
              0.16
              0.14
              0.12
              0.10.3
              0.9
              0.8
              0.7
              Downloads
              PDF
              HTML
              Epub
              On Read the Docs
              Project Home
              Builds

              Free document hosting provided by Read the Docs.
              PKV`1Dascrapy-0.22/_static/plus.pngPNG  IHDR &q pHYs  tIME 1l9tEXtComment̖RIDATcz(BpipPc |IENDB`PK&o1Dkl\\ scrapy-0.22/_static/pygments.css.highlight .hll { background-color: #ffffcc } .highlight { background: #eeffcc; } .highlight .c { color: #408090; font-style: italic } /* Comment */ .highlight .err { border: 1px solid #FF0000 } /* Error */ .highlight .k { color: #007020; font-weight: bold } /* Keyword */ .highlight .o { color: #666666 } /* Operator */ .highlight .cm { color: #408090; font-style: italic } /* Comment.Multiline */ .highlight .cp { color: #007020 } /* Comment.Preproc */ .highlight .c1 { color: #408090; font-style: italic } /* Comment.Single */ .highlight .cs { color: #408090; background-color: #fff0f0 } /* Comment.Special */ .highlight .gd { color: #A00000 } /* Generic.Deleted */ .highlight .ge { font-style: italic } /* Generic.Emph */ .highlight .gr { color: #FF0000 } /* Generic.Error */ .highlight .gh { color: #000080; font-weight: bold } /* Generic.Heading */ .highlight .gi { color: #00A000 } /* Generic.Inserted */ .highlight .go { color: #333333 } /* Generic.Output */ .highlight .gp { color: #c65d09; font-weight: bold } /* Generic.Prompt */ .highlight .gs { font-weight: bold } /* Generic.Strong */ .highlight .gu { color: #800080; font-weight: bold } /* Generic.Subheading */ .highlight .gt { color: #0044DD } /* Generic.Traceback */ .highlight .kc { color: #007020; font-weight: bold } /* Keyword.Constant */ .highlight .kd { color: #007020; font-weight: bold } /* Keyword.Declaration */ .highlight .kn { color: #007020; font-weight: bold } /* Keyword.Namespace */ .highlight .kp { color: #007020 } /* Keyword.Pseudo */ .highlight .kr { color: #007020; font-weight: bold } /* Keyword.Reserved */ .highlight .kt { color: #902000 } /* Keyword.Type */ .highlight .m { color: #208050 } /* Literal.Number */ .highlight .s { color: #4070a0 } /* Literal.String */ .highlight .na { color: #4070a0 } /* Name.Attribute */ .highlight .nb { color: #007020 } /* Name.Builtin */ .highlight .nc { color: #0e84b5; font-weight: bold } /* Name.Class */ .highlight .no { color: #60add5 } /* Name.Constant */ .highlight .nd { color: #555555; font-weight: bold } /* Name.Decorator */ .highlight .ni { color: #d55537; font-weight: bold } /* Name.Entity */ .highlight .ne { color: #007020 } /* Name.Exception */ .highlight .nf { color: #06287e } /* Name.Function */ .highlight .nl { color: #002070; font-weight: bold } /* Name.Label */ .highlight .nn { color: #0e84b5; font-weight: bold } /* Name.Namespace */ .highlight .nt { color: #062873; font-weight: bold } /* Name.Tag */ .highlight .nv { color: #bb60d5 } /* Name.Variable */ .highlight .ow { color: #007020; font-weight: bold } /* Operator.Word */ .highlight .w { color: #bbbbbb } /* Text.Whitespace */ .highlight .mf { color: #208050 } /* Literal.Number.Float */ .highlight .mh { color: #208050 } /* Literal.Number.Hex */ .highlight .mi { color: #208050 } /* Literal.Number.Integer */ .highlight .mo { color: #208050 } /* Literal.Number.Oct */ .highlight .sb { color: #4070a0 } /* Literal.String.Backtick */ .highlight .sc { color: #4070a0 } /* Literal.String.Char */ .highlight .sd { color: #4070a0; font-style: italic } /* Literal.String.Doc */ .highlight .s2 { color: #4070a0 } /* Literal.String.Double */ .highlight .se { color: #4070a0; font-weight: bold } /* Literal.String.Escape */ .highlight .sh { color: #4070a0 } /* Literal.String.Heredoc */ .highlight .si { color: #70a0d0; font-style: italic } /* Literal.String.Interpol */ .highlight .sx { color: #c65d09 } /* Literal.String.Other */ .highlight .sr { color: #235388 } /* Literal.String.Regex */ .highlight .s1 { color: #4070a0 } /* Literal.String.Single */ .highlight .ss { color: #517918 } /* Literal.String.Symbol */ .highlight .bp { color: #007020 } /* Name.Builtin.Pseudo */ .highlight .vc { color: #bb60d5 } /* Name.Variable.Class */ .highlight .vg { color: #bb60d5 } /* Name.Variable.Global */ .highlight .vi { color: #bb60d5 } /* Name.Variable.Instance */ .highlight .il { color: #208050 } /* Literal.Number.Integer.Long */PKV`1DDUkkscrapy-0.22/_static/up.pngPNG  IHDRasRGBbKGDC pHYs B(xtIME!.<̓EIDAT8͓NABP\EG{%<|xc  cr6@t;b$;3&)h1!﫳Hzz@=)p 3۵e2/ߴ ( %^ND^ }3H1DoǪISFұ?, G`{v^X[b]&HC3{:sO& ?,[eL#IENDB`PKBFCVR>>scrapy-0.22/_static/rtd.css/* * rtd.css * ~~~~~~~~~~~~~~~ * * Sphinx stylesheet -- sphinxdoc theme. Originally created by * Armin Ronacher for Werkzeug. * * Customized for ReadTheDocs by Eric Pierce & Eric Holscher * * :copyright: Copyright 2007-2010 by the Sphinx team, see AUTHORS. * :license: BSD, see LICENSE for details. * */ /* RTD colors * light blue: #e8ecef * medium blue: #8ca1af * dark blue: #465158 * dark grey: #444444 * * white hover: #d1d9df; * medium blue hover: #697983; * green highlight: #8ecc4c * light blue (project bar): #e8ecef */ @import url("basic.css"); /* PAGE LAYOUT -------------------------------------------------------------- */ body { font: 100%/1.5 "ff-meta-web-pro-1","ff-meta-web-pro-2",Arial,"Helvetica Neue",sans-serif; text-align: center; color: black; background-color: #465158; padding: 0; margin: 0; } div.document { text-align: left; background-color: #e8ecef; } div.bodywrapper { background-color: #ffffff; border-left: 1px solid #ccc; border-bottom: 1px solid #ccc; margin: 0 0 0 16em; } div.body { margin: 0; padding: 0.5em 1.3em; min-width: 20em; } div.related { font-size: 1em; background-color: #465158; } div.documentwrapper { float: left; width: 100%; background-color: #e8ecef; } /* HEADINGS --------------------------------------------------------------- */ h1 { margin: 0; padding: 0.7em 0 0.3em 0; font-size: 1.5em; line-height: 1.15; color: #111; clear: both; } h2 { margin: 2em 0 0.2em 0; font-size: 1.35em; padding: 0; color: #465158; } h3 { margin: 1em 0 -0.3em 0; font-size: 1.2em; color: #6c818f; } div.body h1 a, div.body h2 a, div.body h3 a, div.body h4 a, div.body h5 a, div.body h6 a { color: black; } h1 a.anchor, h2 a.anchor, h3 a.anchor, h4 a.anchor, h5 a.anchor, h6 a.anchor { display: none; margin: 0 0 0 0.3em; padding: 0 0.2em 0 0.2em; color: #aaa !important; } h1:hover a.anchor, h2:hover a.anchor, h3:hover a.anchor, h4:hover a.anchor, h5:hover a.anchor, h6:hover a.anchor { display: inline; } h1 a.anchor:hover, h2 a.anchor:hover, h3 a.anchor:hover, h4 a.anchor:hover, h5 a.anchor:hover, h6 a.anchor:hover { color: #777; background-color: #eee; } /* LINKS ------------------------------------------------------------------ */ /* Normal links get a pseudo-underline */ a { color: #444; text-decoration: none; border-bottom: 1px solid #ccc; } /* Links in sidebar, TOC, index trees and tables have no underline */ .sphinxsidebar a, .toctree-wrapper a, .indextable a, #indices-and-tables a { color: #444; text-decoration: none; /* border-bottom: none; */ } /* Search box size */ div.sphinxsidebar #searchbox input[type="submit"] { width: 50px; } /* Most links get an underline-effect when hovered */ a:hover, div.toctree-wrapper a:hover, .indextable a:hover, #indices-and-tables a:hover { color: #111; text-decoration: none; border-bottom: 1px solid #111; } /* Footer links */ div.footer a { color: #86989B; text-decoration: none; border: none; } div.footer a:hover { color: #a6b8bb; text-decoration: underline; border: none; } /* Permalink anchor (subtle grey with a red hover) */ div.body a.headerlink { color: #ccc; font-size: 1em; margin-left: 6px; padding: 0 4px 0 4px; text-decoration: none; border: none; } div.body a.headerlink:hover { color: #c60f0f; border: none; } /* NAVIGATION BAR --------------------------------------------------------- */ div.related ul { height: 2.5em; } div.related ul li { margin: 0; padding: 0.65em 0; float: left; display: block; color: white; /* For the >> separators */ font-size: 0.8em; } div.related ul li.right { float: right; margin-right: 5px; color: transparent; /* Hide the | separators */ } /* "Breadcrumb" links in nav bar */ div.related ul li a { order: none; background-color: inherit; font-weight: bold; margin: 6px 0 6px 4px; line-height: 1.75em; color: #ffffff; padding: 0.4em 0.8em; border: none; border-radius: 3px; } /* previous / next / modules / index links look more like buttons */ div.related ul li.right a { margin: 0.375em 0; background-color: #697983; text-shadow: 0 1px rgba(0, 0, 0, 0.5); border-radius: 3px; -webkit-border-radius: 3px; -moz-border-radius: 3px; } /* All navbar links light up as buttons when hovered */ div.related ul li a:hover { background-color: #8ca1af; color: #ffffff; text-decoration: none; border-radius: 3px; -webkit-border-radius: 3px; -moz-border-radius: 3px; } /* Take extra precautions for tt within links */ a tt, div.related ul li a tt { background: inherit !important; color: inherit !important; } /* SIDEBAR ---------------------------------------------------------------- */ div.sphinxsidebarwrapper { padding: 0; } div.sphinxsidebar { margin: 0; margin-left: -100%; float: left; top: 3em; left: 0; padding: 0 1em; width: 14em; font-size: 1em; text-align: left; background-color: #e8ecef; } div.sphinxsidebar img { max-width: 12em; } div.sphinxsidebar h3, div.sphinxsidebar h4, div.sphinxsidebar p.logo { margin: 1.2em 0 0.3em 0; font-size: 1em; padding: 0; color: #222222; font-family: "ff-meta-web-pro-1", "ff-meta-web-pro-2", "Arial", "Helvetica Neue", sans-serif; } div.sphinxsidebar h3 a { color: #444444; } div.sphinxsidebar ul, div.sphinxsidebar p { margin-top: 0; padding-left: 0; line-height: 130%; background-color: #e8ecef; } /* No bullets for nested lists, but a little extra indentation */ div.sphinxsidebar ul ul { list-style-type: none; margin-left: 1.5em; padding: 0; } /* A little top/bottom padding to prevent adjacent links' borders * from overlapping each other */ div.sphinxsidebar ul li { padding: 1px 0; } /* A little left-padding to make these align with the ULs */ div.sphinxsidebar p.topless { padding-left: 0 0 0 1em; } /* Make these into hidden one-liners */ div.sphinxsidebar ul li, div.sphinxsidebar p.topless { white-space: nowrap; overflow: hidden; } /* ...which become visible when hovered */ div.sphinxsidebar ul li:hover, div.sphinxsidebar p.topless:hover { overflow: visible; } /* Search text box and "Go" button */ #searchbox { margin-top: 2em; margin-bottom: 1em; background: #ddd; padding: 0.5em; border-radius: 6px; -moz-border-radius: 6px; -webkit-border-radius: 6px; } #searchbox h3 { margin-top: 0; } /* Make search box and button abut and have a border */ input, div.sphinxsidebar input { border: 1px solid #999; float: left; } /* Search textbox */ input[type="text"] { margin: 0; padding: 0 3px; height: 20px; width: 144px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; -moz-border-radius-topleft: 3px; -moz-border-radius-bottomleft: 3px; -webkit-border-top-left-radius: 3px; -webkit-border-bottom-left-radius: 3px; } /* Search button */ input[type="submit"] { margin: 0 0 0 -1px; /* -1px prevents a double-border with textbox */ height: 22px; color: #444; background-color: #e8ecef; padding: 1px 4px; font-weight: bold; border-top-right-radius: 3px; border-bottom-right-radius: 3px; -moz-border-radius-topright: 3px; -moz-border-radius-bottomright: 3px; -webkit-border-top-right-radius: 3px; -webkit-border-bottom-right-radius: 3px; } input[type="submit"]:hover { color: #ffffff; background-color: #8ecc4c; } div.sphinxsidebar p.searchtip { clear: both; padding: 0.5em 0 0 0; background: #ddd; color: #666; font-size: 0.9em; } /* Sidebar links are unusual */ div.sphinxsidebar li a, div.sphinxsidebar p a { background: #e8ecef; /* In case links overlap main content */ border-radius: 3px; -moz-border-radius: 3px; -webkit-border-radius: 3px; border: 1px solid transparent; /* To prevent things jumping around on hover */ padding: 0 5px 0 5px; } div.sphinxsidebar li a:hover, div.sphinxsidebar p a:hover { color: #111; text-decoration: none; border: 1px solid #888; } div.sphinxsidebar p.logo a { border: 0; } /* Tweak any link appearing in a heading */ div.sphinxsidebar h3 a { } /* OTHER STUFF ------------------------------------------------------------ */ cite, code, tt { font-family: 'Consolas', 'Deja Vu Sans Mono', 'Bitstream Vera Sans Mono', monospace; font-size: 0.95em; letter-spacing: 0.01em; } tt { background-color: #f2f2f2; color: #444; } tt.descname, tt.descclassname, tt.xref { border: 0; } hr { border: 1px solid #abc; margin: 2em; } pre, #_fontwidthtest { font-family: 'Consolas', 'Deja Vu Sans Mono', 'Bitstream Vera Sans Mono', monospace; margin: 1em 2em; font-size: 0.95em; letter-spacing: 0.015em; line-height: 120%; padding: 0.5em; border: 1px solid #ccc; background-color: #eee; border-radius: 6px; -moz-border-radius: 6px; -webkit-border-radius: 6px; } pre a { color: inherit; text-decoration: underline; } td.linenos pre { margin: 1em 0em; } td.code pre { margin: 1em 0em; } div.quotebar { background-color: #f8f8f8; max-width: 250px; float: right; padding: 2px 7px; border: 1px solid #ccc; } div.topic { background-color: #f8f8f8; } table { border-collapse: collapse; margin: 0 -0.5em 0 -0.5em; } table td, table th { padding: 0.2em 0.5em 0.2em 0.5em; } /* ADMONITIONS AND WARNINGS ------------------------------------------------- */ /* Shared by admonitions, warnings and sidebars */ div.admonition, div.warning, div.sidebar { font-size: 0.9em; margin: 2em; padding: 0; /* border-radius: 6px; -moz-border-radius: 6px; -webkit-border-radius: 6px; */ } div.admonition p, div.warning p, div.sidebar p { margin: 0.5em 1em 0.5em 1em; padding: 0; } div.admonition pre, div.warning pre, div.sidebar pre { margin: 0.4em 1em 0.4em 1em; } div.admonition p.admonition-title, div.warning p.admonition-title, div.sidebar p.sidebar-title { margin: 0; padding: 0.1em 0 0.1em 0.5em; color: white; font-weight: bold; font-size: 1.1em; text-shadow: 0 1px rgba(0, 0, 0, 0.5); } div.admonition ul, div.admonition ol, div.warning ul, div.warning ol, div.sidebar ul, div.sidebar ol { margin: 0.1em 0.5em 0.5em 3em; padding: 0; } /* Admonitions and sidebars only */ div.admonition, div.sidebar { border: 1px solid #609060; background-color: #e9ffe9; } div.admonition p.admonition-title, div.sidebar p.sidebar-title { background-color: #70A070; border-bottom: 1px solid #609060; } /* Warnings only */ div.warning { border: 1px solid #900000; background-color: #ffe9e9; } div.warning p.admonition-title { background-color: #b04040; border-bottom: 1px solid #900000; } /* Sidebars only */ div.sidebar { max-width: 30%; } div.versioninfo { margin: 1em 0 0 0; border: 1px solid #ccc; background-color: #DDEAF0; padding: 8px; line-height: 1.3em; font-size: 0.9em; } .viewcode-back { font-family: 'Lucida Grande', 'Lucida Sans Unicode', 'Geneva', 'Verdana', sans-serif; } div.viewcode-block:target { background-color: #f4debf; border-top: 1px solid #ac9; border-bottom: 1px solid #ac9; } dl { margin: 1em 0 2.5em 0; } dl dt { font-style: italic; } dl dd { color: rgb(68, 68, 68); font-size: 0.95em; } /* Highlight target when you click an internal link */ dt:target { background: #ffe080; } /* Don't highlight whole divs */ div.highlight { background: transparent; } /* But do highlight spans (so search results can be highlighted) */ span.highlight { background: #ffe080; } div.footer { background-color: #465158; color: #eeeeee; padding: 0 2em 2em 2em; clear: both; font-size: 0.8em; text-align: center; } p { margin: 0.8em 0 0.5em 0; } .section p img.math { margin: 0; } .section p img { margin: 1em 2em; } table.docutils td, table.docutils th { padding: 1px 8px 1px 5px; } /* MOBILE LAYOUT -------------------------------------------------------------- */ @media screen and (max-width: 600px) { h1, h2, h3, h4, h5 { position: relative; } ul { padding-left: 1.25em; } div.bodywrapper a.headerlink, #indices-and-tables h1 a { color: #e6e6e6; font-size: 80%; float: right; line-height: 1.8; position: absolute; right: -0.7em; visibility: inherit; } div.bodywrapper h1 a.headerlink, #indices-and-tables h1 a { line-height: 1.5; } pre { font-size: 0.7em; overflow: auto; word-wrap: break-word; white-space: pre-wrap; } div.related ul { height: 2.5em; padding: 0; text-align: left; } div.related ul li { clear: both; color: #465158; padding: 0.2em 0; } div.related ul li:last-child { border-bottom: 1px dotted #8ca1af; padding-bottom: 0.4em; margin-bottom: 1em; width: 100%; } div.related ul li a { color: #465158; padding-right: 0; } div.related ul li a:hover { background: inherit; color: inherit; } div.related ul li.right { clear: none; padding: 0.65em 0; margin-bottom: 0.5em; } div.related ul li.right a { color: #fff; padding-right: 0.8em; } div.related ul li.right a:hover { background-color: #8ca1af; } div.body { clear: both; min-width: 0; word-wrap: break-word; } div.bodywrapper { margin: 0 0 0 0; } div.sphinxsidebar { float: none; margin: 0; width: auto; } div.sphinxsidebar input[type="text"] { height: 2em; line-height: 2em; width: 70%; } div.sphinxsidebar input[type="submit"] { height: 2em; margin-left: 0.5em; width: 20%; } div.sphinxsidebar p.searchtip { background: inherit; margin-bottom: 1em; } div.sphinxsidebar ul li, div.sphinxsidebar p.topless { white-space: normal; } .bodywrapper img { display: block; margin-left: auto; margin-right: auto; max-width: 100%; } div.documentwrapper { float: none; } div.admonition, div.warning, pre, blockquote { margin-left: 0em; margin-right: 0em; } .body p img { margin: 0; } #searchbox { background: transparent; } .related:not(:first-child) li { display: none; } .related:not(:first-child) li.right { display: block; } div.footer { padding: 1em; } .rtd_doc_footer .rtd-badge { float: none; margin: 1em auto; position: static; } .rtd_doc_footer .rtd-badge.revsys-inline { margin-right: auto; margin-bottom: 2em; } table.indextable { display: block; width: auto; } .indextable tr { display: block; } .indextable td { display: block; padding: 0; width: auto !important; } .indextable td dt { margin: 1em 0; } ul.search { margin-left: 0.25em; } ul.search li div.context { font-size: 90%; line-height: 1.1; margin-bottom: 1; margin-left: 0; } } PKV`1D;l/l/!scrapy-0.22/_static/underscore.js// Underscore.js 1.3.1 // (c) 2009-2012 Jeremy Ashkenas, DocumentCloud Inc. // Underscore is freely distributable under the MIT license. // Portions of Underscore are inspired or borrowed from Prototype, // Oliver Steele's Functional, and John Resig's Micro-Templating. // For all details and documentation: // http://documentcloud.github.com/underscore (function(){function q(a,c,d){if(a===c)return a!==0||1/a==1/c;if(a==null||c==null)return a===c;if(a._chain)a=a._wrapped;if(c._chain)c=c._wrapped;if(a.isEqual&&b.isFunction(a.isEqual))return a.isEqual(c);if(c.isEqual&&b.isFunction(c.isEqual))return c.isEqual(a);var e=l.call(a);if(e!=l.call(c))return false;switch(e){case "[object String]":return a==String(c);case "[object Number]":return a!=+a?c!=+c:a==0?1/a==1/c:a==+c;case "[object Date]":case "[object Boolean]":return+a==+c;case "[object RegExp]":return a.source== c.source&&a.global==c.global&&a.multiline==c.multiline&&a.ignoreCase==c.ignoreCase}if(typeof a!="object"||typeof c!="object")return false;for(var f=d.length;f--;)if(d[f]==a)return true;d.push(a);var f=0,g=true;if(e=="[object Array]"){if(f=a.length,g=f==c.length)for(;f--;)if(!(g=f in a==f in c&&q(a[f],c[f],d)))break}else{if("constructor"in a!="constructor"in c||a.constructor!=c.constructor)return false;for(var h in a)if(b.has(a,h)&&(f++,!(g=b.has(c,h)&&q(a[h],c[h],d))))break;if(g){for(h in c)if(b.has(c, h)&&!f--)break;g=!f}}d.pop();return g}var r=this,G=r._,n={},k=Array.prototype,o=Object.prototype,i=k.slice,H=k.unshift,l=o.toString,I=o.hasOwnProperty,w=k.forEach,x=k.map,y=k.reduce,z=k.reduceRight,A=k.filter,B=k.every,C=k.some,p=k.indexOf,D=k.lastIndexOf,o=Array.isArray,J=Object.keys,s=Function.prototype.bind,b=function(a){return new m(a)};if(typeof exports!=="undefined"){if(typeof module!=="undefined"&&module.exports)exports=module.exports=b;exports._=b}else r._=b;b.VERSION="1.3.1";var j=b.each= b.forEach=function(a,c,d){if(a!=null)if(w&&a.forEach===w)a.forEach(c,d);else if(a.length===+a.length)for(var e=0,f=a.length;e2;a== null&&(a=[]);if(y&&a.reduce===y)return e&&(c=b.bind(c,e)),f?a.reduce(c,d):a.reduce(c);j(a,function(a,b,i){f?d=c.call(e,d,a,b,i):(d=a,f=true)});if(!f)throw new TypeError("Reduce of empty array with no initial value");return d};b.reduceRight=b.foldr=function(a,c,d,e){var f=arguments.length>2;a==null&&(a=[]);if(z&&a.reduceRight===z)return e&&(c=b.bind(c,e)),f?a.reduceRight(c,d):a.reduceRight(c);var g=b.toArray(a).reverse();e&&!f&&(c=b.bind(c,e));return f?b.reduce(g,c,d,e):b.reduce(g,c)};b.find=b.detect= function(a,c,b){var e;E(a,function(a,g,h){if(c.call(b,a,g,h))return e=a,true});return e};b.filter=b.select=function(a,c,b){var e=[];if(a==null)return e;if(A&&a.filter===A)return a.filter(c,b);j(a,function(a,g,h){c.call(b,a,g,h)&&(e[e.length]=a)});return e};b.reject=function(a,c,b){var e=[];if(a==null)return e;j(a,function(a,g,h){c.call(b,a,g,h)||(e[e.length]=a)});return e};b.every=b.all=function(a,c,b){var e=true;if(a==null)return e;if(B&&a.every===B)return a.every(c,b);j(a,function(a,g,h){if(!(e= e&&c.call(b,a,g,h)))return n});return e};var E=b.some=b.any=function(a,c,d){c||(c=b.identity);var e=false;if(a==null)return e;if(C&&a.some===C)return a.some(c,d);j(a,function(a,b,h){if(e||(e=c.call(d,a,b,h)))return n});return!!e};b.include=b.contains=function(a,c){var b=false;if(a==null)return b;return p&&a.indexOf===p?a.indexOf(c)!=-1:b=E(a,function(a){return a===c})};b.invoke=function(a,c){var d=i.call(arguments,2);return b.map(a,function(a){return(b.isFunction(c)?c||a:a[c]).apply(a,d)})};b.pluck= function(a,c){return b.map(a,function(a){return a[c]})};b.max=function(a,c,d){if(!c&&b.isArray(a))return Math.max.apply(Math,a);if(!c&&b.isEmpty(a))return-Infinity;var e={computed:-Infinity};j(a,function(a,b,h){b=c?c.call(d,a,b,h):a;b>=e.computed&&(e={value:a,computed:b})});return e.value};b.min=function(a,c,d){if(!c&&b.isArray(a))return Math.min.apply(Math,a);if(!c&&b.isEmpty(a))return Infinity;var e={computed:Infinity};j(a,function(a,b,h){b=c?c.call(d,a,b,h):a;bd?1:0}),"value")};b.groupBy=function(a,c){var d={},e=b.isFunction(c)?c:function(a){return a[c]};j(a,function(a,b){var c=e(a,b);(d[c]||(d[c]=[])).push(a)});return d};b.sortedIndex=function(a, c,d){d||(d=b.identity);for(var e=0,f=a.length;e>1;d(a[g])=0})})};b.difference=function(a){var c=b.flatten(i.call(arguments,1));return b.filter(a,function(a){return!b.include(c,a)})};b.zip=function(){for(var a=i.call(arguments),c=b.max(b.pluck(a,"length")),d=Array(c),e=0;e=0;d--)b=[a[d].apply(this,b)];return b[0]}}; b.after=function(a,b){return a<=0?b():function(){if(--a<1)return b.apply(this,arguments)}};b.keys=J||function(a){if(a!==Object(a))throw new TypeError("Invalid object");var c=[],d;for(d in a)b.has(a,d)&&(c[c.length]=d);return c};b.values=function(a){return b.map(a,b.identity)};b.functions=b.methods=function(a){var c=[],d;for(d in a)b.isFunction(a[d])&&c.push(d);return c.sort()};b.extend=function(a){j(i.call(arguments,1),function(b){for(var d in b)a[d]=b[d]});return a};b.defaults=function(a){j(i.call(arguments, 1),function(b){for(var d in b)a[d]==null&&(a[d]=b[d])});return a};b.clone=function(a){return!b.isObject(a)?a:b.isArray(a)?a.slice():b.extend({},a)};b.tap=function(a,b){b(a);return a};b.isEqual=function(a,b){return q(a,b,[])};b.isEmpty=function(a){if(b.isArray(a)||b.isString(a))return a.length===0;for(var c in a)if(b.has(a,c))return false;return true};b.isElement=function(a){return!!(a&&a.nodeType==1)};b.isArray=o||function(a){return l.call(a)=="[object Array]"};b.isObject=function(a){return a===Object(a)}; b.isArguments=function(a){return l.call(a)=="[object Arguments]"};if(!b.isArguments(arguments))b.isArguments=function(a){return!(!a||!b.has(a,"callee"))};b.isFunction=function(a){return l.call(a)=="[object Function]"};b.isString=function(a){return l.call(a)=="[object String]"};b.isNumber=function(a){return l.call(a)=="[object Number]"};b.isNaN=function(a){return a!==a};b.isBoolean=function(a){return a===true||a===false||l.call(a)=="[object Boolean]"};b.isDate=function(a){return l.call(a)=="[object Date]"}; b.isRegExp=function(a){return l.call(a)=="[object RegExp]"};b.isNull=function(a){return a===null};b.isUndefined=function(a){return a===void 0};b.has=function(a,b){return I.call(a,b)};b.noConflict=function(){r._=G;return this};b.identity=function(a){return a};b.times=function(a,b,d){for(var e=0;e/g,">").replace(/"/g,""").replace(/'/g,"'").replace(/\//g,"/")};b.mixin=function(a){j(b.functions(a), function(c){K(c,b[c]=a[c])})};var L=0;b.uniqueId=function(a){var b=L++;return a?a+b:b};b.templateSettings={evaluate:/<%([\s\S]+?)%>/g,interpolate:/<%=([\s\S]+?)%>/g,escape:/<%-([\s\S]+?)%>/g};var t=/.^/,u=function(a){return a.replace(/\\\\/g,"\\").replace(/\\'/g,"'")};b.template=function(a,c){var d=b.templateSettings,d="var __p=[],print=function(){__p.push.apply(__p,arguments);};with(obj||{}){__p.push('"+a.replace(/\\/g,"\\\\").replace(/'/g,"\\'").replace(d.escape||t,function(a,b){return"',_.escape("+ u(b)+"),'"}).replace(d.interpolate||t,function(a,b){return"',"+u(b)+",'"}).replace(d.evaluate||t,function(a,b){return"');"+u(b).replace(/[\r\n\t]/g," ")+";__p.push('"}).replace(/\r/g,"\\r").replace(/\n/g,"\\n").replace(/\t/g,"\\t")+"');}return __p.join('');",e=new Function("obj","_",d);return c?e(c,b):function(a){return e.call(this,a,b)}};b.chain=function(a){return b(a).chain()};var m=function(a){this._wrapped=a};b.prototype=m.prototype;var v=function(a,c){return c?b(a).chain():a},K=function(a,c){m.prototype[a]= function(){var a=i.call(arguments);H.call(a,this._wrapped);return v(c.apply(b,a),this._chain)}};b.mixin(b);j("pop,push,reverse,shift,sort,splice,unshift".split(","),function(a){var b=k[a];m.prototype[a]=function(){var d=this._wrapped;b.apply(d,arguments);var e=d.length;(a=="shift"||a=="splice")&&e===0&&delete d[0];return v(d,this._chain)}});j(["concat","join","slice"],function(a){var b=k[a];m.prototype[a]=function(){return v(b.apply(this._wrapped,arguments),this._chain)}});m.prototype.chain=function(){this._chain= true;return this};m.prototype.value=function(){return this._wrapped}}).call(this); PK&o1D{&scrapy-0.22/_static/readthedocs-ext.js // User's analytics code. var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-10231918-2']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); PKT`1Duº55*scrapy-0.22/_static/selectors-sample1.html Example website PKV`1D<>#scrapy-0.22/_static/ajax-loader.gifGIF89aU|NU|l!Created with ajaxload.info! ! NETSCAPE2.0,30Ikc:Nf E1º.`q-[9ݦ9 JkH! ,4N!  DqBQT`1 `LE[|ua C%$*! ,62#+AȐ̔V/cNIBap ̳ƨ+Y2d! ,3b%+2V_ ! 1DaFbR]=08,Ȥr9L! ,2r'+JdL &v`\bThYB)@<&,ȤR! ,3 9tڞ0!.BW1  sa50 m)J! ,2 ٜU]qp`a4AF0` @1Α! ,20IeBԜ) q10ʰPaVڥ ub[;PKV`1DPu u scrapy-0.22/_static/comment.pngPNG  IHDRa OiCCPPhotoshop ICC profilexڝSgTS=BKKoR RB&*! J!QEEȠQ, !{kּ> H3Q5 B.@ $pd!s#~<<+"x M0B\t8K@zB@F&S`cbP-`'{[! eDh;VEX0fK9-0IWfH  0Q){`##xFW<+*x<$9E[-qWW.(I+6aa@.y24x6_-"bbϫp@t~,/;m%h^ uf@Wp~<5j>{-]cK'Xto(hw?G%fIq^D$.Tʳ?D*A, `6B$BB dr`)B(Ͱ*`/@4Qhp.U=pa( Aa!ڈbX#!H$ ɈQ"K5H1RT UH=r9\F;2G1Q= C7F dt1r=6Ыhڏ>C03l0.B8, c˱" VcϱwE 6wB aAHXLXNH $4 7 Q'"K&b21XH,#/{C7$C2'ITFnR#,4H#dk9, +ȅ3![ b@qS(RjJ4e2AURݨT5ZBRQ4u9̓IKhhitݕNWGw Ljg(gwLӋT071oUX**| J&*/Tު UUT^S}FU3S ԖUPSSg;goT?~YYLOCQ_ cx,!k u5&|v*=9C3J3WRf?qtN (~))4L1e\kXHQG6EYAJ'\'GgSSݧ M=:.kDwn^Loy}/TmG X $ <5qo</QC]@Caaᄑ.ȽJtq]zۯ6iܟ4)Y3sCQ? 0k߬~OCOg#/c/Wװwa>>r><72Y_7ȷOo_C#dz%gA[z|!?:eAAA!h쐭!ΑiP~aa~ 'W?pX15wCsDDDޛg1O9-J5*>.j<74?.fYXXIlK9.*6nl {/]py.,:@LN8A*%w% yg"/6шC\*NH*Mz쑼5y$3,幄'L Lݛ:v m2=:1qB!Mggfvˬen/kY- BTZ(*geWf͉9+̳ې7ᒶKW-X潬j9(xoʿܔĹdff-[n ڴ VE/(ۻCɾUUMfeI?m]Nmq#׹=TR+Gw- 6 U#pDy  :v{vg/jBFS[b[O>zG499?rCd&ˮ/~јѡ򗓿m|x31^VwwO| (hSЧc3-bKGD pHYs  tIME 1;VIDAT8ukU?sg4h`G1 RQܸp%Bn"bЍXJ .4V iZ##T;m!4bP~7r>ιbwc;m;oӍAΆ ζZ^/|s{;yR=9(rtVoG1w#_ө{*E&!(LVuoᲵ‘D PG4 :&~*ݳreu: S-,U^E&JY[P!RB ŖޞʖR@_ȐdBfNvHf"2T]R j'B1ddAak/DIJD D2H&L`&L $Ex,6|~_\P $MH`I=@Z||ttvgcЕWTZ'3rje"ܵx9W> mb|byfFRx{w%DZC$wdցHmWnta(M<~;9]C/_;Տ#}o`zSڷ_>:;x컓?yݩ|}~wam-/7=0S5RP"*֯ IENDB`PKV`1Dhkkscrapy-0.22/_static/down.pngPNG  IHDRasRGBbKGDC pHYs B(xtIME"U{IDAT8ҡNCAJ, ++@4>/U^,~T&3M^^^PM6ٹs*RJa)eG*W<"F Fg78G>q OIp:sAj5GنyD^+yU:p_%G@D|aOs(yM,"msx:.b@D|`Vٟ۲иeKſ/G!IENDB`PKV`1D+0scrapy-0.22/_static/file.pngPNG  IHDRabKGD pHYs  tIME  )TIDAT8˭J@Ir('[ "&xYZ X0!i|_@tD] #xjv YNaEi(əy@D&`6PZk$)5%"z.NA#Aba`Vs_3c,2mj [klvy|!Iմy;v "߮a?A7`c^nk?Bg}TЙD# "RD1yER*6MJ3K_Ut8F~IENDB`PKV`1D[{gtt"scrapy-0.22/_static/up-pressed.pngPNG  IHDRasRGBbKGDC pHYs B(xtIME ,ZeIDAT8͓jA*WKk-,By@- و/`cXYh!6jf GrOlXvvfk2!p!GOOԲ &zf 6|M~%`]* ΛM]K ZĆ1Er%ȶcm1`= 0 && !jQuery(node.parentNode).hasClass(className)) { var span = document.createElement("span"); span.className = className; span.appendChild(document.createTextNode(val.substr(pos, text.length))); node.parentNode.insertBefore(span, node.parentNode.insertBefore( document.createTextNode(val.substr(pos + text.length)), node.nextSibling)); node.nodeValue = val.substr(0, pos); } } else if (!jQuery(node).is("button, select, textarea")) { jQuery.each(node.childNodes, function() { highlight(this); }); } } return this.each(function() { highlight(this); }); }; /** * Small JavaScript module for the documentation. */ var Documentation = { init : function() { this.fixFirefoxAnchorBug(); this.highlightSearchWords(); this.initIndexTable(); }, /** * i18n support */ TRANSLATIONS : {}, PLURAL_EXPR : function(n) { return n == 1 ? 0 : 1; }, LOCALE : 'unknown', // gettext and ngettext don't access this so that the functions // can safely bound to a different name (_ = Documentation.gettext) gettext : function(string) { var translated = Documentation.TRANSLATIONS[string]; if (typeof translated == 'undefined') return string; return (typeof translated == 'string') ? translated : translated[0]; }, ngettext : function(singular, plural, n) { var translated = Documentation.TRANSLATIONS[singular]; if (typeof translated == 'undefined') return (n == 1) ? singular : plural; return translated[Documentation.PLURALEXPR(n)]; }, addTranslations : function(catalog) { for (var key in catalog.messages) this.TRANSLATIONS[key] = catalog.messages[key]; this.PLURAL_EXPR = new Function('n', 'return +(' + catalog.plural_expr + ')'); this.LOCALE = catalog.locale; }, /** * add context elements like header anchor links */ addContextElements : function() { $('div[id] > :header:first').each(function() { $('\u00B6'). attr('href', '#' + this.id). attr('title', _('Permalink to this headline')). appendTo(this); }); $('dt[id]').each(function() { $('\u00B6'). attr('href', '#' + this.id). attr('title', _('Permalink to this definition')). appendTo(this); }); }, /** * workaround a firefox stupidity */ fixFirefoxAnchorBug : function() { if (document.location.hash && $.browser.mozilla) window.setTimeout(function() { document.location.href += ''; }, 10); }, /** * highlight the search words provided in the url in the text */ highlightSearchWords : function() { var params = $.getQueryParameters(); var terms = (params.highlight) ? params.highlight[0].split(/\s+/) : []; if (terms.length) { var body = $('div.body'); window.setTimeout(function() { $.each(terms, function() { body.highlightText(this.toLowerCase(), 'highlighted'); }); }, 10); $('') .appendTo($('#searchbox')); } }, /** * init the domain index toggle buttons */ initIndexTable : function() { var togglers = $('img.toggler').click(function() { var src = $(this).attr('src'); var idnum = $(this).attr('id').substr(7); $('tr.cg-' + idnum).toggle(); if (src.substr(-9) == 'minus.png') $(this).attr('src', src.substr(0, src.length-9) + 'plus.png'); else $(this).attr('src', src.substr(0, src.length-8) + 'minus.png'); }).css('display', ''); if (DOCUMENTATION_OPTIONS.COLLAPSE_INDEX) { togglers.click(); } }, /** * helper function to hide the search marks again */ hideSearchWords : function() { $('#searchbox .highlight-link').fadeOut(300); $('span.highlighted').removeClass('highlighted'); }, /** * make the url absolute */ makeURL : function(relativeURL) { return DOCUMENTATION_OPTIONS.URL_ROOT + '/' + relativeURL; }, /** * get the current relative url */ getCurrentURL : function() { var path = document.location.pathname; var parts = path.split(/\//); $.each(DOCUMENTATION_OPTIONS.URL_ROOT.split(/\//), function() { if (this == '..') parts.pop(); }); var url = parts.join('/'); return path.substring(url.lastIndexOf('/') + 1, path.length - 1); } }; // quick alias for translations _ = Documentation.gettext; $(document).ready(function() { Documentation.init(); }); PK&o1D(xEE"scrapy-0.22/_static/searchtools.js/* * searchtools.js_t * ~~~~~~~~~~~~~~~~ * * Sphinx JavaScript utilties for the full-text search. * * :copyright: Copyright 2007-2013 by the Sphinx team, see AUTHORS. * :license: BSD, see LICENSE for details. * */ /** * Porter Stemmer */ var Stemmer = function() { var step2list = { ational: 'ate', tional: 'tion', enci: 'ence', anci: 'ance', izer: 'ize', bli: 'ble', alli: 'al', entli: 'ent', eli: 'e', ousli: 'ous', ization: 'ize', ation: 'ate', ator: 'ate', alism: 'al', iveness: 'ive', fulness: 'ful', ousness: 'ous', aliti: 'al', iviti: 'ive', biliti: 'ble', logi: 'log' }; var step3list = { icate: 'ic', ative: '', alize: 'al', iciti: 'ic', ical: 'ic', ful: '', ness: '' }; var c = "[^aeiou]"; // consonant var v = "[aeiouy]"; // vowel var C = c + "[^aeiouy]*"; // consonant sequence var V = v + "[aeiou]*"; // vowel sequence var mgr0 = "^(" + C + ")?" + V + C; // [C]VC... is m>0 var meq1 = "^(" + C + ")?" + V + C + "(" + V + ")?$"; // [C]VC[V] is m=1 var mgr1 = "^(" + C + ")?" + V + C + V + C; // [C]VCVC... is m>1 var s_v = "^(" + C + ")?" + v; // vowel in stem this.stemWord = function (w) { var stem; var suffix; var firstch; var origword = w; if (w.length < 3) return w; var re; var re2; var re3; var re4; firstch = w.substr(0,1); if (firstch == "y") w = firstch.toUpperCase() + w.substr(1); // Step 1a re = /^(.+?)(ss|i)es$/; re2 = /^(.+?)([^s])s$/; if (re.test(w)) w = w.replace(re,"$1$2"); else if (re2.test(w)) w = w.replace(re2,"$1$2"); // Step 1b re = /^(.+?)eed$/; re2 = /^(.+?)(ed|ing)$/; if (re.test(w)) { var fp = re.exec(w); re = new RegExp(mgr0); if (re.test(fp[1])) { re = /.$/; w = w.replace(re,""); } } else if (re2.test(w)) { var fp = re2.exec(w); stem = fp[1]; re2 = new RegExp(s_v); if (re2.test(stem)) { w = stem; re2 = /(at|bl|iz)$/; re3 = new RegExp("([^aeiouylsz])\\1$"); re4 = new RegExp("^" + C + v + "[^aeiouwxy]$"); if (re2.test(w)) w = w + "e"; else if (re3.test(w)) { re = /.$/; w = w.replace(re,""); } else if (re4.test(w)) w = w + "e"; } } // Step 1c re = /^(.+?)y$/; if (re.test(w)) { var fp = re.exec(w); stem = fp[1]; re = new RegExp(s_v); if (re.test(stem)) w = stem + "i"; } // Step 2 re = /^(.+?)(ational|tional|enci|anci|izer|bli|alli|entli|eli|ousli|ization|ation|ator|alism|iveness|fulness|ousness|aliti|iviti|biliti|logi)$/; if (re.test(w)) { var fp = re.exec(w); stem = fp[1]; suffix = fp[2]; re = new RegExp(mgr0); if (re.test(stem)) w = stem + step2list[suffix]; } // Step 3 re = /^(.+?)(icate|ative|alize|iciti|ical|ful|ness)$/; if (re.test(w)) { var fp = re.exec(w); stem = fp[1]; suffix = fp[2]; re = new RegExp(mgr0); if (re.test(stem)) w = stem + step3list[suffix]; } // Step 4 re = /^(.+?)(al|ance|ence|er|ic|able|ible|ant|ement|ment|ent|ou|ism|ate|iti|ous|ive|ize)$/; re2 = /^(.+?)(s|t)(ion)$/; if (re.test(w)) { var fp = re.exec(w); stem = fp[1]; re = new RegExp(mgr1); if (re.test(stem)) w = stem; } else if (re2.test(w)) { var fp = re2.exec(w); stem = fp[1] + fp[2]; re2 = new RegExp(mgr1); if (re2.test(stem)) w = stem; } // Step 5 re = /^(.+?)e$/; if (re.test(w)) { var fp = re.exec(w); stem = fp[1]; re = new RegExp(mgr1); re2 = new RegExp(meq1); re3 = new RegExp("^" + C + v + "[^aeiouwxy]$"); if (re.test(stem) || (re2.test(stem) && !(re3.test(stem)))) w = stem; } re = /ll$/; re2 = new RegExp(mgr1); if (re.test(w) && re2.test(w)) { re = /.$/; w = w.replace(re,""); } // and turn initial Y back to y if (firstch == "y") w = firstch.toLowerCase() + w.substr(1); return w; } } /** * Simple result scoring code. */ var Scorer = { // Implement the following function to further tweak the score for each result // The function takes a result array [filename, title, anchor, descr, score] // and returns the new score. /* score: function(result) { return result[4]; }, */ // query matches the full name of an object objNameMatch: 11, // or matches in the last dotted part of the object name objPartialMatch: 6, // Additive scores depending on the priority of the object objPrio: {0: 15, // used to be importantResults 1: 5, // used to be objectResults 2: -5}, // used to be unimportantResults // Used when the priority is not in the mapping. objPrioDefault: 0, // query found in title title: 15, // query found in terms term: 5 }; /** * Search Module */ var Search = { _index : null, _queued_query : null, _pulse_status : -1, init : function() { var params = $.getQueryParameters(); if (params.q) { var query = params.q[0]; $('input[name="q"]')[0].value = query; this.performSearch(query); } }, loadIndex : function(url) { $.ajax({type: "GET", url: url, data: null, dataType: "script", cache: true, complete: function(jqxhr, textstatus) { if (textstatus != "success") { document.getElementById("searchindexloader").src = url; } }}); }, setIndex : function(index) { var q; this._index = index; if ((q = this._queued_query) !== null) { this._queued_query = null; Search.query(q); } }, hasIndex : function() { return this._index !== null; }, deferQuery : function(query) { this._queued_query = query; }, stopPulse : function() { this._pulse_status = 0; }, startPulse : function() { if (this._pulse_status >= 0) return; function pulse() { var i; Search._pulse_status = (Search._pulse_status + 1) % 4; var dotString = ''; for (i = 0; i < Search._pulse_status; i++) dotString += '.'; Search.dots.text(dotString); if (Search._pulse_status > -1) window.setTimeout(pulse, 500); } pulse(); }, /** * perform a search for something (or wait until index is loaded) */ performSearch : function(query) { // create the required interface elements this.out = $('#search-results'); this.title = $('

              ' + _('Searching') + '

              ').appendTo(this.out); this.dots = $('').appendTo(this.title); this.status = $('

              ').appendTo(this.out); this.output = $('