- Fixed docs build.
- Fixed
Scheduler
not compatible with BaseDupeFilter (#294) - Added precommit hooks.
- Switched to Python 3.12 as default build version.
- Fixed request fingerprint method.
- Fixed support for Scrapy 2.6+.
- Fixed tox tests and github workflow.
- Deprecated
REDIS_START_URLS_BATCH_SIZE
.
- Move docs to GitHub Wiki
- Update tox and support dynamic tests
- Update support for json data
- Refactor max idle time
- Add support for python3.7~python3.10
- Deprecate python2.x support
- Fix RedisStatsCollector._get_key()
- Fix redis-py dependency version
- Added maximum idle waiting time MAX_IDLE_TIME_BEFORE_CLOSE
- Fixes datetime parse error for redis-py 3.x.
- Add support for stats extensions.
- Fixes datetime parse error for redis-py 3.x.
- Add support for stats extensions.
- Unreleased.
- Fixed automated release due to not matching registered email.
- Fixes bad formatting in logging message.
- Fixes wrong message on dupefilter duplicates.
- Fixed typo in default settings.
- Fixed data decoding in Python 3.x.
- Added
REDIS_ENCODING
setting (defaultutf-8
). - Default to
CONCURRENT_REQUESTS
value forREDIS_START_URLS_BATCH_SIZE
. - Renamed queue classes to a proper naming conventiong (backwards compatible).
- Added
REDIS_START_URLS_KEY
setting. - Fixed spider method
from_crawler
signature.
- Support
redis_cls
parameter inREDIS_PARAMS
setting. - Python 3.x compatibility fixed.
- Added
SCHEDULER_SERIALIZER
setting.
- Backwards incompatible change: Require explicit
DUPEFILTER_CLASS
setting. - Added
SCHEDULER_FLUSH_ON_START
setting. - Added
REDIS_START_URLS_AS_SET
setting. - Added
REDIS_ITEMS_KEY
setting. - Added
REDIS_ITEMS_SERIALIZER
setting. - Added
REDIS_PARAMS
setting. - Added
REDIS_START_URLS_BATCH_SIZE
spider attribute to read start urls in batches. - Added
RedisCrawlSpider
.
- Updated code to be compatible with Scrapy 1.0.
- Added -a domain=... option for example spiders.
- Added REDIS_URL setting to support Redis connection string.
- Added SCHEDULER_IDLE_BEFORE_CLOSE setting to prevent the spider closing too quickly when the queue is empty. Default value is zero keeping the previous behavior.
- Schedule preemptively requests on item scraped.
- This version is the latest release compatible with Scrapy 0.24.x.
- Added RedisSpider and RedisMixin classes as building blocks for spiders to be fed through a redis queue.
- Added redis queue stats.
- Let the encoder handle the item as it comes instead converting it to a dict.
- Added support for different queue classes.
- Changed requests serialization from marshal to cPickle.
- Improved backward compatibility.
- Added example project.
- First release on PyPI.