You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ran into this problem as well and wanted to document my findings.
This appears to be due to a series of flawed assumptions between spidermon, scrapy and scrapyd. Spidermon's LocalStorageStatsHistoryCollector uses the data_path method from scrapy.utils.project to try to create a path to store stats history. But data_path requires you to have a scrapy.cfg file somewhere in your working directory or higher. But if you deploy via scrapyd-deploy then your local scrapy.cfg is never copied to the server (not even inside the deployed egg file). And so then scrapy barfs and spidermon doesn't gracefully handle it and kills your spider (see screenshot above).
Only workaround I've found is to add a dummy scrapy.cfg into your working directory (kudos to a suggestion in a related scrapy issue from 8 years ago scrapy/scrapy#1581 (comment) ).
If you want the stats history to be stored somewhere else it appears you can use the completely undocumented datadir section in your otherwise dummy scrapy.cfg (the one on your server, not the one in your project which doesn't get deployed).
[datadir]
default = /path/to/somewhere/
You might alternatively be able to deploy your project's scrapy.cfg by modifying the setup.py that scrapyd-deploy generates. I have not tried that approach.
Perhaps spidermon should use a different, less obscure mechanism for choosing a data path? or at the very least degrade more gracefully by disabling stats history and logging it.
i got error message like this when deployed scrapy project to scrapyd, even when scrapy.cfg is included in the egg file
I have deployed a scrapy project to scrapyd, but I think there is a problem with the spidermon, because without scrapyd it's fine
The text was updated successfully, but these errors were encountered: