Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support AnyResponse #161
support AnyResponse #161
Changes from 13 commits
809e179
2f5c69d
b4a79b0
5c83d22
202fa1b
5f1d104
3cb2290
85f355d
297eb57
1e19ef3
137d146
9322622
ffa345f
7043c55
3fec1b8
b341976
60f5c2b
2c5e506
6c1e043
571ee63
a51f961
9da17b4
d389893
df32f14
382dced
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Check warning on line 8 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L8
Check warning on line 27 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L27
Check warning on line 59 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L57-L59
Check warning on line 67 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L67
Check warning on line 69 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L69
Check warning on line 76 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L75-L76
Check warning on line 83 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L79-L83
Check warning on line 128 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L128
Check warning on line 135 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L135
Check warning on line 143 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L140-L143
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check my understanding: the logic here is that if browserHtml is not requested, but a data type is requested, and there is AnyResponse, then we switch the extractFrom from default to httpResponseBody for this data type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be, if:
item_type
(e.g. Product, ProductNavigation, etc), andthen we use
httpResponseBody
as the extraction source.Check warning on line 153 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L145-L153
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, this logic is the least clear to me. Could you please add a comment, to explain a bit how it works?
scrapy-poet integration ignores default parameters
. But it seems here they are applied, and ZYTE_API_PROVIDER_PARAMS are ignored instead? Or is it not happening because of some reason?del http_request_params["url"]
is mysterious to me :) Why delete the url? Are there other parameters which need to be deleted?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea here is to handle cookies, headers, etc. in a more consistent way, as compared to just setting httpRequestBody and httpRequestHeaders to True, without invoking ParamParser?
What are the actual differences? What breaks if ParamParser is not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I forgot about this and so using the
ParamUser
was a way to make handling the headers consistent across the requests. I can't say for the actual differences in practice. For now, we can go with the simplest approach of settinghttpResponseBody
andhttpResponseHeaders
to True. b341976There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BurnzZ simply setting httpResponseBody/Headers looks way easier to understand, as it's similar to how all other parameters are handled.
@Gallaecio what do you think about this? Do you see any edge cases with using ParamParser vs just requesting a httpResponseBody/Headers? Any concerns about using the parameters directly, without ParamParser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only case I can think of, and would have been a problem already with the pre-existing code, is the case where a cookie included in the request is necessary to get the right content, or some actions are necessary for extraction to work properly, and the server side cannot (yet) inject those automatically.
Still, it may be better to keep things simple for now, and figure out how we wish to solve these issues when we get to that. Because even if we decide to use ParamParser, things are more complicated: it should only be used if automatic parameter parsing is being used, if (raw) zyte_api is used for the source request then some parameters may also need to be copied from there…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about the necessary cookies. Fortunately, I haven't encountered this yet in my experiments.
+1 on keeping things simple for now.
Check warning on line 157 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L156-L157
Check warning on line 180 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L180
Check warning on line 182 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L182
Check warning on line 188 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L187-L188
Check warning on line 191 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L190-L191
Check warning on line 194 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L193-L194
Check warning on line 202 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L202
Check warning on line 206 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L206
Check warning on line 219 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L217-L219
Check warning on line 228 in scrapy_zyte_api/providers.py
scrapy_zyte_api/providers.py#L228