-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Following too many Redirects Is Problematic #1
Comments
Per the spec, these changes make sense. In fact, based on my reading of the spec, only a single |
In my data set, we see that "www." is routinely stripped (for the domain presented in the domain field), so this change would, while ensuring greater literal spec compliance, result in a poorer practical outcome. Perhaps a quick revision to the spec to also permit "domain.com" -> "www.domain.com" redirects is the cleanest solution. |
I suggest we only allow two styles of redirect (made this same comment in the slack channel):
The number of redirects allowed should probably be limited as well. |
We implemented the redirect limitation using requests.session() and it worked great. |
Test case results in 4-5 redirects due to paywall. This misconfiguration should be handle elegantly, with a clear max on redirects, rather than failing to parse paywall text.
The text was updated successfully, but these errors were encountered: