-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix broken link bug for article's name that includes slash #822
base: main
Are you sure you want to change the base?
Fix broken link bug for article's name that includes slash #822
Conversation
This doesn't work. For this URL:
I get:
Which seems to indicate that the value being passed from the API is None, which supports my assumption in #817 (comment) that the API is unable to locate the page. However, the fact that it got "further" down the stack seems to support your idea that it was a routing issue. By the way, you don't need a toolforge account to test this PR. If you run the dev Docker compose you will get a development database that contains data, and you can test a similar article to the bug here: http://localhost:5173/#/project/Aesthetics/articles?quality=Project-Class&importance=NA-Class |
With the original code, I tested article with non-slashed link and it still shows that NoneType error. Thats why I think I need API key or maybe I miss setting up something for development. In the Try to show all databases in the docker container wp1bot-db-dev and there is Can you try running the original code, navigate to http://localhost:5173/#/project/Alexandra%20Stan/articles?quality=Assessed-Class, and click any link that has no slash? I clicked the first link and it shows NoneType error. |
Yes you're right, I get the NoneType error on all timestamp links, regardless if they have a slash or not. You still don't need toolforge credentials though. It turns out this comment is wrong: https://github.com/openzim/wp1/blob/main/wp1/credentials.py.example#L63 The API credentials are needed for this redirect logic to work. All you need is a Wikipedia user name and password though: credentials.py
|
Anyways, please refer to #817 (comment). I would prefer a solution that uses non "pretty" URL parameters, rather than double encoding. |
i changed double-encoding url to url with query parameters. Tested with a lot of broken links and non-broken links and they work. There is this link that is not found: http://localhost:5000/v1/articles/redirect?name=Mami%20(Alexandra%20Stan%20song)×tamp=2018-02-23T21%3A51%3A09Z, but I tested it: it enters the article router and exit with 404 status. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Yes, the article has been redirected since the last scrape. This is expected. (https://en.wikipedia.org/wiki/Mami_(song)) |
@audiodude Hello this pull request still fails due to test case not being re-implemented. Do you want me to make a minor change in the test case for ArticleTest? I see that we just need to edit the link in the function |
I think you need to include this in the README file. Set up code is not too bad but the problem is credentials.py and sometimes I dont know if I miss any credentials that are essential for development, unless after I mention you about this. |
Yes, fixing/writing tests is part of a valid PR.
Absolutely right. As I mentioned, the comment is wrong and needs to be updated. |
Test cases are fixed and properly tested in local environment. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #822 +/- ##
=======================================
Coverage 91.57% 91.58%
=======================================
Files 66 66
Lines 3597 3600 +3
=======================================
+ Hits 3294 3297 +3
Misses 303 303 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
In frontend, double encode the article name. In backend, decode the encoded article name (Flask decode the first time, and we decode it the second time to get original name).
change route from <name>/<timestamp>/redirect -> /redirect?name=×tamp=
6e4e883
to
fc75f33
Compare
Issue: Broken Date Links in Articles having name that contains slash
Fixes #817
Changes
In frontend, double encode the article name.
In backend, decode the encoded article name (Flask decode the first time, and we decode it the second time to get original name).