-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Outreachy Round 27] Make RevisionScoreImporter
import data from both LiftWing API and reference-counter API.
#5594
Conversation
…unexpected error, the return value is a hash with fields: wp10, features, deleted, and prediction.
…case of unexpected error, the return value is a hash with num_ref field.
…ftWingApi/ReferenceCounterApi. This RevisionScoreApiHandler API takes care of the logic to decide which APIs are available for a given wiki, hits the APIS, and to merge the responses.
Failure doesn't seem to be related to my changes:
|
RevisionScoreImporter
import data from both LiftWing API and reference-counter API.RevisionScoreImporter
import data from both LiftWing API and reference-counter API.
@ragesoss @Aminehassou this PR is ready for review. Feel free to review whenever you have the chance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea to evaluate before deploying this is to make the process of starting to use the new reference-counter API more progressive. One easy way to do this could be to temporarily modify the valid_wiki? definition for the ReferenceCounterApi. This way we could enable this new API only for some specific wikis, for example es.wikipedia. That way we can monitor the new process taking less risk. After the updates of some courses, if everything works as expected, we can enable the ReferenceCounterApi for every wiki except for wikidata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is necessary. We can deploy this to https://dashboard-testing.wikiedu.org/ to try it out in a production-like environment and make sure everything seems to be working.
… retries the same request up to 5 times if it raises an error.
Note about why I added a retry strategy: For the quality report, I went through the entire process of populating the features and features_previous fields using both APIs. I observed that the most frequent API error was "unable to solve domain," a transient issue that typically resolves itself upon retrying the request. In the previous implementation, when faced with this temporary error, the features and features_previous fields were set to nil, with the expectation that they would likely be populated in the next automatic statistics update task. However, in the new implementation, if this temporary error occurs, the features and features_previous fields won't be set to nil. For instance, it might fail the request to Toolforge but succeed with LiftWing. Consequently, these fields won't be populated in the subsequent UpdateCourseStats periodic task run. To address this, I introduced a straightforward retry strategy. This ensures a higher likelihood of successfully populating the fields, even in the presence of temporary errors. |
@Aminehassou have you looked at this already, and if not would you like to? (If you don't want to, that's fine, just want to make sure you have a chance to review it if you were planning to.) |
You can go ahead and review it, I've taken a look at it and it looks good! |
@Aminehassou thanks! it looks good to me as well, I'm going to merge and deploy it to staging so we can put it through a few paces before we go live. |
What this PR does
This PR is part of the "Improve how Wiki Education Dashboard counts references added" project (read issue #5547).
The purpose of this PR is to modify the existing
RevisionScoreImporter
class to not only use the Lift Wing API, but also to hit the new reference-counter one. Notice that these APIs are not available for every wiki:To abstract the logic related to deciding which API/APIs hit to get revisions data for a given wiki, this PR adds a new
RevisionScoreApiHandler
intermediate class that deal with it.When merging and deploying this PR,
features
andfeatures_previous
fields inrevisions
table will store a hash with not only the data from Lift Wing API (if any), but also anum_ref
key with the number of references got from the reference-counter API (if applicable).Notice that even though this PR populates the
features
andfeatures_previous
fields withnum_ref
data, this it not used when calculating the references added inRevision.references_added method
. So no change should be visible from the user's point of view.Open questions and concerns
One idea to evaluate before deploying this is to make the process of starting to use the new reference-counter API more progressive. One easy way to do this could be to temporarily modify the
valid_wiki?
definition for theReferenceCounterApi
. This way we could enable this new API only for some specific wikis, for example es.wikipedia. That way we can monitor the new process taking less risk. After the updates of some courses, if everything works as expected, we can enable theReferenceCounterApi
for every wiki except for wikidata.