Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Outreachy Round 27] Stop storing liftwing features for non-wikidata wikis. #5682

Conversation

gabina
Copy link
Member

@gabina gabina commented Feb 28, 2024

What this PR does

This PR is part of the "Improve how Wiki Education Dashboard counts references added" project (read issue #5547).

Before this PR, wikis supported by liftwing and reference-counter API had both API responses stored in the features/features_previous fields.
After this PR, only wikidata wikis (just supported by the liftwing API) have liftwing features stored in the features/features_previous fields. Other non-wikidata wikis, like fr.wikipedia or es.wiktionary will only store reference-counter response in their features/features_previous fields.

Context on why we're changing this
The revision score importing process is conducted through automatic jobs as part of the course updates. The RevisionScoreImporter class is responsible for selecting "unscored" revisions and querying the API to populate the revision score fields (such as features, wp10, etc.). It determines if a revision is "unscored" by checking if the features or features_previous fields are nil.
Based on my understanding, previously, if the process of querying the LiftWing API failed for any reason, the features field remained nil. Consequently, the RevisionScoreImporter would attempt to populate that field during the next run, as the revision would still be considered "unscored".
Now, with the introduction of two APIs (LiftWing and reference-counter), and storing values from both in the features field, this behavior has changed. For instance, if a LiftWing API request fails unexpectedly, the features field will not be nil because it will contain the response from the reference-counter API. As a result, during the subsequent course update run, the revision score importer will not attempt to query the LiftWing API again to complete the features field.
Although we have implemented a retry strategy when querying APIs, prolonged downtime of one API could lead to many revisions remaining without complete data.

Open questions and concerns

Liftwing features keeps being stored in the features/features_previous fields for wikidata wikis because they're not supported by the new reference-counter API, so references have to be calculated through liftwing features. However, I'm not sure if wikidata uses a completely different approach to count references. In that case, maybe wikidata doesn't need features at all.

…wikidata wikis. We don't need it anymore since non-wikidata wikis use only reference-counter features.
@gabina gabina marked this pull request as draft February 28, 2024 00:52
…tead of the RevisionScoreImporter) to get the liftwing features and predictions
Comment on lines +10 to +13
revision_data = LiftWingApi.new(@wiki).get_revision_data([@rev_id])[@rev_id.to_s]
@feedback = RevisionFeedbackService.new(revision_data['features']).feedback
@user_feedback = Assignment.find(params['assignment_id']).assignment_suggestions
@rating = revision_data[:rating]
@rating = revision_data['prediction']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the easiest way to fix that for now. Instead of using the RevisionScoreImporter, it uses the LiftWingApi directly. This is because the RevisionScoreImporter won't return the LiftWing features or the predictions for en.wikipedia (and this controller ony works for en.wikipedia).

I can still create an issue to move this to the frontend if we think that's better.

@gabina gabina marked this pull request as ready for review February 28, 2024 14:08
@gabina
Copy link
Member Author

gabina commented Feb 28, 2024

@ragesoss this PR is ready for review. The failing spec is not related to these changes.

@gabina gabina changed the title [WIP] [Outreachy Round 27] Stop storing liftwing features for non-wikidata wikis. [Outreachy Round 27] Stop storing liftwing features for non-wikidata wikis. Feb 28, 2024
@ragesoss ragesoss merged commit 8efe822 into WikiEducationFoundation:master Mar 11, 2024
1 check passed
@gabina gabina deleted the 5547-outreachy-round-27-stop-storing-liftwing-features-for-non-wikidata-wikis branch March 12, 2024 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants