You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our sources have a primary_language field on them. We should fill that in regularly by doing something like pulling 1000 random stories in the last year and grabbing the most common detected language. Perhaps don't do this for sources that have less than 100 articles because that's an indicator that we don't get regular data from them? This should probably be a cron-based task run once every few months. A simple data science script so I'm logging it here, but probably best implemented as a Django management command like update-stories-per-week.
The text was updated successfully, but these errors were encountered:
Our sources have a primary_language field on them. We should fill that in regularly by doing something like pulling 1000 random stories in the last year and grabbing the most common detected language. Perhaps don't do this for sources that have less than 100 articles because that's an indicator that we don't get regular data from them? This should probably be a cron-based task run once every few months. A simple data science script so I'm logging it here, but probably best implemented as a Django management command like
update-stories-per-week
.The text was updated successfully, but these errors were encountered: