Skip to content

add script to automatically update source primary_language #11

Closed
@rahulbot

Description

@rahulbot

Our sources have a primary_language field on them. We should fill that in regularly by doing something like pulling 1000 random stories in the last year and grabbing the most common detected language. Perhaps don't do this for sources that have less than 100 articles because that's an indicator that we don't get regular data from them? This should probably be a cron-based task run once every few months. A simple data science script so I'm logging it here, but probably best implemented as a Django management command like update-stories-per-week.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions