Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some copyedits #14

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,15 @@ This is the Cocytus project for tracking citations on Wikipedia.
We are changing a __diff stream__ into a __citation delta__ stream.

+ we use the [recent changes stream](https://wikitech.wikimedia.org/wiki/RCStream)
+ to make queue of diffs to be inspected
+ to make a queue of diffs to be inspected
+ Keep a database table of the latest version we have seen so far
+ call the wikimedia api to fetch the diff text
+ call the Wikimedia API to fetch the diff text
+ see if the diff involves adding or removing a DOI or other citation identifier
+ we do this in three different ways:
1. run a regex on the diff text
1. look for `doi=xxx` patterns in templates
1. look for `[[doi:xxx]]` external links.
+ then for each change found we push that change to crossref in `crossref_push.py` with information
+ then, for each change found, we push that change to crossref in `crossref_push.py` with information
+ Idenfifier, delta direction, provenance page, page metadata
+ which can be viewed at [crossreflabs](http://events.labs.crossref.org/events/types/WikipediaCitation)

Expand Down