Skip to content
Klortho edited this page Sep 7, 2014 · 1 revision

Scripts

  • Input: the CC BY-licensed articles from the Open Subset of PubMed Central, perhaps reusing code that already exists from a related project, the Open access media importer.

    • Source: PMC's FTP service for their OA subset.

    • Format: In the OA subset are articles in Archiving and Interchange versions 2.0 and up -- that means 2.0, 2.1, 2.2, 2.3, and 3.0, as well as the draft version of the NISO, which is 0.4.
      Documentation for all of these is on the JATS Homepage. Versions up to and including 2.3 are all backwards-compatible, and NLM 3.0 is essentially the same as NISO 0.4 / 1.0, so there are only two real formats to consider: NLM 3.0 and NISO 0.4.

  • Conversion: convert the XML to MediaWiki XML,

  • Upload to Wikisource or Wikimedia Commons (to be decided). The upload should make use of existing bot frameworks like (Python) Pywikipediabot or (Perl) MediaWiki::Bot.

Clone this wiki locally