Skip to content
This repository has been archived by the owner on Jul 5, 2019. It is now read-only.
Vot Z edited this page May 12, 2018 · 7 revisions

Welcome to the Citation Gecko wiki!

Starting the app

  1. Clone the repo

  2. Inside of repo root directory run npm install

  3. Run node server.js

  4. The application is running at http://localhost:3000

Config / credentials

You have to provide an API key for Microsoft Academic service. You can do that by setting an environment variable API_KEY_MICROSOFT when running the application.

Example (Linux/Mac):

API_KEY_MICROSOFT=qwerty1234567890 node server.js

Example (Windows):

set API_KEY_MICROSOFT=qwerty1234567890
node server.js

You will also need to have your AWS keys specified in ~/.aws/credentials in order to connect to DynamoDB.

API

Citation Gecko exposes an API that serves the data used to drive the visualisations.

Currently visible endpoints:

  • GET /api/v1/getCitedBy?doi={articleDOI}
  • GET /api/v1/query/oadoi?doi={articleDOI}
  • POST /api/v1/query/microsoft/search (proxies the request to MAG service)
  • POST /api/v1/query/occ/sparql (proxies the request to OCC Sparql service)

Data

The data used here is based on a dump from CrossRef.

That data is then inserted into a DynamoDB table with two columns: citeTo and citeFrom. Importing it into Dynamo makes the data queryable.

citeFrom indicates the document that's citing a citeTo document.

This data is accessible via /api/v1/getCitedBy endpoint.

Ingest process

The ingest process depends on a tsv file with CrossRef data dump (provided as part of eLife Innovation Sprint 2018). Going forward this will use a file with incremental changes (to be executed with a cron job).

The ingest process is located in its own repository: https://github.com/CitationGecko/crossref-importer

The process chunks the data to be imported into batches of 25 in order to send as many data rows at once as AWS will allow.

Clone this wiki locally