-
Notifications
You must be signed in to change notification settings - Fork 8
Home
Welcome to the Citation Gecko wiki!
-
Clone the repo
-
Inside of repo root directory run
npm install
-
Run
node server.js
-
The application is running at
http://localhost:3000
You have to provide an API key for Microsoft Academic service.
You can do that by setting an environment variable API_KEY_MICROSOFT
when running the application.
Example (Linux/Mac):
API_KEY_MICROSOFT=qwerty1234567890 node server.js
Example (Windows):
set API_KEY_MICROSOFT=qwerty1234567890
node server.js
You will also need to have your AWS keys specified in ~/.aws/credentials
in order to connect to DynamoDB.
Citation Gecko exposes an API that serves the data used to drive the visualisations.
Currently visible endpoints:
-
GET
/api/v1/getCitedBy?doi={articleDOI}
-
GET
/api/v1/query/oadoi?doi={articleDOI}
-
POST
/api/v1/query/microsoft/search
(proxies the request to MAG service) -
POST
/api/v1/query/occ/sparql
(proxies the request to OCC Sparql service)
The data used here is based on a dump from CrossRef.
That data is then inserted into a DynamoDB table with two columns: citeTo
and citeFrom
. Importing it into Dynamo makes the data queryable.
citeFrom
indicates the document that's citing a citeTo
document.
This data is accessible via /api/v1/getCitedBy
endpoint.
The ingest process depends on a tsv file with CrossRef data dump (provided as part of eLife Innovation Sprint 2018). Going forward this will use a file with incremental changes (to be executed with a cron job).
The ingest process is located in its own repository: https://github.com/CitationGecko/crossref-importer
The process chunks the data to be imported into batches of 25 in order to send as many data rows at once as AWS will allow.