Skip to content

Developer Notes for Data Publishers

Nick Evans edited this page Aug 2, 2016 · 25 revisions

Are you new here?

The TLDR below is the incredibly short version of the detailed guidance available in the steps at the bottom of this page. Please use the detailed guidance if you've not done this before.

Technical TLDR of "Creating a Dataset Site" Guide

  1. This repo is forked by the user in the data publishing organisation
  2. The user gives permission to the "openactive-bot" user to have write access to their repo (in order to receive updates).
  3. The user renames the forked repo to "orgname.github.io" where orgname is the name of their organisation, to activate GitHub Pages.
  4. The user deletes the DELETEME file to force GitHub Pages to update its hosted files
  5. The user opens {github-pages-url}/generator.html in a browser.
  6. The generator initialises using the state from metadata.json, then generates index.html and metadata.json from the contents of the fields on the generator form and template.html.
  7. The user updates the fields in the generator, then copies the generated index.html and metadata.json and uses the GitHub interface to update these files in the fork.
  8. The user then uses the GitHub interface to update the bg.jpg (background image) and logo.png (logo) in images/*
  9. The user creates a CNAME file to make the Dataset Site available at a subdomain of the website where the data originated.

Customisation

Local config files: index.html, metadata.json, images/*, CNAME

Changes to the above local config files should be made inside the main forked repo for your organisation as described above, using generator.html as an aid (generator.html can be accessed at your current GitHub Pages URL, as described above).

If you spot anything that needs to be fixed or made more configurable, please fork this repo (dataset-site-generator) separately and create a pull request for your change.

Any customisation made outside of the local config files will be overwritten by the openactive-bot if the bot has access to the fork. Note that the openactive-bot will also regenerate index.html from metadata.json. metadata.json is the source of truth for the fork's configuration.

Openactive Bot

"openactive-bot" will automatically keep all forks of the dataset-site-generator up-to-date, if it has access to them.

Changes to the way the machine-readable metadata and SEO tags are presented in the pages (e.g. DCAT and RDF) are likely, to ensure they are as accessible as possible to various open data catalogues. Updates will also be made to ensure cross-browser compatibility.

Due to the above, using Openactive Bot is recommended.

Updating your fork manually (openactive-bot disabled)

If you do not give openactive-bot write access to your fork, you will need to keep it up-to-date manually.

Files other than the local config files are frequently updated upstream in dataset-site-generator. To update your organisation's fork to the latest version, complete the following steps:

  1. Clone your fork
git clone [email protected]:britishcycling-oa/britishcycling-oa.github.io.git
cd britishcycling-oa.github.io
  1. Add this upstream remote, dataset-site-generator
git remote add upstream [email protected]:openactive/dataset-site-generator.git
  1. Fetch from upstream
git fetch upstream
  1. Pull from upstream into your local master, and deal with any merge conflicts
git pull upstream master
  1. Push the changes back to the fork
git push
  1. Go back to generator.html on your GitHub Pages site and regenerate your index.html and metadata.json (in case additional fields have been added) as per the Getting Started Guide, replacing these using the GitHub interface or locally.