This repository contains a software project that analyses some astronaut data and has been used to aid the material in "Let's Make Your Script Ready for Publication " workshop. The original software project is stored on GitLab.
- It demonstrates every important step performed in the workshop.
- Every branch shows the final results of a specific step.
- The default branch (this branch) adds further documentation and automates checking some details.
This analysis is based on publicly available astronauts data from Wikidata. In this context, we investigated aspects such as time humans spent in space as well as the age distribution of the astronauts.
The repository is organized as follows:
- data: Contains the astronauts data set retrieved from Wikidata
- code: Contains the astronaut analysis script
- results: Contains the resulting analysis plots
The data set has been generated using the following SPARQL query [1] (retrieval date: 2018-10-25).
You can also analyze a recent version of the astronaut data by replacing the data set and re-running the analysis script:
- Run the SPARQL query
- Download the resulting data formatted as JSON
- Replace the file
data/astronauts.json
- Run the analysis script
The script requires Python >= 3.8 and uses the libraries pandas (BSD 3-Clause License) as well as matplotlib (Matplotlib License).
The script has been successfully tested on Windows 10 and Linux with Python 3.8.
Please clone this repository and install the required dependencies as follows:
git clone ...
cd astronaut-analysis/code
pip install -r requirements.txt
You can run the script as follows:
python astronauts-analysis.py
The script processes the astronauts data set and stores the plots in the same directory. Existing result plots will be overwritten.
The test.sh script performs some basic checks to support maintaining the analysis script:
- It installs the required packages.
- It runs the flake8 linter to find programming mistakes and code style issues.
- It runs the analysis script and checks that the expected plots are produced.
The script runs as part of the GitLab build pipeline to find errors introduced by new commits.
If you use this work in a research publication, please cite the specific version that you used using the citation metadata on Zenodo .
You can find an overview about the different versions in the changelog.
Here you find the main contributors to the material:
- Martin Stoffers
- Tobias Schlauch
The changelog documents all notable changes.
Please see the file LICENSE.md for further information about how the content is licensed.