Limassol, 17.6.2019
This is the introductory session of the Reproducible Research (RR@AGILE) workshop at the AGILE 2019 conference. This year, the workshop is centred on the results of the AGILE initiative on Reproducible Research, which aimed to discuss and draft the Reproducible Publication Guidelines or AGILE authors.
Reproducibility can be tackled from the perspective of readers/reviewers or authors. The idea in this workshop is to take both perspectives and to get you started with the tools and platforms that increase the level of reproducibility of computational analyses.
The task is to read through the provided publication and reproduce the analysis based on the provided R script and input data in this folder. Depending on your previous knowledge of R this might or might not require getting familiar with the specific computational environment chosen by the authors of the publication of interest.
Provided material: A publication (paper) with an R analysis and input data as supplementary material download
Software required: R Studio
Outputs: successful run through the R script leading to the plot contained in the paper.
Questions to discuss:
- Which benefits do you observe from providing code and data together with the publication?
- Do the figures resulting from the R Script match the figures in the text document?
- What could make the job of you as a reader/reviewer easier?
Now, you are asked to work through this Geopandas tutorial tutorial and to assess the improvements regarding reproducibility in comparison to the first example.
Look at the repo, examine the files it contains, and start an interactive session on Binder for the first file of the tutorial 01-introduction-geospatial-data.ipynb
. If you feel comfortable with Python, feel fee to run other files of the tutorial.
Provided material: Geopandas tutorial
The author perspective focuses on the reproducibility of research work using R, GitHub and Binder. The guiding principle is to integrate the computational analysis with the text using R Markdown and providing future readers with the computational environment used during development by means of a Binder repository.
- R/RStudio is your analysis code and development environment.
- Github is a control version system to trace the changes of your analysis
- Binder generates a virtual execution environment so that others can recreate your analsys with identical execution conditions.
Therefore, you commit your R code in a remote repository in GitHub, and Binder takes it as input to create a virtual container to run your R code on the cloud.
Software required: R Studio, GitHub (or GitLab) account, myBinder
Provided material: R script (.R file) + incomplete R Markdown document (.Rmd) in this folder.
Background material: Chapter 4 git with RStudio and GitLab, and Chapter 5 R Markdown in reproducible research, from the EGU 2018 course session on Writing reproducible geoscience papers using R Markdown, Docker, and GitLab by Daniel Nüst, Vicky Steeves, Rémi Rampin, Markus Konkol, Edzer Pebesma.
Outputs:
- Complete .Rmd file adding code chunks (map creation) + r expressions
- Add a software and data availability statement to the R Markdown according to the reproducible paper guidelines
Outputs:
- Add the Binder configuration to execute remotely the R markdown file