A template file and folder structure for a data analysis project/paper done with R/Rmarkdown/Github.
This is a template for a data analysis project using R, Rmarkdown (and variants, e.g. bookdown), Github and a reference manager that can handle bibtex (I recommend Jabref or Zotero). It is also assumed that you have a word processor installed (e.g. MS Word or LibreOffice). You need that software stack to make use of this template.
- All data goes into the subfolders inside the
data
folder. - All code goes into the
code
folder or subfolders. - All results (figures, tables, computed values) go into
results
folder or subfolders. - All products (manuscripts, supplement, presentation slides, web apps, etc.) go into
products
subfolders. - See the various
readme.md
files in those folders for some more information.
The template comes with a few files that are meant as illustrative examples of the kinds of content you would place in the different folders.
- There is a simple, made-up dataset in the
raw_data
folder. - The
processing_code
folder contains a single R script which loads the raw data, performs a bit of cleaning, and saves the result in theprocessed_data
folder. - The
analysis_code
folder contains an R script which loads the processed data, fits a simple model, and produces a figure and some numeric output, which is saved in theresults
folder. - The
products
folder contains an examplebibtex
and CSL style file for references. Those files are used by the example manuscript, poster and slides. - The
poster
andslides
folders contain very basic examples of posters and slides made with R Markdown. Note that especially for slides, there are many different formats. You might find a different format more suitable. Check the R Markdown documentation. - The
manuscript
folder contains a template for a report written in Rmarkdown (bookdown, to be precise). If you access this repository as part of my Modern Applied Data Science course, the sections are guides for your project. If you found your way to this repository outside the course, you might only be interested in seeing how the file pulls in results and references and generates a word document as output, without paying attention to the detailed structure.
This is a Github template repository. The best way to get it and start using it is by following these steps.
Once you got the repository, you can check out the examples by executing them in order. First run the cleaning script, which will produce the processed data. Then run the analysis script, which will take the processed data and produce some results. Then you can run the manuscript, poster and slides example files in any order. Those files pull in the generated results and display them. These files also pull in references from the bibtex
file and format them according to the CSL style.