- An introduction to
ggplot2
using volcano plots (Updated) - Using
DiagrammeR
to draw flow charts (Updated)
- Data cleaning using
data.table
ortidyverse
(or PythonPandas
) (Updated) - Cleaning strings using regular expressions with base R or
stringr
(Updated)
- Creating SQL <> R workflows - Part 1 (Updated)
- Creating SQL <> R workflows - Part 2 (Updated)
- Automating R Markdown report generation - Part 1 (Updated)
- Automating R Markdown report generation - Part 2 (updated)
The resources below also cover a comprehensive range of practical R tutorials.
- Statistical Computing by Alex Reinhart and Christopher Genovese
- Data Science Toolkit by David Benkeser
- What They Forgot to Teach You About R by Jennifer Bryan and Jim Hester
This repository now contains the following file naming and code style rules.
- Folders are no longer ordered with a numerical prefix and names are no longer case sensitive e.g
r_tips\tutorials\...
andr_tips\figures\...
- Tutorial subtopics share the same prefix e.g.
r_tips\tutorials\dv-...
andr_tips\tutorials\st-...
- File names contain
-
to separate file name prefixes and_
instead of other white space e.g.r_tips\figures\dv-using_diagrammer-simple_flowchart.svg
- Comments are styled according to the tidyverse style guide:
- The first comment explains the purpose of the code chunk and is styled differently for enhanced readability e.g.
# Code as header --------
- Comments are written in sentence case and only end with a full stop if they contain at least two sentences
- Short comments explaining a function argument do not have to be written on a new line
- Comments should not be followed by a blank line, unless the comment is a stand-alone paragraph containing in-depth rationale or an alternative solution
- The first comment explains the purpose of the code chunk and is styled differently for enhanced readability e.g.
- R code chunks are styled as follows:
- Each R chunk should be named with a short unique description written in the active voice e.g.
create basic plot
andmodify plot labels
- Arguments inside code chunks should not contain white space and boolean argument options should be written in capitals e.g.
{r load libraries, message=FALSE, warning=FALSE}
- To render the github document, results are generally suppressed using
results='hide'
and manually entered in a new line beneath the code. - To render the github document, figures are generally outputed using
fig.show='markdown'
and figure outputs can then be suppressed at the local chunk level usingfig.show='hide'
- Each R chunk should be named with a short unique description written in the active voice e.g.
- Set a margin of 80 characters length in RStudio through
Tools\Global options --> Code --> Display --> Show margin
and use this margin as the cut-off for code and comments length
Citing packages is a good practice when you are publishing research papers. To do this, use citations("package")
to print the relevant package publication. A non-exhaustive list of R packages used in this repository is found below.
- R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
- Wickham et al., (2019). Welcome to the
tidyverse
. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686 - H. Wickham.
ggplot2
: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. - Matt Dowle and Arun Srinivasan (2021).
data.table
: Extension ofdata.frame
. R package version 1.14.2. https://CRAN.R-project.org/package=data.table
Many kudos to Dr Chuanxin Liu, my former PhD student and code editor, for teaching me how to code in R in my past life as an immunologist.