diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 0000000..45aaae5 --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,15 @@ +--- +title: "Contributor Code of Conduct" +--- + +As contributors and maintainers of this project, +we pledge to follow the [The Carpentries Code of Conduct][coc]. + +Instances of abusive, harassing, or otherwise unacceptable behavior +may be reported by following our [reporting guidelines][coc-reporting]. + + +[coc-reporting]: https://docs.carpentries.org/topic_folders/policies/incident-reporting.html +[coc]: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html + +We also adhere to the [Epiverse-TRACE Code of Conduct](https://github.com/epiverse-trace/.github/blob/main/CODE_OF_CONDUCT.md). diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 0000000..49a8cbd --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,74 @@ +--- +title: "Licenses" +--- + +## Instructional Material + +The tutorials in this repository are developed by Epiverse-TRACE, based on the [lesson template from the Carpentries](https://github.com/carpentries/workbench-template-rmd) (template under CC BY license). + +All Epiverse-TRACE +instructional material is made available under the [Creative Commons +Attribution license][cc-by-human]. The following is a human-readable summary of +(and not a substitute for) the [full legal text of the CC BY 4.0 +license][cc-by-legal]. + +You are free: + +- to **Share**---copy and redistribute the material in any medium or format +- to **Adapt**---remix, transform, and build upon the material + +for any purpose, even commercially. + +The licensor cannot revoke these freedoms as long as you follow the license +terms. + +Under the following terms: + +- **Attribution**---You must give appropriate credit (mentioning that your work + is derived from work that is Copyright (c) Epiverse-TRACE, where + practical, linking to ), provide a [link to the + license][cc-by-human], and indicate if changes were made. You may do so in + any reasonable manner, but not in any way that suggests the licensor endorses + you or your use. + +- **No additional restrictions**---You may not apply legal terms or + technological measures that legally restrict others from doing anything the + license permits. With the understanding that: + +Notices: + +* You do not have to comply with the license for elements of the material in + the public domain or where your use is permitted by an applicable exception + or limitation. +* No warranties are given. The license may not give you all of the permissions + necessary for your intended use. For example, other rights such as publicity, + privacy, or moral rights may limit how you use the material. + +## Software + +Except where otherwise noted, the example programs and other software provided +by Epiverse-TRACE are made available under the [OSI][osi]-approved [MIT +license][mit-license]. + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +[cc-by-human]: https://creativecommons.org/licenses/by/4.0/ +[cc-by-legal]: https://creativecommons.org/licenses/by/4.0/legalcode +[mit-license]: https://opensource.org/licenses/mit-license.html +[osi]: https://opensource.org diff --git a/appendix.md b/appendix.md new file mode 100644 index 0000000..2062061 --- /dev/null +++ b/appendix.md @@ -0,0 +1,185 @@ +--- +title: 'Appendix' +teaching: 10 +exercises: 2 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- Where can I add my functions? +- How do I need to document my functions? +- How can I read the documentation of my functions? +- How can I write a manuscript with my project outputs? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Write your functions documentation following the `{rcompendium}` template. +- Load your functions and update its documentation using `{devtools}`. +- Create a website for the project using `{usethis}`. +- Create a manuscript template with `{rrtools}`. + +:::::::::::::::::::::::::::::::::::::::::::::::: + +## What about my functions? + +### How do I write my functions documentation? + +We must __write__ our custom functions as _Modular functions_ and save them in the `R/` folder. +You can __write__ the documentation of your functions following a [standard documentation method](https://r-pkgs.org/man.html). The `{rcompendium}` template already contains a `fun-demo.R` for this. + +![](fig/fun-demo.png) + +:::callout + +Remember that _documented functions_ can facilitate further efforts to reuse them and create a specific R package! + +::: + + +### How do I load my functions? + +To __load__ your project functions, as written in `line 20` of the `make.R` file, run: + +```r +devtools::load_all(here::here()) +``` + +### How do I read my functions documentation? + +Is there an easier way to read the documentation of my _modular functions_? + +Remember that after you write the documentation of new functions on the `R/` folder, you must update your function and project documentation files, which are in _different_ files and folders. To do this run: + +```r +devtools::document() +``` + +This last step will update the following: + +- `man/` folder, which stores the project documentation, and +- the `NAMESPACE`, that registers the functions that your project exports for your data analysis to run. + +Lastly, you can ask with `?function` in the `Console` and read the documentation for your functions, as any other function from the packages you installed. Try to run this: + +```r +?print_msg +``` + + +## Create a project website + +An alternative way to navigate all the files generated by the `{rcompendium}` template is with a website. + +We can create a website using [GitHub pages](https://pages.github.com/). To make this possible run: + +```r +usethis::use_pkgdown_github_pages() +``` + +This function implements the GitHub setup needed to automatically publish your site to GitHub pages using the [`{pkgdown}` package](https://pkgdown.r-lib.org/). + +This output is possible in two steps: + +- First, it prepares to _publish_ the pkgdown site from a new `gh-pages` branch. +- Then, it configures a [GitHub Action](https://github.com/features/actions) to automatically _build_ the site and _deploy_ it via GitHub Pages. + +Lastly, the `pkgdown` site's URL is added to the `pkgdown` configuration file, to the URL field of `DESCRIPTION`, and to the GitHub repo. + +Commit and Push your changes. + +:::callout + +Remember that when using _GitHub Actions_, next to the `SHA/hash` will be the _status icon_ of the actions. + +- Yellow ball for "Job running", +- Red cross for "Failed Run", and +- Green check for "job done!". + +::: + +Please wait for it to get green and inspect the Reference tab on the navigation bar. + +Now, let's compare the `fun-demo.R` file, the `?print_msg` output, and the website format: + +![](fig/fun-demo-web.png) + +:::callout + +A `pkgdown` website format can facilitate the navigation through: + +- Community files and +- Function documentation. + +::: + +:::instructor + +If required, you can add vignettes for an `{rcompendium}` using: + +```r +rcompendium::add_vignette(filename = "vignette-01") +``` + +Vignettes look more suitable for package documentation than a project. But knowing that we have that alternative with the `{rcompendium}` template is helpful. + +::: + +## How do I write a manuscript for my project? + +You can use handy functions from another research compendium package called `{rrtools}`. + +To get a template of files required to fill a manuscript run: + +```r +rrtools::use_analysis(location = "inst", data_in_git = FALSE) +``` + +This function will create a folder `inst/` with a new set of folders for data and figures. You can avoid using them and only use the `.qmd` as a template for your manuscript. + +![](fig/rrtools-paper.png) + +The `.qmd` files get formatted from several template files like references using `.bib` and citation style using `.csl`. + +![](fig/rrtools-templates.png) + +Using `rrtools::use_analysis()` with those arguments will not modify your `{rcompendium}` configuration. Other functions can change it. + +:::instructor + +This manuscript will not be visible on a website unless moved to the `vignette/` folder. We have yet to test that behaviour. + +::: + +## Reproducible research features + +We also relate Reproducibility with the practice of _describing_ and _documenting_ the research process so that another researcher can re-run the software on the same data input to get the same data outputs. + +Features related to this are: + +- __Documentation strings__ in one or two lines using active verbs to describe how inputs turn into outputs ([Irving et al. 2021](https://merely-useful.tech/py-rse/documentation.html)). The documentation of functions, like the `fun-demo.R` template file, follows this good practice. + +- __Manuscripts__ using [literate programming](https://books.ropensci.org/targets/literate-programming.html) with tools like [Rmarkdown](https://rstudio.github.io/cheatsheets/html/rmarkdown.html) or [Quarto](https://quarto.org/). The template provided by `{rrtools}` facilitates files to start with this practice. + + +:::callout + +Remember that if you have all your changes as commits with git, you can _revert_ any modification with the button __`Revert`__, located between the `Stage` and `Ignore` buttons. + +![](fig/git-revert.png) + +::: + +::::::::::::::::::::::::::::::::::::: keypoints + +- Write your functions documentation following the `R/fun-demo.R` template. +- Run your project functions with `devtools::load_all()`. +- Update your functions documentation with `devtools::document()`. +- Read your functions documentation with the `?function` notation in the R console. +- Create a website for the project with `usethis::use_pkgdown_github_pages()`. +- Use a manuscript template with `rrtools::use_analysis(location = "inst", data_in_git = FALSE)`. +- _Documentation strings_ and _Manuscripts_ using _literate programming_ are features related to Reproducible research. + +:::::::::::::::::::::::::::::::::::::::::::::::: + diff --git a/beforestart.md b/beforestart.md new file mode 100644 index 0000000..58a3489 --- /dev/null +++ b/beforestart.md @@ -0,0 +1,60 @@ +--- +title: 'Before we start' +teaching: 10 +exercises: 0 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- Where can I meet other workshop participants? +- Where can I fill in my questions about the workshop topic? +- Where can I find the Code of Conduct? +- How can I report unacceptable behaviour? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Share our communication forum. +- Share our Code of Conduct. + +:::::::::::::::::::::::::::::::::::::::::::::::: + +## Roll call + +:::checklist + +Hello! + +Before we start, tell us something about you on our _communication forum_ called [GitHub Discussions](https://github.com/epiverse-trace/research-compendium/discussions/1). + + + +::: + +## Welcome + +:::checklist + +A reminder of our [Code of conduct](https://github.com/epiverse-trace/.github/blob/main/CODE_OF_CONDUCT.md): + +- If you experience or witness unacceptable behaviour or have any other concerns, please report by completing this short form: + +- To report an issue involving one of the organisers, please use the LSHTM’s [Report and Support tool](https://reportandsupport.lshtm.ac.uk/), where your concern will be triaged by a member of LSHTM’s Equity and Diversity Team. + +::: + +## Contributors + +This material has contributions from: + +- [James Azam, PhD](https://www.lshtm.ac.uk/aboutus/people/azam.james), RSE at Epiverse. +- [Carmen Tamayo-Cuartero, PhD](https://www.lshtm.ac.uk/aboutus/people/tamayo-cuartero.carmen), RF at Epiverse. + +::::::::::::::::::::::::::::::::::::: keypoints + +- Use the `GitHub Discussions` as our communication forum for the workshop. +- Use the Code of Conduct to report unacceptable behaviour. + +:::::::::::::::::::::::::::::::::::::::::::::::: + diff --git a/compendium.md b/compendium.md new file mode 100644 index 0000000..3728416 --- /dev/null +++ b/compendium.md @@ -0,0 +1,195 @@ +--- +title: 'Research compendium' +teaching: 25 +exercises: 5 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- How do you create a research compendium for an R project? +- How do I facilitate users and collaborators to participate in my project? +- What features are related to sustainable software? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Adapt a research compendium template with files and folders organized logically with `{rcompendium}`. +- Add community files for users to seek support and contribute with `{usethis}` +- Identify your project features related to sustainable software. + +:::::::::::::::::::::::::::::::::::::::::::::::: + +## What is a research compendium? + +A research compendium collects all digital parts of a research project, including data, code, and texts (protocols, reports, questionnaires, metadata). We create this collection in such a way that reproducing all results is straightforward ([The Turing Way Community, 2022](https://the-turing-way.netlify.app/reproducible-research/compendia#summary)) + +Using templates facilitates having all the required files from the beginning of your project. + +![Artwork by Allison Horst https://allisonhorst.com/](fig/folder_tidy_cooking.png) + +We understand that creativity can be “messy” sometimes. You will be able to handle it in the present, but your collaborators (and the future you) may have problems understanding it. Reproducibility is as much about the humans that interact with the code as the machines that need to run it ([Campitelli and Corrales, 2022](https://eliocamp.github.io/reproducibility-with-r/materials/day1/02-projects/)). + +![Artwork by Allison Horst https://allisonhorst.com/](fig/folder_mess_cooking.png) + +## Let's code + +### Create a Rstudio Project + +Go to `Project`, which is in the top right corner of Rstudio and select `New Project...`. Follow these steps: + +- Select `New directory`, +- Select `New project`, and +- Check the `[x] Create a git repository` option + +:::callout + +#### Stop! Find a name! + +Don't use `projectname` as your R project name! + +Create a new one, thinking about your current research project. + +::: + +Your `projectname` must follow some rules for everything to work. It must: + +- contain only ASCII letters, numbers, and dots "`.`" (it cannot have a hyphen "`-`") +- have at least two characters +- start with a letter (not a number) +- not end with a dot "`.`" + +![](fig/project-init.png){alt='New Project Wizard panel with Directory name and the Create a git repository box checked'} + + +### Create a research compendium + +To create a new research compendium run: + +```r +rcompendium::new_compendium() +``` + +This function will create new files and folders as a template. You can rearrange the folder elements by size to identify its components. + +![](fig/projectname-01.png) + +We will explore the content of each new element during the workshop. + +This function will also create the GitHub repository for your project. This step will open a new tab in your browser. + +![](fig/projectname-01-github.png) + +### Add community files + +We are going to add more files to the default template. For this, we are going to use a package with helper functions called [`{usethis}`](https://usethis.r-lib.org/). + +To add community files, run: + +```r +usethis::use_tidy_github() +``` + +This function is a convenience [wrapper function](https://stackoverflow.com/q/44783295/6702544) that adds four template files in a new folder called `.github/`: + +- `SUPPORT.md` with resources to seek support. +- `CONTRIBUTING.md` with contributing guidelines. +- `issue_template.md` with steps on how to report issues. +- `CODE_OF_CONDUCT.md` with guidelines to foster an environment of inclusiveness and to explicitly discourage inappropriate behaviour. + +These four files follow the tidyverse standards. You can edit them [writing with `Markdown`](https://rstudio.github.io/cheatsheets/html/rmarkdown.html#write-with-markdown) to fit your specific project content purposes. + +:::prereq + +Now `commit` and `push` your changes using `git`. + +#### Git reminders + +- We use [`git commit`](https://www.atlassian.com/git/tutorials/saving-changes/git-commit) to capture a snapshot of the project's currently staged changes. We use `git add` to 'stage' changes that we will store in a commit. + +- We use [`git push`](https://www.atlassian.com/git/tutorials/syncing/git-push) to upload `local` repository content to a `remote` repository. + +![Source: https://www.gitkraken.com/learn/git/git-remote](fig/push-and-pull.png) + +- You can use [Git with Rstudio](https://rviews.rstudio.com/2020/04/23/10-commands-to-get-started-with-git/) to performs these tasks. + +::: + +:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor + +#### Git with Rstudio reminder + +- From the “Review changes” pane: + + Go to the “History” tab in the top left. + + Show that each commit has an ID under the [SHA/hash](https://happygitwithr.com/repeated-amend.html?q=sha#um-what-if-i-did-push) column + +- Go to GitHub: + + Identify where this ID, called SHA/hash, is located. + +:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: + +### Where are community files visible? + +GitHub automatically recognizes these files and adds them as hyperlinks in specific places. + +1. Go to the __About__ section in the upper right corner side of your repository, to read the `Code of conduct`: + +![](fig/code_of_conduct.png) + +2. Go to the __Issues__ tab on the navigation bar at the top of your repository on GitHub. You will find a link to the `issue templates` you added there. + +![](fig/issue_template.png) + +3. Press the `"Get started"` button on the right to write on top of the template. In the lower right corner, the __Contributing__ and __Support__ files are accessible under the _Helpful resources_ subtitle. + +![](fig/helpful_resources.png) + +These community files are also known as [community health files](https://github.blog/changelog/2019-02-21-organization-wide-community-health-files/) + +:::discussion + +- Do you find the links to the Community files visible enough on GitHub? + +- Have you ever found them in a different place in the past? + +::: + +:::checklist + +![](fig/concept-map-01.png) + +::: + +## Sustainable software features + +Software is sustainable when it's easier to __maintain__ and __extend__ rather than __replace__. This _easiness_ depends on the: + +- _Quality_ of the software, +- _Skills_ of the potential maintainers, and +- _How much_ the user community is willing to invest to keep the software up to date. + +Features like a __Research compendium__ template and __Version control__ increase the quality of the software. + +- A _Research compendium_ follows Project organization good practices. This give a logical and familiar structure to the project. +- A _version control_ follows the Keep track of changes good practice. This registers the project's history and how one or multiple contributors wrote code and made decisions. + +Additionally, __Community files__ follow Collaboration good practices. They consider any gaps in the community of users to facilitate their participation and how to interact with maintainers. + +:::testimonial + +_Is a data analysis also considered a piece of software?_ + +Nick Huber, from the blog Towards Data Science, concludes that [data analysis best practices/tools are starting to strongly resemble practices/tools from software engineering](https://towardsdatascience.com/data-analysis-is-a-form-of-software-engineering-876232bd3ebc) + +The repository of this lesson also came from a [template](https://carpentries.github.io/workbench/) that looks like a derivative of a research compendium, which also looks like a piece of software like an R package. + +::: + +::::::::::::::::::::::::::::::::::::: keypoints + +- Use `{rcompendium}` templates to reuse all the files and folders a research project needs. +- Use `{usethis}` to add complementary community files to a research project. +- _Version control_, _Research compendium_, and _Community files_ are features related to Sustainable software. + +:::::::::::::::::::::::::::::::::::::::::::::::: + diff --git a/config.yaml b/config.yaml new file mode 100644 index 0000000..5269470 --- /dev/null +++ b/config.yaml @@ -0,0 +1,85 @@ +#------------------------------------------------------------ +# Values for this lesson. +#------------------------------------------------------------ + +# Which carpentry is this (swc, dc, lc, or cp)? +# swc: Software Carpentry +# dc: Data Carpentry +# lc: Library Carpentry +# cp: Carpentries (to use for instructor training for instance) +# incubator: The Carpentries Incubator +carpentry: 'incubator' + +# Overall title for pages. +title: 'Improve your code for Epidemic Analysis with R' + +# Date the lesson was created (YYYY-MM-DD, this is empty by default) +created: '2023-08-29' + +# Comma-separated list of keywords for the lesson +keywords: 'software, data, lesson, The Carpentries' + +# Life cycle stage of the lesson +# possible values: pre-alpha, alpha, beta, stable +life_cycle: 'pre-alpha' + +# License of the lesson materials (recommended CC-BY 4.0) +license: 'CC-BY 4.0' + +# Link to the source repository for this lesson +source: 'https://github.com/avallecam/research-compendium' + +# Default branch of your lesson +branch: 'main' + +# Who to contact if there are any issues +contact: 'andree.valle-campos@lshtm.ac.uk' + +# Navigation ------------------------------------------------ +# +# Use the following menu items to specify the order of +# individual pages in each dropdown section. Leave blank to +# include all pages in the folder. +# +# Example ------------- +# +# episodes: +# - introduction.md +# - first-steps.md +# +# learners: +# - setup.md +# +# instructors: +# - instructor-notes.md +# +# profiles: +# - one-learner.md +# - another-learner.md + +# Order of episodes in your lesson +episodes: +- beforestart.Rmd +- introduction.Rmd +- compendium.Rmd +- reproducible.Rmd +- readmefile.Rmd +- wrapup.Rmd +- appendix.Rmd +- definitions.Rmd + +# Information for Learners +learners: + +# Information for Instructors +instructors: + +# Learner Profiles +profiles: + +# Customisation --------------------------------------------- +# +# This space below is where custom yaml items (e.g. pinning +# sandpaper and varnish versions) should live +varnish: epiverse-trace/varnish@epiversetheme + diff --git a/definitions.md b/definitions.md new file mode 100644 index 0000000..0d17190 --- /dev/null +++ b/definitions.md @@ -0,0 +1,98 @@ +--- +title: 'Definitions' +teaching: 10 +exercises: 0 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- How can I define Reliability, Usability and Sustainability? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Define the concepts of Open science, Reproducible research, and Sustainable software. + +- Define related concepts like Reliability and Usability. + +- Define related features for each concept. + +:::::::::::::::::::::::::::::::::::::::::::::::: + +## Introduction + +Three introductory concepts informed our approach to this material. + +### Open science + +Definition: + +- Make data inputs, software, and data outputs freely available by publishing all of them with open licences ([Irving et al. 2021](https://merely-useful.tech/py-rse/index.html#intro-big-picture)), to facilitate project reuse. + +- Also make their dissemination available to any member of an inquiring society, from professionals to citizens ([ORION Open Science, 2020](https://www.orion-openscience.eu/index.php/resources/open-science)), to improve its transparency and public ownership. + +Related feature: + +- _Open Licence_: An open licence that permits reuse using MIT or GPL for software ([Choose a License, 2023](https://choosealicense.com/)) and CC BY or CC0 for data, prose and other creative products ([Creative Commons, 2023](https://creativecommons.org/about/cclicenses/), [Irving et al. 2021](https://merely-useful.tech/py-rse/glossary.html#open_license)). + +- _DEI_: Diversity, Equity, and Inclusion. There are four CHAOSS’s metrics for projects: Project Access, Communication Transparency, Newcomer Experience, Inclusive Leadership. ([The GitHub Blog, 2023](https://github.blog/2023-06-07-announcing-the-all-in-chaoss-dei-badging-pilot-initiative/)) + +### Reproducible research + +Definition: + +- Ensure that anyone with access to data inputs and software can feasibly generate the data outputs, both to check or build on them. ([Irving et al. 2021](https://merely-useful.tech/py-rse/glossary.html#open_license)) + +- Practice of describing and documenting the research process in such a way that another researcher can re-run the software on the same data input to get the same data outputs. + +Related features: + +- _Documentation strings_: in one or two lines using active verbs to describe how inputs turn into outputs ([Irving et al. 2021](https://merely-useful.tech/py-rse/documentation.html)). + +- _Literate programming_ is the practice of mixing code and descriptive writing in order to execute and explain a data analysis simultaneously in the same document ([Eli Lilly and Company, 2022](https://books.ropensci.org/targets/literate-programming.html)). + +- _Software descriptions_ structured in four types with complementing purposes: tutorials, how-to guides, technical references, and explanations. ([Documentation System, 2023](https://documentation.divio.com/)). + +Related concepts: + +- _Reliability_: Result consistency across many repetitions of the same experiment. ([Dymocks Tutoring, 2022](https://www.dymockstutoring.edu.au/scientific-skills-accuracy-validity-and-reliability/)) + +- _Usability_: Capacity to provide conditions to perform the tasks safely, effectively, and efficiently. ([Wikipedia, 2023](https://en.wikipedia.org/wiki/Usability)) + +### Sustainable software + +Definition: + +- The ease with which to maintain and extend rather than replace. ([Irving et al. 2021](https://merely-useful.tech/py-rse/index.html#intro-big-picture)) +It depends on the quality of the software, the skills of the potential maintainers, and if users can afford to keep up to date (how much the community is willing to invest). + +Related features: + +- _Modular code_: Build programs out of short, single-purpose functions with clearly-defined inputs and outputs ([Wilson et al, 2017](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510#sec005)) + +- _Unit testing_: Small test of one particular feature of a piece of software. ([Wilson et al, 2017](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510#sec027)) + +- _Version control_: Keeping track of changes that you or your collaborators make to data and software. ([Wilson et al, 2017](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510#sec014)) + +- _Community around software_: Users and collaborators that can communicate effectively with maintainers given the software documentation and by public or private platforms like chat channels, video conferencing, and more. ([Wilson et al, 2017](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510#sec007)) + +## How to use these concepts? + +Often used interchangeably but use them differently can help to differentiate the characteristics of a project ([Irving et al. 2021](https://merely-useful.tech/py-rse/index.html#intro-big-picture)): + +- We can have open science projects without documentation, thus not reproducible. + +- We can have an automated and documented project not open to the public, thus not open science. + +- We can have open and reproducible software but lack incentives for maintainers, thus not sustainable. + + +::::::::::::::::::::::::::::::::::::: keypoints + +- The definitions of Open science, Reproducible research, and Sustainable software help us identify their specific software features. + +- Differentiating these concepts helps us to differentiate the characteristics of a project. + +:::::::::::::::::::::::::::::::::::::::::::::::: + diff --git a/fig/add_dependencies-github.png b/fig/add_dependencies-github.png new file mode 100644 index 0000000..9cf9a81 Binary files /dev/null and b/fig/add_dependencies-github.png differ diff --git a/fig/citation-cff.png b/fig/citation-cff.png new file mode 100644 index 0000000..18ea5df Binary files /dev/null and b/fig/citation-cff.png differ diff --git a/fig/code_of_conduct.png b/fig/code_of_conduct.png new file mode 100644 index 0000000..ff5087d Binary files /dev/null and b/fig/code_of_conduct.png differ diff --git a/fig/concept-map-00.png b/fig/concept-map-00.png new file mode 100644 index 0000000..659eaea Binary files /dev/null and b/fig/concept-map-00.png differ diff --git a/fig/concept-map-01.png b/fig/concept-map-01.png new file mode 100644 index 0000000..08b760b Binary files /dev/null and b/fig/concept-map-01.png differ diff --git a/fig/concept-map-02.png b/fig/concept-map-02.png new file mode 100644 index 0000000..f876801 Binary files /dev/null and b/fig/concept-map-02.png differ diff --git a/fig/concept-map-03.png b/fig/concept-map-03.png new file mode 100644 index 0000000..d8c7507 Binary files /dev/null and b/fig/concept-map-03.png differ diff --git a/fig/concept-map-04.png b/fig/concept-map-04.png new file mode 100644 index 0000000..c917cdd Binary files /dev/null and b/fig/concept-map-04.png differ diff --git a/fig/description-title.png b/fig/description-title.png new file mode 100644 index 0000000..b8dab8b Binary files /dev/null and b/fig/description-title.png differ diff --git a/fig/folder_mess_cooking.png b/fig/folder_mess_cooking.png new file mode 100644 index 0000000..61dd25e Binary files /dev/null and b/fig/folder_mess_cooking.png differ diff --git a/fig/folder_tidy_cooking.png b/fig/folder_tidy_cooking.png new file mode 100644 index 0000000..7c2ea4c Binary files /dev/null and b/fig/folder_tidy_cooking.png differ diff --git a/fig/fun-demo-web.png b/fig/fun-demo-web.png new file mode 100644 index 0000000..81b4ad7 Binary files /dev/null and b/fig/fun-demo-web.png differ diff --git a/fig/fun-demo.png b/fig/fun-demo.png new file mode 100644 index 0000000..1f489e8 Binary files /dev/null and b/fig/fun-demo.png differ diff --git a/fig/git-revert.png b/fig/git-revert.png new file mode 100644 index 0000000..d51b9ca Binary files /dev/null and b/fig/git-revert.png differ diff --git a/fig/git-token.png b/fig/git-token.png new file mode 100644 index 0000000..89da622 Binary files /dev/null and b/fig/git-token.png differ diff --git a/fig/goal-intro.png b/fig/goal-intro.png new file mode 100644 index 0000000..c1d4bdc Binary files /dev/null and b/fig/goal-intro.png differ diff --git a/fig/helpful_resources.png b/fig/helpful_resources.png new file mode 100644 index 0000000..c9a149a Binary files /dev/null and b/fig/helpful_resources.png differ diff --git a/fig/issue_template-content.png b/fig/issue_template-content.png new file mode 100644 index 0000000..c0cb32b Binary files /dev/null and b/fig/issue_template-content.png differ diff --git a/fig/issue_template.png b/fig/issue_template.png new file mode 100644 index 0000000..809a74d Binary files /dev/null and b/fig/issue_template.png differ diff --git a/fig/markdown-header.png b/fig/markdown-header.png new file mode 100644 index 0000000..35bdf1e Binary files /dev/null and b/fig/markdown-header.png differ diff --git a/fig/markdown-readme.png b/fig/markdown-readme.png new file mode 100644 index 0000000..c0c2e71 Binary files /dev/null and b/fig/markdown-readme.png differ diff --git a/fig/non-reproducible-workflow.png b/fig/non-reproducible-workflow.png new file mode 100644 index 0000000..38b9706 Binary files /dev/null and b/fig/non-reproducible-workflow.png differ diff --git a/fig/open-sustainable-reproducible.jpeg b/fig/open-sustainable-reproducible.jpeg new file mode 100644 index 0000000..2778a84 Binary files /dev/null and b/fig/open-sustainable-reproducible.jpeg differ diff --git a/fig/project-init.png b/fig/project-init.png new file mode 100644 index 0000000..64e333e Binary files /dev/null and b/fig/project-init.png differ diff --git a/fig/projectname-01-github.png b/fig/projectname-01-github.png new file mode 100644 index 0000000..3d2709b Binary files /dev/null and b/fig/projectname-01-github.png differ diff --git a/fig/projectname-01.png b/fig/projectname-01.png new file mode 100644 index 0000000..4c88291 Binary files /dev/null and b/fig/projectname-01.png differ diff --git a/fig/push-and-pull.png b/fig/push-and-pull.png new file mode 100644 index 0000000..ec949f1 Binary files /dev/null and b/fig/push-and-pull.png differ diff --git a/fig/readme-github.png b/fig/readme-github.png new file mode 100644 index 0000000..f0d63ca Binary files /dev/null and b/fig/readme-github.png differ diff --git a/fig/readme_so-preview.png b/fig/readme_so-preview.png new file mode 100644 index 0000000..9f6a48b Binary files /dev/null and b/fig/readme_so-preview.png differ diff --git a/fig/readme_so-sections.png b/fig/readme_so-sections.png new file mode 100644 index 0000000..9400f41 Binary files /dev/null and b/fig/readme_so-sections.png differ diff --git a/fig/renv.png b/fig/renv.png new file mode 100644 index 0000000..c35692c Binary files /dev/null and b/fig/renv.png differ diff --git a/fig/reproducible_analysis.png b/fig/reproducible_analysis.png new file mode 100644 index 0000000..852a6d8 Binary files /dev/null and b/fig/reproducible_analysis.png differ diff --git a/fig/rrtools-paper.png b/fig/rrtools-paper.png new file mode 100644 index 0000000..0b4f676 Binary files /dev/null and b/fig/rrtools-paper.png differ diff --git a/fig/rrtools-templates.png b/fig/rrtools-templates.png new file mode 100644 index 0000000..ff9e89f Binary files /dev/null and b/fig/rrtools-templates.png differ diff --git a/index.md b/index.md new file mode 100644 index 0000000..101c019 --- /dev/null +++ b/index.md @@ -0,0 +1,10 @@ +--- +site: sandpaper::sandpaper_site +--- + +This is an [Epiverse-TRACE][epiversetrace] lesson built with [The Carpentries Workbench][workbench]. + + +[epiversetrace]: https://epiverse-trace.github.io/ +[workbench]: https://carpentries.github.io/sandpaper-docs + diff --git a/instructor-notes.md b/instructor-notes.md new file mode 100644 index 0000000..6494d35 --- /dev/null +++ b/instructor-notes.md @@ -0,0 +1,5 @@ +--- +title: Instructor Notes +--- + +This is a placeholder file. Please add content here. diff --git a/introduction.md b/introduction.md new file mode 100644 index 0000000..b5fc711 --- /dev/null +++ b/introduction.md @@ -0,0 +1,126 @@ +--- +title: 'Introduction' +teaching: 5 +exercises: 5 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- Why to improve our code for analysis? +- What can we do to improve it? +- How can we start improving it? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Explain our vision of an improved epidemic analysis code. +- Share our strategy to incorporate good practices in scientific computing. +- Define our plan to incorporate practical and quick-to-learn solutions. + +:::::::::::::::::::::::::::::::::::::::::::::::: + + + +## Why improve our code for epidemic analysis? + +When we want to improve our analysis code's reliability and reusability, we want to make it reproducible. + +Reproducible research aims to ensure that _anyone_ with access to data inputs and software can _feasibly generate_ the data outputs, both to check or _build on them_. Reproducibility is improved when mixed with Open science and Sustainable software features. + +Our vision for this workshop is to increase the awareness of good practices that will increase the reproducibility of data analysis workflow that already uses R and Git. + +![Our vision: Increase the awareness of good practices that complement an R and Git workflow](fig/goal-intro.png) + +The figure above helps us to visualize and potentially evaluate the **processes** we are following. A process-centred approach helps us remove the focus on human error, be aware that processes can fail people with good intentions, and accept that we can enter a continuous improvement cycle. + +> "By defining the process, we can begin to borrow from the rich field of operations, +which focuses primarily on (the) process. One paradigm that proves especially useful is +the concept of human error. The seminal book +_The Field Guide to Understanding Human Error_ +argues for a paradigm shift from the “Old World View” (that +when an error occurs it is an individual actor’s fault) to the “New World View” +(that when an error occurs, it is a symptom of a flawed system that failed that +individual actor) (Dekker 2014). When an error in an analysis occurs, it is safe +to assume (aside from nefarious actors) that the analyst did not want that error +to occur. Given that she thought she was producing an analysis free from errors, +you must look at the way she developed the analysis to understand where the +error occurred, and create safeguards so that the error does not occur again." ([Parker, 2017a](https://sachsmc.github.io/rpackage-workshop/opinionated-analysis-dev.pdf) and [Parker, 2017b](https://posit.co/resources/videos/opinionated-analysis-development/)) + +Repetitive events (like outbreak response and research data analysis projects) give us the opportunity to: + +- Focus on the **process** we have followed, +- Evaluate **where** bottlenecks occur, and then +- **Adopt** new practices to be better protected against errors in the next iteration. + +::::::::::::::::: callout + +### Deming Cycle + +This approach aims to follow a [Deming cycle](https://www.lean.org/lexicon-terms/pdca/) of Plan, Do, Check, and Act, as a foundation for continuous improvement. + +::::::::::::::::::::::::: + +::::::::::::::::: discussion + +Exercise: Your experience analyzing outbreak data (the latest... or the most chaotic!) + +Take 5 minutes. + +Reflect on these questions: + +- How do you **organize** your files and folders? +- Where do you **describe** what your project does or how to use it? Was it all in one accessible place? +- Could your project be **reused** by colleagues? Do you think it is? + +Share one idea from your neighbour. + +:::::::::::::::::::::::::::: + +:::::::: instructor + +The questions above are self-assessment questions to let the participants hear and report back on their or others' practices. This motivates a self-evaluation to include or replace previous practices, and potentially leave the session aware or motivated to proceed differently in the next iteration. + +:::::::::::::::: + +## What can we do? + +A fair strategy to follow is to **gradually** incorporate good practices in scientific computing ([Wilson et al. 2017](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510)) that include: + +- Data management, +- Software development, +- Collaboration, +- Project organization, +- Keep track of changes, and +- Manuscript writing. + +:::testimonial + +### Do I need to use them all from today? + +No, we do not intend you start adopting all these workshop's good practices and tools. + +If you already use a programming language like R and Git for version control, [you are already on the path!](https://www.nature.com/articles/s41559-017-0160/figures/1) + +We support the opinion of [Jaime Quinn](https://sorse.github.io/blog/a-reproducible-phd/): "It can be challenging to absorb so many different good practices while still getting research done. However, I would argue that _anything helps_. While all good practices in open science are important, even just incorporating one example is good for the community and provides a solid personal foundation for _gradually incorporating_ more good practices." + +::: + +## How can we start? + +Our plan for this workshop is to prioritize three tools, given their [usefulness once mastered and the time to master them](https://teachtogether.tech/en/index.html#s:motivation-authentic): + +- Use research compendium templates. +- Make reproducible analysis. +- Write informative READMEs. + +We'll relate relevant features for Sustainable software, Reproducible research, and Open science for each tool. + +::::::::::::::::::::::::::::::::::::: keypoints + +- Our vision is to increase the awareness of tools to improve the reproducibility of data analysis. +- Our strategy is to incorporate good practices in scientific computing gradually. +- We plan to share specific tools to create a research compendium, make a reproducible analysis, and write READMEs. + +:::::::::::::::::::::::::::::::::::::::::::::::: + diff --git a/learner-profiles.md b/learner-profiles.md new file mode 100644 index 0000000..75b2c5c --- /dev/null +++ b/learner-profiles.md @@ -0,0 +1,5 @@ +--- +title: FIXME +--- + +This is a placeholder file. Please add content here. diff --git a/links.md b/links.md new file mode 100644 index 0000000..4c5cd2f --- /dev/null +++ b/links.md @@ -0,0 +1,10 @@ + + +[pandoc]: https://pandoc.org/MANUAL.html +[r-markdown]: https://rmarkdown.rstudio.com/ +[rstudio]: https://www.rstudio.com/ +[carpentries-workbench]: https://carpentries.github.io/sandpaper-docs/ + diff --git a/materials.md b/materials.md new file mode 100644 index 0000000..e09deb6 --- /dev/null +++ b/materials.md @@ -0,0 +1,13 @@ +--- +title: Instructor Materials +--- + +## Slides + +- Curso internacional en Análisis de Brotes y Modelamiento en Salud Pública, Bogotá 2023. Event: . Slides: . Language: Español. + +- IDDconf 2023. A Conference on Infectious Disease Dynamics. Event: Slides: . Language: English. + +## Timetable + +- IDDconf 2023 diff --git a/md5sum.txt b/md5sum.txt new file mode 100644 index 0000000..253c68f --- /dev/null +++ b/md5sum.txt @@ -0,0 +1,20 @@ +"file" "checksum" "built" "date" +"CODE_OF_CONDUCT.md" "4002c196560d1b6395ee1375e8190314" "site/built/CODE_OF_CONDUCT.md" "2024-03-03" +"LICENSE.md" "14377518ee654005a18cf28549eb30e3" "site/built/LICENSE.md" "2024-03-03" +"config.yaml" "19731cdf9da2cdcc1d7eea744866170e" "site/built/config.yaml" "2024-03-03" +"index.md" "ca324507113d77941fd0f97f7aae49e6" "site/built/index.md" "2024-03-03" +"links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2024-03-03" +"episodes/beforestart.Rmd" "1c71e52033982952220732b11411c3f3" "site/built/beforestart.md" "2024-03-03" +"episodes/introduction.Rmd" "f273b5b217270c2c658db6931cf5522e" "site/built/introduction.md" "2024-03-03" +"episodes/compendium.Rmd" "346914df8e0a0695f4791a7b57718696" "site/built/compendium.md" "2024-03-03" +"episodes/reproducible.Rmd" "11be248121bf6ff3fd7f72c3123107a5" "site/built/reproducible.md" "2024-03-03" +"episodes/readmefile.Rmd" "3cc0212d4220e050446c4ebc49bf83f6" "site/built/readmefile.md" "2024-03-03" +"episodes/wrapup.Rmd" "276a00aa8663d254002abd5dabfc8956" "site/built/wrapup.md" "2024-03-03" +"episodes/appendix.Rmd" "8701920cce9c7a3ee879612cef7424c8" "site/built/appendix.md" "2024-03-03" +"episodes/definitions.Rmd" "81b8514022c64661acac40958b01b325" "site/built/definitions.md" "2024-03-03" +"instructors/instructor-notes.md" "5cf113fd22defb29d17b64597f3c9bc0" "site/built/instructor-notes.md" "2024-03-03" +"instructors/materials.md" "e4fa9e868b71a6fb12a91e64f46fa5a3" "site/built/materials.md" "2024-03-03" +"learners/reference.md" "527a12e217602daae51c5fd9ef8958df" "site/built/reference.md" "2024-03-03" +"learners/setup.md" "9d8df14921d2ed23a628a065e34342f6" "site/built/setup.md" "2024-03-03" +"profiles/learner-profiles.md" "5fe5bf7537072422b91ed393ada03f9a" "site/built/learner-profiles.md" "2024-03-03" +"renv/profiles/lesson-requirements/renv.lock" "78c2cf03b0fd4676cef717318b80607b" "site/built/renv.lock" "2024-03-03" diff --git a/readmefile.md b/readmefile.md new file mode 100644 index 0000000..a2d1d24 --- /dev/null +++ b/readmefile.md @@ -0,0 +1,294 @@ +--- +title: 'README files' +teaching: 40 +exercises: 5 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- Where can I give proper installation instructions? +- What licenses can I add for text, figures, and data? +- How do I generate a citation for my project? +- How can I increase the visibility of community guidelines? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Recognise good practices for `README` files. +- Complement the `{rcompendium}` `README` template. +- Identify your project features related to Open science. + +:::::::::::::::::::::::::::::::::::::::::::::::: + +## README files + +`README` files can include a whole range of information from an _overview_ of the project, _installation_ instructions and _licensing_ details to information on how to contribute to the code and cite the software. With modern text markup and formatting through `Markdown`, `README` files can also be rendered in a much more _accessible_ and _appealing_ manner than traditional plain-text `README` files. ([Cohen and Crouch, 2023](https://www.imperial.ac.uk/computational-methods/rse/events/byte-sized-rse/)) + +### Good practices + +There is no standard for `README` files, but we can use some widely used approaches. Here we list some `README` good practices collected by [Cohen and Crouch, 2023](https://www.imperial.ac.uk/computational-methods/rse/events/byte-sized-rse/): + +- Consider a formatting, layout, or _structure_. +- Ensure _clear and concise_ descriptions. +- Avoid overloading the `README` with content that could be hosted elsewhere. +- Consider including a table of contents if you have many sections. +- _Know your audience_ - Is your `README` aimed at other developers or end-users of your software/code? + +### Structure + +Using an [online editor called readme.so](https://readme.so/), we selected some typical sections frequently found in R packages: + +![README file sections selected from https://readme.so/](fig/readme_so-sections.png) + +This selection generates this `README` file preview template: + +![](fig/readme_so-preview.png) + +We can find room for improvement if we compare this [readme.so](https://readme.so/) template with the `README` file from the `{rcompendium}` template. + +![](fig/readme-github.png) + +In this episode, we will complement this template with some key sections. + +:::callout + +We invite you to edit your `README` as you prefer! You can also use this simple [readme.so editor to generate more section templates](https://readme.so/) than the ones we will cover here. + +::: + +## Let's code + +First, let's `Knit` the `README.Rmd`. + +We must remember that our `README.md` is generated from the `README.Rmd` file. So we need to edit that file and `Knit` it after any update. This step is not done automatically for this template. + +:::instructor + +The `{usethis}` package provides control of flow functionality to commit the `README.md` only when synchronised with `README.Rmd`. We have yet to test that behaviour. + +::: + +### Installation + +The `Usage` section includes the installation steps of: + +- Clone a repository, and +- Use R/Rstudio. + +We can assess our [target audience](https://merely-useful.tech/py-rse/documentation.html#documentation-audience) and adapt this content to our projects. + +Let's assume that the following personas are examples of the types of people that are your target audience: + +- [Patricia](https://epiverse-trace.github.io/personas/patricia-discoverer.html) is a PhD student. She uses R to analyse infectious disease data and wants it to be reproducible. She is unfamiliar with GitHub and the terminal window. + +- [Lucia](https://epiverse-trace.github.io/personas/lucia-outbreaks.html) is a Field epidemiologist. She uses R to clean data and create plots for outbreak response. She wants to communicate her doubts and ideas with package maintainers. She does not track the versions of her code with Git. + +If we want to add external guides to facilitate the `git clone` step, we can complement our installation steps with external resources. + +Copy, edit as you prefer, and paste it to your `README` file: + +``` +### Usage + +First, clone this repository. You can follow [steps on creating a new Rstudio Project from a GitHub repository](https://www.epirhandbook.com/en/version-control-and-collaboration-with-git-and-github.html?q=clone#clone-from-a-github-repository). + +Then, run: +``` + +:::checklist + +#### Checkpoint + +`Knit` the `README.Rmd` file. + +::: + +:::callout + +`Notes` are not part of the structure but information about the `Usage` step. We can add one more `#` to its heading. + +::: + +### Citation + +We can take advantage of the `DESCRIPTION` file to generate a `CITATION` file. + +First, open the `DESCRIPTION` file. + +Note that in the 5th line, the `Authors@R` section is already filled with your details. You set this up when running the Configuration steps with `rcompendium::set_credentials()`. + +Second, write a `Title` for the Project in the 3rd line. The [`Title` should be written](https://r-pkgs.org/man.html#title-description-details) in sentence case, not ending in a full stop. + +![](fig/description-title.png) + +:::callout + +[`CITATION.cff` is file format](https://ropensci.org/blog/2021/11/23/cffr/) that facilitates software citation in ecosystems like GitHub, Zenodo and Zotero. + +::: + +Third, to generate a CITATION.cff file from the DESCRIPTION file, we can install the `{cffr}` package: + +```r +install.packages("cffr") +``` + +Fourth, create a `.cff` file: + +```r +cffr::cff_write(dependencies = FALSE) +``` + +Commit and Push your changes. Identify that GitHub has built-in support for this citation. + +![](fig/citation-cff.png) + +:::challenge + +#### How can I paste the CITATION in the README file? + +:::solution + +First, write a `inst/CITATION` file: + +```r +cffr::write_citation(x = "CITATION.cff") +``` + +Our default `CITATION.cff` do not record the `year` of creation. To solve it, we can follow the following steps: + +- Open the `inst/CITATION` file. Within the `bibentry()` add: + +``` +year = 2023, +``` + +- Then, paste this chunk with the `echo=FALSE` option in the `README.Rmd`: + +```r +readCitationFile(file = "inst/CITATION") +``` + +- `Knit` the `README.Rmd` file. + +- Finally, re-run this line to update the `.cff` file with the `year`: + +```r +cffr::cff_write(dependencies = FALSE) +``` + + +::: + +::: + +### Licenses + +Our project has a [GPLv2](https://www.gnu.org/licenses/old-licenses/gpl-2.0.html) license registered in the `LICENSE.md` file and in the `DESCRIPTION` file as a [GPL (>=2)](https://choosealicense.com/licenses/gpl-3.0/). + +We adapted text generated by the [`{rrtools}` package](https://github.com/benmarwick/rrtools/) template. + +Copy, edit as you prefer, and paste it to your `README` file: + +``` +### Licenses + +**Text and figures :** [CC-BY-4.0](http://creativecommons.org/licenses/by/4.0/) + +**Code :** See the [DESCRIPTION](DESCRIPTION) file + +**Data :** [CC-0](http://creativecommons.org/publicdomain/zero/1.0/) attribution requested in reuse +``` + +:::checklist + +#### Checkpoint + +`Knit` the `README.Rmd` file. + +::: + +### Contributing + +We adapted this format from the template generated from [readme.so](https://readme.so/). We added hyperlinks to redirect to the Community files in the `.github/` folder. + +Copy, edit as you prefer, and paste it to your `README` file: + +``` +### Contributing + +Contributions are always welcome! + +See our [Contributing guide](/.github/CONTRIBUTING.md) for ways to get started. + +Please adhere to this project's [Code of Conduct](/.github/CODE_OF_CONDUCT.md). + +### Support + +Please see our [Getting help guide](/.github/SUPPORT.md) for support. +``` + +:::checklist + +#### Checkpoint + +`Knit` the `README.Rmd` file. + +::: + +:::instructor + +Contributing guides and Function documentation are also visible in a website format. Please look at the Appendix episode to learn how to do it. + +::: + +### Markdown + +In Markdown, the `Header 2` generates an underline that can help isolate sections of our chosen structure. + +![](fig/markdown-header.png) + +Remove one `#` from all the main headers. This edit generates a final `README` file that looks like this: + +![](fig/markdown-readme.png) + +:::discussion + +Consider your research project: + +- Would you add or remove any section from the `README` template above? Why? + +Explore the [online editor called readme.so](https://readme.so/) to identify more sections that could suit your research project. + +::: + +:::testimonial + +- We recommend you to [listen to the Code for Thought podcast](https://codeforthought.buzzsprout.com/1326658/12979597-en-bytesized-rse-the-readme-with-julian-lenz) episode on the `README` file. They also have a few links that you might find helpful. + +- For Badges, we recommend reading a Blog post on [Communicating development stages of open-source software](https://epiverse-trace.github.io/posts/comm-software-devel/) at the Epiverse-TRACE website. + +::: + +:::checklist + +![](fig/concept-map-03.png) + +::: + + +## Open science features + +We define Open science as making software, data inputs and outputs _freely available_ by publishing all of them with open licences to facilitate project reuse. + +A vital feature of this practice is the __Licenses__. Explicit licenses that include the _software_ and the specific license for _text and figures_ and _data_, in particular, are also relevant. + +::::::::::::::::::::::::::::::::::::: keypoints + +- Complement the `README` template with Installation steps, Citations, Licenses and Contributing guides. +- Use different types of licenses of text and figures, software code, and data. +- _Licenses_ is a feature related to Open Science. + +:::::::::::::::::::::::::::::::::::::::::::::::: + diff --git a/reference.md b/reference.md new file mode 100644 index 0000000..97b2a44 --- /dev/null +++ b/reference.md @@ -0,0 +1,7 @@ +--- +title: Reference +--- + +## Glossary + +This is a placeholder file. Please add content here. diff --git a/renv.lock b/renv.lock new file mode 100644 index 0000000..c417c40 --- /dev/null +++ b/renv.lock @@ -0,0 +1,397 @@ +{ + "R": { + "Version": "4.3.2", + "Repositories": [ + { + "Name": "carpentries", + "URL": "https://carpentries.r-universe.dev" + }, + { + "Name": "carpentries_archive", + "URL": "https://carpentries.github.io/drat" + }, + { + "Name": "CRAN", + "URL": "https://cran.rstudio.com" + } + ] + }, + "Packages": { + "R6": { + "Package": "R6", + "Version": "2.5.1", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R" + ], + "Hash": "470851b6d5d0ac559e9d01bb352b4021" + }, + "base64enc": { + "Package": "base64enc", + "Version": "0.1-3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R" + ], + "Hash": "543776ae6848fde2f48ff3816d0628bc" + }, + "bslib": { + "Package": "bslib", + "Version": "0.6.1", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "base64enc", + "cachem", + "grDevices", + "htmltools", + "jquerylib", + "jsonlite", + "lifecycle", + "memoise", + "mime", + "rlang", + "sass" + ], + "Hash": "c0d8599494bc7fb408cd206bbdd9cab0" + }, + "cachem": { + "Package": "cachem", + "Version": "1.0.8", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "fastmap", + "rlang" + ], + "Hash": "c35768291560ce302c0a6589f92e837d" + }, + "cli": { + "Package": "cli", + "Version": "3.6.2", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "utils" + ], + "Hash": "1216ac65ac55ec0058a6f75d7ca0fd52" + }, + "digest": { + "Package": "digest", + "Version": "0.6.34", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "utils" + ], + "Hash": "7ede2ee9ea8d3edbf1ca84c1e333ad1a" + }, + "ellipsis": { + "Package": "ellipsis", + "Version": "0.3.2", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "rlang" + ], + "Hash": "bb0eec2fe32e88d9e2836c2f73ea2077" + }, + "evaluate": { + "Package": "evaluate", + "Version": "0.23", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "methods" + ], + "Hash": "daf4a1246be12c1fa8c7705a0935c1a0" + }, + "fastmap": { + "Package": "fastmap", + "Version": "1.1.1", + "Source": "Repository", + "Repository": "CRAN", + "Hash": "f7736a18de97dea803bde0a2daaafb27" + }, + "fontawesome": { + "Package": "fontawesome", + "Version": "0.5.2", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "htmltools", + "rlang" + ], + "Hash": "c2efdd5f0bcd1ea861c2d4e2a883a67d" + }, + "fs": { + "Package": "fs", + "Version": "1.6.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "methods" + ], + "Hash": "47b5f30c720c23999b913a1a635cf0bb" + }, + "glue": { + "Package": "glue", + "Version": "1.7.0", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "methods" + ], + "Hash": "e0b3a53876554bd45879e596cdb10a52" + }, + "highr": { + "Package": "highr", + "Version": "0.10", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "xfun" + ], + "Hash": "06230136b2d2b9ba5805e1963fa6e890" + }, + "htmltools": { + "Package": "htmltools", + "Version": "0.5.7", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "base64enc", + "digest", + "ellipsis", + "fastmap", + "grDevices", + "rlang", + "utils" + ], + "Hash": "2d7b3857980e0e0d0a1fd6f11928ab0f" + }, + "jquerylib": { + "Package": "jquerylib", + "Version": "0.1.4", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "htmltools" + ], + "Hash": "5aab57a3bd297eee1c1d862735972182" + }, + "jsonlite": { + "Package": "jsonlite", + "Version": "1.8.8", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "methods" + ], + "Hash": "e1b9c55281c5adc4dd113652d9e26768" + }, + "knitr": { + "Package": "knitr", + "Version": "1.45", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "evaluate", + "highr", + "methods", + "tools", + "xfun", + "yaml" + ], + "Hash": "1ec462871063897135c1bcbe0fc8f07d" + }, + "lifecycle": { + "Package": "lifecycle", + "Version": "1.0.4", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "cli", + "glue", + "rlang" + ], + "Hash": "b8552d117e1b808b09a832f589b79035" + }, + "magrittr": { + "Package": "magrittr", + "Version": "2.0.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R" + ], + "Hash": "7ce2733a9826b3aeb1775d56fd305472" + }, + "memoise": { + "Package": "memoise", + "Version": "2.0.1", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "cachem", + "rlang" + ], + "Hash": "e2817ccf4a065c5d9d7f2cfbe7c1d78c" + }, + "mime": { + "Package": "mime", + "Version": "0.12", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "tools" + ], + "Hash": "18e9c28c1d3ca1560ce30658b22ce104" + }, + "rappdirs": { + "Package": "rappdirs", + "Version": "0.3.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R" + ], + "Hash": "5e3c5dc0b071b21fa128676560dbe94d" + }, + "rlang": { + "Package": "rlang", + "Version": "1.1.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "utils" + ], + "Hash": "42548638fae05fd9a9b5f3f437fbbbe2" + }, + "rmarkdown": { + "Package": "rmarkdown", + "Version": "2.25", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "bslib", + "evaluate", + "fontawesome", + "htmltools", + "jquerylib", + "jsonlite", + "knitr", + "methods", + "stringr", + "tinytex", + "tools", + "utils", + "xfun", + "yaml" + ], + "Hash": "d65e35823c817f09f4de424fcdfa812a" + }, + "sass": { + "Package": "sass", + "Version": "0.4.8", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R6", + "fs", + "htmltools", + "rappdirs", + "rlang" + ], + "Hash": "168f9353c76d4c4b0a0bbf72e2c2d035" + }, + "stringi": { + "Package": "stringi", + "Version": "1.8.3", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "stats", + "tools", + "utils" + ], + "Hash": "058aebddea264f4c99401515182e656a" + }, + "stringr": { + "Package": "stringr", + "Version": "1.5.1", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "cli", + "glue", + "lifecycle", + "magrittr", + "rlang", + "stringi", + "vctrs" + ], + "Hash": "960e2ae9e09656611e0b8214ad543207" + }, + "tinytex": { + "Package": "tinytex", + "Version": "0.49", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "xfun" + ], + "Hash": "5ac22900ae0f386e54f1c307eca7d843" + }, + "vctrs": { + "Package": "vctrs", + "Version": "0.6.5", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "cli", + "glue", + "lifecycle", + "rlang" + ], + "Hash": "c03fa420630029418f7e6da3667aac4a" + }, + "xfun": { + "Package": "xfun", + "Version": "0.41", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "stats", + "tools" + ], + "Hash": "460a5e0fe46a80ef87424ad216028014" + }, + "yaml": { + "Package": "yaml", + "Version": "2.3.8", + "Source": "Repository", + "Repository": "CRAN", + "Hash": "29240487a071f535f5e5d5a323b7afbd" + } + } +} diff --git a/reproducible.md b/reproducible.md new file mode 100644 index 0000000..09b955f --- /dev/null +++ b/reproducible.md @@ -0,0 +1,372 @@ +--- +title: 'Reproducible analysis' +teaching: 55 +exercises: 5 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- How do I make my research project reproducible? +- How do I include packages as dependencies of my project? +- What features are related to reproducible research? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Add dependencies of a project using the `DESCRIPTION` file. +- Create an isolated and specific reproducible environment with `{renv}` +- Identify your project features related to reproducible software. + +:::::::::::::::::::::::::::::::::::::::::::::::: + +## How do I make my analysis reproducible? + +### The reproducible environment + +Any analysis with R needs packages. These packages on which your project relies are called [dependencies](https://r-pkgs.org/dependencies-mindset-background.html). To make an analysis reproducible, we need to register these packages (and their versions) somewhere as your project's dependencies. That place is the `DESCRIPTION` file. + +In the [`DESCRIPTION` file](https://r-pkgs.org/description.html#the-description-file), dependencies are registered at the end of the file with the package names only and usually with a minimum version (`dplyr (>= 1.0.0)`). We can add dependencies using functions (`rcompendium::add_dependencies()`), and also use this file to automate version recovery (`devtools::install_deps()`). However, `DESCRIPTION` files are most useful for R packages. + +For [non-package projects](https://r-pkgs.org/description.html#footnotes) we can use `{renv}`. It registers specific dependencies by implementing project-specific environments, which means that `{renv}` registers even the [SHA/hash](https://happygitwithr.com/repeated-amend.html?q=sha#um-what-if-i-did-push) from GitHub packages, feature that the `DESCRIPTION` file can not do. Also, `{renv}` isolates your project packages from your computer packages. Lastly, `{renv}` can detect new dependencies automatically, apart from adding them with functions (`renv::snapshot()`), and it can also automate the recovery of the whole project (`renv::restore()`). + +:::callout + +The `{renv}` package: + +- _Isolates_ the dependencies of your project from your computer. +- Registers the _specific_ version of packages from CRAN or GitHub. +- Provides an _automated_ package management solution to restore an external project. + +::: + +### The analysis workflow + +Complementary to the dependencies, your analysis workflow must follow some good practices in scientific computing. + +First, for __Data management__, we need to save input data as originally created and, preferably, configure it as a read-only file. In your project, you can differentiate `raw-data` from `derived-data` + +Second, for __Project organization__, we need to store analysis and generated files in specific and isolated folders. In your project, you can differentiate `analyses` files (like `.R` scripts and `.Rmd` files) from `figures` and other `outputs`. + +![](fig/reproducible_analysis.png) + +### Automate your analysis + +The `make.R` file helps automate your analysis project. This file includes a script line to automatically restore your dependencies (`renv::restore()`) and run all the analysis scripts in your preferred order. The `make.R` file is the only `.R` file stored in the project's root given by the `{rcompendium}` template. You can use the `make.R` file as the only script to run and regenerate all your project outputs. + +:::callout + +The `make.R` file is inspired but not equivalent to GNU `Make` file. + +[GNU `Make`](https://www.gnu.org/software/make/) files can identify out-of-date files and re-execute any downstream code that needs to be updated, usually used for [`bash` scripts](http://book.biologistsguide2computing.com/en/stable/automation-is-your-friend.html). + +To use this functionality for your `R` project, you can use the [`{targets}` package](https://books.ropensci.org/targets/). + +::: + + +## Let's code + +We need to play under the rules of the `{rcompendium}` template. + +### The reproducible environment + +We will use `{renv}` instead of `DESCRIPTION` files for this. + +Usually, to [initiate](https://rstudio.github.io/renv/) a reproducible environment with `{renv}`, we need to run `renv::init()`. + +![Source: https://rstudio.github.io/renv/](fig/renv.png) + +However, when working in a `{rcompendium}` template, your first step must be to run: + +```r +rcompendium::add_renv() +``` + +```output +This project contains a DESCRIPTION file. +Which files should renv use for dependency discovery in this project? + +1: Use only the DESCRIPTION file. (explicit mode) +2: Use all files in this project. (implicit mode) +``` + +Write `2` and press ENTER to use `{renv}` instead of `DESCRIPTION` file. + +:::challenge + +### Question + +Why not to use `{renv}` in addition to `DESCRIPTION`? + +:::solution + +We can use `{renv}` in addition to `DESCRIPTION`. + +However, we opt to use `{renv}` instead of `DESCRIPTION` because the `rcompendium::add_dependencies(".")` function because it assumes that all packages to add to `DESCRIPTION` are from CRAN. If you want to add GitHub packages, [you need to add them manually](https://remotes.r-lib.org/articles/dependencies.html) in a different section called `Remotes:` and write `repository/package`. The `{renv}` package solves this automatically. + +![We need to fix the DESCRIPTION file manually. Packages like {cfr} and {epiparameter} are on GitHub.](fig/add_dependencies-github.png) + +However, this still needs to be assessed with different scenarios to confirm this as the final best decision. + +If you decide to use `{renv}` in addition to `DESCRIPTION` run: + +```r +rcompendium::add_dependencies(".") +``` + +Note that this function requires one argument specification `"."`, which means that your [working directory](https://www.epirhandbook.com/en/r-basics.html?q=getwd#working-directory) must be at the root of the R project. + +The output below details which packages were included in the description file + +``` +✔ Scanning 'Imports' dependencies + (*) Found 2 package(s) + (*) Adding the following line in 'DESCRIPTION': `Imports: devtools, here` +``` + +::: + +::: + + +If you get an error message like: + +```error +Error in renv_snapshot_validate_report(valid, prompt, force) : + aborting snapshot due to pre-flight validation failure +``` + +Run again the `rcompendium::add_renv()` function. You may get the following message: + +```output +This project already has a private library. What would you like to do? + +1: Activate the project and use the existing library. +2: Re-initialize the project with a new library. +3: Abort project initialization. +``` + +Write option `1` and press ENTER. + +This step creates a `renv/` folder and modifies the content of the `make.R` in `line 15`, replacing the default `devtools::install_deps()` by `renv::restore`. + +Second, to get the status of the project run: + +```r +renv::status() +``` + +```output +This project does not contain a lockfile. +Use renv::snapshot() to create a lockfile. +``` + +:::callout + +Always follow the suggestions of the `renv::status()` output. You can also get a message from it each time you reopen your project. + +::: + +Third, to create the `lockfile` run: + +```r +renv::snapshot() +``` + +This step creates a `renv.lock` file detailing the following: + +- R version on top and +- specific version details of all the packages in the project's dependency tree (including SHA/hash for GitHub packages). + +``` +{ + "R": { + "Version": "4.2.2", + "Repositories": [ + { + "Name": "CRAN", + "URL": "https://packagemanager.posit.co/cran/latest" + } + ] + }, + "Packages": { + "R6": { + "Package": "R6", + "Version": "2.5.1", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R" + ], + "Hash": "470851b6d5d0ac559e9d01bb352b4021" + }, + ... +``` + +Now, you have completed your reproducible environment configuration. + +### The analysis workflow + +The workflow will follow these three paths: + +- Read `raw-data/` to `clean.R` it and save it to `derived-data/`. +- Read `derived-data/` to make a `plot.R` and save it to `figures/`. +- Read `derived-data/` to make a `table.R` and save it to `outputs/`. + +First, [download the sample data set](https://github.com/reconhub/learn/raw/master/static/data/linelist_20140701.xlsx). + +Since this is raw data, save it in the `data/raw-data/` folder. + +Second, create the analysis script to clean this raw data set. Name it `01-clean.R`. Save it in the `analyses/` folder. Copy and paste these lines of code: + +```r +# Load packages +library(readxl) +library(tidyverse) + +# Read raw data +dat <- readxl::read_xlsx("data/raw-data/linelist_20140701.xlsx") + +# Clean raw data +dat_clean <- dat %>% + select(case_id,date_of_onset,date_of_outcome,outcome) %>% + mutate(across(.cols = c(date_of_onset,date_of_outcome), + .fns = as.Date)) %>% + mutate(outcome = fct(outcome,level = c("Death","Recover"),na = "NA")) + +# Write clean data +dat_clean %>% + write_rds("data/derived-data/linelist_clean.rds") +``` + +Notice that we are writing a new cleaned data set in a different path: `data/derived-data/`. + +:::callout + +- The default folder to save R scripts will be `R/`. This path is the place to write your _Modular functions_. Go to the `analyses/` folder to save your analysis script. + +- Yes, it is named `analyses/` not "analysis". + +::: + +Rstudio will invite you to install new packages. Press Install. Always run `renv::status()` after installing new packages: + +```r +renv::status() +``` + +```output +The following package(s) are in an inconsistent state: + + package installed recorded used + backports y n y + bit y n y +``` + +In this case, we need to follow the instructions in the section of [Missing packages](https://rstudio.github.io/renv/reference/status.html#missing-packages) from the `?renv::status()` documentation. + +```r +renv::install() +``` + +```output +- There are no packages to install. +- Automatic snapshot has updated '~/0projects/projectname/renv.lock'. +``` + +Third, create an analysis script to create an [incidence plot](https://www.reconverse.org/incidence2/) for this cleaned data set. Name it `02-plot.R`. Save it in the `analyses/` folder. Copy and paste these lines of code: + +```r +# Load packages +library(tidyverse) +library(incidence2) + +# Read data +ebola_dat <- read_rds("data/derived-data/linelist_clean.rds") + +# Create incidence2 object +ebola_onset <- + incidence2::incidence( + x = ebola_dat, + date_index = c("date_of_onset"), + interval = "epiweek" + ) + +# Read incidence2 object +ebola_onset + +# Plot incidence data +plot(ebola_onset) + +# Write ggplot as figure +ggsave("figures/02-plot_incidence.png",height = 3,width = 5) +``` + +Notice that we are writing the new plot in a different path: `figures/`. + +:::challenge + +- Explore the `i2extras::fit_curve()` to [fit a model](https://www.reconverse.org/i2extras/articles/fitting_epicurves.html#modeling-incidence) to the incidence curve. +- Save the output table in the corresponding folder. + +:::hint + +- You can reuse the `incidence2` object as input in the same file. +- Remember to update the `{renv}` status if you need to install and use a new package for this task + +::: + +::: + +### Automate your analysis + +The easiest step to forget! + +Lastly, list all `.R` scripts and `.Rmd` in a sequential order in the `make.R` file after `line 32`: + +```r +## Run Project ---- + +# List all R scripts in a sequential order and using the following form: +# source(here::here("analyses", "script_X.R")) + +source(here::here("analyses", "01-clean.R")) +source(here::here("analyses", "02-plot.R")) +``` + +:::checklist + +![](fig/concept-map-02.png) + +::: + + +## Reproducible research features + +We defined Reproducible research as a practice that wants to ensure that _anyone with access_ to data inputs and software can _feasibly generate_ the outputs to check or build on them. + +A key feature of this practice is the combination of __`{renv}`__ with the __`make.R` file__. With this file, and any other more sophisticated alternatives like `GNU Make` or `{targets}`, we are sure that we: + +- Can feasibly _regenerate_ the outputs. +- Can inform about the _reliability_ of the project. +- Have an _isolated_ time-proof capsule of dependencies. + +::::::::::::::::::::::::::::::::::::: keypoints + +- A _dependency_ is a package that your project needs to run. + +- Use the `DESCRIPTION` file to register your project dependencies. + +- Use `{renv}` to isolate and create package-specific reproducible environments for your dependencies. + +- Use the folder template to differentiate your `raw-data/` and `derived-data/`. + +- Save analysis and generated files in isolated folders like `analyses/`, `figures/`, and `outputs/`. + +- Use the `make.R` to list your analysis scripts and facilitate the regeneration of all your outputs. + +- _Reproducible environments_ and _Make files_ are features related to Reproducible research. + +:::::::::::::::::::::::::::::::::::::::::::::::: + diff --git a/setup.md b/setup.md new file mode 100644 index 0000000..7a14c85 --- /dev/null +++ b/setup.md @@ -0,0 +1,308 @@ +--- +title: Setup +--- + +## Motivation + +Have you ever wondered how cool would it be if you would be able to successfully: + +- __Reuse__ an analysis months or years later each time you need to revisit it? +- __Redo__ the analyses, figures or tables after correcting an error in the data or following a reviewer's recommendations? +- __Reuse__ data from other authors for a secondary analysis thanks to informative metadata on the primary study? + +Sadly, most of the time, this isn't easy because: + +- __We__ do not remember how we did the analysis, +- __Redoing__ figures and tables is time-consuming, +- __Data__ is not readily available or unreadable today. + + + + + + + +In this lesson, you will learn how to improve your code's _reliability_, _usability_ and _sustainability_ for epidemic analysis with R packages. You will learn how to __add specific features__ to your R project to keep it as __Open__, __Reproducible__, and __Sustainable__ as possible! + +![Open science, Sustainable software, and Reproducible analysis: Different and Complementary. Image by Bing, 2023, [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/), created with [Bing Image Creator powered by DALL·E 3](https://www.bing.com/create)](episodes/fig/open-sustainable-reproducible.jpeg){alt='A puzzle of three hexagon pieces, only showing the pieces, each with one of these words: OPEN, SUSTAINABLE, REPRODUCIBLE'} + +::::::::::::::::: prereq + +In this lesson, we will use R, Git, and GitHub. Some previous experience using [RStudio projects](https://support.posit.co/hc/en-us/articles/200526207-Using-RStudio-Projects) is expected, _but isn’t mandatory_. + +For an introductory lesson on Git, please go to the [Version Control with Git in Rstudio](https://epiverse-trace.github.io/git-rstudio-basics/) lesson. + +:::::::::::::::::::::::: + + + +## Software Setup + +### Install Rstudio + +Install R and Rstudio + +### Create a GitHub account + +Create a GitHub account + +### Install Git + +::::::::::::::::::::::::::::::::::::::: discussion + +### Follow software specific suggestions + +Follow [happygitwithr recommendation](https://happygitwithr.com/install-git.html) for each Operating system. + + +::::::::::::::::::::::::::::::::::::::::::::::::::: + +:::::::::::::::: solution + +### Windows + +For [Windows](https://happygitwithr.com/install-git.html#install-git-windows) + +::::::::::::::::::::::::: + +:::::::::::::::: solution + +### MacOS + +For [MacOS](https://happygitwithr.com/install-git.html#macos) + +::::::::::::::::::::::::: + + +:::::::::::::::: solution + +### Linux + +For [Linux](https://happygitwithr.com/install-git.html#linux) + +::::::::::::::::::::::::: + +### Install R packages + +These installation steps could ask you `? Do you want to continue (Y/n)` write `y` and press ENTER. It can take up to 3 minutes to complete. + +First, we strongly suggest to install the development versions of the `{rcompendium}` and `{rrtools}` packages: + +```r +if (!require("remotes")) install.packages("remotes") + +remotes::install_github("FRBCesab/rcompendium") +remotes::install_github("benmarwick/rrtools") +``` + +Then, install all these packages: + +```r +if(!require("pak")) install.packages("pak") + +new <- c("gh", + "usethis", + "tidyverse", + "here", + "yaml", + "renv") + +pak::pak(new) +``` + +### Configure Git and GitHub + +::: prereq + +### Follow all these steps + +In these steps, we will verify that you have: + +- a correctly configured _token_, and +- a clean output when running `usethis::git_sitrep()`. + +#### 1. Verify your git configuration + +Use `gh::gh_whoami()` to check if your local git configuration recognizes: + +- your name +- your GitHub account +- your _token_ + +```r +gh::gh_whoami() +``` + +```output +{ + "name": "Andree Valle Campos", + "login": "avallecam", + "html_url": "https://github.com/avallecam", + "scopes": "gist, repo, workflow", + "token": "gho_...AlAn" +} + +``` + +If there is no `name`, `login`, or `html_url`, then you need to run: + +```r +gert::git_config_global_set(name = "user.name", value = "John Doe") +gert::git_config_global_set(name = "user.email", value = "john.doe@domain.com") +gert::git_config_global_set(name = "github.user", value = "jdoe") +``` + +If you do not have a `token`, follow step number 3. + +#### 2. Get a situational report on your current Git/GitHub status: + +Use `usethis::git_sitrep()` to check if there is no `✖ ...` line in the output with an error message. + +An example with two errors is below: + +```r +usethis::git_sitrep() +``` + +```error +✖ Token lacks recommended scopes: + - 'user:email': needed to read user's email addresses + Consider re-creating your PAT with the missing scopes. + `create_github_token()` defaults to the recommended scopes. +✖ Can't retrieve registered email addresses from GitHub. + Consider re-creating your PAT with the 'user' or at least 'user:email' scope. +``` + +The output shows that I had a _token_ but must fix its configuration. If you do not have a _token_ or get a similar error message, follow the next step. + +If you have an error message unrelated to creating a token, copy and paste this output in your issue report to the email at the end of this page. + + +#### 3. Create your GitHub token: + +Do this with `usethis::create_github_token()`. This function should redirect you to GitHub on your browser. Once there, check all the options in the figure below. + +```r +usethis::create_github_token() +``` + +Check all of the following options: + +![](../episodes/fig/git-token.png) + +#### 4. Configure your token + +To complete the configuration of your token use `gitcreds::gitcreds_set()` ([Bryan, 2023](https://happygitwithr.com/https-pat.html)), then accept that you want to `2: Replace these credentials`. Do this by writing the number `2` and press ENTER. + +```r +gitcreds::gitcreds_set() +``` + +```output +-> What would you like to do? + +1: Abort update with error, and keep the existing credentials +2: Replace these credentials +3: See the password / token + +Selection: 2 + +``` + +Paste your `token` to save it and complete this step. + +#### 5. Run again the situational report: + +Verify again that there is no `✖ ...` line in the output with an error message. The expected outcome should look like this: + +```r +usethis::git_sitrep() +``` + +```output +Git config (global) +• Name: 'Andree Valle' +• Email: 'avallecam@gmail.com' +• Global (user-level) gitignore file: +• Vaccinated: FALSE +ℹ See `?git_vaccinate` to learn more +• Default Git protocol: 'https' +• Default initial branch name: 'master' +GitHub +• Default GitHub host: 'https://github.com' +• Personal access token for 'https://github.com': '' +• GitHub user: 'avallecam' +• Token scopes: 'delete_repo, gist, repo, user, workflow' +• Email(s): 'avallecam@gmail.com (primary)', 'andree.valle-campos@lshtm.ac.uk' +Git repo for current project +ℹ No active usethis project +``` + +If you still have an error, close Rstudio and open it again for changes to take effect. + +If the error persist, copy and paste this output in your issue report to the email at the end of this page. + +#### 6. Two-factor authentication + +If you have an error message related to Two-factor authentication, follow the [steps in this GitHub guide](https://docs.github.com/en/authentication/securing-your-account-with-two-factor-authentication-2fa/configuring-two-factor-authentication). + + +::: + +### Configure your R environment + +::: prereq + +### Follow all these steps + +#### 1. Set the default Git branch name + +Run the code chunk below: + +```r +usethis::git_default_branch_configure(name = "main") +``` + +This step will homogenize the name of the default branch in our computers. We need this to make some auto-generated links work downstream. + +#### 2. Add {rcompendium} credentials + +Use `rcompendium::set_credentials()` to paste your name and personal information to the `.Rprofile` configuration file. + +Adapt the code chunk below to your name, family name, email, and ORCID. Adding your ORCID is optional. + +```r +rcompendium::set_credentials( + given = "Andree", + family = "Valle-Campos", + email = "avallecam@gmail.com", + orcid = "0000-0002-7779-481X" +) +``` + +This function will automatically copy a line of code to the clipboard that starts with `options(...`, and open a file called `.Rprofile`. Paste the line of code in the file. After this, close Rstudio and open it again for changes to take effect. + +You can access the content of the `.Rprofile` file at any time with `usethis::edit_r_profile()`. + +::: + +## Your Questions + +If you need any assistance installing the software, configuring Git and GitHub, or have any other questions about the workshop, please send an email to + diff --git a/wrapup.md b/wrapup.md new file mode 100644 index 0000000..0e74562 --- /dev/null +++ b/wrapup.md @@ -0,0 +1,85 @@ +--- +title: 'Wrap up' +teaching: 10 +exercises: 2 +--- + +:::::::::::::::::::::::::::::::::::::: questions + +- Where is a full view of the concepts covered today? +- How can I self-assess my progress using these tools? +- Where can I ask for questions after this workshop? +- Where can I write my feedback on this workshop? + +:::::::::::::::::::::::::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::: objectives + +- Show the final concept map of the workshop. +- Share a self-assessment review checklist. +- Remind our communication forum. +- Share the feedback form of the workshop. + +:::::::::::::::::::::::::::::::::::::::::::::::: + +## The goal + +![Concept map of the workshop](fig/concept-map-00.png) + +### A next step + +![Data analysis resembles software engineering](fig/concept-map-04.png) + + +## Self-assessment template + +Now, we invite you to self-assess your progress in these [good practices](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510) using a __review checklists__ similar the one used by [JOSS, the Journal of Open Source Software](https://joss.readthedocs.io/en/latest/review_checklist.html). + +:::callout + +We related these two references in one [Google sheet](https://docs.google.com/spreadsheets/d/1dsGg9RoD3yCEfgHAr0ARQ-nJ1RWSeCEXcycGk7dFRKs/edit?usp=sharing). Take a look! + +::: + +## Write an individual learning reflection + +Before we wrap up, please take 5 minutes to think over everything we have covered so far. + +- On a piece of paper, write down something that captures what you want to remember about the day. +- The Instructor will not look at this - it is just for you. + +If you do not know where to start, consider the following list for a starting point: + +- Draw a concept map, connecting the material +- Draw pictures or a comic depicting one of the day's concepts +- Write an outline of the topics we covered +- Write a paragraph or "journal" entry about your experience of the workshop today +- Write down one thing that struck you the most + +This exercise should take about 5 minutes. + + +## Our communication channel + +:::checklist + +We remind you of our _communication forum_ called [GitHub Discussions](https://github.com/epiverse-trace/research-compendium/discussions). Here we will ask and solve our and your question on the topic! + +You can fill your questions under the [Q&A category](https://github.com/epiverse-trace/research-compendium/discussions/categories/q-a)... at any time in the future! + +::: + +## Your constructive feedback + +This form is anonymous: + +If you did not fill out this form, please take 5 minutes to fill it. This form will be beneficial for further improvements to our workshop. + +::::::::::::::::::::::::::::::::::::: keypoints + +- Use the JOSS review checklist to self-assess your progress. +- Use the `GitHub Discussions` as our communication forum after the workshop. +- Use the feedback form to share your constructive comments. + +:::::::::::::::::::::::::::::::::::::::::::::::: +