diff --git a/1-reproducible-research.qmd b/1-reproducible-research.qmd index 6495ffe..c0dd962 100644 --- a/1-reproducible-research.qmd +++ b/1-reproducible-research.qmd @@ -18,17 +18,13 @@ Read this chapter and watch this week's videos. Afterwards go through the following assignments: -```{ojs} -//| echo: false - -viewof flavor = Inputs.checkbox([ -"What questions came up for you while watching the videos and going through the booklet?", -"What are your personal hurdles for reproducible research? What can you do to address them?", -"What are hurdles for you in producing FAIR data? What can you do for the data you work with?", -"Which of the points in the 8 steps for Planning a Community does your research team already check off? Which should be discussed? If possible, bring open points to your team meeting. " -]) -``` +- What questions came up for you while watching the videos and going through the booklet? +- What are your personal hurdles for reproducible research? What can you do to address them? +- What are hurdles for you in producing FAIR data? What can you do for the data you work with? +- Which of the points in the 8 steps for Planning a Community does your research team already check off? Which should be discussed? If possible, bring open points to your team meeting. +Discuss your progress with your accountability buddy. Bring any questions and +problems that you cannot solve with your buddy to the weekly meeting. ::: diff --git a/2-1-naming.qmd b/2-1-naming.qmd new file mode 100644 index 0000000..ffca066 --- /dev/null +++ b/2-1-naming.qmd @@ -0,0 +1,49 @@ +## Naming things + +This chapter shows you how to pick good names. + +**Good names for files, folders, functions** and other things can make a research project (or any project on your computer, really) more pleasant. Both for yourself and any people you work with. + +Let's be kind to ourselves and the people around us and get into naming 🙌! + + +A few examples from [Jenny Brian's slides](https://speakerdeck.com/jennybc/how-to-name-files) of bad and good file names: + +> #### BAD ❌ {.unnumbered} +> +> - Myabstract.docx +> +> - Joe's Filenames Use Spaces and Punctuation.xlsx +> +> - figure 1.png +> +> - fig 2.png +> +> - JW7d\^(2sl\@deletethisandyourcareerisoverWx2\*.txt +> +> #### GOOD ✅ {.unnumbered} +> +> - 2014-06-08_abstract-for-sla.docx +> +> - Joes-filenames-are-getting-better.xlsx +> +> - Fig01_scatterplot-talk-length-vs-interest.png +> +> - Fig02_histogram-talk-attendance.png +> +> - 1986-01-28_raw-data-from-challenger-o-rings.txt + +Names should be: + +- **Machine readable** 💻 +- **Human readable** 🧐 +- **Optional: Consistent** ⚙️ (decide how you use underscores \_ and dashes -, if you want to use CamelCase or not, ...) +- **Optional: Play well with default ordering** ⬇ (e.g. start your file names with the creation date `YYYY-MM-DD`) + + +### Further reading + +- [Naming files, folders and other things](https://the-turing-way.netlify.app/project-design/filenaming.html), The Turing Way +- [Project structure slides](https://djnavarro.net/slides-project-structure/#1), Danielle Navarro +- [File naming slides](https://speakerdeck.com/jennybc/how-to-name-files), Jenny Brian +- [ISO 8601, a standard for dates](https://en.wikipedia.org/wiki/ISO_8601), Wikipedia diff --git a/2-2-organisation.qmd b/2-2-organisation.qmd new file mode 100644 index 0000000..e019dc0 --- /dev/null +++ b/2-2-organisation.qmd @@ -0,0 +1,43 @@ +## File and folder organization + +**My first research project was a mess 🙈.** I had hundreds of files with dubious file names and sometimes several files with similar code written for computing on different infrastructures (my computer, the institute server, the cluster of the computing facility). + +I felt like the worst researcher of all times. But I wasn't. **Many struggle with organizing their files and folders in increasingly complex research projects.** + + +### How to organize files and folders well? + +It basically comes down to structuring folders and files **systematically from the beginning**. + +Think about what a good folder structure could be for your research project. A standard project of mine looks something like this: + +``` +. +├── analysis <- all things data analysis +│ └── src <- functions and other source files +├── comm +│ ├── internal_comm <- internal communication such as meeting notes +│ └── journal_comm <- communication with the journal, e.g. peer review +├── data +│ ├── data_clean <- clean version of the data +│ └── data_raw <- raw data (don't touch) +├── dissemination +│ ├── manuscripts +│ ├── posters +│ └── presentations +├── documentation <- documentation, e.g. data management plan +└── misc <- miscellaneous files that don't fit elsewhere +``` + +You can download this folder structure as a template from [https://github.com/HeidiSeibold/research-project-template](https://github.com/HeidiSeibold/research-project-template). +Not every project is the same and likely your project will be more complex than this. But if you think about good organization from the beginning, it will be easier in the long run. + +What do you think about file or folder organization? Is your folder structure similar to mine? + +### Further reading + +- [Research Compendia](https://the-turing-way.netlify.app/reproducible-research/compendia.html), The Turing Way +- [Towards a Standardized Research Folder Structure](https://genr.eu/wp/towards-a-standardized-research-folder-structure/), GenR blog +- Folder structure of R packages, [Making Packages in R](https://swcarpentry.github.io/r-novice-inflammation/08-making-packages-R/), Software Carpentry +- [Research Project Template](https://github.com/HeidiSeibold/research-project-template), Heidi Seibold +- [Data Analysis Project Template](http://projecttemplate.net/), a [group of R users](https://github.com/KentonWhite/ProjectTemplate/graphs/contributors) diff --git a/2-3-documentation.qmd b/2-3-documentation.qmd new file mode 100644 index 0000000..9a67b0f --- /dev/null +++ b/2-3-documentation.qmd @@ -0,0 +1,63 @@ +## Documentation + +In this chapter we discuss research documentation for reproducible research. + +*How can I document my research outputs?* + +There is actually no super-clear catch all answer to this question. It really depends on your needs, on your audience as well as on the types of research outputs you generate. In the following you find a few ideas to start from. + + +### Documenting research projects + +One thing that I always do is to add a README-Text-File to each project. In the README I write the **most important info about the project**: What is it about? Who is involved? Where to find files? How to cite it? Where to find the paper? ... + +As an example, check out my project on [personalised medicine](https://github.com/HeidiSeibold/personalised_medicine). + +For more complex research projects, you can create a whole wiki or similar to +describe the project. For most projects a README will be just fine. + +### Documenting data + +Metadata is central to documentation of data. Metadata is information about your data. It's information on the license of the data, who owns it, what information the data contain, ... so essential data documentation. + +Many research fields have **standards for metadata**. If you can't find one for your field you can use a common standard (e.g. [Dublin Core](https://www.dublincore.org/specifications/dublin-core/dces/)) or just ask a data manager or librarian at your institution. You can write metadata similar to a README (see e.g. this [guide from Cornell University](https://data.research.cornell.edu/content/readme)). If you upload your data to a data platform (e.g. [Dryad](https://datadryad.org/)) you won't have to think about it as the platform usually takes care of that (Dryad uses Dublin Core). + + +### Documenting code + +To make my code as understandable as possible for others, I use **literate programming** (mixing text and code to make it easier to read, e.g. [Quarto](https://quarto.org/)) or add clear **code comments**. When writing functions in R I additionally use the standardized way to document R functions (via [**Roxygen2**](https://cran.r-project.org/web/packages/roxygen2/vignettes/roxygen2.html)). + +An example of code comments in R (`##`): +```{r} +#| eval: false + +## Load package + data +library("model4you") +data("MathExam14W", package = "psychotools") + +## scale points achieved to [0, 100] percent +MathExam14W$tests <- 100 * MathExam14W$tests/26 +MathExam14W$pcorrect <- 100 * MathExam14W$nsolved/13 + +## select variables to be used +MathExam <- MathExam14W[ , c("pcorrect", "group", "tests", "study", + "attempt", "semester", "gender")] +``` + + + +### Documenting other things + +Whatever you work on, there might be parts of your research project that are difficult to understand. Say you work in a lab, then your documentation is a **lab notebook**. Or you do interviews, then your documentation may be your interview strategy. **Anything that might be useful for others is worth keeping and worth sharing**. *After all, we all want to build on the work of others in order to make the world a little better.* + + +### Further reading + +Want to learn more? Check out: + +- [Landing Page - README file](https://the-turing-way.netlify.app/project-design/project-repo/project-repo-readme.html?highlight=readme), The Turing Way +- [A beginner's guide to writing documentation](https://www.writethedocs.org/guide/writing/beginners-guide-to-docs/), Write The Docs +- [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/), Yihui Xie, J. J. Allaire, Garrett Grolemund +- [knitr](https://yihui.org/knitr/) - Elegant, flexible, and fast dynamic report generation with R, Yihui Xie +- [Guide to writing "readme" style metadata](https://data.research.cornell.edu/content/readme), research data management service group, Cornell University + diff --git a/2-project-organization.qmd b/2-project-organization.qmd index 545c3e7..e2891c5 100644 --- a/2-project-organization.qmd +++ b/2-project-organization.qmd @@ -11,12 +11,17 @@ ::: -## Naming things +::: {.callout-caution} +## Tasks -## File and folder organization +Read this chapter and watch this week's videos. +Afterwards go through the following assignments: -## Documentation +- What questions came up for you while watching the videos and going through the booklet? +- What are your personal hurdles for reproducible research? What can you do to address them? +- What are hurdles for you in producing FAIR data? What can you do for the data you work with? +- Which of the points in the 8 steps for Planning a Community does your research team already check off? Which should be discussed? If possible, bring open points to your team meeting. -### Documenting data -### Documenting code -### Documenting research projects \ No newline at end of file +Discuss your progress with your accountability buddy. Bring any questions and +problems that you cannot solve with your buddy to the weekly meeting. +::: diff --git a/_quarto.yml b/_quarto.yml index 36979ab..81ecba1 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -15,6 +15,10 @@ book: - 1-3-fair.qmd - 1-4-team-work.qmd - part: 2-project-organization.qmd + chapters: + - 2-1-naming.qmd + - 2-2-organisation.qmd + - 2-3-documentation.qmd - part: 3-computational-workflows.qmd - part: 4-publishing-research.qmd - part: summary.qmd