Skip to content

Commit

Permalink
project organisation
Browse files Browse the repository at this point in the history
  • Loading branch information
HeidiSeibold committed Aug 24, 2023
1 parent 42a2795 commit cd440de
Show file tree
Hide file tree
Showing 6 changed files with 176 additions and 16 deletions.
16 changes: 6 additions & 10 deletions 1-reproducible-research.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,13 @@
Read this chapter and watch this week's videos.
Afterwards go through the following assignments:

```{ojs}
//| echo: false
viewof flavor = Inputs.checkbox([
"What questions came up for you while watching the videos and going through the booklet?",
"What are your personal hurdles for reproducible research? What can you do to address them?",
"What are hurdles for you in producing FAIR data? What can you do for the data you work with?",
"Which of the points in the 8 steps for Planning a Community does your research team already check off? Which should be discussed? If possible, bring open points to your team meeting. "
])
```
- What questions came up for you while watching the videos and going through the booklet?
- What are your personal hurdles for reproducible research? What can you do to address them?
- What are hurdles for you in producing FAIR data? What can you do for the data you work with?
- Which of the points in the 8 steps for Planning a Community does your research team already check off? Which should be discussed? If possible, bring open points to your team meeting.

Discuss your progress with your accountability buddy. Bring any questions and
problems that you cannot solve with your buddy to the weekly meeting.
:::


Expand Down
49 changes: 49 additions & 0 deletions 2-1-naming.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
## Naming things

This chapter shows you how to pick good names.

**Good names for files, folders, functions** and other things can make a research project (or any project on your computer, really) more pleasant. Both for yourself and any people you work with.

Let's be kind to ourselves and the people around us and get into naming 🙌!


A few examples from [Jenny Brian's slides](https://speakerdeck.com/jennybc/how-to-name-files) of bad and good file names:

> #### BAD ❌ {.unnumbered}
>
> - Myabstract.docx
>
> - Joe's Filenames Use Spaces and Punctuation.xlsx
>
> - figure 1.png
>
> - fig 2.png
>
> - JW7d\^(2sl\@deletethisandyourcareerisoverWx2\*.txt
>
> #### GOOD ✅ {.unnumbered}
>
> - 2014-06-08_abstract-for-sla.docx
>
> - Joes-filenames-are-getting-better.xlsx
>
> - Fig01_scatterplot-talk-length-vs-interest.png
>
> - Fig02_histogram-talk-attendance.png
>
> - 1986-01-28_raw-data-from-challenger-o-rings.txt
Names should be:

- **Machine readable** 💻
- **Human readable** 🧐
- **Optional: Consistent** ⚙️ (decide how you use underscores \_ and dashes -, if you want to use CamelCase or not, ...)
- **Optional: Play well with default ordering** ⬇ (e.g. start your file names with the creation date `YYYY-MM-DD`)


### Further reading

- [Naming files, folders and other things](https://the-turing-way.netlify.app/project-design/filenaming.html), The Turing Way
- [Project structure slides](https://djnavarro.net/slides-project-structure/#1), Danielle Navarro
- [File naming slides](https://speakerdeck.com/jennybc/how-to-name-files), Jenny Brian
- [ISO 8601, a standard for dates](https://en.wikipedia.org/wiki/ISO_8601), Wikipedia
43 changes: 43 additions & 0 deletions 2-2-organisation.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
## File and folder organization

**My first research project was a mess 🙈.** I had hundreds of files with dubious file names and sometimes several files with similar code written for computing on different infrastructures (my computer, the institute server, the cluster of the computing facility).

I felt like the worst researcher of all times. But I wasn't. **Many struggle with organizing their files and folders in increasingly complex research projects.**


### How to organize files and folders well?

It basically comes down to structuring folders and files **systematically from the beginning**.

Think about what a good folder structure could be for your research project. A standard project of mine looks something like this:

```
.
├── analysis <- all things data analysis
│ └── src <- functions and other source files
├── comm
│ ├── internal_comm <- internal communication such as meeting notes
│ └── journal_comm <- communication with the journal, e.g. peer review
├── data
│ ├── data_clean <- clean version of the data
│ └── data_raw <- raw data (don't touch)
├── dissemination
│ ├── manuscripts
│ ├── posters
│ └── presentations
├── documentation <- documentation, e.g. data management plan
└── misc <- miscellaneous files that don't fit elsewhere
```

You can download this folder structure as a template from [https://github.com/HeidiSeibold/research-project-template](https://github.com/HeidiSeibold/research-project-template).
Not every project is the same and likely your project will be more complex than this. But if you think about good organization from the beginning, it will be easier in the long run.

What do you think about file or folder organization? Is your folder structure similar to mine?

### Further reading

- [Research Compendia](https://the-turing-way.netlify.app/reproducible-research/compendia.html), The Turing Way
- [Towards a Standardized Research Folder Structure](https://genr.eu/wp/towards-a-standardized-research-folder-structure/), GenR blog
- Folder structure of R packages, [Making Packages in R](https://swcarpentry.github.io/r-novice-inflammation/08-making-packages-R/), Software Carpentry
- [Research Project Template](https://github.com/HeidiSeibold/research-project-template), Heidi Seibold
- [Data Analysis Project Template](http://projecttemplate.net/), a [group of R users](https://github.com/KentonWhite/ProjectTemplate/graphs/contributors)
63 changes: 63 additions & 0 deletions 2-3-documentation.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
## Documentation

In this chapter we discuss research documentation for reproducible research.

*How can I document my research outputs?*

There is actually no super-clear catch all answer to this question. It really depends on your needs, on your audience as well as on the types of research outputs you generate. In the following you find a few ideas to start from.


### Documenting research projects

One thing that I always do is to add a README-Text-File to each project. In the README I write the **most important info about the project**: What is it about? Who is involved? Where to find files? How to cite it? Where to find the paper? ...

As an example, check out my project on [personalised medicine](https://github.com/HeidiSeibold/personalised_medicine).

For more complex research projects, you can create a whole wiki or similar to
describe the project. For most projects a README will be just fine.

### Documenting data

Metadata is central to documentation of data. Metadata is information about your data. It's information on the license of the data, who owns it, what information the data contain, ... so essential data documentation.

Many research fields have **standards for metadata**. If you can't find one for your field you can use a common standard (e.g. [Dublin Core](https://www.dublincore.org/specifications/dublin-core/dces/)) or just ask a data manager or librarian at your institution. You can write metadata similar to a README (see e.g. this [guide from Cornell University](https://data.research.cornell.edu/content/readme)). If you upload your data to a data platform (e.g. [Dryad](https://datadryad.org/)) you won't have to think about it as the platform usually takes care of that (Dryad uses Dublin Core).


### Documenting code

To make my code as understandable as possible for others, I use **literate programming** (mixing text and code to make it easier to read, e.g. [Quarto](https://quarto.org/)) or add clear **code comments**. When writing functions in R I additionally use the standardized way to document R functions (via [**Roxygen2**](https://cran.r-project.org/web/packages/roxygen2/vignettes/roxygen2.html)).

An example of code comments in R (`##`):
```{r}
#| eval: false
## Load package + data
library("model4you")
data("MathExam14W", package = "psychotools")
## scale points achieved to [0, 100] percent
MathExam14W$tests <- 100 * MathExam14W$tests/26
MathExam14W$pcorrect <- 100 * MathExam14W$nsolved/13
## select variables to be used
MathExam <- MathExam14W[ , c("pcorrect", "group", "tests", "study",
"attempt", "semester", "gender")]
```



### Documenting other things

Whatever you work on, there might be parts of your research project that are difficult to understand. Say you work in a lab, then your documentation is a **lab notebook**. Or you do interviews, then your documentation may be your interview strategy. **Anything that might be useful for others is worth keeping and worth sharing**. *After all, we all want to build on the work of others in order to make the world a little better.*


### Further reading

Want to learn more? Check out:

- [Landing Page - README file](https://the-turing-way.netlify.app/project-design/project-repo/project-repo-readme.html?highlight=readme), The Turing Way
- [A beginner's guide to writing documentation](https://www.writethedocs.org/guide/writing/beginners-guide-to-docs/), Write The Docs
- [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/), Yihui Xie, J. J. Allaire, Garrett Grolemund
- [knitr](https://yihui.org/knitr/) - Elegant, flexible, and fast dynamic report generation with R, Yihui Xie
- [Guide to writing "readme" style metadata](https://data.research.cornell.edu/content/readme), research data management service group, Cornell University

17 changes: 11 additions & 6 deletions 2-project-organization.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,17 @@
:::


## Naming things
::: {.callout-caution}
## Tasks

## File and folder organization
Read this chapter and watch this week's videos.
Afterwards go through the following assignments:

## Documentation
- What questions came up for you while watching the videos and going through the booklet?
- What are your personal hurdles for reproducible research? What can you do to address them?
- What are hurdles for you in producing FAIR data? What can you do for the data you work with?
- Which of the points in the 8 steps for Planning a Community does your research team already check off? Which should be discussed? If possible, bring open points to your team meeting.

### Documenting data
### Documenting code
### Documenting research projects
Discuss your progress with your accountability buddy. Bring any questions and
problems that you cannot solve with your buddy to the weekly meeting.
:::
4 changes: 4 additions & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ book:
- 1-3-fair.qmd
- 1-4-team-work.qmd
- part: 2-project-organization.qmd
chapters:
- 2-1-naming.qmd
- 2-2-organisation.qmd
- 2-3-documentation.qmd
- part: 3-computational-workflows.qmd
- part: 4-publishing-research.qmd
- part: summary.qmd
Expand Down

0 comments on commit cd440de

Please sign in to comment.