Skip to content

Commit

Permalink
workflows + beginning of publishing
Browse files Browse the repository at this point in the history
  • Loading branch information
HeidiSeibold committed Aug 24, 2023
1 parent cd440de commit 6ad9694
Show file tree
Hide file tree
Showing 13 changed files with 196 additions and 9 deletions.
3 changes: 1 addition & 2 deletions 1-reproducible-research.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,7 @@ Afterwards go through the following assignments:
- What are hurdles for you in producing FAIR data? What can you do for the data you work with?
- Which of the points in the 8 steps for Planning a Community does your research team already check off? Which should be discussed? If possible, bring open points to your team meeting.

Discuss your progress with your accountability buddy. Bring any questions and
problems that you cannot solve with your buddy to the weekly meeting.
Bring any questions and problems that come up to the weekly meeting.
:::


Expand Down
6 changes: 2 additions & 4 deletions 2-project-organization.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,8 @@
Read this chapter and watch this week's videos.
Afterwards go through the following assignments:

- What questions came up for you while watching the videos and going through the booklet?
- What are your personal hurdles for reproducible research? What can you do to address them?
- What are hurdles for you in producing FAIR data? What can you do for the data you work with?
- Which of the points in the 8 steps for Planning a Community does your research team already check off? Which should be discussed? If possible, bring open points to your team meeting.
- Create a folder structure template that would work for most of your projects. If possible, discuss it with your team.
- For your current project, implement the organization and documentation practices that you learned about. Write down questions that come up in the process.

Discuss your progress with your accountability buddy. Bring any questions and
problems that you cannot solve with your buddy to the weekly meeting.
Expand Down
50 changes: 50 additions & 0 deletions 3-1-version-control.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
## Version control

Version control is a big topic for me. It completely changed the way I work. I am happy that we get to talk about version control as part of this 6 step process towards reproducibility.

### What is version control?

Let's say you are writing a paper. You will edit your paper and might want to keep different versions of it. A common way to handle that is by using different file names for different versions.



::: captioned-image-container
![http://www.phdcomics.com/comics/archive/phd101212s.gif](http://www.phdcomics.com/comics/archive/phd101212s.gif){fig-alt="http://www.phdcomics.com/comics/archive/phd101212s.gif"}
:::

This way of "version control" is outdated and error-prone (hence the pixelated image 😜). The most common proper version control system today is [Git](https://git-scm.com/), which I'd like to introduce to you now.

### Git for version control

Git is free and open source 😃🙌.

With Git you can track different versions of your paper. For each version you can add a description ("commit message") and you even automatically track who made which change if you are working in a group. You can always go back to old versions.

The way you work with Git is that you have the version database both on our computers and on a server. To get the changes from and to the server you use commands (`pull` = download stuff from server, `push` = upload stuff to server).

::: captioned-image-container
![](images/vc-server.jpg){fig-alt="Reproducibility scale: a scale showing **not reproducible** on one end and **reproducible** on the other end."}
:::

Most researchers use GitLab or GitHub as platforms for working with Git and they also serve as a neat front end for the server. GitLab and GitHub give us some extra neat features for collaboration (e.g. issues, Wiki, ...).

Learning Git can be daunting 🙀. I recommend learning it with a group or in a class. I am always happy to teach version control. You can also check if there is a free [Software Carpentry](https://software-carpentry.org/) class in your area.

### Other version control systems

There are many other ways of doing version control out there.

**Subversion:** Simpler systems like Subversion are used less these days as Git offers more flexibility.

**Google docs and friends:** Many online text editors (Google Docs, OneDrive, ...) offer versioning now. It is not as advanced and versatile, but a nice way to work in a [WYSIWYG](https://en.wikipedia.org/wiki/WYSIWYG) (What You See Is What You Get) editor. Git really only works with real text files, so people usually use LaTeX or Markdown (not WYSIWYG) to write texts when using Git.

**Versioning data:** Version control of data is a difficult task. Let's leave that for another day. See [here](https://the-turing-way.netlify.app/reproducible-research/vcs/vcs-data.html) for more info for now.


## Further reading
- [Version Control](https://the-turing-way.netlify.app/reproducible-research/vcs.html), The Turing Way
- [Version Control with Git](https://swcarpentry.github.io/git-novice/), Software Carpentry
- [Version Control with Git](https://annakrystalli.me/rrresearchACCE20/version-control-with-git.html) (for R users), Anna Krystalli
- [Set up Git with RStudio & GitLab](https://gitlab.com/HeidiSeibold/setup-git-rstudio-gitlab), Heidi Seibold


100 changes: 100 additions & 0 deletions 3-2-stabilize.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
## Stabilize your computing environment and software

This topic may sound technical and boring at first, but please bare
with me 🙏. **It will be useful!**


Have you ever had the problem that you ran an old code and it just did
not work anymore? After hours of digging into the issue you find that it's because the
software package you use has changed in the meantime 🧐

Or have you tried to reproduce someone else's code, which seems to run on
their machine but not on yours and you just don't know why.

This chapter is all about avoiding such problems in the future
by **stabilizing your computing environment and software**. ✅


### What is a computing environment?

Your computing environment is defined by your computer, the operating
system and the software installed. If you update your operating system
or your software, your computing environment changes. In R, for example,
you can learn a lot about your computing environment by typing
`sessionInfo()`.

```{r}
sessionInfo()
```

It tells the R version, operating system, loaded R packages as well as
their versions.

### Options for stabilizing your computing environment

#### 1) Record your computing environment {.unnumbered}

Document the software versions you used. For example if you use R, you
could copy the output of `sessionInfo()` into your README or somewhere
else where future you (and others) can find this information. This is
not exactly "stabilizing" but it gives the possibility to install the
same software versions again.

#### 2) Use one virtual machine per research project {.unnumbered}

You don't need to know what a virtual machine is or how to set it up to
be able to do this. I used to ask the wonderful IT person at my
institute to set up a virtual machine for me and if your IT supporters
know their job, they'll be able to help you here.

A virtual machine is essentially a virtual computer on another computer
or server (To those nerds out there, I know I am probably explaining it
incorrectly but for the purpose of what we want to achieve here, it's
good enough). If you have one virtual machine for each project, you can
keep the computing environment stable by not installing or updating
software after you've finished the research project.

The downside of this strategy is that this is only for future you and
your collaborators, but not for other researchers who want to work with
the same computing environment.

#### 3) Use one container per research project {.unnumbered}

Containers are similar to virtual machines (think little computer inside
your computer). The big difference is that you can make them available
for others. So you can send your container image (or the file describing
it) to others.

::: captioned-image-container
![](images/docker-computers.jpg)
:::

Popular container tools are **Docker and Apptainer** (formerly
Singularity). Learning to work with containers is not super easy, but it
is worth the time and actually can be applied in so many other
situations. So, a great skill to have even if you decide to quit
research.

#### 4) Other {.unnumbered}

There are many other options out there. I wrote down the three that are
least dependent on the actual software you use. For R users, check out
packages `logrx`, `rang`, `packrat`, `versions`, and `renv`.


### Further reading

- [Reproducible
Environments](https://the-turing-way.netlify.app/reproducible-research/renv.html),
The Turing Way
- Video: [How can software containers help your
research?](https://youtu.be/HelrQnm3v4g), Paula Andrea Martinez +
Australian Research Data Commons
- [R Docker tutorial](https://jsta.github.io/r-docker-tutorial/),
maintained by Jemma Stachelek

That's all for this chapter. I hope it was helpful and not too technical. Happy
researching! 🙌



3 changes: 3 additions & 0 deletions 3-3-automate.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Automate your code

TODO
21 changes: 21 additions & 0 deletions 3-computational-workflows.qmd
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Computational workflows

::: {.callout-note}
## Learning targets

Expand All @@ -9,3 +10,23 @@
- You will have a roadmap on how to automate your code.
:::

::: {.callout-important}
This week's assignments are a bit more difficult and will take longer to implement
that the previous. Please plan for this.
:::

::: {.callout-caution}
## Tasks

Read this chapter and watch this week's videos.
Afterwards go through the following assignments:

- Install git: For R users, I recommend following [these](https://gitlab.com/HeidiSeibold/setup-git-rstudio-gitlab) instructions. Otherwise the Carpentries have good [instructions](https://swcarpentry.github.io/git-novice/index.html) for all systems (follow along until end of `2. Setting up Git`)
- Create a project on GitLab (or GitHub). Then `clone` it, edit a file, `add`,
`commit`, and `push` your changes.
- View the open issues for our course booklet and choose one where you can make a contribution. Create a merge request with your contribution and mark it in the respective issue.
- Optional: try using Make to automate something in your current research project.

Discuss your progress with your accountability buddy. Bring any questions and
problems that you cannot solve with your buddy to the weekly meeting.
:::
1 change: 1 addition & 0 deletions 4-1-licensing.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## Licensing
1 change: 1 addition & 0 deletions 4-2-where-to-publish.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## Where to publish
1 change: 1 addition & 0 deletions 4-3-fair-revisited.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## FAIR revisited
15 changes: 12 additions & 3 deletions 4-publishing-research.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,17 @@
- You will know how to implement the FAIR principles in practice.
:::

## Licensing

## Where to publish

## FAIR revisited

::: {.callout-caution}
## Tasks

Read this chapter and watch this week's videos.
Afterwards go through the following assignments:

- Upload something (e.g. data or a slide deck) to a repository of your choice. How FAIR can you make it?

Discuss your progress with your accountability buddy. Bring any questions and
problems that you cannot solve with your buddy to the weekly meeting.
:::
4 changes: 4 additions & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ book:
- 2-2-organisation.qmd
- 2-3-documentation.qmd
- part: 3-computational-workflows.qmd
chapters:
- 3-1-version-control.qmd
- 3-2-stabilize.qmd
- 3-3-automate.qmd
- part: 4-publishing-research.qmd
- part: summary.qmd
chapters:
Expand Down
Binary file added images/docker-computers.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/vc-server.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 6ad9694

Please sign in to comment.