-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcomputo_RR23.qmd
110 lines (81 loc) · 9.99 KB
/
computo_RR23.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
title: "Computo"
subtitle: "An academic journal promoting reproductibility via Quarto and Continuous Integration"
date: last-modified
date-modified: last-modified
toc: false
author:
- name: Julien Chiquet
affiliation: Université Paris-Saclay/AgroParisTech/INRAE
orcid: 0000-0002-3629-3429
email: [email protected]
corresponding: true
- name: Chloé Azencott
affiliation: MinesParisTech
orcid: 0000-0003-1003-301X
- name: François-David Collin
affiliation: IMAG/CNRS
orcid: 0000-0002-8374-7163
- name: Ghislain Durif
affiliation: LBMC/CNRS/ENS Lyon
orcid: 0000-0003-2567-1401
- name: Mathurin Massias
affiliation: INRIA Lyon OCKHAM
orcid: 0000-0002-8950-0356
- name: Pierre Neuvial
affiliation: IMT/CNRS
orcid: 0000-0003-3584-9998
- name: Nelle Varoquaux
afiliation: Université Grenoble Alpes/CNRS
orcid: 0000-0002-8748-6546
format:
computo-pdf: default
bibliography: references.bib
csl: journal-of-open-research-software.csl
abstract: >
Computo is a newborn journal published by the [French Statistical Society](https://sfds.asso.fr/) which aims at promoting computational/algorithmic contributions in statistics and machine learning. In order to achieve this goal, Computo goes beyond classical static publications by leveraging technical advances in literate programming and scientific reporting. We present in this talk the solution that we put in place so far to support both editorial and scientific reproducibility, that relies on community-based tools like Quarto, Github actions and open review. See our webpage \url{https://computo.sfds.asso.fr/}, and github group \url{https://github.com/computorg}.
keywords: [Reproducible research -- Statistics -- Machine-Learning -- Quarto -- R -- Python -- Julia]
---
## About
### Aims and scope
*Computo* has been created in the context of a reproducibility crisis in science [@ioannidis2005; @steen2011; @allison2016; @bastian2016; @whitfield2021], which calls for improvements of scientific research implementation [@desquilbet2019; @the_turing_way2022] and for higher standards in the publication of scientific results. *Computo* aims at promoting computational and algorithmic contributions in statistics and machine learning (ML) that provide insight into which models or methods are the most appropriate to address a specific scientific question. The journal welcomes the following types of contributions:
1. New methods with original stats/ML developments, or numerical studies that illustrate theoretical results in stats/ML;
2. Case studies or surveys on stats/ML methods to address a specific (type of) question in data analysis, neutral comparison studies that provide insight into when, how, and why the compared methods perform well or less well;
3. Software/tutorial papers to present implementations of stats/ML algorithms or to feature the use of a package/toolbox/library.
### An open access journal with reproducible contributions
*Computo* is free for readers and authors. It is a diamond open access journal [@ancion2022] which means that all contents are freely published without any fee for the authors, and are freely available without any charge to the reader or their institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or the author^[as long as original authors are duly cited.]. This is in accordance with the [Budapest Open Access Initiative (BOAI)](https://www.budapestopenaccessinitiative.org/) definition of open access and with the [Plan S](https://www.coalition-s.org/addendum-to-the-coalition-s-guidance-on-the-implementation-of-plan-s/principles-and-implementation/) initiative [@planS].
The reproducibility of numerical results is a necessary condition for publication in *Computo*. In particular, submissions must include all necessary data (e.g. via Zenodo repositories) and codes. For contributions featuring the implementation of methods/algorithms, the quality of the provided code is assessed during the review process.
The reviews are open, i.e. visible to any reader after acceptance of the contribution. Reviewers may choose to remain anonymous or not.
## Submission
### Overview
Submissions require both scientific content (equations, codes and figures, data) and a proof that this content is reproducible. This is achieved by means of a notebook system, a virtual environment fixing the dependencies and continuous integration (plus, if needed, an external data repository to store data files such as [Zenodo](https://zenodo.org/) [@zenodo] or [OSF](https://osf.io/) [@foster2017]). A *Computo* submission is thus a git repository containing
- the source file of the notebook with auxiliary files: a markdown file with yaml metadata and some statics files, e.g. figures or small .csv data tables;
- configuration files to set up the dependencies in a virtual environment;
- configuration files to set up the continuous integration rendering the final documents.
The compiled notebook (HTML and PDF) is directly generated in the git repository via continuous integration (e.g., Github action or Gitlab CI) and published, if successful, to a web page (e.g. gh-page). The PDF and the git repository address are then submitted to our submission platform.
::: {.callout-note}
### On the use of [Quarto](https://quarto.org/)
Our choice to rely on `Quarto` has been carefully considered. Leveraging its experience in literate programming [@knuth1984] with [`Rmarkdown`](https://pkgs.rstudio.com/rmarkdown/) [@rmarkdown] and relying on top community tools like universal document converter [`Pandoc`](https://pandoc.org/) [@pandoc], RStudio/Posit developed [`Quarto`](https://quarto.org/) [@quarto], a standalone, language-agnostic publishing tool^[including support for R, Python, Julia and Observable JS.]. This point was decisive for the editorial committee of *Computo*, which in no way wishes to privilege one language over another. We thus encourage authors to use Quarto, which hopefully offer numerous publishing oriented features, a centralized workflow which facilitates the final production, and a literate programming experience well adapted to VScode/Rstudio/Neovim users.
:::
::: {.callout-warning}
### **Warning**
*Computo* is not just about publishing a notebook and proving that it can be compiled with CI! This part of the process is what we call _"Editorial Reproducibility"_. _"Scientific"_ or _"numerical"_ reproducibility^[We don't ask people reproducing their data... yet!] ^[We also don't ask for bit-wise reproducibility but at least statistical reproducibility.] of the analyses is also mandatory, on top of classical peer-review evaluation.
:::
### Available templates
`Quarto`-based templates are available from our [github group](https://github.com/computorg/), which serves as material for starting to write a submission, and as a documentation for doing so. We developed a [Quarto extension](https://github.com/computorg/computo-quarto-extension) and dedicated templates for setting everything up either for R, Python and Julia users:
- [*Computo* Template for R users](https://github.com/computorg/template-computo-R)^[`Quarto` R-based notebooks are similar to Rmarkdown notebooks which are also natively supported by `Quarto`.]
- [*Computo* Template for Python users](https://github.com/computorg/template-computo-python)
- [*Computo* Template for Julia users](https://github.com/computorg/template-computo-julia)
<!--
If you are attached to Jupyter book or do not prefer to use Quarto, you are still encouraged to submit to *Computo*:a Jupyter-myst template is available that requires more formatting work for production, but author comfort is a priority:
- [Jupyter Myst template](https://github.com/computorg/template-computo-myst)
-->
## Reviewing and publication
Submitted papers are reviewed by external reviewers selected by the Associate Editor in charge of the paper. *Computo* strives for fast reviewing cycles, but cannot provide strict guarantees on the matter; the current time between submission and publication is under six months. In order to ensure an efficient reviewing process, authors are requested upon submission to suggest the names of four potential referees. To avoid conflicts of interests, recent co-authors or collaborators should not be suggested. At the moment, *Computo* relies on [Scholastica](https://scholasticahq.com/) for the review process. We plan to migrate very soon to the [openreview plateform](https://openreview.net/).
Once a manuscript is accepted, its reviews are made available on the finally rendered article as an issues associated to the git repository. Reviewers can choose to remain anonymous or not. *Computo*’s accepted papers are published electronically immediately upon receipt under [CC BY 4.0 license](https://creativecommons.org/licenses/by/4.0/) [@ccby4]. Authors retain the copyright and full publishing rights without restrictions.
At the moment, we have [3 published contributions](https://computo.sfds.asso.fr/publications/) and are processing 6 additional submissions. We strongly encourage you to submit your work to our journal.
## Perspectives
Boundaries between “editorial” and “scientific/numerical” reproducibilities are not always clear and may be blurred or moving. For example, some contributions which computational workload completely fits the CI rendering process and provides also near-perfect reproducibility of their materials, if approved, achieve both reproducibilities in a single step. But other contributions, for example involving simulations, large amount of data or complex deep-learning models may require a more “hands-on” reproducibility process. One of the perspectives of *Computo* is to broaden the automated reproducibility scope to include more and more “scientific” reproducibility, and, ultimately, to provide authors a full array of computational resources to allow them to include their own scientific/numerical reproducibility developing process in the editorial one.
## References
::: {#refs}
:::