-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathpaper.tex
1630 lines (1498 loc) · 80.4 KB
/
paper.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[onecolumn]{article}
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\usepackage{fixltx2e} % provides \textsubscript
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\else % if luatex or xelatex
\ifxetex
\usepackage{mathspec}
\else
\usepackage{fontspec}
\fi
\defaultfontfeatures{Ligatures=TeX,Scale=MatchLowercase}
\fi
% use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
% use microtype if available
\IfFileExists{microtype.sty}{%
\usepackage{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\usepackage[margin=1in]{geometry}
\usepackage{hyperref}
\hypersetup{unicode=true,
pdftitle={Publishing computational research -- A review of infrastructures for reproducible and transparent scholarly communication},
pdfauthor={Markus Konkol (m.konkol {[}at{]} uni-muenster {[}dot{]} de), Daniel Nüst, Laura Goulier (Institute for Geoinformatics, University~of~Münster, Münster,~Germany)},
pdfborder={0 0 0},
breaklinks=true}
\urlstyle{same} % don't use monospace font for urls
\usepackage{longtable,booktabs}
\usepackage{graphicx,grffile}
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
\def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}
}
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{0}
% Redefines (sub)paragraphs to behave more like sections
\ifx\paragraph\undefined\else
\let\oldparagraph\paragraph
\renewcommand{\paragraph}[1]{\oldparagraph{#1}\mbox{}}
\fi
\ifx\subparagraph\undefined\else
\let\oldsubparagraph\subparagraph
\renewcommand{\subparagraph}[1]{\oldsubparagraph{#1}\mbox{}}
\fi
%%% Use protect on footnotes to avoid problems with footnotes in titles
\let\rmarkdownfootnote\footnote%
\def\footnote{\protect\rmarkdownfootnote}
%%% Change title format to be more compact
\usepackage{titling}
% Create subtitle command for use in maketitle
\providecommand{\subtitle}[1]{
\posttitle{
\begin{center}\large#1\end{center}
}
}
\setlength{\droptitle}{-2em}
\title{Publishing computational research -- A review of infrastructures for
reproducible and transparent scholarly communication}
\pretitle{\vspace{\droptitle}\centering\huge}
\posttitle{\par}
\author{Markus Konkol (m.konkol {[}at{]} uni-muenster {[}dot{]} de), Daniel
Nüst, Laura Goulier (Institute for Geoinformatics,
University~of~Münster, Münster,~Germany)}
\preauthor{\centering\large\emph}
\postauthor{\par}
\date{}
\predate{}\postdate{}
\setlength{\columnsep}{18pt}
\usepackage{url} \usepackage{breakurl}
\PassOptionsToPackage{hyperindex,breaklinks}{hyperref}
\usepackage{caption} \captionsetup{width=5in}
\begin{document}
\maketitle
\begin{abstract}
Funding agencies increasingly ask applicants to include data and
software management plans into proposals. In addition, the author
guidelines of scientific journals and conferences more often include a
statement on data availability, and some reviewers reject unreproducible
submissions. This trend towards open science increases the pressure on
authors to provide access to the source code and data underlying the
computational results in their scientific papers. Still, publishing
reproducible articles is a demanding task and not achieved simply by
providing access to code scripts and data files. Consequently, several
projects develop solutions to support the publication of executable
analyses alongside articles considering the needs of the aforementioned
stakeholders. The key contribution of this paper is a review of
applications addressing the issue of publishing executable computational
research results. We compare the approaches across properties relevant
for the involved stakeholders, e.g., provided features and deployment
options, and also critically discuss trends and limitations. The review
can support publishers to decide which system to integrate into their
submission process, editors to recommend tools for researchers, and
authors of scientific papers to adhere to reproducibility principles.
\end{abstract}
\hypertarget{introduction}{%
\section{Introduction}\label{introduction}}
Many scientific articles report on results based on computations, e.g.,
a statistical analysis implemented in R. Publishing the used source code
and data to adhere to open reproducible research (ORR) principles (i.e.,
public access to code and data underlying the reported results (Stodden
et al. 2016)) seems simple. However, several studies concluded that
papers rarely link to these materials (Stagge et al. 2019; Nüst,
Granell, et al. 2018). Moreover, due to technical challenges, e.g.,
capturing the original computational environment of the analyst, even
accessible materials do not guarantee reproducibility (Chen et al. 2018;
Konkol, Kray, and Pfeiffer 2018). These issues have several implications
(Morin et al. 2012): It is difficult (often even impossible) to find
errors within the analysis, but publishing erroneous papers can damage
an author's reputation (Herndon, Ash, and Pollin 2013) as well as trust
in science (National Academies of Sciences, Medicine, and others 2019).
Also, reviewers cannot verify the results, because they need to
understand the analysis just by reading the text (Bailey, Borwein, and
Stodden 2016). Furthermore, other researchers cannot build upon existing
work but have to collect data and implement the analysis from scratch
(Powers and Hampton 2018). Finally, libraries cannot preserve the
materials for future use or education. These issues are also to
society's disadvantage as it cannot benefit fully from publicly funded
research (Piwowar and Piwowar 2007). Fortunately, funding bodies, e.g.,
Horizon 2020
(\url{https://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-dissemination_en.htm},
last access for this and the following URLs: 20th Dec 19), increasingly
consider data and software management plans as part of grant proposals.
Accordingly, more editors add a section on code and data availability
into their author guidelines (see, e.g., Nüst, Ostermann, et al. (2019);
Hrynaszkiewicz (2019)), and reviewers consider reproducibility in their
decision process (Stark 2018). Nevertheless, these cultural and
systematic developments (Munafò et al. 2017) alone do not solve the
plethora of reproducibility issues. Authors often do not know how to
fulfill the requirements of funding bodies and journals, such as the TOP
guidelines (Nosek et al. 2015). It is important to consider that the
range of researchers' programming expertise varies from trained research
software engineers to self-taught beginners. For these reasons, more and
more projects work on solutions to support the publication of executable
supplements. The key contribution of this paper is a review of
applications that support the publication of executable computational
research for transparent and reproducible research. This review can be
used as decision support by publishers who want to comply with
reproducibility principles, editors and programme committees planning to
adopt reproducibility requirements in the author guidelines and
integrate code evaluation in their review process (Eglen and Nüst 2019),
applicants in the process of creating data and software management plans
for their funding proposals, and authors searching for tools to
disseminate their work in a convincing, sustainable, and effective
manner. We also consider aspects related to preservation relevant for
librarians dealing with long-term accessibility of research materials.
Based on the survey, we critically discuss trends and limitations in the
area of reproducible research infrastructures.
\emph{Scope:} This work focuses on applications that support the
publication of research results based on executable source code scripts
(e.g., R or Python) and the underlying data. Hence, we did not consider
workflow systems (e.g., Taverna (Wolstencroft et al. 2013)) or online
repositories (e.g., Open Science Framework, \url{https://osf.io/}).
Also, this paper does not discuss how to work reproducibly since this is
covered already in literature (e.g., Rule et al. (2019), Sandve et al.
(2013), Greenbaum et al. (2017), Markowetz (2015)). The review is a
snapshot of the highly dynamic area of publishing infrastructures.
Hence, some of the collected information might become outdated, e.g., an
application might extend the set of functionalities or be discontinued.
Still, reviewing the current state of the landscape to reflect on
available options is helpful for publishers, editors, reviewers,
authors, and librarians. All collected data is available in the
supplements (see Data and Software Availability). The paper is
structured as follows: First, we survey fundamental concepts and tools
underlying the applications. We then introduce each application and the
comparison criteria followed by the actual comparison. The paper
concludes by a discussion about the observations we made, trends, and
limitations.
\hypertarget{fundamental-concepts-and-tools}{%
\section{Fundamental concepts and
tools}\label{fundamental-concepts-and-tools}}
Before reviewing applications for publishing reproducible research, we
briefly survey fundamental concepts and tools that underpin the
applications. This overview is needed to understand how the applications
work and what the limitations are.
\hypertarget{packaging-computational-research-reproducibly}{%
\subsection{Packaging computational research
reproducibly}\label{packaging-computational-research-reproducibly}}
The traditional research article alone is not sufficient to communicate
a complex computational analysis (Donoho 2010). To address this issue,
computational reproducibility concerns the publication of code and data
underlying a research paper. This form of publishing research allows
reviewers to verify the reported results and readers to reuse the
materials (Barba 2018). To achieve that, all materials are needed,
including not only the data and code but also the computational
environment. A basic concept for such a collection is the research
compendium, a ``mechanism that combines text, data, and auxiliary
software into a distributable and executable unit'' (Gentleman and Lang
2007). The concept was extended by a description and snapshot of the
software environment using containerization resulting in the executable
research compendium (Nüst et al. 2017). Containerization and
virtualization are mechanisms to capture the full software stack of a
computational environment, including all software dependencies in a
portable snapshot (Perkel 2019). In contrast to containerization,
virtualization also includes the operating system kernel. Despite this
difference, both approaches have proven to improve transparency and
reproducibility (Boettiger 2015; Howe 2012). One containerization
technology is Docker, which is based on so-called Dockerfiles, human and
machine readable recipes to create the image of a virtual environment
(Boettiger 2015). These recipes add an additional layer of documentation
making Docker a popular tool in the area of computational
reproducibility (Nüst and Hinz 2019). A research compendium should
contain an entry point, i.e., a main file that needs to be executed to
run the entire analysis. One option to realize these entry points is the
concept of literate programming, an approach for interweaving source
code and text in one notebook (Knuth 1984). Two popular realizations of
such notebooks are Jupyter Notebooks (Kluyver et al. 2016) and R
Markdown (Baumer et al. 2014). Combining source code and data in one
document is advantageous over other approaches, such as having code
scripts and the article separated, which might result in inconsistencies
between the two. A further advantage is the possibility to execute the
analysis with a single click, so called one-click-reproduce (Pebesma
2013). This form of making computational results available lowers the
barrier for others to reproduce the results and thus increases trust and
transparency of computer-based research.
\hypertarget{licensing-and-citation}{%
\subsection{Licensing and Citation}\label{licensing-and-citation}}
Appropriate licensing of research components is crucial yet complex, as
copyright laws differ between component types, e.g., data, software, and
text (Stodden 2009). This is particularly important when it comes to
reusing research components, which is one of the main goals of research
compendia. A further level of complexity emerges if research compendia
include, for example, parts of the data and the code of several already
published papers. A typical use case is reusing code of a specific
version published in a repository, while the same code is developed and
stored on a public repository (e.g., GitLab). Besides conscious handling
of licenses and copyrights, building on top of the work of others
requires adequate citations. This can be supported by connecting the
research components with the help of metadata including permanent and
global identifiers, e.g., DOIs (Stodden et al. 2016), which can be also
used for data (Park and Wolfram 2019) and software (Fenner et al. 2016).
\hypertarget{ethical-and-technical-issues}{%
\subsection{Ethical and technical
issues}\label{ethical-and-technical-issues}}
Frequently mentioned issues related to computational reproducibility
concern sensitive data and large data files. To tackle the issue of
sensitive data, a first step would be to anonymize the data. Another
option is to involve a trustworthy authority which ensures that the
results in the article can be achieved based on the used data (Pérignon
et al. 2019). In this case, public access is not required. To ensure
that these solutions are not exploited, authors should argue why hiding
or providing synthetic data is required and reviewers can then decide
whether the reasons are valid. A further solution is the concept of
cloud-based data enclaves, which provide data access only to authorized
persons (Foster 2017). Such approaches for access control could be
connected with the applications discussed in this paper.
Large data files, e.g., global remote sensing datasets quickly reach
several petabytes. However, a large number of papers are based on
datasets that can be stored on public and free data repositories, such
as Open Science Framework (file size limit only for individual files,
\url{https://help.osf.io/hc/en-us/articles/360019737894-FAQs\#what-is-the-individual-file-size-limit})
or Zenodo (max. 50GB by default, extension possible,
\url{https://help.zenodo.org/whatsnew/}). Further limiting factors are
long computation times and the need for specialized hardware, such as
high-performance computing clusters (Ahn et al. 2013).
\hypertarget{materials-and-methods}{%
\section{Materials and Methods}\label{materials-and-methods}}
To obtain an overview of what the applications supporting the
publication of reproducible analyses provide as well as the trends and
limitations, we compared them across a set of criteria.
\hypertarget{materials}{%
\subsection{Materials}\label{materials}}
To ensure that the stakeholders receive current recommendations, we
considered an application as part of our analysis if \textbf{(i)} it was
actively maintained at the time the data for this paper was collected
(5th-13th Dec 2019), \textbf{(ii)} it supported publishing executable
code and data which can be inspected and reused, and \textbf{(iii)} the
application was explicitly connected to the publication process. Hence,
we did not consider technologies that alone cannot support the
publication process of code and data as further infrastructure is needed
(e.g., Docker) or applications that only provide access to data or code
(e.g., Zenodo). We found the applications during literature research and
discussions at conferences or workshops.
\hypertarget{applications}{%
\subsection{Applications}\label{applications}}
Based on the sample criteria, ten applications were selected for the
review. In the following, we briefly introduce them in alphabetical
order.
Researchers having a repository (e.g., on GitHub/Lab, Zenodo) including,
e.g., a Jupyter notebook can use \textbf{Binder}
(\url{https://mybinder.org/}) to make it available in an executable
environment (Jupyter et al. 2018). Readers can launch the analysis from
a Binder-ready repository and inspect the workflow in a browser. Binder
creates a containerized environment from a repository based on
configuration files. In \textbf{Code Ocean} (Clyburne-Sherin, Fei, and
Green 2019), authors can create so-called ``capsules'' which contain
code, data, and the computational environment including the version of
the operating system and dependencies. Readers can, while studying the
article, execute and inspect the analysis in a separate window below the
online version of the article or on Code Ocean's website. The
\textbf{eLife Reproducible Document Stack} (RDS,
\url{https://elifesciences.org/labs/b521cf4d/reproducible-document-stack-towards-a-scalable-solution-for-reproducible-articles})
enables authors to publish executable documents based on Stencila
(\url{https://stenci.la/}), an open-source editor for articles. The
executable document, which contains the whole narrative and executable
code snippets, is not only a supplement but the actual scientific
article. \textbf{Galaxy} (Goecks et al. 2010) is a web-based application
for developing computational analyses without programming expertise.
Scientists can upload and analyze data by using Jupyter Notebooks
(Grüning et al. 2017). \textbf{Gigantum} (\url{https://gigantum.com/})
builds on top of Git and packages code, data, the computational
environment, and the work history into a Git repository. Gigantum is
composed of a client application for creating as well as executing
analyses locally, and a cloud-based infrastructure for sharing
computations and collaborating with peers. \textbf{Manuscripts}
(\url{https://www.manuscripts.io/about/}) is an online tool for writing
executable documents collaboratively based on the concept of literate
programming, but featuring a ``What you see is what you get'' user
interface. The runtime environment of the author is, however, not
considered. \textbf{o2r} (Nüst et al. 2017) addresses publishers who
want to extend their existing infrastructure by a reproducibility
service during the process of paper submission (Nüst 2018). Authors can
also create interactive figures, allowing reviewers and readers to check
the robustness of the results, e.g., by changing model parameters using
a slider (Konkol, Kray, and Suleiman 2019). \textbf{REANA} (Šimko et al.
2019; Chen et al. 2018) provides a formal specification to guide authors
through the process of capturing input datasets, code, and the
computational environment. Based on this structure and after creating
some configuration files manually, REANA provides a set of command line
interface (CLI) commands to run large analyses on a remote REANA cloud.
\textbf{ReproZip} (Steeves, Rampin, and Chirigati 2017; Chirigati,
Rampin, et al. 2016) provides a set of CLI commands for encapsulating
data, code, and the computational environment automatically. Users can
execute the resulting bundle on a server provided by ReproZip (Rampin et
al. 2018) or locally on different computer systems. With \textbf{Whole
Tale} (Brinckman et al. 2019), authors can create so called ``Tales''
that combine narrative, data, code, and the computational environment.
Readers can inspect the materials and execute the analysis in the
original environment.
\hypertarget{rationale-for-the-comparison-criteria}{%
\subsection{Rationale for the comparison
criteria}\label{rationale-for-the-comparison-criteria}}
We identified the comparison criteria considering the needs of
stakeholders of the scholarly publication process described by Nüst et
al. (2017), i.e., those of publishers, editors, authors, reviewers,
readers, and librarians. There is some overlap regarding stakeholder
needs, for example, publishers as well as authors aim at attracting
readers and providing a convenient reading experience for reviewers.
\textbf{Publishers} need to know whether they can integrate the
application into their existing infrastructure. The applications can be
either made available as open source tools for own hosting or as a
service hosted by the provider. If the tool is available for free under
an open license, publishers only have to consider costs for maintaining
the infrastructure. Moreover, publishers gain full control and can
customize the interface or processes according to their own
specifications. In case of a paid service, publishers can take advantage
of not being responsible for the maintenance. A further criterion
relevant for publishers is the development stage of the application,
i.e., if it was already used in published articles.
\textbf{Editors} of journals need to ensure that a service for
publishing reproducible research is consistent with the tools the
authors typically use and common practices in their scientific field.
For example, journals regularly receiving submissions containing Jupyter
Notebooks should not choose a service that supports only R Markdown.
This aspect might also affect the author and reviewer guidelines, for
which the editors are responsible. A further relevant aspect is the
addressed research area. Some applications might address specific fields
and thus provide features tailored to domain-specific requirements.
\textbf{Authors} need to submit research materials efficiently. Hence,
we checked how authors can upload their files and which submission
formats and programming languages are supported. We also considered
which license submitted materials receive, since this is a frequently
mentioned aspect of papers discussing reproducibility guidelines
(Stodden et al. 2016). Although licensing is relevant for all
stakeholders, authors are particularly responsible for taking care of
it. We also checked whether the applications can deal with sensitive
data.
For \textbf{readers} and \textbf{reviewers}, open reproducible research
comes with several benefits, such as advanced search capabilities,
re-running workflows, inspecting results in detail (i.e.~looking at code
or data files), modifying parameter settings, and reusing the data or
the analysis for the own work (Konkol and Kray 2018). We thus checked
whether the tools provide any specific support for such investigations
of the research materials.
\textbf{Librarians} are tasked with preserving research materials. We
checked how the materials are stored and shared, and if modifying or
deleting them after publication is possible.
Based on these comparison criteria, we investigated the project
websites, the actual applications, GitHub/Lab repositories, scientific
articles (if available), and blog posts. Since most of the sources were
not scientific articles, the supplements contain screenshots and URLs to
show where we found the corresponding information.
\hypertarget{results}{%
\section{Results}\label{results}}
In the following, we compare the applications considering the needs of
the stakeholders. \emph{Table 1} summarizes aspects relevant for
publishers, i.e., if self-hosting is possible, which license is assigned
to the application, whether it is already in use or in a beta stage, and
the funding source. From the ten applications, eight allow self-hosting.
Code Ocean and Gigantum provide the service themselves. eLife RDS, o2r,
and REANA (in \emph{Table 1} marked by *) require own installations
since no free online deployments exist. Three applications are released
under the \emph{BSD-3-Clause License}, three under \emph{MIT} of which
Gigantum assigned this license to the local tool and not to the cloud
service, one under \emph{Apache 2.0}, one under the \emph{CPAL 1.0}, and
one under \emph{Academic Free License 3.0}. These licenses allow
operators to host their own service as well as to modify the software
according to their individual needs and styles. This means, however,
they also have to maintain the infrastructure and provide the required
technological resources as well as personnel. In contrast, Code Ocean's
infrastructure and Gigantum's cloud service are provided in exchange for
payment. From the reviewed applications, four are rather experimental
and six are already in use as shown by the example papers with workflows
based on the corresponding application. Seven applications receive
funding from public or private science foundations. Code Ocean and
Gigantum offer a commercial service.
\emph{Table 2} summarizes aspects relevant for editors and authors,
i.e., the scientific domains, supported submission formats, upload
mechanisms, and license terms. Although none of the investigated
applications are strictly tied to a specific domain, we observed that
some of them focus on particular areas. For example, Galaxy provides a
rich set of features tailored to use cases in the life sciences. Other
applications originate from a particular domain, e.g., eLife's RDS comes
from the life sciences whereas REANA focuses on particle physics. From
the ten applications, nine support literate programming approaches by
default, e.g., Jupyter Notebook or R Markdown. Manuscripts supports
Markdown, but also code execution via embedded Jupyter Notebooks.
\begin{longtable}[]{@{}lllll@{}}
\caption{Overview of properties relevant for publishers, i.e., if
self-hosting is possible (* denotes only self-hosting is possible),
which license the applications have, the stage of the project (in use or
beta), and the funding source.}\tabularnewline
\toprule
\begin{minipage}[b]{0.11\columnwidth}\raggedright
\strut
\end{minipage} & \begin{minipage}[b]{0.11\columnwidth}\raggedright
Self-hosting\strut
\end{minipage} & \begin{minipage}[b]{0.17\columnwidth}\raggedright
Open license\strut
\end{minipage} & \begin{minipage}[b]{0.17\columnwidth}\raggedright
Stage\strut
\end{minipage} & \begin{minipage}[b]{0.29\columnwidth}\raggedright
Funding\strut
\end{minipage}\tabularnewline
\midrule
\endfirsthead
\toprule
\begin{minipage}[b]{0.11\columnwidth}\raggedright
\strut
\end{minipage} & \begin{minipage}[b]{0.11\columnwidth}\raggedright
Self-hosting\strut
\end{minipage} & \begin{minipage}[b]{0.17\columnwidth}\raggedright
Open license\strut
\end{minipage} & \begin{minipage}[b]{0.17\columnwidth}\raggedright
Stage\strut
\end{minipage} & \begin{minipage}[b]{0.29\columnwidth}\raggedright
Funding\strut
\end{minipage}\tabularnewline
\midrule
\endhead
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Binder\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
yes\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
BSD 3-Clause ``New'' or ``Revised''\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
in use by Nüst, Granell, et al. (2018)\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
Moore Foundation, Google Cloud Platform\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Code Ocean\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
no\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
Commercial application\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
in use by Chitre (2018)\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
commercial\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
eLife RDS\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
yes*\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
MIT\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
in use by Lewis et al. (2018)\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
Howard Hughes Medic. Inst, Max Planck Society, Wellcome Trust, Knut and
Alice Wallenberg Foundation\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Galaxy\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
yes\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
Academic Free 3.0\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
in use by Ide et al. (2016)\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
National Institutes of Health, National Science Foundation, Penn State,
Johns Hopkins, and the Pennsylvania Department of Public Health\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Gigantum\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
no\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
MIT\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
beta\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
commercial\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Manuscripts\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
yes\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
CPAL-1.0\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
beta\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
no information available\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
o2r\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
yes*\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
Apache 2.0\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
beta\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
DFG (German Funding Agency)\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
REANA\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
yes*\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
MIT\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
in use by Prelipcean (2019)\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
CERN, National Science Foundation\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
ReproZip\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
yes\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
BSD 3-Clause ``New'' or ``Revised''\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
in use by Chirigati, Doraiswamy, et al. (2016)\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
Moore and Sloan Foundation\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Whole Tale\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
yes\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
BSD 3-Clause ``New'' or ``Revised''\strut
\end{minipage} & \begin{minipage}[t]{0.17\columnwidth}\raggedright
beta\strut
\end{minipage} & \begin{minipage}[t]{0.29\columnwidth}\raggedright
National Science Foundation\strut
\end{minipage}\tabularnewline
\bottomrule
\end{longtable}
Seven applications are extensible and provide the possibility to
configure the application to support further submission formats or
programming languages. Except for Code Ocean which also supports MATLAB
and Stata, all applications only support non-proprietary programming
languages. For making code and data available on the platform, five
applications provide file upload. Five applications provide the
possibility to upload materials via an external cloud or repository,
e.g., Zenodo. However, uploading materials might be disadvantageous for
papers based on large data files. For these cases, eLife's RDS (based on
Stencila), REANA, and ReproZip allow local usage. Researchers can also
work locally with Gigantum, but then need to synchronize with the online
service to access all features. Despite the importance of licensing, we
could not find information on copyright for research materials in four
applications. Whole Tale and Gigantum only allow open licenses whereas
Code Ocean, Galaxy, and o2r encourage it. eLife assigns an open license
to the article text only.
\begin{longtable}[]{@{}lllll@{}}
\caption{Overview of aspects relevant for editors and authors, i.e., the
addressed research area, which submission formats are supported, how
authors can upload materials, and copyright.}\tabularnewline
\toprule
\begin{minipage}[b]{0.11\columnwidth}\raggedright
\strut
\end{minipage} & \begin{minipage}[b]{0.11\columnwidth}\raggedright
Research area\strut
\end{minipage} & \begin{minipage}[b]{0.21\columnwidth}\raggedright
Submission formats/ Program. languages\strut
\end{minipage} & \begin{minipage}[b]{0.21\columnwidth}\raggedright
Upload\strut
\end{minipage} & \begin{minipage}[b]{0.21\columnwidth}\raggedright
Copyright\strut
\end{minipage}\tabularnewline
\midrule
\endfirsthead
\toprule
\begin{minipage}[b]{0.11\columnwidth}\raggedright
\strut
\end{minipage} & \begin{minipage}[b]{0.11\columnwidth}\raggedright
Research area\strut
\end{minipage} & \begin{minipage}[b]{0.21\columnwidth}\raggedright
Submission formats/ Program. languages\strut
\end{minipage} & \begin{minipage}[b]{0.21\columnwidth}\raggedright
Upload\strut
\end{minipage} & \begin{minipage}[b]{0.21\columnwidth}\raggedright
Copyright\strut
\end{minipage}\tabularnewline
\midrule
\endhead
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Binder\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
R Markdown, Jupyter Notebooks, extensible\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
via URL/DOI from Git(Hub/Lab), Gist, Zenodo, Figshare, Dataverse\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
no information found\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Code Ocean\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
R Markdown, Jupyter Notebooks, C/C++, Fortran, Java, Lua, MATLAB, Stata,
extensible\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
File upload, via URL from Git repository\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
self-determined, MIT for code/ CC0 for data encouraged\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
eLife RDS\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all, focus on life sciences\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
R Markdown, Jupyter Notebooks, Markdown, Excel, Word, LaTeX, JATS,
extensible\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
created locally using Stencila\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
CC-BY for text, for code/data not discussed\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Galaxy\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all, focus on life sciences\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
Jupyter Notebooks, extensible\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
File upload, FTP, SRA\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
encourage open license for software\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Gigantum\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
R Markdown, Jupyter Notebooks\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
Synchronization\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
self-determined but has to be open\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Manuscripts\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
Markdown, Word, Latex, JATS, R, Julia, Python\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
File upload\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
no information found\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
o2r\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all, focus on geosciences\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
R Markdown\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
File upload, ownCloud\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
self-determined but open is encouraged\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
REANA\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all, focus on particle physics\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
Jupyter Notebooks, extensible\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
created locally\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
no information found\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
ReproZip\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
Jupyter Notebooks, extensible\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
created locally\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
no information found\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.11\columnwidth}\raggedright
Whole Tale\strut
\end{minipage} & \begin{minipage}[t]{0.11\columnwidth}\raggedright
all\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
R Markdown, Jupyter Notebooks, extensible\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
File upload, URL/DOI from DataOne/ Dataverse, Materials Data
Facility\strut
\end{minipage} & \begin{minipage}[t]{0.21\columnwidth}\raggedright
self-determined but has to be open\strut
\end{minipage}\tabularnewline
\bottomrule
\end{longtable}
\emph{Table 3} summarizes aspects relevant for reviewers and readers.
From the ten applications, five provide a keyword-based search for
papers whereas five do not provide any search feature. o2r provides a
spatiotemporal search combined with thematic properties, such as
libraries used in the code. Nine applications provide tools for
inspecting code and data, six of them by providing an own user interface
(UI) and three by embedding a programming environment (e.g., JupyterLab,
RStudio). Though REANA does not provide supportive tools for inspection,
the materials can be viewed when stored on public repositories, e.g.,
GitLab. Nine applications provide tools for downloading materials.
Projects created with REANA can be downloaded if stored on public
repositories which already provide a download functionality. Eight
applications allow readers to execute the analysis in the browser on a
remote server. Gigantum provides a UI running locally, REANA projects
are executed via the CLI in a remote REANA cloud. Each application
allows manipulating the code and rerunning it based on a new parameter.
Most commonly, users can directly manipulate the code in the browser (6
applications provide this option) or locally (Gigantum). In REANA, users
can pass new parameter values via the CLI, in ReproZip via the CLI or
input fields using ReproServer. The o2r platform allows authors to
configure UI widgets giving reviewers/readers the chance to
interactively manipulate parameter values, e.g., by using a slider to
change a model parameter within a certain range.
\begin{longtable}[]{@{}llllll@{}}
\caption{Overview of features relevant for reviewers and readers, i.e.,
searching for papers and materials, inspecting code and data,
downloading materials, executing the analysis, and manipulating the
code.}\tabularnewline
\toprule
\begin{minipage}[b]{0.12\columnwidth}\raggedright
\strut
\end{minipage} & \begin{minipage}[b]{0.12\columnwidth}\raggedright
Searching\strut
\end{minipage} & \begin{minipage}[b]{0.18\columnwidth}\raggedright
Inspection\strut
\end{minipage} & \begin{minipage}[b]{0.09\columnwidth}\raggedright
Download\strut
\end{minipage} & \begin{minipage}[b]{0.18\columnwidth}\raggedright
Execution\strut
\end{minipage} & \begin{minipage}[b]{0.15\columnwidth}\raggedright
Manipulation\strut
\end{minipage}\tabularnewline
\midrule
\endfirsthead
\toprule
\begin{minipage}[b]{0.12\columnwidth}\raggedright
\strut
\end{minipage} & \begin{minipage}[b]{0.12\columnwidth}\raggedright
Searching\strut
\end{minipage} & \begin{minipage}[b]{0.18\columnwidth}\raggedright
Inspection\strut
\end{minipage} & \begin{minipage}[b]{0.09\columnwidth}\raggedright
Download\strut
\end{minipage} & \begin{minipage}[b]{0.18\columnwidth}\raggedright
Execution\strut
\end{minipage} & \begin{minipage}[b]{0.15\columnwidth}\raggedright
Manipulation\strut
\end{minipage}\tabularnewline
\midrule
\endhead
\begin{minipage}[t]{0.12\columnwidth}\raggedright
Binder\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
no support\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI of JupyterLab in browser\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI in browser\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
manually within code in browser\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
Code Ocean\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
keyword-based\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
below article, or in UI of Code Ocean\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI in browser\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
manually within code in browser\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
eLife RDS\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
keyword-based\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within article in browser\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI in browser\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
manually within code in browser\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
Galaxy\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
keyword-based\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI of JupyterLab in browser\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI in browser\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
manually within code in browser\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
Gigantum\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
no support\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI of local installation\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI of local installation\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
within UI of local installation\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
Manuscripts\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
no support\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI of Manuscripts\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI in browser\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
manually within code in browser\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
o2r\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
spatiotemporal and keyword- based search\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI of o2r\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI in browser\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
using UI widgets\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
REANA\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
no support\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
no support\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
no support\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
via CLI in remote Reana cloud\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
manually via CLI\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
ReproZip\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
no support\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI of ReproServer\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
locally via CLI, within UI in browser\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
manually via CLI/input fields in browser\strut
\end{minipage}\tabularnewline
\begin{minipage}[t]{0.12\columnwidth}\raggedright
Whole Tale\strut
\end{minipage} & \begin{minipage}[t]{0.12\columnwidth}\raggedright
keyword-based\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI of JupyterLab/ RStudio in browser\strut
\end{minipage} & \begin{minipage}[t]{0.09\columnwidth}\raggedright
via UI\strut
\end{minipage} & \begin{minipage}[t]{0.18\columnwidth}\raggedright
within UI in browser or locally\strut
\end{minipage} & \begin{minipage}[t]{0.15\columnwidth}\raggedright
manually within code in browser\strut
\end{minipage}\tabularnewline
\bottomrule
\end{longtable}
\emph{Table 4} addresses libraries and other institutions with a mandate
to preserve and provide access to research outputs. It includes
information on how the research materials are stored and shared, and
whether modifying or deleting content once published is possible. Five
applications provide storage, though it remains unclear whether they run
the servers by themselves or by third-party services, and what kind of
backup and archiving is implemented. Seven applications give hosts the
option to store research materials independently, e.g., on the
publisher's infrastructure. The free available instance of Binder,
MyBinder.org (\url{https://mybinder.org/}), stores Docker images
temporarily but beyond that, no storage is provided. Whole Tale and o2r
use existing long-term preservation services, e.g., Zenodo and DataOne.
Regarding the possibility to modify or delete materials once published,
we assigned ``possible'' if there is any way to do so. In Binder, REANA,
and ReproZip, modifying/deleting content is possible if the research
materials are stored on GitHub/Lab, but not when stored on Zenodo. The
same is true for Galaxy, Gigantum, and Manuscripts, which allow users to
edit/delete contents stored in the cloud. Code Ocean and Whole Tale
assign DOIs to published contents making it impossible to edit these
after publication. The same applies to o2r but only if the materials are