Skip to content

Commit 29c9e43

Browse files
committed
add example
1 parent e9367ec commit 29c9e43

File tree

5 files changed

+235
-0
lines changed

5 files changed

+235
-0
lines changed

LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2023 Fabio Matti
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

+108
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# Re-Pro
2+
3+
This repository helps you set up a reproducibility proof for your project. It's pretty simple, trust me.
4+
5+
## About provable reproducibility
6+
7+
There is an age-old saying in the Swiss army:
8+
9+
> Vertrauen ist gut; Kontrolle ist besser.
10+
11+
> Trust is good; checking is better.
12+
13+
How can scientists unmistakably know whether their results can be reproduced by other people? How can reviewers verify that a certain numerical experiment in an article is correct? And how can collaborators quickly understand how to use the source code you have written for your project?
14+
15+
Provable reproducibility is an initiative, which pursues the goal of making results published in articles, theses, and software packages easy to reproduce and verify. No more ambiguity, data manipulation, cherry-picked parameters, and hand-crafted results. Every figure, plot, and table in a provably reproducible project can be unequivocally traced back to where it originated from.
16+
17+
## Quick start
18+
19+
To add provable reproducibility to your GitHub repository, run the following command:
20+
21+
```[bash]
22+
git pull https://github.com/FMatti/Re-Pro <preset>
23+
```
24+
25+
You may choose from the following presets:
26+
27+
| Preset | Description |
28+
| ------ | ------------- |
29+
| python-latex | Python scripts and LaTeX project |
30+
| python-latex-bibtex | Python scripts and LaTeX project with BibTeX bibliography component |
31+
| matlab-latex | MATLAB scripts and LaTeX project |
32+
| julia-latex | Julia scripts and LaTeX project |
33+
34+
In the `.github/workflows` directory, you may have to modify the `reproduce.yml` file as follows:
35+
36+
- Change the `LATEX_PROJECT` variable to the name/path of your LaTeX main file (without the `.tex` extension)
37+
- Change the `PYTHON_SCRIPT` variable to the location of the Python script(s) you want to execute with the pipeline (glob patterns supported)
38+
- Change the Python package imports to the packages you use in your Python scripts
39+
- Change the LaTeX setup to the packages/compilers your project requires
40+
41+
Finally, commit and push the changes to GitHub:
42+
43+
```[bash]
44+
git commit --all -m "add reproducibility proof"
45+
git push
46+
```
47+
48+
## Explanations
49+
50+
On your GitHub repository, every time you push to main branch, a pipeline will be executed. If everything goes well, a green check mark will appear next to the commit message.
51+
52+
[Check mark image]
53+
54+
You can view all the GitHub actions in the Actions tab. Click on one to see all the details. This is also where you can download a ZIP archive with the generated PDF in it.
55+
56+
[PDF artifact generated.]
57+
58+
If something goes wrong, a red cross appears and you can click it and display more details about where the failure happened.
59+
60+
[Failed image]
61+
62+
## Example
63+
64+
This repository serves as an example for how a provably reproducible project may look like. [Elaborate more in detail what is done]
65+
66+
## The Re-Pro badge
67+
68+
The Re-Pro badge is the seal of reproducibility which can be displayed in a document. It certifies that a document was indeed produced based on the given commit.
69+
70+
[Image of badge]
71+
72+
## Extended setup
73+
74+
Unless you are using some extraordinary dependencies or features in your project, your repository should now be configured for provable reproducibility.
75+
76+
- Non-default main branch -> change in `reproduce.yml`.
77+
- Add packages in TexLive installation
78+
- Commit to repository using
79+
80+
```[bash]
81+
...
82+
83+
jobs:
84+
build:
85+
permissions:
86+
contents: write
87+
...
88+
```
89+
and subsequently adding the step
90+
```[bash]
91+
...
92+
93+
- name: Commit and push generated files to repository
94+
run: |
95+
git config --global user.name "github-actions[bot]"
96+
git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"
97+
git add [FILES]
98+
git commit -m "reproduce thesis"
99+
git push
100+
```
101+
102+
## About PGF plots
103+
104+
This example repository also serves as a demonstration of how matplotlib plots are to be exported and included in LaTeX projects. Any other way than using .pgf files for this purpose should be pursued as a criminal offence.
105+
106+
## Opening GitHub repository with Overleaf
107+
108+
Another advantage of tracking your code in a GitHub repository is that you can view and edit your project from Overleaf. The process for setting this up is described in the [Overleaf guide on GitHub Synchronisation](https://www.overleaf.com/learn/how-to/GitHub_Synchronization).

bibliography.bib

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
@article{article2021,
2+
author = {A. B. Surname},
3+
title = {Title},
4+
journal = {Journal},
5+
year = {Year},
6+
7+
volume = {Volume},
8+
number = {Number},
9+
pages = {Start--End},
10+
note = {Note},
11+
12+
doi = {DOI}
13+
}

main.tex

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
\documentclass[12pt]{article}
2+
3+
% Import pgf plots
4+
\usepackage{pgf}
5+
\def\mathdefault#1{#1}
6+
7+
\title{Re-Pro example}
8+
\author{FMatti}
9+
10+
\begin{document}
11+
12+
\maketitle
13+
14+
This is a demonstration project which is provably reproducible. \cite{article2021}
15+
16+
\begin{figure}[ht]
17+
\centering
18+
\input{plot.pgf}
19+
\caption{Example plot.}
20+
\end{figure}
21+
22+
\bibliographystyle{acm}
23+
\bibliography{bibliography.bib}
24+
25+
\end{document}

plot.py

+68
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
import urllib.request
2+
import tarfile
3+
import os
4+
import tempfile
5+
import shutil
6+
7+
import scipy as sp
8+
import matplotlib
9+
import matplotlib.pyplot as plt
10+
11+
matplotlib.rc("text", usetex=True)
12+
matplotlib.rcParams["pgf.texsystem"] = "pdflatex"
13+
matplotlib.rcParams["font.family"] = "serif"
14+
matplotlib.rcParams["font.size"] = 10
15+
16+
def download_matrix(url, save_path="matrices", save_name=None):
17+
"""
18+
Download a matrix from an online matrix market archive.
19+
20+
Parameters
21+
----------
22+
url : str
23+
The URL, e.g. https://www.[...].com/matrix.tar.gz.
24+
save_path : str
25+
The path under which the matrix should be saved.
26+
save_name : str or None
27+
The filename of the matrix. When None, name is inferred from url.
28+
"""
29+
# Create a temporary directory
30+
temp_dir = tempfile.mkdtemp()
31+
32+
try:
33+
# Download the archive containing the matrix
34+
file_name = os.path.join(temp_dir, "archive.tar.gz")
35+
archive_file_path, _ = urllib.request.urlretrieve(url, file_name)
36+
37+
# Open the archive
38+
with tarfile.open(archive_file_path, "r:gz") as tar:
39+
# Extract only the ".mtx" files
40+
mtx_members = [m for m in tar.getmembers() if m.name.endswith(".mtx")]
41+
tar.extractall(path=temp_dir, members=mtx_members)
42+
43+
# Convert and save matrices as scipy.sparse.matrix
44+
for m in mtx_members:
45+
matrix = sp.io.mmread(os.path.join(temp_dir, m.name))
46+
if save_name is None:
47+
save_name = os.path.splitext(os.path.basename(m.name))[0]
48+
try:
49+
sp.sparse.save_npz(os.path.join(save_path, save_name), matrix)
50+
except:
51+
continue
52+
53+
finally:
54+
# Clean up: Delete the temporary directory and its contents
55+
shutil.rmtree(temp_dir)
56+
57+
# Download sparse matrix from suitesparse collection
58+
download_matrix("https://suitesparse-collection-website.herokuapp.com/MM/VDOL/orbitRaising_1.tar.gz", ".", "matrix.npz")
59+
A = sp.sparse.load_npz("matrix.npz")
60+
61+
# Extract principal components of matrix
62+
u, _, _ = sp.sparse.linalg.svds(A, k=2)
63+
pc = (u.T @ A)
64+
65+
# Visualize principal components
66+
plt.figure(figsize=(3, 3))
67+
plt.scatter(pc[0], pc[1], color="#2F455C")
68+
plt.savefig("plot.pgf", bbox_inches="tight")

0 commit comments

Comments
 (0)