Convert tutorials from sphinx-gallery to plain rst (#31)

The tutorials have much more text than Python code so rendering them in sphinx-gallery means writing a bunch of rst content as comments in the `.py` files, which is very awkward. Use jupyter-sphinx instead so we can have the tutorials as plain rst files with code blocks executed by the extension. The output looks pretty much the same with some slight differences in style (had to tweak the CSS to match). Another advantage is that the "Edit on GitHub" links from sphinx-book-theme now work for these (they don't for the gallery output).
fatiando · May 3, 2022 · 3125152 · 3125152
1 parent 4ad880b
commit 3125152
Show file tree

Hide file tree

Showing 10 changed files with 153 additions and 142 deletions.
diff --git a/.gitignore b/.gitignore
@@ -11,7 +11,6 @@ dist/
 doc/_build
 doc/api/generated
 doc/gallery
-doc/tutorial
 .ipynb_checkpoints
 *.eggs
 *.egg-info

diff --git a/doc/_static/style.css b/doc/_static/style.css
@@ -47,3 +47,23 @@ main.bd-content #main-content a.btn:hover {
   border-top: 1px solid #dddddd;
   padding: 1.8rem 0;
 }
+
+/* Try to make the jupyter-sphinx style match sphinx-gallery for consistency */
+div.jupyter_container {
+  border: none;
+  box-shadow: none;
+}
+
+div.jupyter_container .cell_input {
+  margin-bottom: 1.15rem;
+  border-radius: 0.4em;
+  box-shadow: 1px 1px 1px #d8d8d8;
+}
+
+div.jupyter_container .cell_output {
+  margin-bottom: 1.15rem;
+}
+
+.jupyter_container div.code_cell pre {
+  padding: 10px;
+}
diff --git a/doc/conf.py b/doc/conf.py
@@ -35,6 +35,7 @@
     "sphinx.ext.napoleon",
     "sphinx_panels",
     "sphinx_gallery.gen_gallery",
+    "jupyter_sphinx",
 ]
 
 # Disable including boostrap CSS for sphinx_panels since it's already included
@@ -79,9 +80,9 @@
 # -----------------------------------------------------------------------------
 sphinx_gallery_conf = {
     # path to your examples scripts
-    "examples_dirs": ["gallery_src", "tutorial_src"],
+    "examples_dirs": ["gallery_src"],
     # path where to save gallery generated examples
-    "gallery_dirs": ["gallery", "tutorial"],
+    "gallery_dirs": ["gallery"],
     "filename_pattern": r"\.py",
     # Remove the "Download all examples" button from the top level gallery
     "download_all_examples": False,

diff --git a/doc/tutorial/developers.rst b/doc/tutorial/developers.rst
@@ -0,0 +1,51 @@
+.. _developers:
+
+Using Ensaio in your project
+----------------------------
+
+One of the main use cases of Ensaio is to provide reproducible and
+easy-to-access data for the documentation of other Python projects.
+These are a few tips and tricks for using Ensaio in your own project.
+
+Explicitly set data versions
+++++++++++++++++++++++++++++
+
+New version of each dataset may be included in new Ensaio releases. We'll do
+our very best to always keep the older data versions available as well to
+avoid breaking existing tutorials and documentation.
+
+We recommend always explicitly setting the data version when fetching a
+dataset:
+
+.. jupyter-execute::
+
+    import ensaio
+
+    fname = ensaio.fetch_southern_africa_gravity(version=1)
+
+This way, your documentation/tutorial should still use the same data (and
+hopefully still produce the same result) even if new versions of Ensaio are
+installed.
+Otherwise, people going through older examples with newer versions of Ensaio
+could get different results (or worse, broken code).
+
+.. tip::
+
+    We still recommend updating to the latest data versions in new tutorials
+    and documentation whenever you can.
+
+Download from GitHub on CI
+++++++++++++++++++++++++++
+
+By default, the data sources for Ensaio are the archives with the given DOIs
+for each dataset (usually
+`Zenodo <https://zenodo.org/communities/fatiando>`__).
+Alternatively, you can ask Ensaio to download from the GitHub release of each
+dataset by setting the environment variable ``ENSAIO_DATA_FROM_GITHUB=true``.
+
+We recommend using the environment variable when running on continuous
+integration (CI).
+This will minimize the load that is placed on public data servers like
+Zenodo.
+When using GitHub Actions, this may even make the downloads much faster since
+the data source is likely physically closer to the CI infrastructure.
diff --git a/doc/tutorial/using.rst b/doc/tutorial/using.rst
@@ -0,0 +1,77 @@
+.. _using:
+
+Downloading data
+================
+
+Ensaio provides functions for downloading datasets from the `Fatiando a Terra
+Datasets <https://github.com/fatiando-data>`__ collection to your computer.
+These functions don't attempt to do any loading of the data into memory and
+only return the path of the downloaded file on your computer.
+
+To take care of the actual loading of the data, we'll import
+`Pandas <https://pandas.pydata.org/>`__ as well since the data we'll use is in
+CSV format.
+
+.. jupyter-execute::
+
+    import pandas as pd
+
+    import ensaio
+
+To download a particular dataset, say version 1 of our Southern Africa
+gravity data, call the corresponding ``fetch_*`` functions:
+
+.. jupyter-execute::
+
+    fname = ensaio.fetch_southern_africa_gravity(version=1)
+    print(fname)
+
+.. tip::
+
+    The version of the data should **always** be explicitly included so that
+    you code continues to work in the same way even if a newer version of the
+    data is released.
+
+If the data are not yet available on your computer, Ensaio will automatically
+download it and return the path to the downloaded file.
+In the file had already been downloaded, Ensaio won't repeat the download and
+will only return the path to the existing file.
+
+This means that placing the code above in a Python script or Jupyter notebook
+will mean that whoever runs it is guaranteed to get the data on their
+computer.
+Running the code multiple times or using the same data in multiple places
+will only trigger a single download, saving bandwidth and storage space.
+
+.. note::
+
+    Ensaio uses `Pooch <https://www.fatiando.org/pooch/>`__ under the hood to
+    make all of this work.
+
+Once we have the path to the data file, we can load it like we would any
+other data file. In this case, our data is in a CSV file so the natural
+choice is to use `Pandas <https://pandas.pydata.org/>`__:
+
+.. jupyter-execute::
+
+    data = pd.read_csv(fname)
+    data
+
+.. seealso::
+
+    You can browse a list of all available datasets in :ref:`api` or
+    :ref:`gallery`.
+
+Where are the data?
+-------------------
+
+The location of the cache folder varies by operating system. Use the
+:func:`ensaio.locate` function to get its location on your computer.
+
+.. jupyter-execute::
+
+    print(ensaio.locate())
+
+You can also set the location manually by creating a ``ENSAIO_DATA_DIR``
+environment variable with the desired path. Ensaio will search for this
+variable and if found will use its value instead of the default cache folder.
diff --git a/doc/tutorial_src/README.txt b/doc/tutorial_src/README.txt
diff --git a/doc/tutorial_src/developers.py b/doc/tutorial_src/developers.py
diff --git a/doc/tutorial_src/using.py b/doc/tutorial_src/using.py
diff --git a/env/requirements-docs.txt b/env/requirements-docs.txt
@@ -2,6 +2,7 @@ sphinx==4.3.*
 sphinx-book-theme==0.1.*
 sphinx-gallery==0.10.*
 sphinx-panels==0.6.*
+jupyter-sphinx==0.3.*
 numpy
 pandas
 xarray

diff --git a/environment.yml b/environment.yml
@@ -20,6 +20,7 @@ dependencies:
   - sphinx-book-theme==0.1.*
   - sphinx-gallery==0.10.*
   - sphinx-panels==0.6.*
+  - jupyter-sphinx==0.3.*
   - numpy
   - pandas
   - xarray