Skip to content

Commit

Permalink
Convert tutorials from sphinx-gallery to plain rst (#31)
Browse files Browse the repository at this point in the history
The tutorials have much more text than Python code so rendering them in
sphinx-gallery means writing a bunch of rst content as comments in the
`.py` files, which is very awkward. Use jupyter-sphinx instead so we can
have the tutorials as plain rst files with code blocks executed by the
extension. The output looks pretty much the same with some slight
differences in style (had to tweak the CSS to match). Another advantage
is that the "Edit on GitHub" links from sphinx-book-theme now work for
these (they don't for the gallery output).
  • Loading branch information
leouieda authored May 3, 2022
1 parent 4ad880b commit 3125152
Show file tree
Hide file tree
Showing 10 changed files with 153 additions and 142 deletions.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ dist/
doc/_build
doc/api/generated
doc/gallery
doc/tutorial
.ipynb_checkpoints
*.eggs
*.egg-info
Expand Down
20 changes: 20 additions & 0 deletions doc/_static/style.css
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,23 @@ main.bd-content #main-content a.btn:hover {
border-top: 1px solid #dddddd;
padding: 1.8rem 0;
}

/* Try to make the jupyter-sphinx style match sphinx-gallery for consistency */
div.jupyter_container {
border: none;
box-shadow: none;
}

div.jupyter_container .cell_input {
margin-bottom: 1.15rem;
border-radius: 0.4em;
box-shadow: 1px 1px 1px #d8d8d8;
}

div.jupyter_container .cell_output {
margin-bottom: 1.15rem;
}

.jupyter_container div.code_cell pre {
padding: 10px;
}
5 changes: 3 additions & 2 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
"sphinx.ext.napoleon",
"sphinx_panels",
"sphinx_gallery.gen_gallery",
"jupyter_sphinx",
]

# Disable including boostrap CSS for sphinx_panels since it's already included
Expand Down Expand Up @@ -79,9 +80,9 @@
# -----------------------------------------------------------------------------
sphinx_gallery_conf = {
# path to your examples scripts
"examples_dirs": ["gallery_src", "tutorial_src"],
"examples_dirs": ["gallery_src"],
# path where to save gallery generated examples
"gallery_dirs": ["gallery", "tutorial"],
"gallery_dirs": ["gallery"],
"filename_pattern": r"\.py",
# Remove the "Download all examples" button from the top level gallery
"download_all_examples": False,
Expand Down
51 changes: 51 additions & 0 deletions doc/tutorial/developers.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
.. _developers:

Using Ensaio in your project
----------------------------

One of the main use cases of Ensaio is to provide reproducible and
easy-to-access data for the documentation of other Python projects.
These are a few tips and tricks for using Ensaio in your own project.

Explicitly set data versions
++++++++++++++++++++++++++++

New version of each dataset may be included in new Ensaio releases. We'll do
our very best to always keep the older data versions available as well to
avoid breaking existing tutorials and documentation.

We recommend always explicitly setting the data version when fetching a
dataset:

.. jupyter-execute::

import ensaio

fname = ensaio.fetch_southern_africa_gravity(version=1)

This way, your documentation/tutorial should still use the same data (and
hopefully still produce the same result) even if new versions of Ensaio are
installed.
Otherwise, people going through older examples with newer versions of Ensaio
could get different results (or worse, broken code).

.. tip::

We still recommend updating to the latest data versions in new tutorials
and documentation whenever you can.

Download from GitHub on CI
++++++++++++++++++++++++++

By default, the data sources for Ensaio are the archives with the given DOIs
for each dataset (usually
`Zenodo <https://zenodo.org/communities/fatiando>`__).
Alternatively, you can ask Ensaio to download from the GitHub release of each
dataset by setting the environment variable ``ENSAIO_DATA_FROM_GITHUB=true``.

We recommend using the environment variable when running on continuous
integration (CI).
This will minimize the load that is placed on public data servers like
Zenodo.
When using GitHub Actions, this may even make the downloads much faster since
the data source is likely physically closer to the CI infrastructure.
77 changes: 77 additions & 0 deletions doc/tutorial/using.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
.. _using:

Downloading data
================

Ensaio provides functions for downloading datasets from the `Fatiando a Terra
Datasets <https://github.com/fatiando-data>`__ collection to your computer.
These functions don't attempt to do any loading of the data into memory and
only return the path of the downloaded file on your computer.

To take care of the actual loading of the data, we'll import
`Pandas <https://pandas.pydata.org/>`__ as well since the data we'll use is in
CSV format.

.. jupyter-execute::

import pandas as pd

import ensaio

To download a particular dataset, say version 1 of our Southern Africa
gravity data, call the corresponding ``fetch_*`` functions:

.. jupyter-execute::

fname = ensaio.fetch_southern_africa_gravity(version=1)
print(fname)

.. tip::

The version of the data should **always** be explicitly included so that
you code continues to work in the same way even if a newer version of the
data is released.

If the data are not yet available on your computer, Ensaio will automatically
download it and return the path to the downloaded file.
In the file had already been downloaded, Ensaio won't repeat the download and
will only return the path to the existing file.

This means that placing the code above in a Python script or Jupyter notebook
will mean that whoever runs it is guaranteed to get the data on their
computer.
Running the code multiple times or using the same data in multiple places
will only trigger a single download, saving bandwidth and storage space.

.. note::

Ensaio uses `Pooch <https://www.fatiando.org/pooch/>`__ under the hood to
make all of this work.

Once we have the path to the data file, we can load it like we would any
other data file. In this case, our data is in a CSV file so the natural
choice is to use `Pandas <https://pandas.pydata.org/>`__:

.. jupyter-execute::

data = pd.read_csv(fname)
data

.. seealso::

You can browse a list of all available datasets in :ref:`api` or
:ref:`gallery`.

Where are the data?
-------------------

The location of the cache folder varies by operating system. Use the
:func:`ensaio.locate` function to get its location on your computer.

.. jupyter-execute::

print(ensaio.locate())

You can also set the location manually by creating a ``ENSAIO_DATA_DIR``
environment variable with the desired path. Ensaio will search for this
variable and if found will use its value instead of the default cache folder.
4 changes: 0 additions & 4 deletions doc/tutorial_src/README.txt

This file was deleted.

58 changes: 0 additions & 58 deletions doc/tutorial_src/developers.py

This file was deleted.

77 changes: 0 additions & 77 deletions doc/tutorial_src/using.py

This file was deleted.

1 change: 1 addition & 0 deletions env/requirements-docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ sphinx==4.3.*
sphinx-book-theme==0.1.*
sphinx-gallery==0.10.*
sphinx-panels==0.6.*
jupyter-sphinx==0.3.*
numpy
pandas
xarray
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ dependencies:
- sphinx-book-theme==0.1.*
- sphinx-gallery==0.10.*
- sphinx-panels==0.6.*
- jupyter-sphinx==0.3.*
- numpy
- pandas
- xarray
Expand Down

0 comments on commit 3125152

Please sign in to comment.