Doc: Add example of user-defined function

isciences · Aug 13, 2024 · f39e093 · f39e093
1 parent 8c7a065
commit f39e093
Show file tree

Hide file tree

Showing 4 changed files with 131 additions and 1 deletion.
diff --git a/ci/download_sample_data.sh b/ci/download_sample_data.sh
@@ -12,3 +12,9 @@ wget https://github.com/isciences/exactextractr/raw/master/inst/sao_miguel/clc20
 wget https://github.com/isciences/exactextractr/raw/master/inst/sao_miguel/eu_dem_v11.tif -P ${DEST_DIR}
 wget https://github.com/isciences/exactextractr/raw/master/inst/sao_miguel/gpw_v411_2020_count_2020.tif -P ${DEST_DIR}
 wget https://github.com/isciences/exactextractr/raw/master/inst/sao_miguel/gpw_v411_2020_density_2020.tif -P ${DEST_DIR}
+
+# Hawaii
+DEST_DIR=${DATA_ROOT}/hawaii
+mkdir -p ${DEST_DIR}
+wget https://exactextract-datasets.s3.us-east-2.amazonaws.com/hawaii/HI_L50dBA_sumDay_exi.tif -P ${DEST_DIR}
+wget https://www2.census.gov/geo/tiger/TIGER2023/COUSUB/tl_2023_15_cousub.zip -P ${DEST_DIR}
diff --git a/python/doc/basic_usage.ipynb b/python/doc/basic_usage.ipynb
@@ -424,7 +424,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.2"
   }
  },
  "nbformat": 4,

diff --git a/python/doc/examples.rst b/python/doc/examples.rst
@@ -6,3 +6,4 @@ Usage examples
    :caption: Contents:
 
    basic_usage
+   noise_average_udf
diff --git a/python/doc/noise_average_udf.ipynb b/python/doc/noise_average_udf.ipynb
@@ -0,0 +1,123 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "0",
+   "metadata": {},
+   "source": [
+    "# User-defined functions: summarizing noise data\n",
+    "\n",
+    "This example works with a [model of sound levels for the United States](https://www.nps.gov/subjects/sound/soundmap.htm) produced by the US National Park Service. We want to compute the average noise level for each municipality in Hawaii, but the standard `mean` operation provided by exactextract is not appropriate for the logarithmic [decibel](https://en.wikipedia.org/wiki/Decibel) scale used for sound levels.\n",
+    "\n",
+    "To avoid working with large data files, we use the sound level grid provided for Hawaii, along with the [TIGER shapefile](https://www2.census.gov/geo/tiger/TIGER2023/COUSUB/) of 2023 municipal boundaries (\"county subdivisions\") published by the US Census Bureau:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sound_fname = 'hawaii/HI_L50dBA_sumDay_exi.tif'\n",
+    "municipalities_fname = 'hawaii/tl_2023_15_cousub.zip'"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2",
+   "metadata": {},
+   "source": [
+    "The sound boundaries are published in UTM 4N (meters), while the municipal boundaries use NAD 1983 geographic coordinates. Rather than pass the filenames to `exact_extract`, we will first load the datasets and reproject the municipal boundaries to the UTM coordinate system used by the sound grid."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from exactextract import exact_extract\n",
+    "from matplotlib import pyplot\n",
+    "import rasterio\n",
+    "import rasterio.plot\n",
+    "import geopandas as gpd\n",
+    "\n",
+    "sound_ds = rasterio.open(sound_fname)\n",
+    "municipalities = gpd.read_file(municipalities_fname).to_crs(sound_ds.crs)\n",
+    "\n",
+    "fig, ax = pyplot.subplots()\n",
+    "ax.set_xlim([0.3e6, 1.0e6])\n",
+    "ax.set_ylim([2.0e6, 2.5e6])\n",
+    "\n",
+    "rasterio.plot.show(sound_ds, ax=ax)\n",
+    "municipalities.plot(ax=ax, facecolor='none', edgecolor='black', linewidth=0.2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4",
+   "metadata": {},
+   "source": [
+    "The following function can be used to compute the mean sound levels:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "def mean_sound_db(values, coverage):\n",
+    "    # transform values, take average, transform result\n",
+    "    values = np.power(10, values)\n",
+    "    mean = np.average(values, weights=coverage)\n",
+    "    return np.log10(mean)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6",
+   "metadata": {},
+   "source": [
+    "The user-defined function can now be used in the same way as a built-in summary operation. Unsurprisingly, we see that Honolulu is the loudest municipality in Hawaii."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results = exact_extract(sound_ds, municipalities, mean_sound_db, include_cols=['GEOID', 'NAME'], output='pandas')\n",
+    "results.sort_values(by=['mean_sound_db'], ascending=False)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}