diff --git a/docs/example/ncompare-example-usage.ipynb b/docs/example/ncompare-example-usage.ipynb index 14afb08..1fa7d2d 100644 --- a/docs/example/ncompare-example-usage.ipynb +++ b/docs/example/ncompare-example-usage.ipynb @@ -1,538 +1,538 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "214b2e0a-4a8a-48bb-b1f5-b457b69ece57", - "metadata": {}, - "source": [ - "# Brief demonstration of `ncompare`: to compare the structure, groups, variables, and attributes of two netCDF files\"" - ] - }, - { - "cell_type": "markdown", - "id": "351983d5-1c2f-45ee-8a24-cd2a3b621405", - "metadata": {}, - "source": [ - "Installation instructions for `ncompare` can be found in either of these locations:\n", - "\n", - "- [GitHub repository](https://github.com/nasa/ncompare)\n", - "- [Pip entry](https://pypi.org/project/ncompare/)" - ] - }, - { - "cell_type": "markdown", - "id": "569c088b-0929-43c3-8d0f-6da3b6c89cce", - "metadata": {}, - "source": [ - "## `ncompare`'s command line arguments, provided by the `--help` description" - ] - }, - { - "cell_type": "markdown", - "id": "6a145933-e57b-4e33-bed1-95b13800878d", - "metadata": {}, - "source": [ - "***✍️ Syntax Note:*** Commands preceeded by an exclamation point \"!\" \n", - "(which is needed to [run shell commands in a Jupyter notebook](https://stackoverflow.com/a/48529220)) can be run from a terminal. \n", - "In a shell/terminal, the exclamation point should not be used." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "07e397b3-4964-4a90-b7f5-ae35185f86e5", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "usage: ncompare [-h] [-v COMPARISON_VAR_NAME] [-g COMPARISON_VAR_GROUP]\n", - " [--only-diffs] [--file-text FILE_TEXT] [--file-csv FILE_CSV]\n", - " [--file-xlsx FILE_XLSX] [--no-color] [--show-attributes]\n", - " [--show-chunks]\n", - " [--column-widths COLUMN_WIDTHS COLUMN_WIDTHS COLUMN_WIDTHS]\n", - " [--version]\n", - " nc_a nc_b\n", - "\n", - "Compare the variables contained within two different NetCDF datasets\n", - "\n", - "positional arguments:\n", - " nc_a First NetCDF file\n", - " nc_b First NetCDF file\n", - "\n", - "options:\n", - " -h, --help show this help message and exit\n", - " -v COMPARISON_VAR_NAME, --comparison_var_name COMPARISON_VAR_NAME\n", - " Comparison variable name\n", - " -g COMPARISON_VAR_GROUP, --comparison_var_group COMPARISON_VAR_GROUP\n", - " Comparison variable group\n", - " --only-diffs Only display variables and attributes that are\n", - " different\n", - " --file-text FILE_TEXT\n", - " A text file to which the output will be written.\n", - " --file-csv FILE_CSV A csv (comma separated values) file to which the\n", - " output will be written.\n", - " --file-xlsx FILE_XLSX\n", - " An Excel file to which the output will be written.\n", - " --no-color Turn off all colorized output\n", - " --show-attributes Include variable attributes in comparison\n", - " --show-chunks Include chunk sizes in the table that compares\n", - " variables\n", - " --column-widths COLUMN_WIDTHS COLUMN_WIDTHS COLUMN_WIDTHS\n", - " Width, in number of characters, of the three columns\n", - " in the comparison report\n", - " --version Show the current version.\n" - ] - } - ], - "source": [ - "! ncompare --help" - ] - }, - { - "cell_type": "markdown", - "id": "4028d153-a1d2-4f8b-aad5-ca736b6c8292", - "metadata": {}, - "source": [ - "## Example 1: Two netCDF files with the same groups, variables, and attributes\n", - "----" - ] - }, - { - "cell_type": "markdown", - "id": "6992fa0f-7460-42c8-b0d4-d0634ddcc798", - "metadata": {}, - "source": [ - "Data files are first defined. The examples here rely on three files: two from NOAA National Centers of Environmental Information's (NCEI) (a) _[Global Precipitation Climatology Project (GPCP) Climate Data Record (CDR), Monthly V2.3](https://doi.org/10.7289/V56971M6)_ and one from the (b) _[Climate Data Record (CDR) of Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN-CDR), Version 1 Revision 1)](https://doi.org/10.7289/V51V5BWQ)_ (a daily quasi-global precipitation product), accessible via [this GPCP catalog](https://www.ncei.noaa.gov/thredds/catalog/cdr/gpcp_final/2023/catalog.html) and [this PERSIANN catalog](https://www.ncei.noaa.gov/thredds/catalog/cdr/persiann/catalog.html):\n", - "\n", - "1. https://www.ncei.noaa.gov/thredds/catalog/cdr/gpcp_final/2023/catalog.html?dataset=cdr_gpcp_final/2023/gpcp_v02r03_monthly_d202301_c20230411.nc\n", - "2. https://www.ncei.noaa.gov/thredds/catalog/cdr/gpcp_final/2023/catalog.html?dataset=cdr_gpcp_final/2023/gpcp_v02r03_monthly_d202302_c20230505.nc\n", - "3. https://www.ncei.noaa.gov/thredds/fileServer/cdr/persiann/2023/PERSIANN-CDR_v01r01_20230419_c20231030.nc" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "136bbeb8-6d74-4373-8ef7-1c20c1fe6afc", - "metadata": {}, - "outputs": [], - "source": [ - "from pathlib import Path\n", - "\n", - "file_urls = [\n", - " \"https://www.ncei.noaa.gov/thredds/fileServer/cdr/gpcp_final/2023/gpcp_v02r03_monthly_d202301_c20230411.nc\",\n", - " \"https://www.ncei.noaa.gov/thredds/fileServer/cdr/gpcp_final/2023/gpcp_v02r03_monthly_d202302_c20230505.nc\",\n", - " \"https://www.ncei.noaa.gov/thredds/fileServer/cdr/persiann/2023/PERSIANN-CDR_v01r01_20230419_c20231030.nc\",\n", - "]\n", - "\n", - "file_names = [Path(url).name for url in file_urls]" - ] - }, - { - "cell_type": "markdown", - "id": "2b635084-8b99-4824-9f36-27b1c31bd2a5", - "metadata": {}, - "source": [ - "To download these files (e.g., for the first time running this notebook), run the following:" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "10a025b9-4483-4925-873e-6653b64441e3", - "metadata": {}, - "outputs": [], - "source": [ - "import requests\n", - "\n", - "for url, filename in zip(file_urls, file_names):\n", - " r = requests.get(url, allow_redirects=True)\n", - " open(filename, 'wb').write(r.content)" - ] - }, - { - "cell_type": "markdown", - "id": "56b0eeba-20ed-46ed-a14b-593b59c2d9cd", - "metadata": {}, - "source": [ - "Next, we pass the two filepaths to `ncompare`, and any differences would be printed in red. In this case, there are no differences; therefore, all of the variables are printed in black." - ] - }, - { - "cell_type": "markdown", - "id": "a2ea8513-19ce-4089-8494-e0fba9aea789", - "metadata": {}, - "source": [ - "***✍️ Syntax Note:*** the curly brackets, \"{\" and \"}\", that follow are simply a way to [substitute python variables into a shell command](https://stackoverflow.com/a/35497161). \n", - "In a shell/terminal, one can just write out the full arguments, separated by spaces.\n", - "For example, the following command would be run at the terminal as `ncompare notebook_example_data/MOP03JM-202205-L3V95.6.3.he5 notebook_example_data/MOP03JM-202205-L3V95.9.3.he5`\n", - "\n", - "***✍️ `ncompare` Options Note:*** the `--column-widths 33 26 26` arguments are optional, and they are being used here to shrink the columns width-wise from their defaults to a size that fits better in the GitHub notebook renderer." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "43cace42-aa55-469e-84d9-13a45115267e", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\u001b[37m\u001b[0mFile A: gpcp_v02r03_monthly_d202301_c20230411.nc\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0mFile B: gpcp_v02r03_monthly_d202302_c20230505.nc\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "Root-level Dimensions:\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36mAre all items the same? ---> True.\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36m[('latitude', 72), ('longitude', 144), ('nv', 2), ('time', 1)]\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "Root-level Groups:\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36mAre all items the same? ---> True. (No items exist.)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[90m\n", - "No variable group selected for comparison. Skipping..\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "All variables:\u001b[0m\n", - "\u001b[0m File A File B\u001b[0m\n", - "\u001b[0m All Variables \u001b[0m\n", - "\u001b[0m - -------------------------- --------------------------\u001b[0m\n", - "\u001b[0m \u001b[0m\n", - "\u001b[0m GROUP #00 -------------------------/ -------------------------/\u001b[0m\n", - "\u001b[0m num variables in group: 8 8\u001b[0m\n", - "\u001b[0m - -------------------------- --------------------------\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lat_bounds lat_bounds\u001b[0m\n", - "\u001b[0m dtype: float32 float32\u001b[0m\n", - "\u001b[0m shape: (72, 2) (72, 2)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: latitude latitude\u001b[0m\n", - "\u001b[0m dtype: float32 float32\u001b[0m\n", - "\u001b[0m shape: (72,) (72,)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lon_bounds lon_bounds\u001b[0m\n", - "\u001b[0m dtype: float32 float32\u001b[0m\n", - "\u001b[0m shape: (144, 2) (144, 2)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: longitude longitude\u001b[0m\n", - "\u001b[0m dtype: float32 float32\u001b[0m\n", - "\u001b[0m shape: (144,) (144,)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: precip precip\u001b[0m\n", - "\u001b[0m dtype: float32 float32\u001b[0m\n", - "\u001b[0m shape: (1, 72, 144) (1, 72, 144)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: precip_error precip_error\u001b[0m\n", - "\u001b[0m dtype: float32 float32\u001b[0m\n", - "\u001b[0m shape: (1, 72, 144) (1, 72, 144)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: time time\u001b[0m\n", - "\u001b[0m dtype: float32 float32\u001b[0m\n", - "\u001b[0m shape: (1,) (1,)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: time_bounds time_bounds\u001b[0m\n", - "\u001b[0m dtype: float32 float32\u001b[0m\n", - "\u001b[0m shape: (1, 2) (1, 2)\u001b[0m\n", - "\u001b[0m - -------------------------- --------------------------\u001b[0m\n", - "\u001b[0m Total number of shared items: 8 8\u001b[0m\n", - "\u001b[0m Total number of non-shared items: 0 0\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\n", - "Done.\u001b[0m\n", - "\u001b[0m\u001b[0m" - ] - } - ], - "source": [ - "! ncompare --column-widths 33 26 26 {file_names[0]} {file_names[1]}" - ] - }, - { - "cell_type": "markdown", - "id": "220888cd-92d1-4bb4-9b5d-8187f89bda87", - "metadata": {}, - "source": [ - "## Example 2: Two netCDF files with different groups, variables, and attributes\n", - "----" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "c48728a0-1379-4a05-b7e6-ad50694510df", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\u001b[37m\u001b[0mFile A: gpcp_v02r03_monthly_d202301_c20230411.nc\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0mFile B: PERSIANN-CDR_v01r01_20230419_c20231030.nc\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "Root-level Dimensions:\u001b[0m\n", - "/usr/local/Caskroom/miniconda/base/envs/ncompare-jupyter-example/lib/python3.12/site-packages/xarray/conventions.py:428: SerializationWarning: variable 'precipitation' has multiple fill values {-9999.0, -1.0}, decoding all values to NaN.\n", - " new_vars[k] = decode_cf_variable(\n", - "\u001b[0m\u001b[37m\u001b[0m\tAre all items the same? ---> \u001b[31mFalse. (2 items are shared, out of 6 total.)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\t\u001b[31mWhich items are different?\u001b[0m\n", - "\u001b[0m File A File B\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #00 ------------------------------ ------------------('lat', 480)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #01 --------------('latitude', 72) ------------------------------\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #02 ------------------------------ -----------------('lon', 1440)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #03 ------------('longitude', 144) ------------------------------\u001b[0m\n", - "\u001b[0m #04 ---------------------('nv', 2) ---------------------('nv', 2)\u001b[0m\n", - "\u001b[0m #05 -------------------('time', 1) -------------------('time', 1)\u001b[0m\n", - "\u001b[0m Number of non-shared items: 2 2\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "Root-level Groups:\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36mAre all items the same? ---> True. (No items exist.)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[90m\n", - "No variable group selected for comparison. Skipping..\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "All variables:\u001b[0m\n", - "\u001b[0m File A File B\u001b[0m\n", - "\u001b[0m All Variables \u001b[0m\n", - "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", - "\u001b[0m \u001b[0m\n", - "\u001b[0m GROUP #00 -----------------------------/ -----------------------------/\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mnum variables in group: 8 6\u001b[0m\n", - "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lat\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (480,)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lat_bnds\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (480, 2)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lat_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (72, 2) \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: latitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (72,) \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lon\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1440,)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lon_bnds\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1440, 2)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lon_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (144, 2) \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: longitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (144,) \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: precip \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 72, 144) \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: precip_error \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 72, 144) \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: precipitation\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 1440, 480)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: time time\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 int32\u001b[0m\n", - "\u001b[0m shape: (1,) (1,)\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: time_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 2) \u001b[0m\n", - "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", - "\u001b[0m Total number of shared items: 1 1\u001b[0m\n", - "\u001b[0m Total number of non-shared items: 7 5\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\n", - "Done.\u001b[0m\n", - "\u001b[0m\u001b[0m\u001b[0m" - ] - } - ], - "source": [ - "! ncompare --column-widths 33 30 30 {file_names[0]} {file_names[2]}" - ] - }, + "cells": [ + { + "cell_type": "markdown", + "id": "214b2e0a-4a8a-48bb-b1f5-b457b69ece57", + "metadata": {}, + "source": [ + "# Brief demonstration of `ncompare`: to compare the structure, groups, variables, and attributes of two netCDF files\"" + ] + }, + { + "cell_type": "markdown", + "id": "351983d5-1c2f-45ee-8a24-cd2a3b621405", + "metadata": {}, + "source": [ + "Installation instructions for `ncompare` can be found in either of these locations:\n", + "\n", + "- [GitHub repository](https://github.com/nasa/ncompare)\n", + "- [Pip entry](https://pypi.org/project/ncompare/)" + ] + }, + { + "cell_type": "markdown", + "id": "569c088b-0929-43c3-8d0f-6da3b6c89cce", + "metadata": {}, + "source": [ + "## `ncompare`'s command line arguments, provided by the `--help` description" + ] + }, + { + "cell_type": "markdown", + "id": "6a145933-e57b-4e33-bed1-95b13800878d", + "metadata": {}, + "source": [ + "***✍️ Syntax Note:*** Commands preceeded by an exclamation point \"!\" \n", + "(which is needed to [run shell commands in a Jupyter notebook](https://stackoverflow.com/a/48529220)) can be run from a terminal. \n", + "In a shell/terminal, the exclamation point should not be used." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "07e397b3-4964-4a90-b7f5-ae35185f86e5", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "11a23041-6f24-491b-a9e3-124ace151736", - "metadata": {}, - "source": [ - "#### More file details can be examined by using the `--show-attributes` and `--show-chunks` options" - ] - }, + "name": "stdout", + "output_type": "stream", + "text": [ + "usage: ncompare [-h] [-v COMPARISON_VAR_NAME] [-g COMPARISON_VAR_GROUP]\n", + " [--only-diffs] [--file-text FILE_TEXT] [--file-csv FILE_CSV]\n", + " [--file-xlsx FILE_XLSX] [--no-color] [--show-attributes]\n", + " [--show-chunks]\n", + " [--column-widths COLUMN_WIDTHS COLUMN_WIDTHS COLUMN_WIDTHS]\n", + " [--version]\n", + " nc_a nc_b\n", + "\n", + "Compare the variables contained within two different NetCDF datasets\n", + "\n", + "positional arguments:\n", + " nc_a First NetCDF file\n", + " nc_b First NetCDF file\n", + "\n", + "options:\n", + " -h, --help show this help message and exit\n", + " -v COMPARISON_VAR_NAME, --comparison_var_name COMPARISON_VAR_NAME\n", + " Comparison variable name\n", + " -g COMPARISON_VAR_GROUP, --comparison_var_group COMPARISON_VAR_GROUP\n", + " Comparison variable group\n", + " --only-diffs Only display variables and attributes that are\n", + " different\n", + " --file-text FILE_TEXT\n", + " A text file to which the output will be written.\n", + " --file-csv FILE_CSV A csv (comma separated values) file to which the\n", + " output will be written.\n", + " --file-xlsx FILE_XLSX\n", + " An Excel file to which the output will be written.\n", + " --no-color Turn off all colorized output\n", + " --show-attributes Include variable attributes in comparison\n", + " --show-chunks Include chunk sizes in the table that compares\n", + " variables\n", + " --column-widths COLUMN_WIDTHS COLUMN_WIDTHS COLUMN_WIDTHS\n", + " Width, in number of characters, of the three columns\n", + " in the comparison report\n", + " --version Show the current version.\n" + ] + } + ], + "source": [ + "! ncompare --help" + ] + }, + { + "cell_type": "markdown", + "id": "4028d153-a1d2-4f8b-aad5-ca736b6c8292", + "metadata": {}, + "source": [ + "## Example 1: Two netCDF files with the same groups, variables, and attributes\n", + "----" + ] + }, + { + "cell_type": "markdown", + "id": "6992fa0f-7460-42c8-b0d4-d0634ddcc798", + "metadata": {}, + "source": [ + "Data files are first defined. The examples here rely on three files: two from NOAA National Centers of Environmental Information's (NCEI) (a) _[Global Precipitation Climatology Project (GPCP) Climate Data Record (CDR), Monthly V2.3](https://doi.org/10.7289/V56971M6)_ and one from the (b) _[Climate Data Record (CDR) of Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN-CDR), Version 1 Revision 1)](https://doi.org/10.7289/V51V5BWQ)_ (a daily quasi-global precipitation product), accessible via [this GPCP catalog](https://www.ncei.noaa.gov/thredds/catalog/cdr/gpcp_final/2023/catalog.html) and [this PERSIANN catalog](https://www.ncei.noaa.gov/thredds/catalog/cdr/persiann/catalog.html):\n", + "\n", + "1. https://www.ncei.noaa.gov/thredds/catalog/cdr/gpcp_final/2023/catalog.html?dataset=cdr_gpcp_final/2023/gpcp_v02r03_monthly_d202301_c20230411.nc\n", + "2. https://www.ncei.noaa.gov/thredds/catalog/cdr/gpcp_final/2023/catalog.html?dataset=cdr_gpcp_final/2023/gpcp_v02r03_monthly_d202302_c20230505.nc\n", + "3. https://www.ncei.noaa.gov/thredds/fileServer/cdr/persiann/2023/PERSIANN-CDR_v01r01_20230419_c20231030.nc" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "136bbeb8-6d74-4373-8ef7-1c20c1fe6afc", + "metadata": {}, + "outputs": [], + "source": [ + "from pathlib import Path\n", + "\n", + "file_urls = [\n", + " \"https://www.ncei.noaa.gov/thredds/fileServer/cdr/gpcp_final/2023/gpcp_v02r03_monthly_d202301_c20230411.nc\",\n", + " \"https://www.ncei.noaa.gov/thredds/fileServer/cdr/gpcp_final/2023/gpcp_v02r03_monthly_d202302_c20230505.nc\",\n", + " \"https://www.ncei.noaa.gov/thredds/fileServer/cdr/persiann/2023/PERSIANN-CDR_v01r01_20230419_c20231030.nc\",\n", + "]\n", + "\n", + "file_names = [Path(url).name for url in file_urls]" + ] + }, + { + "cell_type": "markdown", + "id": "2b635084-8b99-4824-9f36-27b1c31bd2a5", + "metadata": {}, + "source": [ + "To download these files (e.g., for the first time running this notebook), run the following:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "10a025b9-4483-4925-873e-6653b64441e3", + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "\n", + "for url, filename in zip(file_urls, file_names):\n", + " r = requests.get(url, allow_redirects=True)\n", + " open(filename, \"wb\").write(r.content)" + ] + }, + { + "cell_type": "markdown", + "id": "56b0eeba-20ed-46ed-a14b-593b59c2d9cd", + "metadata": {}, + "source": [ + "Next, we pass the two filepaths to `ncompare`, and any differences would be printed in red. In this case, there are no differences; therefore, all of the variables are printed in black." + ] + }, + { + "cell_type": "markdown", + "id": "a2ea8513-19ce-4089-8494-e0fba9aea789", + "metadata": {}, + "source": [ + "***✍️ Syntax Note:*** the curly brackets, \"{\" and \"}\", that follow are simply a way to [substitute python variables into a shell command](https://stackoverflow.com/a/35497161). \n", + "In a shell/terminal, one can just write out the full arguments, separated by spaces.\n", + "For example, the following command would be run at the terminal as `ncompare notebook_example_data/MOP03JM-202205-L3V95.6.3.he5 notebook_example_data/MOP03JM-202205-L3V95.9.3.he5`\n", + "\n", + "***✍️ `ncompare` Options Note:*** the `--column-widths 33 26 26` arguments are optional, and they are being used here to shrink the columns width-wise from their defaults to a size that fits better in the GitHub notebook renderer." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "43cace42-aa55-469e-84d9-13a45115267e", + "metadata": {}, + "outputs": [ { - "cell_type": "code", - "execution_count": 6, - "id": "1dd4c51a-394c-4569-b8b1-053743e63cb9", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\u001b[37m\u001b[0mFile A: gpcp_v02r03_monthly_d202301_c20230411.nc\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0mFile B: PERSIANN-CDR_v01r01_20230419_c20231030.nc\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "Root-level Dimensions:\u001b[0m\n", - "/usr/local/Caskroom/miniconda/base/envs/ncompare-jupyter-example/lib/python3.12/site-packages/xarray/conventions.py:428: SerializationWarning: variable 'precipitation' has multiple fill values {-9999.0, -1.0}, decoding all values to NaN.\n", - " new_vars[k] = decode_cf_variable(\n", - "\u001b[0m\u001b[37m\u001b[0m\tAre all items the same? ---> \u001b[31mFalse. (2 items are shared, out of 6 total.)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\t\u001b[31mWhich items are different?\u001b[0m\n", - "\u001b[0m File A File B\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #00 ------------------------------ ------------------('lat', 480)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #01 --------------('latitude', 72) ------------------------------\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #02 ------------------------------ -----------------('lon', 1440)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #03 ------------('longitude', 144) ------------------------------\u001b[0m\n", - "\u001b[0m #04 ---------------------('nv', 2) ---------------------('nv', 2)\u001b[0m\n", - "\u001b[0m #05 -------------------('time', 1) -------------------('time', 1)\u001b[0m\n", - "\u001b[0m Number of non-shared items: 2 2\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "Root-level Groups:\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36mAre all items the same? ---> True. (No items exist.)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[90m\n", - "No variable group selected for comparison. Skipping..\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", - "All variables:\u001b[0m\n", - "\u001b[0m File A File B\u001b[0m\n", - "\u001b[0m All Variables \u001b[0m\n", - "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", - "\u001b[0m \u001b[0m\n", - "\u001b[0m GROUP #00 -----------------------------/ -----------------------------/\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mnum variables in group: 8 6\u001b[0m\n", - "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lat\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (480,)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: lat_bnds\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: latitude\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: latitude\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: degrees_north\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_max: 60.0\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_min: -60.0\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lat_bnds\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (480, 2)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lat_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (72, 2) \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcomment: latitude values at the north and south bounds of each pixel. \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: latitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (72,) \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31maxis: Y \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: lat_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: Latitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: latitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: degrees_north \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_range: [-90.0, 90.0, ...] \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lon\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1440,)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: lon_bnds\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: longitude\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: longitude\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: degrees_east\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_max: 360.0\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_min: 0.0\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lon_bnds\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1440, 2)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: lon_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (144, 2) \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcomment: longitude values at the west and east bounds of each pixel. \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: longitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (144,) \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31maxis: X \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: lon_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: Longitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: longitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: degrees_east \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_range: [0.0, 360.0, ...] \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: precip \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 72, 144) \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcell_methods: area: mean time: mean \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcoordinates: time latitude longitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: NOAA Climate Data Record (CDR) of GPCP Monthly Satellite-Gauge Combined Precipitation \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mmissing_value: -9999.0 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: precipitation amount \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: mm/day \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_range: [0.0, 100.0, ...] \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: precip_error \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 72, 144) \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcoordinates: time latitude longitude \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: NOAA CDR of GPCP Satellite-Gauge Combined Precipitation Error \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mmissing_value: -9999.0 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: mm/day \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_range: [0.0, 100.0, ...] \u001b[0m\n", - "\u001b[0m -----VARIABLE-----: precipitation\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 1440, 480)\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: [1, 1440, 480]\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31m_FillValue: -1.0\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcell_method: sum\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: NOAA Climate Data Record of PERSIANN-CDR daily precipitation\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mmissing_value: -9999.0\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: precipitation_amount\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: mm\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_max: 999999.0\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_min: 0.0\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: time time\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 int32\u001b[0m\n", - "\u001b[0m shape: (1,) (1,)\u001b[0m\n", - "\u001b[0m chunksize: contiguous contiguous\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31maxis: T \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: time_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcalendar: Gregorian \u001b[0m\n", - "\u001b[0m long_name: time time\u001b[0m\n", - "\u001b[0m standard_name: time time\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: days since 1970-01-01 00:00:00 0:00 days since 1979-01-01 0:0:0\u001b[0m\n", - "\u001b[0m -----VARIABLE-----: time_bounds \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 2) \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcomment: time bounds for each time value \u001b[0m\n", - "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", - "\u001b[0m Total number of shared items: 1 1\u001b[0m\n", - "\u001b[0m Total number of non-shared items: 7 5\u001b[0m\n", - "\u001b[0m\u001b[37m\u001b[0m\n", - "Done.\u001b[0m\n", - "\u001b[0m\u001b[0m\u001b[0m" - ] - } - ], - "source": [ - "! ncompare --show-attributes --show-chunks --column-widths 33 30 30 {file_names[0]} {file_names[2]}" - ] - }, + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[37m\u001b[0mFile A: gpcp_v02r03_monthly_d202301_c20230411.nc\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0mFile B: gpcp_v02r03_monthly_d202302_c20230505.nc\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "Root-level Dimensions:\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36mAre all items the same? ---> True.\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36m[('latitude', 72), ('longitude', 144), ('nv', 2), ('time', 1)]\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "Root-level Groups:\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36mAre all items the same? ---> True. (No items exist.)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[90m\n", + "No variable group selected for comparison. Skipping..\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "All variables:\u001b[0m\n", + "\u001b[0m File A File B\u001b[0m\n", + "\u001b[0m All Variables \u001b[0m\n", + "\u001b[0m - -------------------------- --------------------------\u001b[0m\n", + "\u001b[0m \u001b[0m\n", + "\u001b[0m GROUP #00 -------------------------/ -------------------------/\u001b[0m\n", + "\u001b[0m num variables in group: 8 8\u001b[0m\n", + "\u001b[0m - -------------------------- --------------------------\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lat_bounds lat_bounds\u001b[0m\n", + "\u001b[0m dtype: float32 float32\u001b[0m\n", + "\u001b[0m shape: (72, 2) (72, 2)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: latitude latitude\u001b[0m\n", + "\u001b[0m dtype: float32 float32\u001b[0m\n", + "\u001b[0m shape: (72,) (72,)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lon_bounds lon_bounds\u001b[0m\n", + "\u001b[0m dtype: float32 float32\u001b[0m\n", + "\u001b[0m shape: (144, 2) (144, 2)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: longitude longitude\u001b[0m\n", + "\u001b[0m dtype: float32 float32\u001b[0m\n", + "\u001b[0m shape: (144,) (144,)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: precip precip\u001b[0m\n", + "\u001b[0m dtype: float32 float32\u001b[0m\n", + "\u001b[0m shape: (1, 72, 144) (1, 72, 144)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: precip_error precip_error\u001b[0m\n", + "\u001b[0m dtype: float32 float32\u001b[0m\n", + "\u001b[0m shape: (1, 72, 144) (1, 72, 144)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: time time\u001b[0m\n", + "\u001b[0m dtype: float32 float32\u001b[0m\n", + "\u001b[0m shape: (1,) (1,)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: time_bounds time_bounds\u001b[0m\n", + "\u001b[0m dtype: float32 float32\u001b[0m\n", + "\u001b[0m shape: (1, 2) (1, 2)\u001b[0m\n", + "\u001b[0m - -------------------------- --------------------------\u001b[0m\n", + "\u001b[0m Total number of shared items: 8 8\u001b[0m\n", + "\u001b[0m Total number of non-shared items: 0 0\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\n", + "Done.\u001b[0m\n", + "\u001b[0m\u001b[0m" + ] + } + ], + "source": [ + "! ncompare --column-widths 33 26 26 {file_names[0]} {file_names[1]}" + ] + }, + { + "cell_type": "markdown", + "id": "220888cd-92d1-4bb4-9b5d-8187f89bda87", + "metadata": {}, + "source": [ + "## Example 2: Two netCDF files with different groups, variables, and attributes\n", + "----" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "c48728a0-1379-4a05-b7e6-ad50694510df", + "metadata": {}, + "outputs": [ { - "cell_type": "markdown", - "id": "dccb326d-3b47-4d0f-b96d-93577d3e7c54", - "metadata": {}, - "source": [ - "END of Notebook." - ] + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[37m\u001b[0mFile A: gpcp_v02r03_monthly_d202301_c20230411.nc\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0mFile B: PERSIANN-CDR_v01r01_20230419_c20231030.nc\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "Root-level Dimensions:\u001b[0m\n", + "/usr/local/Caskroom/miniconda/base/envs/ncompare-jupyter-example/lib/python3.12/site-packages/xarray/conventions.py:428: SerializationWarning: variable 'precipitation' has multiple fill values {-9999.0, -1.0}, decoding all values to NaN.\n", + " new_vars[k] = decode_cf_variable(\n", + "\u001b[0m\u001b[37m\u001b[0m\tAre all items the same? ---> \u001b[31mFalse. (2 items are shared, out of 6 total.)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\t\u001b[31mWhich items are different?\u001b[0m\n", + "\u001b[0m File A File B\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #00 ------------------------------ ------------------('lat', 480)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #01 --------------('latitude', 72) ------------------------------\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #02 ------------------------------ -----------------('lon', 1440)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #03 ------------('longitude', 144) ------------------------------\u001b[0m\n", + "\u001b[0m #04 ---------------------('nv', 2) ---------------------('nv', 2)\u001b[0m\n", + "\u001b[0m #05 -------------------('time', 1) -------------------('time', 1)\u001b[0m\n", + "\u001b[0m Number of non-shared items: 2 2\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "Root-level Groups:\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36mAre all items the same? ---> True. (No items exist.)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[90m\n", + "No variable group selected for comparison. Skipping..\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "All variables:\u001b[0m\n", + "\u001b[0m File A File B\u001b[0m\n", + "\u001b[0m All Variables \u001b[0m\n", + "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", + "\u001b[0m \u001b[0m\n", + "\u001b[0m GROUP #00 -----------------------------/ -----------------------------/\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mnum variables in group: 8 6\u001b[0m\n", + "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lat\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (480,)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lat_bnds\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (480, 2)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lat_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (72, 2) \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: latitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (72,) \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lon\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1440,)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lon_bnds\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1440, 2)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lon_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (144, 2) \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: longitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (144,) \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: precip \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 72, 144) \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: precip_error \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 72, 144) \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: precipitation\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 1440, 480)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: time time\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 int32\u001b[0m\n", + "\u001b[0m shape: (1,) (1,)\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: time_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 2) \u001b[0m\n", + "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", + "\u001b[0m Total number of shared items: 1 1\u001b[0m\n", + "\u001b[0m Total number of non-shared items: 7 5\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\n", + "Done.\u001b[0m\n", + "\u001b[0m\u001b[0m\u001b[0m" + ] } - ], - "metadata": { - "kernelspec": { - "display_name": "ncompare-jupyter-example", - "language": "python", - "name": "ncompare-jupyter-example" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.0" + ], + "source": [ + "! ncompare --column-widths 33 30 30 {file_names[0]} {file_names[2]}" + ] + }, + { + "cell_type": "markdown", + "id": "11a23041-6f24-491b-a9e3-124ace151736", + "metadata": {}, + "source": [ + "#### More file details can be examined by using the `--show-attributes` and `--show-chunks` options" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "1dd4c51a-394c-4569-b8b1-053743e63cb9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[37m\u001b[0mFile A: gpcp_v02r03_monthly_d202301_c20230411.nc\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0mFile B: PERSIANN-CDR_v01r01_20230419_c20231030.nc\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "Root-level Dimensions:\u001b[0m\n", + "/usr/local/Caskroom/miniconda/base/envs/ncompare-jupyter-example/lib/python3.12/site-packages/xarray/conventions.py:428: SerializationWarning: variable 'precipitation' has multiple fill values {-9999.0, -1.0}, decoding all values to NaN.\n", + " new_vars[k] = decode_cf_variable(\n", + "\u001b[0m\u001b[37m\u001b[0m\tAre all items the same? ---> \u001b[31mFalse. (2 items are shared, out of 6 total.)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\t\u001b[31mWhich items are different?\u001b[0m\n", + "\u001b[0m File A File B\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #00 ------------------------------ ------------------('lat', 480)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #01 --------------('latitude', 72) ------------------------------\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #02 ------------------------------ -----------------('lon', 1440)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m #03 ------------('longitude', 144) ------------------------------\u001b[0m\n", + "\u001b[0m #04 ---------------------('nv', 2) ---------------------('nv', 2)\u001b[0m\n", + "\u001b[0m #05 -------------------('time', 1) -------------------('time', 1)\u001b[0m\n", + "\u001b[0m Number of non-shared items: 2 2\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "Root-level Groups:\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\t\u001b[36mAre all items the same? ---> True. (No items exist.)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[90m\n", + "No variable group selected for comparison. Skipping..\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\u001b[94m\n", + "All variables:\u001b[0m\n", + "\u001b[0m File A File B\u001b[0m\n", + "\u001b[0m All Variables \u001b[0m\n", + "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", + "\u001b[0m \u001b[0m\n", + "\u001b[0m GROUP #00 -----------------------------/ -----------------------------/\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mnum variables in group: 8 6\u001b[0m\n", + "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lat\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (480,)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: lat_bnds\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: latitude\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: latitude\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: degrees_north\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_max: 60.0\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_min: -60.0\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lat_bnds\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (480, 2)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lat_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (72, 2) \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcomment: latitude values at the north and south bounds of each pixel. \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: latitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (72,) \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31maxis: Y \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: lat_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: Latitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: latitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: degrees_north \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_range: [-90.0, 90.0, ...] \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lon\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1440,)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: lon_bnds\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: longitude\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: longitude\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: degrees_east\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_max: 360.0\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_min: 0.0\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lon_bnds\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1440, 2)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: lon_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (144, 2) \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcomment: longitude values at the west and east bounds of each pixel. \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: longitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (144,) \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31maxis: X \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: lon_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: Longitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: longitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: degrees_east \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_range: [0.0, 360.0, ...] \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: precip \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 72, 144) \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcell_methods: area: mean time: mean \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcoordinates: time latitude longitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: NOAA Climate Data Record (CDR) of GPCP Monthly Satellite-Gauge Combined Precipitation \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mmissing_value: -9999.0 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: precipitation amount \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: mm/day \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_range: [0.0, 100.0, ...] \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: precip_error \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 72, 144) \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcoordinates: time latitude longitude \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: NOAA CDR of GPCP Satellite-Gauge Combined Precipitation Error \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mmissing_value: -9999.0 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: mm/day \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_range: [0.0, 100.0, ...] \u001b[0m\n", + "\u001b[0m -----VARIABLE-----: precipitation\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 1440, 480)\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: [1, 1440, 480]\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31m_FillValue: -1.0\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcell_method: sum\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mlong_name: NOAA Climate Data Record of PERSIANN-CDR daily precipitation\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mmissing_value: -9999.0\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mstandard_name: precipitation_amount\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: mm\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_max: 999999.0\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mvalid_min: 0.0\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: time time\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 int32\u001b[0m\n", + "\u001b[0m shape: (1,) (1,)\u001b[0m\n", + "\u001b[0m chunksize: contiguous contiguous\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31maxis: T \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mbounds: time_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcalendar: Gregorian \u001b[0m\n", + "\u001b[0m long_name: time time\u001b[0m\n", + "\u001b[0m standard_name: time time\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31munits: days since 1970-01-01 00:00:00 0:00 days since 1979-01-01 0:0:0\u001b[0m\n", + "\u001b[0m -----VARIABLE-----: time_bounds \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mdtype: float32 \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mshape: (1, 2) \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mchunksize: contiguous \u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m \u001b[31mcomment: time bounds for each time value \u001b[0m\n", + "\u001b[0m - ------------------------------ ------------------------------\u001b[0m\n", + "\u001b[0m Total number of shared items: 1 1\u001b[0m\n", + "\u001b[0m Total number of non-shared items: 7 5\u001b[0m\n", + "\u001b[0m\u001b[37m\u001b[0m\n", + "Done.\u001b[0m\n", + "\u001b[0m\u001b[0m\u001b[0m" + ] } + ], + "source": [ + "! ncompare --show-attributes --show-chunks --column-widths 33 30 30 {file_names[0]} {file_names[2]}" + ] + }, + { + "cell_type": "markdown", + "id": "dccb326d-3b47-4d0f-b96d-93577d3e7c54", + "metadata": {}, + "source": [ + "END of Notebook." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "ncompare-jupyter-example", + "language": "python", + "name": "ncompare-jupyter-example" }, - "nbformat": 4, - "nbformat_minor": 5 + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.0" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/ncompare/console.py b/ncompare/console.py index 91f85a2..e530043 100755 --- a/ncompare/console.py +++ b/ncompare/console.py @@ -25,6 +25,7 @@ # See the License for the specific language governing permissions and limitations under the License. """Command line interface for `ncompare` -- to compare the structure of two NetCDF files.""" + import argparse import importlib.metadata import sys @@ -34,7 +35,7 @@ from ncompare.core import compare -__version__ = importlib.metadata.version('ncompare') +__version__ = importlib.metadata.version("ncompare") def _cli(args: Optional[Sequence[str]]) -> argparse.Namespace: @@ -65,7 +66,10 @@ def _cli(args: Optional[Sequence[str]]) -> argparse.Namespace: ) parser.add_argument("--file-xlsx", help="An Excel file to which the output will be written.") parser.add_argument( - "--no-color", action="store_true", default=False, help="Turn off all colorized output" + "--no-color", + action="store_true", + default=False, + help="Turn off all colorized output", ) parser.add_argument( "--show-attributes", @@ -90,8 +94,8 @@ def _cli(args: Optional[Sequence[str]]) -> argparse.Namespace: parser.add_argument( "--version", - action='version', - version=f'%(prog)s {__version__}', + action="version", + version=f"%(prog)s {__version__}", default=False, help="Show the current version.", ) @@ -103,7 +107,7 @@ def main() -> None: # pragma: no cover """Run from the command line.""" args = _cli(None) - delattr(args, 'version') + delattr(args, "version") try: compare(**vars(args)) @@ -113,5 +117,5 @@ def main() -> None: # pragma: no cover sys.exit(0) # a clean, no-issue, exit -if __name__ == '__main__': # pragma: no cover +if __name__ == "__main__": # pragma: no cover main() diff --git a/ncompare/core.py b/ncompare/core.py index 8b8d040..66be6a5 100644 --- a/ncompare/core.py +++ b/ncompare/core.py @@ -28,10 +28,11 @@ # pylint: disable=fixme """Compare the structure of two NetCDF files.""" + import random import traceback from collections import namedtuple -from collections.abc import Iterable +from collections.abc import Iterable, Iterator from pathlib import Path from typing import Optional, Union @@ -46,6 +47,12 @@ VarProperties = namedtuple("VarProperties", "varname, variable, dtype, shape, chunking, attributes") +GroupPair = namedtuple( + "GroupPair", + "group_a_name group_a group_b_name group_b", + defaults=("", None, "", None), +) + def compare( nc_a: Union[str, Path], @@ -195,7 +202,11 @@ def run_through_comparisons( + f"\nChecking multiple random values within specified variable <{comparison_var_name}>:" ) compare_multiple_random_values( - out, nc_a, nc_b, groupname=comparison_var_group, varname=comparison_var_name + out, + nc_a, + nc_b, + groupname=comparison_var_group, + varname=comparison_var_name, ) except KeyError: @@ -250,12 +261,12 @@ def compare_multiple_random_values( out.print("Done.", colors=False) -def walk_common_groups_tree( # type:ignore[misc] +def walk_common_groups_tree( top_a_name: str, top_a: Union[netCDF4.Dataset, netCDF4.Group], top_b_name: str, top_b: Union[netCDF4.Dataset, netCDF4.Group], -) -> tuple[str, netCDF4.Group, str, netCDF4.Group]: +) -> Iterator[GroupPair]: """Yield names and groups from a netCDF4's group tree. Parameters @@ -267,25 +278,30 @@ def walk_common_groups_tree( # type:ignore[misc] Yields ------ - group A name : str - group A object : netCDF4.Group - group B name : str - group B object : netCDF4.Group + tuple + group A name : str + group A object : netCDF4.Group or None + group B name : str + group B object : netCDF4.Group or None """ - yield ( - ( - top_a_name + "/" + group_a_name if group_a_name else "", - top_a[group_a_name] if (group_a_name and (group_a_name in top_a.groups)) else None, - top_b_name + "/" + group_b_name if group_b_name else "", - top_b[group_b_name] if (group_b_name and (group_b_name in top_b.groups)) else None, - ) - for (_, group_a_name, group_b_name) in common_elements( - top_a.groups if top_a is not None else "", top_b.groups if top_b is not None else "" + for _, group_a_name, group_b_name in common_elements( + top_a.groups if top_a is not None else "", + top_b.groups if top_b is not None else "", + ): + yield GroupPair( + group_a_name=top_a_name + "/" + group_a_name if group_a_name else "", + group_a=( + top_a[group_a_name] if (group_a_name and (group_a_name in top_a.groups)) else None + ), + group_b_name=top_b_name + "/" + group_b_name if group_b_name else "", + group_b=( + top_b[group_b_name] if (group_b_name and (group_b_name in top_b.groups)) else None + ), ) - ) for _, subgroup_a_name, subgroup_b_name in common_elements( - top_a.groups if top_a is not None else "", top_b.groups if top_b is not None else "" + top_a.groups if top_a is not None else "", + top_b.groups if top_b is not None else "", ): yield from walk_common_groups_tree( top_a_name + "/" + subgroup_a_name if subgroup_a_name else "", @@ -311,50 +327,57 @@ def compare_two_nc_files( show_attributes: bool = False, ) -> tuple[int, int, int]: """Go through all groups and all variables, and show them side by side - whether they align and where they don't.""" - out.side_by_side(' ', 'File A', 'File B', force_display_even_if_same=True) + out.side_by_side(" ", "File A", "File B", force_display_even_if_same=True) num_var_diffs = {"left": 0, "right": 0, "both": 0} with netCDF4.Dataset(nc_one) as nc_a, netCDF4.Dataset(nc_two) as nc_b: out.side_by_side( - 'All Variables', ' ', ' ', dash_line=False, force_display_even_if_same=True + "All Variables", " ", " ", dash_line=False, force_display_even_if_same=True ) - out.side_by_side('-', '-', '-', dash_line=True, force_display_even_if_same=True) + out.side_by_side("-", "-", "-", dash_line=True, force_display_even_if_same=True) group_counter = 0 _print_group_details_side_by_side( - out, nc_a, "/", nc_b, "/", group_counter, num_var_diffs, show_attributes, show_chunks + out, + nc_a, + "/", + nc_b, + "/", + group_counter, + num_var_diffs, + show_attributes, + show_chunks, ) group_counter += 1 - for group_pairs in walk_common_groups_tree("", nc_a, "", nc_b): - for group_a_name, group_a, group_b_name, group_b in group_pairs: - _print_group_details_side_by_side( - out, - group_a, - group_a_name, - group_b, - group_b_name, - group_counter, - num_var_diffs, - show_attributes, - show_chunks, - ) - group_counter += 1 + for group_pair in walk_common_groups_tree("", nc_a, "", nc_b): + _print_group_details_side_by_side( + out, + group_pair.group_a, + group_pair.group_a_name, + group_pair.group_b, + group_pair.group_b_name, + group_counter, + num_var_diffs, + show_attributes, + show_chunks, + ) + group_counter += 1 - out.side_by_side('-', '-', '-', dash_line=True, force_display_even_if_same=True) + out.side_by_side("-", "-", "-", dash_line=True, force_display_even_if_same=True) out.side_by_side( - 'Total number of shared items:', - str(num_var_diffs['both']), - str(num_var_diffs['both']), + "Total number of shared items:", + str(num_var_diffs["both"]), + str(num_var_diffs["both"]), force_display_even_if_same=True, ) out.side_by_side( - 'Total number of non-shared items:', - str(num_var_diffs['left']), - str(num_var_diffs['right']), + "Total number of non-shared items:", + str(num_var_diffs["left"]), + str(num_var_diffs["right"]), force_display_even_if_same=True, ) - return num_var_diffs['left'], num_var_diffs['right'], num_var_diffs['both'] + return num_var_diffs["left"], num_var_diffs["right"], num_var_diffs["both"] def _print_group_details_side_by_side( @@ -369,7 +392,12 @@ def _print_group_details_side_by_side( show_chunks: bool, ) -> None: out.side_by_side( - " ", " ", " ", dash_line=False, highlight_diff=False, force_display_even_if_same=True + " ", + " ", + " ", + dash_line=False, + highlight_diff=False, + force_display_even_if_same=True, ) out.side_by_side( f"GROUP #{group_counter:02}", @@ -388,19 +416,19 @@ def _print_group_details_side_by_side( if group_b: vars_b_sorted = sorted(group_b.variables) out.side_by_side( - 'num variables in group:', + "num variables in group:", len(vars_a_sorted), len(vars_b_sorted), highlight_diff=True, force_display_even_if_same=True, ) - out.side_by_side('-', '-', '-', dash_line=True, force_display_even_if_same=True) + out.side_by_side("-", "-", "-", dash_line=True, force_display_even_if_same=True) # Count differences between the lists of variables in this group. left, right, both = count_diffs(vars_a_sorted, vars_b_sorted) - num_var_diffs['left'] += left - num_var_diffs['right'] += right - num_var_diffs['both'] += both + num_var_diffs["left"] += left + num_var_diffs["right"] += right + num_var_diffs["both"] += both # Go through each variable in the current group. for variable_pair in common_elements(vars_a_sorted, vars_b_sorted): @@ -473,7 +501,10 @@ def _print_var_properties_side_by_side( for attr_a_key, attr_a, attr_b_key, attr_b in get_and_check_variable_attributes(v_a, v_b): # Check whether attr_a_key is empty, because it might be if the variable doesn't exist in File A. out.side_by_side( - f"{attr_a_key if attr_a_key else attr_b_key}:", attr_a, attr_b, highlight_diff=True + f"{attr_a_key if attr_a_key else attr_b_key}:", + attr_a, + attr_b, + highlight_diff=True, ) # Scale Factor @@ -483,14 +514,14 @@ def _print_var_properties_side_by_side( def get_and_check_variable_scale_factor(v_a, v_b) -> Union[None, tuple[str, str]]: - if getattr(v_a.variable, 'scale_factor', None): + if getattr(v_a.variable, "scale_factor", None): sf_a = v_a.variable.scale_factor else: - sf_a = ' ' - if getattr(v_b.variable, 'scale_factor', None): + sf_a = " " + if getattr(v_b.variable, "scale_factor", None): sf_b = v_b.variable.scale_factor else: - sf_b = ' ' + sf_b = " " if (sf_a != " ") or (sf_b != " "): return str(sf_a), str(sf_b) else: diff --git a/ncompare/printing.py b/ncompare/printing.py index 13d9ae3..339c513 100644 --- a/ncompare/printing.py +++ b/ncompare/printing.py @@ -25,6 +25,7 @@ # pylint: disable=too-many-arguments """Utility functions for printing to the console or a text file.""" + import csv import re import warnings @@ -43,7 +44,7 @@ # Set up regex remover of ANSI color escape sequences # From ansi_escape = re.compile( - r''' + r""" \x1B # ESC (?: # 7-bit C1 Fe (except CSI) [@-Z\\-_] @@ -53,7 +54,7 @@ [ -/]* # Intermediate bytes [@-~] # Final byte ) -''', +""", re.VERBOSE, ) @@ -117,9 +118,7 @@ def __init__( if filepath.exists(): pass # This will overwrite any existing file at this path if one exists. - self._text_file_obj: Optional[TextIO] = open( - filepath, "w", encoding="utf-8" - ) # pylint: disable=consider-using-with + self._text_file_obj: Optional[TextIO] = open(filepath, "w", encoding="utf-8") # pylint: disable=consider-using-with else: self._text_file_obj = None @@ -131,7 +130,11 @@ def __exit__(self, exc_type, exc_value, exc_traceback): # noqa: D105 self._text_file_obj.close() def print( - self, string: str = "", colors: bool = False, add_to_history: bool = False, **print_args + self, + string: str = "", + colors: bool = False, + add_to_history: bool = False, + **print_args, ) -> None: """Print text using custom options. @@ -156,7 +159,7 @@ def print( # Optional - write text to file if self._text_file_obj: # Remove ANSI escape sequences. - result = ansi_escape.sub('', text_to_print) + result = ansi_escape.sub("", text_to_print) self._text_file_obj.write(result + "\n") # Optional - save text to a history list @@ -168,7 +171,7 @@ def _add_to_history(self, *args): def _parse_single_str(s): # pylint: disable=invalid-name # Remove ANSI escape sequences before adding to a parsed string list. - result = ansi_escape.sub('', s) + result = ansi_escape.sub("", s) # Remove any leading or trailing newlines. return result.strip("\n") @@ -321,16 +324,16 @@ def lists_diff( # print(Fore.RED + "Which items are different? ---> %s." % # str(set(list_a).symmetric_difference(list_b))) - self.side_by_side(' ', 'File A', 'File B') + self.side_by_side(" ", "File A", "File B") self.side_by_side_list_diff(list_a, list_b) - self.side_by_side('Number of non-shared items:', str(left), str(right)) + self.side_by_side("Number of non-shared items:", str(left), str(right)) return left, right, both def write_history_to_csv(self, filename: Union[str, Path] = "test.csv"): """Save the line history that's been stored to a CSV file.""" - headers = ['Info', 'File A', 'File B', 'Other marks'] - with open(filename, 'w', encoding="utf-8") as target: + headers = ["Info", "File A", "File B", "Other marks"] + with open(filename, "w", encoding="utf-8") as target: writer = csv.writer(target) writer.writerow(headers) writer.writerows(self._line_history) @@ -341,7 +344,7 @@ def write_history_to_excel(self, filename: Union[str, Path] = "test.xlsx"): sheet = workbook.active # Add a header row - sheet.append(['Info', 'File A', 'File B']) + sheet.append(["Info", "File A", "File B"]) # Add rows and apply styles for row in self._line_history: @@ -350,7 +353,7 @@ def write_history_to_excel(self, filename: Union[str, Path] = "test.xlsx"): # First, remove difference marker that is redundant with styles applied to the row (unlike in the CSV) del row[3] sheet.append(_excel_red_cells(row, sheet)) - elif (len(row) == 1) or ((len(row) == 3) and ((row[1] == '') and (row[2] == ''))): + elif (len(row) == 1) or ((len(row) == 3) and ((row[1] == "") and (row[2] == ""))): # The case where there is a subheader and no information in the second and third columns. sheet.append(_excel_bold_underline_cells(row, sheet)) else: @@ -379,5 +382,5 @@ def _excel_bold_underline_cells(data, sheet): """Stylize cells in Excel with a bold and underlined font.""" for cell in data: cell = Cell(sheet, column="A", row=1, value=cell) - cell.font = Font(bold=True, underline='single') + cell.font = Font(bold=True, underline="single") yield cell diff --git a/ncompare/sequence_operations.py b/ncompare/sequence_operations.py index 10dc8f3..5baef68 100644 --- a/ncompare/sequence_operations.py +++ b/ncompare/sequence_operations.py @@ -24,6 +24,7 @@ # See the License for the specific language governing permissions and limitations under the License. """Helper functions for operating on iterables, such as lists or sets.""" + from collections.abc import Generator, Iterable from typing import Union @@ -62,9 +63,9 @@ def common_elements( ) if item not in a_sorted: - item_a = '' + item_a = "" elif item not in b_sorted: - item_b = '' + item_b = "" yield i, item_a, item_b diff --git a/ncompare/utils.py b/ncompare/utils.py index 0c03149..6c2d720 100644 --- a/ncompare/utils.py +++ b/ncompare/utils.py @@ -24,6 +24,7 @@ # See the License for the specific language governing permissions and limitations under the License. """Helper utilities.""" + from pathlib import Path from typing import Union diff --git a/tests/conftest.py b/tests/conftest.py index 1e33e03..6a11927 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -35,7 +35,7 @@ @pytest.fixture(scope="session") def temp_data_dir(tmpdir_factory) -> Path: - return Path(tmpdir_factory.mktemp('data')) + return Path(tmpdir_factory.mktemp("data")) @pytest.fixture(scope="function") @@ -130,25 +130,25 @@ def ds_3dims_3vars_4coords_1group(temp_data_dir): filepath = temp_data_dir / "test_3dims_3vars_4coords_1group.nc" f = nC.Dataset(filename=filepath, mode="w") - grp1 = f.createGroup('Group1') + grp1 = f.createGroup("Group1") # A root variable - f.createVariable('var0', "i2", ()) + f.createVariable("var0", "i2", ()) # New/modified coordinates in grp1 - grp1.createDimension('x', 2) - grp1.createDimension('step', 3) - grp1.createDimension('track', 7) + grp1.createDimension("x", 2) + grp1.createDimension("step", 3) + grp1.createDimension("track", 7) # Variables in grp1 - grp1.createVariable('var1', 'f8', ()) + grp1.createVariable("var1", "f8", ()) # - grp1.createVariable('var2', 'f4', ()) + grp1.createVariable("var2", "f4", ()) # - grp1.createVariable('step', 'f4', ('step',), fill_value=False) - grp1['step'][:] = [-0.9, -1.8, -2.7] + grp1.createVariable("step", "f4", ("step",), fill_value=False) + grp1["step"][:] = [-0.9, -1.8, -2.7] # - grp1.createVariable('w', 'u1', ('x', 'step'), fill_value=False) + grp1.createVariable("w", "u1", ("x", "step"), fill_value=False) # Wrap up f.close() @@ -161,35 +161,35 @@ def ds_3dims_3vars_4coords_2groups(temp_data_dir): filepath = temp_data_dir / "test_3dims_3vars_4coords_2groups.nc" f = nC.Dataset(filename=filepath, mode="w") - grp1 = f.createGroup('Group1') - grp2 = f.createGroup('Group2') + grp1 = f.createGroup("Group1") + grp2 = f.createGroup("Group2") # A root variable - f.createVariable('var0', "i2", ()) + f.createVariable("var0", "i2", ()) # New/modified coordinates in grp1 - grp1.createDimension('x', 2) - grp1.createDimension('step', 3) - grp1.createDimension('track', 7) + grp1.createDimension("x", 2) + grp1.createDimension("step", 3) + grp1.createDimension("track", 7) # Variables in grp1 - grp1.createVariable('var1', 'f8', ()) + grp1.createVariable("var1", "f8", ()) # - grp1.createVariable('var2', 'f4', ()) + grp1.createVariable("var2", "f4", ()) # - grp1.createVariable('step', 'f4', ('step',), fill_value=False) - grp1['step'][:] = [-0.9, -1.8, -2.7] + grp1.createVariable("step", "f4", ("step",), fill_value=False) + grp1["step"][:] = [-0.9, -1.8, -2.7] # - grp1.createVariable('w', 'u1', ('x', 'step'), fill_value=False) + grp1.createVariable("w", "u1", ("x", "step"), fill_value=False) # New/modified coordinates in grp2 - grp2.createDimension('x', 2) - grp2.createDimension('step', 3) - grp2.createDimension('track', 7) - grp2.createDimension('level', 4) + grp2.createDimension("x", 2) + grp2.createDimension("step", 3) + grp2.createDimension("track", 7) + grp2.createDimension("level", 4) # Variables in grp2 - grp2.createVariable('var3', 'f8', ('level',), fill_value=False) + grp2.createVariable("var3", "f8", ("level",), fill_value=False) # Wrap up f.close() @@ -202,38 +202,38 @@ def ds_3dims_3vars_4coords_1subgroup(temp_data_dir): filepath = temp_data_dir / "test_3dims_3vars_4coords_1subgroup.nc" f = nC.Dataset(filename=filepath, mode="w") - grp1 = f.createGroup('Group1') - grp2 = f.createGroup('Group2') - grp2_subgroup = grp2.createGroup('Group2_subgroup') + grp1 = f.createGroup("Group1") + grp2 = f.createGroup("Group2") + grp2_subgroup = grp2.createGroup("Group2_subgroup") # A root variable - f.createVariable('var0', "i2", ()) + f.createVariable("var0", "i2", ()) # New/modified coordinates in grp1 - grp1.createDimension('x', 2) - grp1.createDimension('step', 3) - grp1.createDimension('track', 7) + grp1.createDimension("x", 2) + grp1.createDimension("step", 3) + grp1.createDimension("track", 7) # Variables in grp1 - grp1.createVariable('var1', 'f8', ()) + grp1.createVariable("var1", "f8", ()) # - grp1.createVariable('var2', 'f4', ()) + grp1.createVariable("var2", "f4", ()) # - grp1.createVariable('step', 'f4', ('step',), fill_value=False) - grp1['step'][:] = [-0.9, -1.8, -2.7] + grp1.createVariable("step", "f4", ("step",), fill_value=False) + grp1["step"][:] = [-0.9, -1.8, -2.7] # - grp1.createVariable('w', 'u1', ('x', 'step'), fill_value=False) + grp1.createVariable("w", "u1", ("x", "step"), fill_value=False) # New/modified coordinates in grp2 - grp2.createDimension('step', 3) - grp2.createDimension('level', 4) + grp2.createDimension("step", 3) + grp2.createDimension("level", 4) # Variables in grp2 - grp2.createVariable('var3', 'f8', ('step', 'level'), fill_value=False) + grp2.createVariable("var3", "f8", ("step", "level"), fill_value=False) # New/modified coordinates in grp2 # Variables in grp2 - grp2_subgroup.createVariable('var4', 'f8', ('level',), fill_value=False) + grp2_subgroup.createVariable("var4", "f8", ("level",), fill_value=False) # Wrap up f.close() diff --git a/tests/data/create_a-b_test_netcdfs.ipynb b/tests/data/create_a-b_test_netcdfs.ipynb index afb7ab6..49a396c 100644 --- a/tests/data/create_a-b_test_netcdfs.ipynb +++ b/tests/data/create_a-b_test_netcdfs.ipynb @@ -8,8 +8,9 @@ "outputs": [], "source": [ "import time\n", - "import numpy as np\n", - "import netCDF4 as nc" + "\n", + "import netCDF4 as nc\n", + "import numpy as np" ] }, { @@ -38,72 +39,70 @@ } ], "source": [ - "some_4d_data = np.array([\n", + "some_4d_data = np.array(\n", " [\n", " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287],\n", - " [300, 301, 295, 287],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", " ],\n", " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287],\n", - " [300, 301, 295, 287], \n", - " ]\n", - " ],\n", - " [\n", - " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287],\n", - " [300, 301, 295, 287],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", " ],\n", " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287], \n", - " [300, 301, 295, 287], \n", - " ]\n", - " \n", - " ],\n", - " [\n", - " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287],\n", - " [300, 301, 295, 287],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", " ],\n", " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287], \n", - " [300, 301, 295, 287], \n", - " ]\n", - " \n", - " ],\n", - " [\n", - " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287],\n", - " [300, 301, 295, 287],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", " ],\n", " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287], \n", - " [300, 301, 295, 287], \n", - " ]\n", - " \n", - " ],\n", - " [\n", - " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287],\n", - " [300, 301, 295, 287],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", + " [\n", + " [300, 305, 290, 287],\n", + " [300, 301, 295, 287],\n", + " [300, 301, 295, 287],\n", + " ],\n", " ],\n", - " [\n", - " [300, 305, 290, 287],\n", - " [300, 301, 295, 287], \n", - " [300, 301, 295, 287], \n", - " ]\n", - " \n", " ]\n", - "])\n", + ")\n", "some_4d_data.shape" ] }, @@ -128,81 +127,95 @@ "\n", "with nc.Dataset(\"test_a.nc\", \"w\", format=\"NETCDF4\") as rootgrp:\n", " print(f\"Creating {rootgrp.data_model}...\")\n", - " \n", + "\n", " # --- Create Groups ---\n", " groups[\"Position\"] = rootgrp.createGroup(\"Position\")\n", " groups[\"Statistics\"] = rootgrp.createGroup(\"Statistics\")\n", - " \n", + "\n", " groups[\"Data\"] = rootgrp.createGroup(\"Data\")\n", " groups[\"Data_Products_Subgroup\"] = rootgrp.createGroup(\"/Data/Products\")\n", " groups[\"Data_Quality_Subgroup\"] = rootgrp.createGroup(\"/Data/Quality\")\n", - " \n", - " \n", + "\n", " # --- Create Dimensions ---\n", " dims[\"time\"] = rootgrp.createDimension(\"time\", None)\n", " dims[\"latitude\"] = rootgrp.createDimension(\"lat\", 3)\n", " dims[\"longitude\"] = rootgrp.createDimension(\"lon\", 4)\n", " dims[\"conditions\"] = rootgrp.createDimension(\"conditions\", 2)\n", - " \n", + "\n", " dims[\"level\"] = groups[\"Data\"].createDimension(\"level\", None)\n", - " \n", - " \n", + "\n", " # --- Create Variables ---\n", " conditions = rootgrp.createVariable(\"conditions\", \"i4\", (\"conditions\",))\n", " times = rootgrp.createVariable(\"time\", \"f8\", (\"time\",))\n", - " \n", + "\n", " latitudes = rootgrp.createVariable(\"/Position/lat\", \"f4\", (\"lat\",))\n", " longitudes = rootgrp.createVariable(\"/Position/lon\", \"f4\", (\"lon\",))\n", - " \n", + "\n", " mean_values = rootgrp.createVariable(\"/Statistics/mean_value\", \"f4\", (\"time\",))\n", - " \n", + "\n", " levels = rootgrp.createVariable(\"/Data/level\", \"i4\", (\"level\",))\n", - " product_temp = rootgrp.createVariable(\"/Data/Products/temp\", \"f4\", (\"time\", \"level\", \"lat\", \"lon\",))\n", - " quality_flag = rootgrp.createVariable(\"/Data/Quality/quality_flag\", \"i4\", (\"time\", \"level\", \"lat\", \"lon\",))\n", - " \n", - " \n", + " product_temp = rootgrp.createVariable(\n", + " \"/Data/Products/temp\",\n", + " \"f4\",\n", + " (\n", + " \"time\",\n", + " \"level\",\n", + " \"lat\",\n", + " \"lon\",\n", + " ),\n", + " )\n", + " quality_flag = rootgrp.createVariable(\n", + " \"/Data/Quality/quality_flag\",\n", + " \"i4\",\n", + " (\n", + " \"time\",\n", + " \"level\",\n", + " \"lat\",\n", + " \"lon\",\n", + " ),\n", + " )\n", + "\n", " # --- Assign Attributes ---\n", " rootgrp.description = \"Example netCDF file\"\n", " rootgrp.history = \"Created \" + time.ctime(time.time())\n", " rootgrp.source = \"test data creation script\"\n", - " \n", + "\n", " groups[\"Position\"].description = \"This group contain position data.\"\n", " groups[\"Statistics\"].description = \"This group contains statistical information.\"\n", - " \n", + "\n", " times.units = \"hours since 0001-01-01 00:00:00.0\"\n", " times.long_name = \"Time of observation\"\n", " times.calendar = \"gregorian\"\n", " times.coordinates = \"time\"\n", - " \n", + "\n", " levels.units = \"hPa\"\n", - " \n", + "\n", " latitudes.units = \"degrees north\"\n", - " \n", + "\n", " longitudes.units = \"degrees east\"\n", - " \n", + "\n", " mean_values.long_name = \"average value for each time\"\n", " mean_values.coordinates = \"time\"\n", - " \n", + "\n", " product_temp.long_name = \"temperature\"\n", " product_temp.units = \"K\"\n", - " \n", + "\n", " quality_flag.units = \"unitless\"\n", - " \n", - " \n", + "\n", " # --- Assign Data Values ---\n", - " lats = np.arange(-90, 91, 90)\n", - " lons = np.arange(-180, 180, 90)\n", + " lats = np.arange(-90, 91, 90)\n", + " lons = np.arange(-180, 180, 90)\n", " latitudes[:] = lats\n", " longitudes[:] = lons\n", - " \n", + "\n", " nlats = len(rootgrp.dimensions[\"lat\"])\n", " nlons = len(rootgrp.dimensions[\"lon\"])\n", - " \n", + "\n", " product_temp[0:5, 0:2, :, :] = some_4d_data\n", - " \n", + "\n", " times[0:5] = [1, 1.5, 2, 2.5, 3]\n", " levels[0:2] = [10, 20]\n", - " \n", + "\n", "print(\"Done.\")" ] }, @@ -227,97 +240,111 @@ "\n", "with nc.Dataset(\"test_b.nc\", \"w\", format=\"NETCDF4\") as rootgrp:\n", " print(f\"Creating {rootgrp.data_model}...\")\n", - " \n", + "\n", " # --- Create Groups ---\n", " groups[\"Position\"] = rootgrp.createGroup(\"Position\")\n", " groups[\"Statistics\"] = rootgrp.createGroup(\"Statistics\")\n", - " \n", + "\n", " groups[\"Data\"] = rootgrp.createGroup(\"Data\")\n", " groups[\"Data_Products_Subgroup\"] = rootgrp.createGroup(\"/Data/Products\")\n", " groups[\"Data_Quality_Subgroup\"] = rootgrp.createGroup(\"/Data/Quality\")\n", - " \n", + "\n", " groups[\"Data_Supplemental_Subgroup\"] = rootgrp.createGroup(\"/Data/Supplemental\")\n", " groups[\"Data_Supplemental_Details_Subgroup\"] = rootgrp.createGroup(\"/Data/Supplemental/Details\")\n", - " \n", + "\n", " # --- Create Dimensions ---\n", " dims[\"time\"] = rootgrp.createDimension(\"time\", None)\n", " dims[\"latitude\"] = rootgrp.createDimension(\"lat\", 2)\n", " dims[\"longitude\"] = rootgrp.createDimension(\"lon\", 2)\n", " dims[\"conditions\"] = rootgrp.createDimension(\"conditions\", 2)\n", - " \n", + "\n", " dims[\"level\"] = groups[\"Data\"].createDimension(\"level\", None)\n", - " \n", - " \n", + "\n", " # --- Create Variables ---\n", " conditions = rootgrp.createVariable(\"conditions\", \"i4\", (\"conditions\",))\n", " times = rootgrp.createVariable(\"time\", \"f8\", (\"time\",))\n", - " \n", + "\n", " latitudes = rootgrp.createVariable(\"/Position/lat\", \"f4\", (\"lat\",))\n", " longitudes = rootgrp.createVariable(\"/Position/lon\", \"f4\", (\"lon\",))\n", - " \n", + "\n", " std_values = rootgrp.createVariable(\"/Statistics/std_value\", \"f4\", (\"time\",))\n", - " \n", + "\n", " levels = rootgrp.createVariable(\"/Data/level\", \"i4\", (\"level\",))\n", - " product_temp = rootgrp.createVariable(\"/Data/Products/temp\", \"f4\", (\"time\", \"level\", \"lat\", \"lon\",))\n", - " quality_flag = rootgrp.createVariable(\"/Data/Quality/quality_flag\", \"i4\", (\"time\", \"level\", \"lat\", \"lon\",))\n", - " \n", - " supplemental_flag = rootgrp.createVariable(\"/Data/Supplemental/supplemental_flag\", \"i4\", (\"time\", \"conditions\"))\n", - " condition_details = rootgrp.createVariable(\"/Data/Supplemental/Details/condition_details\", \"f8\", (\"conditions\"))\n", - " \n", - " \n", + " product_temp = rootgrp.createVariable(\n", + " \"/Data/Products/temp\",\n", + " \"f4\",\n", + " (\n", + " \"time\",\n", + " \"level\",\n", + " \"lat\",\n", + " \"lon\",\n", + " ),\n", + " )\n", + " quality_flag = rootgrp.createVariable(\n", + " \"/Data/Quality/quality_flag\",\n", + " \"i4\",\n", + " (\n", + " \"time\",\n", + " \"level\",\n", + " \"lat\",\n", + " \"lon\",\n", + " ),\n", + " )\n", + "\n", + " supplemental_flag = rootgrp.createVariable(\n", + " \"/Data/Supplemental/supplemental_flag\", \"i4\", (\"time\", \"conditions\")\n", + " )\n", + " condition_details = rootgrp.createVariable(\n", + " \"/Data/Supplemental/Details/condition_details\", \"f8\", (\"conditions\")\n", + " )\n", + "\n", " # --- Assign Attributes ---\n", " rootgrp.description = \"Example netCDF file\"\n", " rootgrp.history = \"Created \" + time.ctime(time.time())\n", " rootgrp.source = \"test data creation script\"\n", - " \n", + "\n", " groups[\"Position\"].description = \"This group contain position data.\"\n", " groups[\"Statistics\"].description = \"This group contains statistical information.\"\n", - " \n", + "\n", " times.units = \"hours since 0001-01-01 00:00:00.0\"\n", " times.long_name = \"Time of observation\"\n", " times.calendar = \"gregorian\"\n", " times.coordinates = \"time\"\n", - " \n", + "\n", " levels.units = \"hPa\"\n", - " \n", + "\n", " latitudes.units = \"degrees north\"\n", - " \n", + "\n", " longitudes.units = \"degrees east\"\n", - " \n", + "\n", " std_values.long_name = \"standard deviation value for each time\"\n", " std_values.coordinates = \"time\"\n", - " \n", + "\n", " product_temp.long_name = \"temperature\"\n", " product_temp.units = \"Kelvin\"\n", - " \n", + "\n", " quality_flag.units = \"unitless\"\n", - " \n", + "\n", " supplemental_flag.units = \"unitless\"\n", - " \n", + "\n", " # --- Assign Data Values ---\n", - " lats = np.arange(-90, 1, 90)\n", - " lons = np.arange(-180, 0, 90)\n", + " lats = np.arange(-90, 1, 90)\n", + " lons = np.arange(-180, 0, 90)\n", " latitudes[:] = lats\n", " longitudes[:] = lons\n", - " \n", + "\n", " nlats = len(rootgrp.dimensions[\"lat\"])\n", " nlons = len(rootgrp.dimensions[\"lon\"])\n", - " \n", + "\n", " product_temp[0:5, 0:2, :, :] = some_4d_data[:, :, :-1, :-2]\n", - " \n", + "\n", " times[0:5] = [1, 1.5, 2, 2.5, 3]\n", " levels[0:2] = [10, 20]\n", - " \n", - " supplemental_flag = [\n", - " [1, 2],\n", - " [0, 1],\n", - " [1, 2],\n", - " [0, 1],\n", - " [1, 2]\n", - " ]\n", - " \n", + "\n", + " supplemental_flag = [[1, 2], [0, 1], [1, 2], [0, 1], [1, 2]]\n", + "\n", " condition_details = [8.65, 1.23]\n", - " \n", + "\n", "print(\"Done.\")" ] }, diff --git a/tests/test_cli.py b/tests/test_cli.py index 5bb74dc..d314ae8 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -29,12 +29,12 @@ def test_console_version(): - exit_status = os.system('ncompare --version') + exit_status = os.system("ncompare --version") assert exit_status == 0 def test_console_help(): - exit_status = os.system('ncompare --help') + exit_status = os.system("ncompare --help") assert exit_status == 0 diff --git a/tests/test_complete_file_output.py b/tests/test_complete_file_output.py index d989733..d4af4ef 100644 --- a/tests/test_complete_file_output.py +++ b/tests/test_complete_file_output.py @@ -42,7 +42,10 @@ def test_full_run_to_text_output(temp_data_dir): file_text=str(out_path), ) - with open(data_for_tests_dir / "a-b_test_golden_file.txt") as f1, open(str(out_path)) as f2: + with ( + open(data_for_tests_dir / "a-b_test_golden_file.txt") as f1, + open(str(out_path)) as f2, + ): exclude_n_lines = 3 for _ in range(exclude_n_lines): @@ -65,7 +68,10 @@ def test_full_run_to_csv_output(temp_data_dir): file_csv=str(out_path), ) - with open(data_for_tests_dir / "a-b_test_golden_file.csv") as f1, open(str(out_path)) as f2: + with ( + open(data_for_tests_dir / "a-b_test_golden_file.csv") as f1, + open(str(out_path)) as f2, + ): exclude_n_lines = 3 for _ in range(exclude_n_lines): diff --git a/tests/test_core.py b/tests/test_core.py index e106274..8d127a6 100644 --- a/tests/test_core.py +++ b/tests/test_core.py @@ -28,6 +28,7 @@ Note that full comparison tests are performed in both directions, i.e., A -> B and B -> A. """ + from contextlib import nullcontext as does_not_raise import pytest @@ -76,10 +77,10 @@ def test_matching_random_values( ds_1dim_1var_allnan_1coord, outputter_to_console, ): - variable_array_1 = xr.open_dataset(ds_3dims_2vars_4coords).variables['z1'] - variable_array_2 = xr.open_dataset(ds_4dims_3vars_5coords).variables['z1'] - variable_array_3 = xr.open_dataset(ds_1dim_1var_1coord).variables['z1'] - variable_array_allnan = xr.open_dataset(ds_1dim_1var_allnan_1coord).variables['z1'] + variable_array_1 = xr.open_dataset(ds_3dims_2vars_4coords).variables["z1"] + variable_array_2 = xr.open_dataset(ds_4dims_3vars_5coords).variables["z1"] + variable_array_3 = xr.open_dataset(ds_1dim_1var_1coord).variables["z1"] + variable_array_allnan = xr.open_dataset(ds_1dim_1var_allnan_1coord).variables["z1"] assert ( _match_random_value( @@ -128,7 +129,10 @@ def test_matching_random_values( def test_print_values_runs_with_no_error(ds_3dims_3vars_4coords_1group, outputter_to_console): with does_not_raise(): _print_sample_values( - outputter_to_console, ds_3dims_3vars_4coords_1group, groupname="Group1", varname="step" + outputter_to_console, + ds_3dims_3vars_4coords_1group, + groupname="Group1", + varname="step", ) @@ -136,7 +140,10 @@ def test_print_values_to_text_file_runs_with_no_error( ds_3dims_3vars_4coords_1group, outputter_to_text_file, temp_test_text_file_path ): _print_sample_values( - outputter_to_text_file, ds_3dims_3vars_4coords_1group, groupname="Group1", varname="step" + outputter_to_text_file, + ds_3dims_3vars_4coords_1group, + groupname="Group1", + varname="step", ) outputter_to_text_file._text_file_obj.close() @@ -192,7 +199,7 @@ def test_comparison_var_no_error_for_duplicate_dataset( def test_get_vars_with_group(ds_3dims_3vars_4coords_1group): result = _get_vars(ds_3dims_3vars_4coords_1group, groupname="Group1") - assert set(result) == {'step', 'var1', 'var2', 'w'} + assert set(result) == {"step", "var1", "var2", "w"} def test_get_vars_error_when_no_group(ds_3dims_2vars_4coords): diff --git a/tests/test_printing.py b/tests/test_printing.py index e6ede64..e3223a3 100644 --- a/tests/test_printing.py +++ b/tests/test_printing.py @@ -26,7 +26,7 @@ def test_list_of_strings_diff(outputter_to_console): left, right, both = outputter_to_console.lists_diff( - ['hey', 'yo', 'beebop'], ['what', 'is', 'this', 'beebop'] + ["hey", "yo", "beebop"], ["what", "is", "this", "beebop"] ) assert (left, right, both) == (2, 3, 1) diff --git a/tests/test_sequence_operations.py b/tests/test_sequence_operations.py index f7af980..d1dd4a1 100644 --- a/tests/test_sequence_operations.py +++ b/tests/test_sequence_operations.py @@ -30,8 +30,8 @@ @pytest.fixture def two_example_lists() -> tuple[list[str], list[str]]: - a = ['yo', 'beebop', 'hey'] - b = ['what', 'does', 'this', 'beebop', 'mean'] + a = ["yo", "beebop", "hey"] + b = ["what", "does", "this", "beebop", "mean"] return a, b @@ -39,13 +39,13 @@ def test_common_elements(two_example_lists): composed_pairs = [e for e in common_elements(*two_example_lists)] should_be = [ - (0, 'beebop', 'beebop'), - (1, '', 'does'), - (2, 'hey', ''), - (3, '', 'mean'), - (4, '', 'this'), - (5, '', 'what'), - (6, 'yo', ''), + (0, "beebop", "beebop"), + (1, "", "does"), + (2, "hey", ""), + (3, "", "mean"), + (4, "", "this"), + (5, "", "what"), + (6, "yo", ""), ] assert composed_pairs == should_be diff --git a/tests/test_utils.py b/tests/test_utils.py index fe800e0..b87f535 100644 --- a/tests/test_utils.py +++ b/tests/test_utils.py @@ -60,7 +60,7 @@ def test_coerce_int_to_str(): def test_coerce_tuple_to_str(): - assert coerce_to_str(('step', 123)) == "('step', 123)" + assert coerce_to_str(("step", 123)) == "('step', 123)" def test_error_from_not_able_to_coerce_to_str():