Skip to content

Refactor & Optimize Remapping (NN & IDW) #1244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
May 9, 2025
Merged

Refactor & Optimize Remapping (NN & IDW) #1244

merged 15 commits into from
May 9, 2025

Conversation

philipc2
Copy link
Member

@philipc2 philipc2 commented May 8, 2025

Overview

  • Default to using a Cartesian scipy.spatial.KDTree for all remapping operations
  • Streamlined remapping functionality to operate directly on either a UxDataset or UxDataArray
  • Unifies the remapping accessor into a single Class for both UxDataset and UxDataArray
  • Fixes a bug in isel() which didn't allow a dictionary to be passed in for grid dimensions

Benchmarks

M1 Macbook Pro (32GB)

Source Zoom Destination Zoom Repeats Avg Time (excl. max) [s] Max Time [s]
8 8 2 0.096208 0.459888
8 9 2 0.430757 0.793449
8 10 2 1.46089 2.02054
9 8 2 0.110274 1.52041
9 9 2 0.367843 1.84821
9 10 2 1.67965 3.31935
10 8 2 0.180659 5.074
10 9 2 0.598835 6.16627
10 10 2 1.75738 7.19776

Derecho CPU Node

Source Zoom Destination Zoom Repeats Avg Time (excl. max) [s] Max Time [s]
8 8 2 0.055009 0.550319
8 9 2 0.097827 0.582884
8 10 2 0.301049 1.30994
8 11 2 0.997492 4.18288
9 8 2 0.056503 1.34068
9 9 2 0.103906 1.55522
9 10 2 0.274427 2.27644
9 11 2 1.01098 5.14391
10 8 2 0.081992 5.2756
10 9 2 0.12964 5.50848
10 10 2 0.310025 6.21761
10 11 2 1.03694 9.15409
11 8 2 0.149953 21.8791
11 9 2 0.218635 22.1311
11 10 2 0.381045 22.8287
11 11 2 1.09738 25.743

@philipc2 philipc2 self-assigned this May 8, 2025
@philipc2
Copy link
Member Author

philipc2 commented May 8, 2025

@aaronzedwick @rajeeja

A couple thing to note here that are relevant to #1016

I've unified howe we do our remapping and dramatically streamlined it so that we don't need separate uxds and uxda functions. Additionally, we only need one accessor now since the remapping works on a uxda and uxds interchanagbly now.

@philipc2 philipc2 added the scalability Related to scalability & performance efforts label May 8, 2025
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@rajeeja
Copy link
Contributor

rajeeja commented May 8, 2025

@aaronzedwick @rajeeja

A couple thing to note here that are relevant to #1016

I've unified howe we do our remapping and dramatically streamlined it so that we don't need separate uxds and uxda functions. Additionally, we only need one accessor now since the remapping works on a uxda and uxds interchanagbly now.

So, we should hold off the bilinear remapping PR and Mark this main first?

@philipc2
Copy link
Member Author

philipc2 commented May 8, 2025

@aaronzedwick @rajeeja
A couple thing to note here that are relevant to #1016
I've unified howe we do our remapping and dramatically streamlined it so that we don't need separate uxds and uxda functions. Additionally, we only need one accessor now since the remapping works on a uxda and uxds interchanagbly now.

So, we should hold off the bilinear remapping PR and Mark this main first?

It shouldn't be much of a delay. I'm planning to move forward with this PR before the hackathon next week.

@rajeeja
Copy link
Contributor

rajeeja commented May 8, 2025

Ok, I was hoping we get bilinear before the hackathon

@philipc2
Copy link
Member Author

philipc2 commented May 8, 2025

Ok, I was hoping we get bilinear before the hackathon

I'll have this ready today. Are the recent changes in for Bilinear?

Getting Bilinear merged for the hackathon may not be practical if it can't be run on the grids we'll be using, since most grids will be very high resolution.

@philipc2 philipc2 changed the title DRAFT: Refactor & Optimize Remapping Refactor & Optimize Remapping May 8, 2025
@philipc2 philipc2 requested a review from Copilot May 8, 2025 18:05
@philipc2 philipc2 marked this pull request as ready for review May 8, 2025 18:05
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the remapping functionality and unifies the remapping accessors for both UxDataset and UxDataArray while optimizing nearest‐neighbor and inverse‐distance‐weighted routines. Key changes include replacing deprecated parsing functions with new implementations in the remap module, consolidating accessor classes into a unified RemapAccessor, and enhancing grid nearest neighbor support via a KDTree cache.

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file
File Description
uxarray/remap/utils.py Refactored remap helper functions and removed deprecated parsing code.
uxarray/remap/nearest_neighbor.py Renamed and restructured nearest neighbor functions with clearer APIs.
uxarray/remap/inverse_distance_weighted.py Revised IDW remap implementation and introduced weight calculation helper.
uxarray/remap/accessor.py Unified remapping accessors into one RemapAccessor interface.
uxarray/grid/grid.py Added KDTree caching for grid queries and new get_kdtree method.
uxarray/core/dataset.py & dataarray.py Updated remap accessor usage and revised isel signature for grid-aware slicing.
uxarray/grid/coordinates.py Minor adjustments in coordinate population and removal of unnecessary warning.

@philipc2 philipc2 requested a review from Copilot May 8, 2025 18:32
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors and optimizes the remapping functionality by unifying the remapping accessor for both UxDataset and UxDataArray, streamlining the underlying nearest‐neighbor and inverse‐distance–weighted implementations, and defaulting to a Cartesian KDTree (from scipy.spatial) for all remapping operations.

  • Unified remapping functions and removed legacy dataset/dataarray accessors.
  • Updated function signatures and docstrings in remapping modules for improved clarity and error handling.
  • Incorporated KDTree caching in the grid module and updated accessor references in core dataset and dataarray modules.
  • Updated CI dependencies by adding "cmocean".

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file
File Description
uxarray/remap/utils.py Added several helper functions (_assert_dimension, _construct_remapped_ds, _to_dataset, _get_remap_dims, _prepare_points) and improved docstrings and error handling.
uxarray/remap/nearest_neighbor.py Renamed and refactored nearest neighbor remapping functions; updated parameter names and documentation for clarity.
uxarray/remap/inverse_distance_weighted.py Refactored inverse-distance–weighted remapping, extracting a dedicated weight computation helper (_idw_weights) and updating function interface.
uxarray/remap/accessor.py Introduced a unified RemapAccessor providing remapping methods on both UxDataset and UxDataArray, with updated supported parameter values.
uxarray/remap/init.py Adjusted exports to reflect the refactored remapping functions.
uxarray/grid/grid.py Added KDTree caching with a new get_kdtree method using scipy.spatial.KDTree.
uxarray/grid/coordinates.py Minor changes to coordinate-related warnings (now commented out).
uxarray/core/dataset.py Updated the remap accessor to use the new unified RemapAccessor.
uxarray/core/dataarray.py Updated the remap accessor to use the new unified RemapAccessor and expanded the isel signature.
ci/docs.yml Added the "cmocean" dependency.

@aaronzedwick
Copy link
Member

Ok, I was hoping we get bilinear before the hackathon

I'll have this ready today. Are the recent changes in for Bilinear?

Getting Bilinear merged for the hackathon may not be practical if it can't be run on the grids we'll be using, since most grids will be very high resolution.

The changes are in for bilinear. It should work, but is not optimized yet, especially since I had to redo the weights calculation. With the updated tree structures that might help, but until the point contains function is more performative, bilinear will be slower, as that is the main bottleneck according to the benchmarks.

@philipc2 philipc2 changed the title Refactor & Optimize Remapping Refactor & Optimize Remapping (NN & IDW) May 8, 2025
@philipc2 philipc2 requested a review from Copilot May 9, 2025 20:17
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors and optimizes the remapping functionality by unifying the remapping accessor for both UxDataset and UxDataArray, while streamlining the underlying nearest neighbor and inverse‐distance–weighted implementations. In addition, it removes the old dedicated remap accessors, updates related documentation and CI dependencies, and adds caching for KDTree construction in the Grid module.

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file
File Description
uxarray/remap/***.py Refactored remapping functions and renamed internal APIs (e.g. _nearest_neighbor_query, _inverse_distance_weighted_remap).
uxarray/remap/accessor.py Introduced a unified RemapAccessor replacing the removed dataset/dataarray accessor modules.
uxarray/core/dataset.py and uxarray/core/dataarray.py Updated to use the new RemapAccessor.
uxarray/grid/grid.py Added a new method (_get_scipy_kd_tree) and caching for KDTree instances.
ci/docs.yml Added dependency on “cmocean” to the documentation build environment.
Comments suppressed due to low confidence (2)

uxarray/remap/accessor.py:38

  • The default value for the 'remap_to' parameter is set to 'faces', but the underlying remapping functions validate dimensions against values like 'nodes', 'edge centers', or 'face centers'. Consider changing the default to 'face centers' to ensure consistency in the API.
def nearest_neighbor(self, destination_grid: Grid, remap_to: str = "faces", **kwargs) -> UxDataArray | UxDataset:

uxarray/remap/accessor.py:60

  • The default value for 'remap_to' here is also set to 'faces', which may not match the expected keys in the remapping routines. Updating this default to 'face centers' would improve clarity and consistency across the API.
def inverse_distance_weighted(self, destination_grid: Grid, remap_to: str = "faces", power=2, k=8, **kwargs) -> UxDataArray | UxDataset:

@philipc2 philipc2 merged commit 92dac00 into main May 9, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scalability Related to scalability & performance efforts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants