perf(autograd): optimize grey_dilation with striding #2589

yaugenst-flex · 2025-06-20T11:39:51Z

The previous implementation of grey_dilation was based on convolution, which was slow for both the forward and backward passes.

This PR replaces it with a high-performance implementation that uses NumPy's sliding_window_view to create sliding window views of the input array. I also wrote a custom VJP that uses the same striding technique to make the backward pass faster too.

I also simplified the implementation of grey_erosion so that grey_dilation is now the only function that does the heavy lifting.

Benchmarks show speedups of 10-100x depending on the array and kernel size.

This should make these ops much more usable in topopt @groberts-flex

Greptile Summary

Significant performance optimization of the grey_dilation morphological operation by replacing convolution-based implementation with NumPy's sliding_window_view for strided array operations.

Replaced convolution-based implementation with strided array approach in tidy3d/plugins/autograd/functions.py, achieving 10-100x speedup
Added custom VJP (vector-Jacobian product) for efficient backpropagation using the same striding technique
Simplified grey_erosion by expressing it through duality with grey_dilation
Updated morphology test cases to use first-order gradients and include kernel structure testing
Added comprehensive benchmarks showing performance improvements scaling with array and kernel sizes

greptile-apps

LGTM

_{3 files reviewed, no comments}
_{Edit PR Review Bot Settings | Greptile}

github-actions · 2025-06-20T12:05:08Z

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

tidy3d/plugins/autograd/functions.py (98.8%): Missing lines 67

Summary

Total: 83 lines
Missing: 1 line
Coverage: 98%

tidy3d/plugins/autograd/functions.py

  63         The indices for padding along the axis.
  64     """
  65     total_pad = sum(pad_width)
  66     if n == 0:
! 67         return numpy_module.zeros(total_pad, dtype=int)
  68 
  69     idx = numpy_module.arange(-pad_width[0], n + pad_width[1])
  70 
  71     if mode == "constant":

groberts-flex · 2025-06-23T19:17:46Z

tidy3d/plugins/autograd/functions.py

+        raise ValueError("Either size or structure must be provided.")
+    if structure is None:
+        size_np = onp.atleast_1d(size)
+        shape = (size_np[0], size_np[-1]) if size_np.size > 1 else (size_np[0], size_np[0])


in the case of a 1D structuring element or size, does this mean the dilation/erosion gets applied in 2D and then just a single dimension is extracted at the end?

The operation is still applied to the full 2D array, but with a 1D-like structuring element. You always need to define a 2D structuring element of shape like (1, size) or (size, 1). The dilation/erosion
operation slides this structuring element across the entire 2D array, effectively performing the operation along rows or columns respectively. I updated the docstring to clarify this, because we don't explicitly handle 1d structuring elements.

groberts-flex · 2025-06-23T19:18:48Z

tidy3d/plugins/autograd/functions.py

@@ -238,9 +204,27 @@ def convolve(
    return convolve_ag(array, kernel, axes=axes, mode=mode)


+def _get_footprint(size, structure, maxval):
+    """Helper to generate the morphological footprint from size or structure."""
+    if size is None and structure is None:


possibly the other case to catch is that both the size and the structure are specified

yeah good catch. i think this was discussed previously at some point and we went for silently having precedence of structuring element over sizes, but since it came up again i added a check instead to forbid this

groberts-flex · 2025-06-23T20:20:18Z

tidy3d/plugins/autograd/functions.py


+    padded_array_np = getval(padded_array)
+
+    windows = sliding_window_view(padded_array_np, window_shape=(h, w))


checking my understanding - this replaces the convolve call in the previous version which was being used to essentially create this same view but doing so through a bunch of unnecessary computation with an identity kernel?

yes, exactly. the previous implementation would create the sliding windows (which we now do directly), multiply each window element-wise with the kernel, and then sum the results. you can see how this scales horrendously with image and kernel sizes 😄

groberts-flex · 2025-06-23T20:24:07Z

tidy3d/plugins/autograd/functions.py

+
+    # normalize the gradient for cases where multiple elements are the maximum.
+    multiplicity = onp.sum(is_max_mask, axis=(-2, -1), keepdims=True)
+    is_max_mask /= onp.maximum(multiplicity, 1)


not fully understanding this part - is it possible for values to come out of the operation as the max_val?

no, values cannot equal max_val in the output. i added a comment explaining this:

# Note: Values can never exceed maxval in the output since we add structure # values (capped at maxval) to the input array values.

groberts-flex

thanks for this implementation, the speed up looks awesome especially for a function that will be in a lot of robust optimizations!

left some comments/questions, some just for my own understanding!

The previous implementation of `grey_dilation` was based on convolution, which was slow for both the forward and backward passes. This commit replaces it with a high-performance implementation that uses NumPy's `as_strided` to create sliding window views of the input array. This avoids redundant computations and memory allocations, leading to significant speedups. The VJP (gradient) for the primitive is also updated to use the same striding technique, ensuring the backward pass is also much faster. Benchmarks show speedups of 10-100x depending on the array and kernel size.

yaugenst-flex requested a review from groberts-flex June 20, 2025 11:39

yaugenst-flex self-assigned this Jun 20, 2025

greptile-apps bot reviewed Jun 20, 2025

View reviewed changes

yaugenst-flex force-pushed the yaugenst-flex/faster-morphology branch 2 times, most recently from 266d6a0 to be0b9b0 Compare June 20, 2025 15:08

groberts-flex reviewed Jun 23, 2025

View reviewed changes

yaugenst-flex force-pushed the yaugenst-flex/faster-morphology branch from be0b9b0 to 21486df Compare June 24, 2025 06:40

greg comments

3f63287

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(autograd): optimize grey_dilation with striding #2589

perf(autograd): optimize grey_dilation with striding #2589

Uh oh!

yaugenst-flex commented Jun 20, 2025 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

github-actions bot commented Jun 20, 2025 •

edited

Loading

Uh oh!

groberts-flex Jun 23, 2025

Uh oh!

yaugenst-flex Jun 24, 2025

Uh oh!

groberts-flex Jun 23, 2025

Uh oh!

yaugenst-flex Jun 24, 2025

Uh oh!

groberts-flex Jun 23, 2025

Uh oh!

yaugenst-flex Jun 24, 2025

Uh oh!

groberts-flex Jun 23, 2025

Uh oh!

yaugenst-flex Jun 24, 2025 •

edited

Loading

Uh oh!

groberts-flex left a comment

Uh oh!

Uh oh!


		padded_array_np = getval(padded_array)

		windows = sliding_window_view(padded_array_np, window_shape=(h, w))

perf(autograd): optimize grey_dilation with striding #2589

Are you sure you want to change the base?

perf(autograd): optimize grey_dilation with striding #2589

Uh oh!

Conversation

yaugenst-flex commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

Summary

tidy3d/plugins/autograd/functions.py

Uh oh!

groberts-flex Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

yaugenst-flex Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

groberts-flex Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

yaugenst-flex Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

groberts-flex Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

yaugenst-flex Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

groberts-flex Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

yaugenst-flex Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

groberts-flex left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yaugenst-flex commented Jun 20, 2025 •

edited

Loading

github-actions bot commented Jun 20, 2025 •

edited

Loading

yaugenst-flex Jun 24, 2025 •

edited

Loading