Milestone2.2: Optimize transposes in XNNPACK partition by removing redundant to_copy ops #11316

leafs1 · 2025-06-03T17:28:52Z

Summary

Optimize transposes in XNNPACK partition

Test plan

Constructed graphs with multiple redundant to_copy ops. Asserted their removal after pass

pytorch-bot · 2025-06-03T17:28:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11316

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a020722 with merge base 083663b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

leafs1 · 2025-06-04T21:16:43Z

@pytorchbot label "release notes: none"

GregoryComer · 2025-06-13T21:21:14Z

backends/xnnpack/_passes/channels_last_tagged_reshape_pass.py

+    def input_dim_order(
+        self, input_node: torch.fx.Node, input_order: InputDimOrder
+    ) -> bool:
+        if input_node.name == "x":


Can you replace this with checking if the input_node is a placeholder?

GregoryComer · 2025-06-13T22:10:13Z

backends/xnnpack/test/passes/test_remove_redundant_ops_pass.py

+from executorch.exir.passes.memory_format_ops_pass import DimOrderOpsRevertPass
+
+
+class TestChannelsLastTaggedReshapePass(unittest.TestCase):


Can we add a test that includes implicitly created dim order conversions? This will check to make sure that user created and pass-created converts get optimized out correctly. I expected it will work, but it would be nice to cover it since this is a common use case.

Maybe something like:
to_channels_last
upsample_nearest2d (not partitioned)
to_channels_first
conv

GregoryComer · 2025-06-13T22:13:51Z

backends/xnnpack/_passes/remove_redundant_ops_pass.py

+
+            # If we encounter a to_copy node, check if it is preceded by an opposite to_copy node
+            if node.target == exir_ops.edge.aten._to_copy.default:
+                if prev and ChannelsLastTaggedReshapePass.is_nchw_node(


I think that there may be cases where the using the previous node in the iteration order might not actually be the first arg, especially in more complex graphs. Can you try replacing prev with node.args[0]? That should be sound in all cases.

digantdesai · 2025-06-26T03:33:38Z

backends/xnnpack/_passes/remove_redundant_ops_pass.py

+from executorch.exir.pass_base import PassResult
+
+
+class RemoveRedundantOpsPass(XNNPACKPass):


Nit: rename this as RemoveRedundantCopyPass or something? This is too generic to infer what its doing

GregoryComer · 2025-06-27T21:52:24Z

backends/xnnpack/_passes/remove_redundant_copy_pass.py

+                continue
+
+            # If we encounter a to_copy node, check if its input is also a to_copy node with opposite format
+            if node.target == exir_ops.edge.aten._to_copy.default:


I thought of one more edge case while reading this code. We should probably check to make sure that the second copy is the only user of the first. It's possible to have two copies in a row, but something else could use the output of the first. It's unlikely, but would lead to an invalid graph in this case.

Added a check, thanks for this find

mcr229

Just a few extra test cases to make sure things look ok

mcr229 · 2025-06-30T21:27:26Z

backends/xnnpack/test/passes/test_remove_redundant_copy_pass.py

+            module.eval(),
+            inputs,
+        )
+        tester.export().to_edge_transform_and_lower().to_executorch().serialize().run_method_and_compare_outputs()


for complicated paths, can we also try quantized models?

mcr229 · 2025-06-30T21:28:24Z

backends/xnnpack/test/passes/test_remove_redundant_copy_pass.py

+    class ChannelsLastToContiguous(torch.nn.Module):
+        def __init__(self):
+            super().__init__()
+            self.conv1 = torch.nn.Conv2d(3, 3, 3)


can we check conv1d as well.

leafs1 requested review from digantdesai and mcr229 as code owners June 3, 2025 17:28

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 3, 2025

leafs1 force-pushed the milestone2.2 branch 5 times, most recently from 98f4027 to 173e41f Compare June 3, 2025 21:47

pytorch-bot bot added the release notes: none Do not include this in the release notes label Jun 4, 2025

leafs1 changed the title ~~Milestone2.2~~ Milestone2.2: Optimize transposes in XNNPACK partition by removing redundant to_copy ops Jun 5, 2025

GregoryComer reviewed Jun 13, 2025

View reviewed changes

leafs1 force-pushed the milestone2.2 branch from 6a47b46 to 4db96f1 Compare June 23, 2025 19:17

leafs1 requested review from manuelcandales, swolchok, jackzhxng, larryliu0820 and mergennachin as code owners June 23, 2025 19:17

leafs1 force-pushed the milestone2.2 branch from 4db96f1 to 824a753 Compare June 23, 2025 19:22

digantdesai reviewed Jun 26, 2025

View reviewed changes

leafs1 force-pushed the milestone2.2 branch 3 times, most recently from 2b4643d to 85e1c4d Compare June 27, 2025 18:32

GregoryComer reviewed Jun 27, 2025

View reviewed changes

leafs1 force-pushed the milestone2.2 branch from eaca3f4 to b663eca Compare June 27, 2025 22:04

GregoryComer approved these changes Jun 27, 2025

View reviewed changes

leafs1 force-pushed the milestone2.2 branch from 5ebc0a3 to 1fa447b Compare June 30, 2025 17:20

leafs1 requested a review from JacobSzwejbka as a code owner June 30, 2025 17:20

Optimize transposes in XNNPACK partition

a020722

leafs1 force-pushed the milestone2.2 branch from 1fa447b to a020722 Compare June 30, 2025 18:08

mcr229 requested changes Jun 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Milestone2.2: Optimize transposes in XNNPACK partition by removing redundant to_copy ops #11316

Milestone2.2: Optimize transposes in XNNPACK partition by removing redundant to_copy ops #11316

Uh oh!

leafs1 commented Jun 3, 2025

Uh oh!

pytorch-bot bot commented Jun 3, 2025 •

edited

Loading

Uh oh!

leafs1 commented Jun 4, 2025

Uh oh!

GregoryComer Jun 13, 2025

Uh oh!

GregoryComer Jun 13, 2025

Uh oh!

GregoryComer Jun 13, 2025

Uh oh!

digantdesai Jun 26, 2025

Uh oh!

GregoryComer Jun 27, 2025

Uh oh!

leafs1 Jun 27, 2025

Uh oh!

mcr229 left a comment

Uh oh!

mcr229 Jun 30, 2025

Uh oh!

mcr229 Jun 30, 2025

Uh oh!

Uh oh!

		from executorch.exir.passes.memory_format_ops_pass import DimOrderOpsRevertPass


		class TestChannelsLastTaggedReshapePass(unittest.TestCase):

		from executorch.exir.pass_base import PassResult


		class RemoveRedundantOpsPass(XNNPACKPass):

Milestone2.2: Optimize transposes in XNNPACK partition by removing redundant to_copy ops #11316

Are you sure you want to change the base?

Milestone2.2: Optimize transposes in XNNPACK partition by removing redundant to_copy ops #11316

Uh oh!

Conversation

leafs1 commented Jun 3, 2025

Summary

Test plan

Uh oh!

pytorch-bot bot commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11316

✅ No Failures

Uh oh!

leafs1 commented Jun 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcr229 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 3, 2025 •

edited

Loading