-
Notifications
You must be signed in to change notification settings - Fork 607
Milestone2.2: Optimize transposes in XNNPACK partition by removing redundant to_copy ops #11316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11316
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit a020722 with merge base 083663b ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
98f4027
to
173e41f
Compare
@pytorchbot label "release notes: none" |
def input_dim_order( | ||
self, input_node: torch.fx.Node, input_order: InputDimOrder | ||
) -> bool: | ||
if input_node.name == "x": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace this with checking if the input_node is a placeholder?
from executorch.exir.passes.memory_format_ops_pass import DimOrderOpsRevertPass | ||
|
||
|
||
class TestChannelsLastTaggedReshapePass(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test that includes implicitly created dim order conversions? This will check to make sure that user created and pass-created converts get optimized out correctly. I expected it will work, but it would be nice to cover it since this is a common use case.
Maybe something like:
to_channels_last
upsample_nearest2d (not partitioned)
to_channels_first
conv
|
||
# If we encounter a to_copy node, check if it is preceded by an opposite to_copy node | ||
if node.target == exir_ops.edge.aten._to_copy.default: | ||
if prev and ChannelsLastTaggedReshapePass.is_nchw_node( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that there may be cases where the using the previous node in the iteration order might not actually be the first arg, especially in more complex graphs. Can you try replacing prev with node.args[0]? That should be sound in all cases.
from executorch.exir.pass_base import PassResult | ||
|
||
|
||
class RemoveRedundantOpsPass(XNNPACKPass): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: rename this as RemoveRedundantCopyPass
or something? This is too generic to infer what its doing
2b4643d
to
85e1c4d
Compare
continue | ||
|
||
# If we encounter a to_copy node, check if its input is also a to_copy node with opposite format | ||
if node.target == exir_ops.edge.aten._to_copy.default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of one more edge case while reading this code. We should probably check to make sure that the second copy is the only user of the first. It's possible to have two copies in a row, but something else could use the output of the first. It's unlikely, but would lead to an invalid graph in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a check, thanks for this find
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few extra test cases to make sure things look ok
module.eval(), | ||
inputs, | ||
) | ||
tester.export().to_edge_transform_and_lower().to_executorch().serialize().run_method_and_compare_outputs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for complicated paths, can we also try quantized models?
class ChannelsLastToContiguous(torch.nn.Module): | ||
def __init__(self): | ||
super().__init__() | ||
self.conv1 = torch.nn.Conv2d(3, 3, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we check conv1d as well.
Summary
Optimize transposes in XNNPACK partition
Test plan
Constructed graphs with multiple redundant to_copy ops. Asserted their removal after pass