[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

srkreddy1238 · 2025-01-21T16:24:15Z

Texture scope annotation is handled by

Layout conversion from 4D to 5D with convert_layout pass
Legalization of ops with Adreno specific legalization and fall back legalization
FuseOps & FuseTIR
Now, over the fused TIR annotate the texture scopes by hint_on_device
RealizeVDevice will take care of injecting to_device as needed.
Also, introduced SpecializeTIRParams to update the fused TIR the prim function buffer var map with new scope information.

Changes in FuseOps and FuseTIR are to forward op attr and op pattern info. This info is used for Texture specific scoping decisions.

Texture scope annotation is handled by - Layout conversion from 4D to 5D with convert_layout pass - Legalization of ops with Adreno specific legalization - FuseOps & FuseTIR - Now, over the fused TIR annotate the scopes by hint_on_device - RealizeVDevice will take care of injecting to_device as needed. - Also, introduced SpecializeTIRParams to update the fused TIR the prim function buffer var map with new scope information. Changes in FuseOps and FuseTIR are to forward op attr and op pattern info. This info is used for Texture specific scoping decisions.

srkreddy1238 · 2025-01-23T03:45:34Z

@tvm-bot rerun

tqchen · 2025-01-25T14:43:25Z

@Hzfengsy do you mind take a look given it touches FuseOps/TIR

src/script/printer/relax/utils.h

tqchen · 2025-01-25T14:46:57Z

python/tvm/relax/transform/transform.py

@@ -1092,6 +1095,10 @@ def LegalizeOps(
        legalization function is not registered. By default we don't print
        warnings.

+    add_attributes : bool


is it possible to compose instead? e.g. run attribute attach pass after.

After legalization pass we don't have any trace of operator specific attributes.

include/tvm/relax/transform.h

tqchen · 2025-01-25T14:48:31Z

also cc @yongwww for memory scope related changes

Hzfengsy

some initial comments

python/tvm/relax/transform/optimize_batchnorm.py

src/relax/op/tensor/binary.cc

srkreddy1238 · 2025-01-28T10:03:19Z

@tvm-bot rerun

Hzfengsy

LGTM

tqchen · 2025-02-03T18:13:25Z

Thanks @srkreddy1238 for updates. I take a closer look and now understands the motivation behind add_attributes. This is mainly to handle the case of conv2d operators where texture can be supported.

However, attaching op attributes into the call_tir indeed introduce less desirable impact, as the specification of call_tir originally do not have to deal with these attributes, and having them will results in "leak through". This would increase the surface area for developers working with call_tir

I also now understand the demand is to enable the finally fused call_tir function to decide whether texture memory is feasible.

I think it is more cleaner to try a different approach. Instead of relying on legalize pass, let us introduce an adreno specific conv_dispatch which can be used before legalize, to offload these conv operators. We specifically attach the attribute tir.opencl_texture_2d_supported = true to the call node.

Now the remaining question is where the schedule can appear

The most clean way is to actually have relax.andreno.conv_dispatch to call the dlight schedule and construct such call_tir, and mark it as already scheduled. The only issue is that in such case followup FuseOps/TIR should treat this as opaque, and do not yet have capabilities to run more fusions. But we should be fine getting the right conv2d op scheduled

To further enable fusion, one can try adopt the following customized legalize sequence
- S0: relax.andreno.conv_dispatch: run conv dispatch and mark it as opaque with tir.opencl_texture_2d_supported = true
- S1: Run legalize and analysis
- S2: Do a pattern match to manually fuse the ewise onto the conv2d (by creating a sub function that calls into conv2d then ewise), this will create a sub function that calls into conv2d and then ewise, which can then be consumed by FuseTIR
- Run FuseOps (this will try to fuse the other ops)
- Run FuseTIR
- Run dlight

include/tvm/ir/global_info.h

src/script/printer/relax/utils.h

src/relax/transform/utils.h

srkreddy1238 · 2025-02-07T05:27:33Z

Off late realized, I could have drafted an RFC to describe the approach. Have done now https://discuss.tvm.apache.org/t/rfc-annotate-custom-scope-layout-relax-pass-for-adreno-gpu/18052

@tqchen thanks for the thoughts. Few concerns I have in this approach

tir.opencl_texture_2d_supported = true : I assume this flag will be used to realize VDevice in struct_info after FuseTIR. Then, only flag may not be sufficient here we might need scope information for each input. And this information to be consistent while we pass through the fusion ops.
Another moderate challenge is in S2, where we need to define and maintain BYOC like pattern table to ensure maximum fusion possibilities.

Pls advice.

srkreddy1238 force-pushed the annotate_texture_scope branch from fe15d5b to 1549733 Compare January 21, 2025 16:31

srkreddy1238 added 2 commits January 21, 2025 22:08

lint

f2b81ce

Optional attr addition in legalization

15589e5

srkreddy1238 requested review from yongwww and tqchen January 22, 2025 16:31

srkreddy1238 requested a review from spectrometerHBH January 24, 2025 04:28

tqchen requested changes Jan 25, 2025

View reviewed changes

tqchen assigned yongwww Jan 25, 2025

Hzfengsy reviewed Jan 25, 2025

View reviewed changes

python/tvm/relax/transform/optimize_batchnorm.py Outdated Show resolved Hide resolved

src/relax/op/tensor/binary.cc Show resolved Hide resolved

tqchen assigned Hzfengsy Jan 25, 2025

srkreddy1238 added 3 commits January 27, 2025 12:27

VDevice ptr equality.

83161b1

Rename pass - SpecializeTIRParams

89f0dd8

Remote OptimizeBatchnorm pass. Redundant with DecomposeOpsForInference.

31819e6

srkreddy1238 requested review from tqchen and Hzfengsy January 29, 2025 04:23

Hzfengsy approved these changes Jan 29, 2025

View reviewed changes

tqchen requested changes Feb 3, 2025

View reviewed changes

include/tvm/ir/global_info.h Outdated Show resolved Hide resolved

src/script/printer/relax/utils.h Outdated Show resolved Hide resolved

src/relax/transform/utils.h Outdated Show resolved Hide resolved

review

31dac47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

srkreddy1238 commented Jan 21, 2025 •

edited

Loading

srkreddy1238 commented Jan 23, 2025

tqchen commented Jan 25, 2025

tqchen Jan 25, 2025

srkreddy1238 Jan 26, 2025

tqchen commented Jan 25, 2025

Hzfengsy left a comment

srkreddy1238 commented Jan 28, 2025

Hzfengsy left a comment

tqchen commented Feb 3, 2025

srkreddy1238 commented Feb 7, 2025

[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

Are you sure you want to change the base?

[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

Conversation

srkreddy1238 commented Jan 21, 2025 • edited Loading

srkreddy1238 commented Jan 23, 2025

tqchen commented Jan 25, 2025

tqchen Jan 25, 2025

Choose a reason for hiding this comment

srkreddy1238 Jan 26, 2025

Choose a reason for hiding this comment

tqchen commented Jan 25, 2025

Hzfengsy left a comment

Choose a reason for hiding this comment

srkreddy1238 commented Jan 28, 2025

Hzfengsy left a comment

Choose a reason for hiding this comment

tqchen commented Feb 3, 2025

srkreddy1238 commented Feb 7, 2025

srkreddy1238 commented Jan 21, 2025 •

edited

Loading