[CUDA][shared memory allocation]fix 'ptxas error : Entry function 'fu… #17267
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I convert a vit model from onnx, and then run relay.build with NVIDAI-RTX4090 for compilation.
with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target=target, params=params)
and then meet an error like this: Compilation error:
ptxas error : Entry function 'tvmgen_default_fused_nn_conv2d_add_kernel' uses too much shared data (0x2ab44 bytes, 0x29000 max)
I apologize for resorting to this temporary solution to address the issue I encountered. As a stepping stone, I hope the experts can offer some advice to help me resolve this problem more effectively. Thank you.