Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Should we enable the Ladder weight propagation when the shape is dynamic? #91

Closed
2 tasks done
LeiWang1999 opened this issue Jul 19, 2024 · 1 comment
Closed
2 tasks done

Comments

@LeiWang1999
Copy link
Contributor

LeiWang1999 commented Jul 19, 2024

Our project can be considered a dynamic runtime kernel library, which can generate different executables for specific shape and devices on-the-fly. BitBLAS enables Ladder to propagate layout based on the compute expression and target hardware instructions to avoid bank conflict and make sure the global memory load is coalesced as possible. However, our policy and schedule cannot achieve ideal performance when the shape is small, as the parallelism is limited, which lead to an awkward situation where GEMV and GEMM use different instructions (for example, GEMV uses simt while mma be applied on GEMM), the propagated layout for gemm may not optimal for gemv.

Currently, to preserve the optimal performance of GEMV, we have disabled weight propagation when the input M falls within a dynamic range. However, there is a growing trend of increased attention towards the performance of contiguous decoding. And in some projects, like the flute, bitblas has a weak performance when they do benchmarking with a preset dynamic input range.

So its time for us to decide whether should us make a hotfix to open the weight propagation by default, to improve the performance of batched dequantize gemv?

TODO Items:

@LeiWang1999
Copy link
Contributor Author

We provide an option to enable the weight propagation as this might introduce extra overhead, and have limitations for different hardware platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant