Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transformations] SDPA Decomposition: avoid unnecessary ShapeOf subgraphs #28639

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

v-Golubev
Copy link
Contributor

@v-Golubev v-Golubev commented Jan 23, 2025

Details:

  • Currently, ScaledDotProductAttentionDecomposition uses ShapeOf->Gather subgraphs to extract a specific dimension of input shapes. If the extracted dim is static but the whole shape is dynamic, such subgraphs are not folded by ConstantFolding pass whereas all the needed info can be extracted. This PR updates dim extraction logic: after the subgraph formation, get_constant_from_source tries to compute the subgraph, and replaces it with constant if possible
  • This change unblocks SDPA quantization for some scenarios

Tickets:

@v-Golubev v-Golubev added this to the 2025.1 milestone Jan 23, 2025
@github-actions github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label Jan 23, 2025
@v-Golubev v-Golubev marked this pull request as ready for review January 24, 2025 10:49
@v-Golubev v-Golubev requested a review from a team as a code owner January 24, 2025 10:49
@v-Golubev v-Golubev requested review from itikhono and removed request for a team January 24, 2025 10:49
@v-Golubev v-Golubev force-pushed the vg/transformations/sdpa_decomposition_improvement branch from a193890 to 1b63229 Compare January 24, 2025 10:50
@v-Golubev
Copy link
Contributor Author

@itikhono could you please assign a relevant person on review? Thanks in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: transformations OpenVINO Runtime library - Transformations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants