Changes to SDPA to support no kv cache export #7530

tarun292 · 2025-01-06T21:53:06Z

Stack from ghstack (oldest at bottom):

Differential Revision: D67878163

[ghstack-poisoned]

pytorch-bot · 2025-01-06T21:53:09Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7530

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 84e7573 with merge base 68c0208 ():

NEW FAILURES - The following jobs have failed:

Lint / lintrunner / linux-job (gh)
>>> Lint for examples/models/llama/source_transformation/sdpa.py:
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv) / linux-job (gh)
RuntimeError: Command docker exec -t 3c3a5cd191d7d2be0d350d5bbdcb9bff7bc3a49c05cf2714f0ecd7de20a4b2b1 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

tarun292 · 2025-01-06T21:57:32Z

@tarun292 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

dvorjackz · 2025-01-06T22:29:36Z

examples/models/llama/source_transformation/sdpa.py

+                q,
+                k,
+                v,
+                input_pos,


Tbh I think you can just set this to 0, should work for the no kv cache text decoder as well since it represents start position, so you don't need to set input_pos to torch.tensor(0) in your other pr

Differential Revision: [D67878163](https://our.internmc.facebook.com/intern/diff/D67878163) [ghstack-poisoned]

tarun292 · 2025-01-06T22:42:16Z

@tarun292 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

tarun292 · 2025-01-06T22:49:50Z

@tarun292 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Changes to SDPA to support no kv cache export

3979fc8

[ghstack-poisoned]

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 6, 2025

This was referenced Jan 6, 2025

Add utils to replace torchtune SDPA with ET Custom SDPA #7531

Open

Add test case to export, quantize and lower vision encoder model for ET #7532

Open

tarun292 added the topic: not user facing label Jan 6, 2025

dvorjackz reviewed Jan 6, 2025

View reviewed changes

Update on "Changes to SDPA to support no kv cache export"

84e7573

Differential Revision: [D67878163](https://our.internmc.facebook.com/intern/diff/D67878163) [ghstack-poisoned]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes to SDPA to support no kv cache export #7530

Changes to SDPA to support no kv cache export #7530

tarun292 commented Jan 6, 2025 •

edited

Loading

pytorch-bot bot commented Jan 6, 2025 •

edited

Loading

tarun292 commented Jan 6, 2025

dvorjackz Jan 6, 2025

tarun292 commented Jan 6, 2025

tarun292 commented Jan 6, 2025

Changes to SDPA to support no kv cache export #7530

Are you sure you want to change the base?

Changes to SDPA to support no kv cache export #7530

Conversation

tarun292 commented Jan 6, 2025 • edited Loading

pytorch-bot bot commented Jan 6, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7530

❌ 2 New Failures

tarun292 commented Jan 6, 2025

dvorjackz Jan 6, 2025

Choose a reason for hiding this comment

tarun292 commented Jan 6, 2025

tarun292 commented Jan 6, 2025

tarun292 commented Jan 6, 2025 •

edited

Loading

pytorch-bot bot commented Jan 6, 2025 •

edited

Loading