Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test for SDPAToPagedAttention #12

Closed
wants to merge 429 commits into from
Closed

Add test for SDPAToPagedAttention #12

wants to merge 429 commits into from

Conversation

CuriousPanCake
Copy link

Add test for SDPAToPagedAttention

Add a test to check if SDPAToPagedAttention transformation
was performed

Signed-off-by: Andrii Staikov [email protected]

ilya-lavrenov and others added 30 commits March 15, 2024 18:39
…penvinotoolkit#308)

## Type of Change

- The PR fixes an issue mentioned in openvinotoolkit#254. There was a fix proposed in
openvinotoolkit#246, but this was only applied the cpp part and not the
llm_bench/python. As we speak, with current code version, the llama2
conversion fails with same error when using llm_bench/python.

- This PR introduces same fix as openvinotoolkit#246 for llm_bench/python too + fixes
some missing documentation in README and typos

## Description

- Same as above

## Expected Behavior & Potential Risk

- N/A

## How has this PR been tested?

- N/A

## Dependency Change?

- N/A

---------

Co-authored-by: Anas Ahouzi <[email protected]>
- [x] rely on optimum configs for export
- [x] rely on optimum configs for weights compression -
openvinotoolkit#218
Data-aware compression with updated configurations.
Bumps [diffusers](https://github.com/huggingface/diffusers) from 0.26.3 to 0.27.0.
- [Release notes](https://github.com/huggingface/diffusers/releases)
- [Commits](huggingface/diffusers@v0.26.3...v0.27.0)

---
updated-dependencies:
- dependency-name: diffusers
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [diffusers](https://github.com/huggingface/diffusers) from 0.26.3 to 0.27.0.
- [Release notes](https://github.com/huggingface/diffusers/releases)
- [Commits](huggingface/diffusers@v0.26.3...v0.27.0)

---
updated-dependencies:
- dependency-name: diffusers
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
…t/pip/image_generation/stable_diffusion_1_5/cpp/scripts/diffusers-0.27.0

Bump diffusers from 0.26.3 to 0.27.0 in /image_generation/stable_diffusion_1_5/cpp/scripts
…t/pip/image_generation/lcm_dreamshaper_v7/cpp/scripts/diffusers-0.27.0

Bump diffusers from 0.26.3 to 0.27.0 in /image_generation/lcm_dreamshaper_v7/cpp/scripts
@p-wysocki @pavel-esir this fixes openvinotoolkit#302 , tested on local , added github
actions workflow and updated the models list in README.md.

---------

Co-authored-by: Pavel Esir <[email protected]>
`
Skipping line in requirement file
[openvino.genai/llm_bench/python/requirements.txt] because it's not
clear what it would install:
git+https://github.com/huggingface/optimum-intel.git@552de65a9c5f7fa1a2f0ce6859ebdeedaeaabe53
  (add #egg=PackageName to the URL to avoid this warning)   
Skipping line in requirement file
[openvino.genai/llm_bench/python/requirements.txt] because it's not
clear what it would install:
git+https://github.com/openvinotoolkit/nncf.git
  (add #egg=PackageName to the URL to avoid this warning)   
`

Signed-off-by: Peter Chen <[email protected]>
…issed

add submodule update into sd installation instructions
I've verified support of Qwen1.5-7B by OpenVINO and then added it to the
github workflow and readme.md
```
(base) root@8tvt:~/openvino.genai/llm_bench/python# ../../text_generation/causal_lm/cpp/build/greedy_causal_lm qwen/pytorch/dldt/FP32/ "Why is the Sun yellow?"

 

The Sun does not actually appear yellow to us when we look at it. In fact, it appears white because it emits light across a wide range of wavelengths, including all the colors of the visible spectrum. When this light reaches our eyes, our eyes combine the different colors to create the perception of white.
```

---------

Co-authored-by: Pavel Esir <[email protected]>
yatarkan and others added 28 commits May 13, 2024 15:05
CVS-133717
1. if has option -ic, output token size is same as infer count
2. if without option -ic, output token size is generated by default
according to the model.
3. remove the default output limit of 512 tokens.
4. if set env LOGLEVEL=DEBUG, will print latency of all tokens.

examples:
set env LOGLEVEL=DEBUG

[bloomz-560m-without-ic.txt](https://github.com/openvinotoolkit/openvino.genai/files/15245407/bloomz-560m-without-ic.txt)
[bloomz-560m-ic-1024.txt](https://github.com/openvinotoolkit/openvino.genai/files/15245409/bloomz-560m-ic-1024.txt)
[llama-2-7b-chat-without-ic-.txt](https://github.com/openvinotoolkit/openvino.genai/files/15245412/llama-2-7b-chat-without-ic-.txt)
[llama-2-7b-chat-without-ic-.txt](https://github.com/openvinotoolkit/openvino.genai/files/15245415/llama-2-7b-chat-without-ic-.txt)
---------
Co-authored-by: Chen Peter <[email protected]>
Model lists in Python tests
Changed block_size according to latest CPU changes
Removed not suitable models from models list
Add a test to check if SDPAToPagedAttention transformation
was performed

Signed-off-by: Andrii Staikov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.