Add test for SDPAToPagedAttention #12

CuriousPanCake · 2024-05-15T21:56:22Z

Add test for SDPAToPagedAttention

Add a test to check if SDPAToPagedAttention transformation
was performed

Signed-off-by: Andrii Staikov [email protected]

…penvinotoolkit#308) ## Type of Change - The PR fixes an issue mentioned in openvinotoolkit#254. There was a fix proposed in openvinotoolkit#246, but this was only applied the cpp part and not the llm_bench/python. As we speak, with current code version, the llama2 conversion fails with same error when using llm_bench/python. - This PR introduces same fix as openvinotoolkit#246 for llm_bench/python too + fixes some missing documentation in README and typos ## Description - Same as above ## Expected Behavior & Potential Risk - N/A ## How has this PR been tested? - N/A ## Dependency Change? - N/A --------- Co-authored-by: Anas Ahouzi <[email protected]>

- [x] rely on optimum configs for export - [x] rely on optimum configs for weights compression - openvinotoolkit#218

Data-aware compression with updated configurations.

Bumps [diffusers](https://github.com/huggingface/diffusers) from 0.26.3 to 0.27.0. - [Release notes](https://github.com/huggingface/diffusers/releases) - [Commits](huggingface/diffusers@v0.26.3...v0.27.0) --- updated-dependencies: - dependency-name: diffusers dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>

…search_with_hf

…t/pip/image_generation/stable_diffusion_1_5/cpp/scripts/diffusers-0.27.0 Bump diffusers from 0.26.3 to 0.27.0 in /image_generation/stable_diffusion_1_5/cpp/scripts

…t/pip/image_generation/lcm_dreamshaper_v7/cpp/scripts/diffusers-0.27.0 Bump diffusers from 0.26.3 to 0.27.0 in /image_generation/lcm_dreamshaper_v7/cpp/scripts

@p-wysocki

@p-wysocki @pavel-esir this fixes openvinotoolkit#302 , tested on local , added github actions workflow and updated the models list in README.md. --------- Co-authored-by: Pavel Esir <[email protected]>

` Skipping line in requirement file [openvino.genai/llm_bench/python/requirements.txt] because it's not clear what it would install: git+https://github.com/huggingface/optimum-intel.git@552de65a9c5f7fa1a2f0ce6859ebdeedaeaabe53 (add #egg=PackageName to the URL to avoid this warning) Skipping line in requirement file [openvino.genai/llm_bench/python/requirements.txt] because it's not clear what it would install: git+https://github.com/openvinotoolkit/nncf.git (add #egg=PackageName to the URL to avoid this warning) ` Signed-off-by: Peter Chen <[email protected]>

…issed add submodule update into sd installation instructions

I've verified support of Qwen1.5-7B by OpenVINO and then added it to the github workflow and readme.md ``` (base) root@8tvt:~/openvino.genai/llm_bench/python# ../../text_generation/causal_lm/cpp/build/greedy_causal_lm qwen/pytorch/dldt/FP32/ "Why is the Sun yellow?" The Sun does not actually appear yellow to us when we look at it. In fact, it appears white because it emits light across a wide range of wavelengths, including all the colors of the visible spectrum. When this light reaches our eyes, our eyes combine the different colors to create the perception of white. ``` --------- Co-authored-by: Pavel Esir <[email protected]>

Ticket: 140109

New shape approach

CVS-133717 1. if has option -ic, output token size is same as infer count 2. if without option -ic, output token size is generated by default according to the model. 3. remove the default output limit of 512 tokens. 4. if set env LOGLEVEL=DEBUG, will print latency of all tokens. examples: set env LOGLEVEL=DEBUG [bloomz-560m-without-ic.txt](https://github.com/openvinotoolkit/openvino.genai/files/15245407/bloomz-560m-without-ic.txt) [bloomz-560m-ic-1024.txt](https://github.com/openvinotoolkit/openvino.genai/files/15245409/bloomz-560m-ic-1024.txt) [llama-2-7b-chat-without-ic-.txt](https://github.com/openvinotoolkit/openvino.genai/files/15245412/llama-2-7b-chat-without-ic-.txt) [llama-2-7b-chat-without-ic-.txt](https://github.com/openvinotoolkit/openvino.genai/files/15245415/llama-2-7b-chat-without-ic-.txt) --------- Co-authored-by: Chen Peter <[email protected]>

Model lists in Python tests

Changed block_size according to latest CPU changes

Removed not suitable models from models list

Add a test to check if SDPAToPagedAttention transformation was performed Signed-off-by: Andrii Staikov <[email protected]>

ilya-lavrenov and others added 30 commits March 15, 2024 18:39

Generalization for SequenceGroups

b166e33

Added CacheManager

c737849

Split sample into modules

b39769b

Added scheduler logic

a008424

migrate on new optimum-intel version (openvinotoolkit#283)

91f69de

- [x] rely on optimum configs for export - [x] rely on optimum configs for weights compression - openvinotoolkit#218

Added prototype of data aware compression. (openvinotoolkit#253)

4e3dc3b

Data-aware compression with updated configurations.

Refactor length penalty calculation

019a92c

Add comment

09692c8

Merge branch 'master' into as/align_beam_search_with_hf

3d10d5c

Merge pull request openvinotoolkit#307 from as-suvorov/as/align_beam_…

ecf6e42

…search_with_hf

Merge pull request openvinotoolkit#311 from openvinotoolkit/dependabo…

d322733

…t/pip/image_generation/stable_diffusion_1_5/cpp/scripts/diffusers-0.27.0 Bump diffusers from 0.26.3 to 0.27.0 in /image_generation/stable_diffusion_1_5/cpp/scripts

Merge pull request openvinotoolkit#310 from openvinotoolkit/dependabo…

aee8fbc

…t/pip/image_generation/lcm_dreamshaper_v7/cpp/scripts/diffusers-0.27.0 Bump diffusers from 0.26.3 to 0.27.0 in /image_generation/lcm_dreamshaper_v7/cpp/scripts

Verify Phi-1_5 (openvinotoolkit#303)

88d08dd

@p-wysocki @pavel-esir this fixes openvinotoolkit#302 , tested on local , added github actions workflow and updated the models list in README.md. --------- Co-authored-by: Pavel Esir <[email protected]>

add submodule update into sd installation instructions

90690ce

Update image_generation/stable_diffusion_1_5/cpp/README.md

fffdd64

Merge pull request openvinotoolkit#314 from eaidova/ea/sd_submodule_m…

18382fe

…issed add submodule update into sd installation instructions

Fixed bug with help in arg_pars in convert.py. (openvinotoolkit#317)

3d1e82b

Add .clang-format configuration (openvinotoolkit#312)

dacd957

Greedy sampling works

1ef7b5c

refactored beam search

694bf64

Beam search produces some output, but it's almost identical

efc5dfc

All blocks are freed

63b6ec2

Beam search almost works

5781210

Beam search works as in group_beam_searcher.hpp

39072bd

Fixed crash when multiple sequences have finished at different time

6ad737c

yatarkan and others added 28 commits May 13, 2024 15:05

Fix type

4196b0b

Fixed model lists paths.

9887065

Improve trim tensor implementation (openvinotoolkit#423)

07193b6

Ticket: 140109

New shape approach

47f1098

Fix for non utf-8 text.

8870e0e

New shape approach

fb73eb5

Merge pull request #8 from ilya-lavrenov/new-kv-cache-shape-approach

d2aebd9

New shape approach

Merge remote-tracking branch 'upstream/master' into ct-beam-search

e34b294

Migrate to official optimum-intel (openvinotoolkit#439)

2bc9a7f

fix phi3 conversion (openvinotoolkit#440)

234ad87

Fixed __repr__, fixed m_generation_ids property.

c5f74db

Removed not needed imports.

a163972

Changed model lists, separated test files.

34308b0

Minor corrections.

81ff24d

Removed wrong models.

822b4c7

Merge pull request #6 from popovaan/model_tests

447e7fb

Model lists in Python tests

Tests on real models

e18f293

Merge remote-tracking branch 'upstream/master' into ct-beam-search

0632495

Fixed HF calls

3306018

Merge branch 'real-models' into ct-beam-search

b715e15

Upated real models

e7fd50f

Not suitable models removed.

c696c4c

Changed block_size according to latest CPU changes

2daf27b

Merge pull request #11 from ilya-lavrenov/change-block-size

a4bb9f0

Changed block_size according to latest CPU changes

Merge pull request #10 from popovaan/remove_not_suitable_models

6628e72

Removed not suitable models from models list

Changed default num blocks

4d3a4fd

Add test for SDPAToPagedAttention

63c80c0

Add a test to check if SDPAToPagedAttention transformation was performed Signed-off-by: Andrii Staikov <[email protected]>

github-actions bot added the llm_bench label May 15, 2024

CuriousPanCake closed this May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test for SDPAToPagedAttention #12

Add test for SDPAToPagedAttention #12

CuriousPanCake commented May 15, 2024

Add test for SDPAToPagedAttention #12

Add test for SDPAToPagedAttention #12

Conversation

CuriousPanCake commented May 15, 2024