Stable diffusion pipeline #34

ilya-lavrenov · 2024-09-02T14:09:55Z

No description provided.

Bumps [diffusers](https://github.com/huggingface/diffusers) from 0.30.1 to 0.30.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/diffusers/releases">diffusers's releases</a>.</em></p> <blockquote> <h2>v0.30.2: Update from single file default repository</h2> <h2>All commits</h2> <ul> <li>update runway repo for single_file by <a href="https://github.com/yiyixuxu"><code>@yiyixuxu</code></a> in <a href="https://redirect.github.com/huggingface/diffusers/issues/9323">#9323</a></li> <li>Fix Flux CLIP prompt embeds repeat for num_images_per_prompt > 1 by <a href="https://github.com/DN6"><code>@DN6</code></a> in <a href="https://redirect.github.com/huggingface/diffusers/issues/9280">#9280</a></li> <li>[IP Adapter] Fix cache_dir and local_files_only for image encoder by <a href="https://github.com/asomoza"><code>@asomoza</code></a> in <a href="https://redirect.github.com/huggingface/diffusers/issues/9272">#9272</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/huggingface/diffusers/commit/f63c12633f154c2a1d79c17f4238fb073133652c"><code>f63c126</code></a> Release: v0.30.2</li> <li><a href="https://github.com/huggingface/diffusers/commit/be5995a8156d9d9967ec34abb30dfa6e0342c33d"><code>be5995a</code></a> update runway repo for single_file (<a href="https://redirect.github.com/huggingface/diffusers/issues/9323">#9323</a>)</li> <li><a href="https://github.com/huggingface/diffusers/commit/065978474b2131dc578c21f132bd1a1d3407b894"><code>0659784</code></a> Fix Flux CLIP prompt embeds repeat for num_images_per_prompt > 1 (<a href="https://redirect.github.com/huggingface/diffusers/issues/9280">#9280</a>)</li> <li><a href="https://github.com/huggingface/diffusers/commit/cc1e589537701c780befc8c141a07fc6c1d46914"><code>cc1e589</code></a> [IP Adapter] Fix <code>cache_dir</code> and <code>local_files_only</code> for image encoder (<a href="https://redirect.github.com/huggingface/diffusers/issues/9272">#9272</a>)</li> <li>See full diff in <a href="https://github.com/huggingface/diffusers/compare/v0.30.1...v0.30.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=diffusers&package-manager=pip&previous-version=0.30.1&new-version=0.30.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Related PRs: - openvinotoolkit/openvino_tokenizers#240 - jinja2cpp/Jinja2Cpp#258

This is a port of openvinotoolkit#824 to master branch [RunwayML](https://huggingface.co/runwayml) is no longer maintaining a HuggingFace organization so `runwayml/stable-diffusion-v1-5` model is not available for downloading. Replace it with a re-uploaded archive copy [`botp/stable-diffusion-v1-5`](https://huggingface.co/botp/stable-diffusion-v1-5)

Update the way how metrics are collected in the llm-bench CVS-151502 openvinotoolkit#830

Port openvinotoolkit#823 to master CVS-151497 --------- Co-authored-by: Artur Paniukov <[email protected]>

…vinotoolkit#793) Once KV-cache tensors are exposed from the stateful model, they should be reshaped to have static size. Current implementation of reshape function assumes that KV-cache dimension is always equal to 2 and batch dimension always equal to 0. For chatglm and Qwen this is not the case. This PR identifies the KV-cache and batch dimensions by reading the models config.json file --------- Co-authored-by: Zlobin Vladimir <[email protected]> Co-authored-by: Ilya Lavrenov <[email protected]>

I found that openvino_genai.PerfMetrics are not available for NPU and it is not possible to accurately get the LLM performance on NPU. I added part of the code for this. Before： <img width="857" alt="image" src="https://github.com/user-attachments/assets/70b08d7d-0980-4876-a37e-c91433fa32df"> After: <img width="854" alt="image" src="https://github.com/user-attachments/assets/492eea0a-117b-4bc8-b742-2fcda56687df">

…_pipeline

See warnings from build tool https://github.com/openvinotoolkit/openvino.genai/actions/runs/10768130167 ``` [OpenVINO genai extension (cmake + wheel)](https://github.com/openvinotoolkit/openvino.genai/actions/runs/10768130167/job/29856868111#step:11:76) py_build_cmake.config.load:Name normalized from openvino_genai to openvino-genai ```

…t#843) - Collect inference only (ipot - Inference Per Output Token) time statistics and calculate total and per token information - Bug fixes - Typo fixes

…penvinotoolkit#853)

PR in openvino: openvinotoolkit/openvino#26520

GenerationConfig, do_sample is always false, bool need be added

Co-authored-by: Alina Kladieva <[email protected]> Co-authored-by: Zlobin Vladimir <[email protected]> Co-authored-by: mzegla <[email protected]> Co-authored-by: Pavel Esir <[email protected]> Co-authored-by: Pavel Esir <[email protected]> Co-authored-by: Artur Paniukov <[email protected]> Co-authored-by: Ekaterina Aidova <[email protected]> Co-authored-by: Ilya Lavrenov <[email protected]> Co-authored-by: Mikhail Ryzhov <[email protected]> Co-authored-by: Trawinski, Dariusz <[email protected]> Co-authored-by: TolyaTalamanov <[email protected]> Co-authored-by: Andrei Kochin <[email protected]>

- Add initial support of GPU Plugin - Fixed overflow in vllm-like scheduling, caused by size_t data type overflow - Added prefix support for accuracy sample

…otoolkit#857) Co-authored-by: Ekaterina Aidova <[email protected]>

Ticket 151792

To avoid manual update on release branch creation. Same as openvinotoolkit/openvino#26564

Cloning https://github.com/andreyanufr/who_what_benchmark.git is no longer needed

…penvinotoolkit#868) Bumps [optimum[openvino]](https://github.com/huggingface/optimum) from 1.21.4 to 1.22.0. <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/huggingface/optimum/commit/2112e99122d7f23a1da1a9d263fef64301050ea7"><code>2112e99</code></a> Release: v1.22.0</li> <li><a href="https://github.com/huggingface/optimum/commit/e604af32fcd054cdeafcfb5553d02e92e0787fd3"><code>e604af3</code></a> Add quanto install and instructions (<a href="https://redirect.github.com/huggingface/optimum/issues/1976">#1976</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/2335ec258132881c8b56fcfb27ad2bd5d09367b6"><code>2335ec2</code></a> update transformers imports for <code>deepspeed</code> and <code>is_torch_xla_available</code> (<a href="https://redirect.github.com/huggingface/optimum/issues/2012">#2012</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/29f23f1fa9dbcb148718ff852a60a495a87471ad"><code>29f23f1</code></a> Apply deprecated <code>evaluation_strategy</code> (<a href="https://redirect.github.com/huggingface/optimum/issues/1819">#1819</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/c0d9111775709aac6f1451e59911abe36b4b6c37"><code>c0d9111</code></a> Fix typo in BetterTransformer's overview docs (<a href="https://redirect.github.com/huggingface/optimum/issues/2015">#2015</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/1de4e2522ddf40ba449296877fe6de44a7650f0c"><code>1de4e25</code></a> fix attribute name from <code>inputs_names</code> to <code>input_names</code> (<a href="https://redirect.github.com/huggingface/optimum/issues/2010">#2010</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/8cb6832a2797f54ec1221ff5014a81d961016b6b"><code>8cb6832</code></a> Fix TFLite tests (<a href="https://redirect.github.com/huggingface/optimum/issues/2007">#2007</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/bb46ebea547a2545c33c36f77067406f687187b8"><code>bb46ebe</code></a> Modify token classification processor default dataset args (<a href="https://redirect.github.com/huggingface/optimum/issues/2005">#2005</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/ad98dc944be4308f405ab34e78fa85b16c7d3709"><code>ad98dc9</code></a> Modify Parallelization Strategy to Make it More General (<a href="https://redirect.github.com/huggingface/optimum/issues/1988">#1988</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/7cc57e40f84e00f8ebc2849da303e40575fb23b4"><code>7cc57e4</code></a> Transformers 4.44 support (<a href="https://redirect.github.com/huggingface/optimum/issues/1996">#1996</a>)</li> <li>Additional commits viewable in <a href="https://github.com/huggingface/optimum/compare/v1.21.4...v1.22.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=optimum[openvino]&package-manager=pip&previous-version=1.21.4&new-version=1.22.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [optimum[openvino]](https://github.com/huggingface/optimum) from 1.21.4 to 1.22.0. <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/huggingface/optimum/commit/2112e99122d7f23a1da1a9d263fef64301050ea7"><code>2112e99</code></a> Release: v1.22.0</li> <li><a href="https://github.com/huggingface/optimum/commit/e604af32fcd054cdeafcfb5553d02e92e0787fd3"><code>e604af3</code></a> Add quanto install and instructions (<a href="https://redirect.github.com/huggingface/optimum/issues/1976">#1976</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/2335ec258132881c8b56fcfb27ad2bd5d09367b6"><code>2335ec2</code></a> update transformers imports for <code>deepspeed</code> and <code>is_torch_xla_available</code> (<a href="https://redirect.github.com/huggingface/optimum/issues/2012">#2012</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/29f23f1fa9dbcb148718ff852a60a495a87471ad"><code>29f23f1</code></a> Apply deprecated <code>evaluation_strategy</code> (<a href="https://redirect.github.com/huggingface/optimum/issues/1819">#1819</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/c0d9111775709aac6f1451e59911abe36b4b6c37"><code>c0d9111</code></a> Fix typo in BetterTransformer's overview docs (<a href="https://redirect.github.com/huggingface/optimum/issues/2015">#2015</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/1de4e2522ddf40ba449296877fe6de44a7650f0c"><code>1de4e25</code></a> fix attribute name from <code>inputs_names</code> to <code>input_names</code> (<a href="https://redirect.github.com/huggingface/optimum/issues/2010">#2010</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/8cb6832a2797f54ec1221ff5014a81d961016b6b"><code>8cb6832</code></a> Fix TFLite tests (<a href="https://redirect.github.com/huggingface/optimum/issues/2007">#2007</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/bb46ebea547a2545c33c36f77067406f687187b8"><code>bb46ebe</code></a> Modify token classification processor default dataset args (<a href="https://redirect.github.com/huggingface/optimum/issues/2005">#2005</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/ad98dc944be4308f405ab34e78fa85b16c7d3709"><code>ad98dc9</code></a> Modify Parallelization Strategy to Make it More General (<a href="https://redirect.github.com/huggingface/optimum/issues/1988">#1988</a>)</li> <li><a href="https://github.com/huggingface/optimum/commit/7cc57e40f84e00f8ebc2849da303e40575fb23b4"><code>7cc57e4</code></a> Transformers 4.44 support (<a href="https://redirect.github.com/huggingface/optimum/issues/1996">#1996</a>)</li> <li>Additional commits viewable in <a href="https://github.com/huggingface/optimum/compare/v1.21.4...v1.22.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=optimum[openvino]&package-manager=pip&previous-version=1.21.4&new-version=1.22.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

When `add_requests()` is executed in multiple threads global `m_counter` can be accessed simultaneously by multiple threads, this results in the same sequence IDs in different sequences. Added mutex to prevent such situation. Ticket: 148119

Added support of importing blobs for `StaticLLMPipeline`

)

Add WWB label. --------- Signed-off-by: Chen, Peter <[email protected]>

Extend --prompt_file to support multiple prompt files. You can specify multiple prompt files, separated by spaces. python benchmark.py -mc 1 -ic 128 -m <model> -d CPU -n 3 -pf ../../../repo-prompts/32_1024/qwen1.5-14b-chat.jsonl ../../../repo-prompts/2048/qwen1.5-14b-chat.jsonl ../../../repo-prompts/4096/qwen1.5-14b-chat.jsonl [qwen-7b-chat_multi_prompt_files.txt](https://github.com/user-attachments/files/17000709/qwen-7b-chat_multi_prompt_files.txt) [qwen-7b-chat_multi_prompt_files.csv](https://github.com/user-attachments/files/17000710/qwen-7b-chat_multi_prompt_files.csv) --------- Co-authored-by: Chen Peter <[email protected]>

This is work in progress PR. Todos: - [x] use WhisperFeatureExtractor for audio preprocessing - [x] compute `assets/whisper/mel_filters_data.bin` on initialization - [x] move wav reader to sample utils - [ ] Longer audio inputs (>30s) chunking border poor quality results. Long audio inputs splitted by 30s chunks. This leads to a loss of context on a chunking border. This could be partially solved by [chunking with stride](https://huggingface.co/blog/asr-chunking). - [ ] add perf metrics - [x] update docstrings - [ ] update documentation - [x] add python bindings - [x] add tests - [ ] add cpp, python samples tests - [x] fix win build - [x] fetch `dr_wav.h` with `FetchContent` - [ ] support different languages, language autodetection - [ ] support translation - [ ] support timestamps - [x] remove constructor with infer requests - [x] rename pipeline to WhisperPipeline - [ ] Whisper pipeline doesn't need tokenizer, it uses detokenizer only. Implement detokenizer only initialization for `ov::genai::Tokenizer` - [ ] Check discrete GPU. Integrated GPU works as expected. - [ ] Investigate use of `RemoteTensor` for GPU - [ ] Add batch - [ ] Add sampler, inherit WhisperGenerationConfig from GenerationConfig Current limitations: - No resampling during preprocessing. Input raw speech should have 16k Hz sampling rate - No normalization during preprocessing. Input raw speech should be normalized to near [-1, 1] range Tickets: CVS-147994, CVS-146010, CVS-152522

ilya-lavrenov force-pushed the stable-diffusion-generalization branch from c10a65f to 863e231 Compare September 2, 2024 14:12

ilya-lavrenov changed the title ~~Stable diffusion generalization~~ Stable diffusion pipeline Sep 2, 2024

eaidova and others added 28 commits September 3, 2024 08:49

update optimum-intel commit to include mxfp4 (openvinotoolkit#816)

fd9287c

Use latest OpenVINO (openvinotoolkit#807)

f2545c2

Fixed Android build (openvinotoolkit#809)

d30f62f

Related PRs: - openvinotoolkit/openvino_tokenizers#240 - jinja2cpp/Jinja2Cpp#258

use perf metrics genai in llm_bench (openvinotoolkit#818)

179de15

Update the way how metrics are collected in the llm-bench CVS-151502 openvinotoolkit#830

Fix perf metrics (openvinotoolkit#829)

ecafec3

Port openvinotoolkit#823 to master CVS-151497 --------- Co-authored-by: Artur Paniukov <[email protected]>

Updated GenAI

2eb9cbd

Use tarballs

fe7bfd0

Updated tokenizers

0a53c91

updated pybind version

0dc2acf

improve utf8 handling for DecodedResults in pybind

547dc47

alighn printing DecodedResults scores in py_generate_pipeline and llm…

9ed0e9f

…_pipeline

replace invalid utf8 bytes when file is opened

63f3ce3

print with 'cat pred.txt' for debug

4b78bd8

replace � -> ""

37cd3f1

unskip beam_search_causal_lm

6b435c6

[GHA] Added overall GHA workflow status

1a1526e

Added device and OV config options into WWB

fc2cce3

Fixed style

3a52eba

[perf metrics] Added infer timing stats and bug fixes (openvinotoolki…

38eed7d

…t#843) - Collect inference only (ipot - Inference Per Output Token) time statistics and calculate total and per token information - Bug fixes - Typo fixes

Update optimum-intel version (openvinotoolkit#851)

dcdf07c

StaticLLMPipeline[chery-pick]: Handle single element list of prompts (o…

2c133af

…penvinotoolkit#853)

Update pybind11 to 2.13.5 (openvinotoolkit#850)

5df42fb

PR in openvino: openvinotoolkit/openvino#26520

Update utils.hpp (openvinotoolkit#847)

1bceaca

GenerationConfig, do_sample is always false, bool need be added

yatarkan and others added 15 commits September 12, 2024 13:10

Add GPU support for continuous batching (openvinotoolkit#831)

5d14718

- Add initial support of GPU Plugin - Fixed overflow in vllm-like scheduling, caused by size_t data type overflow - Added prefix support for accuracy sample

WWB: control number of samples. Added verbose mode with diff (openvin…

e82ac0a

…otoolkit#857) Co-authored-by: Ekaterina Aidova <[email protected]>

Drop Python 3.8 (openvinotoolkit#856)

b8603ce

Ticket 151792

[GHA] Use dynamic OV refs (openvinotoolkit#862)

ba97a22

To avoid manual update on release branch creation. Same as openvinotoolkit/openvino#26564

Remove link from WWB README.md (openvinotoolkit#860)

c43b945

Cloning https://github.com/andreyanufr/who_what_benchmark.git is no longer needed

Added blobs support for Static LLM pipeline (openvinotoolkit#811)

15d0bd5

Added support of importing blobs for `StaticLLMPipeline`

WWB: Added Chinese language and language autodetect (openvinotoolkit#873

6003234

)

Add WWB label (openvinotoolkit#876)

6221e86

Add WWB label. --------- Signed-off-by: Chen, Peter <[email protected]>

C++ text to image pipeline

8efce74

ilya-lavrenov force-pushed the stable-diffusion-generalization branch from 863e231 to 8efce74 Compare September 19, 2024 10:03

ilya-lavrenov added 5 commits September 19, 2024 12:11

Updated sample's cmake

cc66db0

Disable CI on Windows

75e21a2

Change picture

23ed727

Change model in GHA CI

27d4115

Changed the model for Linux pipeline

d890213

ilya-lavrenov force-pushed the stable-diffusion-generalization branch from 5ae8b5d to d890213 Compare September 19, 2024 13:24

ilya-lavrenov closed this Dec 6, 2024

ilya-lavrenov deleted the stable-diffusion-generalization branch December 6, 2024 11:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable diffusion pipeline #34

Stable diffusion pipeline #34

ilya-lavrenov commented Sep 2, 2024

Stable diffusion pipeline #34

Stable diffusion pipeline #34

Conversation

ilya-lavrenov commented Sep 2, 2024