Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable diffusion pipeline #34

Closed
wants to merge 48 commits into from

Conversation

ilya-lavrenov
Copy link
Owner

No description provided.

@ilya-lavrenov ilya-lavrenov force-pushed the stable-diffusion-generalization branch from c10a65f to 863e231 Compare September 2, 2024 14:12
@ilya-lavrenov ilya-lavrenov changed the title Stable diffusion generalization Stable diffusion pipeline Sep 2, 2024
eaidova and others added 28 commits September 3, 2024 08:49
Bumps [diffusers](https://github.com/huggingface/diffusers) from 0.30.1
to 0.30.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/huggingface/diffusers/releases">diffusers's
releases</a>.</em></p>
<blockquote>
<h2>v0.30.2: Update from single file default repository</h2>
<h2>All commits</h2>
<ul>
<li>update runway repo for single_file by <a
href="https://github.com/yiyixuxu"><code>@​yiyixuxu</code></a> in <a
href="https://redirect.github.com/huggingface/diffusers/issues/9323">#9323</a></li>
<li>Fix Flux CLIP prompt embeds repeat for num_images_per_prompt &gt; 1
by <a href="https://github.com/DN6"><code>@​DN6</code></a> in <a
href="https://redirect.github.com/huggingface/diffusers/issues/9280">#9280</a></li>
<li>[IP Adapter] Fix cache_dir and local_files_only for image encoder by
<a href="https://github.com/asomoza"><code>@​asomoza</code></a> in <a
href="https://redirect.github.com/huggingface/diffusers/issues/9272">#9272</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/huggingface/diffusers/commit/f63c12633f154c2a1d79c17f4238fb073133652c"><code>f63c126</code></a>
Release: v0.30.2</li>
<li><a
href="https://github.com/huggingface/diffusers/commit/be5995a8156d9d9967ec34abb30dfa6e0342c33d"><code>be5995a</code></a>
update runway repo for single_file (<a
href="https://redirect.github.com/huggingface/diffusers/issues/9323">#9323</a>)</li>
<li><a
href="https://github.com/huggingface/diffusers/commit/065978474b2131dc578c21f132bd1a1d3407b894"><code>0659784</code></a>
Fix Flux CLIP prompt embeds repeat for num_images_per_prompt &gt; 1 (<a
href="https://redirect.github.com/huggingface/diffusers/issues/9280">#9280</a>)</li>
<li><a
href="https://github.com/huggingface/diffusers/commit/cc1e589537701c780befc8c141a07fc6c1d46914"><code>cc1e589</code></a>
[IP Adapter] Fix <code>cache_dir</code> and
<code>local_files_only</code> for image encoder (<a
href="https://redirect.github.com/huggingface/diffusers/issues/9272">#9272</a>)</li>
<li>See full diff in <a
href="https://github.com/huggingface/diffusers/compare/v0.30.1...v0.30.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=diffusers&package-manager=pip&previous-version=0.30.1&new-version=0.30.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This is a port of openvinotoolkit#824 to master branch

[RunwayML](https://huggingface.co/runwayml) is no longer maintaining a
HuggingFace organization so `runwayml/stable-diffusion-v1-5` model is
not available for downloading.
Replace it with a re-uploaded archive copy
[`botp/stable-diffusion-v1-5`](https://huggingface.co/botp/stable-diffusion-v1-5)
Update the way how metrics are collected in the llm-bench

CVS-151502 
openvinotoolkit#830
Port openvinotoolkit#823 to
master

CVS-151497

---------

Co-authored-by: Artur Paniukov <[email protected]>
…vinotoolkit#793)

Once KV-cache tensors are exposed from the stateful model, they should
be reshaped to have static size. Current implementation of reshape
function assumes that KV-cache dimension is always equal to 2 and batch
dimension always equal to 0. For chatglm and Qwen this is not the case.
This PR identifies the KV-cache and batch dimensions by reading the
models config.json file

---------

Co-authored-by: Zlobin Vladimir <[email protected]>
Co-authored-by: Ilya Lavrenov <[email protected]>
I found that openvino_genai.PerfMetrics are not available for NPU and it
is not possible to accurately get the LLM performance on NPU. I added
part of the code for this.
Before:
<img width="857" alt="image"
src="https://github.com/user-attachments/assets/70b08d7d-0980-4876-a37e-c91433fa32df">
After:
<img width="854" alt="image"
src="https://github.com/user-attachments/assets/492eea0a-117b-4bc8-b742-2fcda56687df">
See warnings from build tool
https://github.com/openvinotoolkit/openvino.genai/actions/runs/10768130167
```
[OpenVINO genai extension (cmake + wheel)](https://github.com/openvinotoolkit/openvino.genai/actions/runs/10768130167/job/29856868111#step:11:76)
py_build_cmake.config.load:Name normalized from openvino_genai to openvino-genai
```
…t#843)

- Collect inference only (ipot - Inference Per Output Token) time
statistics and calculate total and per token information
 - Bug fixes
 - Typo fixes
GenerationConfig, do_sample is always false, bool need be added
yatarkan and others added 15 commits September 12, 2024 13:10
Co-authored-by: Alina Kladieva <[email protected]>
Co-authored-by: Zlobin Vladimir <[email protected]>
Co-authored-by: mzegla <[email protected]>
Co-authored-by: Pavel Esir <[email protected]>
Co-authored-by: Pavel Esir <[email protected]>
Co-authored-by: Artur Paniukov <[email protected]>
Co-authored-by: Ekaterina Aidova <[email protected]>
Co-authored-by: Ilya Lavrenov <[email protected]>
Co-authored-by: Mikhail Ryzhov <[email protected]>
Co-authored-by: Trawinski, Dariusz <[email protected]>
Co-authored-by: TolyaTalamanov <[email protected]>
Co-authored-by: Andrei Kochin <[email protected]>
- Add initial support of GPU Plugin 
- Fixed overflow in vllm-like scheduling, caused by size_t data type
overflow
- Added prefix support for accuracy sample
To avoid manual update on release branch creation. Same as
openvinotoolkit/openvino#26564
…penvinotoolkit#868)

Bumps [optimum[openvino]](https://github.com/huggingface/optimum) from
1.21.4 to 1.22.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/huggingface/optimum/commit/2112e99122d7f23a1da1a9d263fef64301050ea7"><code>2112e99</code></a>
Release: v1.22.0</li>
<li><a
href="https://github.com/huggingface/optimum/commit/e604af32fcd054cdeafcfb5553d02e92e0787fd3"><code>e604af3</code></a>
Add quanto install and instructions (<a
href="https://redirect.github.com/huggingface/optimum/issues/1976">#1976</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/2335ec258132881c8b56fcfb27ad2bd5d09367b6"><code>2335ec2</code></a>
update transformers imports for <code>deepspeed</code> and
<code>is_torch_xla_available</code> (<a
href="https://redirect.github.com/huggingface/optimum/issues/2012">#2012</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/29f23f1fa9dbcb148718ff852a60a495a87471ad"><code>29f23f1</code></a>
Apply deprecated <code>evaluation_strategy</code> (<a
href="https://redirect.github.com/huggingface/optimum/issues/1819">#1819</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/c0d9111775709aac6f1451e59911abe36b4b6c37"><code>c0d9111</code></a>
Fix typo in BetterTransformer's overview docs (<a
href="https://redirect.github.com/huggingface/optimum/issues/2015">#2015</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/1de4e2522ddf40ba449296877fe6de44a7650f0c"><code>1de4e25</code></a>
fix attribute name from <code>inputs_names</code> to
<code>input_names</code> (<a
href="https://redirect.github.com/huggingface/optimum/issues/2010">#2010</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/8cb6832a2797f54ec1221ff5014a81d961016b6b"><code>8cb6832</code></a>
Fix TFLite tests (<a
href="https://redirect.github.com/huggingface/optimum/issues/2007">#2007</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/bb46ebea547a2545c33c36f77067406f687187b8"><code>bb46ebe</code></a>
Modify token classification processor default dataset args (<a
href="https://redirect.github.com/huggingface/optimum/issues/2005">#2005</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/ad98dc944be4308f405ab34e78fa85b16c7d3709"><code>ad98dc9</code></a>
Modify Parallelization Strategy to Make it More General (<a
href="https://redirect.github.com/huggingface/optimum/issues/1988">#1988</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/7cc57e40f84e00f8ebc2849da303e40575fb23b4"><code>7cc57e4</code></a>
Transformers 4.44 support (<a
href="https://redirect.github.com/huggingface/optimum/issues/1996">#1996</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/huggingface/optimum/compare/v1.21.4...v1.22.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=optimum[openvino]&package-manager=pip&previous-version=1.21.4&new-version=1.22.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [optimum[openvino]](https://github.com/huggingface/optimum) from
1.21.4 to 1.22.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/huggingface/optimum/commit/2112e99122d7f23a1da1a9d263fef64301050ea7"><code>2112e99</code></a>
Release: v1.22.0</li>
<li><a
href="https://github.com/huggingface/optimum/commit/e604af32fcd054cdeafcfb5553d02e92e0787fd3"><code>e604af3</code></a>
Add quanto install and instructions (<a
href="https://redirect.github.com/huggingface/optimum/issues/1976">#1976</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/2335ec258132881c8b56fcfb27ad2bd5d09367b6"><code>2335ec2</code></a>
update transformers imports for <code>deepspeed</code> and
<code>is_torch_xla_available</code> (<a
href="https://redirect.github.com/huggingface/optimum/issues/2012">#2012</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/29f23f1fa9dbcb148718ff852a60a495a87471ad"><code>29f23f1</code></a>
Apply deprecated <code>evaluation_strategy</code> (<a
href="https://redirect.github.com/huggingface/optimum/issues/1819">#1819</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/c0d9111775709aac6f1451e59911abe36b4b6c37"><code>c0d9111</code></a>
Fix typo in BetterTransformer's overview docs (<a
href="https://redirect.github.com/huggingface/optimum/issues/2015">#2015</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/1de4e2522ddf40ba449296877fe6de44a7650f0c"><code>1de4e25</code></a>
fix attribute name from <code>inputs_names</code> to
<code>input_names</code> (<a
href="https://redirect.github.com/huggingface/optimum/issues/2010">#2010</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/8cb6832a2797f54ec1221ff5014a81d961016b6b"><code>8cb6832</code></a>
Fix TFLite tests (<a
href="https://redirect.github.com/huggingface/optimum/issues/2007">#2007</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/bb46ebea547a2545c33c36f77067406f687187b8"><code>bb46ebe</code></a>
Modify token classification processor default dataset args (<a
href="https://redirect.github.com/huggingface/optimum/issues/2005">#2005</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/ad98dc944be4308f405ab34e78fa85b16c7d3709"><code>ad98dc9</code></a>
Modify Parallelization Strategy to Make it More General (<a
href="https://redirect.github.com/huggingface/optimum/issues/1988">#1988</a>)</li>
<li><a
href="https://github.com/huggingface/optimum/commit/7cc57e40f84e00f8ebc2849da303e40575fb23b4"><code>7cc57e4</code></a>
Transformers 4.44 support (<a
href="https://redirect.github.com/huggingface/optimum/issues/1996">#1996</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/huggingface/optimum/compare/v1.21.4...v1.22.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=optimum[openvino]&package-manager=pip&previous-version=1.21.4&new-version=1.22.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
When `add_requests()` is executed in multiple threads global `m_counter`
can be accessed simultaneously by multiple threads, this results in the
same sequence IDs in different sequences.
Added mutex to prevent such situation. 

Ticket: 148119
Added support of importing blobs for `StaticLLMPipeline`
Add WWB label.

---------

Signed-off-by: Chen, Peter <[email protected]>
Extend --prompt_file to support multiple prompt files. You can specify
multiple prompt files, separated by spaces.
python benchmark.py -mc 1 -ic 128 -m <model> -d CPU -n 3 
-pf ../../../repo-prompts/32_1024/qwen1.5-14b-chat.jsonl
../../../repo-prompts/2048/qwen1.5-14b-chat.jsonl
../../../repo-prompts/4096/qwen1.5-14b-chat.jsonl



[qwen-7b-chat_multi_prompt_files.txt](https://github.com/user-attachments/files/17000709/qwen-7b-chat_multi_prompt_files.txt)

[qwen-7b-chat_multi_prompt_files.csv](https://github.com/user-attachments/files/17000710/qwen-7b-chat_multi_prompt_files.csv)

---------

Co-authored-by: Chen Peter <[email protected]>
This is work in progress PR. Todos:
- [x] use WhisperFeatureExtractor for audio preprocessing
- [x] compute `assets/whisper/mel_filters_data.bin` on initialization
- [x] move wav reader to sample utils
- [ ] Longer audio inputs (>30s) chunking border poor quality results.
Long audio inputs splitted by 30s chunks. This leads to a loss of
context on a chunking border. This could be partially solved by
[chunking with stride](https://huggingface.co/blog/asr-chunking).
- [ ] add perf metrics
- [x] update docstrings
- [ ] update documentation
- [x] add python bindings
- [x] add tests
- [ ] add cpp, python samples tests
- [x] fix win build
- [x] fetch `dr_wav.h` with `FetchContent`
- [ ] support different languages, language autodetection
- [ ] support translation
- [ ] support timestamps
- [x] remove constructor with infer requests
- [x] rename pipeline to WhisperPipeline
- [ ] Whisper pipeline doesn't need tokenizer, it uses detokenizer only.
Implement detokenizer only initialization for `ov::genai::Tokenizer`
- [ ] Check discrete GPU. Integrated GPU works as expected.
- [ ] Investigate use of `RemoteTensor` for GPU
- [ ] Add batch
- [ ] Add sampler, inherit WhisperGenerationConfig from GenerationConfig

Current limitations:
- No resampling during preprocessing. Input raw speech should have 16k
Hz sampling rate
- No normalization during preprocessing. Input raw speech should be
normalized to near [-1, 1] range

Tickets: CVS-147994, CVS-146010, CVS-152522
@ilya-lavrenov ilya-lavrenov force-pushed the stable-diffusion-generalization branch from 863e231 to 8efce74 Compare September 19, 2024 10:03
@ilya-lavrenov ilya-lavrenov force-pushed the stable-diffusion-generalization branch from 5ae8b5d to d890213 Compare September 19, 2024 13:24
@ilya-lavrenov ilya-lavrenov deleted the stable-diffusion-generalization branch December 6, 2024 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.