Skip to content

Commit

Permalink
Merge branch 'releases/2024/6' into DOCS-updating-links-to-GenAI-24.6
Browse files Browse the repository at this point in the history
  • Loading branch information
kblaszczak-intel authored Jan 14, 2025
2 parents 83524e8 + a05a0aa commit daa580c
Show file tree
Hide file tree
Showing 13 changed files with 50 additions and 45 deletions.
1 change: 0 additions & 1 deletion docs/articles_en/about-openvino/performance-benchmarks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,6 @@ implemented in your solutions. Click the buttons below to see the chosen benchma

:material-regular:`table_view;1.4em` LLM performance for AI PC

.. uncomment under veryfication
.. grid-item::

.. button-link:: #
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ You select one of the methods by setting the ``--group-size`` parameter to eithe
.. code-block:: console
:name: group-quant
optimum-cli export openvino -m TinyLlama/TinyLlama-1.1B-Chat-v1.0 --weight-format int4 --sym --ratio 1.0 --group_size 128 TinyLlama-1.1B-Chat-v1.0
optimum-cli export openvino -m TinyLlama/TinyLlama-1.1B-Chat-v1.0 --weight-format int4 --sym --ratio 1.0 --group-size 128 TinyLlama-1.1B-Chat-v1.0
.. tab-item:: Channel-wise quantization

Expand All @@ -63,12 +63,12 @@ You select one of the methods by setting the ``--group-size`` parameter to eithe
If you want to improve accuracy, make sure you:

1. Update NNCF: ``pip install nncf==2.13``
2. Use ``--scale_estimation --dataset=<dataset_name>`` and accuracy aware quantization ``--awq``:
2. Use ``--scale_estimation --dataset <dataset_name>`` and accuracy aware quantization ``--awq``:

.. code-block:: console
:name: channel-wise-data-aware-quant
optimum-cli export openvino -m meta-llama/Llama-2-7b-chat-hf --weight-format int4 --sym --group-size -1 --ratio 1.0 --awq --scale-estimation --dataset=wikitext2 Llama-2-7b-chat-hf
optimum-cli export openvino -m meta-llama/Llama-2-7b-chat-hf --weight-format int4 --sym --group-size -1 --ratio 1.0 --awq --scale-estimation --dataset wikitext2 Llama-2-7b-chat-hf
.. important::
Expand Down
27 changes: 10 additions & 17 deletions docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
For more information, refer to the
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/image_generation>`__
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/image_generation>`__

.. tab-item:: C++
:sync: cpp
Expand Down Expand Up @@ -225,7 +225,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
}
For more information, refer to the
`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/image_generation/>`__
`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/image_generation/>`__


.. dropdown:: Speech Recognition
Expand Down Expand Up @@ -271,7 +271,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
For more information, refer to the
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/whisper_speech_recognition/>`__.
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/whisper_speech_recognition/>`__.

.. tab-item:: C++
:sync: cpp
Expand Down Expand Up @@ -323,7 +323,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
}
For more information, refer to the
`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/whisper_speech_recognition/>`__.
`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/whisper_speech_recognition/>`__.


.. dropdown:: Using GenAI in Chat Scenario
Expand Down Expand Up @@ -367,7 +367,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
For more information, refer to the
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/text_generation/chat_sample/>`__.
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/chat_sample/>`__.

.. tab-item:: C++
:sync: cpp
Expand Down Expand Up @@ -415,8 +415,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
For more information, refer to the
`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/text_generation/chat_sample/>`__

`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/chat_sample/>`__

.. dropdown:: Using GenAI with Vision Language Models

Expand Down Expand Up @@ -483,7 +482,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
For more information, refer to the
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/visual_language_chat>`__.
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/visual_language_chat>`__.

.. tab-item:: C++
:sync: cpp
Expand Down Expand Up @@ -549,7 +548,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
For more information, refer to the
`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/visual_language_chat/>`__
`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/visual_language_chat/>`__


|
Expand Down Expand Up @@ -803,8 +802,7 @@ runs prediction of the next K tokens, thus repeating the cycle.
For more information, refer to the
`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/text_generation/speculative_decoding_lm/>`__.

`Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/speculative_decoding_lm/>`__.

.. tab-item:: C++
:sync: cpp
Expand Down Expand Up @@ -859,12 +857,7 @@ runs prediction of the next K tokens, thus repeating the cycle.
For more information, refer to the
`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/text_generation/speculative_decoding_lm/>`__





`C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/speculative_decoding_lm/>`__



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,17 @@ Learn more in Loading an LLM with OpenVINO.
optimum-cli export openvino --convert-tokenizer --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 openvino_model
.. note::

The current Optimum version can convert both the model and tokenizers. To do so, use the
standard call:

.. code-block:: py
optimum-cli export openvino --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 openvino_model
Full OpenVINO Text Generation Pipeline
######################################################################

Expand All @@ -110,6 +121,7 @@ Use the model and tokenizer converted from the previous step:
import numpy as np
from openvino import compile_model
import openvino_tokenizers
# Compile the tokenizer, model, and detokenizer using OpenVINO. These files are XML representations of the models optimized for OpenVINO
compiled_tokenizer = compile_model("openvino_tokenizer.xml")
Expand Down Expand Up @@ -154,7 +166,7 @@ and appends it to the existing sequence.
# Generate new tokens iteratively
for idx in range(prompt_size, prompt_size + new_tokens_size):
# Get output from the model
output = compiled_model(input_dict)["token_ids"]
output = compiled_model(input_dict)[0]
# Update the input_ids with newly generated token
input_dict["input_ids"][:, idx] = output[:, idx - 1]
# Update the attention mask to include the new token
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ Additional Resources

* `OpenVINO Tokenizers repo <https://github.com/openvinotoolkit/openvino_tokenizers>`__
* `OpenVINO Tokenizers Notebook <https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/openvino-tokenizers>`__
* `Text generation C++ samples that support most popular models like LLaMA 3 <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/greedy_causal_lm>`__
* `Text generation C++ samples that support most popular models like LLaMA 3 <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/greedy_causal_lm>`__
* `OpenVINO GenAI Repo <https://github.com/openvinotoolkit/openvino.genai>`__


Original file line number Diff line number Diff line change
Expand Up @@ -354,7 +354,7 @@ To find the optimal weight compression parameters for a particular model, refer
`example <https://github.com/openvinotoolkit/nncf/tree/develop/examples/llm_compression/openvino/tiny_llama_find_hyperparams>`__ ,
where weight compression parameters are being searched from the subset of values.
To speed up the search, a self-designed validation pipeline called
`WhoWhatBench <https://github.com/openvinotoolkit/openvino.genai/tree/master/llm_bench/python/who_what_benchmark>`__
`WhoWhatBench <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/tools/who_what_benchmark>`__
is used. The pipeline can quickly evaluate the changes in the accuracy of the optimized
model compared to the baseline.

Expand Down Expand Up @@ -491,7 +491,7 @@ Additional Resources
- `OpenVINO GenAI Repo <https://github.com/openvinotoolkit/openvino.genai>`__
: Repository containing example pipelines that implement image and text generation
tasks. It also provides a tool to benchmark LLMs.
- `WhoWhatBench <https://github.com/openvinotoolkit/openvino.genai/tree/master/llm_bench/python/who_what_benchmark>`__
- `WhoWhatBench <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/tools/who_what_benchmark>`__
- `NNCF GitHub <https://github.com/openvinotoolkit/nncf>`__
- :doc:`Post-training Quantization <quantizing-models-post-training>`
- :doc:`Training-time Optimization <compressing-models-during-training>`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Such a tensor is called a string tensor and can be passed as input or retrieved

While this section describes basic API to handle string tensors, more practical examples that leverage both
string tensors and OpenVINO tokenizer can be found in
`GenAI Samples <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/greedy_causal_lm>`__.
`GenAI Samples <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/greedy_causal_lm>`__.


Representation
Expand Down Expand Up @@ -203,4 +203,4 @@ Additional Resources

* Use `OpenVINO tokenizers <https://github.com/openvinotoolkit/openvino_contrib/tree/releases/2024/0/modules/custom_operations/user_ie_extensions/tokenizer/python>`__ to produce models that use string tensors to work with textual information as pre- and post-processing for the large language models.

* Check out `GenAI Samples <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/greedy_causal_lm>`__ to see how string tensors are used in real-life applications.
* Check out `GenAI Samples <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/greedy_causal_lm>`__ to see how string tensors are used in real-life applications.
2 changes: 1 addition & 1 deletion docs/notebooks/multilora-image-generation-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@ generative models as it already includes all the core functionality.

``openvino_genai.Text2ImagePipeline`` class supports inference of
`Diffusers
models <https://github.com/openvinotoolkit/openvino.genai/blob/master/src/docs/SUPPORTED_MODELS.md#text-2-image-models>`__.
models <https://github.com/openvinotoolkit/openvino.genai/blob/releases/2024/6/src/docs/SUPPORTED_MODELS.md#text-2-image-models>`__.
For pipeline initialization, we should provide directory with converted
by Optimum Intel pipeline and specify inference device. Optionally, we
can provide configuration for LoRA Adapters using ``adapter_config``.
Expand Down
2 changes: 1 addition & 1 deletion docs/notebooks/openvino-tokenizers-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -548,6 +548,6 @@ Links
Types <https://github.com/openvinotoolkit/openvino_tokenizers?tab=readme-ov-file#supported-tokenizer-types>`__
- `OpenVINO.GenAI repository with the C++ example of OpenVINO
Tokenizers
usage <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/greedy_causal_lm>`__
usage <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/cpp/greedy_causal_lm>`__
- `HuggingFace Tokenizers Comparison
Table <https://github.com/openvinotoolkit/openvino_tokenizers?tab=readme-ov-file#output-match-by-model>`__
4 changes: 2 additions & 2 deletions docs/notebooks/whisper-asr-genai-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ converts the models to OpenVINO™ IR format. To simplify the user
experience, we will use `OpenVINO Generate
API <https://github.com/openvinotoolkit/openvino.genai>`__ for `Whisper
automatic speech recognition
scenarios <https://github.com/openvinotoolkit/openvino.genai/blob/master/samples/python/whisper_speech_recognition/README.md>`__.
scenarios <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/whisper_speech_recognition/README.md>`__.

Installation Instructions
~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -406,7 +406,7 @@ Run inference OpenVINO model with WhisperPipeline


To simplify user experience we will use `OpenVINO Generate
API <https://github.com/openvinotoolkit/openvino.genai/blob/master/samples/python/whisper_speech_recognition/README.md>`__.
API <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/whisper_speech_recognition/README.md>`__.
Firstly we will create pipeline with ``WhisperPipeline``. You can
construct it straight away from the folder with the converted model. It
will automatically load the ``model``, ``tokenizer``, ``detokenizer``
Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/whisper-subtitles-generation-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ GitHub `repository <https://github.com/openai/whisper>`__.
In this notebook, we will use Whisper model with `OpenVINO Generate
API <https://github.com/openvinotoolkit/openvino.genai>`__ for `Whisper
automatic speech recognition
scenarios <https://github.com/openvinotoolkit/openvino.genai/blob/master/samples/python/whisper_speech_recognition/README.md>`__
scenarios <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/whisper_speech_recognition/README.md>`__
to generate subtitles in a sample video. Additionally, we will use
`NNCF <https://github.com/openvinotoolkit/nncf>`__ improving model
performance by INT8 quantization. Notebook contains the following steps:
Expand Down Expand Up @@ -228,7 +228,7 @@ Whisper model.
whisper_pipeline.png

To simplify user experience we will use `OpenVINO Generate
API <https://github.com/openvinotoolkit/openvino.genai/blob/master/samples/python/whisper_speech_recognition/README.md>`__.
API <https://github.com/openvinotoolkit/openvino.genai/tree/releases/2024/6/samples/python/whisper_speech_recognition/README.md>`__.
Firstly we will create pipeline with ``WhisperPipeline``. You can
construct it straight away from the folder with the converted model. It
will automatically load the ``model``, ``tokenizer``, ``detokenizer``
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -126,35 +126,38 @@ def process_coveo_meta(meta, url, link):
if tag_name == 'ovdoctype':
ET.SubElement(namespace_element, tag_name).text = process_link(link)
elif tag_name == 'ovcategory' and loc_element is not None:
ET.SubElement(namespace_element, tag_name).text = extract_hierarchy(loc_element.text)
ET.SubElement(namespace_element, tag_name).text = extract_categories(loc_element.text)
elif tag_name == 'ovversion':
ET.SubElement(namespace_element, tag_name).text = tag_value

def process_link(link):
if '/' in link:
return link.split('/')[0].replace("-", " ")
return link.split('.html')[0].replace("-", " ")
return format_segment(link.split('/')[0].replace("-", " "))
return format_segment(link.split('.html')[0].replace("-", " "))

def extract_hierarchy(link):
def extract_categories(link):
path = link.split("://")[-1]
segments = path.split('/')[1:]
if segments and segments[-1].endswith('.html'):
segments = segments[:-1]

if segments:
segments = segments[1:]

if segments and '.' in segments[0]:
year, *rest = segments[0].split('.')
if year.isdigit() and len(year) == 4:
segments[0] = year

segments = [format_segment(segment) for segment in segments]

hierarchy = []
for i in range(1, len(segments) + 1):
hierarchy.append('|'.join(segments[:i]))

return ';'.join(hierarchy)
if segments:
hierarchy = ['|'.join(segments[:i]) for i in range(1, len(segments) + 1)]
return ';'.join(hierarchy)
return "No category"

def format_segment(segment):
if segment == 'c_cpp_api': segment = 'c/c++_api'
if segment == 'c_cpp_api': segment = 'C/C++_api'

return ' '.join(word.capitalize() for word in segment.replace('-', ' ').replace('_', ' ').split())
2 changes: 0 additions & 2 deletions docs/sphinx_setup/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,13 +187,11 @@
html_static_path = ['_static']

html_css_files = [
'css/custom.css',
'css/openvino_sphinx_theme.css',
'css/button.css',
'css/input.css',
'css/textfield.css',
'css/tabs.css',
'css/coveo_custom.css',
'https://cdn.jsdelivr.net/npm/@splidejs/[email protected]/dist/css/splide.min.css',
]

Expand Down

0 comments on commit daa580c

Please sign in to comment.