forked from openvinotoolkit/openvino.genai
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Generate pipeline (openvinotoolkit#334)
LLM return logits with probabilities of each token, these probabilities can be converted to tokens/words with different technics: greedy decoding, beam search decoding, random sampling, etc. This requires writing user unfriendly post-processing even for the simplest scenario of greedy decoding. In order to make live easier we we combined all decoding scenarios into a single function call, where the decoding method and parameters are specified by arguments. In this PR we provide a user friendly API for text generation inspired by `generate` method from HuggingFace transformers library. - [x] enable calling tokenizers/detokenizers from LLMPipeline - [ ] add callback for streaming mode - done partially, need to improve - [x] rewritten samples with the current approach: [causal_lm/cpp/generate_pipeline/generate_sample.cpp#L73-L83](https://github.com/pavel-esir/openvino.genai/blob/generate_pipeline/text_generation/causal_lm/cpp/generate_pipeline/generate_sample.cpp#L73-L83) - [x] Multibatch greedy decoding - [ ] Speculative decoding - [ ] Grouped Beam Search decoding: ready for batch 1, need to rebase multibatch support after merging openvinotoolkit#349 - [x] Random sampling Example 1: Greedy search generation ``` LLMPipeline pipe(model_path, device); // Will try to load config from generation_config.json. // but if not found default velues for gready search will be used GenerationConfig config = pipe.generation_config(); cout << pipe(prompt, config.max_new_tokens(20)); ``` Example 2: TextStreaming mode ``` LLMPipeline pipe(model_path, device); GenerationConfig config = pipe.generation_config(); auto text_streamer = TextStreamer{pipe}; auto text_streamer_callback = [&text_streamer](std::vector<int64_t>&& tokens, LLMPipeline& pipe){ text_streamer.put(tokens[0]); }; pipe(prompt, config.max_new_tokens(20).set_callback(text_streamer_callback)); text_streamer.end(); ``` CVS-132907 CVS-137920 --------- Co-authored-by: Wovchena <[email protected]> Co-authored-by: Ilya Lavrenov <[email protected]> Co-authored-by: Alexander Suvorov <[email protected]> Co-authored-by: Yaroslav Tarkan <[email protected]> Co-authored-by: Xiake Sun <[email protected]> Co-authored-by: wenyi5608 <[email protected]> Co-authored-by: Ekaterina Aidova <[email protected]> Co-authored-by: guozhong wang <[email protected]> Co-authored-by: Chen Peter <[email protected]>
- Loading branch information
1 parent
561cde0
commit 9902928
Showing
76 changed files
with
5,054 additions
and
711 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
name: genai_package | ||
on: pull_request | ||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.head_ref || github.ref_name }} | ||
cancel-in-progress: true | ||
jobs: | ||
ubuntu_genai_package: | ||
strategy: | ||
matrix: | ||
build-type: [Release, Debug] | ||
runs-on: ubuntu-20.04 | ||
env: | ||
CMAKE_BUILD_PARALLEL_LEVEL: null | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
submodules: recursive | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.8 | ||
- run: mkdir ./ov/ | ||
- run: curl https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.2.0rc1/linux/l_openvino_toolkit_ubuntu20_2024.2.0.dev20240524_x86_64.tgz | tar --directory ./ov/ --strip-components 1 -xz | ||
- run: sudo ./ov/install_dependencies/install_openvino_dependencies.sh | ||
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ -B ./build/ | ||
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config ${{ matrix.build-type }} --target package -j | ||
- run: source ./ov/setupvars.sh && cmake --install ./build/ --config ${{ matrix.build-type }} --prefix ov | ||
- run: ov/samples/cpp/build_samples.sh -i ${{ github.workspace }}/s\ pace | ||
if: ${{ 'Release' == matrix.build-type }} # build_samples enforces Release build | ||
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ov/samples/cpp/ -B ./samples\ build/ && cmake --build ./samples\ build/ --config ${{ matrix.build-type }} -j && cmake --install ./samples\ build/ --config ${{ matrix.build-type }} --component samples_bin --prefix s\ pace | ||
if: ${{ 'Release' != matrix.build-type }} | ||
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release | ||
- run: source ./ov/setupvars.sh && python -m pip install --upgrade-strategy eager -r ./samples/cpp/requirements.txt | ||
- run: source ./ov/setupvars.sh && optimum-cli export openvino --trust-remote-code --weight-format fp16 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0 | ||
- run: source ./ov/setupvars.sh && timeout 50s ${{ github.workspace }}/s\ pace/samples_bin/greedy_causal_lm ./TinyLlama-1.1B-Chat-v1.0/ "" | ||
|
||
macos_genai_package: | ||
strategy: | ||
matrix: | ||
build-type: [Release, Debug] | ||
runs-on: macos-12 | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
submodules: recursive | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.8 | ||
- run: mkdir ./ov/ | ||
- run: curl https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.2.0rc2/macos/m_openvino_toolkit_macos_12_6_2024.2.0.dev20240529_x86_64.tgz | tar --directory ./ov/ --strip-components 1 -xz | ||
- run: brew install coreutils scons | ||
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ -B ./build/ | ||
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config ${{ matrix.build-type }} --target package -j | ||
- run: source ./ov/setupvars.sh && cmake --install ./build/ --config ${{ matrix.build-type }} --prefix ov | ||
- run: ov/samples/cpp/build_samples.sh -i ${{ github.workspace }}/s\ pace | ||
if: ${{ 'Release' == matrix.build-type }} # build_samples enforces Release build | ||
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release | ||
if: ${{ 'Release' == matrix.build-type }} | ||
- run: source ./ov/setupvars.sh && python -m pip install --upgrade-strategy eager -r ./samples/cpp/requirements.txt | ||
if: ${{ 'Release' == matrix.build-type }} | ||
- run: source ./ov/setupvars.sh && optimum-cli export openvino --trust-remote-code --weight-format fp16 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0 | ||
if: ${{ 'Release' == matrix.build-type }} | ||
- run: source ./ov/setupvars.sh && timeout 50s ${{ github.workspace }}/s\ pace/samples_bin/greedy_causal_lm ./TinyLlama-1.1B-Chat-v1.0/ "" | ||
if: ${{ 'Release' == matrix.build-type }} | ||
|
||
windows_genai_package: | ||
strategy: | ||
matrix: | ||
build-type: [Release, Debug] | ||
runs-on: windows-latest | ||
env: | ||
CMAKE_BUILD_PARALLEL_LEVEL: null | ||
defaults: | ||
run: | ||
shell: cmd | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
submodules: recursive | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.8 | ||
- run: curl --output ov.zip https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.2.0rc1/windows/w_openvino_toolkit_windows_2024.2.0.dev20240524_x86_64.zip | ||
- run: unzip ov.zip | ||
# Shorten the next setupvars calls. | ||
- run: mklink /D ov w_openvino_toolkit_windows_2024.2.0.dev20240524_x86_64 | ||
- run: call ov\setupvars.bat && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ -B ./build/ | ||
- run: call ov\setupvars.bat && cmake --build ./build/ --config ${{ matrix.build-type }} --target package -j | ||
- run: call ov\setupvars.bat && cmake --install ./build/ --config ${{ matrix.build-type }} --prefix ov | ||
- run: call ov\samples\cpp\build_samples_msvc.bat -i "${{ github.workspace }}/samples_install" | ||
if: ${{ false && 'Release' == matrix.build-type }} # build_samples enforces Release build | ||
- run: call ov\setupvars.bat && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release | ||
if: ${{ false && 'Release' == matrix.build-type }} | ||
- run: call ov\setupvars.bat && python -m pip install --upgrade-strategy eager -r ./samples/cpp/requirements.txt | ||
if: ${{ false && 'Release' == matrix.build-type }} | ||
- run: call ov\setupvars.bat && optimum-cli export openvino --trust-remote-code --weight-format fp16 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0 | ||
if: ${{ false && 'Release' == matrix.build-type }} | ||
- run: call ov\setupvars.bat && "${{ github.workspace }}/samples_install/samples_bin/greedy_causal_lm" .\TinyLlama-1.1B-Chat-v1.0\ "" | ||
if: ${{ false && 'Release' == matrix.build-type }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
name: genai_python_lib | ||
on: pull_request | ||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.head_ref || github.ref_name }} | ||
cancel-in-progress: true | ||
jobs: | ||
ubuntu_genai_python_lib: | ||
# A tokenizers' dependency fails to compile on ubuntu-20 n CenOS7 env. | ||
runs-on: ubuntu-22.04 | ||
env: | ||
# A tokenizers' dependency fails to compile with Ninja in CenOS7 env. | ||
CMAKE_GENERATOR: Unix Makefiles | ||
CMAKE_BUILD_PARALLEL_LEVEL: null | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
submodules: recursive | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.8 | ||
- run: mkdir ./ov/ | ||
# Install CentOS7 instead of Ubuntu to match PyPI distribution ABI. | ||
- run: curl https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.2.0rc1/linux/l_openvino_toolkit_centos7_2024.2.0.dev20240524_x86_64.tgz | tar --directory ./ov/ --strip-components 1 -xz | ||
- run: sudo ./ov/install_dependencies/install_openvino_dependencies.sh | ||
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ | ||
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config Release -j | ||
# GitHub Actions already provides what is listed in ./requirements-build.txt but the internal | ||
# build system doesn't. Install ./requirements-build.txt to detect possible conflicts. | ||
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./requirements-build.txt -r ./tests/python_tests/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release --upgrade-strategy eager | ||
- run: source ./ov/setupvars.sh && PYTHONPATH=./build/:$PYTHONPATH python -m pytest ./tests/python_tests/test_generate_api.py -m precommit | ||
- run: source ./ov/setupvars.sh && python -m pip install . --config-settings=build-dir="build" --verbose | ||
- run: python -m pytest ./tests/python_tests/test_generate_api.py -m precommit | ||
|
||
macos_genai_python_lib: | ||
runs-on: macos-12 | ||
env: | ||
# A tokenizers' dependency fails to compile with Ninja. | ||
CMAKE_GENERATOR: Unix Makefiles | ||
CMAKE_BUILD_PARALLEL_LEVEL: null | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
submodules: recursive | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.8 | ||
- run: mkdir ./ov/ | ||
- run: curl https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.2.0rc2/macos/m_openvino_toolkit_macos_12_6_2024.2.0.dev20240529_x86_64.tgz | tar --directory ./ov/ --strip-components 1 -xz | ||
- run: brew install coreutils scons | ||
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ | ||
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config Release -j | ||
# GitHub Actions already provides what is listed in ./requirements-build.txt but the internal | ||
# build system doesn't. Install ./requirements-build.txt to detect possible conflicts. | ||
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./requirements-build.txt -r ./tests/python_tests/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release --upgrade-strategy eager | ||
- run: source ./ov/setupvars.sh && PYTHONPATH=./build/:$PYTHONPATH python -m pytest ./tests/python_tests/test_generate_api.py -m precommit | ||
- run: source ./ov/setupvars.sh && python -m pip install . --config-settings=build-dir="build" --verbose | ||
- run: python -c "from openvino_genai import LLMPipeline" | ||
- run: python -m pytest ./tests/python_tests/test_generate_api.py -m precommit | ||
|
||
windows_genai_python_lib: | ||
if: false | ||
runs-on: windows-latest | ||
env: | ||
CMAKE_BUILD_PARALLEL_LEVEL: null | ||
defaults: | ||
run: | ||
shell: cmd | ||
steps: | ||
- uses: actions/checkout@v4 | ||
with: | ||
submodules: recursive | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.8 | ||
- run: curl --output ov.zip https://storage.openvinotoolkit.org/repositories/openvino/packages/pre-release/2024.2.0rc1/windows/w_openvino_toolkit_windows_2024.2.0.dev20240524_x86_64.zip | ||
- run: unzip ov.zip | ||
# Shorten the next setupvars calls. | ||
- run: mklink /D ov w_openvino_toolkit_windows_2024.2.0.dev20240524_x86_64 | ||
- run: call ./ov/setupvars.bat && cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ | ||
- run: call ./ov/setupvars.bat && cmake --build ./build/ --config Release -j | ||
- run: call ./ov/setupvars.bat && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./requirements-build.txt -r ./tests/python_tests/requirements.txt --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release --upgrade-strategy eager | ||
# cmd evaluates variables in a different way. Setting PYTHONPATH before setupvars.bat instead of doing that after solves that. | ||
- run: set "PYTHONPATH=./build/" && call ./ov/setupvars.bat && python -m pytest ./tests/python_tests/test_generate_api.py -m precommit | ||
- run: call ./ov/setupvars.bat && python -m pip install . --config-settings=build-dir="build" --verbose | ||
- run: python -m pytest ./tests/python_tests/test_generate_api.py -m precommit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -38,4 +38,4 @@ CMakeUserPresets.json | |
# Python-specific | ||
*.?env* | ||
*.pyc | ||
__pycache__ | ||
__pycache__ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Copyright (C) 2018-2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
|
||
cmake_minimum_required(VERSION 3.23.0) # The requirement comes from Jinja2Cpp | ||
|
||
# Multi config generators such as Visual Studio ignore CMAKE_BUILD_TYPE. Multi config generators are configured with | ||
# CMAKE_CONFIGURATION_TYPES, but limiting options in it completely removes such build options | ||
get_property(GENERATOR_IS_MULTI_CONFIG_VAR GLOBAL PROPERTY GENERATOR_IS_MULTI_CONFIG) | ||
if(CMAKE_GENERATOR STREQUAL "Ninja Multi-Config") | ||
# 'Ninja Multi-Config' specific, see: | ||
# https://cmake.org/cmake/help/latest/variable/CMAKE_DEFAULT_BUILD_TYPE.html | ||
set(CMAKE_DEFAULT_BUILD_TYPE "Release" CACHE STRING "CMake default build type") | ||
elseif(NOT GENERATOR_IS_MULTI_CONFIG_VAR AND NOT DEFINED CMAKE_BUILD_TYPE) | ||
message(STATUS "CMAKE_BUILD_TYPE is not defined, 'Release' will be used") | ||
# Setting CMAKE_BUILD_TYPE as CACHE must go before project(). Otherwise project() sets its value and set() doesn't take an effect | ||
set(CMAKE_BUILD_TYPE Release CACHE STRING "Choose the type of build, options are: None Debug Release RelWithDebInfo MinSizeRel ...") | ||
endif() | ||
|
||
project(OpenVINOGenAI VERSION 2024.2.0.0) | ||
|
||
add_subdirectory(./thirdparty/) | ||
add_subdirectory(src) | ||
add_subdirectory(samples/cpp/beam_search_causal_lm/) | ||
add_subdirectory(samples/cpp/chat_sample/) | ||
add_subdirectory(samples/cpp/greedy_causal_lm/) | ||
add_subdirectory(samples/cpp/multinomial_causal_lm/) | ||
add_subdirectory(samples/cpp/prompt_lookup_decoding_lm/) | ||
add_subdirectory(samples/cpp/speculative_decoding_lm/) | ||
|
||
install(DIRECTORY | ||
./samples/cpp/beam_search_causal_lm | ||
./samples/cpp/chat_sample | ||
./samples/cpp/greedy_causal_lm | ||
./samples/cpp/multinomial_causal_lm | ||
# Don't install prompt_lookup_decoding_lm and speculative_decoding_lm because they don't use openvino_genai library and arent verifyed yet. | ||
DESTINATION samples/cpp/ COMPONENT cpp_samples_genai) | ||
install(FILES ./samples/cpp/requirements.txt DESTINATION samples/cpp/ COMPONENT cpp_samples_genai) | ||
install(FILES LICENSE DESTINATION licensing COMPONENT licensing_genai RENAME LICENSE-GENAI) | ||
install(FILES third-party-programs.txt DESTINATION licensing COMPONENT licensing_genai RENAME third-party-programs-genai.txt) | ||
if(MSVC AND NOT DEFINED CPACK_GENERATOR) | ||
set(CPACK_GENERATOR "ZIP") | ||
endif() | ||
include(CPack) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
--extra-index-url https://download.pytorch.org/whl/cpu | ||
torch==2.2.2+cpu | ||
diffusers==0.27.2 | ||
optimum-intel[openvino] @ git+https://github.com/huggingface/optimum-intel.git@fb1b35bef23242d65b2fb057c4a7ac78a7cfd4c3 | ||
optimum-intel[openvino]==1.17.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.