Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the query API of llm cache and use vector<uint8_t> as payload object. #1797

Merged
merged 1 commit into from
Mar 6, 2024

Conversation

dashanji
Copy link
Member

@dashanji dashanji commented Mar 4, 2024

What do these changes do?

  • Improve the query API, users only input a token list and will get the kv_cache with the longest prefix.
  • Use vector<uint8_t> as payload object.
  • Replace the alias of KV_STATE_WITH_LAYER with std::map<int, std::pair<K_STATE, V_STATE>>.
  • Rename the Dimension with TensorBytes.
  • Use the references of std::vector<T> to avoid copying.
  • Print the rax tree to a string for debugging.

Related issue number

@dashanji dashanji force-pushed the improve-query-api branch from f013b4f to 1edb37c Compare March 4, 2024 10:16
@dashanji dashanji requested review from vegetableysm and sighingnow and removed request for vegetableysm March 4, 2024 10:18
Copy link
Member

@sighingnow sighingnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client &client could be a member of the KVStateCacheBuilder (as well as KVStateCacheBlockBuilder). Don't given every APIs a Client &client argument.

You can image that user may pass different client object to the same KVStateCacheBuilder.

modules/basic/ds/dataframe.cc Show resolved Hide resolved
Copy link
Member

@sighingnow sighingnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename K_STATE and V_STATE to LLMKV, or llm_kv_t.

modules/llm-cache/ds/kv_state_cache_block.h Outdated Show resolved Hide resolved
@@ -27,7 +27,7 @@ limitations under the License.

using namespace vineyard; // NOLINT(build/namespaces)

#define DIMENSION 100
#define TENSORBYTES 800
#define CAPACITY 1000
#define LAYER 64
#define BLOCK_SIZE 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use constexpr int or constexpr size_t for constants in C++.

@dashanji dashanji force-pushed the improve-query-api branch 2 times, most recently from 114fe52 to a359b88 Compare March 6, 2024 06:30
modules/llm-cache/ds/kv_state_cache.cc Show resolved Hide resolved
modules/llm-cache/ds/kv_state_cache_block.cc Show resolved Hide resolved
modules/llm-cache/radix-tree/radix-tree.cc Outdated Show resolved Hide resolved
@dashanji dashanji force-pushed the improve-query-api branch 4 times, most recently from 663c7a5 to ec2eab6 Compare March 6, 2024 08:03
…ad object.

* Replace the alias of KV_STATE_WITH_LAYER with std::map<int, std::pair<K_STATE, V_STATE>>.
* Use the references of std::vector<T> to avoid copying.
* Print the rax tree to a string for debugging.
* Replace the alias of KV_STATE_WITH_LAYER with std::map<int, std::pair<K_STATE, V_STATE>>.
* Rename the Dimension with TensorBytes.

Signed-off-by: Ye Cao <[email protected]>
@dashanji dashanji force-pushed the improve-query-api branch from ec2eab6 to 45acb82 Compare March 6, 2024 09:06
@sighingnow sighingnow merged commit 589817f into v6d-io:main Mar 6, 2024
6 checks passed
@sighingnow sighingnow deleted the improve-query-api branch March 6, 2024 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants