Skip to content

Commit

Permalink
Update v1_user_guide.md
Browse files Browse the repository at this point in the history
  • Loading branch information
JenZhao committed Feb 27, 2025
1 parent eb6fd2b commit ff71ca6
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions docs/source/getting_started/v1_user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,9 @@ Previous blog post [vLLM V1: A Major Upgrade to vLLM's Core Architecture](https:
- logits_processors
- beam_search

#### Deprecated KV Cache
#### Deprecated KV Cache features
- KV Cache swapping
- KV Cache offloading
- FP8 KV Cache

## Unsupported features

Expand All @@ -37,6 +36,8 @@ Previous blog post [vLLM V1: A Major Upgrade to vLLM's Core Architecture](https:
- **Quantization**: For V1, when the CUDA graph is enabled, it defaults to the
piecewise CUDA graph introduced in this [PR](https://github.com/vllm-project/vllm/pull/10058) ; consequently, FP8 and other quantizations are not supported.

- **FP8 KV Cache**: FP8 KV Cache is not yet supported in V1.

## Unsupported models

All model with `SupportsV0Only` tag in the model definition is not supported by V1.
Expand Down

0 comments on commit ff71ca6

Please sign in to comment.