Skip to content

Commit a93da12

Browse files
committed
chore: upgrade to new vLLM schema
Signed-off-by: paperspace <[email protected]>
1 parent 2e7592c commit a93da12

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

90 files changed

+34
-1391
lines changed

.editorconfig

-3
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,6 @@ charset = utf-8
77
indent_style = space
88
indent_size = 2
99

10-
[openllm-client/src/openllm_client/pb/v1/*]
11-
indent_size = unset
12-
1310
[/node_modules/*]
1411
indent_size = unset
1512
indent_style = unset

.gitattributes

-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
*_pb2*.pyi linguist-generated=true
44

55
# Python sdk
6-
openllm-python/tests/models/__snapshots__/* linguist-generated=true
76
openllm-python/README.md linguist-generated=true
87
openllm-python/CHANGELOG.md linguist-generated=true
98
openllm-core/src/openllm_core/config/__init__.py linguist-generated=true

.github/workflows/binary-releases.yml

-2
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@ on:
77
branches: [main]
88
paths-ignore:
99
- '*.md'
10-
- 'docs/**'
1110
- 'changelog.d/**'
1211
- 'assets/**'
1312
- 'openllm-node/**'
@@ -16,7 +15,6 @@ on:
1615
branches: [main]
1716
paths-ignore:
1817
- '*.md'
19-
- 'docs/**'
2018
- 'changelog.d/**'
2119
- 'assets/**'
2220
- 'openllm-node/**'

.github/workflows/build-pypi.yml

-6
Original file line numberDiff line numberDiff line change
@@ -18,18 +18,12 @@ on:
1818
push:
1919
branches: [main]
2020
paths-ignore:
21-
- 'docs/**'
22-
- 'bazel/**'
23-
- 'typings/**'
2421
- '*.md'
2522
- 'changelog.d/**'
2623
- 'assets/**'
2724
pull_request:
2825
branches: [main]
2926
paths-ignore:
30-
- 'docs/**'
31-
- 'bazel/**'
32-
- 'typings/**'
3327
- '*.md'
3428
- 'changelog.d/**'
3529
- 'assets/**'

README.md

+16-8
Original file line numberDiff line numberDiff line change
@@ -760,11 +760,8 @@ Quantization is a technique to reduce the storage and computation requirements f
760760

761761
OpenLLM supports the following quantization techniques
762762

763-
- [LLM.int8(): 8-bit Matrix Multiplication](https://arxiv.org/abs/2208.07339) through [bitsandbytes](https://github.com/TimDettmers/bitsandbytes)
764-
- [SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
765-
](https://arxiv.org/abs/2306.03078) through [bitsandbytes](https://github.com/TimDettmers/bitsandbytes)
766-
- [AWQ: Activation-aware Weight Quantization](https://arxiv.org/abs/2306.00978),
767-
- [GPTQ: Accurate Post-Training Quantization](https://arxiv.org/abs/2210.17323)
763+
- [AWQ: Activation-aware Weight Quantization](https://arxiv.org/abs/2306.00978).
764+
- [GPTQ: Accurate Post-Training Quantization](https://arxiv.org/abs/2210.17323).
768765
- [SqueezeLLM: Dense-and-Sparse Quantization](https://arxiv.org/abs/2306.07629).
769766

770767
> [!NOTE]
@@ -816,10 +813,21 @@ from llama_index.llms.openllm import OpenLLMAPI
816813
Spin up an OpenLLM server, and connect to it by specifying its URL:
817814

818815
```python
819-
from langchain.llms import OpenLLM
816+
from langchain.llms import OpenLLMAPI
820817

821-
llm = OpenLLM(server_url='http://44.23.123.1:3000', server_type='http')
822-
llm('What is the difference between a duck and a goose? And why there are so many Goose in Canada?')
818+
llm = OpenLLMAPI(server_url='http://44.23.123.1:3000')
819+
llm.invoke('What is the difference between a duck and a goose? And why there are so many Goose in Canada?')
820+
821+
# streaming
822+
for it in llm.stream('What is the difference between a duck and a goose? And why there are so many Goose in Canada?'):
823+
print(it, flush=True, end='')
824+
825+
# async context
826+
await llm.ainvoke('What is the difference between a duck and a goose? And why there are so many Goose in Canada?')
827+
828+
# async streaming
829+
async for it in llm.astream('What is the difference between a duck and a goose? And why there are so many Goose in Canada?'):
830+
print(it, flush=True, end='')
823831
```
824832

825833
<!-- hatch-fancy-pypi-readme interim stop -->

docs/.eslintrc.cjs

-105
This file was deleted.

docs/README.md

-9
This file was deleted.

docs/components/features/index.tsx

-30
This file was deleted.

docs/components/features/style.module.css

-135
This file was deleted.

docs/components/icons/arrow-right.svg

-11
This file was deleted.

0 commit comments

Comments
 (0)