-
Notifications
You must be signed in to change notification settings - Fork 122
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add some scipts to test fio and dfs llm cache on kubernetes cluster. (#…
…1883) Fixes #1879 Signed-off-by: Ye Cao <[email protected]>
- Loading branch information
Showing
13 changed files
with
864 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
FROM python:3.10 | ||
|
||
WORKDIR / | ||
|
||
COPY master.py /master.py | ||
|
||
RUN pip3 install kubernetes | ||
|
||
CMD ["python3", "master.py"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
FROM ghcr.io/v6d-io/v6d/vineyard-python-dev:latest_x86_64 as builder | ||
|
||
FROM python:3.10 | ||
|
||
WORKDIR / | ||
|
||
COPY worker.py /worker.py | ||
COPY --from=builder /tmp/vineyard_llm-0.22.1-py3-none-any.whl vineyard_llm-0.22.1-py3-none-any.whl | ||
|
||
RUN apt update && \ | ||
apt install fio -y | ||
|
||
RUN pip3 install vineyard /vineyard_llm-0.22.1-py3-none-any.whl && \ | ||
pip3 install networkx==3.1 && \ | ||
pip3 install numpy | ||
|
||
CMD ["python3", "worker.py"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
registry = registry.cn-wulanchabu.aliyuncs.com/vineyard | ||
build-images: | ||
docker build -t ${registry}/fs-llm-master:latest -f ./Dockerfile.master . | ||
docker build -t ${registry}/fs-llm-worker:latest -f ./Dockerfile.worker . | ||
push-images: | ||
docker push ${registry}/fs-llm-master:latest | ||
docker push ${registry}/fs-llm-worker:latest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
## Run llm test on k8s | ||
|
||
This document describes how to run the llm test on a Kubernetes cluster. | ||
|
||
### Tokenize the prompt file | ||
|
||
Suppose you have a [prompt file](./prompt-samples.txt) that contains the conversation info between the user and the chatbot. You can tokenize the prompt file by running the following command: | ||
|
||
```bash | ||
$ python tokenize_prompt.py --prompt-file prompt-samples.txt --file-num 1 | ||
``` | ||
|
||
After running the command, you will get a tokenized prompt file named `tokens_0` under the `small_files` directory. | ||
|
||
```bash | ||
$ ls small_files | ||
prompts_0.txt tokens_0 | ||
``` | ||
|
||
Also, you could set the `--file-num` to the number of files you want to split the prompt file into. If the prompt file is too large, you can split it into multiple files. Each file will be processed in parallel. | ||
|
||
``` | ||
$ python tokenize_prompt.py --prompt-file prompt-samples.txt --file-num 2 | ||
$ ls small_files | ||
prompts_0.txt prompts_1.txt tokens_0 tokens_1 | ||
``` | ||
|
||
At this point, you can put these token files to the OSS bucket or NAS refer to the [ossutil upload files](https://help.aliyun.com/zh/oss/user-guide/upload-objects-to-oss/?spm=a2c4g.11186623.0.0.4b471c22sHG1EG) or [nas mount](https://help.aliyun.com/zh/nas/user-guide/mount-an-nfs-file-system-on-a-linux-ecs-instance?spm=a2c4g.11186623.0.0.15713eedDgiEYF). | ||
|
||
### Build the master and worker images | ||
|
||
Before building the master and worker images, you need to build the vineyard-python-dev image first, as we need the llm-cache pypi package. | ||
|
||
```bash | ||
$ cd v6d && make -C docker vineyard-python-dev | ||
``` | ||
|
||
Then, you can build the master and worker images by running the following command: | ||
|
||
> Make sure the image registry is set correctly. | ||
```bash | ||
$ cd modules/llm-cache/tests/k8s-test | ||
$ make build-images | ||
``` | ||
|
||
Next, push the images to the registry: | ||
|
||
```bash | ||
$ make push-images | ||
``` | ||
|
||
### Deploy on the k8s cluster | ||
|
||
#### Create the OSS volume | ||
|
||
Assume we have put the token files to the OSS bucket, we need to [create the oss secret and oss volume](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/mount-statically-provisioned-oss-volumes#title-hos-c75-12q) first. | ||
|
||
#### Create the Distributed FileSystem Volume | ||
|
||
The DFS could be NAS or CPFS, you could refer to the [Mount Nas Volume on ACK](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/mount-statically-provisioned-nas-volumes?spm=a2c4g.11186623.0.0.b7c130b7eJHcnf) or [Mount CPFS Volume on ACK](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/statically-provisioned-cpfs-2-0-volumes-1?spm=a2c4g.11186623.0.0.399a22dbapWWsP) to create the volume. | ||
|
||
#### Deploy the worker | ||
|
||
After preparing the OSS volume, and DFS volume, you need change the NFS volume name `nas-csi-pvc` to the DFS volume you created before. | ||
|
||
> ** Note: ** The CPU resources is important for the performance of worker, you could adjust the `resources.requests.cpu` to get better performance. | ||
Then deploy the worker by running the following command: | ||
|
||
```bash | ||
$ kubectl apply -f yamls/worker.yaml | ||
``` | ||
|
||
#### Deploy the master | ||
|
||
After deploying the worker, you need to change `TOKENS_FILE_NUM` environment variable in the `yamls/master.yaml` file to the number of token files you put in the OSS bucket. Also, the OSS VolumeClaimName `oss-pvc` should be set to the OSS volume you created. | ||
|
||
Then deploy the master by running the following command: | ||
|
||
```bash | ||
$ kubectl apply -f yamls/master.yaml | ||
``` | ||
|
||
### Show the result | ||
|
||
After running the llm test, you can check the result by running the following command: | ||
|
||
```bash | ||
$ python show_result.py --kubeconfig-path /your/kubeconfig --label-selector your_label_key=your_label_value | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
import socket | ||
import random | ||
import os | ||
import time | ||
from multiprocessing import Pool | ||
from kubernetes import client, config | ||
|
||
def get_pod_ips(label_selector): | ||
config.load_incluster_config() | ||
api = client.CoreV1Api() | ||
pods = api.list_pod_for_all_namespaces(label_selector=label_selector) | ||
pod_ip_list = [] | ||
for pod in pods.items: | ||
pod_ip_list.append(pod.status.pod_ip) | ||
return pod_ip_list | ||
|
||
def distribute_prompts(args): | ||
file_name, server_ips = args | ||
token_list = [] | ||
with open(f'{file_name}', 'r', encoding='utf-8') as f: | ||
while True: | ||
line = f.readline() | ||
if not line: | ||
break | ||
token_list.append(line) | ||
|
||
for token in token_list: | ||
server_ip = random.choice(server_ips) | ||
#time.sleep(random.randint(1, 200)/1000000) | ||
while True: | ||
try: | ||
send_tokens_to_server(server_ip, 8888, token) | ||
break | ||
except Exception as e: | ||
print(f"Error: {e}") | ||
time.sleep(1) | ||
continue | ||
|
||
def send_tokens_to_server(server_address, server_port, tokens): | ||
clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | ||
clientsocket.connect((server_address, server_port)) | ||
clientsocket.send(tokens.encode('utf-8')) | ||
clientsocket.close() | ||
|
||
if __name__ == "__main__": | ||
file_num = int(os.environ.get('TOKENS_FILE_NUM', 16)) | ||
file_names = [f'/tokens/tokens_{i}' for i in range(file_num)] | ||
pod_selector = os.environ.get('POD_SELECTOR', 'app=fs-llm-test-worker') | ||
server_ips = get_pod_ips(pod_selector) | ||
with Pool(file_num) as p: | ||
p.map(distribute_prompts, [(file_name, server_ips) for file_name in file_names]) |
Oops, something went wrong.