Skip to content

Commit

Permalink
Add some scipts to test fio and dfs llm cache on kubernetes cluster. (#…
Browse files Browse the repository at this point in the history
…1883)

Fixes #1879

Signed-off-by: Ye Cao <[email protected]>
  • Loading branch information
dashanji authored May 16, 2024
1 parent 44458b4 commit 397d274
Show file tree
Hide file tree
Showing 13 changed files with 864 additions and 0 deletions.
9 changes: 9 additions & 0 deletions modules/llm-cache/tests/k8s-test/Dockerfile.master
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
FROM python:3.10

WORKDIR /

COPY master.py /master.py

RUN pip3 install kubernetes

CMD ["python3", "master.py"]
17 changes: 17 additions & 0 deletions modules/llm-cache/tests/k8s-test/Dockerfile.worker
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM ghcr.io/v6d-io/v6d/vineyard-python-dev:latest_x86_64 as builder

FROM python:3.10

WORKDIR /

COPY worker.py /worker.py
COPY --from=builder /tmp/vineyard_llm-0.22.1-py3-none-any.whl vineyard_llm-0.22.1-py3-none-any.whl

RUN apt update && \
apt install fio -y

RUN pip3 install vineyard /vineyard_llm-0.22.1-py3-none-any.whl && \
pip3 install networkx==3.1 && \
pip3 install numpy

CMD ["python3", "worker.py"]
7 changes: 7 additions & 0 deletions modules/llm-cache/tests/k8s-test/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
registry = registry.cn-wulanchabu.aliyuncs.com/vineyard
build-images:
docker build -t ${registry}/fs-llm-master:latest -f ./Dockerfile.master .
docker build -t ${registry}/fs-llm-worker:latest -f ./Dockerfile.worker .
push-images:
docker push ${registry}/fs-llm-master:latest
docker push ${registry}/fs-llm-worker:latest
91 changes: 91 additions & 0 deletions modules/llm-cache/tests/k8s-test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
## Run llm test on k8s

This document describes how to run the llm test on a Kubernetes cluster.

### Tokenize the prompt file

Suppose you have a [prompt file](./prompt-samples.txt) that contains the conversation info between the user and the chatbot. You can tokenize the prompt file by running the following command:

```bash
$ python tokenize_prompt.py --prompt-file prompt-samples.txt --file-num 1
```

After running the command, you will get a tokenized prompt file named `tokens_0` under the `small_files` directory.

```bash
$ ls small_files
prompts_0.txt tokens_0
```

Also, you could set the `--file-num` to the number of files you want to split the prompt file into. If the prompt file is too large, you can split it into multiple files. Each file will be processed in parallel.

```
$ python tokenize_prompt.py --prompt-file prompt-samples.txt --file-num 2
$ ls small_files
prompts_0.txt prompts_1.txt tokens_0 tokens_1
```

At this point, you can put these token files to the OSS bucket or NAS refer to the [ossutil upload files](https://help.aliyun.com/zh/oss/user-guide/upload-objects-to-oss/?spm=a2c4g.11186623.0.0.4b471c22sHG1EG) or [nas mount](https://help.aliyun.com/zh/nas/user-guide/mount-an-nfs-file-system-on-a-linux-ecs-instance?spm=a2c4g.11186623.0.0.15713eedDgiEYF).

### Build the master and worker images

Before building the master and worker images, you need to build the vineyard-python-dev image first, as we need the llm-cache pypi package.

```bash
$ cd v6d && make -C docker vineyard-python-dev
```

Then, you can build the master and worker images by running the following command:

> Make sure the image registry is set correctly.
```bash
$ cd modules/llm-cache/tests/k8s-test
$ make build-images
```

Next, push the images to the registry:

```bash
$ make push-images
```

### Deploy on the k8s cluster

#### Create the OSS volume

Assume we have put the token files to the OSS bucket, we need to [create the oss secret and oss volume](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/mount-statically-provisioned-oss-volumes#title-hos-c75-12q) first.

#### Create the Distributed FileSystem Volume

The DFS could be NAS or CPFS, you could refer to the [Mount Nas Volume on ACK](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/mount-statically-provisioned-nas-volumes?spm=a2c4g.11186623.0.0.b7c130b7eJHcnf) or [Mount CPFS Volume on ACK](https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/statically-provisioned-cpfs-2-0-volumes-1?spm=a2c4g.11186623.0.0.399a22dbapWWsP) to create the volume.

#### Deploy the worker

After preparing the OSS volume, and DFS volume, you need change the NFS volume name `nas-csi-pvc` to the DFS volume you created before.

> ** Note: ** The CPU resources is important for the performance of worker, you could adjust the `resources.requests.cpu` to get better performance.
Then deploy the worker by running the following command:

```bash
$ kubectl apply -f yamls/worker.yaml
```

#### Deploy the master

After deploying the worker, you need to change `TOKENS_FILE_NUM` environment variable in the `yamls/master.yaml` file to the number of token files you put in the OSS bucket. Also, the OSS VolumeClaimName `oss-pvc` should be set to the OSS volume you created.

Then deploy the master by running the following command:

```bash
$ kubectl apply -f yamls/master.yaml
```

### Show the result

After running the llm test, you can check the result by running the following command:

```bash
$ python show_result.py --kubeconfig-path /your/kubeconfig --label-selector your_label_key=your_label_value
```
51 changes: 51 additions & 0 deletions modules/llm-cache/tests/k8s-test/master.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
import socket
import random
import os
import time
from multiprocessing import Pool
from kubernetes import client, config

def get_pod_ips(label_selector):
config.load_incluster_config()
api = client.CoreV1Api()
pods = api.list_pod_for_all_namespaces(label_selector=label_selector)
pod_ip_list = []
for pod in pods.items:
pod_ip_list.append(pod.status.pod_ip)
return pod_ip_list

def distribute_prompts(args):
file_name, server_ips = args
token_list = []
with open(f'{file_name}', 'r', encoding='utf-8') as f:
while True:
line = f.readline()
if not line:
break
token_list.append(line)

for token in token_list:
server_ip = random.choice(server_ips)
#time.sleep(random.randint(1, 200)/1000000)
while True:
try:
send_tokens_to_server(server_ip, 8888, token)
break
except Exception as e:
print(f"Error: {e}")
time.sleep(1)
continue

def send_tokens_to_server(server_address, server_port, tokens):
clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientsocket.connect((server_address, server_port))
clientsocket.send(tokens.encode('utf-8'))
clientsocket.close()

if __name__ == "__main__":
file_num = int(os.environ.get('TOKENS_FILE_NUM', 16))
file_names = [f'/tokens/tokens_{i}' for i in range(file_num)]
pod_selector = os.environ.get('POD_SELECTOR', 'app=fs-llm-test-worker')
server_ips = get_pod_ips(pod_selector)
with Pool(file_num) as p:
p.map(distribute_prompts, [(file_name, server_ips) for file_name in file_names])
Loading

0 comments on commit 397d274

Please sign in to comment.