Skip to content

Commit 045115a

Browse files
authored
[PIR] Fix milvus (#10397)
1 parent 5d8b593 commit 045115a

File tree

2 files changed

+16
-2
lines changed

2 files changed

+16
-2
lines changed

slm/applications/neural_search/recall/milvus/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,12 @@
9191
9292
```
9393

94+
下载数据集并解压到当前目录:
95+
```shell
96+
wget https://bj.bcebos.com/v1/paddlenlp/data/literature_search_data.zip
97+
unzip literature_search_data.zip
98+
```
99+
94100
<a name="向量检索"></a>
95101

96102
## 5. 向量检索
@@ -141,6 +147,10 @@ python milvus_ann_search.py --data_path milvus/milvus_data.csv \
141147
* `search`: 是否检索向量
142148
* `batch_size`: 表示的是一次性插入的向量的数量
143149

150+
也可以运行脚本:
151+
```
152+
sh scripts/feature_extract.sh
153+
```
144154

145155
| 数据量 | 时间 |
146156
| ------------ | ------------ |

slm/applications/neural_search/recall/milvus/feature_extract.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@
2323

2424
from paddlenlp.data import Pad, Tuple
2525
from paddlenlp.transformers import AutoTokenizer
26+
from paddlenlp.utils.env import (
27+
PADDLE_INFERENCE_MODEL_SUFFIX,
28+
PADDLE_INFERENCE_WEIGHTS_SUFFIX,
29+
)
2630

2731
sys.path.append(".")
2832

@@ -59,8 +63,8 @@ def __init__(
5963
self.max_seq_length = max_seq_length
6064
self.batch_size = batch_size
6165

62-
model_file = model_dir + "/inference.get_pooled_embedding.pdmodel"
63-
params_file = model_dir + "/inference.get_pooled_embedding.pdiparams"
66+
model_file = model_dir + f"/inference.get_pooled_embedding{PADDLE_INFERENCE_MODEL_SUFFIX}"
67+
params_file = model_dir + f"/inference.get_pooled_embedding{PADDLE_INFERENCE_WEIGHTS_SUFFIX}"
6468
if not os.path.exists(model_file):
6569
raise ValueError("not find model file path {}".format(model_file))
6670
if not os.path.exists(params_file):

0 commit comments

Comments
 (0)