Skip to content
This repository has been archived by the owner on Nov 13, 2024. It is now read-only.

Commit

Permalink
Feature milvus2x->milvus2x (#88)
Browse files Browse the repository at this point in the history
* milvus2x->2x

* support mutil type vertor field & fix issue

* milvus2x migration support no field param default all fields

* connect milvus add timeout

* milvus2x->2x readme

* upgrade milvus-go-sdk
  • Loading branch information
wenhuiZilliz authored Jun 12, 2024
1 parent acd5ee2 commit 421edfe
Show file tree
Hide file tree
Showing 39 changed files with 1,408 additions and 209 deletions.
20 changes: 12 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
## Overview

[milvus-migration](https://github.com/zilliztech/milvus-migration) is a data migration tool
for [Milvus](https://milvus.io/) that supports importing **Milvus 0.9.x ~ 1.x** / **faiss** / **elasticsearch7.x+**
for [Milvus](https://milvus.io/) that supports importing **Milvus 0.9.x ~ 1.x** / **faiss** / **elasticsearch7.x+** / **Milvus2.3+**
datas to milvus 2.x.

## Architecture
Expand Down Expand Up @@ -31,12 +31,12 @@ datas to milvus 2.x.

- Data Format Support

| Source Data Type | Target Data Type |
|:-------------------------|:-----------------|
| Milvus 0.9.x - 1.x | Milvus 2.x |
| Elasticsearch 7.x-8.x | Milvus 2.x |
| Faiss (Beta) | Milvus 2.x |
| Milvus 2.x (in progress) | Milvus 2.x |
| Source Data Type | Target Data Type |
|:----------------------|:-----------------|
| Milvus 0.9.x - 1.x | Milvus 2.x |
| Elasticsearch 7.x-8.x | Milvus 2.x |
| Faiss (Beta) | Milvus 2.x |
| Milvus 2.3 + | Milvus 2.x |

### How to use this tool?

Expand Down Expand Up @@ -74,6 +74,10 @@ Finally, load the numpy files to Milvus 2.x successfully by using the `load` com

Run the command `start` to migration es->2.x. here `start` == `dump && load` cmd

```shell
./milvus-migration start
```
5. Milvus 2.3+ to Milvus2.x migration:
```shell
./milvus-migration start
```
Expand All @@ -88,6 +92,7 @@ how to learn more about using migration tool, see examples doc below:
milvux2.x : [migrate_1.x_doc](README_1X.md).
3. faiss -> milvux2.x (
Beta) : [migrate_faiss_doc](README_FAISS.md).
4. milvus2.x -> milvux2.x : [migrate_milvus2x_doc](README_2X.md).

## How to verify migration result
When migration finished, you can use visual tool `Attu` or use Milvus SDK verify your new collection data rows.
Expand All @@ -105,7 +110,6 @@ After the Milvus collection Data migration is completed, we can use SDK or `Attu

- [ ] Support Redis to Milvus 2.x
- [ ] Support Mongodb to Milvus 2.x
- [ ] Milvus 2.x to Milvus 2.x
- [ ] Support others datasource to Milvus 2.x
- [ ] Supports binary vectors

Expand Down
87 changes: 87 additions & 0 deletions README_2X.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Milvus Migration: Milvus2.x to Milvus2.x

## Limitation

### Soft version

- Source Milvus support version: 2.3.0+
- Target Milvus support version: 2.2+


## Milvus2.x to Milvus2.x migration.yaml example

```yaml
dumper:
worker:
workMode: milvus2x # work mode:milvus2x->milvus2x
reader:
bufferSize: 500 # Read source data rows in each time read from Source Milvus.

meta: # meta part
mode: config # 'config' mode means will get meta config from this config file itself.
version: 2.3.0 # Source Milvus version
collection: src_coll_name # migrate data from this source collection

source: # source milvus connection info
milvus2x:
endpoint: {milvus2x_domain}:{milvus2x_port}
username: xxxx
password: xxxxx

target: # target milvus collection info
milvus2x:
endpoint: {milvus2x_domain}:{milvus2x_port}
username: xxxx
password: xxxxx
```
you can place the migration.yaml to configs/ directory, then tool will auto read config from the configs/migration.yaml
when you execute cmd:
```shell
./milvus-migration start
```

or you can place the migration.yaml to any directory, then will read config from `--config` param path when execute cmd
like below:

```shell
./milvus-migration start --config=/{YourConfigFilePath}/migration.yaml
```
migration success when you see the print log like below:
```html
[INFO] [migration/milvus2x_starter.go:79] ["=================>JobProcess!"] [Percent=100]
[INFO] [migration/milvus2x_starter.go:27] ["[Starter] migration Milvus2X to Milvus2X finish!!!"] [Cost=94.877717375]
[INFO] [starter/starter.go:109] ["[Starter] Migration Success!"] [Cost=94.878243583]
[INFO] [cleaner/none_cleaner.go:17] ["[None Cleaner] not need clean files"] [mode=]
[INFO] [cmd/start.go:32] ["[Cleaner] clean file success!"]
```
if you want to verify the migration data result, you can use Attu see source collection already in your target Milvus. [Attu](https://github.com/zilliztech/attu)

## Other introduce
- if you don't migration all the source collection fields to the target Milvus, you can add fields config in meta part to specify the need migration fields.
- btw, you at least need migration the PrimaryKey and Vector type field to target Milvus.
```yaml
...
meta:
#......
fields: # optional configuration, only migration below source collection fields to target milvus:
- name: id
- name: title_vector
- name: reading_time
#......
...
```
- if you want to customize target collection properties, you can add below config in your meta part
```yaml
...
meta:
#......
milvus: #below info are target collection optional configuration:
collection: target_coll_name # If not, the source collection name will be used.
closeDynamicField: false # If not, the source collection DynamicField prop will be used.
shardNum: 2 # If not, the source collection ShardNum prop will be used.
consistencyLevel: Customized # If not, the source collection consistencyLevel prop will be used.
#......
...
```
37 changes: 37 additions & 0 deletions core/check/common_check.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
package check

import (
"errors"
"github.com/zilliztech/milvus-migration/core/common"
"strconv"
"strings"
)

func verifyCollNameIsOk(collection string) bool {
if strings.Contains(ArabicNumer, collection[:1]) {
return false
}
for i, _ := range collection {
s := collection[i : i+1]
if !strings.Contains(LowerAlphabet, s) && !strings.Contains(UpperAlphabet, s) &&
Underline != s && !strings.Contains(ArabicNumer, s) {
return false
}
}
return true
}

func verifyShardNum(shardNum int) error {
if shardNum > common.MAX_SHARD_NUM {
return errors.New("[Verify Meta file] Milvus shardNum can not > " + strconv.Itoa(common.MAX_SHARD_NUM))
}
return nil
}

func verifyMilvusCollName(collectionName string) error {
if !verifyCollNameIsOk(collectionName) {
return errors.New("[Verify Meta file] collection name not match [A-Z|a-z|0-9|_] format cannot as Milvus collection name, " +
"you can set meta.milvus.collection property to replace it, invalid collection name is:" + collectionName)
}
return nil
}
43 changes: 8 additions & 35 deletions core/check/es_check.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ import (
"errors"
"github.com/milvus-io/milvus-sdk-go/v2/entity"
"github.com/zilliztech/milvus-migration/core/common"
convert "github.com/zilliztech/milvus-migration/core/transform/common"
"github.com/zilliztech/milvus-migration/core/transform/es/convert"
"github.com/zilliztech/milvus-migration/core/type/estype"
"strings"
"github.com/zilliztech/milvus-migration/core/type/milvustype"
)

var LowerAlphabet = "abcdefghijklmnopqrstuvwxyz"
Expand All @@ -21,17 +22,17 @@ func VerifyESMetaCfg(metaJson *estype.MetaJSON) error {
return errors.New("[Verify ES Meta file] Index migration Field is empty, IndexName:" + idx.Index)
}
if idx.MilvusCfg == nil {
idx.MilvusCfg = &estype.MilvusCfg{ShardNum: common.MAX_SHARD_NUM}
idx.MilvusCfg = &milvustype.MilvusCfg{ShardNum: common.DEF_SHARD_NUM}
}

err := verifyShardNum(idx)
err := verifyShardNum(idx.MilvusCfg.ShardNum)
if err != nil {
return err
}

//如果自定义了milvus collection name, 则用它作为collection name
if len(idx.MilvusCfg.Collection) > 0 {
err2 := verifyMilvusCollName(idx)
err2 := verifyMilvusCollName(idx.MilvusCfg.Collection)
if err2 != nil {
return err2
}
Expand All @@ -52,8 +53,8 @@ func VerifyESMetaCfg(metaJson *estype.MetaJSON) error {
if f.Type == string(esconvert.DenseVector) && f.Dims <= 0 {
return errors.New("[Verify ES Meta file]Index migration dense_vector type Field dims need > 0")
}
if f.MaxLen > 0 && f.MaxLen > esconvert.VarcharMaxLenNum {
return errors.New("[Verify ES Meta file]milvus field max len cannot > " + esconvert.VarcharMaxLen)
if f.MaxLen > 0 && f.MaxLen > convert.VarcharMaxLenNum {
return errors.New("[Verify ES Meta file]milvus field max len cannot > " + convert.VarcharMaxLen)
}
if f.PK {
if idx.InnerPkField != nil {
Expand All @@ -65,7 +66,7 @@ func VerifyESMetaCfg(metaJson *estype.MetaJSON) error {
}
if len(idx.MilvusCfg.ConsistencyLevel) > 0 {
//如果存在ConsistencyLevel配置:
if _, ok := esconvert.ConsistencyLevelMap[idx.MilvusCfg.ConsistencyLevel]; !ok {
if _, ok := convert.ConsistencyLevelMap[idx.MilvusCfg.ConsistencyLevel]; !ok {
return errors.New("[Verify ES Meta file] ConsistencyLevel value invalid :" + idx.MilvusCfg.ConsistencyLevel)
}
}
Expand All @@ -80,31 +81,3 @@ func verifyEsIndexName(idx *estype.IdxCfg) error {
}
return nil
}

func verifyMilvusCollName(idx *estype.IdxCfg) error {
if !verifyCollNameIsOk(idx.MilvusCfg.Collection) {
return errors.New("[Verify ES Meta file] milvus collection name only can contain: [A-Z|a-z|0-9|_] and cannot start with number")
}
return nil
}

func verifyShardNum(idx *estype.IdxCfg) error {
if idx.MilvusCfg.ShardNum > common.MAX_SHARD_NUM {
return errors.New("[Verify ES Meta file] milvus shardNum can not > " + string(common.MAX_SHARD_NUM))
}
return nil
}

func verifyCollNameIsOk(collection string) bool {
if strings.Contains(ArabicNumer, collection[:1]) {
return false
}
for i, _ := range collection {
s := collection[i : i+1]
if !strings.Contains(LowerAlphabet, s) && !strings.Contains(UpperAlphabet, s) &&
Underline != s && !strings.Contains(ArabicNumer, s) {
return false
}
}
return true
}
49 changes: 49 additions & 0 deletions core/check/milvus2x_check.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
package check

import (
"errors"
convert "github.com/zilliztech/milvus-migration/core/transform/common"
"github.com/zilliztech/milvus-migration/core/type/milvus2xtype"
"github.com/zilliztech/milvus-migration/core/type/milvustype"
)

func VerifyMilvus2xMetaCfg(metaJson *milvus2xtype.MetaJSON) error {

for _, coll := range metaJson.CollCfgs {

//if len(coll.Fields) <= 0 {
// return errors.New("[Verify Milvus2x Meta file] Index migration Field is empty, Collection:" + coll.Collection)
//}

if coll.MilvusCfg == nil {
coll.MilvusCfg = &milvustype.MilvusCfg{ShardNum: 0} //当没定义时,会用source collection shardNum
}

err := verifyShardNum(coll.MilvusCfg.ShardNum)
if err != nil {
return err
}

//如果自定义了milvus collection name, 则用它作为collection name
if len(coll.MilvusCfg.Collection) > 0 {
err := verifyMilvusCollName(coll.MilvusCfg.Collection)
if err != nil {
return err
}
} else {
//否则使用 source Milvus2x collection name 作为collection name
err := verifyMilvusCollName(coll.Collection)
if err != nil {
return err
}
}

if len(coll.MilvusCfg.ConsistencyLevel) > 0 {
//如果存在ConsistencyLevel配置:
if _, ok := convert.ConsistencyLevelMap[coll.MilvusCfg.ConsistencyLevel]; !ok {
return errors.New("[Verify Milvus2x Meta file] ConsistencyLevel value invalid :" + coll.MilvusCfg.ConsistencyLevel)
}
}
}
return nil
}
8 changes: 8 additions & 0 deletions core/cleaner/base_cleaner.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,14 @@ type Clean interface {
}

func NewCleaner(cfg *config.MigrationConfig, jobId string) (*Cleaner, error) {

if len(cfg.TargetMode) == 0 {
return &Cleaner{
jobId: jobId,
cleaner: newNoneCleaner(cfg.TargetMode),
}, nil
}

var clr Clean

switch cfg.TargetMode {
Expand Down
19 changes: 19 additions & 0 deletions core/cleaner/none_cleaner.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
package cleaner

import (
"github.com/zilliztech/milvus-migration/internal/log"
"go.uber.org/zap"
)

type NoneCleaner struct {
Mode string
}

func newNoneCleaner(mode string) *NoneCleaner {
return &NoneCleaner{Mode: mode}
}

func (this *NoneCleaner) CleanFiles() error {
log.Info("[None Cleaner] not need clean files", zap.String("mode", this.Mode))
return nil
}
7 changes: 5 additions & 2 deletions core/common/constant.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ const (
Faiss DumpMode = "faiss"
Milvus1x DumpMode = "milvus1x"
Elasticsearch DumpMode = "elasticsearch"
Milvus2x DumpMode = "milvus2x"
)

type SourceMode string
Expand Down Expand Up @@ -40,15 +41,17 @@ const (

// reader type
const (
MILVUS2X = "milvus2x"
ES = "es"
RV = "rv"
UID = "uid"
FAISS_ID = "faiss-id"
FAISS_DATA = "faiss-data"
)

// current Milvus support max shard num is 2
var MAX_SHARD_NUM = 2
// current Milvus support max shard num: https://milvus.io/docs/v2.3.x/create_collection.md#Create-a-collection-with-the-schema
var MAX_SHARD_NUM = 16
var DEF_SHARD_NUM = 2

var DEBUG = false
var DUMP_SUB_TASK_NUM = 3
Expand Down
Loading

0 comments on commit 421edfe

Please sign in to comment.