Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: The status of collections show “0% is loaded". #37505

Open
1 task done
allonx opened this issue Nov 7, 2024 · 11 comments
Open
1 task done

[Bug]: The status of collections show “0% is loaded". #37505

allonx opened this issue Nov 7, 2024 · 11 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@allonx
Copy link

allonx commented Nov 7, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:2.3.15
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): ubuntu22.04
- CPU/Memory: 384G
- GPU: 
- Others:

Current Behavior

Miluvus 2.3.15使用docker compose 部署。数据量2000万条,机房断电后ups击穿,milvus重启后,collection状态显示0% is loaded.

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

error_logs.txt

Anything else?

No response

@allonx allonx added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 7, 2024
@xiaofan-luan
Copy link
Collaborator

@allonx
did you try to restart the node?

There seems to be no hard data damage
restart docker and check if it works.
If no we'd like to glad more help. connect me at [email protected]

@yanliang567
Copy link
Contributor

@allonx quick questions, did etcd service go back to running healthy after power back? could you please share the etcd logs for us?

@yanliang567
Copy link
Contributor

/assign @allonx

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 8, 2024
@allonx
Copy link
Author

allonx commented Nov 10, 2024

@allonx quick questions, did etcd service go back to running healthy after power back? could you please share the etcd logs for us?

etcd_logs.txt

@allonx
Copy link
Author

allonx commented Nov 10, 2024

@allonx did you try to restart the node?

There seems to be no hard data damage restart docker and check if it works. If no we'd like to glad more help. connect me at [email protected]

I had restart docker-compose again. It doesn't work!

@allonx
Copy link
Author

allonx commented Nov 10, 2024

logs.tar.gz

@allonx did you try to restart the node?
There seems to be no hard data damage restart docker and check if it works. If no we'd like to glad more help. connect me at [email protected]

I had restart docker-compose again. It doesn't work!

logs.tar.gz

@xiaofan-luan
Copy link
Collaborator

@allonx
What I need is all log from querycoord, datacoord and querynode
you can collect with https://github.com/milvus-io/milvus/tree/master/deployments/export-log
Right now it seems that there are many small segments in the cluster.

maybe some file brokes on object storage or rocksdb.

Since you are use standalone mode, there is only one copy of all the data.
but we need error log to find which data actaully broke

@allonx
Copy link
Author

allonx commented Nov 11, 2024

For Milvus installed with docker-compose, you can use docker compose logs > milvus.log to export the logs.

The file (logs.tar.gz) I uploaded is a zip file containing all the logs, which was exported according to the command you provided above.

@xiaofan-luan
Copy link
Collaborator

@allonx

this a bug already fixed for rocksmq

#36618

try to upgrade to 2.4 latest see if it fixed.

@xiaofan-luan
Copy link
Collaborator

rocksmq is slow on recovery.
Or you can try to remove the rocksdb directory to fast recovery(could lose some near line data)

@allonx
Copy link
Author

allonx commented Nov 13, 2024

@allonx

this a bug already fixed for rocksmq

#36618

try to upgrade to 2.4 latest see if it fixed.

yeah, I upgraded milvus to version 2.4.15 and it seems to be working, but the recovery is slow. How can I remove the rocksdb diretory? Where is it located?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

3 participants