-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GetCollStatResponseWrapper randomly returns 0 size for collections in 2.3.x #1038
Comments
I tried to debug it further, and now I have two identical collections of size 27 (with different names), but |
The function of As we know, when users call insert() to insert entities into a collection, the insert request is passed to Pulsar, and consumed by querynode/datanode asynchronously. The datanode accumulates entities in a memory buffer, once the buffer size exceeds a threshold, the datanode flushes the buffer to be a sealed segment. Only when a sealed segment is persisted, its row number is recorded into Etcd. So, the number returns from MilvusClient.getCollectionStatistics() is not accurate. This is an example of MilvusClientV2 to get row number:
|
Thank you, @yhmo. We’ll proceed with this approach. Could you also let me know if there are any plans to deprecate |
getCollectionStatistics() is much faster than query("count(*)") because getCollectionStatistics() quickly picks the number from Etcd but query() requires the collection to be loaded and iterates all the segments to sum up the number. Sometimes users only want to know a raw number and don't intend to load the collection. So I think the getCollectionStatistics() should not be marked as deprecated. In the python sdk, the Collection.num_entities is not deprecated either: |
Hi,
The GetCollStatResponseWrapper randomly returns a zero row count for some collections. For others it still works ok, so it's unclear what the reason is.
For example, here is the collection in a format compatible with LangChain:
The real row count:
The Java code that returns 0:
I use SDK 2.3.4 which is tied to LangChain4J.
The text was updated successfully, but these errors were encountered: