Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support concurrent read and write for HNSW #178

Merged
merged 30 commits into from
Jan 3, 2025

Conversation

inabao
Copy link
Collaborator

@inabao inabao commented Dec 2, 2024

No description provided.

jinjiabao.jjb added 3 commits December 2, 2024 15:49
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
@inabao inabao requested a review from LHT129 December 2, 2024 11:21
@inabao inabao self-assigned this Dec 2, 2024
@inabao inabao requested a review from ShawnShawnYou December 2, 2024 11:22
src/index/hnsw.cpp Show resolved Hide resolved
tests/test_multi_thread.cpp Outdated Show resolved Hide resolved
src/algorithm/hnswlib/visited_list_pool.h Outdated Show resolved Hide resolved
jinjiabao.jjb added 6 commits December 7, 2024 13:46
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
@@ -211,7 +211,6 @@ TEST_CASE("HNSW Multi-threading", "[ft][hnsw]") {
->Float32Vectors(data.get() + i * dim)
->Owner(false);
auto add_res = index->Add(dataset);
REQUIRE(add_res.has_value());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUIRE is not thread-safe. If add_res.has_value() is false, then add_res.value().size() will cause a core dump.

@@ -241,7 +240,7 @@ TEST_CASE("HNSW Multi-threading", "[ft][hnsw]") {

float recall = correct / max_elements;
std::cout << index->GetStats() << std::endl;
REQUIRE(recall == 1);
REQUIRE(recall > 0.96);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, it was not built concurrently, so the recall rate could be 1. However, a concurrent build cannot achieve such a high recall rate.

jinjiabao.jjb added 2 commits December 9, 2024 00:16
src/algorithm/hnswlib/hnswalg.cpp Show resolved Hide resolved
int level,
const InnerIdType* neighbors,
size_t neigbor_count) {
linklistsizeint* ll_cur = getLinklistAtLevel(internal_id, level);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, getLinklistAtLevel is not thread-safe

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is read-only, and subsequent modifications will be placed under the protection of the lock

Copy link
Collaborator

@jiaweizone jiaweizone Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when one thread reads while another thread modifies ?

I recommend you to complete the revision directly, otherwise the reviewer cannot make a judgment on thread safety

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the lock

int level,
InnerIdType neighbor,
size_t max_degree) {
linklistsizeint* ll_cur = getLinklistAtLevel(internal_id, level);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

src/algorithm/hnswlib/hnswalg.cpp Show resolved Hide resolved
continue;
for (size_t i = 0; i < cand_neighbors.size(); i++) {
auto id = cand_neighbors[i];
auto& in_edges = getEdges(id, level);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Signed-off-by: jinjiabao.jjb <[email protected]>
@inabao inabao added the kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) label Dec 16, 2024
jinjiabao.jjb added 2 commits December 22, 2024 21:16
jinjiabao.jjb added 2 commits December 24, 2024 18:46
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
const InnerIdType* neighbors,
size_t neigbor_count) {
std::unique_lock lock(link_list_locks_[internal_id]);
linklistsizeint* ll_cur = getLinklistAtLevel(internal_id, level);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLinklistAtLevel is thread-safe ?
the sub lock link_list_locks_[internal_id] can not protect the data_level0_memory_ concurrent access
eg. data_level0_memory_->Resize()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function getLinklistAtLevel does not access data_level0_memory_; it accesses the graph, while data_level0_memory_ stores the raw data.

InnerIdType neighbor,
size_t max_degree) {
std::unique_lock lock(link_list_locks_[internal_id]);
linklistsizeint* ll_cur = getLinklistAtLevel(internal_id, level);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

jinjiabao.jjb added 2 commits December 26, 2024 15:10
Copy link

codecov bot commented Dec 31, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

see 39 files with indirect coverage changes

Copy link
Collaborator

@LHT129 LHT129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@jiaweizone jiaweizone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/algorithm/hnswlib/hnswalg.cpp Show resolved Hide resolved
// Locks operations with element by label value
mutable std::shared_mutex resize_mutex_{};

mutable std::shared_mutex global_{};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global_ could make confused, rename to level_mutex_ or multi_level_mutex_

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 108 to 109
mutable std::shared_mutex label_lookup_lock_{}; // lock for label_lookup_
vsag::UnorderedMap<LabelType, InnerIdType> label_lookup_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move these two line up, near other locks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 78 to 80
mutable std::shared_mutex resize_mutex_{};

mutable std::shared_mutex global_{};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add more comments about the scope of these locks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: jinjiabao.jjb <[email protected]>
Copy link
Collaborator

@wxyucs wxyucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Signed-off-by: jinjiabao.jjb <[email protected]>
@inabao inabao merged commit b59cec9 into main Jan 3, 2025
10 checks passed
@inabao inabao deleted the enable-concurrency-for-read-and-write branch January 3, 2025 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) size/L version/0.13
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants