Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature!: verify and set default VamanaBuildParameters #96

Merged
merged 44 commits into from
Mar 28, 2025

Conversation

yuejiaointel
Copy link
Contributor

@yuejiaointel yuejiaointel commented Mar 18, 2025

BREAKING CHANGE: Removed the deprecated num_threads argument and added a use_full_search_history argument in VamanaBuildParameters in Python bindings. This change ensures consistency between the Python API and the C++ implementation in include/svs/index/vamana/build_params.h.

Added default value setting and checking based on doc requirement, and added additional tests.

Pin CMake < 4 as a workaround until our dependencies require CMake >= 3.5.

@yuejiaointel yuejiaointel marked this pull request as ready for review March 21, 2025 05:33
@ibhati ibhati requested review from mihaic and dian-lun-lin March 21, 2025 19:32

// Check User set values
// Validate number parameters are positive
if (parameters.alpha < 0.0f) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (parameters.alpha < 0.0f) {
if (parameters.alpha <= 0.0f) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed!

CATCH_REQUIRE(index.get_max_candidates() == 64);
CATCH_REQUIRE(index.get_full_search_history() == true);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these tests should go to svs/index/vamana/index.cpp where the logic is implemented. Also, can we add similar tests for dynamic_index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, moved and added similar tests in dynamic_index_2.cpp

Copy link
Member

@mihaic mihaic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work, @yuejiaointel! I look forward to not having to care about the default value of alpha. :) I think we can make a couple of improvements. Please see my suggestions.

py::arg("prune_to") = std::numeric_limits<size_t>::max(),
py::arg("num_threads") = std::numeric_limits<size_t>::max(),
py::arg("alpha") = svs::FLOAT_MAX,
py::arg("graph_max_degree") = svs::UNSIGNED_INTEGER_MAX,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not setting graph_max_degree based on other parameters. The default is always 32. Let's leave it as 32 so users immediately see what the default is.

Suggested change
py::arg("graph_max_degree") = svs::UNSIGNED_INTEGER_MAX,
py::arg("graph_max_degree") = 32,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, renamed it to be GRAPH_MAX_DEGREE_DEFAULT with value 32

py::arg("num_threads") = std::numeric_limits<size_t>::max(),
py::arg("alpha") = svs::FLOAT_MAX,
py::arg("graph_max_degree") = svs::UNSIGNED_INTEGER_MAX,
py::arg("window_size") = svs::UNSIGNED_INTEGER_MAX,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as graph_max_degree.

Suggested change
py::arg("window_size") = svs::UNSIGNED_INTEGER_MAX,
py::arg("window_size") = 64,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, renamed it to be WINDOW_SIZE_DEFAULT with value 64

py::arg("graph_max_degree") = svs::UNSIGNED_INTEGER_MAX,
py::arg("window_size") = svs::UNSIGNED_INTEGER_MAX,
py::arg("max_candidate_pool_size") = svs::UNSIGNED_INTEGER_MAX,
py::arg("prune_to") = svs::UNSIGNED_INTEGER_MAX,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
py::arg("prune_to") = svs::UNSIGNED_INTEGER_MAX,
py::arg("prune_to") = svs::UNSIGNED_INTEGER_PLACEHOLDER,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, renamed!

Copy link
Member

@mihaic mihaic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for accommodating my suggestions. I think the code looks good now. We should also update the docs.

// Maximum values used as default initializers
inline constexpr size_t UNSIGNED_INTEGER_PLACEHOLDER = std::numeric_limits<size_t>::max();
inline constexpr float FLOAT_PLACEHOLDER = std::numeric_limits<float>::max();
inline constexpr float GRAPH_MAX_DEGREE_DEFAULT = 32;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please look at the docs corresponding to the PR and confirm these *_DEFAULT values are visible? Users should be able to see the default value without reading the code. Ideally, there should be numbers visible in the VamanaBuildParameters doc as currently; alternatively, there should be separate doc entries for the *_DEFAULT consts.

if (parameters.alpha == svs::FLOAT_PLACEHOLDER) {
// Check if it's a supported distance type
if (is_L2) {
parameters.alpha = 1.2f;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea with the *_DEFAULT values! (I was only suggesting numerical values). What do you think about adding two more for the two alpha values for maintainability, e.g., ALPHA_MAXIMIZE_DEFAULT and ALPHA_MINIMIZE_DEFAULT? Feel free to say no.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, added!

@mihaic
Copy link
Member

mihaic commented Mar 25, 2025

/intelci

@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params

@yuejiaointel
Copy link
Contributor Author

/intelci

@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params

3 similar comments
@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params

@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params

@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params

@@ -159,3 +162,14 @@ inline constexpr bool have_avx512_avx2 = true;
#endif

} // namespace svs::arch

namespace svs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the good work.

Since some of these parameters are vamana-specific, could we move these parameters to svs::index::vamana namespace? Or change the name to Vamana specific.

For example, GRAPH_MAX_DEGREE_DEFAULT -> VAMANA_GRAPH_MAX_DEGREE_DEFAULT.
ALPHA_*, USE_FULL_SEARCH_HISTORY_DEFAULT, WINDOW_SIZE_DEFAULT should also be changed to Vamana specific

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for suggestion, fixed!

@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params_test_back

@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params_test_back

@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params_test_back

Copy link
Member

@mihaic mihaic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature for VamanaBuildParameters looks good in the docs! That was my main concern. I am leaving suggestions for removing duplication like the repetition of defaults for the defaults that are shown in the signature and for reStructuredText (vs Markdown) formatting.

@mihaic mihaic changed the title feature: verify and set default VamanaBuildParameters feature!: verify and set default VamanaBuildParameters Mar 28, 2025
Copy link
Member

@mihaic mihaic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your contribution (and for your patience), @yuejiaointel! SVS users will benefit greatly from the new defaults.

@mihaic
Copy link
Member

mihaic commented Mar 28, 2025

@yuejiaointel, please try changing the CIBuildWheel workflow to use Ubuntu 22.04:
https://github.com/intel/ScalableVectorSearch/actions/runs/14119826155/workflow?pr=96#L33

@mihaic
Copy link
Member

mihaic commented Mar 28, 2025

Ubuntu 24.04 didn't help...

@yuejiaointel
Copy link
Contributor Author

/intelci
set ref=fix_set_default_build_params_test_back

@mihaic mihaic merged commit f2380c1 into intel:main Mar 28, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants