-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] UMAP API for building with batched NN Descent #6022
Conversation
Will change |
cpp/cmake/thirdparty/get_raft.cmake
Outdated
@@ -82,8 +82,8 @@ endfunction() | |||
# To use a different RAFT locally, set the CMake variable | |||
# CPM_raft_SOURCE=/path/to/local/raft | |||
find_and_configure_raft(VERSION ${CUML_MIN_VERSION_raft} | |||
FORK rapidsai | |||
PINNED_TAG branch-${CUML_BRANCH_VERSION_raft} | |||
FORK jinsolp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder this needs to be reverted before this PR is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! I'll revert these after the raft PR is merged the package is released : )
/merge |
Description
adds the following parameters as part of the
build_kwds
n_clusters
: number of clusters to use when batching. Larger number of clusters reduce GPU memory usage. Defaults to 1 (no batch)Results showing consistent trustworthiness scores for doing/not doing batching.
Also note below that now UMAP can run with datasets that don't fit on the GPU. Putting the dataset on host and enabling the batching method allows UMAP to run with a dataset that is 50M x 768 (153GB).
Notes
This PR in raft needs to be merged before this PR