Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Avoid large device allocation in UMAP with nndescent (#6292)
Currently `NNDescent` returns two arrays: - `graph.graph()`: (n x graph_degree) on host - `graph.distances()`: (n x graph_degree) on device Downstream, the rest of UMAP wants both of these to be device arrays of shape (n x n_neighbors). Currently we copy `graph.graph()` to a temporary device array, then slice and and copy it to the output array `out.knn_indices`. Ideally we'd force `graph_degree = n_neighbors` to avoid the slicing entirely (and reduce the size of the intermediate results). However, it seems like currently there's a bug in `NNDescent` where reducing `graph_degree` to `n_neighbors` causes a significant decrease in result quality. So for now we need to keep the slicing around. We can avoid allocating the temporary device array though, instead doing the slicing on host. Doing this avoids allocating a (n x graph_degree) device array entirely; for large `n` this can be a significant savings (47 GiB on one test problem I was trying). We still should fix the `graph_degree` issue, but for now this should help unblock running UMAP on very large datasets. Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - Divye Gala (https://github.com/divyegala) - William Hicks (https://github.com/wphicks) URL: #6292
- Loading branch information