You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am interested in producing a cluster simmilar to the one you did with arxiv. I'm working with a set of web pages from Common Crawl ~6M urls. I have them reduced to embeddings using this. How did you decide for the arxiv project the config of node_embedding_dim, neighbor_scale, and n_neighbors or at least what are rational ranges so and I can search on that areas. because currently I end with ~65% of points not being noise in no cluster. even using noise_level=0
thanks
The text was updated successfully, but these errors were encountered:
Hi,
I am interested in producing a cluster simmilar to the one you did with arxiv. I'm working with a set of web pages from Common Crawl ~6M urls. I have them reduced to embeddings using this. How did you decide for the arxiv project the config of
node_embedding_dim
,neighbor_scale
, andn_neighbors
or at least what are rational ranges so and I can search on that areas. because currently I end with ~65% of points not being noise in no cluster. even usingnoise_level=0
thanks
The text was updated successfully, but these errors were encountered: