Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clustering and MPI: multiple copy of same models #219

Open
ichem001 opened this issue Nov 2, 2016 · 2 comments
Open

Clustering and MPI: multiple copy of same models #219

ichem001 opened this issue Nov 2, 2016 · 2 comments

Comments

@ichem001
Copy link

ichem001 commented Nov 2, 2016

Input: sets of models to be clustered
Output: clusters with more than number of models
Method: K-means clustering using mpi
Assessment: the output gives more than the total number of models in input.
What it looks like is that each process runs its own clustering and somehow merges the files together.
Ideally, mpi clustering should be used to have each process taking care of one of the K-clusters.

@sethaxen
Copy link
Contributor

sethaxen commented Nov 2, 2016

What do you mean by "more models than you are clustering"? Do you mean models are duplicated in the clusters?

@shruthivis
Copy link

I have the same problem but not able to reproduce it on an example: it could be stochastic. If we start clustering with 1000 models for example, after kmeans clustering for some k, the total number of models in the output cluster.0,cluster.1... directories is more than 1000. Some models get copied. The same problem was not seen while running on a single core. So it appears to be MPI related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants