Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K-Means Clustering does not cluster #100

Open
ctasoluk opened this issue Aug 1, 2016 · 1 comment
Open

K-Means Clustering does not cluster #100

ctasoluk opened this issue Aug 1, 2016 · 1 comment

Comments

@ctasoluk
Copy link

ctasoluk commented Aug 1, 2016

I have been trying simple K-Means clustering, and always clusters into 1-cluster. Here is the data set to be clustered.

/**
* The data to be clustered.
*/
public static final double[][] DATA = { {2617.83}, {5885.6}, {1690.71}, {3162.3}, {2180.97},
{1913.49},{2493.73},{1341.28},{4972.91},{2098.54},{3645.07},{1554.69},{1483.03},{339.25},
{12153.81},{1082.09},{1266.5}
};

Note that, when you remove the last element {1266.5} or placed it in different position in the Data array, you get 2-clusters:

*** Cluster 1 ***
[2617.83]
[1690.71]
[3162.3]
[2180.97]
[1913.49]
[2493.73]
[1341.28]
[2098.54]
[3645.07]
[1554.69]
[1483.03]
[339.25]
[1266.5]
[1082.09]
*** Cluster 2 ***
[5885.6]
[4972.91]
[12153.81]

Here is the SimpleKMeans example with a problematic data set, so that you can easily replicated the problem.

import java.util.Arrays;

import org.encog.ml.MLCluster;
import org.encog.ml.data.MLDataPair;
import org.encog.ml.data.MLDataSet;
import org.encog.ml.data.basic.BasicMLData;
import org.encog.ml.data.basic.BasicMLDataPair;
import org.encog.ml.data.basic.BasicMLDataSet;
import org.encog.ml.kmeans.KMeansClustering;

public class SimpleKMeans {

/**
 * The data to be clustered.
 */
public static final double[][] DATA = { {2617.83}, {5885.6}, {1690.71}, {3162.3}, {2180.97},
        {1913.49},{2493.73},{1341.28},{4972.91},{2098.54},{3645.07},{1554.69},{1483.03},{339.25},
        {12153.81},{1082.09},{1266.5}
};

/**
 * The main method.
 * @param args Arguments are not used.
 */
public static void main(final String args[]) {

    final BasicMLDataSet set = new BasicMLDataSet();

    for (final double[] element : SimpleKMeans.DATA) {
        set.add(new BasicMLData(element));
    }

    final KMeansClustering kmeans = new KMeansClustering(2, set);

    kmeans.iteration(100);
    //System.out.println("Final WCSS: " + kmeans.getWCSS());

    // Display the cluster
    int i = 1;
    for (final MLCluster cluster : kmeans.getClusters()) {
        System.out.println("*** Cluster " + (i++) + " ***");
        final MLDataSet ds = cluster.createDataSet();
        final MLDataPair pair = BasicMLDataPair.createPair(
                ds.getInputSize(), ds.getIdealSize());
        for (int j = 0; j < ds.getRecordCount(); j++) {
            ds.getRecord(j, pair);
            System.out.println(Arrays.toString(pair.getInputArray()));

        }
    }
}

}

@jentfoo
Copy link

jentfoo commented Sep 12, 2017

I had this issue in 3.3.0 as well. But seems to be fixed in 3.4. Not sure what specifically changed to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants