Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results differ from scikit-learn implementation #8

Open
areshytko opened this issue Mar 9, 2017 · 12 comments
Open

Results differ from scikit-learn implementation #8

areshytko opened this issue Mar 9, 2017 · 12 comments

Comments

@areshytko
Copy link

areshytko commented Mar 9, 2017

t-sne is inherently randomized but still not that much. It produces consistently different (much worse) results compared to scikit-learn Barnes-Hut implementation.

Example on IRIS dataset:

Scikit-learn with default parameters and learning rate 100

original

Multicore T-SNE with default parameters and learning rate 100

multicore

The greater distance of setosa cluster is also supported by general statistical properties of the dataset (and other embedding algorithms) so the results of scikit-learn lib are more consistent with the original manifold structure

@DmitryUlyanov
Copy link
Owner

Did you try py_bh_tsne or any other non sk-learn package? Do they also produce a worse result? There can be some implementation differences, default parameters and so on. This repo uses py_bh_tsne as the base, I fixed some errors there, but it still can be imperfect. I would give it another try and check the implementation, but I hope sk-learn guys will improve their t-sne efficiency earlier, making this repo useless (that is how it should be).

@areshytko
Copy link
Author

areshytko commented Mar 9, 2017

Yes, unfortunately sk-learn's t-sne is unusable now except for such toy datasets. Yes, that's strange the output shows that the algorithm quickly converged to a low error and stopped any progress further.


Learning embedding...
Iteration 50: error is 43.405481 (50 iterations in 0.00 seconds)
Iteration 100: error is 44.709520 (50 iterations in 0.00 seconds)
Iteration 150: error is 43.567784 (50 iterations in 0.00 seconds)
Iteration 200: error is 42.564679 (50 iterations in 0.00 seconds)
Iteration 250: error is 1.118502 (50 iterations in 0.00 seconds)
Iteration 300: error is 0.238091 (50 iterations in 0.00 seconds)
Iteration 350: error is 0.117268 (50 iterations in 0.00 seconds)
Iteration 400: error is 0.120770 (50 iterations in 0.00 seconds)
Iteration 450: error is 0.121062 (50 iterations in 0.00 seconds)
Iteration 500: error is 0.121366 (50 iterations in 0.00 seconds)
Iteration 550: error is 0.121098 (50 iterations in 0.00 seconds)
Iteration 600: error is 0.121540 (50 iterations in 0.00 seconds)
Iteration 650: error is 0.121057 (50 iterations in 0.00 seconds)
Iteration 700: error is 0.120856 (50 iterations in 0.00 seconds)
Iteration 750: error is 0.121666 (50 iterations in 0.00 seconds)
Iteration 800: error is 0.121161 (50 iterations in 0.00 seconds)
Iteration 850: error is 0.121708 (50 iterations in 0.00 seconds)
Iteration 900: error is 0.121865 (50 iterations in 0.00 seconds)
Iteration 950: error is 0.122631 (50 iterations in 0.00 seconds)
Iteration 999: error is 0.121577 (50 iterations in 0.00 seconds)
Fitting performed in 0.00 seconds.

Comparing to that MNIST test example slowly but progressed till the last iteration.
And the IRIS dataset is a simple one - linearly separable

No I haven't tried other implementations yet

@DmitryUlyanov
Copy link
Owner

DmitryUlyanov commented Mar 9, 2017 via email

@shaidams64
Copy link

I also got a very different result from sklearn implementation on mnist dataset:
Multi-core tsne
screen shot 2017-07-13 at 10 39 43 am
sklearn tsne
screen shot 2017-07-13 at 10 39 53 am

@DmitryUlyanov
Copy link
Owner

DmitryUlyanov commented Jul 13, 2017

Hi, the picture in the README file is t-sne visualization for MNIST dataset, made with the code from this repository. Here is the code https://github.com/DmitryUlyanov/Multicore-TSNE/blob/master/python/tests/test.py

@shaidams64
Copy link

shaidams64 commented Jul 13, 2017

Hey, I loaded the dataset from sklearn and ran the multicore_tsne on it, would there be a difference?
from MulticoreTSNE import MulticoreTSNE as MultiTSNE digits2 = load_digits()
m_tsne = MultiTSNE(n_jobs=4, init='pca', random_state=0) m_y = m_tsne.fit_transform(digits2.data)
plt.scatter(m_y[:, 0], m_y[:, 1], c=digits2.target) plt.show()

@DmitryUlyanov
Copy link
Owner

Do not know for sure, but the format the digits are stored can be different, e.g. [0,1] or 0...255. And t-SNE does a gradient descent, which may fail if the scaling and learning rates are wrong.

Try the example test.py from abouve, do you get a pretty image?

@shaidams64
Copy link

shaidams64 commented Jul 13, 2017

Yes it works with your example. It appears the scalings are different for the datasets. The dataset from sklearn is 0...16 but the one in your example is [-1,1]. So is this version working only with normalized datasets?

@bartimus9
Copy link

bartimus9 commented Aug 8, 2017

Thank you for putting this together, as it is the only multicore TSNE application I can get to successfully complete. However, my results are identical to shaidams64. I have an arcsinh transformed data set and I tried an implementation of this method in R (single core) and I get good results. Sklearn implementation (python) on the same data set returns a very similar result. This multi-core implementation works quickly, but produces an indiscernible cloud of points. I have carefully aligned all of the arguments I can, and the result is the same. Even when I set multicoreTSNE to use only one core, the result is the same (cloud of points). Any recommendations on how to fix this?

EDIT: This discussion thread ends with a multicore TSNE implementation that does reproduce my results with Sklearn and Rtsne. lvdmaaten/bhtsne#18

@YubinXie
Copy link

YubinXie commented Apr 2, 2018

Is this problem solved with this multi-core tsne?

@Ryanglambert
Copy link

Ryanglambert commented Apr 2, 2018 via email

@orihomie
Copy link

Hi, facing same problem for now - results of sklearn tsne and yours differs on the same params

Yes it works with your example. It appears the scalings are different for the datasets. The dataset from sklearn is 0...16 but the one in your example is [-1,1]. So is this version working only with normalized datasets?

So, if I'm getting it right - data normalizing should help (to make results be about "same")?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants