Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot reproduce results #13

Open
RomuloPaiva01 opened this issue Aug 18, 2020 · 1 comment
Open

Cannot reproduce results #13

RomuloPaiva01 opened this issue Aug 18, 2020 · 1 comment

Comments

@RomuloPaiva01
Copy link

RomuloPaiva01 commented Aug 18, 2020

Every time I call fit_transform I get different results.

I noticed that np.random.permutation changes the random_state, so I used np.random.RandomState(seed=seed).permutation() to solve.

I also noticed that np.random.seed(i) is used in run_select_features, but it changes the random state in the same way, so I can always convert back to the random_state that I had.

Even with those changes, and always getting the same random_state after calling fit_transform, I always end up with different results.

@cod3licious
Copy link
Owner

Yes, randomness is used in a lot of places in the code, both explicitly in places you've mentioned as well as internally (e.g. in some of the models). And it is crucial for the feature selection to use lots of randomness everywhere to make sure a robust subset of features is selected.

If you find a way to catch all instances where randomness is used and make it possible to pass a single random seed to the model to make the results reproducible, I'd love to accept a pull request! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants