Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The accuracy of random forest is over 0.99 #7

Open
mk123qwe opened this issue Dec 30, 2019 · 1 comment
Open

The accuracy of random forest is over 0.99 #7

mk123qwe opened this issue Dec 30, 2019 · 1 comment

Comments

@mk123qwe
Copy link

I fit the easy random forest model,just like this
from sklearn.ensemble import RandomForestClassifier
RandomForestClassifier(n_estimators=10, random_state=2019)

TABLE IV. High-level features in yours paper

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 16 out of 16 | elapsed: 3.2min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 16 out of 16 | elapsed: 1.2s finished
Validation Accuracy: 0.996

@jmduarte
Copy link
Collaborator

How much of the training data did you use?

I tried just using 1 file (~20k events, so granted it might not be enough), and only got up to ~80% test accuracy. On the other hand, if I use the training data, then it's >99% accuracy.

How is the validation accuracy defined here?

My code here: https://github.com/jmduarte/HiggsToBBMachineLearning/blob/randomforest/train.ipynb
Binder link: https://mybinder.org/v2/gh/jmduarte/HiggsToBBMachineLearning/randomforest?filepath=train.ipynb

Thanks,
Javier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants