The accuracy of random forest is over 0.99 #7

mk123qwe · 2019-12-30T08:32:23Z

I fit the easy random forest model，just like this
from sklearn.ensemble import RandomForestClassifier
RandomForestClassifier(n_estimators=10, random_state=2019)

TABLE IV. High-level features in yours paper

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 16 out of 16 | elapsed: 3.2min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 16 out of 16 | elapsed: 1.2s finished
Validation Accuracy: 0.996

jmduarte · 2019-12-30T18:53:49Z

How much of the training data did you use?

I tried just using 1 file (~20k events, so granted it might not be enough), and only got up to ~80% test accuracy. On the other hand, if I use the training data, then it's >99% accuracy.

How is the validation accuracy defined here?

My code here: https://github.com/jmduarte/HiggsToBBMachineLearning/blob/randomforest/train.ipynb
Binder link: https://mybinder.org/v2/gh/jmduarte/HiggsToBBMachineLearning/randomforest?filepath=train.ipynb

Thanks,
Javier

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The accuracy of random forest is over 0.99 #7

The accuracy of random forest is over 0.99 #7

mk123qwe commented Dec 30, 2019

jmduarte commented Dec 30, 2019

The accuracy of random forest is over 0.99 #7

The accuracy of random forest is over 0.99 #7

Comments

mk123qwe commented Dec 30, 2019

jmduarte commented Dec 30, 2019