Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random error in fci.random_forest_error for clasification #77

Open
miranov25 opened this issue Oct 25, 2018 · 0 comments
Open

Random error in fci.random_forest_error for clasification #77

miranov25 opened this issue Oct 25, 2018 · 0 comments

Comments

@miranov25
Copy link

Halo.

I started to use forest-confidence-interval. Thank you for implementing package.
After several interaction I converged to following usage:

errors = fci.random_forest_error(clf, k0_training,k0_test,memory_constrained=1, memory_limit=100, calibrate=0 )

I have following comments/suggestions/questions

Memory

  • could yo use some default upper limit for the evaluation < available memory?
    • In my examples with 50000 rows x 6 columns I got >10 GBy memory
    • I had to stop the process
  • flag memory_limit was not working using default pip install (restci 0.3)
    • After installation from sources flag worked properly ()
    • Could you update pip recipe in the pip to use version with working memory limits ?

Errors

  • calibrate method I got O (1000) times higher errors compared to option without calibrate
  • (~1.6+-0.3 instead of ~0.001)
  • using switch calibrate=0 , obtained errors look more realistic (for classification values I assumed errors should be <1)
  • using calibrate method I obtained large spread of error values:
for i in range(0,5):
    errors = fci.random_forest_error(clf, k0_training,k0_test,memory_constrained=1, memory_limit=100, calibrate=1 )
    print(i,errors[0:1000:200])
===> 
(0, array([1.77080289, 1.77080289, 1.77080289, 1.77080289, 1.77080289]))
(1, array([1.60437205, 1.60437205, 1.60437205, 1.60437205, 1.60437205]))
(2, array([1.00765122, 1.00765122, 1.00765122, 1.00765122, 1.00765122]))
(3, array([1.55302694, 1.55302694, 1.55302694, 1.55302694, 1.55302694]))
(4, array([1.36027949, 1.36027949, 1.36027949, 1.36027949, 1.36027949]))
  • I assume that the error estimate using calibrate is overestimated. I will check if the error estimates with calibrate=0 are realistic.
  • Is the problem with my expectation (for classification errors < 1), or is there problem with calibrate method ?
  • Did you try before error estimates for classification

Regards
Marian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant