Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

standard error of accuracy estimates #11

Open
oh-data-sci opened this issue Aug 28, 2019 · 0 comments
Open

standard error of accuracy estimates #11

oh-data-sci opened this issue Aug 28, 2019 · 0 comments

Comments

@oh-data-sci
Copy link

in your random forest notebook, in function cross_val_metrics

    if print_results:
        for i in range(0, len(scores)):
            print("Cross validation run {0}: {1: 0.3f}".format(i, scores[i]))
        print("Accuracy: {0: 0.3f} (+/- {1: 0.3f})"\
              .format(scores.mean(), scores.std() / 2))
    else:
        return scores.mean(), scores.std() / 2

you split the standard deviation of the samples in half and present that as the... standard error? should the standard deviation not be divided by the square root of the number of the folds, in this case sqrt(10)? or you could just report the standard deviation, not the half of it. better yet, report/return score.std()*1.96/np.sqrt(n_folds) for a 95% confidence interval. the latter scales the standard deviation by 0.62 as opposed to 0.5 so the numerical results are not drastically different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant