GitHub - brycehenson/bootstrap_error: A function for using bootstrapping to find the standard error in arbitrary (complicated) data analysis

Bryce M. Henson, Dong K. Shin, Kieran F. Thomas

A matlab function that uses bootstrapping to find the standard error in an arbitrary analysis operation. Status: This core functionality provided here is ready for use in other projects. Testing is implemented and passing for the core functionality which provides error determination.

It only takes a moderate amount of complexity in data analysis operation(estimation function) before it is difficult to determine the error in the result. Bootstraping/resampling is a powerful statistical method that performs the analysis operation repeatedly on smaller subsets of the data in order to estimate the error in the result of the the operation(estimation function) on the full data set. Further the method is able to work with an analysis operation that only produces meaningful results when performed with many data points (such as a (non)linear fit)

The procedure is reasonably simple given some analysis operation (estimation function) A(x) (that produces a scalar) and a dataset D

select a random sample of the data S of length n_samp (with replacements) out of all data collected (D, with length n_tot)
compute the analysis operation A(S)
repeat steps 1 to 2 many times saving the result of each analysis operation (on the subset)
calculate the standard deviation across these results and multiply by sqrt(n_samp)/sqrt(n_tot) to estimate the standard error in A(D). (This is known as mean-like scaling.)

As a test it is advisable to check that there is no trend in either: the output of the function, or the estimated standard error, as function of the size of the subset. Thus the above procedure may be repeated at many different fractions of the whole dataset giving the graph below.


Figure 1- Bias analysis output. This graph can be used to reveal the bias of the estimation function with sample size and how the error in the result scales with sample size. An estimation function is mean-like if the estimated SE in the operation on the whole data set does not change with subsample fraction.

Features

sampling without replacement

The above uses random sampling with replacement in order to prevent biasing of the standard error estimate. It is however possible to use the Finite sample correction from L. Isserlis,On the Value of a Mean as Calculated from a Sample,J. Royal Stat. Soc Vol. 81, No. 1 (Jan., 1918), pp. 75-81 to correct for the bias when using random sampling without replacements. Both methods are implemented in this work.

Estimated error in the error

If you are studying the error from some analysis operation then it may be required to know how significant some change in this error is. This is where it is natural to start worrying about the error in the SE. This code provides two estimates, the first assumes a normal distribution (and is unbiased) the second does not and is slightly biased.

To Do

contributors welcome! There is a lot to do to build this into a powerful tool. Drop me an email.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
derivation		derivation
dev		dev
figs		figs
lib		lib
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bootstrap_se.m		bootstrap_se.m
tutorial.m		tutorial.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

sampling without replacement

Estimated error in the error

Further Reading

To Do

About

Releases

Packages

Contributors 3

Languages

License

brycehenson/bootstrap_error

Folders and files

Latest commit

History

Repository files navigation

Features

sampling without replacement

Estimated error in the error

Further Reading

To Do

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages