Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Monte Carlo methods to estimate error on model parameters #49

Open
dwysocki opened this issue Nov 9, 2014 · 3 comments
Open

Use Monte Carlo methods to estimate error on model parameters #49

dwysocki opened this issue Nov 9, 2014 · 3 comments

Comments

@dwysocki
Copy link
Member

dwysocki commented Nov 9, 2014

There is no accepted general method for computing standard error on Lasso coefficients. To remedy this, we might make use of bootstrapping.

Here we can discuss the possibilities for this addition, such as:

  • how to implement
  • which method to utilize
@dwysocki dwysocki added this to the Science goals milestone Nov 9, 2014
@earlbellinger earlbellinger changed the title Use statistical bootstrapping to estimate error on amplitude coefficients Use Monte Carlo methods to estimate error on model parameters Oct 22, 2015
@earlbellinger
Copy link
Contributor

Bootstrapping won't work - instead we need to use Monte Carlo. It's easy enough: with some number of times --num_perturbations, we perturb each observation's magnitude with normal noise whose standard deviation is that observation's uncertainty. Then we report medians and standard deviations for period (if unspecified), amplitudes, phases, fitted magnitudes, and also parameters like Phi_31.

Why params like Phi_31? Because 1/N sum_N (Phi_1 - 3 Phi_3) % 2pi is not equal to [(1/N sum_N Phi_1) - (3/N sum_N Phi_3)] % 2pi.

@earlbellinger earlbellinger self-assigned this Oct 22, 2015
@dwysocki
Copy link
Member Author

Just some comments on this:

  • note that we now output the fitted magnitudes to files, they've been removed from the table
    • this is actually very convenient, because instead of outputting (phase, mag) we can output (phase, mag, error)
  • any time an error is omitted (e.g. period is already provided, or --num-perturbations=0), we should put a 0 for its value, instead of omitting it
  • also, keep with naming convention, and call it --num-perturbations

@earlbellinger
Copy link
Contributor

I like the idea of weighing each model by its goodness of fit. However, R^2 isn't a good choice for doing this because its value can be negative. Instead we should weigh each model by 1/MSE.

We can report weighted medians instead of just medians.

We could also do weighted std's but that's just a choice -- probably keep them unweighted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants