-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plotting improvements #103
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There is a lot of coupling when it comes to phased/unphased data, so I had to make a lot of changes in this one commit. The main point of this commit was to deal with temporal-space data instead of phase-space data wherever possible. This was especially important in the `Fourier` class, to prepare for multiperiodic signal support. Now the period needs to be provided to `Fourier` before `Fourier.fit` or `Fourier.transform` is called. To allow for setting the `period` parameter properly when `Fourier` is nested within another estimator, I had to make `Fourier` a subclass of scikit-learn's `BaseEstimator`, which we should have done long ago. Most of the changes in `get_lightcurve` deal with replacing `phase` with `time`. There are also some changes which deal with providing the correct data when dealing with masked arrays. There is also a new piece of code which passes the period into the `Fourier` object, which required a little bit of hackery (see the comments before the call to `predictor.set_params`). The `rephase` function in `plotypus.periodogram` was also modified, such that masked values are still rephased.
Instead of multiplying each index `i` by `2 * pi`, and then dividing every element of the matrix by `period`, I instead define `omega = 2 * pi / period` (as we usually do), and multiply each index `i` by that.
Added a new command line option: `--regressor-options key1 val1 ... keyN valN`. Allows user to pass arbitrary arguments. Old regressor options now ignored (such as `--max-iter`) and will be removed soon. Need to at least pass `fit_intercept False` to retain previous functionality, which will change soon.
All CLI regressor options are now encompassed by the `--regressor-options` option. The old options have been removed completely. Also, a help string has been added to `--regressor-options`.
Now the Fourier design matrix no longer provides the intercept column, and relies on the regressor object to fit the intercept separately. This is already the default behavior of all sklearn regressors (that I'm aware of), and so this simply saves the user from needing to provide `fit_intercept=False` every time.
Since it takes a series of key-value pairs, I made the metavar "KEY VALUE". This makes it look like there are two metavars (desired) when it's really just one with a space inbetween.
Instead of having the `--regressor` option take one of a few hand-chosen values, it now takes the name of _any arbitrary regressor object_, such as the default, `--regressor sklearn.linear_model.LassoLarsIC`. This requires a little more user-knowledge, but opens a world of possibilities. This is accomplished through the new `plotypus.utils.import_name` function, which imports an object from its full name and returns it, without affecting the enclosing namespace.
A number of changes have been made relating to phasing and plotting, in preparation for multiple periods. Before, the `lightcurve.get_lightcurve` function returned an array `phased_data`, which contained the original data, with outliers masked, and with time replaced with phase. Now it has been replaced with the array `data`, which only differs from the input by masking outliers. The function also returned fit values with phase [0, 1). Now it instead returns fit values over the entire time interval, and leaves phasing for later. This does result in unnecessary computation for singly-periodic stars, but it will be necessary for multi-periodic stars, and avoiding the extra computation in the special case would require adding complexity to the code. This array has been renamed from `lightcurve` to `data_smooth`, to emphasize its relation to `data`. Now, instead of plotting the data and fit repeated over 2 periods, we phase the data such that it repeats after 2 periods instead of one. This removes the need to duplicate the arrays, causes the observed data points on each half to be different, and generalizes to multiple periods, where the data _should not_ repeat after 1 period. This also allows us to make the number of cycles before repetition _arbitrary_, and thus a new CLI option `--cycles=2.0` has been added. `cycles` has also been made a keyword argument to all functions which involve rephasing, as well as the `plot_lightcurve` functions. The option to plot _unphased_ data has also been created, although it is currently inaccessible from the CLI. Simply plot data with `period=None`, and it will be left unphased. This is rather ugly when outputting to a small image, and will look much better if we eventually add support for a format which supports interactive scrolling, or simply use a matplotlibrc with a _very_ wide figure size. The unused `--phase-points` CLI option has been removed. A missing comma was added to the list given by the `__all__` attribute of `plotypus.utils`. Some doc strings are out of sync with the changes, though they were out of sync before the changes as well.
Previously we were outputting the fit over the duration of the first period, which meant leaving out a number of phase points. Now we instead output the second period, which is problematic in itself, but is at least an improvement in most real scenarios. Ideally we should just be outputting phase points which are not sensitive to any period.
We replaced the `--periods` option with a general `--parameters` option, long ago, but never updated the user guide. This fixes the Beginner's Guide, but still leaves other pages to be updated.
This guide is out of date, and needs to be updated. I have added a warning for now.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Warning: incomplete, do not merge yet
This introduces the changes requested in #102.