Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong results when iterative_prewhitening(prewhiteningorder_snr=True) is used with stopcrit_scargle_snr #24

Open
stefano-rgc opened this issue Sep 28, 2020 · 0 comments

Comments

@stefano-rgc
Copy link

stefano-rgc commented Sep 28, 2020

I may have stumbled on the same bug Siemen and Dominic brought up during the STARBUCKS meetings and decided to post it here.

The bug shows up when the following two setups are used together in the prewhitening function iterative_prewhitening:

  1. prewhiten frequencies in an order that follows the SNR, i.e., parameter prewhiteningorder_snr=True.
  2. stop the prewhitening when the SNR drops below a certain value, i.e., use of the function stopcrit_scargle_snr.

Example

The best way to visualize this issue is to print the SNR value at each step of the prewhitening by inserting the line of code

print(f'SNR = {(ampls/noises)[np.argmax(ampls/noises)]:.3f}') ### NEW LINE ###

below line 276 in find_frequency (which is called by iterative_prewhitening). The change should look like

        noises_old = np.copy(noises)
    else:
        noises = np.interp(freqs,freqs_old,noises_old)
    frequency = freqs[np.argmax(ampls/noises)]                    ### LINE 276 ###
    print(f'SNR = {(ampls/noises)[np.argmax(ampls/noises)]:.3f}') ### NEW LINE ###
else:
    frequency = freqs[np.argmax(ampls)]
if full_output and counter==0:
    freqs_,ampls_ = freqs,ampls

Now, load a light curve of choice or use the one attached here:

import pandas as pd
lc     = pd.read_csv('lc_tic374944608.csv')  
time   = lc.time  
signal = lc.flux                                                   

Finally, run the code below:

import ivs.timeseries.freqanalyse as fa

SNR_limit  = 4.0
SNR_window = 1.0
StopCrit   = (fa.stopcrit_scargle_snr, SNR_limit, SNR_window)
params = fa.iterative_prewhitening(time, signal, prewhiteningorder_snr=True, stopcrit=StopCrit, maxiter=200)

This will print the SNR of the frequencies at every stage of the prewhitening (including the process of refining the frequency grid). The console output will look similar to

SNR = 22.626
SNR = 22.656
SNR = 22.660
SNR = 22.660
SNR = 13.446
SNR = 13.474
SNR = 13.474
SNR = 13.474
SNR = 14.645
SNR = 14.660
SNR = 14.661
SNR = 14.661
SNR = 13.013
SNR = 13.013
...
SNR = 4.046
SNR = 4.046
SNR = 3.892
SNR = 3.892
SNR = 3.893
...
SNR = 3.369
SNR = 3.370
SNR = 3.370
SNR = 3.370
SNR = 3.380
SNR = 3.380
SNR = 3.380
SNR = 3.380
SNR = 3.352
SNR = 3.366
SNR = 3.366

This shows:

  1. the SNR can go below the aimed SNR=4 and
  2. the SNR has no relation with the value stored in params['stopcrit']
>>> params['stopcrit']
array([ 13.40474348,  13.44678306,  14.64631962,  12.61257799,
        12.64298309,  13.81737578,  11.71527848,  11.73468068,
        12.85877578,  14.37025633,  13.36801271,  10.43517399,
        10.4358055 ,  10.55250507,  10.57568668,  10.5923864 ,
         8.45050698,   9.04302943,   8.0771531 ,   8.12639612,
         8.18179814,   8.28015034,   8.78057684,   7.34186741,
         7.33615494,   7.31018856,   7.32047348,   7.32133006,
         7.33448515,   7.77632555,   7.61053638,   7.59566627,
         7.6029366 ,   6.68654245,   6.68784436,   6.67135312,
         6.68573519,   6.67526519,   6.7685099 ,   6.9593664 ,
         5.7687152 ,   5.76961418,   5.77253784,   5.76632453,
         5.76715021,   5.77237361,   5.77487442,   5.78587947,
         5.79101015,   5.80199297,   5.80517593,   5.73848364,
         5.94638784,   5.22299725,   5.242362  ,   5.24254542,
         5.23705819,   5.23723004,   5.24082174,   5.23803633,
         5.24944535,   5.26240406,   5.2590147 ,   5.2572157 ,
         4.67118545,   4.67098396,   4.68356173,   4.67893452,
         4.50099421,   4.5006101 ,   4.28881645,   4.42815387,
         4.43569177,   4.43484652,   4.02039813,   4.02140911,
         4.02093853,   4.02246461,   4.01459534,   4.01289055,
         4.00897009,   4.01626374,   4.01684946,   4.02571564,
         4.02576566,   4.03315131,   4.03151657,   4.03357364,
         4.01409774,   4.02742649,   4.03630451,   4.0399484 ,
         4.03936324,   4.04654839,   4.0483281 ,   4.04889985,
         4.05170285,   4.03186587,   4.034091  ,   4.02836794,
         4.02060923,   4.02102296,   4.01439998,   4.03126013,
         4.02632311,   4.02690732,   4.03358535,   4.03667749,
         4.0314516 ,   4.03188031,   4.0314285 ,   4.0306712 ,
         4.03018952,   4.0351546 ,   4.04085326,   4.03599139,
         4.03327408,   4.03159533,   4.03190789,   4.03191836,
         4.02406664,   4.0326358 ,   4.12642581,   3.77142806])
>>> params['freq'].size
124

Explanation

The function stopcrit_scargle_snr below is meant to return True or False depending on whether the stopping criterion is met or not. As the function's name suggests, the halt happens when the SNR value drops below crit_value and the function returns the tuple(True,value).

def stopcrit_scargle_snr(times,signal,modelfunc,allparams,pergram,crit_value,width=6.):
    """
    Stop criterium based on signal-to-noise ratio.
    """
    width = width/2.
    argmax = np.argmax(pergram[1]) #1
    ampls = pergram[1]
    start = max(0,pergram[0][argmax]-width)
    stop = min(pergram[0][argmax]+width,pergram[0][-1])
    if start==0:
        stop += width-pergram[0][argmax]
    if stop==pergram[0][-1]:
        start = pergram[0][-1]-pergram[0][argmax]+width
    ampls = ampls[(start<=pergram[0]) & (pergram[0]<=stop)]
    value = pergram[1][argmax]/ampls.mean()
    return value<crit_value,value

The problem is, at each iteration, the frequency ν returned by iterative_prewhitening(prewhiteningorder_snr=True) is such that maximizes the SNR in the periodogram, while the frequency ν' used by stopcrit_scargle_snr to compute the SNR is such that maximizes the amplitude in the periodogram (see #1 above).

This implies that ν is not always equal to ν' and, therefore, the SNR in the output does not necessarily correspond to the associated frequency in the same output. For instance, the frequencies stored in params['freq'] in the example above do not necessarily relate to the SNR stored in params['stopcrit'].

Solution

We have to modify the function stopcrit_scargle_snr so that it calculates the SNR of the frequency selected by the prewhitening process and not always default to the frequency with the highest periodogram's amplitude.

Below is an example of such new code:

def NEW_stopcrit_scargle_snr(times,signal,modelfunc,allparams,pergram,crit_value,width=6.):
    """
    Stop criterium based on signal-to-noise ratio.
    """
    width = width/2.
    freqs = pergram[0] # frequency grid
    ampls = pergram[1] # periodogram's amplitudes

    freq = allparams['freq'][-1] # Last frequency in the iterative prewhitening
    ind = np.abs(freqs-freq).argmin() # Get grid's indice closest to the frequency
    
    # Clip the frequency grid
    start = max(0,freqs[ind]-width)
    stop = min(freqs[ind]+width,freqs[-1])
    if start==0:
        stop += width-freqs[ind]
    if stop==freqs[-1]:
        start = freqs[-1]-freqs[ind]+width
    clipped_ampls = ampls[(start<=freqs) & (freqs<=stop)]
    
    value = ampls[ind]/clipped_ampls.mean()
    return value<crit_value,value

Testing the solution

We will just repeat the code in section Example using NEW_stopcrit_scargle_snr instead of stopcrit_scargle_snr.

import ivs.timeseries.freqanalyse as fa

SNR_limit    = 4.0
SNR_window   = 1.0
NEW_StopCrit = (NEW_stopcrit_scargle_snr, SNR_limit, SNR_window)
params = fa.iterative_prewhitening(time, signal, prewhiteningorder_snr=True, stopcrit=NEW_StopCrit, maxiter=200)

the console output will look similar to

SNR = 22.626
SNR = 22.656
SNR = 22.660
SNR = 22.660
SNR = 13.446
SNR = 13.474
SNR = 13.474
SNR = 13.474
SNR = 14.645
SNR = 14.660
SNR = 14.661
SNR = 14.661
SNR = 13.013
SNR = 13.013
...
SNR = 4.046
SNR = 4.046
SNR = 3.892
SNR = 3.892
SNR = 3.893

This shows:

  1. the SNR does not go below the aimed SNR=4 but halts after crossing that threshold and
  2. the SNR are the same values stored in params['stopcrit']
>>> params['stopcrit']
array([ 22.62772635,  13.44678306,  14.64631962,  13.0099008 ,
        12.64298309,  13.81737578,  11.95378747,  11.73468068,
        12.85877578,  14.37025633,  13.36801271,  11.49104876,
        11.34068947,  10.75950652,  10.59325395,  10.5923864 ,
         9.68459626,   9.04302943,   8.81164341,   9.03136256,
         8.81812226,   9.14385704,   8.78057684,   8.55852621,
         8.38600352,   8.1462341 ,   8.13629635,   7.9063102 ,
         7.82175271,   7.77632555,   7.68812351,   7.96437803,
         7.6029366 ,   7.4474554 ,   7.0331079 ,   6.83559477,
         6.77586296,   6.67526519,   6.7685099 ,   6.9593664 ,
         6.67973248,   6.4832323 ,   6.14747587,   6.39048214,
         6.20770705,   6.01226832,   5.84869113,   5.82001952,
         5.83489705,   6.07535406,   5.80517593,   5.73848364,
         5.94638784,   5.66314269,   5.5250279 ,   5.51133535,
         5.41815014,   5.2881503 ,   5.14268841,   4.8694191 ,
         4.85398674,   4.8750638 ,   4.78881205,   5.2572157 ,
         4.76446398,   4.68501405,   4.73088918,   4.67893452,
         4.63948586,   4.5006101 ,   4.48703331,   4.48560261,
         4.46868011,   4.43484652,   4.23184198,   4.19014372,
         4.14076642,   4.08372717,   4.0494421 ,   4.08648646,
         4.14701003,   4.15841565,   4.00241717,   4.30104471,
         4.01527914,   4.02158804,   4.03000781,   3.89200165])
>>> params['freq'].size
88

Note as well that this time the prewhitening returns 88 frequencies instead of 124.

Final note <--------------------------

Even if the parameter prewhiteningorder_snr=True is used, the extracted frequencies will not necessarily come sorted by SNR, as it can be seen in the output for params['stopcrit'] above. I think this happens because the noise level is calculated with the residuals after each iteration, meaning that subsequent frequencies may have their SNR increased because the windows function of the previous frequency has been removed.

Useful links

  1. Source code iterative_prewhitening
  2. Source code stopcrit_scargle_snr

lc_tic374944608.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant