Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy between stored best fit spectrum and reconstructed spectrum from predict ? #298

Open
themiyan opened this issue Oct 12, 2023 · 14 comments

Comments

@themiyan
Copy link

The stored MAP spectrum in .h5 file seems to have slightly different properties compared to the one that can be reproduced by the max_theta values.

In the screenshot, the cyan photometry and blue observed JWST/NIRSpec PRISM spectrum is fit with prospector. The stored best fit spectrum is shown by the brown spectrum. Clearly there are some balmer Ines that's are not in the observed spectrum.

If I reconstruct the best-fit spectrum at a higher R using the stored MAP theta values I get the green spectrum. Then when I resample the spectrum back to the NIRSpec wavelength resolution using spectres I get the red spectrum, which is still a good fit to the observed spectrum and does not have the strong abs lines.

The theta values do match. For example:
from the full chain:
[-0.48803794, 0.4424453 , 11.43844738, 0.04454762, 0.15767742, 0.03109789, 0.06980556, -0.06348596, -3.00673872]
from the stored best-values dic in .h5 file:
[-0.48803794, 0.4424453 , 11.43844738, 0.04454762, 0.15767742, 0.03109789, 0.06980556, -0.06348596, -3.00673872]
If I read the .h5 file using h5py

from prospect.plotting.utils import best_sample
best_sample(h5_read['sampling'])
[-0.48803794,  0.4424453 , 11.43844738,  0.04454762,  0.15767742,
        0.03109789,  0.06980556, -0.06348596, -3.00673872]

So the possibility is that the sps object is different. I don't see any reason why reader.get_sps(out_res) would give an incorrect sps object.

{'verbose': True,
 'debug': False,
 'outfile': array('jwst_2565_', dtype='<U10'),
 'output_pickle': False,
 'optimize': False,
 'min_method': array('lm', dtype='<U2'),
 'min_opts': array({}, dtype=object),
 'nmin': 10,
 'emcee': True,
 'nwalkers': 512,
 'niter': 1024,
 'nburn': array([16, 32, 64]),
 'interval': 0.25,
 'restart_from': array('', dtype='<U1'),
 'initial_disp': 0.1,
 'dynesty': False,
 'nested_bound': array('multi', dtype='<U5'),
 'nested_sample': array('unif', dtype='<U4'),
 'nested_walks': 48,
 'nested_nlive_init': 400,
 'nested_nlive_batch': 200,
 'nested_dlogz_init': 0.05,
 'nested_maxcall': 50000000,
 'nested_maxiter': 1000000,
 'nested_maxbatch': 10,
 'nested_bootstrap': 0,
 'nested_target_n_effective': 10000,
 'objid': 7329,
 'object_redshift': 3.2074616457461644,
 'output_pickles': False,
 'do_powell': False,
 'ftol': 5e-06,
 'maxfev': 5000,
 'do_levenberg': True,
 'nested_method': array('rwalk', dtype='<U5'),
 'nested_weight_kwargs': {'pfrac': 1.0},
 'add_neb': False,
 'sps_libraries': array([b'mist', b'miles', b'DL07  '], dtype='|S6'),
 'param_file': array('/Users/wnanayakkara/Dropbox/JWST/JWST-QuGals/prospector/qu_obs200_ZF7329/with_spec_trimmedspec_TN/run_on_data.py',
       dtype='<U112'),
 'component': -1,
 'lnwavegrid': None,
 'zred': array(3.20746165),
 'mass': array([4.61848953e+07, 9.72336623e+07, 1.00080804e+08, 1.89665341e+08,
        3.28789160e+08, 7.74708867e+08, 2.72903316e+11]),
 'logzsol': array(-0.48803794),
 'dust2': array(0.4424453),
 'sfh': array(0.),
 'imf_type': array(2),
 'dust_type': array(2.),
 'logmass': array(11.43844738),
 'agebins': array([[0.        , 7.4772    ],
        [7.4772    , 8.        ],
        [8.        , 8.30873508],
        [8.30873508, 8.61747016],
        [8.61747016, 8.92620524],
        [8.92620524, 9.23494032],
        [9.23494032, 9.30552139]]),
 'logsfr_ratios': array([ 0.04454762,  0.15767742,  0.03109789,  0.06980556, -0.06348596,
        -3.00673872]),
 'smoothtype': array('lsf', dtype='<U3'),
 'fftsmooth': array(True)}

VS the param file here:
run_on_data.py.txt

Screenshot 2023-10-12 at 20 38 30

Any ideas why this is happening? Which best fit spectrum should one trust?

@themiyan
Copy link
Author

@jrleja I've tested the log probabilities as you suggested and the regenerated values from your test_lnprobfn function agrees with the stored max value to ~ the 6th decimal point. So it's more or less the same I think.

Going a bit deeper, I think the issue here is actually on the interpolation technique. The sps objects make identical spectra, but when I do it myself I use Adam's Spectres https://github.com/ACCarnall/spectres to resample the FSPS SSP to the observed NIRSpec grid. In prospector this is done by numpy.interp function

smspec = np.interp(outwave, wa, sa, left=0, right=0)

This seems to be the source of discrepancy. See the figure below:

Screenshot 2023-11-17 at 16 39 35

If one computes the chi2 for the fits:
for h5 stored spec: Reduced Chi-squared: 6.826009644018405
for regenerated spec with spectres: Reduced Chi-squared: 3.298083815923359
for regenerated spec with bumpy: 6.678531843420945

So there is a clear change in the goodness.

I wonder if this will have follow on effects if the ssp_basis uses the numpy also when evaluating the full MC grid.

For instruments like NIRSpec given the wavelength bins are highly non-linear (specially in the PRISM mode), I assume spectres would provide a better interpolation over numpy (but I haven't tested this though)?

@bd-j
Copy link
Owner

bd-j commented Nov 17, 2023

I think the issue is the undersampling of the LSF by the pixels, not the non-linearity of the bins. This will require implementing a different smoothing algorithm within the likelihood call.

@themiyan
Copy link
Author

Yes, under sampling of the LSF ~ 2 microns is severe for NIRSpec/CLEAR.

Changing smoothing.py smooth_lsf_fft function to Spectres does seem to fix the issue within itself for now. However, this is quite slow, ~ order of mag slower, adding ~ 100 min extra run time for a 512*1024 run. Spectres has an inbuilt numba version which gives comparable or faster than speeds to numpy. This seems to be reasonable.

See the three different interpolations here for a given chain:
Screenshot 2023-11-23 at 11 09 38

The diff with numpy for both is relatively clear.
Screenshot 2023-11-23 at 11 12 10

There are also differences between the algorithms within 'spectres' but to a very lower degree, which is insignificant.

Screenshot 2023-11-23 at 11 12 10

I'll create a pull request for you to checkout, but perhaps something in built rather than relying on 3rd party Spectres might be preferable.

@karlglazebrook
Copy link

@bd-j @jrleja

Just to add my oar in here. We have confirmed that our fitting issues were due to the use ofnp.interpin the prospector code for the resampling step (this happens after LSF smoothing, that part is done correctly). This kind of linear interpolation really does not work in going from high-res models with lots of fine lines to low res NIRSPEC prism data.

As @themiyan mentioned we changed to spectres.spectres for this (one line change), which does binning up correctly by integrating pixels, and that completely solved all our issues with fitting (including unrealistically small errors and poor chi^2 on the best fit).

I expect you don't want to have a spectres package dependency in prospector, and it is slow, so you are going to need some low level C code that does this kind of rebinning correctly.

This will affect anyone using Prospector to fit NIRSPEC PRISM spectra, (UNCOVER etc.) so it is probably worth informing people of this issue if you know they are working on this.

@bd-j
Copy link
Owner

bd-j commented Dec 1, 2023

Hi @themiyan, @karlglazebrook

Thanks for bringing this up. I've taken a closer look, and have a few thoughts.

To clarify, I think in principle np.interp can work pretty well even for very low resolution spectroscopy as long as the instrument LSF, convolved with the LOSVD and before pixelization, is well sampled by the pixels (at least 2 pixels per LSF FWHM, and ideally > 3 though see Robertson 17 and Law 21 for caveats.) One can test this by convolving the FSPS/MILES spectra with a Gaussian LSF with sigma=5000 km/s on a fine grid and comparing the results of spectres and np.interp for pixels spaced every 3000km/s or so.

However, I do think the nominal "R" quoted in Jdox files is a pre-flight estimate based simply on FWHM=2.2 nominal "pixels", so the actual instrumental LSF due to the optics or dispersing element might be narrower such that undersampling/pixelization play a significant role. Unfortunately the pre-pixel instrumental LSF is not quoted anywhere as far as I can tell, and likely depends on the light profile of the source within the slit (e.g. de Graaf et al 23). Indeed, I am curious about what exactly you mean by 'this happens after LSF smoothing, that part is done correctly'.

Anyway, if I conservatively assume that the instrumental LSF is a Gaussian with FWHM = 1.0 pixel at every wavelength (so pretty undersampled, but not infinitely so), smooth a MILES-library 2 Gyr old solar metallicity SSP by this LSF (within FSPS, using StellarPopulation.set_lsf as in the examples at https://github.com/bd-j/exspect), and by a LOSVD with sigma=50km/s, and then interpolate to the "WAVELENGTH" column of the Jdox dispersion file, I can compare that to what you get if you instead use spectres -- with the same "WAVELENGTH" vector -- on the LSF smoothed spectrum. I find sub-percent differences, even around absorption lines (see figures below). I imagine these differences would be larger if there was an extremely strong emission line, since undersampling must play some role, but it does not appear significant for this spectrum. I do note that if you only use spectres, and do not first smooth by a Gaussian instrumental LSF, you get a significantly different result than using spectres on the LSF smoothed version. However, this would correspond to massive undersampling, where the 'instrumental' resolution was entirely due to pixelization and thus basically a square kernel (modulo drizzling or resampling during processing). I do not believe this to be the actual case for NIRSPec prism, but again I am not aware of a publicly available analysis of the prism LSF that considers pixelization and the optical dispersion separately.

FWIW sedpy.observate.rebin implements a version of pixel binning - this is vectorized such that it has O(NM) complexity instead of ~O(N+M) like spectres, but the time constant is smaller such that the execution times are about equal for NM ~1e6. I am incorporating an undersampling treatment in the next version of prospector, which treats instrumental and physical dispersion separately and more clearly, and will allow for detector pixelization effects as well.

Hope this helps. It's possible I've made a mistake somewhere; the code to make the below figures is in a gist here:
https://gist.github.com/bd-j/0c02fbda0054940efaadc038d64ec462

smoothed_spectra
smoothed_spectrum_zoom
smoothed_difference

@karlglazebrook
Copy link

Hi @bd-j
I believe the issue here is not so much 'interpolation' but that the code using np.interp() to do the binning up step. In the case of coarse pixels linear interpolation produces errors, even with 2 pixel sampling. When we replace that with spectres, that effectively does the pixel integral, we get correct results.

I can't see the pixels in your above plots, the lines are too smooth, so I don't think you have accounted for the binning ups step.

@bd-j
Copy link
Owner

bd-j commented Dec 4, 2023

Hi @karlglazebrook

How you would define 'coarse' pixels, since there might be computational advantages to only doing the pixel integral when really necessary? What do you mean by 'correct results'; do you have a ground truth, e.g. a PNe or well characterized star observed with nirspec prism?

Re binning up, I have explicitly used spectres to bin the LSF-smoothed spectra for comparison in the code gist that I linked above. I've updated that code to show a 0.5 Gyr SSP with stronger Balmer lines, using larger pixels corresponding to the "DLDS" column of the nirspec prism dispersion curves on JDox , and plotting with ax.step instead of ax.plot for visualization of the pixels. However, for a Gaussian instrumental LSF with FWHM=1.5 pixels the difference between using np.interp and spectres is still <~ 1%, see attached plots.

Please let me know if you notice an error in the code in that gist, or a discrepancy in assumptions compared to your own tests. Thanks

smoothed_spectra
smoothed_spectrum_zoom
smoothed_difference

@karlglazebrook
Copy link

We definitely found we got the wrong outputs until we switched the code from np.interp to spectres. You could see by eye that the binning was incorrect. @themiyan can you provide a worked example for @bd-j ?

@karlglazebrook
Copy link

Hmm I see we have not posted the worked example. Sorry about that! To answer your questions:

By 'coarse' I simply mean the low resolution of NIRSPEC PRISM, we are smoothing and binning to match our data and both are a f(λ).

I will bug @themiyan - but if you look at the top (Oct 12 original post) you can see our example spectrum where spectres gives different results (red vs brown). This is for an older spectrum than 0.5 Gyr which may explain why you are not seeing it.

What is truth? Well clearly the higher resolution when binned correctly. What is 'correct'? For us is was a significant effect you could see by eye, and by eye we determined that it was clearly spectres that was doing more correct binning from the higher resolution model. For an independent check I used yet another rebinning code, this one didn't do variable pixel size but I approximated as constant around the break and it agreed with spectres in that region.

This all made sense to us as (1) linear interpolation can only ever be an approximation to integration across pixels. (2) we got much better and more stable fits with correct binning.

I will ask @themiyan to post the actual data and code we used. e.g. high res spectrum, result of prospector binning, result of spectres binning.

@bd-j
Copy link
Owner

bd-j commented Jan 6, 2024

Hi, thanks for the additional info. It would be helpful to see a worked example - happy to continue via email if you don't feel like sharing data on github. I still would like to know what exactly you mean by smoothing the data (in addition to binning? By what LSF?)

I agree that binning the native MILES resolution (R~2000) spectrum gives a different (and much more plausible) answer than just interpolating to the PRISM pixel centers. However, I think that unless some Gaussian-ish smoothing corresponding to the instrumental LSF (pre-pixelization) is applied, then this answer will also not be correct, potentially at the several 10s of percent level. And, once you have applied a Gaussian-ish instrumental LSF smoothing that is reasonable for PRISM, the difference between interpolating and binning is very small (sub-percent).

@themiyan
Copy link
Author

themiyan commented Jan 7, 2024

Sorry for the delay. As suspected you both are correct. @bd-j 's code also works as intended and I think the issue was on how the lsf was handled.

Taking a step back, the original issue was that the 'best-fit' spectrum saved by Prospector had some absorption features which were not observable in the observed spectrum.
To investigate that further, we regenerated the spectrum using best-fit values stored in the h5 file, using a finer wavelength grid (np.arange(900, 5e5, 10) so at a higher R). Then we resampled that spectrum to the NIRSpec observed wavelength grid using numpy. The resulting spectrum was similar to what was stored in the h5 file. Then we did the resampling using spectres which made a spectrum which looked more similar (visually + also better chi2) to the observed spectrum. This is what we posted initially here . When we did the resampling we didn't care about the LSF.

Now let's do this again with LSF. Grey thick line is the observed spectrum and the red and black is what we discussed above. Numpy clearly has some features but more or less does follow the pink high-R spectrum. So it's not wrong (because pink is the ground truth), it just looks more different to the observed spectrum compared to what spectres gives.

So if we now regenerate the higher-R spectrum (np.arange(900, 5e5, 10)) but with LSF (as handled by prospector, I don't do any smoothing myself but provide the sigma to lsf smoothing similar to fitting process) I get green and blue which are very similar + is also a good fit to the observed spectrum.

Screenshot 2024-01-08 at 10 14 33

This demonstrate this clearly: comp with observed spectrum
Screenshot 2024-01-08 at 10 20 13
comp with numpy interpolation
Screenshot 2024-01-08 at 10 20 51

So this is good. Regardless of whether we use numpy and spectres, when a higher-R spectrum is made and convolved with an LSF and resampled to a lower R NIRspec wavelength grid, the result is similar.
If we disregard the LSF and just resample a high-R spectrum to a lower R non-linear NIRSpec resolution, then Spectres gives better chi2 fit.

So now, if we go to the original issue of why the stored best-fit in the h5 file looked incorrect compared to observed spectrum, I think this is probably due to lsf smoothing not applied to the saved spectrum. It should have been but for some reason it wasn't.

BTW @bd-j prospect/utils/smoothing.py
lines such as if sigma is not None: had to be changed to if sigma != np.array(None): (like here)
otherwise it seems these conditions are not evaluated correctly.

@themiyan
Copy link
Author

themiyan commented Jan 7, 2024

The notebook for plots is interp_lsf_tests.ipynb here

@themiyan
Copy link
Author

themiyan commented Jan 8, 2024

OK. I think I understand why the hdf5 file 'best-fit' had the non-lsf smoothed spectrum stored.

The smoothing applied here is based on a wavelength dependent LSF. So for smoothing I removed sigma_smooth which defines a constant velocity smoothing applied to the data and instead provided fftsmooth params with a lsf function which provides the delta_lambda smoothing per wavelength bin.

In the past I did del model_params['sigma_smooth'] which meant that condition

do_smooth = (('sigma_smooth' in self.params) and
was not met.

Later on I generalised it as:

    if smooth:
        print("Spectral smoothing is applied")
        # ## spectral smoothing
        model_params = {**model_params ,  **TemplateLibrary['spectral_smoothing']}

        model_params['sigma_smooth']['isfree']= False
        model_params['sigma_smooth']['init'] = None
        model_params['sigma_smooth']['prior'] = None 
        
        model_params['smoothtype']['init']='lsf'

        model_params['fftsmooth'] = {'N': 1, 'isfree': False, 'init': True}

so for newer runs 'sigma_smooth' key is identified and smoothing is applied through lsf.

Ideally a seperate generalised keyword should be implemented in ssp_basis so any form of smoothing triggers the smoothing function.
This should explain why the stored spectra was different.

So the only thing outstanding is if a user attempts to fit a spectrum without a smoothing function, in low-R instruments like NIRSpec PRISM there can be differences in the binned spectrum based on whether numpy or spectres is used like shown by the red and black spectra in the previous post.

@bd-j
Copy link
Owner

bd-j commented Jul 5, 2024

Hi @themiyan, apologies for the very delayed response. I agree that one needs to do smoothing of the model spectrum by the instrumental resolution (or technically by the difference between the instrumental resolution and the stellar library resolution) before comparing to data. Depending on the sampling of the instrumental LSF this may need to be done even in addition to pixelization/binning, even with low-R instruments.

Re the smoothing, in more recent versions of prospector the smoothing is not handled by SSPBasis().get_spectrum() but instead within SpecModel.predict_spec() via the SpecModel.smoothspec() method here. In practice adding the "sigma_smooth" parameter with value None will cause the wavelength dependent LSF to be used, at least for the stellar continuum. Having the parameter be present but None also overrides the default 100km/s smoothing.

However, if nebemlineinspec is False, e.g. for marginalization, the emission lines are treated differently from the stars, and their width is determined by the "eline_sigma" parameter (in km/s, can be a vector giving different values for different lines), as noted at https://prospect.readthedocs.io/en/latest/nebular.html#fitting-emission-line-fluxes. The relevant code is here. This emission line width should take into account physical and instrumental effects.

Because of the complexities of the emission line marginalization, and including library, physical, and instrumental effects self-consistently but flexibly for nebular lines and for stars, the smoothing treatments have been overhauled in the v2.0 branch to hopefully make all this a bit more explicit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants