Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially incorrect resolution matrix in coadds #2372

Closed
segasai opened this issue Sep 16, 2024 · 8 comments
Closed

Potentially incorrect resolution matrix in coadds #2372

segasai opened this issue Sep 16, 2024 · 8 comments
Labels

Comments

@segasai
Copy link
Contributor

segasai commented Sep 16, 2024

Hi,

I was trying to get to the bottom of something that does not quite make sense to me when trying to use the resolution matrix when fitting the data.

Specifically I'm seeing a few spectra with some masked pixels, where a naive application of resolution matrix leads to incorrect results next to masked pixels.

Here's the illustration (I am using jura)
Object with TARGETID 39633233108796099 (main/bright/ hpx 10032)
I note this object only has 1 single exposure in main/bright hence I would have expected the data in spectra- and coadd- be the same.
However when I look at the single exposure spectrum vs coadd -- I see the difference with a few pixels masked out

image

That is a bit surprising to me given I'd expected to be data be exactly the same in the coadd vs spectra when there is just one exposure, but I assume there is some additional masking going on..

However what I really cannot understand is when I try to model this data and use the resolution matrix.
Here I assume my spectrum is just flat and and then apply the resolution matrix.
The plot shows the coadded and single exposure plot and the 'model' obtained using the corresponding resolution matrices.

image

( The resolution matrix is applied as desispec.resolution.Resolution(RES)@np.ones(2881) *100)

Here I notice that "around" the masked region (i.e. pixel values 180-182) the predicted model values (blue curve) are below 100. Which is not correct in my understanding.

In my mind I can see two explanations, either I incorrectly interpret the resolution matrix, or there is something wrong in the masking of the resolution matrix .

I note there is this line of code

trdata[i,r]=np.sum((spectra.ivar[b][jj]*spectra.resolution_data[b][jj,r]),axis=0) # not sure applying mask is wise here

with an interesting comment...

Any suggestion/explanation or am I missing something obvious ?

Below is my code to reproduce the plots (on nersc)

import astropy.io.fits as pyfits
import matplotlib.pyplot as plt
import desispec.resolution
import numpy as np

path = '/global/cfs/cdirs/desi/spectro/redux/jura/healpix/main/bright/100/10032/'
D = pyfits.getdata(f'{path}/coadd-main-bright-10032.fits', 'Z_FLUX')
SD = pyfits.getdata(f'{path}/spectra-main-bright-10032.fits.gz', 'Z_FLUX')
T = pyfits.getdata(f'{path}/coadd-main-bright-10032.fits', 'FIBERMAP')
ST = pyfits.getdata(f'{path}/spectra-main-bright-10032.fits.gz', 'FIBERMAP')
RES = pyfits.getdata(f'{path}/coadd-main-bright-10032.fits', 'Z_RESOLUTION')
SRES = pyfits.getdata(f'{path}/spectra-main-bright-10032.fits.gz',
                      'Z_RESOLUTION')
xid = np.nonzero(T['TARGETID']==39633233108796099)[0][0]
sxid = np.nonzero(ST['TARGETID']==39633233108796099)[0][0]
# xid, sxid = 1299, 1651  # Row number for our object of interest
plt.plot(D[xid], label='Single exposure spectrum', color='black')
plt.plot(SD[sxid], color='red', label='Coadded spectrum')
plt.plot((desispec.resolution.Resolution(SRES[sxid]) @ np.ones(2881) * 100),
         color='violet',
         label='Single exposure model')
plt.plot((desispec.resolution.Resolution(RES[xid]) @ np.ones(2881) * 100),
         color='blue',
         label='Coadded model')
plt.legend()
plt.xlabel('Pixel')
plt.xlim(120,210)
@segasai
Copy link
Contributor Author

segasai commented Sep 17, 2024

Further checking the github. I think this is connected to #1389 .

Also, I checked and the redrock model (rrmodel) behaves similarly to my blue curve above (i.e. the model dips below what's expected next to the masked region)
image

I do believe now this is a bug that leads to artificially higher chi-squares. (This is how I noticed this, when I got chi-squares larger when using the resolution matrix vs using naive Gaussian convolution approach).

Another illustration of the issue
It shows the (data - redrock_model)*sqrt(IVAR) vs distance from masked region (in pixels) :
image
This is based on the whole coadd-main-bright-10032.fits file

It shows a clear bias next to the mask, as seen above.

@segasai
Copy link
Contributor Author

segasai commented Sep 18, 2024

I didn't get any replies here yet, but I am not sure this calculation for the resolution matrix
of the coadd

trdata[i,r]=np.sum((spectra.ivar[b][jj]*spectra.resolution_data[b][jj,r]),axis=0) # not sure applying mask is wise here

is accurate (or at least when I tried to derive it I get something different)

I think this is correct way of getting the resolution matrix of the coadd.

https://github.com/desihub/desispec/compare/resol_fix?expand=1

--- a/py/desispec/coaddition.py
+++ b/py/desispec/coaddition.py
@@ -564,8 +564,16 @@ def coadd(spectra, cosmics_nsig=None, onetile=False) :
 
             tivar[i]=np.sum(ivarjj,axis=0)
             tflux[i]=np.sum(ivarjj*spectra.flux[b][jj],axis=0)
+            ww = spectra.resolution_data[b].shape[1]//2
+            # resolution kernel width
+            npix = spectra.resolution_data[b].shape[2]
             for r in range(spectra.resolution_data[b].shape[1]) :
-                trdata[i,r]=np.sum((spectra.ivar[b][jj]*spectra.resolution_data[b][jj,r]),axis=0) # not sure applying mask is wise here
+                l1, l2 = max(r - ww, 0) , min(npix + (r - ww), npix)
+                l1x, l2x = l1 - (r - ww), l2 - (r - ww)
+                norm = np.sum(spectra.ivar[b][jj, l1:l2] ,axis=0)
+                trdata[i, r, l1x:l2x] = np.sum(spectra.ivar[b][jj, l1:l2] *
+                                    spectra.resolution_data[b][jj, r, l1x:l2x],
+                                               axis=0) / (norm + (norm == 0))
             bad=(tivar[i]==0)
             if np.sum(bad)>0 :
                 tivar[i][bad] = np.sum(spectra.ivar[b][jj][:,bad],axis=0) # if all masked, keep original ivar
@@ -574,8 +582,6 @@ def coadd(spectra, cosmics_nsig=None, onetile=False) :
             if np.sum(ok)>0 :
                 tflux[i][ok] /= tivar[i][ok]
             ok=(tivar_unmasked>0)
-            if np.sum(ok)>0 :
-                trdata[i][:,ok] /= tivar_unmasked[ok]
             if spectra.mask is not None :
                 tmask[i]      = np.bitwise_and.reduce(spectra.mask[b][jj],axis=0)
         spectra.flux[b] = tflux

This is basically weighting the resolution matrix by this kind of matrix where sigma_i are the variances in i-th pixel.

image

The existing code does this kind of weighting

image

The fix definitely solves the original problem.
I haven't tested it in other environment yet. I'd welcome any suggestions to easily test if my suggestion is incorrect or not.

@segasai segasai changed the title Potentially incorrect resolution matrix around masked pixels ? Potentially incorrect resolution matrix in the spectral coadds Sep 18, 2024
@segasai segasai changed the title Potentially incorrect resolution matrix in the spectral coadds Potentially incorrect resolution matrix in coadds Sep 18, 2024
@segasai
Copy link
Contributor Author

segasai commented Sep 18, 2024

I have verified that with fixed code essentially all the chi^2 reported by Redrock improve.
I've taken M92 tertiary data that I had around and then done the coadd using the original code and using my fixed code.
The histogram below shows the new chi^2 from redrock table minus old chi^2

image

Basically the absolute majority of chi-squares improved. For 1% of cases the improvement is more than 500, for 16% improvement is more than 30 in chi^2. The median is zero.
Thus I would say this is actually a significant effect.

@moustakas
Copy link
Member

@segasai I haven't had a chance to test this yet, but thank you for the work.

Pinging @jdbuhler apropos desihub/fastspecfit#181.

@segasai
Copy link
Contributor Author

segasai commented Sep 19, 2024

Thank you @moustakas !

Also I went further to DESI-doc-1056 and I still never quite follow exactly Eqns 9-10.
But basically if
$M_i$ are DESI extracted spectra (i is the exposure number), $C_i$ diagonal covariance matrix, $R_i$ resolution matrix, and $S$ 'true/noiseless' spectrum.
Then $E[M_i] = R_i S $
Then if we do inverse variance weighted average of extracted spectra (which is what the current code does )
$$\bar {M} = (\sum_i C_i^{-1})^{-1} \sum_i C_i^{-1} M_i $$
The expected value is
$$E[\bar{M}] = (\sum C_i^{-1})^{-1} {\sum C_i^{-1} R_i S } $$
Therefore the equivalent resolution matrix of the stack is
$$\hat{R} = (\sum C_i^{-1})^{-1} {\sum C_i^{-1} R_i } $$
Multiplying by the diagonal matrix $C_i^{-1}$ means multiplying each j-th row of $R_i$ by $1/\sigma_j^2$.
And because of the way the resolution matrix is stored (via diagonals) j-th row will be (w+j -k) row for k-th pixel (w is the half 'width' of the resolution). that exactly leads to the new corrected code. The old code essentially meant the multiplication of each j-th column of $R_i$ by $1/\sigma_j^2$.

@jdbuhler
Copy link

@segasai I haven't had a chance to test this yet, but thank you for the work.

Pinging @jdbuhler apropos desihub/fastspecfit#181

I don't think this impacts #181 - building the resolution matrix for the coadded spectrum happens before we get our hands on the data. Unless you're calling the coaddition code yourself from fastspec?

@segasai
Copy link
Contributor Author

segasai commented Sep 23, 2024

I've restacked over the weekend a random subset of kibo: (with a mix of sv/main/dark/bright)
And compared the results of restacks and redshifts of those and it confirms what I've seen before

Improvement in chi^2 (negative is better):
image
( roughly 1% of objects get chi^2 improved by ~ 200)

and in terms of redshift change, they change by more than 100km/s in 0.3% cases.

@moustakas
Copy link
Member

Fixed in #2377.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants