JP-3576: Implement the Ramp Fitting Likelihood Algorithm #278

kmacdonald-stsci · 2024-08-07T18:23:51Z

This PR implements the likelihood algorithm for ramp fitting developed by Timothy Brandt in Optimal Fitting and Debiasing for Detectors Read Out Up-the-Ramp.

Checklist

added entry in CHANGES.rst (either in Bug Fixes or Changes to API)
updated relevant tests
updated relevant documentation
updated relevant milestone(s)
added relevant label(s)

codecov · 2024-08-07T18:31:08Z

Codecov Report

Attention: Patch coverage is 84.85915% with 129 lines in your changes missing coverage. Please review.

Project coverage is 84.69%. Comparing base (d85f859) to head (375f310).
Report is 78 commits behind head on main.

Files with missing lines	Patch %	Lines
tests/test_ramp_fitting_likely_fit.py	82.99%	67 Missing ⚠️
src/stcal/ramp_fitting/likely_algo_classes.py	72.56%	31 Missing ⚠️
src/stcal/ramp_fitting/likely_fit.py	91.26%	29 Missing ⚠️
src/stcal/ramp_fitting/ramp_fit.py	81.81%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #278      +/-   ##
==========================================
+ Coverage   84.63%   84.69%   +0.06%     
==========================================
  Files          41       44       +3     
  Lines        7628     8501     +873     
==========================================
+ Hits         6456     7200     +744     
- Misses       1172     1301     +129

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

docs/stcal/ramp_fitting/description.rst

src/stcal/ramp_fitting/likely_fit.py

t-brandt

Comments are all minor docstring edits apart from one check to make sure that definitions of read noise are consistently referring to single read (=CDS/sqrt(2)) for all algorithms. I think the answer is yes, but I want to be completely certain.

src/stcal/ramp_fitting/likely_fit.py

src/stcal/ramp_fitting/ramp_fit.py

CHANGES.rst

melanieclarke · 2024-09-03T15:06:00Z

@kmacdonald-stsci - in this comment from @t-brandt:
#278 (comment)
he suggests leaving jump as is, and just ignoring previously set jump flags.

It looks to me like the PR on JWST is skipping jump:
https://github.com/spacetelescope/jwst/blob/2452fddf897c617c5b066f791054863c24e36938/jwst/pipeline/calwebb_detector1.py#L87

I'm just starting to look at these code changes, but did the code here get updated to expect previously set jump flags? Or did we decide to skip jump in the detector1 pipeline when algorithm=LIKELY after all?

melanieclarke

I think this needs quite a bit of clean up before it's ready to merge.

Most of it is code style related and documentation clarification, but I have a few significant structural concerns:

Will jump be skipped in detector1 or no? See previous comment - we may need to run jump for 1/f purposes, and ignore the input flags here.
We should add parameter passing structures for the detection threshold - currently nsig, but should be rejection_threshold, I think? And we need a test for that branch of the code.
There is code related to a pedestal calculation that looks unused and untested. If we don't plan to offer it, we should remove it.

Also - I'd like to test this on real data, but will wait for the issue with skipping jump to be settled.

docs/stcal/ramp_fitting/description.rst

src/stcal/ramp_fitting/ramp_fit.py

tests/test_ramp_fitting_likly_fit.py

kmacdonald-stsci · 2024-09-04T13:06:04Z

I think this needs quite a bit of clean up before it's ready to merge.

Most of it is code style related and documentation clarification, but I have a few significant structural concerns:

Will jump be skipped in detector1 or no? See previous comment - we may need to run jump for 1/f purposes, and ignore the input flags here.

We should add parameter passing structures for the detection threshold - currently nsig, but should be rejection_threshold, I think? And we need a test for that branch of the code.

There is code related to a pedestal calculation that looks unused and untested. If we don't plan to offer it, we should remove it.

Also - I'd like to test this on real data, but will wait for the issue with skipping jump to be settled.

Of course. As has been discussed many times, this is simply the initial implementation of the likelihood algorithm for ramp fitting. The implementation right now is to make the algorithm available for those interested in investigating it, which includes investigating the pedestal. The details for full integration into the existing pipeline have not been completely resolved, such as the separation of jump detection so it can be used as the preceding step for 1/f before running ramp fitting.

zacharyburnett · 2024-09-05T12:33:16Z

would this need a corresponding PR in jwst if it were included here?

kmacdonald-stsci · 2024-09-05T12:58:11Z

would this need a corresponding PR in jwst if it were included here?

This does have a corresponding JWST PR here: spacetelescope/jwst#8631

melanieclarke

Getting closer, thanks for all the updates!

A few lingering clean up issues still, noted below.

docs/stcal/ramp_fitting/description.rst

src/stcal/ramp_fitting/likely_fit.py

melanieclarke · 2024-09-09T17:41:14Z

Testing this on some real data now to verify the various options are working as expected.

I see significant differences between running with standard jump detection turned on in the pipeline and with it set to skip. With jump on and algorithm=LIKELY, the output is very similar to standard ramp fitting. With jump off and algorithm=LIKELY, the results from ramp fitting are ~~very different (and very poor)~~ a little different.

@kmacdonald-stsci - I expected that the likelihood algorithm should be ignoring any previously set jump flags, so running with and without jump detection on should have identical results, but that's not the case. Can you please investigate?

~~@t-brandt - I'm not sure what output is expected to look like from the likelihood algorithm, but have you tested the implementation on this branch with real data?~~

Edit: reinstalling and testing again, I can't reproduce the very poor results I first saw with LIKELY + no jump - I assume something was off in my environment. However, I do still see differences between running with and without standard jump on, so that needs investigation.

src/stcal/ramp_fitting/likely_fit.py

kmacdonald-stsci · 2024-09-11T15:06:28Z

Testing this on some real data now to verify the various options are working as expected.

I see significant differences between running with standard jump detection turned on in the pipeline and with it set to skip. With jump on and algorithm=LIKELY, the output is very similar to standard ramp fitting. With jump off and algorithm=LIKELY, the results from ramp fitting are ~~very different (and very poor)~~ a little different.

@kmacdonald-stsci - I expected that the likelihood algorithm should be ignoring any previously set jump flags, so running with and without jump detection on should have identical results, but that's not the case. Can you please investigate?

~~@t-brandt - I'm not sure what output is expected to look like from the likelihood algorithm, but have you tested the implementation on this branch with real data?~~

Edit: reinstalling and testing again, I can't reproduce the very poor results I first saw with LIKELY + no jump - I assume something was off in my environment. However, I do still see differences between running with and without standard jump on, so that needs investigation.

Please send me the data you are using and the specific pixels you are noticing as different.

drlaw1558 · 2024-09-12T17:29:39Z

Finally had a chance to test this out a little. A few quick comments:

The likelihood routine may be doing a better job of eliminating CR showers in MIRI data, in at least some cases (jw01236004001_02101_00001_mirifulong).
As for @melanieclarke , I see some small differences in results with jump on/off for both jw01236004001_02101_00001_mirifulong and jw01523003001_03102_00001_mirifushort_uncal.fits. E.g., pixel (X,Y)=(547,576) 1-indexed in the jw1523 exposure.
I'm seeing an issue when processing jw01264014001_02101_00001_mirifushort_uncal.fits with
2024-09-12 12:28:56,238 - stpipe.Detector1Pipeline.ramp_fit - WARNING - /Users/dlaw/jwcode/stcal_jwst/stcal/src/stcal/ramp_fitting/likely_fit.py:288: RuntimeWarning: invalid value encountered in multiply
2024-09-12 12:28:56,238 - stpipe.Detector1Pipeline.ramp_fit - WARNING - new_cts[i_d2] = np.sum(result.countrate_two_omit * droptwo, axis=0)[i_d2]
Similarly with jw01247667001_02101_00001_mirifulong_uncal.fits, but for both i_d1 and i_d2 on lines 286 and 288.
I'm also seeing an issue for jw01247667001_02101_00001_mirifulong_uncal.fits with the combination of individual integration data into the final rate value. This is a bright multi-int data set, and many integrations have NaNs in them which isn't necessarily surprising. However, any NaNs in a single integration are producing NaNs in the final rate image as well as if they're being mean-combined rather than nanmean-combined? See snake-like features near the top left of rate image created from this exposure. Reminds me a lot of JP-3004 but for NaNs.
Question perhaps for @t-brandt ; one of the appeals of this approach was runtime, which we expected to be a few seconds or so (depending on the observation of course). Actual runtime in some of the examples above though is factor of 10 longer than the C-based OLS code. We shouldn't optimize right now (let's study science performance gains first), but what's your opinion on the parameter space available for such speed gains in future?

t-brandt · 2024-09-12T18:17:36Z

For @drlaw1558: this ramp fitting approach was never designed to outperform an efficient implementation of the Fixsen algorithm (which has a cost linear in the number of groups, with a small prefactor). This approach retains the linear scaling albeit with a larger prefactor. So if both the C implementation and this one are perfectly implemented, I would expect the C version to run faster by a factor of at least 5. This approach would offer significant performance gains if the Fixsen approach were implemented very inefficiently, as was previously the case. The appeal to the likelihood-based approach is the science gain (no bias, optimal SNR, optimal CR sensitivity), especially for long ramps, combined with perfect scaling with ramp size and a modest performance hit--a fixed factor--relative to the Fixsen algorithm. My expectation is that this approach should take 5-10 seconds for 10 groups full frame with CR flagging.

tapastro · 2024-09-17T15:42:57Z

Just wanted to check in, @kmacdonald-stsci : have you had a chance to investigate the files that David mentioned above? We're hoping to get this in soon, i.e. prior to CoB Thursday, so I want to make sure we address these issues found during testing.

kmacdonald-stsci · 2024-09-18T11:21:51Z

Just wanted to check in, @kmacdonald-stsci : have you had a chance to investigate the files that David mentioned above? We're hoping to get this in soon, i.e. prior to CoB Thursday, so I want to make sure we address these issues found during testing.

No. I am still investigating the files Melanie gave me.

kmacdonald-stsci · 2024-09-18T15:00:38Z

Finally had a chance to test this out a little. A few quick comments:

The likelihood routine may be doing a better job of eliminating CR showers in MIRI data, in at least some cases (jw01236004001_02101_00001_mirifulong).

As for @melanieclarke , I see some small differences in results with jump on/off for both jw01236004001_02101_00001_mirifulong and jw01523003001_03102_00001_mirifushort_uncal.fits. E.g., pixel (X,Y)=(547,576) 1-indexed in the jw1523 exposure.

I'm seeing an issue when processing jw01264014001_02101_00001_mirifushort_uncal.fits with
2024-09-12 12:28:56,238 - stpipe.Detector1Pipeline.ramp_fit - WARNING - /Users/dlaw/jwcode/stcal_jwst/stcal/src/stcal/ramp_fitting/likely_fit.py:288: RuntimeWarning: invalid value encountered in multiply
2024-09-12 12:28:56,238 - stpipe.Detector1Pipeline.ramp_fit - WARNING - new_cts[i_d2] = np.sum(result.countrate_two_omit * droptwo, axis=0)[i_d2]
Similarly with jw01247667001_02101_00001_mirifulong_uncal.fits, but for both i_d1 and i_d2 on lines 286 and 288.

I'm also seeing an issue for jw01247667001_02101_00001_mirifulong_uncal.fits with the combination of individual integration data into the final rate value. This is a bright multi-int data set, and many integrations have NaNs in them which isn't necessarily surprising. However, any NaNs in a single integration are producing NaNs in the final rate image as well as if they're being mean-combined rather than nanmean-combined? See snake-like features near the top left of rate image created from this exposure. Reminds me a lot of JP-3004 but for NaNs.

Question perhaps for @t-brandt ; one of the appeals of this approach was runtime, which we expected to be a few seconds or so (depending on the observation of course). Actual runtime in some of the examples above though is factor of 10 longer than the C-based OLS code. We shouldn't optimize right now (let's study science performance gains first), but what's your opinion on the parameter space available for such speed gains in future?

I fixed the jump detection differences and suppressed the warnings you were seeing.

The integrations get combined here:

https://github.com/kmacdonald-stsci/stcal/blob/f97d568c4fa5bcb9bebe4b058e231e19557fc068/src/stcal/ramp_fitting/likely_fit.py#L355

@t-brandt is the behavior @drlaw1558 notes above expected with respect to NaN's?

drlaw1558 · 2024-09-18T16:27:18Z

Based on the code link, I think the issue is the np.median and np.sum statements. np.median would probably replace easily with np.nanmedian, although the np.sum calls would be a little trickier. np.nansum returns 0 for all-NaN inputs which isn't desirable either, so we'd want something that ignores NaNs if present but return NaN if all inputs are NaN.

melanieclarke · 2024-09-18T16:31:59Z

Based on the code link, I think the issue is the np.median and np.sum statements. np.median would probably replace easily with np.nanmedian, although the np.sum calls would be a little trickier. np.nansum returns 0 for all-NaN inputs which isn't desirable either, so we'd want something that ignores NaNs if present but return NaN if all inputs are NaN.

^^^
For the nansum, I usually do a check for all-NaN values in the input and restore them after. E.g.

all_nan = np.all(np.isnan(components), axis=0)
nan_summed = np.nansum(components, axis=0)
nan_summed[all_nan] = np.nan

melanieclarke

Reviewing again, I think all my comments on the code and docs clean up have been addressed, except for one more documentation clarification, below.

Thanks for fixing the jump flags and excessive warnings! I'm testing again on my data examples now.

I do think we should address the issue David pointed out with the NaNs in integration combination, but assuming local tests look good, we should be okay to get this in after that.

docs/stcal/ramp_fitting/description.rst

… product to account for NaN's.

melanieclarke · 2024-09-18T19:21:42Z

For my local tests, I am still seeing some differences between running with and without the jump step on, but in a lot fewer pixels. Diffs are now in ~2000 pixels total, <1% of the image size.

It looks like they might be snowball related? E.g. file jw02301006001_03101_00001_nrs1_rate.fits, near pixel x,y=493,44. Top is skipping jump, bottom is not skipping jump (default behavior). Left to right are SCI, ERR, DQ extensions.

t-brandt · 2024-09-18T20:13:56Z

I am not sure if this comment is still relevant, but my suggestion for the NaNs when combining multiple integrations is to replace NaN values with zeros in the count rate and in the inverse variance. Then, they will receive zero weight when adding the to valid measurements, and if there are no valid measurements, you will wind up with 0/0.

kmacdonald-stsci · 2024-09-19T12:00:49Z

For my local tests, I am still seeing some differences between running with and without the jump step on, but in a lot fewer pixels. Diffs are now in ~2000 pixels total, <1% of the image size.

It looks like they might be snowball related? E.g. file jw02301006001_03101_00001_nrs1_rate.fits, near pixel x,y=493,44. Top is skipping jump, bottom is not skipping jump (default behavior). Left to right are SCI, ERR, DQ extensions.

My testing is showing zero differences in the science array and the variance arrays.

melanieclarke · 2024-09-19T14:00:06Z

My testing is showing zero differences

Hmm, okay, I'll install again in a clean environment and test again.

Have you or @drlaw1558 checked that the NaN combination issues with jw01247667001_02101_00001_mirifulong_uncal.fits are now resolved?

src/stcal/ramp_fitting/likely_fit.py

…thm.

melanieclarke · 2024-09-19T19:11:25Z

I still see the same differences after pulling and reinstalling, but even if it isn't just me, they are small, and there is a workaround -- just don't run the jump step, as is recommended. I think we can leave that for future work.

I also tried reducing jw01247667001_02101_00001_mirifulong_rate.fits with your latest changes, and I don't see excessive NaNs, so I think that's fixed too.

melanieclarke

All my comments have been addressed. Thanks for all the hard work on getting this ready!

kmacdonald-stsci requested a review from a team as a code owner August 7, 2024 18:23

github-actions bot added documentation Improvements or additions to documentation ramp_fitting testing labels Aug 7, 2024

kmacdonald-stsci requested review from t-brandt, melanieclarke and tapastro August 7, 2024 18:24

kmacdonald-stsci changed the title ~~Jp 3576~~ JP-3576: Implement the Ramp Fitting Likelihood Algorithm Aug 7, 2024

t-brandt reviewed Aug 9, 2024

View reviewed changes

docs/stcal/ramp_fitting/description.rst Outdated Show resolved Hide resolved

t-brandt reviewed Aug 9, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved

t-brandt reviewed Aug 9, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Show resolved Hide resolved

t-brandt reviewed Aug 9, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved

t-brandt reviewed Aug 9, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved

t-brandt reviewed Aug 9, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Show resolved Hide resolved

t-brandt approved these changes Aug 14, 2024

View reviewed changes

zacharyburnett reviewed Aug 30, 2024

View reviewed changes

CHANGES.rst Outdated Show resolved Hide resolved

melanieclarke requested changes Sep 3, 2024

View reviewed changes

melanieclarke mentioned this pull request Sep 3, 2024

JP-3658: JWST Accommodation of the Ramp Fitting Likelihood Algorithm spacetelescope/jwst#8631

Merged

8 tasks

kmacdonald-stsci force-pushed the jp_3576 branch from dd8ba53 to 9469506 Compare September 4, 2024 17:53

melanieclarke reviewed Sep 9, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved

kmacdonald-stsci added 2 commits September 12, 2024 03:41

First pass at adding the maximum likelihood algorithm to ramp fitting.

ed89b83

Cleaning up test.

23684aa

kmacdonald-stsci added 3 commits September 12, 2024 03:43

Updating code based on code review.

cd90b7f

Updating documentation based on code review.

6a2abb0

Updating docs for likelihood algorithm in ramp fitting.

9c7e438

kmacdonald-stsci force-pushed the jp_3576 branch from 0ef30be to 9c7e438 Compare September 12, 2024 07:45

Suppressing warnings in likelihood fit computations.

f3ba0eb

Updating the jump handling for the likelihood ramp fitting algorithm.

f97d568

melanieclarke reviewed Sep 18, 2024

View reviewed changes

docs/stcal/ramp_fitting/description.rst Show resolved Hide resolved

kmacdonald-stsci added 2 commits September 18, 2024 14:06

Updating the calculations collapsing the rateints product to the rate…

fe59fd6

… product to account for NaN's.

Updating the likelihood algorithm description documentation.

d900a80

drlaw1558 reviewed Sep 19, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved

drlaw1558 reviewed Sep 19, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved

drlaw1558 reviewed Sep 19, 2024

View reviewed changes

src/stcal/ramp_fitting/likely_fit.py Outdated Show resolved Hide resolved

kmacdonald-stsci added 2 commits September 19, 2024 13:00

Changing sum to nansum.

7a07043

Updating the documentation for ramp fitting wrt the likelihood algori…

375f310

…thm.

melanieclarke approved these changes Sep 19, 2024

View reviewed changes

tapastro approved these changes Sep 19, 2024

View reviewed changes

tapastro merged commit 448413b into spacetelescope:main Sep 19, 2024
24 of 26 checks passed

stscijgbot-jp mentioned this pull request Sep 20, 2024

Implement option to use likelihood-based jump and ramp fitting methods in stcal spacetelescope/jwst#8361

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JP-3576: Implement the Ramp Fitting Likelihood Algorithm #278

JP-3576: Implement the Ramp Fitting Likelihood Algorithm #278

kmacdonald-stsci commented Aug 7, 2024

codecov bot commented Aug 7, 2024 •

edited

Loading

t-brandt left a comment

melanieclarke commented Sep 3, 2024 •

edited

Loading

melanieclarke left a comment •

edited

Loading

kmacdonald-stsci commented Sep 4, 2024

zacharyburnett commented Sep 5, 2024

kmacdonald-stsci commented Sep 5, 2024

melanieclarke left a comment

melanieclarke commented Sep 9, 2024 •

edited

Loading

kmacdonald-stsci commented Sep 11, 2024

drlaw1558 commented Sep 12, 2024

t-brandt commented Sep 12, 2024

tapastro commented Sep 17, 2024

kmacdonald-stsci commented Sep 18, 2024

kmacdonald-stsci commented Sep 18, 2024

drlaw1558 commented Sep 18, 2024

melanieclarke commented Sep 18, 2024

melanieclarke left a comment

melanieclarke commented Sep 18, 2024

t-brandt commented Sep 18, 2024

kmacdonald-stsci commented Sep 19, 2024

melanieclarke commented Sep 19, 2024

melanieclarke commented Sep 19, 2024

melanieclarke left a comment

JP-3576: Implement the Ramp Fitting Likelihood Algorithm #278

JP-3576: Implement the Ramp Fitting Likelihood Algorithm #278

Conversation

kmacdonald-stsci commented Aug 7, 2024

codecov bot commented Aug 7, 2024 • edited Loading

Codecov Report

t-brandt left a comment

Choose a reason for hiding this comment

melanieclarke commented Sep 3, 2024 • edited Loading

melanieclarke left a comment • edited Loading

Choose a reason for hiding this comment

kmacdonald-stsci commented Sep 4, 2024

zacharyburnett commented Sep 5, 2024

kmacdonald-stsci commented Sep 5, 2024

melanieclarke left a comment

Choose a reason for hiding this comment

melanieclarke commented Sep 9, 2024 • edited Loading

kmacdonald-stsci commented Sep 11, 2024

drlaw1558 commented Sep 12, 2024

t-brandt commented Sep 12, 2024

tapastro commented Sep 17, 2024

kmacdonald-stsci commented Sep 18, 2024

kmacdonald-stsci commented Sep 18, 2024

drlaw1558 commented Sep 18, 2024

melanieclarke commented Sep 18, 2024

melanieclarke left a comment

Choose a reason for hiding this comment

melanieclarke commented Sep 18, 2024

t-brandt commented Sep 18, 2024

kmacdonald-stsci commented Sep 19, 2024

melanieclarke commented Sep 19, 2024

melanieclarke commented Sep 19, 2024

melanieclarke left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 7, 2024 •

edited

Loading

melanieclarke commented Sep 3, 2024 •

edited

Loading

melanieclarke left a comment •

edited

Loading

melanieclarke commented Sep 9, 2024 •

edited

Loading