WIP JP-3610: Memory usage with Detector1 pipeline #8588

penaguerrero · 2024-06-21T15:32:27Z

This PR addresses running time and memory usage substantial improvement for the Detector 1 pipeline. It is a workaround of the detailed investigation required in the datamodels repo to permanently fix the issue.

Checklist for PR authors (skip items if you don't have permissions or they are not applicable)

added entry in CHANGES.rst within the relevant release section
updated or added relevant tests
updated relevant documentation
added relevant milestone
added relevant label(s)
ran regression tests, post a link to the Jenkins job below.
How to run regression tests on a PR
All comments are resolved
Make sure the JIRA ticket is resolved properly

melanieclarke

It looks like there are some unrelated changes to emicorr in here.

Also, it would be less disruptive to the code, if we can still call your function in a with statement. Looking closer, I think that might be possible as written, since the datamodel already has __enter__ and __exit__ definitions. Can we try that?

melanieclarke · 2024-06-21T16:03:02Z

jwst/lib/basic_utils.py

+    model : datamodel
+        The datamodel object
+    """
+    if not isinstance(input, str):


Should we maybe check for a JwstDataModel here, instead of just not-a-string?

yes, that's a good idea, will implement this and the "with"

braingram · 2024-06-21T16:23:07Z

jwst/lib/basic_utils.py

 from stdatamodels.jwst.datamodels import dqflags
 import numpy as np


+def use_datamodel(input):


What about updating JwstStep._datamodels_open and using it (or Step.open_model) instead of adding this utility function? That would have the added benefit of applying a similar fix (avoiding unnecessary model clones triggered by datamodels.open(Model)) to the calls that stpipe makes to open models to determine references.

One other benefit to updating and using Step.open_model would be to allow the opening code to check if the step is part of a pipeline (where Step.parent will return the pipeline). This might be too "fancy" (and harder to make sense of) but the general idea would be to replace the following (found in many of the steps)

with datamodels.open(input) as input:

with

with self.open_model(input) as input:

And implement _datamodels_open in a way that:

when Step.parent is none, mimic the old behavior (create a clone when a model is provided)

when Step.parent is not None only open the model if a string (or path) is provided

optionally provide a step class (like RampModel) to use when opening the model (or to create a clone when the input is not an instance of the class)

if a model was not opened (or a clone made) return the model with a nullcontext so that the input is not closed when the step finishes

Dropping the old behavior (cloning the input) is worth considering because clones share data arrays (which are often updated in steps).

I implemented your suggestions, thank you!

Thanks! Looking at _datamodels_open for this PR:
https://github.com/spacetelescope/jwst/pull/8588/files#diff-9a8d6de889fab1db67280357ed291190cbcf4fa320977c44c922db8cc78976a6R28-R33
the items described in the bullet points above are missing.

when Step.parent is none, mimic the old behavior (create a clone when a model is provided)

when Step.parent is not None only open the model if a string (or path) is provided

I don't see any check to Step.parent. This means every step now. modifies the input when provided a DataModel (even when run outside the pipeline).

optionally provide a step class (like RampModel) to use when opening the model (or to create a clone when the input is not an instance of the class)

I don't see any option for a model class but I think given the limited use of this behavior (forcing a model type) it might be cleaner to just have this in detector 1.

if a model was not opened (or a clone made) return the model with a nullcontext so that the input is not closed when the step finishes

This one is quite important as most steps use a with statement. If a nullcontext isn't used the steps will now close the input model when they finish. Since this is the same as the output model the next step in the pipeline will receive a closed model. This can cause things to crash for different model/step combinations as the full contents of a model might not be read when the file is opened.

the function that does this is copy_datamodel(input, step_parent), it checks if self.parent is None and creates a copy, otherwise it returns the input

in other words, the procedure now is to use use_datamodel(input, model_class) and/or make sure the model is the desired class, then make a copy or not based on whether the step is part of a pipeline or not with copy_datamodel(input, step_parent)

braingram · 2024-06-21T16:29:16Z

jwst/charge_migration/charge_migration_step.py

+        result = charge_migration.charge_migration(input_model, signal_threshold)
+        result.meta.cal_step.charge_migration = 'COMPLETE'
+
+        input_model.close()


When input is a model, this will close the input file. This can cause issues for certain files (if the input file is asdf and lazy loaded).

I think this is addressed by using the "with" suggested by Melanie - already in the works

Thanks! I see that some of the steps are updated to use with. When the input is not cloned the __exit__ of the with statement will close the file (if the model is the only one that has file_references to the input file). These references are used to track when the file can be safely closed. When a model is cloned it increments the reference to the file so that both the clone and original model need to be freed before the file is closed. This cloning and file reference counting is what allowed these steps to use a with context without closing the file too early. If we want to remove the cloning and keep the file references up-to-date we'll either have to remove the with context usage or provide a way for a datamodel to be used in a with statement without requiring a clone. One option here might be to use a nullcontext when the model is not cloned.

braingram · 2024-06-21T16:31:10Z

jwst/charge_migration/charge_migration_step.py

@@ -28,19 +29,21 @@ class ChargeMigrationStep(Step):
    def process(self, input):

        # Open the input data model
-        with datamodels.RampModel(input) as input_model:


Any idea why this was using RampModel(input) instead of datamodels.open(input)? Are there times when this input might not already be a RampModel?

There are several other steps modified in this PR that are similar (and the same questions apply).

I believe you can use a RampModel up to the ramp step, but if we use the use_datamodel function then it does not matter anymore since it uses datamodels.open()

@braingram The calwebb_detector1 pipeline, as well as its individual steps, must have original 4-D data ramps to work with, which is a RampModel. Forcing the calwebb_detector1 pipeline module to load its original input into a RampModel (as opposed to using a generic open and letting datamodels decide the appropriate model based on the DATAMODL keyword) is necessary, because the "uncal" files created by level-1b processing are in the form of a Level1bModel. So we have to force it into a RampModel. Whether it's also necessary to do that for individual steps, like charge_migration, is debatable. In normal flow, as part of the calwebb_detector1 pipeline, the input would already be in the form of a RampModel, thanks to calwebb_detector1 focing that. But there's also the possibility of someone trying to run an individual step on an "uncal" file, outside of the pipeline. Hence it could be argued that again forcing it RampModel is safest.

Thanks! So the steps (which may receive a Level1bModel or RampModel) could check the model input type (isinstance(model, RampModel)) and only convert it to a RampModel if it's not already one.

this is what the function does now

codecov · 2024-06-21T16:35:26Z

Codecov Report

Attention: Patch coverage is 58.58586% with 123 lines in your changes missing coverage. Please review.

Project coverage is 61.91%. Comparing base (eab0532) to head (efa083c).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
jwst/pipeline/calwebb_detector1.py	6.97%	40 Missing ⚠️
jwst/rscd/rscd_step.py	13.63%	19 Missing ⚠️
jwst/reset/reset_step.py	15.00%	17 Missing ⚠️
jwst/persistence/persistence_step.py	23.07%	10 Missing ⚠️
jwst/emicorr/emicorr_step.py	25.00%	9 Missing ⚠️
jwst/ipc/ipc_step.py	33.33%	6 Missing ⚠️
jwst/gain_scale/gain_scale_step.py	66.66%	4 Missing ⚠️
jwst/saturation/saturation_step.py	73.33%	4 Missing ⚠️
jwst/dark_current/dark_current_step.py	70.00%	3 Missing ⚠️
jwst/linearity/linearity_step.py	70.00%	3 Missing ⚠️
... and 5 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8588      +/-   ##
==========================================
+ Coverage   61.86%   61.91%   +0.04%     
==========================================
  Files         377      377              
  Lines       38911    39009      +98     
==========================================
+ Hits        24071    24151      +80     
- Misses      14840    14858      +18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

braingram · 2024-06-21T16:49:28Z

The steps updated in this PR will now modify their inputs (rather than creating new models) when called with a DataModel. I am overall in favor of this type of change but want to bring up the point because previously I was told that the convention for the pipeline is to not modify the input.

For example, if I run (for example DQInitStep) on a model with the current main branch:

>> model = datamodels.open("some_file.fits")
>> step = DQInitStep()
>> result = step(model)
>> result is model
False
>> model.meta.cal_step.dq_init
None

Then result is not model and model.meta.cal_step is None. With this PR I attempted to run the example with this PR I ran into an error (more on that below) but I expect it should change the result to instead:

>> result is model
True
>> model.meta.cal_step.dq_init
"COMPLETE"

The error I ran into was AttributeError: No attribute 'dq' I think that's because I provided it a FGS GUIDER1 file (the first random file I grabbed). I did not encounter an error on main.

Is there any mention in the documentation about if steps modify their inputs?

CHANGES.rst

tapastro · 2024-09-26T19:35:34Z

We should aim to keep the input unchanged after it is provided to a step or pipeline; any improvements to memory usage that do not violate this principle should be split off this PR, but this PR will be closed.

running time and memory usage substantial improvement

994e729

penaguerrero requested a review from a team as a code owner June 21, 2024 15:32

github-actions bot added dark_current dq_init ipc jump lastframe linearity ramp_fitting refpix reset rscd saturation superbias group_scale gain_scale firstframe labels Jun 21, 2024

penaguerrero added 2 commits June 21, 2024 11:35

added description of PR

a1c088b

updated docs

048414b

github-actions bot added the documentation label Jun 21, 2024

stscijgbot-jp mentioned this pull request Jun 21, 2024

Memory usage with Detector1 pipeline #8454

Open

removed unused code

d53d1fa

melanieclarke reviewed Jun 21, 2024

View reviewed changes

braingram reviewed Jun 21, 2024

View reviewed changes

addressing comments in PR

1bf6096

github-actions bot added the testing label Jun 21, 2024

penaguerrero and others added 2 commits September 18, 2024 13:40

Merge branch 'master' into speedupemicorr

65c43ab

fixing conflict

2bbefef

tapastro removed this from the Build 11.1 milestone Sep 20, 2024

penaguerrero added 2 commits September 20, 2024 10:37

Merge branch 'master' into speedupemicorr

9c76724

Merge branch 'master' into speedupemicorr

0075370

penaguerrero changed the title ~~JP-3610: Memory usage with Detector1 pipeline~~ WIP JP-3610: Memory usage with Detector1 pipeline Sep 20, 2024

penaguerrero and others added 10 commits September 20, 2024 13:05

reverting removing important copy

580e754

fixing silly mistake

33c46d1

trying with statement

921323c

trying with statement

0abea5c

trying with statement

ea24a87

Merge branch 'master' into speedupemicorr

8c4e9fe

fixing rebase of with statement

d6ccdf5

removing unused import

df9b17b

Merge branch 'master' into speedupemicorr

90596b2

fixing rebase

c1479ae

zacharyburnett requested changes Sep 23, 2024

View reviewed changes

CHANGES.rst Outdated Show resolved Hide resolved

penaguerrero and others added 2 commits September 23, 2024 13:35

adding towncirer changes

4ec43b8

move change log entry to towncrier fragment

b39de6e

zacharyburnett self-requested a review September 23, 2024 17:42

penaguerrero and others added 2 commits September 23, 2024 13:45

minor changes

efd3c85

Merge branch 'main' into speedupemicorr

680cd6f

zacharyburnett removed their request for review September 23, 2024 19:30

penaguerrero and others added 5 commits September 24, 2024 09:12

Merge branch 'main' into speedupemicorr

b44e57b

Merge branch 'main' into speedupemicorr

e317fac

improoving garbage collect and mem usage

bb338ee

fix potential type error for test

1ff1d0d

Merge branch 'main' into speedupemicorr

efa083c

tapastro closed this Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP JP-3610: Memory usage with Detector1 pipeline #8588

WIP JP-3610: Memory usage with Detector1 pipeline #8588

penaguerrero commented Jun 21, 2024 •

edited

Loading

melanieclarke left a comment

melanieclarke Jun 21, 2024

penaguerrero Jun 21, 2024

penaguerrero Jun 21, 2024

braingram Jun 21, 2024

braingram Jun 25, 2024

penaguerrero Jul 8, 2024

braingram Jul 8, 2024

penaguerrero Jul 8, 2024

penaguerrero Jul 8, 2024 •

edited

Loading

braingram Jun 21, 2024

penaguerrero Jun 21, 2024

braingram Jun 24, 2024

braingram Jun 21, 2024

braingram Jun 21, 2024

penaguerrero Jun 21, 2024

hbushouse Jun 22, 2024

braingram Jun 24, 2024

penaguerrero Jul 8, 2024

codecov bot commented Jun 21, 2024 •

edited

Loading

braingram commented Jun 21, 2024 •

edited

Loading

tapastro commented Sep 26, 2024

WIP JP-3610: Memory usage with Detector1 pipeline #8588

WIP JP-3610: Memory usage with Detector1 pipeline #8588

Conversation

penaguerrero commented Jun 21, 2024 • edited Loading

melanieclarke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

penaguerrero Jul 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jun 21, 2024 • edited Loading

Codecov Report

braingram commented Jun 21, 2024 • edited Loading

tapastro commented Sep 26, 2024

penaguerrero commented Jun 21, 2024 •

edited

Loading

penaguerrero Jul 8, 2024 •

edited

Loading

codecov bot commented Jun 21, 2024 •

edited

Loading

braingram commented Jun 21, 2024 •

edited

Loading