Add self-calibration uniformity metrics. #395

erykoff · 2024-03-07T22:27:00Z

This PR adds self-calibration uniformity metrics. This is a slow metric (it takes around 70-80 minutes for one band, one year from the 3.3 baseline sim).

The input data file for test stars is currently available at https://slac.stanford.edu/~erykoff/desc/uniformity/monster_stars_uniformity_i15-18_sampled.parquet . This file is taken from the Monster version of Gaia DR3, so it has Gaia DR3 positions + photometry bolted together from different surveys, cross-calibrated, and translated into the LSST passbands. So far it has grizy; u is coming. It was downsampled to 1.5 million stars from 15th to 18th mag in i so that this test runs in a reasonable amount of time and memory. The metric assumes this file is in the maf subdirectory of the data path, but that can be changed obviously (and we'll have to put it into the distribution). Right now it's in the parquet format which I quite like, but to read it I had to add pyarrow into the requirements. If that is unacceptable I can change the format.

The metric is a bit strange because it runs on the whole sky (it can't be run over individual slices). It also outputs a healpix map at the end with the calibration offsets. If this is undesirable in any way then I can do something else. It produces metrics for the self-calibration scatter over the full footprint; over the high galactic latitude (|b|>30) area; and the fraction of 4sigma outlier nside 128 pixels for both the full sky and the high latitude region.

When I was testing it I ran the following:

from rubin_sim.maf.maf_contrib.selfcal_uniformity_metric import PhotometricSelfCalUniformityMetric

from rubin_sim import maf


filter_name = "r"
year = 1
year_name = f"year{year}"
constraint = f"filter = '{filter_name}' and night >= {(year - 1)*365} and night < {year*365}"
run_name = f"baseline_3.3_{year_name}"

metric = PhotometricSelfCalUniformityMetric(run_name=run_name)
slicer = maf.UniSlicer()
bundle = maf.MetricBundle(
    metric,
    slicer,
    constraint=constraint,
    run_name=run_name,
)

baseline = "baseline_v3.3_10yrs.db"
bgroup_baseline = maf.MetricBundleGroup([bundle], baseline, out_dir="./")
bgroup_baseline.run_all()

# Print out metrics...
print(bundle.metric_values)

rhiannonlynne · 2024-03-19T01:24:36Z

Hi @erykoff so sorry for the delay in getting to your PR (just want to make sure we're giving it the time it deserves).

Sigh, my first request was going to be to ask you to run isort but that's failing elsewhere (so it's ok to skip that problem).
My second request though is to run black, because that looks like it's still failing on the new code.

Third, we'll need to update the version of rubin_sim_data/maf to make this work so I'll get your data into our download and then let you know about how to update the data_dict stuff here to grab the new version. (I'm assuming you may need it for the unit tests, but then again, maybe you don't).

Fourth - I think you've got the right way to run things by grabbing all of the visits with a unislicer .. but I just had an idea for how to deal with the output.
Can you return the healpix map as an array, as the actual returned metric_value? (you can do this, you set the metric_dtype to "object" and just shove an array into it -- example here

rubin_sim/rubin_sim/maf/metrics/tgaps.py

Line 54 in 88eb4e9

class TgapsMetric(BaseMetric):

).
Then you could write custom summary metrics (which look like regular metrics in terms of the API, but data_slice == all of the metric_values (aka, the healpix array of values) and output those summary values like normal summary stats. But also, if we actually save the metric output (instead of dumping it, like we often do but shouldn't do as much as we do), then the map will be saved automatically with the right run name and everything, and we'd have it available as part of the "normal way of doing things". In principle, we could also do a little rearranging of things and pipe the output into the plotting functions so we make the plot of the map each time too.
What do you think? What you're doing right now isn't really wrong (obviously, because you can), but I think this other approach might be more in-line with the general philosophy of MAF and could be a better model for other users down the line.

erykoff · 2024-05-02T16:50:37Z

@rhiannonlynne And now there was an unreasonably long delay on my end to get back to you on this.

Thank you for your very helpful comments.

First of all, I have no idea what is going on with the ruff checks. The doc lines are apparently too long everywhere, and this is not code that I have touched, so I don't know how other PRs are passing. And advice welcome.

I implemented the change to not do the manual output of the map, but rather return the map as one of the outputs. However, the format is a bit wonky since I couldn't figure out how to do a reduce function on it (maybe I don't need to). This would just be for further debugging/analysis anyway, I think. Or certainly I don't know how to make a map out of it automatically.

If you can let me know where to put in the data file that would also be great.

Note that there aren't any specific tests for this code, since I didn't know how to test it other than running it locally (which isn't really great). At the same time, it's challenging to test in a reasonable amount of time without crafting special data which may be challenging.

rhiannonlynne · 2024-06-18T00:41:21Z

I'll merge this to a branch so I can just go ahead and rebase and add the location of the filename. Sound good?

Edit - I'm sorry, I was being dumb .. didn't realize this was already a branch of OUR repo, and not a fork.
I guess it's policy to merge things from a ticket with a JIRA number, so maybe this is fine anyway.

rhiannonlynne self-requested a review March 19, 2024 01:10

erykoff added 6 commits May 1, 2024 09:31

Fix deprecated import.

12b6b86

Add self-calibration uniformity metric.

5549a7b

Add pyarrow to dependencies for parquet reading.

bea14a6

Remove unused import.

bec1b8b

Run isort.

cf67bf8

Update to return map in metrics and remove extraneous variables.

a083cda

erykoff force-pushed the u/erykoff/uniformity_metrics branch from 55d974a to a083cda Compare May 1, 2024 16:31

Fix sorting in star_tools.py

945fa22

rhiannonlynne changed the base branch from main to tickets/OPSIM-1158 June 18, 2024 00:40

rhiannonlynne approved these changes Jun 18, 2024

View reviewed changes

rhiannonlynne merged commit f9ee6ac into tickets/OPSIM-1158 Jun 18, 2024
1 check was pending

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add self-calibration uniformity metrics. #395

Add self-calibration uniformity metrics. #395

Uh oh!

erykoff commented Mar 7, 2024

Uh oh!

rhiannonlynne commented Mar 19, 2024

Uh oh!

erykoff commented May 2, 2024

Uh oh!

rhiannonlynne commented Jun 18, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Add self-calibration uniformity metrics. #395

Add self-calibration uniformity metrics. #395

Uh oh!

Conversation

erykoff commented Mar 7, 2024

Uh oh!

rhiannonlynne commented Mar 19, 2024

Uh oh!

erykoff commented May 2, 2024

Uh oh!

rhiannonlynne commented Jun 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rhiannonlynne commented Jun 18, 2024 •

edited

Loading