Bright Neighbor Validation #2

imedan · 2024-04-02T21:29:29Z

@albireox I wanted to start a draft of a pull request so we can comment on what I have so far for the bright neighbor validation. The current implementation relies on having the bright neighbor healpix indices saved in a file that are pointed to by the environment variable BN_HEALPIX. Then it just uses the R.A. and Decl. of the ToOs to see if they are in the bright neighbor regions on the sky for a given design_mode. Currently, for the sample file docs/sample.csv, the check for one design_mode takes ~1 minute.

A couple things that may need to be added are:

Account for proper motions
Account for offsets
Speed up index search

Let me know what you think.

codecov · 2024-04-02T21:34:59Z

Codecov Report

Attention: Patch coverage is 0% with 24 lines in your changes are missing coverage. Please review.

Project coverage is 88.7%. Comparing base (0912bce) to head (cd7056a).
Report is 1 commits behind head on main.

❗ Current head cd7056a differs from pull request most recent head 764d437. Consider uploading reports for the commit 764d437 to get more accurate results

Files	Patch %	Lines
src/too/validate.py	0.0%	24 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##            main      #2     +/-   ##
=======================================
- Coverage   95.2%   88.7%   -6.5%     
=======================================
  Files          8       9      +1     
  Lines        327     351     +24     
  Branches      64      66      +2     
=======================================
  Hits         311     311             
- Misses         4      28     +24     
  Partials      12      12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

imedan · 2024-04-04T19:46:06Z

I have now added offsetting and proper motion correction to the bright neighbor check. A couple things that have changed as a result of this:

I have added can_offset to the data model and to the sample.csv.
It should be noted I am working under the assumption that if pmra is not Null, then the epoch value must be set. I have updated this in sample.csv.
The code now depends on mugatu, but I cannot add it as a dependency. This is because mugatu depends on sdssdb (a tagged version), so poetry seems to not like having the branch you require work with the mugatu requirement. This should be fixable down the road, but for now you need to install mugatu separately for the code to run.

One other thing to figure out. The offsetting function relies on knowing what observatory the observation will be at. Is this something we will know at the validation stage? Or is it simpler to just run the validation for each observatory?

albireox · 2024-04-23T18:45:16Z

Hi @imedan,

I have finally gotten to a point testing the validation that I think I have some relevant comments. Instead of merging this branch directly I did merge/copy-paste/modify your additions in this file in the dev branch

https://github.com/sdss/too/blob/dev/src/too/validate.py

The main reason for this is that I wanted to avoid depending on Mugatu for now, as it requires some additional dependencies to build Kaiju and it only works with Python <= 3.10 (which is what Kaiju allows). Happy to revisit that if we can make Mugatu to not depend on Kaiju and work with the latest Pythons.

In any case, what I did was to copy some functions from Mugatu and a trimmed down version of DesignMode. I also changed a bit the inputs, but this is mostly for my own convenience when calling these functions, and because I've been trying in too to not use the default connection that sdssdb provides and allow it to be overridden (which probably is not very relevant for production, but it was important for development). To make this work I also had to make two changes:

Here I had to comment this block as I think it does the same as in magnitude_array (although it seems you already deleted that?)
Here I think the ev_boss should be ev_apogee and that was causing a shape mismatch.

The non-critical but useful change I made is to use polars lazy query options to speed things up. The BN validation was taking some 10 minutes to run, which is not terrible. I tried to parallelise it but run into issues and ended up giving up. The slowdown is 90% due to reading the large pickle files for some design modes. On one side, I think pickle is not ideal here because it's uncompressed, and the Healpix arrays in those files are quite compressible. I create Parquet files from the pickle files and the size decreases by a factor 4-5 (something line numpy.savez_compressed also works at the same level). But if one does not load the entire file in memory and opens it with polars.scan_parquet and then queries the pixels that match, that is done in a few seconds (because of whatever ultra-efficient magic the polars query planner does in the background).

Currently the version in dev seems to work fine and I can validate targets (see my implementation of the call here.

Let me know what you think and whether we should just merge those changes to main, or you want to port them back here or to Mugatu.

imedan · 2024-04-23T21:28:21Z

So, I think it does make sense (for now) to just copy over the mugatu functions to the validate.py file for development purposes (for the reasons you stated). I do think we should not do this for the production version of the code though, as if these functions need to change for some reason it will be annoying to have to do it in two separate places. But, we can cross that bridge when we get there...

Other than this, the changes that you made look all good to me! For the two errors you caught, I did correct those locally but forgot to push them to the remote branch...so thanks for catching them! Also, the polars query seems nice if it speeds things up. Similarly, the polars.scan_parquet magic sounds great to me, the large size of those pickle files was annoying...

I also looked at your implementation of validating the targets. It looks mostly good to me. I guess it is still missing differentiating between the observatories for the targets? This is a change we will have to make, as the bright neighbor radius is not the same at APO and LCO.

Other than that last comment, this looks good to merge to main to me. I think it makes the most sense to just merge the dev branch to main and we can work out the mugatu dependency issues later.

albireox · 2024-04-23T23:39:10Z

Sounds good, thanks for looking at the dev branch. And yes, the observatory is hardcoded right. Since each field is associated with an observatory, I think the easiest is to match targets to the FoV of each field and then use that observatory.

That reminds me that I wanted to ask about a detail of how the function should be called. Is the idea (as I wrote it right now) that we check every target for all the design modes that match their bright/dark sky brightness mode, and that if any fail we reject that target? Or was your idea that we only check the design mode of the design in which the too target is replacing a science target? I think the latter cannot be done except on the mountain.

imedan · 2024-04-24T12:44:18Z

Since each field is associated with an observatory, I think the easiest is to match targets to the FoV of each field and then use that observatory.

Yes, I think this would make the most sense and we should be able to do this ahead of time. Should I add a function to validate.py that is able to do this check? I think I would know how to go about doing this.

Is the idea (as I wrote it right now) that we check every target for all the design modes that match their bright/dark sky brightness mode, and that if any fail we reject that target?

This is close to my intention. You are right that you need to check every target for the design modes that match their bright/dark sky brightness mode (so if "bright", only need to check design modes with "bright" in them). The second half isn't quite correct though. For example, it is possible that a ToO could be valid in dark_plane, but not dark_faint. I would imagine the intention should be that the ToO could be observed in dark_plane designs, but not dark_faint ones. So, I really think we need to store the validation results for each target and for each design mode. Then, on the mountain for a specific design we should be able to select 1) targets that are within the FOV of the field and 2) targets that are valid for the design mode for the specific design. Does that sound right to you?

albireox · 2024-04-24T17:47:50Z

Ah yes, I had not thought of storing those results. That makes total sense.

albireox · 2024-05-07T01:57:38Z

This is all integrated in the main branch now and I think working fine, so I'm gong to close this PR.

imedan added 7 commits April 2, 2024 15:21

add initial bright neighbor validation code

e372339

add dependencies for validation code

4f7bee7

fix linter errors

68ea8fa

linter imports

4b35908

linter imports

f6d577f

add comment

662799e

add return

1fab038

imedan added 6 commits April 4, 2024 12:09

add coordio to dependencies

cd7056a

add can_offset to the datamodel

4f91c67

add can_offset to sample

b5de250

add epoch for rows with proper motions

8087e8f

ran poetry update

64323af

add offseting and proper motion correction to bn_validation

f4b5d83

imedan added 3 commits April 4, 2024 14:49

add observatory to input for validation

aa6f61e

add magnitude check

5443ab0

rename return for bn check

764d437

albireox closed this May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bright Neighbor Validation #2

Bright Neighbor Validation #2

imedan commented Apr 2, 2024

codecov bot commented Apr 2, 2024 •

edited

Loading

imedan commented Apr 4, 2024 •

edited

Loading

albireox commented Apr 23, 2024

imedan commented Apr 23, 2024

albireox commented Apr 23, 2024

imedan commented Apr 24, 2024

albireox commented Apr 24, 2024

albireox commented May 7, 2024

Bright Neighbor Validation #2

Bright Neighbor Validation #2

Conversation

imedan commented Apr 2, 2024

codecov bot commented Apr 2, 2024 • edited Loading

Codecov Report

imedan commented Apr 4, 2024 • edited Loading

albireox commented Apr 23, 2024

imedan commented Apr 23, 2024

albireox commented Apr 23, 2024

imedan commented Apr 24, 2024

albireox commented Apr 24, 2024

albireox commented May 7, 2024

codecov bot commented Apr 2, 2024 •

edited

Loading

imedan commented Apr 4, 2024 •

edited

Loading