Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-MTL procedure for the DESI extension #833

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

geordie666
Copy link
Contributor

@geordie666 geordie666 commented Feb 3, 2025

This PR updates the MTL framework to use multiple MTLs for the DESI extension.

Features include:

  • New target classes for the extension.
  • New ledger types for DARK1B and BRIGHT1B, which can be updated with redshifts in the usual fashion.
  • Readers based on io.read_mtl_ledger() that can handle two MTLs.
    • The MTLs are merged on the basis of the highest PRIORITY + SUBPRIORITY targets in each MTL, and three new columns MTL_CONTAINS, MTL_WANTED and MTL_HIGHEST are added to the combined MTL, indicating how the multiple MTLs compared targets that had the same TARGETID.

From Eddie's careful summary of discussions between him, me, David S., Stephen and Segev:

Let me propose the following new fiberassign header keywords...

  • MTL1 ... MTLN - ~paths to the MTLs from which the targets were pulled

and new fields in the FIBERASSIGN extension for each target:

  • HIGHEST: bitmask where bit i is 1 if MTLi is the MTL that had the highest priority & subpriority for this target (?)
  • WANTED: bitmask where bit i is 1 if MTLi has priority > DONE for this target, or priority = DONE if the highest priority over all MTLs for this target is DONE.
  • CONTAINS: bitmask where bit i is 1 if MTLi contains the target

Then we can update the each MTL using the targets from a tile that have (keyword & 2**i) != 0. Option 2 corresponds to HIGHEST, option 3 corresponds to WANTED, and option 1 corresponds to CONTAINS. I think all of the code options we discussed amount to which one of these three fields we use ... but it seems valuable to include all three columns in any case, and that will let us change our mind in later DRs as to what goes in the coadds.

@geordie666 geordie666 added the WIP Work in Progress; don't merge yet label Feb 3, 2025
@geordie666
Copy link
Contributor Author

@araichoor: I haven't added the hooks into the base read_mtl_ledger() function yet for the higher-level readers (such as read_mtl_in_hp() and read_targets_in_tiles()). But the output from read_mtl_ledger() should be sufficient information to begin updating fiberassign for the new schema.

You can obtain an example output merged MTL for trial updates to fiberassign with these two lines of code:

filelist = ["/global/cfs/cdirs/desi/users/adamyers/surveyops/mtl/main/dark/mtl-dark-hp-6920.ecsv", "/pscratch/sd/a/adamyers/blub/dr9/2.8.0.dev5597/mtl/main/dark1b/mtl-dark1b-hp-6920.ecsv"]
mtlmerged, [prog1, prog2] = read_mtl_ledger(filelist)

The outputs are:

  • mtlmerged: The merged MTL where duplicate targets (on TARGETID) are merged on the basis of the higher priority target.
    • This merged MTL contains the new MTL_CONTAINS, MTL_WANTED and MTL_HIGHEST columns described above.
  • [prog1, prog2]: The programs or observing conditions that correspond to the input MTLs in filelist. So, for my example prog1 will be output as DARK and prog2 as DARK1B. I think we'll need to store these in the fiberassign header in addition to MTL1 and MTL2, as they can be used to look up the bitmasks in the MTL_CONTAINS, MTL_WANTED and MTL_HIGHEST columns. For example:
from desitarget.targetmask import obsconditions
obsconditions["DARK1B"]
Out[]: 1024

@coveralls
Copy link

coveralls commented Feb 3, 2025

Coverage Status

coverage: 52.763% (-0.3%) from 53.051%
when pulling 812fcb3 on ADM-extension
into a9ee318 on main.

@araichoor
Copy link

I m giving a look at that, and I ve a question.
if I m correct, when reading the ledgers with read_mtl_ledger(), the extra columns MTL_HIGHEST,MTL_CONTAINS,MTL_WANTED will be in the output only if two filenames are provided, right?
ie those won t be there for the main/backup and secondary ledgers.
if that s the case, I just wanted to check if that is intentional or not.

I guess that s fine from the fiberassign point of view it s ok:

  • for the secondary ledgers, as those will be merged with the main bright/dark tiles, those columns will be there; I could either fill them with meaningful values or with "" (empty strings);
  • for the main backup ledgers, I could either do nothing (hence the fiberassign files won t have those columns), or add columns with empty values.

Those last points are relevant for the soon-to-come twin PR in fiberassign, sorry; but those two PRs will be quite intertwined!

@geordie666
Copy link
Contributor Author

That was intentional, for backward compatibility.

But, my latest updates on this branch have a maketwostyle kwarg here:

https://github.com/desihub/desitarget/blob/ADM-extension/py/desitarget/io.py#L2983-L2985

That kwarg will write the new columns for a single MTL ledger, too.

Perhaps I should propagate that kwarg to read_mtl_ledger() so that you can always force the new columns to be added?

@araichoor
Copy link

thanks.
in fiberassign, I read the ledgers through read_targets_in_tiles(..., mtl=True, ...): https://github.com/desihub/fiberassign/blob/41275635f3171055027bb06bdd0310ccd2d9ee00/py/fiberassign/fba_launch_io.py#L1370-L1378
so you d have to propagate the "hooks" up to this kind of function, I guess, right?
as you said, the "two ledgers" thing needs to be propagated to those; for the maketwostyle option, I let it up to you, as I can also add those columns in fiberassign.

@geordie666
Copy link
Contributor Author

geordie666 commented Feb 18, 2025

@araichoor: I think the functionality should exist now to read two ledgers directly using read_targets_in_tiles(..., mtl=True, ...). Try, for example:

from astropy.table import Table
from desitarget.io import read_targets_in_tiles

filelist = ["/global/cfs/cdirs/desi/users/adamyers/surveyops/mtl/main/dark/", "/pscratch/sd/a/adamyers/blub/dr9/2.8.0.dev5597/mtl/main/dark1b/"]
alltiles = Table.read("tiles-main.ecsv")
tiles = alltiles[1550:1551]

And then for one ledger:

read_targets_in_tiles(filelist[0], tiles=tiles, mtl=True)
read_targets_in_tiles(filelist[0], tiles=tiles, mtl=True, maketwostyle=True)

Or for two ledgers:

read_targets_in_tiles(filelist, tiles=tiles, mtl=True)

I'd appreciate any stress-testing you can do of this new functionality!

@araichoor
Copy link

first, quick remark: I guess the syntax in your example in the previous message rather is read_targets_in_tiles(filelist[0], ...), instead of read_targets_in_tiles(..., hpdirname=filelist[0], ...), i.e. hpdirname is the first, non-optional argument, right?

then, from a quick check, looks like you re not handling the case where hpdirname is a single ledger (ie path to a file, not a folder):

>>> d = read_targets_in_tiles(filelist[0], tiles=tiles, mtl=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/global/homes/r/raichoor/software_dev/desitarget_desi1b/py/desitarget/io.py", line 4067, in read_targets_in_tiles
    if mtlmode or mtl:
UnboundLocalError: local variable 'mtlmode' referenced before assignment

here in the code:

desitarget/py/desitarget/io.py

Lines 4060 to 4064 in d1ecc10

# ADM check some ways for which we must be reading MTLs.
if isinstance(hpdirname, list):
mtlmode=True
elif os.path.isdir(hpdirname):
mtlmode=True

also I guess you d want to initialize with mtlmode = False before that if ... check, right?

@araichoor
Copy link

@geordie666, I ve a question: in fiberassign, I ve a sequence like:

# read ledgers
d = read_targets_in_tiles(mtldir[s], ...)

# read targets to grab additional columns
targ = read_targets_in_tiles(targdir[s], columns=columns, ...)

# add the columns to the ledgers
d = match_ledger_to_targets(d, targ)

the current PR version correctly handles the first step to read 1 or 2 ledgers -- great (btw, I ve not made thorough tests, but it works for my test case if I just edit the mtlmode thing as mentioned in the previous message).

I m wondering, for the second step, ie read the static targets files:

  • I could handle that in fiberassign: ie read the two static files, and for targets in both, retain the one with corresponding to the MTL_HIGHEST; btw would that be the correct thing to do?
  • would be better/more robust if that kind of thing was done read_targets_in_tiles() itself? but that would mean propagate in there info like MTL_HIGHEST, so maybe it is a pain / not desirable.

for what is worth, the list of added columns is here: https://github.com/desihub/fiberassign/blob/main/py/fiberassign/assign.py#L943-L996
I don t know if those values could differ for a given target both in e.g. dark and dark1b...

thanks.

@geordie666
Copy link
Contributor Author

Sorry for my tardiness @araichoor: I'm just getting back to this after daily operations. I'll make your suggested (correct, thanks!) update in my current branch to handle the mtlmode.

I think the point with reading the static files is that the TARGETID for the primary targets is always unique to the (currently) DR9 imaging, because the imaging RELEASE is embedded in the TARGETID. So, I don't think the quantities can change for the same TARGETID for, e.g., DARK and DARK1B. Consequently, it's always safe for fiberassign to read whatever combination of static target files is complete.

In other words, each TARGETID always maps to a unique object in the target files, even if each unique object in the target files doesn't necessarily map to the same TARGETID.

The MTL_HIGHEST, etc. quantities should only care about things like priorities and numbers of observations, which are a feature of the (dynamic) MTLs rather than the (static) target files.

@geordie666
Copy link
Contributor Author

Also, I updated the syntax in my previous message to match your (correct) interpretation of the function calls.

@araichoor
Copy link

oh, right, I forgot that TARGETID encodes the release, etc; thanks!
I ll handle that in fiberassign then.

@araichoor
Copy link

hi @geordie666,

during my development tests for fiberassign, I notice that there are some duplicates between the dark1b primary ledgers and the secondary ledgers.
that breaks part of the fiberassign code.

I find that 533/158,514 TARGETIDs in /pscratch/sd/a/adamyers/blub/dr9/2.8.0.dev5597/mtl/main/dark1b/mtl*ecsv also are in /global/cfs/cdirs/desi/survey/ops/surveyops/trunk/mtl/main/secondary/dark/mtl*ecsv.

I then checked for the dark ledgers in /global/cfs/cdirs/desi/survey/ops/surveyops/trunk/mtl/main/, and I find zero duplicate between the primary (100,087,469 rows) and the secondary (17,014,441 rows) ledgers; hopefully in some sense, as otherwise fiberassign would break in operations..

so I assume those duplicates between dark1b and secondary/dark is not expected, right?
is it just because these dark1b ledgers set is only for development, and some further steps have not been run? or is it a more fundamental issue?

@geordie666
Copy link
Contributor Author

Good spot, @araichoor. Unfortunately, I think that's a fundamental problem. We decided to allow the secondary targets to have primary TARGETIDs that weren't already covered in the Main Survey. But now we have new primary objects, they can have a TARGETID that duplicates those permitted secondary targets.

I'll have to think about what to do, here. Could we limit secondary targets to only be part of the survey they were designed for (i.e. DARK secondaries are only allowed on DARK tiles and DARK1B secondaries are only allowed on DARK1B tiles)? Would that still break fiberassign?

@araichoor
Copy link

ok I see.

first, just checking: do we expect to have {dark1b,bright1b} secondaries?
so far, I ve not enabled that in the fiberassign PR, I think... but I surely could if that s what we want.

then, "Could we limit secondary targets to only be part of the survey they were designed for" => I think that s doable in fiberassign; will test when nersc is back up.

@geordie666
Copy link
Contributor Author

I hadn't intended to have dark1b, bright1b secondaries, but the problem you found that we can't have DARK1B primaries mix with DARK secondaries made me thing about that option.

I think the other option is some other complicated merging process. But, that's awkward, because you currently merge DARK primaries and secondaries in fiberassign but we'd switch to having to do that in the desitarget multi-MTL process. With additional MTLs that the code doesn't expect.

@araichoor
Copy link

thanks again for the clarification.

double-checking: the suggestion would be (taking dark, dark1b here; same reasoning for bright, bright1b):

  • dark:
    • primary : draw from dark ledgers
    • secondary: draw from dark ledgers
  • dark1b:
    • primary : draw from dark and dark1b ledgers
    • secondary: draw from dark1b ledgers
      right?

that sounds surely doable in fiberassign, will do!

@araichoor
Copy link

in that context, what about ToO?
is there any thought to have about those? e.g. would or do we expect things to be fine?
(btw, from looking at fiberassign.fba_lauch_io.create_too(), I think I do not cut on observing conditions... if we d want to change that, we d have to be careful about backwards-reproducibility).

@geordie666
Copy link
Contributor Author

This is my suggestion, yes:

dark:
    primary : draw from dark ledgers
    secondary: draw from dark ledgers
dark1b:
    primary : draw from dark and dark1b ledgers
    secondary: draw from dark1b ledgers
    right?

I think the ToOs are always OK. They have no concept of program/observing conditions and their TARGETIDs are always distinct from any other primary/secondary TARGETID.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP Work in Progress; don't merge yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants