Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kibo sv3/dark healpix 43218 failure #2354

Closed
sbailey opened this issue Sep 5, 2024 · 4 comments
Closed

Kibo sv3/dark healpix 43218 failure #2354

sbailey opened this issue Sep 5, 2024 · 4 comments
Labels

Comments

@sbailey
Copy link
Contributor

sbailey commented Sep 5, 2024

Script /global/cfs/cdirs/desi/spectro/redux/kibo/run/scripts/healpix/sv3/dark/425/zpix-sv3-dark-42503-47650.slurm failed while processing healpix 43218. From /global/cfs/cdirs/desi/spectro/redux/kibo/healpix/sv3/dark/432/43218/logs/spectra-sv3-dark-43218.log:

ERROR:util.py:163:runcmd: Traceback (most recent call last):
ERROR:util.py:163:runcmd: File "/global/common/software/desi/perlmutter/desiconda/20240425-2.2.0/code/desispec/0.66.2/lib/python3.10/site-packages/desispec/util.py", line 146, in runcmd
    result = cmd(*args)
ERROR:util.py:163:runcmd: File "/global/common/software/desi/perlmutter/desiconda/20240425-2.2.0/code/desispec/0.66.2/lib/python3.10/site-packages/desispec/scripts/group_spectra.py", line 186, in main
    frames.append(_read_framefile(*rdargs))
ERROR:util.py:163:runcmd: File "/global/common/software/desi/perlmutter/desiconda/20240425-2.2.0/code/desispec/0.66.2/lib/python3.10/site-packages/desispec/scripts/group_spectra.py", line 76, in _read_framefile
    log.warning(f"Frame {filename} had no objects in healpix {args.healpix}. Continuing")
ERROR:util.py:163:runcmd: NameError: name 'args' is not defined
CRITICAL:util.py:172:runcmd: FAILED desispec.scripts.group_spectra.main

Other healpix are processing fine, so it looks like we have a bug in some corner case, perhaps when a frame overlaps a healpix but doesn't actually have any targets on that healpix.

@sbailey sbailey added the crash label Sep 5, 2024
@araichoor
Copy link
Contributor

isn t the error just due to the fact that there is no args variable defined in the _read_framefile() function?:

log.warning(f"Frame {filename} had no objects in healpix {args.healpix}. Continuing")

looks like this has been introduced in this commit 1184b14, at the same time as the _read_framefile() function, I guess when the code has been moved out of the main().

for what is worth:

  • the "problematic" exposure which likely triggers the warning is expid=87354 from 20210505 (tileid=1), for which the _read_framefile() function should return an empty frame; the two other exposures from tileid=1 have each time 5 or less spectra only in the pixel;
  • this pixel ran fine in jura.

@sbailey
Copy link
Contributor Author

sbailey commented Sep 6, 2024

Reading the traceback more closely, it is the irony of a typo in the warning message itself causing an exception. healpix vs. args.healpix. D'oh!

@sbailey
Copy link
Contributor Author

sbailey commented Sep 6, 2024

Underlying issue for why this appeared in the first place is documented in #2359 . Stuck positioners are getting assigned different TARGET_RA, TARGET_DEC on different exposures of the same tile. The healpix bookkeeping code assumes the TARGET_RA, TARGET_DEC -> HPIX mapping is the same for all exposures and just used the first exposure, leading to a discrepancy on the second exposures.

I'll let #2359 track how we got into that situation in the first place, and close this after fixing the warning crash (upcoming).

@sbailey
Copy link
Contributor Author

sbailey commented Sep 6, 2024

Fixed in commit e8256df

@sbailey sbailey closed this as completed Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants