Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expand FITS sniffer to recognize spectra from lightcurves and images? #115

Open
volodymyrss opened this issue Jun 24, 2024 · 9 comments
Open
Assignees

Comments

@volodymyrss
Copy link
Contributor

No description provided.

@dsavchenko dsavchenko self-assigned this Jul 10, 2024
@dsavchenko
Copy link
Collaborator

dsavchenko commented Jul 10, 2024

I started to work on this, but it's not clear how to define a general enough definition of lightcurve/spectrum.

What we currently call "spectrum" in our tools is some representation of SED as a table and is loosely defined...
More established OGIP standard spectrum is rather composite datatype. Other level of complexity, but will be based on fits, so it's next step.

As for lightcurves, there is no standard at all, it's very instrument-dependent, as far as I understand.

Classifying as Image is more-or-less straightforward, if some extensions (not necessarily Primary one) is Image extension, we can say that it's a fits image. Basic implementation works for this case (currently locally, I will push soon).
The complication here is that it still may contain other data products in different extensions, and it may consist of several images. But maybe that's not a big deal.

Also to note: tools can set metadata including type, so in some cases it may be enough to just add datatype with subclass="True" to the registry

@dsavchenko
Copy link
Collaborator

Additionally, it will be useful to extend metadata in some cases. One useful example is to extract column names from table(s) to automatically create options for "column name" tool inputs.

@volodymyrss
Copy link
Contributor Author

There is an OGIP standard for LC, we use it during INTEGRAL LC export at least.
It's used somewhat widely in timing community (pulsars, binaries, GRBs to a degree).
It's quite nuanced because of all these FRACEXP and time references etc, so standard is justified.

When file has image and LC, it would seem like some form of union type is applicable. Does it exist in Galaxy?

@dsavchenko
Copy link
Collaborator

Thank you, I didn't know this document. Our tools don't follow it, though. E.g. HESS lightcureve is produced as a free-formatted table.

Is gammapy able to produce lightcurves in stadard format?
We may also need to improve the lightcurve data product in oda_api to follow the standard.

@dsavchenko
Copy link
Collaborator

When file has image and LC, it would seem like some form of union type is applicable. Does it exist in Galaxy?

The other way around, one dataset that consists of several files exists, composite datatype. The one you propose is most likely for us to implement on top of FITS, I haven't seen anything like this. Not even sure how it can fit into galaxy datasets logic

@volodymyrss
Copy link
Contributor Author

Thank you, I didn't know this document. Our tools don't follow it, though. E.g. HESS lightcureve is produced as a free-formatted table.

Is gammapy able to produce lightcurves in stadard format?

Don't know. Hopefully. Anyway they work on some data models, for DL5 too probably.

We may also need to improve the lightcurve data product in oda_api to follow the standard.

Indeed, let's do that.

@volodymyrss
Copy link
Contributor Author

When file has image and LC, it would seem like some form of union type is applicable. Does it exist in Galaxy?

The other way around, one dataset that consists of several files exists, composite datatype. The one you propose is most likely for us to implement on top of FITS, I haven't seen anything like this. Not even sure how it can fit into galaxy datasets logic

The principle would be that a dataset can be one of the few selected types, but not any types. It would allow linking to next stages accordingly. Maybe @bgruening knows?

@bgruening
Copy link
Collaborator

Galaxy datatypes do support subclassing or nesting. A datatype class in python can be inherited and Galaxy knows then the inheritance chain. Does this help?

@dsavchenko
Copy link
Collaborator

Thank you @bgruening for answering. I see that datatypes may be nested. So if our fits file contains only e.g. image, it may be assigned a FITSImage datatype etc.
The problem we are discussing is that sometimes one fits file may contain different data in different "extensions", e.g. image data in primary extension and lightcurve in the other one. So imagine we have three tools: image analysis, time series analysis and spectrum analysis. This example file will be suitable as input for the first two tools, but not for the third one. As far as I understand, this scenario can't be clearly handled with Galaxy data model (or we end up defining plenty of datatypes for different combinations)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants