Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NX_UINT not supported in the NeXusParser by NOMADs, NX_POSINT (likely) supported but if so buggy #534

Open
mkuehbach opened this issue Jan 30, 2025 · 0 comments

Comments

@mkuehbach
Copy link
Collaborator

mkuehbach commented Jan 30, 2025

I thought using the option of having constrained subspaces of integer as provided by NeXus' NX_UINT, NX_POSINT over the plain NX_INT is useful. Consequently, I use it in almost all NeXus-related work for cases where I know we deal with countable objects of which I cannot have less than none. However, didnt realize two facts: NX_POSINT and even less so NX_UINT are used in NeXus at all, even worse, support for parsing concepts NX_UINT and NX_POSINT
is currently not support by NOMAD.
Specifically, attempting to parse these via the nomad NeXus parser plugin of pynxtools fails in ultimately line130ish in nomad/nomad/metainfo/data_type.py::convertible_from, where the function just raises NotImplemented()

Interestingly, only a few NIAC appdefs (NXxbase, NXlauetof, and NXsns*, and NXdetector, as examples) use one of NX_POSINT or NX_UINT. Almost no classes, and no appdefs designed by FAIRmat other than NXapm* and NXem* use NX_POSINT, NX_UINT at all thats why the issue remained in NOMAD, causing a substantial number of warnings.

A naive fix in this function convertible_from is to use:
return True

Tried out, yields the immediate effect that all warnings related to above-mentioned use of instances are removed and instead the respective data processed correctly. The respective quantities get indexed as expected without having to make changes on the NeXus side and (some of them I test so far) can be found by search.

But there is likely a good reason why MarkusS as a computer scientist likely better stated that such conversion is not general enough NotImplemented because there is not technical implementation for when using the default integer types that restricts the domain and interval stronger than at most from int to uint.
So assume now you pass an uint64 exposing the range, i.e. uint64::max, mapping on int64 silently as could happen now when just returning True would overflow, not all uint64 are correctly mappable on int64 would need more bitwidth...
Likewise NX_POSINT could use uint64 but not 0 BUT also one could write out a neXus file with a custom
uint128 which is still matching an NX_POSINT but that is currently not supported by NOMAD.

The point is, NeXus is here stricter and more general as is NOMAD, but only for ints and hence we need to make a choice in which cases we return True, instead of just blindly returning whenever true.

A dummy NeXus NXtestme_overflow with all types could be used to address and fix this issue, which I would like to discuss
@sanbrock , maybe you also no more about why back then the existence of NX_UINT, and NX_POSINT was certainly realized but maybe not considered with equal priority.

Alternatively we could refactor all occurrences at least on our side in NeXus for NXapm* and NXem* easily by going to NX_INT, but I dislike this suggestion because what we have already implemented conveys a valid and substantiated set of constraints, an often also useful semantic constraint that as you say "I know that this number must only be zero or something positive but not negative as I cannot have a negative number of clusters in my hand in reality". As NeXus is currently sloppy on NX_FLOAT this issue does not come up.

Not a fan of an academic discussion whether this is a bug or not, as NOMAD replies NotImplemented, although however, NOMAD in all test cases so far was processing instance data with NX_POSINT, NX_UINT concepts correctly when setting return True.

@mkuehbach mkuehbach changed the title NX_UINT not supported in the NeXusParser by NOMAD and NX_POSINT (likely) supported but if so incorrectly NX_UINT not supported in the NeXusParser by NOMADs, NX_POSINT (likely) supported but if so buggy Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant