Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verification #133

Closed
wants to merge 81 commits into from
Closed
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
f44b925
Adds verification cli
domna Jun 14, 2023
4739463
Simple working verification
domna Jun 22, 2023
de27590
Don't replace non-variadic group names
domna Jun 22, 2023
712dbd4
Happyfy linting
domna Jun 22, 2023
0ae9671
Adds support for bytes NX_class attributes
domna Jun 22, 2023
09d3b4b
Autoformatting
domna Jun 22, 2023
43c65a9
Cleanup
domna Jul 4, 2023
fcd1c43
Adds nexus unit registry
domna Jul 5, 2023
4200c71
Fixes linting
domna Jul 5, 2023
16b070f
Sets defs to latest fairmat
domna Jul 5, 2023
1671a13
Adds basic unit check
domna Jul 5, 2023
20efda4
Check general validity of units
domna Jul 5, 2023
cca87bc
Resolve also parents for units
domna Jul 5, 2023
9dfbd03
Merge commit '0c69581b014d0ef7a65e54e9cc8a2e25916c26c8' into verifica…
domna Feb 5, 2024
5c2dd4e
autoformat
domna Feb 5, 2024
2f520cc
Merge commit '8bd900e8c520dacc67ef7b644d29dba1d5fe221e' into verifica…
domna Feb 5, 2024
529331f
Adds missing import
domna Feb 5, 2024
7c95311
Merge branch 'master' into verification
domna Feb 5, 2024
33cb7fd
Merge branch 'master' into verification
domna Feb 5, 2024
13e2670
Update to latest definitions
domna Feb 5, 2024
7413bae
Allow more genaral uppercase notation in nx_namefit
domna Feb 7, 2024
9ccb7b4
Add proper unit retrieval in validation
domna Feb 7, 2024
0c6fb6c
Lower debug level
domna Feb 7, 2024
e2a167a
Add counts to units
domna Feb 7, 2024
199024a
Fix namefitting
domna Feb 7, 2024
8f8df03
Adds support for NX_TRANSFORMATION
domna Feb 7, 2024
c44f5b8
Fix units in example data and tests
domna Feb 8, 2024
42904b9
Fix NOT IN SCHEMA for mpes example
domna Feb 8, 2024
cac78c6
Fix uppercase attribute namefit
domna Feb 8, 2024
06190b7
Keep uppercase parts in hdf names
domna Feb 9, 2024
49a7e1f
Fix upper/lower notation for example
domna Feb 9, 2024
11892d4
Re-enable empty-required-field test
domna Feb 9, 2024
4105ba5
don't use removeprefix
domna Feb 9, 2024
2d352cf
Fix empty-required-field test
domna Feb 9, 2024
7b1ec45
Properly check error logs
domna Feb 9, 2024
7c449cb
Catch errors for validate data dict
domna Feb 9, 2024
9312b3e
Fix required lone group in template
domna Feb 9, 2024
53c1849
Removes unecessary function
domna Feb 9, 2024
16e2d37
Adds proper uppercase matching to path in data dict check
domna Feb 9, 2024
45ee476
Cleans unit attributes
domna Feb 9, 2024
a139a40
Fix typing
domna Feb 12, 2024
f98994a
Fix local linting
domna Feb 12, 2024
e9ecd30
Update definitions
domna Feb 12, 2024
9a98967
Update nexus version file
domna Feb 12, 2024
c4ef94f
Updates generated eln file
domna Feb 12, 2024
1cf6a50
Updates reference files
domna Feb 12, 2024
00b7637
Do file checks in verification cli
domna Feb 12, 2024
52fca4e
Don't fail if definition is not present
domna Feb 12, 2024
08a3b81
Updates definitions
domna Feb 12, 2024
6d11276
Merge branch 'master' into verification
domna Feb 23, 2024
d4655fe
Add required under optional in group
domna Apr 18, 2024
1755347
rename to field
domna Apr 18, 2024
debcf51
Fix other tests
domna Apr 18, 2024
d632535
Check required field provided
domna Apr 18, 2024
b4686fc
Fix all_required_children_are_set
domna Apr 19, 2024
d761f37
Merge branch 'master' into fix-required-under-optional
domna Apr 19, 2024
10b1c44
Fix tests
domna Apr 19, 2024
d4dc235
Use if checks instead of try..except
domna Apr 23, 2024
1c68848
Add routine to check required fields for repeating groups
domna Apr 24, 2024
feb973e
Delete temporary file
domna Apr 24, 2024
17eb061
Fix path in data dict test
domna Apr 24, 2024
d83a6b8
Fix tests
domna Apr 24, 2024
b3a0f1b
Cleanup
domna Apr 24, 2024
0c9a0f4
Remove debugging line
domna Apr 24, 2024
dcb4d9b
Add collector class
domna Apr 24, 2024
56bf3a6
Remove commented lines
domna Apr 24, 2024
a0ae259
Check validation return type and logging
domna Apr 24, 2024
46f122b
Add tests for repeating groups
domna Apr 24, 2024
2b0144f
Fix report of variadic groups set to all None
domna Apr 25, 2024
9e29d5d
Merge branch 'master' into verification
domna Apr 25, 2024
59106f2
Merge branch 'fix-required-under-optional' into verification
domna Apr 25, 2024
617d86f
Add validity report at the end
domna Apr 25, 2024
bd98fad
Add validation logging for units
domna Apr 25, 2024
2529789
Fixes undocumented units and reporting of all none required groups
domna Apr 26, 2024
f7a64db
Use dict paths everywhere
domna Apr 26, 2024
59b1798
Add pint to dependencies
domna Apr 26, 2024
e6dad7c
Catch and log undefined units
domna Apr 26, 2024
7734256
Add unit checks for nx transformations
domna Apr 26, 2024
b55ebcc
Log wrong transformation_type
domna Apr 26, 2024
1261e6a
Merge branch 'master' into verification
domna Apr 26, 2024
485ffd9
Renaming
domna Apr 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 139 additions & 0 deletions pynxtools/dataconverter/verify.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
"""Verifies a nxs file"""
import os
import sys
from typing import Dict, Optional, Union
import xml.etree.ElementTree as ET
import logging
from h5py import File, Dataset, Group
import click

from pynxtools.dataconverter import helpers
from pynxtools.dataconverter.template import Template
from pynxtools.nexus import nexus

logger = logging.getLogger(__name__)

DEBUG_TEMPLATE = 9
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler(sys.stdout))


def _replace_group_names(class_map: Dict[str, str], path: str):
for class_path, nx_class in class_map.items():
if f"/{class_path}/" in path or path.startswith(f"{class_path}/"):
path = path.replace(f"{class_path}/", f"{nx_class}[{class_path}]/")
return path


def _clean_str_attr(attr: Optional[Union[str, bytes]], encoding="utf-8") -> str:
if attr is None:
return attr
if isinstance(attr, bytes):
return attr.decode(encoding)
if isinstance(attr, str):
return attr

raise TypeError(
"Invalid type {type} for attribute. Should be either None, bytes or str."
)


def _get_def_map(file: str) -> Dict[str, str]:
def_map: Dict[str, str] = {}
with File(file, "r") as h5file:
for entry_name, dataset in h5file.items():
if _clean_str_attr(dataset.attrs.get("NX_class")) == "NXentry":
def_map = {
entry_name: (
definition := h5file[f"/{entry_name}/definition"][()].decode(
"utf8"
)
)
}
logger.debug("Reading entry '%s': '%s'", entry_name, definition)

return def_map


@click.command()
@click.argument("file")
def verify(file: str):
"""Verifies a nexus file"""
def_map = _get_def_map(file)

if not def_map:
logger.info("Could not find any valid entry in file %s", file)
ref_template = Template()
data_template = Template()
domna marked this conversation as resolved.
Show resolved Hide resolved
class_map: Dict[str, str] = {}
entry_path = "/"

for entry, nxdl in def_map.items():
ref_template = Template()
data_template = Template()
class_map = {}
entry_path = f"/ENTRY[{entry}]"

definitions_path = nexus.get_nexus_definitions_path()
nxdl_path = os.path.join(
definitions_path, "contributed_definitions", f"{nxdl}.nxdl.xml"
)
if not os.path.exists(nxdl_path):
nxdl_path = os.path.join(
definitions_path, "applications", f"{nxdl}.nxdl.xml"
)
if not os.path.exists(nxdl_path):
raise FileNotFoundError(f"The nxdl file, {nxdl}, was not found.")

nxdl_root = ET.parse(nxdl_path).getroot()

helpers.generate_template_from_nxdl(nxdl_root, ref_template)
logger.log(DEBUG_TEMPLATE, "Reference template: %s", ref_template)

def collect_entries(name: str, dataset: Union[Group, Dataset]):
clean_name = _replace_group_names(class_map, name)
if isinstance(dataset, Group) and (
nx_class := _clean_str_attr(dataset.attrs.get("NX_class"))
):
entry_name = name.rsplit("/", 1)[-1]
clean_nx_class = nx_class[2:].upper()

is_variadic = True
clean_name = _replace_group_names(class_map, name)
for ref_entry in ref_template:
if ref_entry.startswith(f"{entry_path}/{clean_name}"):
is_variadic = False
break

if is_variadic:
class_map[entry_name] = clean_nx_class
logger.debug("Adding class %s to %s", clean_nx_class, entry_name)

if isinstance(dataset, Dataset):
logger.debug("Adding field %s/%s", entry_path, clean_name)
if isinstance(read_data := dataset[()], bytes):
read_data = read_data.decode("utf-8")
data_template[f"{entry_path}/{clean_name}"] = read_data

for attr_name, val in dataset.attrs.items():
if attr_name == "NX_class":
continue
logger.debug(
"Adding attribute %s/%s/@%s", entry_path, clean_name, attr_name
)
data_template[f"{entry_path}/{clean_name}/@{attr_name}"] = val

with File(file, "r") as h5file:
h5file[f"/{entry}"].visititems(collect_entries)

logger.debug("Class map: %s", class_map)
logger.log(DEBUG_TEMPLATE, "Processed template %s", data_template)
helpers.validate_data_dict(ref_template, Template(data_template), nxdl_root)

logger.info(
"The entry `%s` in file `%s` is a valid file"
" according to the `%s` application definition.",
entry,
file,
nxdl,
)
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ dev = [
read_nexus = "pynxtools.nexus.nexus:main"
dataconverter = "pynxtools.dataconverter.convert:convert_cli"
nyaml2nxdl = "pynxtools.nyaml2nxdl.nyaml2nxdl:launch_tool"
verify_nexus = "pynxtools.dataconverter.verify:verify"

[tool.setuptools.package-data]
pynxtools = ["definitions/**/*.xml", "definitions/**/*.xsd"]
Expand Down