Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Encode mutually exclusive extensions in schema with sub-lists #1492

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

Remi-Gau
Copy link
Collaborator

fixes #1487

- .raw
- .ave
- .mrk
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.mrk seems to only exist for markers suffix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, some context in this thread: #638 (comment) and in particular this reponse (i.e., why .mrk is only for the markers suffix, but .sqd can also be used for the brain data): #653 (comment)

text += "\n" + entity_text

text = text.replace("SPEC_ROOT", utils.get_relpath(src_path))
return text


def _make_entity_definition(entity, entity_info):
def _make_definition_for_entity(entity_info):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just removing some dead code

@@ -218,6 +218,7 @@ def make_filename_template(
src_path=None,
n_dupes_to_combine=6,
pdf_format=False,
include_legend=True,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly added to simplify testing

@@ -162,7 +162,11 @@ def _sanitize_extension(ext: str) -> str:

def _stem_rule(rule: bst.types.Namespace):
stem_regex = re.escape(rule.stem)
ext_match = "|".join(_sanitize_extension(ext) for ext in rule.extensions)
if isinstance(rule.extensions[0], list) and len(rule.extensions) == 1:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

started trying to fix the failures of schema code but I think I should let grown ups deal with this

Comment on lines 187 to 189
class Group:
def __init__(self, extensions):
self.extensions = extensions
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not the best way to mock a group of suffixes...

@Remi-Gau
Copy link
Collaborator Author

I would personally consider the rendering fixed and OK but I will let others be the final judge of this.

Example here:
https://bids-specification--1492.org.readthedocs.build/en/1492/modality-specific-files/electroencephalography.html#landmark-photos-_photoextension

Screenshot from 2023-05-20 18-53-32

@Remi-Gau
Copy link
Collaborator Author

@effigies
I am done with the rendering fixes and adding doc + type hints / refactoring.

Will let other fixes to you.

@Remi-Gau
Copy link
Collaborator Author

on second though this rule should phrased somewhere in the specification and not just in the schema, no?

@effigies effigies added the schema-structure Changes to the fundamental organization/structure of the YAML schema. Minor version release. label May 26, 2023
Remi-Gau and others added 4 commits May 29, 2023 15:14
effigies
effigies previously approved these changes Jul 2, 2023
Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want @rwblair's go ahead before merging, as this is going to require an update to the validator.

@effigies effigies changed the title [FIX] avoid having one data file with different extensions [ENH] Encode mutually exclusive extensions in schema with sub-lists Jul 2, 2023
@effigies effigies requested a review from rwblair July 2, 2023 13:22
rwblair
rwblair previously approved these changes Jul 3, 2023
Copy link
Member

@rwblair rwblair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Square bracket and array-like for mutual exclusion of extensions is fine.

Be nice to have more explicit documentation about it in the schema readme.

@sappelhoff
Copy link
Member

xref #1560

@Remi-Gau
Copy link
Collaborator Author

Remi-Gau commented Jul 27, 2023

  • Remove digitizer from this PR and let's solve it in 1560

@Remi-Gau
Copy link
Collaborator Author

test failures started appearing after this commit:

53c4fd3


extensions = []
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to add this to make some test pass
@effigies WDYT?

# Combine exts when there are many, but keep JSON separate
if ".json" in extensions:
extensions = [".<extension>", ".json"]
extensions = []
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this horror had to be added

@effigies
Copy link
Collaborator

@Remi-Gau I don't know if we want to try to work on this in the nearish future, but it will need merge conflicts to be resolved before we go further.

@Remi-Gau
Copy link
Collaborator Author

will try to fix

@effigies
Copy link
Collaborator

effigies commented Aug 8, 2024

Okay, I got tests passing, although this does seem to be failing to render a bunch of extensions. Can look into it.

Before I do, though,I wonder if what we instead want is mutually exclusive lists of valid groups (instead of lists of mutually exclusive groups):

extensions:
  - [.nii.gz, .json]
  - [.nii, .json]

...

extensions:
  - [.eeg, .vhdr, .vmrk, .json]
  - [.set, .fdt, .json]
  - [.edf, .json],
  - [.bdf, .json]

Sure, it's duplicative, but it's not as ambiguous as the current proposal, where you could have a .eeg and .fdt file next to each other with no complaints.

We could go further and start referencing extension sets:

extensionsets:
  brainvision:
    required: [.eeg],
    optional: [.vhdr, .vmrk],
  nifti:
    oneOf:
      - required: [.nii]
      - required: [.nii.gz]
  sidecar:
    optional: [.json]
extensions:
  - nifti
  - sidecar

extensions:
  - [edf, brainvision, eeglab, biosemi]
  - sidecar

# OR

extensions:
  - [nifti, sidecar]

extensions:
  - [edf, sidecar]
  - [brainvision, sidecar]
  - [eeglab, sidecar]
  - [biosemi, sidecar]

@Remi-Gau
Copy link
Collaborator Author

Remi-Gau commented Aug 8, 2024

yeah I like the idea of the extension groups: may allow some refactoring

@effigies
Copy link
Collaborator

effigies commented Aug 8, 2024

Inline extension groups (as in the first proposal) or explicit (one of the latter approaches)?

@Remi-Gau
Copy link
Collaborator Author

Remi-Gau commented Aug 8, 2024

definitely explicit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exclude-from-changelog This item will not feature in the automatically generated changelog schema-structure Changes to the fundamental organization/structure of the YAML schema. Minor version release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] lack of recommendations for datafiles that differ only by extension
5 participants