Skip to content

Output paths from reader tasks gets moved and rewritten #751

Open
@ghisvail

Description

@ghisvail

I have implemented a task which reads a bunch of files from a BIDS dataset, with the following signatures:

@task
@annotate({"return": {
    "dataset_description": dict,
    "participant_ids": list[str],
    "session_ids": list[str],
}})
def read_bids_dataset(dataset_path: Path):
    ...

@task
@annotate({"return": {"files": list[Path]}})
def read_bids_files(
    dataset_path: Path,
    participant_id: str,
    session_id: str,
    datatype: str,
    suffix: str,
    extension: str,
):
    ...

# Build workflow composing the two tasks above
def build_bids_reader(bids_queries: dict, **kwargs) -> Workflow:
    ...

If I sequence both tasks manually, I get the list of BIDS files from the source path as expected.

If I compose them in a workflow, I still get the BIDS files but moved to the workflow working directory.

I have never witnessed that behavior before, and believe this may be a regression compared to versions of Pydra prior to 0.23. In my opinion, results obtained from the sequential task execution and the workflow should be equivalent. Besides, copying the BIDS files can become a big problem if the dataset in huge in terms of number of participant / session combinations, or if the queried modality features large volume data, such as DWI.

A quick debug session indicates that this area of the code may be at cause.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingwontfixThis will not be worked on

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions