-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cell type annotation: Harmony/KNN workflow #836
base: main
Are you sure you want to change the base?
Conversation
* cellranger mkgtf component working and tested * removed comments * changelog entry added * test unique attribute in result * multiple attribute par added * removed unused packages * use pytest, multiple attributes tested --------- Co-authored-by: DriesSchaumont <[email protected]>
Co-authored-by: Dries Schaumont <[email protected]>
* cellranger mkgtf component working and tested * removed comments * changelog entry added * test unique attribute in result * multiple attribute par added * removed unused packages * use pytest, multiple attributes tested --------- Co-authored-by: DriesSchaumont <[email protected]>
6bcfac3
to
e2049f1
Compare
- name: "--theta" | ||
type: double | ||
description: | | ||
Diversity clustering penalty parameter. Specify for each variable in group.by.vars. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When reading this, I am not sure that I know what group.by.vars
means here? Is it related to another argument?
`distance` (weight points by the inverse of their distance) | ||
- name: "--n_neighbors" | ||
type: integer | ||
default: 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add min
?
roles: [ author, maintainer ] | ||
- __merge__: /src/authors/weiwei_schultz.yaml | ||
roles: [ contributor ] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the test_dependencies
to info
?
| map {id, state -> | ||
def new_state = state + ["workflow_output": state.output] | ||
[id, new_state] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| map {id, state -> | |
def new_state = state + ["workflow_output": state.output] | |
[id, new_state] | |
} | |
| map {id, state -> | |
def new_state = state + ["workflow_output": state.output] | |
[id, new_state] | |
} |
// add id as _meta join id to be able to merge with source channel and end of workflow | ||
| map{ id, state -> | ||
def new_state = state + ["_meta": ["join_id": id]] | ||
[id, new_state] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is _meta
required here? I think the number of input and output events from this workflow are the same and that the IDs of the events match?
} | ||
| view {"After adding join_id: $it"} | ||
// Add 'query' id to .obs columns of query dataset | ||
| add_id.run( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_id
and duplicate_obs
could be performed in parallel here by:
- splitting the input channel into two channels: one for the reference and one for the query
- performing
add_id
andduplicate_obs
- joining the channel back together
And since you have add_id
and duplicate_obs
twice, please add a key
argument to .run
(e.g. key: "add_id_query"
and key: "add_id_reference"
). This makes sure that the process names remain unique.
|
||
assert "rna" in list(input_mudata.mod.keys()), "Input should contain rna modality." | ||
assert all(key in list(input_mudata.mod["rna"].obsm) for key in expected_obsm), f"Input mod['rna'] obs columns should be: {expected_obsm}, found: {input_mudata.mod['rna'].obsm.keys()}." | ||
assert all(key in list(input_mudata.mod["rna"].obs) for key in expected_obs), f"Input mod['rna'] obs columns should be: {expected_obs}, found: {input_mudata.mod['rna'].obs.keys()}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add the contents of the newly created columns? (for example that the predictions are in fact labels and the probabilities are floats?)
Changelog
Workflow for harmony integration followed by KNN label transfer for cell type annotation.
Issue ticket number and link
Closes #xxxx (Replace xxxx with the GitHub issue number)
Checklist before requesting a review
I have performed a self-review of my code
Conforms to the Contributor's guide
Check the correct box. Does this PR contain:
Proposed changes are described in the CHANGELOG.md
CI tests succeed!