-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding interaction terms to the design matrix #181
base: main
Are you sure you want to change the base?
Conversation
Youpi tests pass ! @khalilouardini and @BorisMuzellec you are on ! |
Hi @jeandut and @khalilouardini, thanks for this PR! It's nice that you went all the way to even support recursive interactions like "a: b:c". I experimented a bit with your code, and there seem to be a few remaining issues though.
I get the following design:
The columns seem to contain the intended values, but:
Let me know if you need help with this :) EDIT: I think that in the example above issue 2 is due to the fact the design is of the form "~a + a:b", which creates a redundancy that probably wouldn't be there if it was just "~a:b". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above
@BorisMuzellec see the modifications I made. LGTM but it's hard to be sure wo test data available. |
Thanks @jeandut for the code updates. The variable names look fine now :). Matrix rankWe still have the rank issue with designs of the form `"~ factor1 + factor1:factor2" though. E.g., in the same example as above,
we have the following design:
which does not have full column rank, because In comparison, the design matrix output by DESeq2 only has the following columns I think that when there are interaction terms in the design, we need to check whether those variables are also present on their own, and if so remove an additional column. In-place modificationOn a side note, the present code modifies the metadata that is being passed (it adds colums). E.G:
It would be better to avoid this, e.g. by adding an I'm not a huge fan of adding to many dependencies, but I'm starting to wonder if we could save us some pain by relying on formulaic, as suggested in #125... |
Has this attempt to introduce interaction terms been abandoned? I was hoping to not have to resort to R to get interaction designs to work, since that is a very important part of DESeq2 in R and was quite surprised this was not part of the original pyDESeq2...Is this specifically hard to implement for some reason in Python, just curious... |
|
Hi, I was wondering if this will be implemented soon? I have done all my analysis in Python but was asked for some interaction terms and i was wondering if i Should switch totally to R, or will this be implemented soon? Thanks |
Hi @Marwansha normally this PR is pretty much finished but, as the changes are substantial, we wanted to spend some extra time to review it before releasing it (we even think of doing a pre-release). Crossing fingers this will be merged soonish. |
I'll be happy to do some analysis using this branch. Is there any specific concern you guys have in mind? :) |
Here is a quick try on my data: build_design_matrix(
metadata=rnaseq_data_wt.obs,
design_factors = '`Treatment`+`Time`+`Treatment:Time`',
ref_level=[('Treatment','DMSO'),('Time','8hr')]
) I'm trying to have "Treatment" and "Time" as co-variables but I ran into an error while setting up the
Full Error:
|
Thank you for the feedback ! Two things I would be specifically looking for is:
|
Do you have support data for an MWE of your error ? |
This PR has several purposes:
a:b:...:z
fordesign_factors
and formulas based on a combination of single design factors and of such interaction termsa + b + a:b:...:z
(no support for other more complex structures or alternative syntax enabled by formulaic's grammar yet such as i.e.~ C(X, contr.treatment("x"))
,a * b
,contr.poly
, ...)design_matrix
to a dds datasetpyproject.toml
syntax (linters complained)Completion milestones:
design_factors
(kind of as end2end integration tests are passing)