-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request #62
Comments
Thanks! Great idea! MCFs are definitely easier to interpret than p-values for GSEA. That R package looks neat, but the KO calls are already made in the kegg_diamond rule so we just need the table and algorithm that links the KOs to pathways and computes the MCF, then we're there! I'll look into a way of integrating that. |
Hey,
The R-package MetQy does not do the KO calling so it is good you have another program that does that for you. I use KoFamScan in my pipelines but I am sure kegg_diamond does the trick too.
MetQy takes a dataframe with semicolon-separated KOs per bin:
Bin1 K00001;K00032;K24233
Bin2 K22001;K32231
Etc.
NB: these are lists of gene K-numbers, not pathway KO-numbers.
It uses about 10-15 minutes for 150 bins on my laptop but will be fast on the Threadripper I suppose.
…-M
From: Carl Mathias Kobel ***@***.***>
Sent: onsdag 11. oktober 2023 16:29
To: cmkobel/assemblycomparator2 ***@***.***>
Cc: Magnus Øverlie Arntzen ***@***.***>; Author ***@***.***>
Subject: Re: [cmkobel/assemblycomparator2] Feature request (Issue #62)
Thanks! Great idea! MCFs are definitely easier to interprete than p-values for GSEA. That R package looks neat, but the KO calls are already made in the kegg_diamond rule so we just need the table and algorithm that links the KOs to pathways and computes the MCF, then we're there! I'll look into a way of solving that.
—
Reply to this email directly, view it on GitHub<#62 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIFICYTE4PMN4DXRGPNZWL3X62ULHANCNFSM6AAAAAA534632M>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>
|
This will be solved by adding gapseq which calculates pathway completion fractions. It is well maintained and very powerful. Currently waiting for r-chnosz to be published on conda-forge so we can publish gapseq on bioconda, so we can finally add gapseq to asscom2. |
Since you asked for feedback...
What about implementing calculations of module competion factors (mcf)? These are values between 0-1 indicating whether a Bin has the required genes to complete a given reaksjon, e.g., 'denitrification' or 'methanogenesis'.
This can be done with the MetQy package in R (I have code if you want) and it would complement your output nicely.
I attach an example output for some of my samples with 150 bins.
MetQy_mcf.pdf
The text was updated successfully, but these errors were encountered: