Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding _ratio.txt and _BAF.txt for counting informative markers #146

Open
kmavrommatis opened this issue Jul 18, 2024 · 1 comment

Comments

@kmavrommatis
Copy link

kmavrommatis commented Jul 18, 2024

Hi,
I am trying to understand the contents of the files Freec produces and count the number of SNPs used as markers for each segment.
My intention is to pass the output of FREEC to GISTIC for a cohort analysis.

Based on my understanding, the file _BAF.txt contains the information for each of the SNPs found in the public SNP database, e.g. dbsnp.
The file _ratio.txt contains the information for each exon region (target interval).
I noticed the script FREEC_ratio2Absolute.R seems to be using the _ratio.txt file to produce an ouput similar to Absolute, but the column Num_probes corresponds to the number of exon regions, not the number of SNPs,

If we want to count the number of SNPs (markers) that were used for each segment call we should count the number of informative SNPs in the _BAF.txt file, i.e the SNPs that have uncertainty > -1, is this correct?

Thanks in advance for your help

@valeu
Copy link
Contributor

valeu commented Jul 20, 2024

Hi, SNPs are only used for BAF estimates, not to call CNAs.. there, each exon (or window for WGS) is used as one point.. Not a SNP. This is why I output the number of exons to Absolute.
And to me, all your points seem to be correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants