Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gender detection #89

Open
liza-alpinia opened this issue Mar 29, 2022 · 10 comments
Open

Gender detection #89

liza-alpinia opened this issue Mar 29, 2022 · 10 comments

Comments

@liza-alpinia
Copy link

Hi!

We have analyzed about 1400 samples using your algorithm. And we came across an interesting feature. All the samples we used were previously verified by gender using a different method. We have created a reference containing 500 samples of male and female each. The peculiarity lies in the fact that there are about 2% of the samples where the gender was determined incorrectly. At the same time, zed score is about 0 on the X chromosome. But in the "Gender based on --yfrac (or manually overridden by --gender) column:" we get M.
I don't understand, on the basis of what is gender determined.

@lennartraman
Copy link

Hi,

Gender is determined based on the number of Y reads. The gender algorithm is used only to create internal male and female references, for which it works fine. The algorithm is however not sophisticated enough to reliable call gender in individual cases. For NIPT, it can 'fail' if the fetal fraction is too low; for non-NIPT cases I've seen it fail in 'older patients', which often suffer from loss-of-Y.

Hope this helps,

Best, Lennart

(duplicate of #81)

@liza-alpinia
Copy link
Author

liza-alpinia commented Mar 29, 2022

image

We have such a sample (NIPT) in which the fetal fraction is 19 percent, but as a result, the gender is determined incorrectly (identified as a male but the correct gender is a female). When we look at the plot we see that number of x chromosomes are 2 (X z-score=0,5). Is it possible to use z-score for gender verification?

@lennartraman
Copy link

WisecondorX is a general CNV tool, meaning it can be used to analyze NIPT (using the --nipt flag) and non-NIPT samples (whole blood, lymphocytes, tumor samples, ...).

I'm not sure I understand the problem here. Is that a confirmed male or a female sample? If that would be a male NIPT sample with a fetal fraction of 19%, we would expect a drop at chromosome X, which isn't the case, so I highly doubt it's truly male.

@liza-alpinia
Copy link
Author

liza-alpinia commented Mar 29, 2022

Sorry, I didn't describe the problem in enough detail, corrected the message above.

The instrument determined that this sample (NIPT sample) is male (the results are in the screenshot), but it is confirmed female.
Both attached images belong to 1 sample.

image

image

@liza-alpinia
Copy link
Author

liza-alpinia commented Mar 29, 2022

If we look at the results of the table, the tool indicates the following: "gender based on --yfrac (or manually overridden by --gender) column : M (male)" , but on the plot we see that this is not the case (female).

@lennartraman
Copy link

That's strange indeed. I do except 1-2% of the fetal gender predictions to be wrong, as the algorithm was not developed to call fetal gender of individual cases. However, for a 19% fetal fraction case, I do absolutely expect it to work. I'm guessing for the other wrong predictions, it concerns mostly cases of low fetal fraction? A couple of possible scenarios for this particular case:

  • sample label mismatch
  • fetal fraction determination failed
  • a biological reason: XXY case, vanishing male twin, ...

It might be worth it to take a close look at the raw reads and confirm the presence of excessive Y reads by manual scripting.

Hope this helps,

Lennart

@liza-alpinia
Copy link
Author

Thanks a lot for the tips!
I use your advice in order to deal with the result. If I can figure out what the error could be, I'll let you know!
Maybe you can advise an algorithm for determining genderfor NIPT samples?

@lennartraman
Copy link

lennartraman commented Mar 30, 2022

We've been using this script, which is compatible with the 5kb .npz files from the convert stage. However, note that it is optimized for our laboratory & computational setup. If you would like to gain more insight on how it was created, I would suggest you read our PREFACE paper.

Lennart

(also see #88)

@liza-alpinia
Copy link
Author

Hi!

I studied fastq files. It turned out that a large number of multiple mapping, in addition, my reads are only 35 nucleotides long, which also complicates the situation. Filtering helped fix the problem.
I will try to test the code offered to you.

Thanks for your help!

@liza-alpinia
Copy link
Author

HI!

I want to clarify the question - what are the threshold values for the fetal fraction that can distort the results of the algorithm?

Best,Liza

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants