Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about Q-values in MSGF+ #114

Open
Linkous-02 opened this issue Dec 24, 2020 · 4 comments
Open

questions about Q-values in MSGF+ #114

Linkous-02 opened this issue Dec 24, 2020 · 4 comments
Labels

Comments

@Linkous-02
Copy link

I met some trouble when understanding the caculation of Q-value in MSGF+:

I am running MSGF+ on one sample of 10 fraction files, and everytime I get output of .tsv, I will delete those spectrums whose Q-values were lower than 0.01 and those PSMs who mapped to decoy entries(those spectrums were filterred out by Q-value were left), then I will run MSGF+ on those left spectrums again with same parameters and databses.

by the explanation of MSGF+ paper(" For a thresholdt, report the FDR as Ndecoy/Ntarget where Ntarget(Ndecoy) is the number of target (decoy) PSMs with spectral E-values equal or smaller than t"), ideally, every turn I run on those left spectrums, there will always some PSMs whose FDR < 0.01, but result is the Q-values of PSMs were all higher than 0.01 in the third turn.

So I wonder the caculation of Q-value by MSGF+ was slightly different from the formula in MSGF+ paper.

It will be very thankful if someone can provide me with some pointers.

@alchemistmatt
Copy link
Collaborator

alchemistmatt commented Dec 24, 2020

Your method of analysis is something I have never seen applied, and is, frankly, a bit dubious. I don't think you can trust the Q-Values on the searches after you removed the high confidence spectra and decoy proteins. As for how Q-Values are computed, please see either of these two Excel files, which demonstrate how to manually compute the Q-Values. I suggest you take the results from each of your searches and manually compute Q-Values as shown in these files, and compare to what MS-GF+ is reporting. Admittedly, the Q-Values in the Excel files don't exactly match what MS-GF+ reports, but they're close.

@Linkous-02
Copy link
Author

Thanks for your reply, to point that, I run MSGF+ by this why because I am curious about these spectrums filtered by target-decoy strategy. And found the Q-value of output did not fit the formula in article by chance so I want to get some answer.

And I also compute Q-Values in my file which was attached below. But the manually computed Qvalues seems to be very different from the output.

It will be very thankful if you can give me som clue.

--
Qvalue

@alchemistmatt
Copy link
Collaborator

Those SpecEValues are fairly low (i.e. not good) and you have negative MSGFScore values. Something odd is going on. As I said earlier, you can only compute Q-values using the Reverse / Forward method when you search the entire, unfiltered original .mzML file.

@MihirMongia
Copy link

Hi Linkous-02 , did you ever happen to resolve this issue. I am a beginner using MSGF+ and I am also having trouble replicating the Q-values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants