Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very different results compared to diann::diann_maxlfq results #6

Open
Mengbo-Li opened this issue Oct 11, 2022 · 3 comments
Open

Very different results compared to diann::diann_maxlfq results #6

Mengbo-Li opened this issue Oct 11, 2022 · 3 comments

Comments

@Mengbo-Li
Copy link

Mengbo-Li commented Oct 11, 2022

Dear Vadim,

As the manual to DIA-NN suggests, we are recommended to use iq to obtain MaxLFQ intensities as an alternative to diann-r. However, if I run the same example as given in diann_maxlfq(), it gives very different results from iq::maxlfq(): df <- data.frame(File.Name = c("A","A","A","B","B","B"),
Protein.Names=rep("ALB",6),
Precursor.Id=rep(c("PEPTIDE","EPTIDEP","PTIDEPE"),2),
Precursor.Normalised=c(20,10,5,25,12,NA)) |>
filter(!is.na(Precursor.Normalised))`

diann::diann_maxlfq(df) |> log2()

X <- matrix(c(df$Precursor.Normalised, NA), nrow = 3) |> log2()
colnames(X) <- LETTERS[1:2]
rownames(X) <- df$Precursor.Id[1:3]

iq::maxLFQ(X)$estimate

with outputs

> iq::maxLFQ(X)$estimate
3.492680 3.785161

> diann::diann_maxlfq(df) |> log2()
4.336651 4.629134

I understand that the two implementations are very different, but I wonder which implementation I should use as the "truth" or the benchmark in this case.

Many thanks,
Mengbo

@vdemichev
Copy link
Owner

With 'iq', please use the fast_maxlfq function, see the syntax & data preparation requirements in the respective manual.
The results are not expected to be identical.

@Mengbo-Li
Copy link
Author

Mengbo-Li commented Oct 11, 2022

Yes I do not expect identical results, but as you can see here, the average log2-intensity for this example protein is very different between the two methods. In fact I have tried on a much larger dataset, and the discrepancy in average log2-intensities between the two methods is quite big. We observed a more compressed range of average values by iq but a wider range of average intensities by diann-r.

@Mengbo-Li
Copy link
Author

And with the larger dataset, I used iq::process_long_format() so it was fast_maxlfq that is called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants