Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NA values in model #10

Open
lisiarend opened this issue Jan 16, 2023 · 1 comment
Open

NA values in model #10

lisiarend opened this issue Jan 16, 2023 · 1 comment

Comments

@lisiarend
Copy link

lisiarend commented Jan 16, 2023

Hello,

I am executing limma on a proteomics dataset with 5 conditions and each condition having 3 samples. Limma is returning: "Partial NA coefficients for 209 probe(s) by executing:

# Specify comparisons
comparisons <- c("B-A", "C-A", "D-A", "E-A", "C-B", "D-B", "E-B", "D-C",  "E-C",  "E-D")

# Create design matrix
groupsM <- as.factor(condition)
designM <- model.matrix(~0+groupsM)
colnames(designM) <- levels(groupsM)

# Fit lm
fit <- lmFit(data, designM)

# Create contrasts
contr <- makeContrasts(contrasts = comparisons, levels = colnames(coef(fit)))

# Contrast fit and ebayes
fit2 <- contrasts.fit(fit, contr)
ebfit <- eBayes(fit2, trend=TRUE)

Now I call the DEqMS function spectraCountBayes(fit), but I get a warning (because of the NAs in the model ebfit):

prot <- rownames(fit$coefficients)
rowdata <- as.data.table(rowData(se))
PSMs <- data.frame("Razor + unique peptides" = rowdata$Razor...unique.peptides)
rownames(PSMs) <- rowdata$Protein.IDs
fit$count <- PSMs[prot, "Razor...unique.peptides"]
fit_DEqMS <- DEqMS::spectraCounteBayes(fit) # model variance

And the warning is:

Warning message:
In y.pred - digamma(df/2) :
longer object length is not a multiple of shorter object length

I looked into the spectraCounteBayes function and saw that the problem arises because y.pred (from loess model) is shorter. My question now is, isn't it possible to call loess(logVAR ~ x, span = 0.75, na.action = stats::na.exclude) with the na.action parameter, so that y.pred has the same length than the coefficients, gamma, etc. from my limma model?

Since I am calling limma with all comparisons together, I don't know how I would control that for each two-group comparison more than 2 non-missing values are present. Therefore I was trying to find another solution, and actually the na.action parameter appears to be to a good solution.

What do you think?

Best,
Lis

@yafeng
Copy link
Owner

yafeng commented Jan 30, 2023

Hi @lisiarend
Sorry for the late reply. I was off from work due to Chinese New Year.
You are right, this warning is probably due to missing values present. Your solution looks OK to me. For now you need to modify the R source code "DEqMS.R" to add this option. I will consider to add this new option in next update.
Another way to avoid this is to keep proteins that have minimum two values at all condition. However, you may loose some proteins, but it solves this issue. I will recommend this way if loss of proteins is minor.

Best,
Yafeng

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants