Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create QFeatures object from any sets of features/peptides/proteins tables #164

Open
annaquaglieri16 opened this issue Jun 17, 2022 · 0 comments

Comments

@annaquaglieri16
Copy link

Hi,

I'm sharing here a simple example that allows to exploit the data structure and functionalities of QFeatures starting from any pre-defined set of features/peptides/proteins. This is useful when one wants to import into R the summarised and aggregated output from any software, e.g. MaxQuant, PD etc... I wanted to test this as opposed to start from a PSMS table and then generating subsequent aggregations like explained in the QFeatures documentation.

Below I create the sample peptide and protein tables (I'll leave out the psms for simplicity). For both peptides and proteins I generate a matrix of "intensities" and the corresponding row data information with the respective peptides/proteins IDs. I generate a matrix of 10 peptides mapped to 3 proteins.

library(QFeatures)
wide_peptides <- matrix(c(rnorm(10), rnorm(10), rnorm(10)), nrow=10)
rownames(wide_peptides) <- paste0("Peptide",1:10)
colnames(wide_peptides) <- c("sample1", "sample2", "sample3")

rowdata_peptides <- DataFrame(PeptideID = rownames(wide_peptides), 
                              Protein.id = c(rep("Protein1", 2), rep("Protein2", 4), rep("Protein3", 4)))

wide_proteins <- matrix(c(rnorm(3), rnorm(3), rnorm(3)), nrow = 3)
rownames(wide_proteins) <- paste0("Protein",1:3)
colnames(wide_proteins) <- c("sample1", "sample2", "sample3")
rowdata_proteins <- DataFrame(ProteinID = rownames(wide_proteins))

We can now create separate SummarizedExperiment objects for the peptides and proteins. This is the minimal information to create a QFeatures object that contains 2 unlinked assays. They are unlinked, meaning that I cannot, for example, directly subset both assays by requesting all features coming from a particular protein id.

se1 <- SummarizedExperiment(wide_peptides, rowdata_peptides)
se2 <- SummarizedExperiment(wide_proteins, rowdata_proteins)

## Sample annotation (colData)
cd <- DataFrame(row.names = colnames(wide_proteins))

el <- list(peptides = se1, proteins = se2)
hl <- QFeatures(el, colData = cd)
hl
An instance of class QFeatures containing 2 assays:
 [1] peptides: SummarizedExperiment with 10 rows and 3 columns 
 [2] proteins: SummarizedExperiment with 3 rows and 3 columns 

However, we can easily create a link between the assays using the protein ids by exploiting the QFeatures::addAssayLink which is applied under the hood automatically when creating aggregations with QFeatures::aggregateFeatures.

hl_linked <- addAssayLink(hl,
             from = "peptides",
             to  = "proteins",
             varFrom = "Protein.id",
             varTo = "ProteinID")

Now the two assays are linked by the protein id and I can, for example, subset both assays simply by querying for one protein id and I can now use all the other functionalities of QFeatures

protein_example <- hl_linked["Protein1", , ]
protein_example
An instance of class QFeatures containing 2 assays:
 [1] peptides: SummarizedExperiment with 2 rows and 3 columns 
 [2] proteins: SummarizedExperiment with 1 rows and 3 columns 

Thanks a lot to @lgatto for providing the support and initial simple example to get me started with and to all the developer of the package!

Anna

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant