CensusTMT2MSstatsTMT

Converter from Census TMT output file to the input of MSstatsTMT.

The input file is the PSM-level census output file with TMT intensities information. From version 1.0.6, this tool supports multiple input files (see input file description in the command line usage).
This tool will read the input file and will generate a peptide-level text file that can be used with MSstatsTMT. In here, we have included a R script to install the required libraries to use with MSstatsTMT. Also here, you will find an example of the R commands you will need to execute to perform the analysis with MSstatsTMT with the output generated by this converter.

This tool can be executed in command line, or with a graphical interface, when using parameter -gui (included in batch files START_win.bat and START_mac_linux.sh).
The GUI version is built automatically from the command line version to give graphical support to the options in the command line (implementing class CommandLineProgramGuiEnclosable).

Both versions are available to download at: http://sealion.scripps.edu/CensusTMT2MSstatsTMT/

Command line options:

gui version usage: java -jar CensusTMT2MSstatsTMT -gui  
  
command line usage: java -jar CensusTMT2MSstatsTMT -i [input file] -an [annotation file]

 -an,--annotation <arg>      Path to the experimental design file.
 -d,--decoy <arg>            [OPTIONAL] Remove decoy hits. Decoys hits
                             will have this prefix in their accession
                             number. If not provided, no decoy filtering
                             will be used.
 -i,--input <arg>            Path to the input file(s). It can refer to multiple files by using wildcard '*', i.e: '/path/to/my/files/census*.out'
 -m,--minPeptides <arg>      [OPTIONAL] Minimum number of peptides+charge
                             per protein. If not provided, even proteins
                             with 1 peptide will be quantified
 -ps,--psm_selection <arg>   [OPTIONAL] What to do with multiple PSMs of
                             the same peptide (SUM, AVERAGE or HIGHEST).
                             If not provided, HIGHEST will be choosen.
 -r,--raw                    [OPTIONAL] Use of raw intensity. If not
                             provided, normalized intensity will be used.
 -u,--unique                 [OPTIONAL] Use only unique peptides. If not
                             provided, all peptides will be used.
Contact Salvador Martinez-Bartolome at salvador at scripps.edu for more help

To know more about the annotation file, go to http://msstats.org/msstatstmt/

The annotation file is a COMMA-SEPARATED file (CSV) containing the information about the experimental design.
The file should have the following columns:

Column	Explanation
Run	MS run ID. It should correspond to the column Filename in census out file.
Channel	Labeling information (126, … 131). It should only numbers and be defined in a way that being sorted correspond to either TMT-6plex, TMT-10plex or TMT-11plex in the census file *()** .
Condition	Condition (ex. Healthy, Cancer, Time0). If the channel doesn’t have sample, please add Empty under Condition. If the channel is a normalization channel in the MS run, add Norm under Condition
Mixture	Mixture of samples labeled with different TMT reagents, which can be analyzed in a single mass spectrometry experiment.
TechRepMixture	Technical replicate of one mixture. One mixture may have multiple technical replicates. For example, if TechRepMixture = 1, 2 are the two technical replicates of one mixture, then they should match with same Mixture value.
Fraction	Fraction ID. One technical replicate of one mixture may be fractionated into multiple fractions to increase the analytical depth. Then one technical replicate of one mixture should correspond to multiple fractions. For example, if Fraction = 1, 2, 3 are three fractions of the first technical replicate of one TMT mixture of biological subjects, then they should have same TechRepMixture and Mixture value.
BioReplicate	Unique ID for biological subject. If the channel doesn’t have sample, please add Empty under BioReplicate.

(*) It doesn't matter which numbers you state as Channel. The only requirement is to be as many different numbers as the TMT-plex you used in Census. The map between this column and the channels in the input file will be done by sorting the values on the channel column and mapping them to the sorted channels in the census file. For example:

Channel in annotation file	TMT channel in census.out
126	126.127726
127.12	127.124761
127.13	127.131081
128.12	128.128116
128.13	128.134436
129.131	129.131471
129.137	129.13779
130.13	130.134825
130.14	130.141145
131	131.13818

Here you have a couple of examples of annotation files:
annotation file example 1 This example corresponds to a single TMT 10-plex (1 mixture, with no fractionations) where the first channel 126.127726 is a normalization channel in the MS run. There are 6 experimental conditions, without fractionation. Each channel is a biological replicate.

annotation file example 2
This example corresponds to a single TMT 6-plex (1 mixture), with 8 fractions (one per MS runs), 3 biological replicates and 2 experimental conditions.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
about MSstatsTMT		about MSstatsTMT
src/main		src/main
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
START_mac_linux.sh		START_mac_linux.sh
START_win.bat		START_win.bat
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CensusTMT2MSstatsTMT

About

Releases

Packages

Contributors 2

Languages

License

proteomicsyates/CensusTMT2MSstatsTMT

Folders and files

Latest commit

History

Repository files navigation

CensusTMT2MSstatsTMT

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages