Converter from Census TMT output file to the input of MSstatsTMT.
The input file is the PSM-level census output file with TMT intensities information. From version 1.0.6, this tool supports multiple input files (see input file description in the command line usage).
This tool will read the input file and will generate a peptide-level text file that can be used with MSstatsTMT.
In here, we have included a R script to install the required libraries to use with MSstatsTMT. Also here, you will find an example of the R commands you will need to execute to perform the analysis with MSstatsTMT with the output generated by this converter.
This tool can be executed in command line, or with a graphical interface, when using parameter -gui (included in batch files START_win.bat and START_mac_linux.sh).
The GUI version is built automatically from the command line version to give graphical support to the options in the command line (implementing class CommandLineProgramGuiEnclosable).
Both versions are available to download at: http://sealion.scripps.edu/CensusTMT2MSstatsTMT/
Command line options:
gui version usage: java -jar CensusTMT2MSstatsTMT -gui
command line usage: java -jar CensusTMT2MSstatsTMT -i [input file] -an [annotation file]
-an,--annotation <arg> Path to the experimental design file.
-d,--decoy <arg> [OPTIONAL] Remove decoy hits. Decoys hits
will have this prefix in their accession
number. If not provided, no decoy filtering
will be used.
-i,--input <arg> Path to the input file(s). It can refer to multiple files by using wildcard '*', i.e: '/path/to/my/files/census*.out'
-m,--minPeptides <arg> [OPTIONAL] Minimum number of peptides+charge
per protein. If not provided, even proteins
with 1 peptide will be quantified
-ps,--psm_selection <arg> [OPTIONAL] What to do with multiple PSMs of
the same peptide (SUM, AVERAGE or HIGHEST).
If not provided, HIGHEST will be choosen.
-r,--raw [OPTIONAL] Use of raw intensity. If not
provided, normalized intensity will be used.
-u,--unique [OPTIONAL] Use only unique peptides. If not
provided, all peptides will be used.
Contact Salvador Martinez-Bartolome at salvador at scripps.edu for more help
To know more about the annotation file, go to http://msstats.org/msstatstmt/
The annotation file is a COMMA-SEPARATED file (CSV) containing the information about the experimental design.
The file should have the following columns:
Column | Explanation |
---|---|
Run | MS run ID. It should correspond to the column Filename in census out file. |
Channel | Labeling information (126, … 131). It should only numbers and be defined in a way that being sorted correspond to either TMT-6plex, TMT-10plex or TMT-11plex in the census file (*) . |
Condition | Condition (ex. Healthy, Cancer, Time0). If the channel doesn’t have sample, please add Empty under Condition. If the channel is a normalization channel in the MS run, add Norm under Condition |
Mixture | Mixture of samples labeled with different TMT reagents, which can be analyzed in a single mass spectrometry experiment. |
TechRepMixture | Technical replicate of one mixture. One mixture may have multiple technical replicates. For example, if TechRepMixture = 1, 2 are the two technical replicates of one mixture, then they should match with same Mixture value. |
Fraction | Fraction ID. One technical replicate of one mixture may be fractionated into multiple fractions to increase the analytical depth. Then one technical replicate of one mixture should correspond to multiple fractions. For example, if Fraction = 1, 2, 3 are three fractions of the first technical replicate of one TMT mixture of biological subjects, then they should have same TechRepMixture and Mixture value. |
BioReplicate | Unique ID for biological subject. If the channel doesn’t have sample, please add Empty under BioReplicate. |
(*) It doesn't matter which numbers you state as Channel. The only requirement is to be as many different numbers as the TMT-plex you used in Census. The map between this column and the channels in the input file will be done by sorting the values on the channel column and mapping them to the sorted channels in the census file. For example:
Channel in annotation file | TMT channel in census.out |
---|---|
126 | 126.127726 |
127.12 | 127.124761 |
127.13 | 127.131081 |
128.12 | 128.128116 |
128.13 | 128.134436 |
129.131 | 129.131471 |
129.137 | 129.13779 |
130.13 | 130.134825 |
130.14 | 130.141145 |
131 | 131.13818 |
Here you have a couple of examples of annotation files:
annotation file example 1
This example corresponds to a single TMT 10-plex (1 mixture, with no fractionations) where the first channel 126.127726 is a normalization channel in the MS run. There are 6 experimental conditions, without fractionation. Each channel is a biological replicate.
annotation file example 2
This example corresponds to a single TMT 6-plex (1 mixture), with 8 fractions (one per MS runs), 3 biological replicates and 2 experimental conditions.