RNA-Seq by Expectation-Maximization. This tool is used to estimate gene and isoform levels, outputting normalized read counts, fpkm, tpm. It runs in about 1.5 - 4 hours, and spot instance cost of $0.60 - $1.50
tools/rsem_calc_expression.cwl
genomeDir
: RSEM reference tar ball. This can be created by running prepare reference tool
bam
: Aligned transcriptome bamoutFileNamePrefix
: String to prepend output file names withstrandedness
: "'none' refers to non-strand-specific protocols. 'forward' means all (upstream) reads are derived from the forward strand. 'reverse' means all (upstream) reads are derived from the reverse strand"paired-end
: If input is paired-end, add this flag
num_threads
: Num threads to useappend_names
: If available, append gene/tx name to gene/tx idestimate_rspd
: Set this option if you want to estimate the read start position distribution (RSPD) from datafragment_length_max
: Maximum read/insert length allowed
gene_out
: Gene expression estimationisoform_out
: Transcript isoform level expression estimation
tools/rsem_prepare_reference.cwl
This tool is used to create the necessary reference for RSEM.
You should only need to do this once for each gene model, fasta reference.
reference_fasta
: Reference fasta filereference_gtf
ORreference_gtf
: gene model definitions
reference_name
: Output file prefix. Recommend format: RSEM_<SOURCE><Version>/
rsem_reference
: RSEM reference tar ball