Skip to content

Code for reproducing the results in the second version of the preprint "Accurate quantification of single-nucleus and single-cell RNA-seq transcripts"

License

Notifications You must be signed in to change notification settings

pachterlab/SHSOHMP_2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SHSOHMP_2024

Code for reproducing the figures and results in the preprint Accurate quantification of single-nucleus and single-cell RNA-seq transcripts by Kristján Eldjárn Hjörleifsson, Delaney Sullivan, Nikhila Swarna, Conrad Oakes, Guillaume Holley, Páll Melsted and Lior Pachter

(Note: In this repo, D-list is often referred to as "offlist".)

Note about human reference genome

The human reference genome (FASTA+GTF) used in all analyses is available directly at https://github.com/pachterlab/SHSOHMP_2024/releases under the filename human_CR_3.0.0.tar.gz.

Introduction

Please follow the steps below in order to reproduce the results of the preprint. Set all the paths to be relative to the directory SHSOHMP_2024.

main_path="$(pwd)/SHSOHMP_2024"
kallisto="$main_path/kallisto_0.48.0/kallisto"
kallisto="$main_path/kallisto_0.50.0/kallisto"
kallisto="$main_path/kallisto_0.50.1/kallisto"
bustools="$main_path/bustools/build/src/bustools"
cellranger7="$main_path/cellranger/cellranger-7.0.1/cellranger"
salmon="$main_path/salmon-latest_linux_x86_64/bin/salmon"

Download software

kallisto

version 0.48.0

cd $main_path
wget https://github.com/pachterlab/kallisto/releases/download/v0.48.0/kallisto_linux-v0.48.0.tar.gz
tar -xzvf kallisto_linux-v0.48.0.tar.gz
mv kallisto kallisto_0.48.0

version 0.50.0

cd $main_path
wget https://github.com/pachterlab/kallisto/releases/download/v0.50.0/kallisto_linux-v0.50.0.tar.gz
tar -xzvf kallisto_linux-v0.50.0.tar.gz
mv kallisto kallisto_0.50.0

version 0.50.1

cd $main_path
wget https://github.com/pachterlab/kallisto/releases/download/v0.50.1/kallisto_linux-v0.50.1.tar.gz
tar -xzvf kallisto_linux-v0.50.1.tar.gz
mv kallisto kallisto_0.50.1

bustools

version 0.43.2

cd $main_path
rm -rf bustools
git clone -b v0.43.2 https://github.com/BUStools/bustools
cd bustools && mkdir -p build && cd build
cmake .. && make

kb-python

version 0.28.0

cd $main_path
yes|python -m pip uninstall kb-python
python -m pip install kb_python==0.28.0

Cell Ranger

Note: Cell Ranger needs to be installed manually. Version is as follows:

  • Cell Ranger v7.0.1 (Released August 18, 2022. Downloaded October 7, 2022)

salmon-alevin-fry

salmon version 1.10.0; alevin-fry version 0.8.2; pyroe 0.9.3; simpleaf 0.15.1

cd $main_path
wget https://github.com/COMBINE-lab/salmon/releases/download/v1.10.0/salmon-1.10.0_linux_x86_64.tar.gz && tar -xzvf salmon-1.10.0_linux_x86_64.tar.gz
export RUSTUP_HOME=${main_path}/.rustup/
export CARGO_HOME=${main_path}/.cargo/
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
./.cargo/bin/cargo install --version 0.8.2 --force alevin-fry
./.cargo/bin/cargo install --version 0.15.1 --force simpleaf
yes|python -m pip uninstall pyroe
python -m pip install pyroe==0.9.3

simpleaf configuration:

export ALEVIN_FRY_HOME="$main_path/af_home"
simpleaf set-paths \
--salmon $(pwd)/salmon-latest_linux_x86_64/bin/salmon

simpleaf workflow get --name 10x-chromium-3p-v3 -o af10xv3

cellCounts

Open up an R session and then run:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("Rsubread")

Note: Version Rsubread_2.12.3

STARSolo simulations

Navigate to STARsoloManuscript and run the scripts there

Note: Make sure to run the STARsoloManuscript scripts first before proceeding (we use these indices and the links to the program binary files downstream). At a minimum, complete the sections "Create symlinks to executables", "Create indices", and "Mouse genome prep".

Human datasets

Single-cell

wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/6.1.0/20k_PBMC_3p_HT_nextgem_Chromium_X/20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs.tar
tar -xvf 20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs.tar

Single-nucleus

wget https://cf.10xgenomics.com/samples/cell-exp/7.0.0/5k_human_jejunum_CNIK_3pv3/5k_human_jejunum_CNIK_3pv3_fastqs.tar
tar -xvf 5k_human_jejunum_CNIK_3pv3_fastqs.tar

Mouse datasets

Single-cell

wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/4.0.0/SC3_v3_NextGem_SI_Neuron_10K/SC3_v3_NextGem_SI_Neuron_10K_fastqs.tar
tar -xvf SC3_v3_NextGem_SI_Neuron_10K_fastqs.tar

Single-nucleus

wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/7.0.0/5k_mouse_lung_CNIK_3pv3/5k_mouse_lung_CNIK_3pv3_fastqs.tar
tar -xvf 5k_mouse_lung_CNIK_3pv3_fastqs.tar

Spatial

wget https://s3-us-west-2.amazonaws.com/10x.files/samples/spatial-exp/2.1.0/CytAssist_11mm_FFPE_Mouse_Embryo/CytAssist_11mm_FFPE_Mouse_Embryo_fastqs.tar
tar -xf CytAssist_11mm_FFPE_Mouse_Embryo_fastqs.tar && rm CytAssist_11mm_FFPE_Mouse_Embryo_fastqs.tar && mv fastqs/* ./ && rmdir fastqs

Generate count matrix for datasets using kallisto

kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 20 -x 10XV3 \
    --workflow nac --sum=total -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c2 \
    -o ./matrices_human_20k_PBMC/ --overwrite --verbose \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz
kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 20 -x 10XV3 \
    --workflow nac --sum=total -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c2 \
    -o ./matrices_human_5k_jejunum_nuclei/ --overwrite --verbose \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R2_001.fastq.gz
kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 20 -x 10XV3 \
    --workflow nac --sum=total -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c2 \
    -o ./matrices_mouse_10k_neuron/ --overwrite --verbose \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R2_001.fastq.gz
kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 20 -x 10XV3 \
    --workflow nac --sum=total -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c2 \
    -o ./matrices_mouse_5k_lung/ --overwrite --verbose \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R2_001.fastq.gz
kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 20 -x VISIUM \
    --strand=unstranded --workflow nac --sum=total -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c2 \
    -o ./matrices_mouse_ffpe/ --overwrite --verbose \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R1_001.fastq.gz \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R2_001.fastq.gz

Filtering for UMI threshold >= 500 (applies to total count matrix; the other count matrices just use the barcodes from the "total" matrix).

./filter.sh matrices_human_20k_PBMC 500
./filter.sh matrices_human_5k_jejunum_nuclei 500
./filter.sh matrices_mouse_10k_neuron 500
./filter.sh matrices_mouse_5k_lung 500
./filter.sh matrices_mouse_ffpe 500

Let's now use the script from the simulations (where we compared output matrix vs simulated truth matrix) to now compare our nascent/mature/ambiguous/etc. matrices. Everything is in the mtx_comparisons.sh file.

./mtx_comparisons.sh matrices_human_20k_PBMC
./mtx_comparisons.sh matrices_human_5k_jejunum_nuclei
./mtx_comparisons.sh matrices_mouse_10k_neuron
./mtx_comparisons.sh matrices_mouse_5k_lung
./mtx_comparisons.sh matrices_mouse_ffpe

The final analysis is produced in the matrix_comparisons.ipynb python notebook file.

Runtime and memory benchmarks of kallisto (and mapping comparisons)

Note: kb-python already uses the 10xv3 prepackaged on-list.

Note: After the following commands are run, the analysis_dlist_performance.ipynb python notebook contains the final plots.

mkdir -p performance_comparisons/out/

Human 20k pbmc

nac + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c2 \
    -o ./performance_comparisons/out/nac_offlist-20kb_PBMC/ --overwrite \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_nac_offlist-20kb_PBMC_1.txt

nac (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/c2 \
    -o ./performance_comparisons/out/nac-20kb_PBMC/ --overwrite \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_nac-20kb_PBMC_1.txt

standard + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_offlist_1/g \
    -o ./performance_comparisons/out/standard_offlist-20kb_PBMC/ --overwrite \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_standard_offlist-20kb_PBMC_1.txt

standard (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_1/g \
    -o ./performance_comparisons/out/standard-20kb_PBMC/ --overwrite \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_standard-20kb_PBMC_1.txt

Human 5k jejunum

nac + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c2 \
    -o ./performance_comparisons/out/nac_offlist-5kb_jejunum/ --overwrite \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_nac_offlist-5kb_jejunum_1.txt

nac (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/c2 \
    -o ./performance_comparisons/out/nac-5kb_jejunum/ --overwrite \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_nac-5kb_jejunum_1.txt

standard + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_offlist_1/g \
    -o ./performance_comparisons/out/standard_offlist-5kb_jejunum/ --overwrite \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R2_001.fastq.gz"

$cmd1 16 $cmd2

standard (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_1/g \
    -o ./performance_comparisons/out/standard-5kb_jejunum/ --overwrite \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L001_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L002_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L003_R2_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R1_001.fastq.gz \
    5k_human_jejunum_CNIK_3pv3_fastqs/5k_human_jejunum_CNIK_3pv3_S1_L004_R2_001.fastq.gz"

$cmd1 16 $cmd2

Mouse 10k neuron

nac + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c2 \
    -o ./performance_comparisons/out/nac_offlist-10kb_neuron/ --overwrite \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_nac_offlist-10kb_neuron_1.txt

nac (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/c2 \
    -o ./performance_comparisons/out/nac-10kb_neuron/ --overwrite \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_nac-10kb_neuron_1.txt

standard + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_offlist_1/g \
    -o ./performance_comparisons/out/standard_offlist-10kb_neuron/ --overwrite \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_standard_offlist-10kb_neuron_1.txt

standard (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_1/g \
    -o ./performance_comparisons/out/standard-10kb_neuron/ --overwrite \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L002_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L003_R2_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R1_001.fastq.gz \
    SC3_v3_NextGem_SI_Neuron_10K_fastqs/SC3_v3_NextGem_SI_Neuron_10K_S1_L004_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_standard-10kb_neuron_1.txt

Mouse 5k lung

nac + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c2 \
    -o ./performance_comparisons/out/nac_offlist-5kb_lung/ --overwrite \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_nac_offlist-5kb_lung_1.txt

nac (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/c2 \
    -o ./performance_comparisons/out/nac-5kb_lung/ --overwrite \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R2_001.fastq.gz"

/usr/bin/time -v $cmd1 16 $cmd2 2> performance_comparisons/16_nac-5kb_lung_1.txt

standard + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_offlist_1/g \
    -o ./performance_comparisons/out/standard_offlist-5kb_lung/ --overwrite \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R2_001.fastq.gz"

$cmd1 16 $cmd2

standard (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_1/g \
    -o ./performance_comparisons/out/standard-5kb_lung/ --overwrite \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L001_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L002_R2_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R1_001.fastq.gz \
    5k_mouse_lung_CNIK_3pv3_fastqs/5k_mouse_lung_CNIK_3pv3_S4_L003_R2_001.fastq.gz"

$cmd1 16 $cmd2

Spatial (mouse FFPE)

nac + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 --strand=unstranded \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c2 \
    -o ./performance_comparisons/out/nac_offlist-mouse_ffpe/ --overwrite \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R1_001.fastq.gz \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R2_001.fastq.gz"

$cmd1 16 $cmd2

nac (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 --strand=unstranded \
    --workflow nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_1/c2 \
    -o ./performance_comparisons/out/nac-mouse_ffpe/ --overwrite \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R1_001.fastq.gz \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R2_001.fastq.gz"

$cmd1 16 $cmd2

standard + offlist:

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 --strand=unstranded \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_offlist_1/g \
    -o ./performance_comparisons/out/standard_offlist-mouse_ffpe/ --overwrite \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R1_001.fastq.gz \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R2_001.fastq.gz"

$cmd1 16 $cmd2

standard (no offlist):

cmd1="kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t "
cmd2=" -x 10XV3 --strand=unstranded \
    --workflow standard -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/standard_1/g \
    -o ./performance_comparisons/out/standard-mouse_ffpe/ --overwrite \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R1_001.fastq.gz \
    CytAssist_11mm_FFPE_Mouse_Embryo_fastqs/CytAssist_11mm_FFPE_Mouse_Embryo_S1_L004_R2_001.fastq.gz"

$cmd1 16 $cmd2

Reprocess PBMC data (for cluster-level analysis)

Obtain data

Get clusters 1 and 2:

wget --continue https://cf.10xgenomics.com/samples/cell-exp/6.1.0/20k_PBMC_3p_HT_nextgem_Chromium_X/20k_PBMC_3p_HT_nextgem_Chromium_X_analysis.tar.gz
tar -xzvf 20k_PBMC_3p_HT_nextgem_Chromium_X_analysis.tar.gz
cat analysis/clustering/graphclust/clusters.csv|cut -d"-" -f1|tail -n+2 > barcodes_10x_human_all.txt

Process using kallisto

NAC plus offlist

kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 48 -x 10XV3 \
    --workflow nac --sum=total -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_offlist_1/c2 \
    -o ./reprocess_human_20k_PBMC/ --overwrite --verbose \
    -w barcodes_10x_human_all.txt -t 48 \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz

NAC no offlist

kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 48 -x 10XV3 \
    --workflow nac --sum=total -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/g \
    -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/c1 \
    -c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/nac_1/c2 \
    -o ./reprocess_human_20k_PBMC_no_offlist/ --overwrite --verbose \
    -w barcodes_10x_human_all.txt \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz

standard plus offlist

kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 48 -x 10XV3 \
    -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_offlist_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_offlist_1/g \
    -o ./reprocess_human_20k_PBMC_standard/ --overwrite --verbose \
    -w barcodes_10x_human_all.txt \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz

standard no offlist

kb count --kallisto STARsoloManuscript/exe/kallisto_0.50.1 --bustools STARsoloManuscript/exe/bustools_0.43.2 -t 48 -x 10XV3 \
    -i STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_1/index.idx \
    -g STARsoloManuscript/genomes/index/kallisto_0.50.1/human_CR_3.0.0/standard_1/g \
    -o ./reprocess_human_20k_PBMC_standard_no_offlist/ --overwrite --verbose \
    -w barcodes_10x_human_all.txt \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L001_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L002_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L003_R2_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R1_001.fastq.gz \
    20k_PBMC_3p_HT_nextgem_Chromium_X_fastqs/20k_PBMC_3p_HT_nextgem_Chromium_X_S3_L004_R2_001.fastq.gz

SPLiT-seq c2c12 for TCCs

Download sequencing reads

wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/065/SRR13948565/SRR13948565_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/065/SRR13948565/SRR13948565_2.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/066/SRR13948566/SRR13948566_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/066/SRR13948566/SRR13948566_2.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/071/SRR13948571/SRR13948571_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/071/SRR13948571/SRR13948571_2.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/068/SRR13948568/SRR13948568_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/068/SRR13948568/SRR13948568_2.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/069/SRR13948569/SRR13948569_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/069/SRR13948569/SRR13948569_2.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/070/SRR13948570/SRR13948570_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/070/SRR13948570/SRR13948570_2.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/067/SRR13948567/SRR13948567_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR139/067/SRR13948567/SRR13948567_2.fastq.gz

Download metadata

wget https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5169nnn/GSM5169184/suppl/GSM5169184%5FC2C12%5Fshort%5F1k%5Fcell%5Fmetadata.csv.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5169nnn/GSM5169185/suppl/GSM5169185%5FC2C12%5Fshort%5F9kA%5Fcell%5Fmetadata.csv.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5169nnn/GSM5169186/suppl/GSM5169186%5FC2C12%5Fshort%5F9kB%5Fcell%5Fmetadata.csv.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5169nnn/GSM5169187/suppl/GSM5169187%5FC2C12%5Fshort%5F9kC%5Fcell%5Fmetadata.csv.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5169nnn/GSM5169188/suppl/GSM5169188%5FC2C12%5Fshort%5F9kD%5Fcell%5Fmetadata.csv.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5169nnn/GSM5169189/suppl/GSM5169189%5FC2C12%5Fshort%5F9kE%5Fcell%5Fmetadata.csv.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5169nnn/GSM5169190/suppl/GSM5169190%5FC2C12%5Fshort%5F9kF%5Fcell%5Fmetadata.csv.gz
wget https://raw.githubusercontent.com/pachterlab/splitcode-tutorial/main/uploads/splitseq/r2_r3.txt
wget https://raw.githubusercontent.com/fairliereese/LR-splitpipe/859279ed3fec859248fb4fdaee17280e6103b9f9/barcodes/bc_data_v2.csv

Format metadata

cat bc_data_v2.csv|grep "A1\|A2\|A3\|A4\|A5\|A6\|A7\|A8\|A9\|A10\|A11\|A12"|grep R$|cut -d, -f2 > r1_R_Awells.txt
cat bc_data_v2.csv|grep "A1\|A2\|A3\|A4\|A5\|A6\|A7\|A8\|A9\|A10\|A11\|A12"|grep T$|cut -d, -f2 > r1_T_Awells.txt

Format sequencing reads

rm splitseq_batch.txt
./prep_splitseq.sh SRR13948565 GSM5169184_C2C12_short_1k
./prep_splitseq.sh SRR13948566 GSM5169185_C2C12_short_9kA
./prep_splitseq.sh SRR13948567 GSM5169186_C2C12_short_9kB
./prep_splitseq.sh SRR13948568 GSM5169187_C2C12_short_9kC
./prep_splitseq.sh SRR13948569 GSM5169188_C2C12_short_9kD
./prep_splitseq.sh SRR13948570 GSM5169189_C2C12_short_9kE
./prep_splitseq.sh SRR13948571 GSM5169190_C2C12_short_9kF

Discard rRNA reads

Need bowtie2 (version 2.5.3), seqkit (v2.8.0), samtools (version 1.19.2)

bowtie2-build "mm10_ncRNA.fa" "exclusion_index"
cat splitseq_batch.txt|cut -d' ' -f2 > splitseq_batch.r1.txt
cat splitseq_batch.txt|cut -d' ' -f3 > splitseq_batch.r2.txt

xargs -I {} sh -c 'bowtie2 -q -p 20 \
--no-unal \
--quiet \
--local \
-x "exclusion_index" \
-U "{}" | samtools view -S | cut -f1 > "{}.filter.txt"' < splitseq_batch.r1.txt
cat *.filter.txt > final.filter.txt

xargs -I {} sh -c 'seqkit \
grep -j 20 -v -n \
-f "final.filter.txt" "{}" \
-o "{}.filtered.fastq.gz"' < splitseq_batch.r1.txt

xargs -I {} sh -c 'seqkit \
grep -j 20 -v -n \
-f "final.filter.txt" "{}" \
-o "{}.filtered.fastq.gz"' < splitseq_batch.r2.txt
cat splitseq_batch.txt | sed 's/\.fastq\.gz/.fastq.gz.filtered.fastq.gz/g' > splitseq_batch_final.txt

Get TCCs with kallisto

rm -rf splitseq_out

kb count --strand=forward -w None --overwrite --keep-tmp --verbose \
--workflow=nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c1 \
-c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c2 -x 1,10,18,1,48,56,1,78,86:1,0,10:0,0,0 \
--sum=total -o splitseq_out --batch-barcodes splitseq_batch_final.txt

STARsoloManuscript/exe/bustools_0.43.2 count -o splitseq_out/counts_unfiltered/cells_x_tcc -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g -e splitseq_out/matrix.ec -t splitseq_out/transcripts.txt --multimapping --umi-gene splitseq_out/tmp/output.s.bus

STARsoloManuscript/exe/kallisto_0.50.1 quant-tcc -b 10 -o splitseq_out/quant_unfiltered/ -t 24 --matrix-to-files --plaintext -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g -e splitseq_out/counts_unfiltered/cells_x_tcc.ec.txt splitseq_out/counts_unfiltered/cells_x_tcc.mtx

Now, look in the splitseq_analysis.ipynb notebook for further analysis.

Supplement: Get transcript-level estimates (with converting R->T barcodes)

rm -rf splitseq_out_supplement

cat splitseq_batch_final.txt|cut -c3- > splitseq_batch_final.modified.txt
awk 'NR==FNR{a[NR]=$0; next} {print a[FNR] " *" $0}' r1_R_Awells.txt r1_T_Awells.txt > replace.txt

kb count --strand=forward -w None --overwrite --keep-tmp --verbose \
--workflow=nac -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g -c1 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c1 \
-c2 STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/c2 -x 1,10,18,1,48,56,1,78,86:1,0,10:0,0,0 \
--sum=total -o splitseq_out_supplement -r replace.txt --batch-barcodes splitseq_batch_final.modified.txt

STARsoloManuscript/exe/bustools_0.43.2 count -o splitseq_out_supplement/counts_unfiltered_modified/cells_x_tcc -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g -e splitseq_out_supplement/matrix.ec -t splitseq_out_supplement/transcripts.txt --multimapping --umi-gene splitseq_out_supplement/output_modified.unfiltered.bus

STARsoloManuscript/exe/kallisto_0.50.1 quant-tcc -o splitseq_out_supplement/quant_unfiltered/ -t 24 --matrix-to-files --plaintext -i STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/index.idx -g STARsoloManuscript/genomes/index/kallisto_0.50.1/mouse/nac_offlist_1/g -e splitseq_out_supplement/counts_unfiltered_modified/cells_x_tcc.ec.txt splitseq_out_supplement/counts_unfiltered_modified/cells_x_tcc.mtx

See supp_quant.ipynb for plotting.

Get STAR alignment

zcat splitseq_R_SRR13948565_R1.fastq.gz.filtered.fastq.gz splitseq_R_SRR13948566_R1.fastq.gz.filtered.fastq.gz splitseq_R_SRR13948567_R1.fastq.gz.filtered.fastq.gz splitseq_R_SRR13948568_R1.fastq.gz.filtered.fastq.gz splitseq_R_SRR13948569_R1.fastq.gz.filtered.fastq.gz splitseq_R_SRR13948570_R1.fastq.gz.filtered.fastq.gz splitseq_R_SRR13948571_R1.fastq.gz.filtered.fastq.gz | gzip > splitseq_R_merged.fastq.gz
zcat splitseq_T_SRR13948565_R1.fastq.gz.filtered.fastq.gz splitseq_T_SRR13948566_R1.fastq.gz.filtered.fastq.gz splitseq_T_SRR13948567_R1.fastq.gz.filtered.fastq.gz splitseq_T_SRR13948568_R1.fastq.gz.filtered.fastq.gz splitseq_T_SRR13948569_R1.fastq.gz.filtered.fastq.gz splitseq_T_SRR13948570_R1.fastq.gz.filtered.fastq.gz splitseq_T_SRR13948571_R1.fastq.gz.filtered.fastq.gz | gzip > splitseq_T_merged.fastq.gz


mkdir -p splitseq_c2c12_R
STARsoloManuscript/exe/STAR_2.7.9a \
--genomeDir "STARsoloManuscript/genomes/index/STAR_2.7.9a/mouse/fullSA" \
--runThreadN 16 \
--readFilesCommand zcat \
--outSAMtype BAM SortedByCoordinate \
--outFilterType BySJout \
--outFileNamePrefix "splitseq_c2c12_R/" \
--readFilesIn splitseq_R_merged.fastq.gz


mkdir -p splitseq_c2c12_T
STARsoloManuscript/exe/STAR_2.7.9a \
--genomeDir "STARsoloManuscript/genomes/index/STAR_2.7.9a/mouse/fullSA" \
--runThreadN 16 \
--readFilesCommand zcat \
--outSAMtype BAM SortedByCoordinate \
--outFilterType BySJout \
--outFileNamePrefix "splitseq_c2c12_T/" \
--readFilesIn splitseq_T_merged.fastq.gz

We can index the BAM files with samtools then view them in IGV.

RSEM analysis

Simply go to kallisto_paper_analysis and follow the instructions there.

Clustering analysis

Simply go to clustering_analysis to obtain the pipeline for generating comparisons between count matrices (PCA, UMAP, marker genes, alluvial plots, etc.)

About

Code for reproducing the results in the second version of the preprint "Accurate quantification of single-nucleus and single-cell RNA-seq transcripts"

Resources

License

Stars

Watchers

Forks

Packages

No packages published