GitHub - Nazeeefa/NanoSwe: NanoSwe: Analysing nanopore (PromethION) data of Swedish genomes

NanoSwe: Analysing PromethION Sequencing Data of Swedish Genomes 🇸🇪

Introduction

NanoSwe is a preliminary analysis toolkit for experiments that involve sequencing data from ONT's PromethION device. It has also been used for other long-read SweGen data (e.g. PacBio).

Bioinformatics ToolKit

Purpose	Program
Quality Control	NanoPlot for QC and NanoComp
Mapping to the reference	Minimap2-2.14
Sorting, Indexing, and calculating statistics	Samtools 1.9
Subsampling	Sambamba 0.7.1
BAM QC Statistics	Qualimap 2.2.1
Structural Variant Calling	Sniffles 1.0.10
Data Extraction (VCF Files only)	bcftools 1.9
Finding intersection in genomic regions	Survivor 1.0.7
Evaluation of SVs	Survivor 1.0.7 and surpyvor: 0.5.0
Removing control DNA sequences	NanoLyse
Trimming Short Reads	BBMap/BBTools
Homology Detection	Blast 2.7.1+
Data Visualisation	R version 3.5.3. See the scripts directory for information on libraries/packages used.

Nanopore Data: Post-Sequencing

Example tree structure of nanopore sequencing data files

├── /basecalled/<sample>/<flowcell>/
│   ├── fastq_0.fastq
│   ├── fastq_850.fastq
│   ├── sequencing_summary_0.txt
│   ├── sequencing_summary_850.txt
│   └── reads (1)
│       ├── 0 (2)
│       │   ├── file_read_1_ch_90_strand.fast5
│       │   ├── file_read_41_ch_40_strand2.fast5
│       │   └── file_read_300_ch_40_strand2.fast5
│       └── 850
│           ├── file_read_1000_ch_200_strand.fast5
│           ├── file_read_9000_ch_100_strand.fast5
│           └── file_read_95000_ch_1000_strand2.fast5
└── /bin/

(1) Each folder contains ~8000 fast5 files
(2) fast5 file named e.g. PCT0001_YYYYMMDD_0001A20B002222C_{flowcell}_sequencing_run_{library_full_name}__read_{number}_ch_{number}_strand.fast5)

Nanopore Data: Post-Tidying

Example tree structure of data organisation

├── /basecalled/<sample>/<flowcell>/
│   ├── FASTQ_files
│   │   ├── fastq_0.fastq
│   │   └── fastq_850.fastq
│   ├── sequencing_summary
│   │   ├── sequencing_summary_0.txt
│   │   └── sequencing_summary_850.txt
│   ├── reads *
│   │   ├── 0 *
│   │   │   ├── file_read_1_ch_90_strand.fast5
│   │   │   ├── file_read_41_ch_40_strand2.fast5
│   │   │   └── file_read_300_ch_40_strand2.fast5
│   │   └── 850
│   │       ├── file_read_1000_ch_200_strand.fast5
│   │       ├── file_read_9000_ch_100_strand.fast5
│   │       └── file_read_95000_ch_1000_strand2.fast5
│   └── <sample>_analysis
│       ├── reference_genome.fna
|       ├── reference_genome.fna.fai
│       ├── Snakefile
│       ├── /bam_files/
│       ├── /vcf_files/
│       └── /logs/
└── /bin/

Sub-folder content

./scRipts - R scripts created for visulisation of long read data.
commands.md - Tool commands used for different analyses.

Data Sources

Recommended Pipeline(s)

Citation

If you plan to use repository as a guide, simply and kindly mention the link https://github.com/Nazeeefa/NanoSwe for acknowledgment. To cite our publication, you can cite it as as shown below otherwise visit citeas.org to choose a different format. Thank you.

AMA Style

Fatima N, Petri A, Gyllensten U, Feuk L, Ameur A. Evaluation of Single-Molecule Sequencing Technologies for Structural Variant Detection in Two Swedish Human Genomes. Genes. 2020; 11(12):1444.

Chicago Style

Fatima, Nazeefa; Petri, Anna; Gyllensten, Ulf; Feuk, Lars; Ameur, Adam. 2020. "Evaluation of Single-Molecule Sequencing Technologies for Structural Variant Detection in Two Swedish Human Genomes." Genes 11, no. 12: 1444.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
scRipts		scRipts
README.md		README.md
commands.md		commands.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NanoSwe: Analysing PromethION Sequencing Data of Swedish Genomes 🇸🇪

Introduction

Bioinformatics ToolKit

Nanopore Data: Post-Sequencing

Nanopore Data: Post-Tidying

Sub-folder content

Data Sources

Recommended Readings

Recommended Pipeline(s)

Citation

AMA Style

Chicago Style

About

Releases

Packages

Languages

Nazeeefa/NanoSwe

Folders and files

Latest commit

History

Repository files navigation

NanoSwe: Analysing PromethION Sequencing Data of Swedish Genomes 🇸🇪

Introduction

Bioinformatics ToolKit

Nanopore Data: Post-Sequencing

Nanopore Data: Post-Tidying

Sub-folder content

Data Sources

Recommended Readings

Recommended Pipeline(s)

Citation

AMA Style

Chicago Style

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages