Skip to content

Scaffolding existing contigs with long reads

olest edited this page Apr 21, 2015 · 5 revisions

Synthetic long reads can guide the scaffolding of an existing de-novo assembly. We recommend aligning the long reads to the assembled contigs using a split-read aligner such as bwa. Long reads having a split read alignment joining two contigs are candidates for a scaffold.

Example:

R1 0 3733 1 60 960S1718M * 0 0 SA:Z:15717,1,-,1722S956M,60,1;

R1 2064 15717 1 60 1722H956M * 0 0 SA:Z:3733,1,+,960S1718M,60,1;

The long read R1 maps to two contigs with the id 3733 and 15717, respectively and can be used to link them into a scaffold.

Although it is written for PacBio sequencing reads, SSPACE-LongRead has been shown to work for TruSeq synthetic reads as well:

http://www.biomedcentral.com/1471-2105/15/211

It iteratively merges draft contigs and long reads, using the long reads as a template.

© 2015 Illumina, Inc. All rights reserved.