Open-MSSPE-Design is a Rust-based pipeline for designing primers for Metagenomic Sequencing with Spiked Primer Enrichment (MSSPE). This approach supports viral diagnostics and genomic surveillance by enriching viral sequences during sequencing, as described in Deng et al. (2020), and openly implemented in nf-msspe by Simon Maestri. This implementation introduces significant revisions to optimize and automate the primer design process.
Key features:
- Fully automated primer design workflow.
- Fully customizable kmer selection process.
- Uses nearest-neighbor thermodynamic models from the standalone Primer3 package ntthal
- Enhanced filtering for:
- nucleotide repeats & homopolymers
- specific minimum and maximum temperature of melting (tm) values
- strict tm value ranges (within 2 standard deviations of mean)
- hairpins (with same primer)
- cross-dimers (across multiple primers/entire pool)
- check strength of secondary structures via DeltaG calculations
- Rust implementation for performance and reliability.
- Rust (latest stable version)
- Cargo (Rust package manager)
- mafft for multiple sequence alignment
- Primer3 for designing PCR primers.
Clone the repository and navigate to the project directory:
git clone https://github.com/opendream/open-msspe-design.git
cd open-msspe-design/od-msspe
Install dependecies:
brew install rust
brew install mafft
brew install primer3
Prepare a FASTA file containing viral genome sequences. This will serve as the input for primer design.
Build the pipeline:
cargo build --release
To run the pipeline:
./target/release/od-msspe
OR build & run:
cargo run
--input
: Path to the input FASTA file containing viral genome sequences.--output
: Directory where the designed primers will be saved.
The following arguments control various aspects of the primer design process:
--kmer-size
: Size of k-mers used in primer design (default: 13).--window-size
: Window size for genome scanning (default: 500).--overlap-size
: Overlap size between adjacent windows (default: 250).--max-mismatch-segments
: Maximum number of mismatched segments allowed (default: 1).--max-iterations
: Maximum number of iterations for primer optimization (default: 1000).--search-windows-size
: Size of search windows for primer candidates (default: 50).
--mv-conc
: Monovalent cation concentration in mM (default: 50.0).--dv-conc
: Divalent cation concentration in mM (default: 3.0).--dntp-conc
: dNTP concentration in mM (default: 0.0).--dna-conc
: Primer concentration in nM (default: 250.0).--annealing-temp
: Annealing temperature in °C (default: 25.0).
--min-tm
: Minimum melting temperature allowed (default: 30.0).--max-tm
: Maximum melting temperature allowed (default: 60.0).--tm-stddev
: Set the number of standard deviations away from the mean of the tm values (default: 2).--max-self-dimer-any-tm
: Maximum Tm for self-dimer at any position (default: 10°C below max-tm).--max-self-dimer-end-tm
: Maximum Tm for self-dimer at 3' end (default: 10°C below max-tm).--max-hairpin-tm
: Maximum Tm for hairpin structures (default: 10°C below min-tm).--max-delta-g
: Maximum delta G value for secondary structures (default: -9).
--keep-all
: Ignore all filtering criteria and keep all primers.--check-cross-dimers
: Enable cross-dimer checking between all primer pairs.--check-self-dimer
: Enable self-dimer checking for individual primers.--check-hairpin
: Enable hairpin structure checking for individual primers.--disable-tm-stddev
: Turns off tm-stddev config. Use if you do not want strictly similar tm values across all primers.--do-align
: Perform MAFFT multiple sequence alignment if true. Set to false if sequence already aligned.
cargo run -- --input data/viral_genomes.fasta --output results/msspe_primers.csv --kmer-size=15
Debugging
RUST_LOG=info cargo run -- --input data/viral_genomes.fasta --output results/msspe_primers.csv
Contributions, issues, and feature requests are welcome! Please open an issue or submit a pull request to improve the project.
This project is licensed under the MIT License.
Maintainers of this project are grateful for the support of the Skoll Foundation.