Skip to content
danielamiller edited this page Feb 28, 2024 · 12 revisions

This repository stores the code used to develop two long-read based wheat genome sequences using PacBio HiFi sequencing input data. The CERES file paths to scripts for both soft red winter wheat assemblies (cv. 'AGS2000' and 'Hilliard') are given for each step in the below process.

Wheat (Triticum aestivum) Whole Genome Assembly

Use of a high performance computer is needed; scripts are written for SLURM batch job submission to a high-performance computing cluster. Quality control steps are included throughout the workflow. For each step, a script using AGS2000 is given as an example.

STEPS:

  1. Adaptor removal
  2. Hifiasm contig assembly
  3. gfastats summary statistics
  4. RagTag scaffold assembly
  5. gfastats summary statistics
  6. Alignment-based QC
  7. Input read mapping QC