Skip to content
Kenji Fukushima edited this page Jan 23, 2025 · 7 revisions

Setting up

CDSKIT commands

  • accession2fasta: Retrieving fasta sequences from a list of GenBank accessions

  • aggregate: Extracting the longest sequences combined with a sequence name regex

  • backtrim: Back-translating a trimmed protein alignment

  • gapjust: Adjusting consecutive Ns to the fixed length

  • hammer: Removing less-occupied codon columns from a gappy alignment

  • intersection: Dropping non-overlapping sequence labels between two sequences files or between a sequence file and a gff file

  • label: Modifying sequence labels

  • mask: Masking ambiguous and/or stop codons

  • pad: Making nucleotide sequences in-frame by head and tail paddings

  • parsegb: Converting the GenBank format

  • printseq: Print a subset of sequences with a regex

  • rmseq: Removing a subset of sequences by using a sequence name regex and by detecting problematic sequence characters

  • split: Splitting 1st, 2nd, and 3rd codon positions

  • stats: Printing sequence statistics

Clone this wiki locally