Skip to content

HybPiper version 2.2.0

Compare
Choose a tag to compare
@chrisjackson-pellicle chrisjackson-pellicle released this 17 Jul 03:18
· 73 commits to master since this release
  • Add option --end_with to command hybpiper assemble. Allows the user to end the assembly pipeline at a chosen step (map_reads, distribute_reads, assemble_reads, exonerate_contigs).
  • Add option --exonerate_skip_hits_with_frameshifts to command hybpiper assemble. If provided, skip Exonerate hits where the SPAdes contig contains frameshifts when considering hits for assembly of an *.FNA sequence. Default behaviour in HybPiper v2.2.0 is to include these hits; previous versions allowed them automatically.
  • Add option --exonerate_skip_hits_with_internal_stop_codons to command hybpiper assemble. If provided, skip Exonerate hits where the SPAdes contig contains internal in-frame stop codon(s) when considering hits for assembly of an *.FNA sequence. A single terminal stop codon is allowed. Default behaviour in HybPiper v2.2.0 is to include these hits; previous versions allowed them automatically.
  • Add option --exonerate_skip_hits_with_terminal_stop_codons to command hybpiper assemble. If provided, skip Exonerate hits where the SPAdes sequence contains a single terminal stop codon. Only applies when option --exonerate_skip_hits_with_internal_stop_codons is also provided. Only use this flag if your target file exclusively contains protein-coding genes with no stop codons included, and you would like to prevent any in-frame stop codons in the output sequences. Default behaviour in HybPiper v2.2.0 is to include these hits; previous versions allowed them automatically.
  • Add option --chimeric_stitched_contig_check to command hybpiper assemble. If provided, HybPiper will attempt to determine whether a stitched contig is a potential chimera of contigs from multiple paralogs. Default behaviour in HybPiper v2.2.0 is to skip this check; previous versions performed the check automatically. Skipping this check speeds up the final 'exonerate_contigs' step of the pipeline, significantly.
  • Add option --no_pad_stitched_contig_gaps_with_n to command hybpiper assemble. If provided, when constructing stitched contigs, do not pad any gaps between hits (with respect to the "best" protein reference) with a number of Ns corresponding to the reference gap multiplied by 3. Default behaviour in HybPiper v2.2.0 is to pad gaps with Ns; previous versions did this automatically.
  • Add option --skip_targetfile_checks to command hybpiper assemble. Skip the target file checks. Can be used if you are confident that your target file has no issues (e.g. if you have previously run hybpiper check_targetfile).
  • Add option --no_spades_eta to command hybpiper assemble. When SPAdes is run concurrently using GNU parallel, the "--eta" flag can result in many "sh: /dev/tty: Device not configured" errors written to stderr. Using this option removes the "--eta" flag to GNU parallel, silencing both ETA output and the error message.
  • Fixed a bug in exonerate_hits.py that could (rarely) result in a duplicated region in the output *.FNA sequence.
  • Fixed a bug in exonerate_hits.py that occurred when more than two Exonerate hits had identical query ranges and similarity scores; this could result in a sequence not being returned for the given gene.
  • Added tests folder containing initial unit tests. Some tests require python package pyfakefs to run.
  • Refactor of the hybpiper package. New module hybpiper_main.py with entry point (moved from assemble.py), and some assemble.py functions moved to utils.py. Target file checking functionality has been consolidated.
  • HybPiper now logs to stdout rather than stderr.
  • Commands hybpiper check_targetfile and hybpiper assemble now write a report file when checking the target file (check_targetfile_report-<target file name>.txt), rather than logging details to the main sample log. Command hybpiper check_targefile writes the report to the current working directory, whereas command hybpiper assemble writes it to the sample directory.
  • If the option --cpu is not specified for hybpiper assemble, HybPiper will now use all available CPUs minus one, rather than all available CPUs.
  • Command hybpiper assemble now checks for output from previous runs for the pipeline steps selected via --start_from and --end_with (default is to select all steps). If previous output is found, HybPiper will exit with an error unless the option --force_overwrite is provided.
  • Corrected the reading frame of sequence Artocarpus-gene660 in the test dataset target file.
  • Command hybpiper assemble now writes the file <prefix>_chimera_check_performed.txt to the sample directory. This is a text file containing 'True' or 'False' depending on whether the option --skip_chimeric_genes was provided to command hybpiper assemble. Used by hybpiper retrieve_sequences and hybpiper paralog_retriever.