Skip to content

Per-read statistics and general optimization

Compare
Choose a tag to compare
@marcus1487 marcus1487 released this 13 Mar 00:49
· 60 commits to master since this release

This release includes new features for investigation of per-read, per-base modified base detection. Study of per-read statistic distributions has improved modified base detection in validation data sets by choosing better default per-read statistics thresholds. This version extends the use of the dampened-fraction of modified bases to better handle samples with variable coverage.

The release also includes some fixes for issues in the last version. The major user issues addressed are:

  • More efficient processing of large genomes, which previously resulted in very large memory usage

    • This addresses both computationally and in memory usage issues in the re-squiggle and test_significance commands.
  • Addressing issues specific to RNA processing: truncation of long transcript names and samples mapping to different sets of sequence records/transcripts

  • Better protection of read file corruption resulting from access by multiple, independent, concurrent Tombo commands