Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add source classes for BED and generic Interval types #665

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

clintval
Copy link
Member

@clintval clintval commented Feb 28, 2021

I need helpers for reading generic interval data, so I added these classes!

BedSource: A source for BED input (similar to IntervalListSource).
IntervalSource: A source for interval data that wraps either BED or Interval List input.

In both classes, you can optionally provide a sequence dictionary that will be used to validate the intervals. I did this because I find myself having a BED and reference FASTA, more often than I have an Interval List and reference FASTA, but I would still like the safety of knowing my intervals are well-formed.

I expect to use these classes like the following, (so even BED data is safe!):

@clp(description="Print intervals from BED or Interval List, validate them all!")
class DoWork(
  @arg(flag='i', doc="The path to the input intervals.") val input: PathToIntervals,
  @arg(flag='r', doc="The path to the reference FASTA.") val reference: PathToFasta,
) extends Tool {
  override def execute(): Unit = {
    val fasta     = ReferenceSequenceFileFactory.getReferenceSequenceFile(reference)
    val intervals = IntervalSource(input, Some(fasta.getSequenceDictionary.fromSam))

    intervals.foreach { interval => println(s"Found interval: $interval") }

    fasta.close()
    intervals.close()
  }
}

This should be OK from a type-alias perspective too, since PathToIntervals is defined as:

/** Represents a path to an intervals file (IntervalList or BED). */
type PathToIntervals = java.nio.file.Path

@tfenne tfenne self-requested a review March 1, 2021 17:13
@tfenne
Copy link
Member

tfenne commented Mar 1, 2021

Conceptuall on board and thanks for doing this. Might take a while to get to a review though.

@clintval
Copy link
Member Author

clintval commented Mar 1, 2021

No rush! Take what you like from the submission.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants