Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering: vcf_filter.py: added possibility to include a path to custom filters python file, not only a local script file #7

Closed
wants to merge 442 commits into from

Conversation

gitanoqevaporelmundoentero

vcf_filter.py: local-script argument changed to custom-filters, since now it is possible to include a path to custom filters python file

martijnvermaat and others added 27 commits May 16, 2014 18:46
Use requirements files to consolidate dependencies.
These coordinates should represent the zero-based, half-open region of
the reference sequence affected by all the events included in ALT. These
coordinates allow the user to identify precisely which bases are altered
by the events in the record.

Provides more thorough documentation on the coordinate schemes for
_Record.POS, .start, and .end.
Adds _Record.affected_start and .affected_end.
making alternate allele frequency work in the case of non-diploid genotypes
As reported in #164, we previously crashed on flag INFO fields declared
as strings (and the number of values declared as 1). This is indeed not
according to spec, but we should probably allow it anyway.
It is not valid according to the spec, but issue #164 shows a VCF file
where the FORMAT column contains just a dot character. We have no way
of interpreting the subsequent genotype columns in that case, so this
patch ignores them.
Allow flag INFO field to be declared as string
Don't crash when FORMAT is set to the missing value (.)
The spec actually does not allow for metadata lines without value, but we
shouldn't crash on them.

Fixes #168
Before we figure out what causes this, let's have a working test suite by
fixing pysam on the latest working release.

Traceback:

    Traceback (most recent call last):
      File "/home/travis/build/jamescasbon/PyVCF/build/lib.linux-x86_64-3.3/vcf/test/test_vcf.py", line 1109, in testNoVariantsInRange
        fetched_variants = self.reader.fetch('20', 14370, 17329)
      File "/home/travis/build/jamescasbon/PyVCF/build/lib.linux-x86_64-3.3/vcf/parser.py", line 623, in fetch
        self.reader = self._tabix.fetch(chrom, start, end)
      File "ctabix.pyx", line 345, in pysam.ctabix.Tabixfile.fetch (pysam/ctabix.c:4241)
    TypeError: expected bytes, str found

See #175
- Add R as an INFO field count (number of alleles including reference).
- Support the optional Source and Version keys on INFO metainformation.

Thanks alot @travc for contributing these fixes!

See #172
The VCF 4.0 and newer specifications say the ALT field is a comma
separated list that includes "base Strings made up of the bases
A,C,G,T,N". Notably, the last case was not handled by `Record.is_snp`,
causing it to erroneously report `False` for records with "N" as the ALT.
Bugfix: SNP records with N as ALT now noted as SNPs.
* Remember the ploidity of uncalled genotypes such that
  the sample genotypes written by PyVCF.Writer match the
  sample genotypes read by PyVCF.Reader.
* For uncalled _Calls, gt_nums and gt_bases are None;
  gt_alleles is a list of "None" with a length of _Call.ploidity.
Warning about open file handles muddle the output of unit tests
and are a potentially confusing factor to those interpreting
the tests.
The sample.data.GT attribute is no longer set to None for
uncalled calls, which means that _format_sample can now
rely on obtaining the original sample genotype.
Fix double quoting issue when writing VCFs
The issue in 0.8.0 seems to be fixed in 0.8.1, so it's now safe to
just blacklist 0.8.0 specifically.

See #175
… now it is possible to include a path to custom filters python file
@gitanoqevaporelmundoentero
Copy link
Author

Sorry, I was trying to include this PR in jamescasbon PyVCF... Better there or better here? :S

@jdoughertyii
Copy link
Owner

Better there, I've not looked at this repo in ages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.