Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination for patient's variants #388

Open
alanwilter opened this issue Mar 2, 2022 · 1 comment
Open

Pagination for patient's variants #388

alanwilter opened this issue Mar 2, 2022 · 1 comment
Assignees

Comments

@alanwilter
Copy link
Collaborator

Spin off from #342.

Query in variant.py:def _get_variants(target: str): need to have pagination since some patients, like http://dev-live.phenopolis.org/individual/PH00008697 has tens of thousands of variants.
Screenshot 2022-02-08 at 12 30 39

One thing to note is how the query is sorting the results, by CHROM, pos, etc.:
-- def _get_variants(target: str):
select
    array_agg(distinct iv.zygosity order by iv.zygosity) as zigosity,
    array_agg(distinct concat(g.hgnc_symbol,'@',g.ensembl_gene_id)) as genes, -- to split
    v.chrom as "CHROM", v.pos as "POS", v."ref" as "REF", v.alt as "ALT", v.cadd_phred, v.dann,
    v.fathmm_score, v.revel, -- new added
    -- removed: v.id
    vg.most_severe_consequence, string_agg(distinct vg.hgvs_c,',' order by vg.hgvs_c) as hgvsc,
    string_agg(distinct vg.hgvs_p,',' order by vg.hgvs_p) as hgvsp, -- via variant_gene
    iv.dp as "DP", iv."fs" as "FS", iv.mq as "MQ", iv."filter" as "FILTER", -- via individual_variant
(
    select array_agg(i.phenopolis_id order by i.id)
    from phenopolis.individual i
    join phenopolis.individual_variant iv2 on iv2.individual_id = i.id and iv2.zygosity = 'HOM'
    where v.id = iv2.variant_id
) as "HOM",
(
    select array_agg(i.phenopolis_id order by i.id)
    from phenopolis.individual i
    join phenopolis.individual_variant iv2 on iv2.individual_id = i.id and iv2.zygosity = 'HET'
    where v.id = iv2.variant_id
) as "HET",
(
    select distinct on (ah.chrom,ah.pos,ah."ref",ah.alt) ah.af from kaviar.annotation_hg19 ah
    where ah.chrom = v.chrom and ah.pos = v.pos and ah."ref" = v."ref" and ah.alt = v.alt
    order by ah.chrom,ah.pos,ah."ref",ah.alt,ah.ac desc
) as af_kaviar,
av.af as af_gnomad_genomes -- gnomad # NOTE: missing strand?
-- deprecated: MLEAF, MLEAC
-- need to be added (by Daniele): af_converge, af_hgvd, af_jirdc, af_krgdb, af_tommo,
from phenopolis.variant v
join phenopolis.individual_variant iv on iv.variant_id = v.id
join phenopolis.individual i2 on i2.id = iv.individual_id
left outer join phenopolis.variant_gene vg on vg.variant_id = v.id -- variant_gene not complete?
left outer join ensembl.gene g on vg.gene_id = g.ensembl_gene_id
    and g.assembly = 'GRCh37' and g.chromosome ~ '^X|^Y|^[0-9]{1,2}'
left outer join gnomad.annotation_v3 av
    on av.chrom = v.chrom and av.pos = v.pos and av."ref" = v."ref" and av.alt = v.alt
--where v.chrom = '12' and v.pos = 7241974 and v."ref" = 'C' and v.alt = 'T' -- 2 rows
--where v.chrom = '7' and v.pos = 2303057 and v."ref" = 'G' and v.alt = 'A' -- 1 row
--where i2.phenopolis_id = 'PH00008256'
--where vg.gene_id = 'ENSG00000144285'
where i2.phenopolis_id = 'PH00008697'
group by "CHROM","POS","REF","ALT",cadd_phred,dann,fathmm_score,revel,most_severe_consequence,
    "DP","FS","MQ","FILTER", -- need for array_agg but disambiguates depending on individual_variant
    "HOM","HET",af_kaviar,af_gnomad_genomes
order by
    substring(v.chrom FROM '([0-9]+)')::int,
    v.pos, v."ref", v.alt, iv.dp desc
limit 10000 offset 0 -- attempt to paginate
;

In this way, zigosity HET and HOM can be alternated in rows, so individual page, which has 3 tabs: RARE HOMOZYGOTES, RARE COMPOUND HETEROZYGOTES AND RARE HETEROZYGOTES, may not paginate as desired.

I'm investigating a better solution. Perhaps pagination could be better done in def _individual_complete_view(...) rather than in the query above since it's there where rare_comp_hets are processed.

@alanwilter alanwilter self-assigned this Mar 2, 2022
@alanwilter alanwilter pinned this issue Mar 2, 2022
@alanwilter
Copy link
Collaborator Author

The DB query is the bottleneck, so pagination really needs to happen there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant