-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Green Team - NIEHS Collaboration #114
Comments
Preliminary meetings have taken place and initial scientific and technical use cases have been developed. ROBOKOP code and data sources are being explored. |
Meeting with Charles, Shepherd, and Christine scheduled for 1 pm, Wednesday, April 18. |
|
DUA accepted by NIEHS. HuSH+ dataset received. |
Charles Schmitt's team is planning to develop a workflow to address toxic agent-centric questions such as those listed here. Relevant NIEHS data sources will be identified for eventual exposure as a Translator ChemTox Smart API. Resources are limited, however, so this is a low-priority project for NIEHS. |
7/20/2018 - Alex, Steve, Chris, and I met with Charles' Tox group and Data Science group to provide an overview of the Translator program and Green/Gamma's role in the program. We requested their assistance with Module 3, MVP1. @cpschmitt : Please provide an update. |
08/02/18 - Several updates: On the clinical side, we're submitting a request this week to the EPR contract team to pull genotype data and linking hashcodes. The geneotype data will be limited for now to TLR4 related SNPS. This will allow us to include the genotype data into the UNC clinical service and validate prior findings on TLR4 association with ashma and distance to roadway (the paper on this from Shepherd was just published). The linking hashcodes are those that UNC has on their own paper through the work that David Borland has done with Tracs on linking UNC and EPR patients (algorithm from Ashok). On the clinical side, we are developing a second use case around immune mediated diseases with Dr. Fred Miller at NIEHS. I'll work with Kara on the formulation of the use case. Presumably we would take the same approach as with Asthma to augment the UNC clinical service. Also, the EPR contract team is starting to draft documents with UNC for a more general data sharing arrangement between UNC and NIEHS based on several prior conversations with UNC. Dave Peden has agreed to serve as the UNC PI on this. On the tox side, Resham Kulkarni is taking a first pass at the chemical to tox phenotype use case that Scott Auerbach had mentioned and how it relates to the Translator and the module 3. She should have that this week. I'll work on it next, then have Scott review it. Then we'll run it by the Green team. |
09/06/18 update From Charles:
Independent of this, I'll be looking at AOPs closer for NTP's purposes and will let you know if there's a way to link those into translator (although I'd encourage you to also look if you haven't). From Kara: Consulted with Charles on (1) plans to generate a new ICEES cohort (i.e., new tables) using the EPR, which includes an asthma sub-cohort; and (2) a new EPR use case on immune-mediated disease |
Update on Green/Gamma collaboration with NIEHS (Charles Schmitt), 11/2/2018:
Modification to approved Green Team IRB protocol on asthma-like patients: Goal: We seek to add additional patient level data elements to inform the existing research study. The new data elements will include a limited set of genotype calls and responses to two survey questions (EPR Health and Exposure Survey, EPR Exposome Survey) that relate to the current Asthma use case. The data elements are available from the Environmental Polymorphism Registry (EPR), a longitudinal research study being conducted by the NIEHS. The EPR has enrolled around 19,000 subjects in North Carolina which includes approximately 5000 subjects who are in the UNC CDW-H (as determined from a prior UNC-NIEHS study). We plan to add data elements from the EPR to only those subjects that are in the EPR and are in the cohort for this study. The genotype data is focused on a small set of variants (4 total) that relate to the TLR4 pathway. The survey questions include a broad range of questions related to environmental exposures and general health. In a prior EPR-based study, we found significant differences in suspectibility to Asthma for patients based on distance to roadways and based upon their genotype for these variants. Under this modification, we will extend the existing Asthma analysis to include these genotypes and attempt to confirm the prior findings. We also plan to incorporate the survey questions and genotype data into the existing study analysis in order to uncover potential relationships between Asthma outcomes, clinical data, environmental exposure measures, survey responses, and genotypes. We note that this work is only meant to extend the set of data elements in the existing study and not to pursue additional research goals. Safeguards: |
@schmittcp will work with @lstillwe and Sue Nolte to develop an API for Tox21 Enricher data. Charles, please send Sue's GitHub user name to @rayi113 so that he can add her to this repo. |
@schmittcp @szcc @lstillwe : I'm hoping that the three of you can coordinate on the Tox21 Enricher API. |
Update, 4/19/19:
|
Update on status of ongoing projects:
|
Latest cross walk data for EPR data results in
The difference between these two data sets include:
which are excluded from icees table |
We have some Tox21 APIs but currently there are behind firewall. We'll make
it public available asap.
…On Thu, Nov 28, 2019 at 1:20 PM xu-hao ***@***.***> wrote:
Latest cross walk data for EPR data results in
- 213 matches with 2012 fhir data and
- 42 matches with icees table.
The difference between these two data sets include:
1. rows with no lat, lon
2. rows with age >= 90
which are excluded from icees table
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#114?email_source=notifications&email_token=ABLFBAZA6WD33JQM3GPUCT3QWADWPA5CNFSM4ETDL6EKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFNHY6Y#issuecomment-559578235>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLFBAZ3TENNTMNIQDI2ZY3QWADWPANCNFSM4ETDL6EA>
.
|
@szcc : Sounds great. Forgive my ignorance, but I don't recognize your user name. Perhaps you can remind me? |
Sue Nolte from NIEHS
…On Mon, Dec 2, 2019 at 10:46 AM karafecho ***@***.***> wrote:
@szcc <https://github.com/szcc> : Sounds great.
Forgive my ignorance, but I don't recognize your user name. Perhaps you
can remind me?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#114?email_source=notifications&email_token=ABLFBA6S66BYYAUIQ6L5UQLQWUUU7A5CNFSM4ETDL6EKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFT5LNA#issuecomment-560453044>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLFBAYATPPEDQ73JKO2T5DQWUUU7ANCNFSM4ETDL6EA>
.
|
👍 |
Emily to resolve hashing issues the week of January 13, 2020. |
@xu-hao : I checked with the EPR folks regarding the EPR "HE_COMPLETION_DATE" variable. This is the variable we should use for integration. So, we should examine exposures during the year prior to survey completion for the CMAQ airborne exposures data. For the ACS socio-economic data, we should use the 5-year block in which the survey was completed. For the roadway data, 2016 is the only option. Does this seem reasonable? Will this present any major challenges? |
@xu-hao : To clarify the above plan, we currently have ICEES structured in a somewhat random manner. Specifically, for patient tables, ages are calculated with respect to January 1 of each one-year 'study' period, and exposures and outcomes are examined over the same one-year 'study' period. For visit tables, ages are calculated with respect to visit date, exposures are examined over the 24 hours prior to visit date, and outcomes are examined with respect to the visit itself. The EPR data are structured differently. Specifically, ages are calculated with respect to the data of survey completion ("HE_COMPLETION_DATE"), and outcomes are reported at the time of survey completion (although some of the variables refer to lifetime metrics). I think it makes the most sense to examine exposures over the one-year period prior to survey completion. If this is too challenging, or will take too much time, then we can compromise, at least for the demonstration project, and examine exposure over the year in which the survey was completed. In other words, if a participant completed the survey on 6/1/2016, then we would examine exposures over the course of 2016. Does this make sense? If so, which of the above plans is the most feasible. @xu-hao : let's discuss this tomorrow (Tuesday, 1/14/20). |
From Emily, new hash matching results, 1/10/2020: UNCHCS denominator: 2,770,607 patients NIEHS EPR denominator: 19,388 participants Matched: 7,233 people (37% of all EPR participants) |
From Emily, 1/14/2020: Crosswalk is up on Rockfish, in /opt/RENCI/output/FHIR. Filename is UNC_NIEHS_XWalk_for_Hao.csv. The UNC identifier should match the patient IDs you have the in the FHIR files. The hash is what will match with NIEHS. |
Update from Kara on EPR asthma cohort data, 01.16.20: N= 4129 total, all with SNP data N= 2709 with HE_COMPLETION_DATE N = 2637 with HE_COMPLETION_DATE and D28_Asthma = 0,1 Of those 2637, 928 have HE_COMPLETION_DATE and D28_Asthma = 1 N = 2593 with D28_Asthma = 0,1 and TLR4_DIST_1X |
Additional update from Kara, 01.16.20: Hao has lat/longs and addresses for all 2709 participants with HE_COMPLETION_Date, so we can integrate the exposures data for the 2705 participants with HE_COMPLETION_DATE in 2012 or 2013, excluding the 4 participants with HE_COMPLETION_DATE in 2014. We will use use the calendar year prior to HE_COMPLETION_DATE to determine airborne pollutant exposure estimates, using the same calculations for AVG and MAX exposure that we currently use for the ICEES integrated feature tables, but expanding from PM2.5 and ozone to include the eight additional airborne pollutant exposure estimates that we now have. In other words, we'll calculate one-year exposures over the year prior to survey completion. So, if someone completed the survey on 7/1/2012, then we'll calculate exposures from 7/1/2011 - 7/1/2012 (or 7/2/2011 - 7/1/2012). For the ACS data, we will use the 2012-2016 estimates. For the roadway data, we do not really have a choice, as we only have 2016 data. WRT to column headers for the EPR data, we'll create two sets: one for the UNC data and one for the EPR data. We'll integrate the UNC and EPR data over all available years to date, i.e., 2010-2016. |
In addition to the above plans for integration of UNC and NIEHS EPR asthma cohort data, we will stand up a private ICEES API at NIEHS. @[email protected] : Please let @[email protected] and me know how we can move this effort forward as quickly as possible. Thanks! |
Clarification to comment from @xu-hao on November 28, 2019: Original comment Latest cross walk data for EPR data results in
The difference between these two data sets include:
which are excluded from icees table Correction The matches noted above represent a two-way join between the original cross-walk for UNC-EPR data (i.e., just the hash match) and both the UNC FHIR files for asthma cohort and the final ICEES integrated feature tables for asthma cohort, which are derived from the FHIR files. The hash codes for the original cross-walk between UNCHCS and NIEHS EPR did not align; i.e., the UNCHCS hash codes differed in format from the NIEHS EPR hash codes. |
This issue involves a new collaboration with Charles Schmitt and Shepherd Schurman of NIEHS to Investigate the use of the Translator/Reasoner architecture in the context of NIEHS clinical data sources (i.e., EPR, CRU), knowledge sources (i.e., Tox21, DrugMatrix), and new toxicology use cases.
The text was updated successfully, but these errors were encountered: