Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#Testing: Verify SQL Linkage against our current version of the RDB #66

Open
J-glove opened this issue Feb 3, 2025 · 3 comments
Open
Assignees

Comments

@J-glove
Copy link
Collaborator

J-glove commented Feb 3, 2025

Our plan is to move to neo4j Knowledge Graph to house the exposome data. In the meantime, we may still want to make our data available to Mayo - we could distribute the RDB we currently have to allow them to do linkage on more datasets.

@chengrong-us
Copy link
Collaborator

Updates[2025-02-04]:

  1. db_linkage_ucr.py works well against UCR from current .db (/data/exposome_db/zip9_exposomes.db) on exposome server.
  2. Made some revisions for db_linkage_ucr.py to run well.
  3. The results (Figure 1) are all zero, which seems like that there are some issues. However, there are no problems when testing against full UCR dataset rather than current .db (Figure 2).

Figure 1:
Image

Figure 2:
Image

@J-glove
Copy link
Collaborator Author

J-glove commented Feb 10, 2025

Please verify against the other datasets uploaded to the db.

@chengrong-us
Copy link
Collaborator

Updates[2025-02-11]:

  1. 'UCR', 'CBP', and 'USDA_FARA' were verified.
  2. There are incomplete 'CBP', 'UCR', 'USDA_FARA' tables in current .db on exposome server as below.
    • UCR: 10000 rows; 1989-2020.
    • CBP: 500 rows; 2005.
    • USDA_FARA: 10000 rows; 2015, 2019.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants