Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: experiments results file ingestion #488

Merged
merged 29 commits into from
Jun 19, 2024

Conversation

noctillion
Copy link
Contributor

This PR introduces a new workflow that

-Accepts an input directory with experiment files, ensuring filenames match those in the experiment results JSON.
-Manages a json file containing experiment metadata and results to prepare for further processing.
-Downloads external resources referenced by URLs in the json to a temporary folder, preparing them for drs ingestion.
-Uploads the specified files to the DRS, creating DRS objects. https://ga4gh.github.io/data-repository-service-schemas/preview/release/drs-1.2.0/docs/
-Updates the experiment results json with drs URIs in the url field post-ingestion, directly linking experiment data to its DRS objects.
-After updating the JSON with new URLs, the experiments are ingested, completing the process.

Copy link

codecov bot commented Apr 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.35%. Comparing base (bcb6eb3) to head (eb02bb5).
Report is 7 commits behind head on develop.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #488   +/-   ##
========================================
  Coverage    88.34%   88.35%           
========================================
  Files          128      128           
  Lines         4642     4645    +3     
  Branches       684      684           
========================================
+ Hits          4101     4104    +3     
  Misses         389      389           
  Partials       152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@noctillion noctillion marked this pull request as ready for review April 29, 2024 13:55
Copy link
Contributor

@v-rocheleau v-rocheleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRS responses should be handled as JSON when parsing

Copy link
Contributor

@v-rocheleau v-rocheleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@davidlougheed davidlougheed changed the title Features/experiments results ingestion feat: experiments results file ingestion Jun 6, 2024
Copy link
Member

@davidlougheed davidlougheed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor cleanup, otherwise i think it looks good!

for experiment in data.get('experiments', []):
for result in experiment.get('experiment_results', []):
filename = result.get('filename', '')
file_path = ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line is unused

if filename in files:
file_found = True
file_path = os.path.join(root, filename)
path_list.append({'filename': filename, 'url': file_path})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the file path key here "url"? should this be a "path"?

@noctillion noctillion merged commit eb46048 into develop Jun 19, 2024
7 checks passed
@davidlougheed davidlougheed deleted the features/experiments_results_ingestion branch July 17, 2024 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants