Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FE-350 update manifest.py to remove managed access project ids from final partition set #341

Conversation

bahill
Copy link
Contributor

@bahill bahill commented Nov 26, 2024

Why

FE-350

This PR:

Updates the manifest.py script which is used to set up the partition sets for each of the four steps of the HCA ingest workflow.

All project ids - IE all datasets - should be validated, loaded into TDR and have a snapshot created. Only public access projects should then go through the final step of having their snapshot made public. This update removes the need for the human running these steps to manually exclude managed access project ids from that final step, and thus helps avoid accidentally making managed(controlled) access data public.

The Monster Ops Playbook has been updated. @snovod @sahakiann @John-Scira, you'll want to update your readme as well.

To Test:

  • Create a Manifest like:

EBI,003d5674-9bf6-4e51-ab1b-8fed80c308b9,Yes
EBI,07e5ebc0-1386-4a33-8ce4-3007705adad8,No
EBI,14da8a72-8c00-41c2-a4fc-21a5563514e0,No

  • Run docker compose run -w /hca-ingest app bash in the terminal to load the Docker dev env. If you run into problems accessing the image try

gcloud auth application-default login
gcloud auth configure-docker us-east4-docker.pkg.dev

  • Authenticate with gcloud in the container

gcloud auth login
gcloud config set project mystical-slate-284720
gcloud auth application-default login

  • Install dependencies

cd orchestration/
poetry install

  • Forward ports for Dagster

cd ../ops/helmfiles/dagster
./forward_ports.sh dev &

  • Run manifest.py
    cd ../../../orchestration/
    python3 hca_manage/manifest.py load -e prod -c dcp<test_your_initials>_manifest.csv -r dcp<test_your_initials>

  • In a local terminal (not in the Docker image)

Forward ports for Dagster
cd ../ops/helmfiles/dagster
./forward_ports.sh prod

Verify that you see your release in the partition set lists and verify that the make_public partition set does not include 003d5674-9bf6-4e51-ab1b-8fed80c308b9

Checklist

  • Documentation has been updated as needed.

bahill and others added 10 commits November 20, 2024 14:53
…c (No/Yes), so that we filter out Managed Access (Yes) data from the final partition set which will make snapshots public.
…-remove-MA-proj-id-from-final-partition-set' into FE-350-update-manifest-dot-py-to-remove-MA-proj-id-from-final-partition-set
@bahill bahill marked this pull request as ready for review December 3, 2024 19:25
@bahill bahill merged commit 91f2bcc into main Dec 3, 2024
1 check passed
@bahill bahill deleted the FE-350-update-manifest-dot-py-to-remove-MA-proj-id-from-final-partition-set branch December 3, 2024 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants