You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Glue pipeline is currently not usable because it can't read the new parameters introduced to the YAML file in #90. It also is currently run in a separate script from the manual flagging, and so code changes from #90 would need to be translated from manual_flagging/flagging.py to glue/flagging.py before the Glue job could even use the new parameters.
If we want to run flagging jobs on Glue going forward, we need to make the necessary updates to the Glue script (and possibly consolidate it with the manual script per #92) to get it running again. The biggest challenge here is figuring out how to represent the new parameters in a way that will be easily configurable as Glue parameters; my thinking is that we can add a small script to the CI pipeline that parses the YAML file and serialize the parameters to a format that works for Glue, and then we pass those into Terraform as variables. The script can then deserialize the parameters when running in a Glue context, or read them from the YAML file directly when running locally. (I recognize this is a lot of design info, so if we go in this direction I'll take some time to sketch this out in more detail.)
But before we fix up the Glue pipeline, we need to decide if we even want to continue using Glue to run the pipeline on a schedule. Glue's approach to parameters is finicky and it doesn't work well with our GitHub CI flow. So unfortunately this issue will be blocked until we make a decision about architecture.
The text was updated successfully, but these errors were encountered:
The Glue pipeline is currently not usable because it can't read the new parameters introduced to the YAML file in #90. It also is currently run in a separate script from the manual flagging, and so code changes from #90 would need to be translated from
manual_flagging/flagging.py
toglue/flagging.py
before the Glue job could even use the new parameters.If we want to run flagging jobs on Glue going forward, we need to make the necessary updates to the Glue script (and possibly consolidate it with the manual script per #92) to get it running again. The biggest challenge here is figuring out how to represent the new parameters in a way that will be easily configurable as Glue parameters; my thinking is that we can add a small script to the CI pipeline that parses the YAML file and serialize the parameters to a format that works for Glue, and then we pass those into Terraform as variables. The script can then deserialize the parameters when running in a Glue context, or read them from the YAML file directly when running locally. (I recognize this is a lot of design info, so if we go in this direction I'll take some time to sketch this out in more detail.)
But before we fix up the Glue pipeline, we need to decide if we even want to continue using Glue to run the pipeline on a schedule. Glue's approach to parameters is finicky and it doesn't work well with our GitHub CI flow. So unfortunately this issue will be blocked until we make a decision about architecture.
The text was updated successfully, but these errors were encountered: