You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Goal
Add an Airflow DAG to a user-specified Airflow server for an artifact.
Current user workflow
The data scientist has developed an artifact, say an ML model called clf, in a Jupyter notebook. To create an Airflow DAG, they would have to manually write a Python script to translate the code in the Jupyter notebook into the Airflow DSL to construct a DAG. This DAG is then placed in the DAG folder of the Airflow server they are submitting the DAG to.
User workflow with Linea Note: This is agnostic of the entry point to Linea (CLI or IPython). We will discuss the UX at the API level.
Airflow config: the user specifies the URI for AIRFLOW_HOME in a Linea config file, say in lineapy/config.yml.
The user first calls lineapy.save(clf) to get the LineaArtifact object associated with clf named clf_artifact. The user then invokes lineapy.to_airflow(clf_artifact) to generate a dag.py file and send to the AIRFLOW_HOME directory.
to_airflow() takes an optional dict argument for users to specify the input parameters to the DAG if they are familiar with them, such as schedule_interval and max_active_runs.
This allows users to pass in multiple artifacts for a single DAG.
Note: to_airflow() handles the transfer of the dag.py file to AIRFLOW_HOME as in the current implementation. This is potentially a point of further discussion.
Desiderata
No dependency on airflow from lineapy
Proposed solution
Construct dag.py using Jinja templates.
The text was updated successfully, but these errors were encountered:
Goal
Add an Airflow DAG to a user-specified Airflow server for an artifact.
Current user workflow
The data scientist has developed an artifact, say an ML model called
clf
, in a Jupyter notebook. To create an Airflow DAG, they would have to manually write a Python script to translate the code in the Jupyter notebook into the Airflow DSL to construct a DAG. This DAG is then placed in the DAG folder of the Airflow server they are submitting the DAG to.User workflow with Linea
Note: This is agnostic of the entry point to Linea (CLI or IPython). We will discuss the UX at the API level.
Airflow config: the user specifies the URI for
AIRFLOW_HOME
in a Linea config file, say inlineapy/config.yml
.The user first calls
lineapy.save(clf)
to get theLineaArtifact
object associated withclf
namedclf_artifact
. The user then invokeslineapy.to_airflow(clf_artifact)
to generate adag.py
file and send to theAIRFLOW_HOME
directory.to_airflow()
takes an optional dict argument for users to specify the input parameters to the DAG if they are familiar with them, such asschedule_interval
andmax_active_runs
.function signature for
to_airflow()
:This allows users to pass in multiple artifacts for a single DAG.
Note:
to_airflow()
handles the transfer of thedag.py
file toAIRFLOW_HOME
as in the current implementation. This is potentially a point of further discussion.Desiderata
airflow
fromlineapy
Proposed solution
Construct
dag.py
using Jinja templates.The text was updated successfully, but these errors were encountered: