Skip to content

Commit b8fd385

Browse files
authored
Merge pull request #29 from michaelconan/michaelconan-patch-1
2 parents f6ce777 + 398ae6c commit b8fd385

File tree

3 files changed

+11
-3
lines changed

3 files changed

+11
-3
lines changed

README.md

+9-2
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,15 @@ graph TB
5454
2. [Airflow](https://airflow.apache.org/) to orchestrate data loading scripts and additional automated workflows
5555
3. [DBT core](https://docs.getdbt.com/) to define data models and transformations, again orchestrated by Airflow (via CLI / bash TaskFlow)
5656

57+
## Standards
58+
59+
The project has been strucutrd and designed with inspiration from [dbt project recommendations](https://docs.getdbt.com/best-practices/how-we-structure/1-guide-overview) and other sources.
60+
61+
- DBT projects stored in separate subdirectory from DAGs (at least now)
62+
- DAGs and DBT projects organised at the top level by owner (should more get involved)
63+
- Further organisation by data source and / or function
64+
- Naming generally follows DBT recommended `[layer]_[source]__[entity]`, adapted for Airflow DAGs with `__[refresh-type]` and other modifications as needed.
65+
5766

5867
## Setup
5968

@@ -73,8 +82,6 @@ To run Airflow on a single instance, I used Honcho to run multiple processes via
7382
- `AIRFLOW__CORE__FERNKET_KEY={generated-key}` following [this guidance](https://airflow.apache.org/docs/apache-airflow/1.10.8/howto/secure-connections.html) to encrypt connection data
7483
- `AIRFLOW__CORE__INTERNAL_API_SECRET_KEY={generated-secret1}` following [this guidance](https://flask.palletsprojects.com/en/stable/config/#SECRET_KEY)
7584
- `AIRFLOW__WEBSERVER__SECRET_KEY={generated-secret2}` following guidance above
76-
- `AIRFLOW__WEBSERVER__BASE_URL={deployed-url}`
77-
- `AIRFLOW__CLI__ENDPOINT_URL={deployed-url}`
7885
- `AIRFLOW__WEBSERVER__INSTANCE_NAME=MY INSTANCE!`
7986
4. Generate Publish Profile file and deploy application code from GitHub
8087
5. Set startup command to use the `startup.txt` file

dags/michael/dbt.py

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717

1818
@dag(
19+
dag_id="dbt__michael",
1920
# Run after source datasets refreshed
2021
schedule=[NOTION_DAILY_HABITS_DS, NOTION_WEEKLY_HABITS_DS],
2122
catchup=False,

dags/michael/migrate.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
DATASET = os.getenv("ADMIN_DATASET", "admin")
2020

2121
with DAG(
22-
"migrate_raw_tables",
22+
"bq__migrate_schema",
2323
schedule="@once", # also consider "None"
2424
start_date=datetime(1970, 1, 1),
2525
params={"command": "upgrade", "revision": "head"},

0 commit comments

Comments
 (0)