Skip to content

Commit

Permalink
Update notebook paths, dependencies, and Spark configuration in sg_re…
Browse files Browse the repository at this point in the history
…sale_flat_prices.py, Dockerfile, and startup.py
  • Loading branch information
xuwenyihust committed May 8, 2024
1 parent 5279640 commit 61bfdc9
Show file tree
Hide file tree
Showing 4 changed files with 74 additions and 58 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -179,4 +179,6 @@ data/*
*/.jupyter/*
*/.local/*
*/.npm/*
*/.custom_ipython_profile/*
*/.custom_ipython_profile/*

*/output/*
5 changes: 4 additions & 1 deletion dags/sg_resale_flat_prices.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,10 @@
run_notebook = PapermillOperator(
task_id='sg_resale_flat_prices_notebook',
input_nb='/opt/airflow/examples/sg-resale-flat-prices/sg-resale-flat-prices.ipynb',
output_nb='/opt/airflow/examples/sg-resale-flat-prices/output/output-notebook-{{ execution_date }}.ipynb'
output_nb='/opt/airflow/examples/sg-resale-flat-prices/output/output-notebook-{{ execution_date }}.ipynb',
parameters={
'spark_master': 'spark://spark-master:7077'
},
)

run_notebook
2 changes: 1 addition & 1 deletion docker/airflow/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ RUN apt-get update
USER airflow

RUN pip install --upgrade pip && \
pip install apache-airflow-providers-papermill ipython jupyter ipykernel papermill pandas numpy matplotlib seaborn
pip install apache-airflow-providers-papermill ipython jupyter ipykernel papermill pandas numpy matplotlib seaborn pyspark

# Add and install the Python 3 kernel
RUN python3 -m ipykernel install --user --name python3 --display-name "Python 3"
121 changes: 66 additions & 55 deletions examples/sg-resale-flat-prices/sg-resale-flat-prices.ipynb

Large diffs are not rendered by default.

0 comments on commit 61bfdc9

Please sign in to comment.