Skip to content

Commit

Permalink
Merge pull request #13 from alan-eu/v2.5.1
Browse files Browse the repository at this point in the history
V2.5.1
  • Loading branch information
florian-ernst-alan authored May 19, 2023
2 parents 8291cf7 + f13a367 commit 0a1a329
Show file tree
Hide file tree
Showing 21 changed files with 779 additions and 677 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
/logs
/db-data
dags/*
.DS_Store
58 changes: 37 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.

*Please note: MWAA/AWS/DAG/Plugin issues should be raised through AWS Support or the Airflow Slack #airflow-aws channel. Issues here should be focused on this local-runner repository.*


## About the CLI

The CLI builds a Docker container image locally that’s similar to a MWAA production image. This allows you to run a local Apache Airflow environment to develop and test DAGs, custom plugins, and dependencies before deploying to MWAA.
Expand All @@ -10,16 +13,14 @@ The CLI builds a Docker container image locally that’s similar to a MWAA produ

```text
dags/
example_dag_with_custom_ssh_plugin.py
example_dag_with_taskflow_api.py
requirements.txt
tutorial.py
example_lambda.py
example_dag_with_taskflow_api.py
example_redshift_data_execute_sql.py
docker/
config/
airflow.cfg
constraints.txt
mwaa-base-providers-requirements.txt
requirements.txt
webserver_config.py
.env.localrunner
script/
Expand All @@ -32,7 +33,9 @@ docker/
docker-compose-sequential.yml
Dockerfile
plugins/
ssh_plugin.py
README.md
requirements/
requirements.txt
.gitignore
CODE_OF_CONDUCT.md
CONTRIBUTING.md
Expand Down Expand Up @@ -67,8 +70,6 @@ Build the Docker container image using the following command:

### Step two: Running Apache Airflow

Run Apache Airflow using one of the following database backends.

#### Local runner

Runs a local Apache Airflow environment that is a close representation of MWAA by configuration.
Expand Down Expand Up @@ -97,18 +98,18 @@ The following section describes where to add your DAG code and supporting files.
#### DAGs

1. Add DAG code to the `dags/` folder.
2. To run the sample code in this repository, see the `tutorial.py` file.
2. To run the sample code in this repository, see the `example_dag_with_taskflow_api.py` file.

#### Requirements.txt

1. Add Python dependencies to `dags/requirements.txt`.
1. Add Python dependencies to `requirements/requirements.txt`.
2. To test a requirements.txt without running Apache Airflow, use the following script:

```bash
./mwaa-local-env test-requirements
```

Let's say you add `aws-batch==0.6` to your `dags/requirements.txt` file. You should see an output similar to:
Let's say you add `aws-batch==0.6` to your `requirements/requirements.txt` file. You should see an output similar to:

```bash
Installing requirements.txt
Expand All @@ -125,18 +126,31 @@ Installing collected packages: botocore, docutils, pyasn1, rsa, awscli, aws-batc
Successfully installed aws-batch-0.6 awscli-1.19.21 botocore-1.20.21 docutils-0.15.2 pyasn1-0.4.8 rsa-4.7.2
```

#### Custom plugins

- There is a directory at the root of this repository called plugins. It contains a sample plugin ```ssh_plugin.py```
- In this directory, create a file for your new custom plugin. For example:
3. To package the necessary WHL files for your requirements.txt without running Apache Airflow, use the following script:

```bash
ssh_plugin.py
./mwaa-local-env package-requirements
```

- (Optional) Add any Python dependencies to `dags/requirements.txt`.
For example usage see [Installing Python dependencies using PyPi.org Requirements File Format Option two: Python wheels (.whl)](https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-dependencies.html#best-practices-dependencies-python-wheels).

#### Custom plugins

- There is a directory at the root of this repository called plugins.
- In this directory, create a file for your new custom plugin.
- Add any Python dependencies to `requirements/requirements.txt`.

**Note**: this step assumes you have a DAG that corresponds to the custom plugin. For example usage [MWAA Code Examples](https://docs.aws.amazon.com/mwaa/latest/userguide/sample-code.html).

#### Startup script

**Note**: this step assumes you have a DAG that corresponds to the custom plugin. For examples, see [MWAA Code Examples](https://docs.aws.amazon.com/mwaa/latest/userguide/sample-code.html).
- There is a sample shell script `startup.sh` located in a directory at the root of this repository called `startup_script`.
- If there is a need to run additional setup (e.g. install system libraries, setting up environment variables), please modify the `startup.sh` script.
- To test a `startup.sh` without running Apache Airflow, use the following script:

```bash
./mwaa-local-env test-startup-script
```

## What's next?

Expand All @@ -156,24 +170,26 @@ To learn more, see [Amazon MWAA Execution Role](https://docs.aws.amazon.com/mwaa

### How do I add libraries to requirements.txt and test install?

- A `requirements.txt` file is included in the `/dags` folder of your local Docker container image. We recommend adding libraries to this file, and running locally.
- A `requirements.txt` file is included in the `/requirements` folder of your local Docker container image. We recommend adding libraries to this file, and running locally.

### What if a library is not available on PyPi.org?

- If a library is not available in the Python Package Index (PyPi.org), add the `--index-url` flag to the package in your `dags/requirements.txt` file. To learn more, see [Managing Python dependencies in requirements.txt](https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-dependencies.html).
- If a library is not available in the Python Package Index (PyPi.org), add the `--index-url` flag to the package in your `requirements/requirements.txt` file. To learn more, see [Managing Python dependencies in requirements.txt](https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-dependencies.html).

## Troubleshooting

The following section contains errors you may encounter when using the Docker container image in this repository.

### My environment is not starting - process failed with dag_stats_table already exists
### My environment is not starting

- If you encountered [the following error](https://issues.apache.org/jira/browse/AIRFLOW-3678): `process fails with "dag_stats_table already exists"`, you'll need to reset your database using the following command:

```bash
./mwaa-local-env reset-db
```

- If you are moving from an older version of local-runner you may need to run the above reset-db command, or delete your `./db-data` folder. Note, too, that newer Airflow versions have newer provider packages, which may require updating your DAG code.

### Fernet Key InvalidToken

A Fernet Key is generated during image build (`./mwaa-local-env build-image`) and is durable throughout all
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.2.2
2.5.1
17 changes: 12 additions & 5 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# VERSION 1.10
# AUTHOR: Subash Canapathy
# VERSION 2.4
# AUTHOR: John Jackson
# DESCRIPTION: Amazon MWAA Local Dev Environment
# BUILD: docker build --rm -t amazon/mwaa-local .

Expand All @@ -8,9 +8,9 @@ LABEL maintainer="amazon"

# Airflow
## Version specific ARGs
ARG AIRFLOW_VERSION=2.2.2
ARG AIRFLOW_VERSION=2.5.1
ARG WATCHTOWER_VERSION=2.0.1
ARG PROVIDER_AMAZON_VERSION=2.4.0
ARG PROVIDER_AMAZON_VERSION=7.1.0

## General ARGs
ARG AIRFLOW_USER_HOME=/usr/local/airflow
Expand All @@ -19,17 +19,24 @@ ARG PYTHON_DEPS=""
ARG SYSTEM_DEPS=""
ARG INDEX_URL=""
ENV AIRFLOW_HOME=${AIRFLOW_USER_HOME}
ENV PATH="$PATH:/usr/local/airflow/.local/bin:/root/.local/bin:/usr/local/airflow/.local/lib/python3.10/site-packages"
ENV PYTHON_VERSION=3.10.9

COPY script/bootstrap.sh /bootstrap.sh
COPY script/systemlibs.sh /systemlibs.sh
COPY script/generate_key.sh /generate_key.sh
COPY script/run-startup.sh /run-startup.sh
COPY script/shell-launch-script.sh /shell-launch-script.sh
COPY script/verification.sh /verification.sh
COPY config/constraints.txt /constraints.txt
COPY config/requirements.txt /requirements.txt
COPY config/mwaa-base-providers-requirements.txt /mwaa-base-providers-requirements.txt

RUN chmod u+x /systemlibs.sh && /systemlibs.sh
RUN chmod u+x /bootstrap.sh && /bootstrap.sh
RUN chmod u+x /generate_key.sh && /generate_key.sh
RUN chmod u+x /run-startup.sh
RUN chmod u+x /shell-launch-script.sh
RUN chmod u+x /verification.sh

# Post bootstrap to avoid expensive docker rebuilds
COPY script/entrypoint.sh /entrypoint.sh
Expand Down
6 changes: 3 additions & 3 deletions docker/config/airflow.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ fernet_key = $FERNET_KEY
donot_pickle = False

# How long before timing out a python file import
dagbag_import_timeout = 30.0
dagbag_import_timeout = 30

# Should a traceback be shown in the UI for dagbag import errors,
# instead of just the exception message
Expand Down Expand Up @@ -509,7 +509,7 @@ navbar_color = #fff
default_dag_run_display_number = 25

# Set secure flag on session cookie
cookie_secure = True
cookie_secure = False

# Set samesite policy on session cookie
cookie_samesite = Lax
Expand Down Expand Up @@ -994,4 +994,4 @@ shard_code_upper_limit = 10000
shards = 5

# comma separated sensor classes support in smart_sensor.
sensors_enabled = NamedHivePartitionSensor
sensors_enabled = NamedHivePartitionSensor
Loading

0 comments on commit 0a1a329

Please sign in to comment.