Flowbot

A Cookiecutter template for Flowbot repos. This is for instantiating and maintaining reporting data pipelines

Quick start: instantiating a new active crisis

Make sure you have a .geojson of your affected areas before you begin - it's helpful to have the path to it in your clipboard. To instantiate:

Clone this repo

git clone https://github.com/Flowminder/flowbot-pipeline-root.git
cd flowbot-pipeline-root

Install deps with pipenv install
Run Cookiecutter with bash run.sh . -o ..
Choose 'active crisis' for the first option
Follow the prompts onscreen, pasting the path to the geojson when asked. Some prompts are not needed for the active crisis
Copy the result to the relevant server's dags or flowbot folder; you should see the new pipeline in Airflow's interface
Copy the replay file from ~/.cookiecutter_replays/{your_project_slug} to this folder; this will support merging changes from the original root repo into your rendered repo.
Optional but strongly advised: run git init to create a git repo for tracking changes to the new pipeline

Available pipelines

A pipeline is a combination of a country and a report configiuration, used to render a new repository from the contents of {{cookiecutter.__project_slug}}. The rendering process populates the relevant *_config.py file and removes any irrelevant notebooks and dags.

Active crisis

A pipeline for monitoring an upcoming or ongoing crisis. To define, you need:

The geographic area of the crisis; this is currently a polygon in EPSG:4326
A key start date to begin monitoring
An optional end date

This pipeline produces a daily active crisis report showing total and newly displaced subscribers from affected areas to host locations. For an example report, see https://www.flowminder.org/resources/publications-reports/haiti-gang-violence-in-downtown-port-au-prince-mobility-situation-report-29-february-12-march-2024

Mobility insights

A pipeline for producing monthly reports on national mobility trends. To define this pipeline, you need:

A reference date range - this will be a period that builds up a baseline for the rest of the reports to refer to.
A start date

For more information, see https://haiti.mobility-dashboard.org/files/HTI%20platform%20release%20documentation%20Nov%202024.pdf

Populated repo contents

*_config.py

This is the only Python file populated by the Cookiecutter template (the only other file populated is the README). This is imported into the DAG - if you need to make changes to an existing pipeline, this is where you should start looking.

dags

Contains the DAGs associated with this pipeline. Although all piplines currently consist of a single DAG, this may change in the future and is not an assumption you can rely on. If you are extending or modifying a DAG, it is recommended that you add any configuration variable (dates, static file names, region lists, ect) to the pipeline's config.py.

notebooks

Contains the parameterised notebooks and associated helper libraries that form the bulk of the pipeline processing via flowpyter-task. These should be as generic as possible and take any date ranges, static data sources or similar from the config.py file via a params cell. For further information, see https://github.com/Flowminder/flowpyter-task

static

Contains artefacts that are unchanging between dagruns, but may be changed by users. The current two use cases for this are static data files that contain population estimates and the Jinja templates used for reports.

manual

Room for information that must be added manually per dagrun. The main case for this is the key observations, but it is also used to provide references to previous reports for the back matter.

data and executed_notebooks

These folders are populated by flowpyter-task tasks - they exist, but should be left empty.

Development

There is a simple approvaltest in the .circleci dir. If you make a change to the structure of the repo, run

pipenv run cookiecutter . --replay-file .circleci/test-replay.json -o .circleci/approved -f

...to update the approval test. If you make changes to the cookiecutter parameters, you'll need to generate a new replay file; run cookciecutter with your new parameters and copy the replay file to .circleci/test-replay.json as above.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.circleci		.circleci
.cookiecutter.replay		.cookiecutter.replay
flowbot_dataclasses		flowbot_dataclasses
hooks		hooks
{{ cookiecutter.__project_slug }}		{{ cookiecutter.__project_slug }}
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENCE.txt		LICENCE.txt
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
cookiecutter.json		cookiecutter.json
cookiecutter.yml		cookiecutter.yml
local_extensions.py		local_extensions.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flowbot

Quick start: instantiating a new active crisis

Available pipelines

Active crisis

Mobility insights

Other contents

flowbot_dataclasses

hooks

local_extensions

Populated repo contents

*_config.py

dags

notebooks

static

manual

data and executed_notebooks

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

Flowminder/flowbot-pipeline-root

Folders and files

Latest commit

History

Repository files navigation

Flowbot

Quick start: instantiating a new active crisis

Available pipelines

Active crisis

Mobility insights

Other contents

flowbot_dataclasses

hooks

local_extensions

Populated repo contents

*_config.py

dags

notebooks

static

manual

data and executed_notebooks

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages