This repository uses Pulumi to provision cloud resources for the Hubverse, a project that provides open tools for collaborative modeling: https://hubverse.io/en/latest/.
At this time, the Hubverse will provide hosting for hubs that opt-in to cloud storage. This may change in the future, but as the Hubverse Cloud work gets started, we want to minimize onboarding friction for hub administrators.
This repository contains the Pulumi project that provisions the AWS resources for each cloud-enabled hub hosted on the Hubverse AWS account.
At this time, Amazon Web Services (AWS) is the only cloud provider supported by Hubverse hosting.
The code in this repository creates two categories of AWS resources:
- Resources that are shared across all hubs or are used for Hubverse administration.
- Resources created specifically for each hub.
These resources are created one time for the entire Hubverse:
- As S3 bucket to store shared files. The contents are not publicly accessible because they are for internal use.
- An AWS lambda function that transforms model-output files into a standardized format.
Each cloud-enabled hub requires several dedicated AWS resources. These resources are created for each hub:
- An S3 bucket to store data (with public read access).
- An IAM role that can be assumed by GitHub Actions. This role has two associated policies:
- A trust policy that stipulates the role can only be used by GitHub Actions that originate from the main branch of the hub's repository.
- A permission policy that grants write access to the hub's S3 bucket.
Currently, this infrastructure repository is not integrated with actual hub repositories. In other words, it doesn't pull information (e.g., the S3 bucket name) from a hub's admin.json
configuration file. Thus, to host a hub in the Hubverse AWS account, we need to manually add its information to hubs.yaml
.
-
Add a new
hub
entry tohubs.yaml
:hub
key: the name of the hub (this value will be used as the S3 bucket name, so make sure it's something unique)org
: the GitHub organization that hosts the hub's repositoryrepo
: the name of the hub's repository
For example:
- hub: flusight-forecast org: cdcepi repo: FluSight-forecast-hub
-
Submit the above changes as a pull request (PR) to this repository.
-
Shortly after the PR is opened, Pulumi will add a comment about the AWS changes it will make once the PR is merged.
For example:
Name Type Operation + flusight-forecast aws:iam/role:Role create + flusight-forecast-write-bucket-policy aws:iam/policy:Policy create + flusight-forecast-allow aws:lambda/permission:Permission create + flusight-forecast-transform-model-output-lambda aws:iam/rolePolicyAttachment:RolePolicyAttachment create + flusight-forecast aws:s3/bucket:Bucket create + flusight-forecast-read-bucket-policy aws:s3/bucketPolicy:BucketPolicy create + flusight-forecast aws:iam/rolePolicyAttachment:RolePolicyAttachment create + flusight-forecast-public-access-block aws:s3/bucketPublicAccessBlock:BucketPublicAccessBlock create + flusight-forecast-create-notification aws:s3/bucketNotification:BucketNotification create
-
If the Pulumi preview looks good, the PR can be merged after a code review. Once the PR is merged, Pulumi will apply the AWS changes.
-
The hub is now hosted in the Hubverse AWS account.
Important: The org
and repo
fields are used to create permissions that allow the hub's GitHub workflow to sync data to s3. If these values are not correct, the workflow will fail.
The code here uses a simple .yaml file that lists the cloud-enabled hubs. For each hub on the list, the Pulumi entry point invokes a Python function that provisions the required AWS resources.
This repo uses two GitHub workflows to manage Hubverse AWS resources. Each workflow assumes an IAM role with the permissions it needs (via GitHub's OIDC identity provider).
The IAM roles below are used by Pulumi, thus they are not managed by Pulumi. Instead, they were created manually in the Hubverse AWS account.
GitHub Workflow | Trigger | Hubverse AWS Role Assumed |
---|---|---|
pulumi_preview.yaml |
PR to main & ad-hoc |
hubverse-infrastructure-read-role |
pulumi_update.yaml |
merge to main |
hubverse-infrastructure-write-role |
If you're a Hubverse developer who wants to use Pulumi locally (using Pulumi's CLI, for example), you will need access to AWS credentials with the same permissions used by the GitHub workflows.
If you get a 403 error from the pulumi_update
GitHub action (or when running pulumi up
manually), it's likely the Pulumi code is trying to make a change that the AWS IAM hubverse-infrastructure-write-role
role doesn't have permission to make.
hubverse-infrastructure-write-role
is attached to an IAM policy that describes what it's allowed to do: hubverse-infrastructure-write-policy
. Thus, to grant additional permissions required for Pulumi operations, you will need to update hubverse-infrastructure-write-policy
via the AWS console:
- Log in to the AWS console.
- Click on Services in the top left corner, and then click on IAM.
- From the IAM dashboard, find the Access management section in the left-hand menu and click on Policies.
- When the list of policies appears, use the search box to find
hubverse-infrastructure-write-policy
and click on it. - Click the Edit button to update the policy.
Note: To make these changes, you will need to have a Hubverse AWS login with console permission and with policy update permissions.
-
Make sure you have the required version on Python installed on your machine (see
.python-version
).note: pyenv is a good tool for managing multiple version of Python on a single machine.
-
Clone this repository and navigate to the project directory.
-
Make sure your machine's current Python interpreter is set to the project's required version of Python, and then create a virtual environment. You can use any third-party tool that manages Python environments (e.g., pipenv, poetry), or you can use Python's built-in
venv
module (make sure you're at the top of the project directory):python -m venv .venv
-
Activate the virtual environment. If you created the environment using the
venv
command above, you can activate it as follows:source .venv/bin/activate
-
Install the project's dependencies:
pip install -r requirements/dev-requirements.txt && pip install -e .
This project uses pip-tools
to generate requirements files from pyproject.toml
. To add new dependencies, you will need to install pip-tools
into your virtual environment (or use pipx
to make it available on your machine globally).
To add a new dependency:
-
Add dependency to the
dependencies
sectionpyproject.toml
(if it's a dev dependency, add it to thedev
section of[project.optional-dependencies]
). -
Regenerate the
requirements.txt
file (you can skip this if you've only added a dev dependency):pip-compile --output-file=requirements/requirements.txt pyproject.toml
-
Regenerate the
requirements-dev.txt
file (you will need to do this every time, even if you haven't added a dev dependency):pip-compile --extra=dev --output-file=requirements/dev-requirements.txt pyproject.toml
-
Install the updated dependencies into your virtual environment:
pip install -r requirements/dev-requirements.txt