Cloud Governance tool provides a lightweight and flexible framework for deploying cloud management policies focusing
on cost optimize and security.
We have implemented several pruning policies.
When monitoring the resources, we found that most of the cost leakage is from available volumes, unused NAT gateways,
and unattached Public IPv4 addresses (Starting from February 2024, public IPv4 addresses are chargeable whether they are
used or not).
This tool support the following policies: policy
-
Real time Openshift Cluster cost, User cost
-
instance_idle: Monitor the idle instances based on the instance metrics for the last 7 days.
- CPU Percent < 2%
- Network < 5KiB
-
instance_run: List the running ec2 instances.
-
unattached_volume: Identify and remove the available EBS volumes.
-
zombie_cluster_resource: Identify the non-live cluster resource and delete those resources by resolving dependency. We are deleting more than 20 cluster resources.
- Ebs, Snapshots, AMI, Load Balancer
- VPC, Subnets, Route tables, DHCP, Internet Gateway, NatGateway, Network Interface, ElasticIp, Network ACL, Security Group, VPC Endpoint
- S3
- IAM User, IAM Role
-
ip_unattached: Identify the unattached public IPv4 addresses.
-
zombie_snapshots: Identify the snapshots, which are abandoned by the AMI.
-
unused_nat_gateway: Identify the unused NatGateway by monitoring the active connection count.
-
s3_inactive: Identify the empty s3 buckets, causing the resource quota issues.
-
empty_roles: Identify the empty roles that do not have any attached policies to them.
-
ebs_in_use: list in use volumes.
-
tag_resources: Update cluster and non cluster resource tags fetching from the user tags or from the mandatory tags
-
tag_non_cluster: tag ec2 resources (instance, volume, ami, snapshot) by instance name
-
tag_iam_user: update the user tags from the csv file
-
cost_explorer: Get data from cost explorer and upload to ElasticSearch
-
gitleaks: scan GitHub repository git leak (security scan)
-
cost_over_usage: send mail to aws user if over usage cost
- instance_idle: Monitor the idle instances based on the
instance metrics.
- CPU Percent < 2%
- Network < 5KiB
- unattached_volume: Identify and remove the available disks.
- ip_unattached: Identify the unattached public IPv4 addresses.
- unused_nat_gateway: Identify the unused NatGateway by monitoring the active connection count.
- tag_baremetal: Tag IBM baremetal machines
- tag_vm: Tga IBM Virtual Machines machines
** You can write your own policy using Cloud-Custodian and run it (see 'custom cloud custodian policy' in Policy workflows).
Reference:
- The cloud-governance package is placed in PyPi
- The cloud-governance container image is placed in Quay.io
- The cloud-governance readthedocs link is ReadTheDocs
Table of Contents
- Installation
- Configuration
- Run AWS Policy Using Podman
- Run IBM Policy Using Podman
- Run Policy Using Pod
- Pytest
- Post Installation
# Need to run it with root privileges
sudo podman pull quay.io/cloud-governance/cloud-governance
(mandatory)AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
(mandatory)AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY
(mandatory)policy=instance_idle / instance_run / ebs_unattached / ebs_in_use / tag_cluster_resource / zombie_cluster_resource / tag_ec2_resource
(mandatory)policy_output=s3://redhat-cloud-governance/logs
(mandatory policy:tag_cluster_resource)resource_name=ocs-test
(mandatory policy:tag_cluster_resource)mandatory_tags="{'Owner': 'Name','Email': '[email protected]','Purpose': 'test'}"
(mandatory policy: gitleaks)git_access_token=$git_access_token (mandatory policy: gitleaks)git_repo=https://github.com/redhat-performance/cloud-governance (optional policy: gitleaks)several_repos=yes/no (default = no)
(optional)AWS_DEFAULT_REGION=us-east-2/all (default = us-east-2)
(optional)dry_run=yes/no (default = yes)
(optional)log_level=INFO (default = INFO)
LDAP_HOST_NAME=ldap.example.com
GOOGLE_APPLICATION_CREDENTIALS=$pwd/service_account.json
- Create user with IAM
- Create a logs bucket create_bucket.sh
- Create classic infrastructure API key
# policy=instance_idle
sudo podman run --rm --name cloud-governance -e policy="instance_idle" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e dry_run="yes" -e policy_output="s3://bucket/logs" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance"
# policy=instance_run
sudo podman run --rm --name cloud-governance -e policy="instance_run" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e dry_run="yes" -e policy_output="s3://bucket/logs" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance"
# select policy ['ec2_stop', 's3_inactive', 'empty_roles', 'ip_unattached', 'unused_nat_gateway', 'zombie_snapshots']
sudo podman run --rm --name cloud-governance -e policy="policy" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e dry_run="yes" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance"
# policy=ebs_unattached
sudo podman run --rm --name cloud-governance -e policy="ebs_unattached" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e dry_run="yes" -e policy_output="s3://bucket/logs" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance"
# policy=ebs_in_use
sudo podman run --rm --name cloud-governance -e policy="ebs_in_use" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e dry_run="yes" -e policy_output="s3://bucket/logs" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance"
# policy=zombie_cluster_resource
sudo podman run --rm --name cloud-governance -e policy="zombie_cluster_resource" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e dry_run="yes" -e resource="zombie_cluster_elastic_ip" -e cluster_tag="kubernetes.io/cluster/test-pd9qq" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance"
# policy=tag_resources
sudo podman run --rm --name cloud-governance -e policy="tag_resources" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e tag_operation="read/update/delete" -e mandatory_tags="{'Owner': 'Name','Email': '[email protected]','Purpose': 'test'}" -e log_level="INFO" -v "/etc/localtime":"/etc/localtime" "quay.io/cloud-governance/cloud-governance"
# policy=tag_non_cluster
sudo podman run --rm --name cloud-governance -e policy="tag_non_cluster" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e tag_operation="read/update/delete" -e mandatory_tags="{'Owner': 'Name','Email': '[email protected]','Purpose': 'test'}" -e log_level="INFO" -v "/etc/localtime":"/etc/localtime" "quay.io/cloud-governance/cloud-governance"
# policy=tag_iam_user
sudo podman run --rm --name cloud-governance -e policy="tag_iam_user" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e user_tag_operation="read/update/delete" -e remove_tags="['Environment', 'Test']" -e username="test_username" -e file_name="tag_user.csv" -e log_level="INFO" -v "/home/user/tag_user.csv":"/tmp/tag_user.csv" --privileged "quay.io/cloud-governance/cloud-governance"
# policy=cost_explorer
sudo podman run --rm --name cloud-governance -e policy="cost_explorer" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e es_host="$elasticsearch_host" -e es_port="$elasticsearch_port" -e es_index="$elasticsearch_index" -e cost_metric=UnblendedCost -e start_date="$start_date" -e end_date="$end_date" -e granularity="DAILY" -e cost_explorer_tags="['User', 'Budget', 'Project', 'Manager', 'Owner', 'LaunchTime', 'Name', 'Email']" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance:latest"
sudo podman run --rm --name cloud-governance -e policy="cost_explorer" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e es_index="elasticsearch_index" -e cost_metric="UnblendedCost" -e start_date="$start_date" -e end_date="$end_date" -e granularity="DAILY" -e cost_explorer_tags="['User', 'Budget', 'Project', 'Manager', 'Owner', 'LaunchTime', 'Name', 'Email']" -e file_name="cost_explorer.txt" -v "/home/cost_explorer.txt":"/tmp/cost_explorer.txt" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance:latest"
# policy=validate_iam_user_tags
sudo podman run --rm --name cloud-governance -e policy="validate_iam_user_tags" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e validate_type="spaces/tags" -e user_tags="['Budget', 'User', 'Owner', 'Manager', 'Environment', 'Project']" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance:latest"
# policy=gitleaks
sudo podman run --rm --name cloud-governance -e policy="gitleaks" -e git_access_token="$git_access_token" -e git_repo="https://github.com/redhat-performance/cloud-governance" -e several_repos="no" -e log_level="INFO" "quay.io/cloud-governance/cloud-governance"
# custom cloud custodian policy (path for custom policy: -v /home/user/custodian_policy:/custodian_policy)
sudo podman run --rm --name cloud-governance -e policy="/custodian_policy/policy.yml" -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-2" -e dry_run="yes" -e policy_output="s3://bucket/logs" -e log_level="INFO" -v "/home/user/custodian_policy":"/custodian_policy" --privileged "quay.io/cloud-governance/cloud-governance"
# policy=tag_baremetal
podman run --rm --name cloud-governance -e policy="tag_baremetal" -e account="$account" -e IBM_API_USERNAME="$IBM_API_USERNAME" -e IBM_API_KEY="$IBM_API_KEY" -e SPREADSHEET_ID="$SPREADSHEET_ID" -e GOOGLE_APPLICATION_CREDENTIALS="$GOOGLE_APPLICATION_CREDENTIALS" -v $GOOGLE_APPLICATION_CREDENTIALS:$GOOGLE_APPLICATION_CREDENTIALS -e LDAP_USER_HOST="$LDAP_USER_HOST" -e tag_operation="update" -e log_level="INFO" -v "/etc/localtime":"/etc/localtime" "quay.io/cloud-governance/cloud-governance:latest"
# tag=tab_vm
podman run --rm --name cloud-governance -e policy="tag_vm" -e account="$account" -e IBM_API_USERNAME="$IBM_API_USERNAME" -e IBM_API_KEY="$IBM_API_KEY" -e SPREADSHEET_ID="$SPREADSHEET_ID" -e GOOGLE_APPLICATION_CREDENTIALS="$GOOGLE_APPLICATION_CREDENTIALS" -v $GOOGLE_APPLICATION_CREDENTIALS:$GOOGLE_APPLICATION_CREDENTIALS -e LDAP_USER_HOST="$LDAP_USER_HOST" -e tag_operation="update" -e log_level="INFO" -v "/etc/localtime":"/etc/localtime" "quay.io/cloud-governance/cloud-governance:latest"
cp example.yaml env.yaml
Added the supported environment variables. example:
policy: instance_idle
AWS_ACCESS_KEY_ID: ""
AWS_SECRET_ACCESS_KEY: ""
podman run --rm --name cloud-governance \
-v "${PWD}/env.yaml":"/tmp/env.yaml" \
"quay.io/cloud-governance/cloud-governance:latest"
Job Pod: cloud-governance.yaml
Configmaps: cloud_governance_configmap.yaml
Quay.io Secret: quayio_secret.sh
AWS Secret: cloud_governance_secret.yaml
* Need to convert secret key to base64 [run_base64.py](pod_yaml/run_base64.py)
python3 -m venv governance
source governance/bin/activate
(governance) $ python -m pip install --upgrade pip
(governance) $ pip install coverage
(governance) $ pip install pytest
(governance) $ git clone https://github.com/redhat-performance/cloud-governance
(governance) $ cd cloud-governance
(governance) $ coverage run -m pytest
(governance) $ deactivate
rm -rf *governance*
sudo podman rmi quay.io/cloud-governance/cloud-governance