This repository contains two complementary, serverless Go projects following AWS best practices for EMF-based metrics:
This repository is provided as a functional example. It is not intended to represent a production-ready "drop-in" solution. Before using in any live environment, you should:
- Review and adjust IAM permissions to follow the principle of least privilege
- Review encryption at rest and in transit for all resources (SQS, Lambda, logs, etc.)
- Configure VPC, subnet, and security group settings according to your network requirements
- Implement proper monitoring, alerting, and log retention lifecycles
- Be aware of any costs associated with deploying in your account(s)
Use this sample as a starting point, not a drop-in solution. Customize this solution based on your organization’s security, reliability, and operational requirements.
An event-driven pipeline that captures control-plane API call rates and publishes RequestPerSecond metrics to CloudWatch in ~60-second intervals.
Use Case: Helps customers prevent throttling which can reduce production customer-facing impacting events. Invoker-level metrics allow customers to identify the source of consumption, enabling them to better redistribute resources and maintain room for growth from an RPS perspective.
Below is an example of an API-level RequestPerSecond metric. This metric is valuable to alarm on since it represents the total usage of an API.
Below is an example of invoker-level RequestPerSecond metric. Each invoker will have their own dedicated metric for a given API.
This metric is useful for deeper analysis into where your consumption is coming from.
A scheduled Lambda function that computes resource utilization across your account by making various describe calls, retrieving the current quota from Service Quotas and publishing utilization metrics (%) to CloudWatch.
Use Case: Valuable for dynamically generating utilization metrics which help prevent resource exhaustion. This is mainly a key concern for larger customers or ISVs who need proactive monitoring to avoid hitting service limits.
This project captures utilization metrics for resources that do not have native CloudWatch coverage. Below is a screenshot of what the metrics will look like in your namespace
Below are the the current supported utilization metrics (with plans to continuously add more based on customer feedback):
- EKS clusters
- GP3 Storage
- IAM Roles
- Network Interfaces
- OIDC Providers
- Network Address Units (vpc level)
Error Count is a metric that is auto created and will enable you to build notification workflows if there are any errors in the solution execution.
cmd/ # entry point location for each project
emf-extension/ # Lambda extension
main.go
ratelimit/ # rate limit solution
main.go
resourcequota/ # resource quota solution
main.go
infra/ # folder for deploying via CloudFormation and Terraform
/cloudformation
/ratelimit
template.yaml
/resourcequota
template.yaml
/terraform
/ratelimit
main.tf
variables.tf
/resourcequota
main.tf
variables.tf
internal/ # folder for internal libraries used
lambda-layer/ # directory for Lambda layer
Please navigate to each project's README file for more details.