Skip to content

Commit

Permalink
Lambda-Promtail (grafana#2282)
Browse files Browse the repository at this point in the history
* specs out lambda-promtail

* init fn

* udpates readme/template

* lambda-promtail docs

* lambda promtail includes source timestamp and uses context

* non markdown links

* new doc structure
  • Loading branch information
owen-d authored Jul 29, 2020
1 parent d93b410 commit 2a596a7
Show file tree
Hide file tree
Showing 81 changed files with 4,663 additions and 1 deletion.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,5 @@ dlv
rootfs/
dist
coverage.txt
.DS_Store
.DS_Store
.aws-sam
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
## Unreleased (Master)

* [2282](https://github.com/grafana/loki/pull/2282) **owen-d**: introduces a [lambda-promtail](https://github.com/grafana/loki/blob/master/docs/clients/lambda-promtail/README.md) workflow for shipping Cloudwatch logs to Loki.

## 1.5.0 (2020-05-20)

It's been a busy month and a half since 1.4.0 was released, and a lot of new improvements have been added to Loki since!
Expand Down
7 changes: 7 additions & 0 deletions docs/sources/clients/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Loki supports the following official clients for sending logs:
- [Fluentd](fluentd/)
- [Fluent Bit](fluentbit/)
- [Logstash](logstash/)
- [Lambda Promtail](/lambda-promtail/)

## Picking a client

Expand Down Expand Up @@ -51,6 +52,12 @@ Prometheus plugin.
If you are already using logstash and/or beats, this will be the easiest way to start.
By adding our output plugin you can quickly try Loki without doing big configuration changes.

### Lambda Promtail

This is a workflow combining the promtail push-api [scrape config](./promtail/configuration#loki_push_api_config) and the [lambda-promtail](../../tools/lambda-promtail/) AWS Lambda function which pipes logs from Cloudwatch to Loki.

This is a good choice if you're looking to try out Loki in a low-footprint way or if you wish to monitor AWS lambda logs in Loki.

# Unofficial clients

Please note that the Loki API is not stable yet, so breaking changes might occur
Expand Down
84 changes: 84 additions & 0 deletions docs/sources/clients/lambda-promtail/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Lambda Promtail

Loki includes an [AWS SAM](https://aws.amazon.com/serverless/sam/) package template for shipping Cloudwatch logs to Loki via a set of promtails [here](../../../tools/lambda-promtail/). This is done via an intermediary [lambda function](https://aws.amazon.com/lambda/) which processes cloudwatch events and propagates them to a promtail instance (or set of instances behind a load balancer) via the push-api [scrape config](docs/clients/promtail/configuration#loki_push_api_config).

## Uses

### Ephemeral Jobs

This workflow is intended to be an effective approach for monitoring ephemeral jobs such as those run on AWS Lambda which are otherwise hard/impossible to monitor via one of the other Loki [clients](../).

Ephemeral jobs can quite easily run afoul of cardinality best practices. During high request load, an AWS lambda function might balloon in concurrency, creating many log streams in Cloudwatch. However, these may only be active for a very short while. This creates a problem for combining these short-lived log streams in Loki because timestamps may not strictly increase across multiple log streams. The other obvious route is creating labels based on log streams, which is also undesirable because it leads to cardinality problems via many low-throughput log streams.

Instead we can pipeline Cloudwatch logs to a set of promtails, which can mitigate these problem in two ways:

1) Using promtail's push api along with the `use_incoming_timestamp: false` config, we let promtail determine the timestamp based on when it ingests the logs, not the timestamp assigned by cloudwatch. Obviously, this means that we lose the origin timestamp because promtail now assigns it, but this is a relatively small difference in a real time ingestion system like this.
2) In conjunction with (1), promtail can coalesce logs across Cloudwatch log streams because it's no longer susceptible to `out-of-order` errors when combining multiple sources (lambda invocations).

One important aspect to keep in mind when running with a set of promtails behind a load balancer is that we're effectively moving the cardinality problems from the `number_of_log_streams` -> `number_of_promtails`. You'll need to assign a promtail specific label on each promtail so that you don't run into `out-of-order` errors when the promtails send data for the same log groups to Loki. This can easily be done via a config like `--client.external-labels=promtail=${HOSTNAME}` passed to promtail.

### Proof of concept Loki deployments

For those using Cloudwatch and wishing to test out Loki in a low-risk way, this workflow allows piping Cloudwatch logs to Loki regardless of the event source (EC2, Kubernetes, Lambda, ECS, etc) without setting up a set of promtail daemons across their infrastructure. However, running promtail as a daemon on your infrastructure is the best-practice deployment strategy in the long term for flexibility, reliability, performance, and cost.

Note: Propagating logs from Cloudwatch to Loki means you'll still need to _pay_ for Cloudwatch.

## Propagated Labels

Incoming logs will have three special labels assigned to them which can be used in [relabeling](../promtail#relabel_config) or later stages in a promtail [pipeline](../promtail/pipelines):

- `__aws_cloudwatch_log_group`: The associated Cloudwatch Log Group for this log.
- `__aws_cloudwatch_log_stream`: The associated Cloudwatch Log Stream for this log.
- `__aws_cloudwatch_owner`: The AWS ID of the owner of this event.

## Limitations

### Promtail labels

As stated earlier, this workflow moves the worst case stream cardinality from `number_of_log_streams` -> `number_of_log_groups` * `number_of_promtails`. For this reason, each promtail must have a unique label attached to logs it processes (ideally via something like `--client.external-labels=promtail=${HOSTNAME}`) and it's advised to run a small number of promtails behind a load balancer according to your throughput and redundancy needs.

This trade-off is very effective when you have a large number of log streams but want to aggregate them by the log group. This is very common in AWS Lambda, where log groups are the "application" and log streams are the individual application containers which are spun up and down at a whim, possibly just for a single function invocation.

### Data Persistence

#### Availability

For availability concerns, run a set of promtails behind a load balancer.

#### Batching

Since promtail batches writes to Loki for performance, it's possible that promtail will receive a log, issue a successful `204` http status code for the write, then be killed at a later time before it writes upstream to Loki. This should be rare, but is a downside this workflow has.

### Templating

The current SAM template is rudimentary. If you need to add vpc configs, extra log groups to monitor, subnet declarations, etc, you'll need to edit the template manually. Currently this requires pulling the Loki source.

## Example Promtail Config

Note: this should be run in conjunction with a promtail-specific label attached, ideally via a flag argument like `--client.external-labels=promtail=${HOSTNAME}`. It will receive writes via the push-api on ports `3500` (http) and `3600` (grpc).

```yaml
server:
http_listen_port: 9080
grpc_listen_port: 0

positions:
filename: /tmp/positions.yaml

clients:
- url: http://ip_or_hostname_where_Loki_run:3100/loki/api/v1/push

scrape_configs:
- job_name: push1
loki_push_api:
server:
http_listen_port: 3500
grpc_listen_port: 3600
labels:
# Adds a label on all streams indicating it was processed by the lambda-promtail workflow.
promtail: 'lambda-promtail'
relabel_configs:
# Maps the cloudwatch log group into a label called `log_group` for use in Loki.
- source_labels: ['__aws_cloudwatch_log_group']
target_label: 'log_group'
```
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ module github.com/grafana/loki
go 1.14

require (
github.com/aws/aws-lambda-go v1.17.0
github.com/blang/semver v3.5.1+incompatible // indirect
github.com/bmatcuk/doublestar v1.2.2
github.com/c2h5oh/datasize v0.0.0-20200112174442-28bbd4740fee
Expand Down
4 changes: 4 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,10 @@ github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a h1:idn718Q4
github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a/go.mod h1:lB+ZfQJz7igIIfQNfa7Ml4HSf2uFQQRzpGGRXenZAgY=
github.com/asaskevich/govalidator v0.0.0-20200108200545-475eaeb16496 h1:zV3ejI06GQ59hwDQAvmK1qxOQGB3WuVTRoY0okPTAv0=
github.com/asaskevich/govalidator v0.0.0-20200108200545-475eaeb16496/go.mod h1:oGkLhpf+kjZl6xBf758TQhh5XrAeiJv/7FRz/2spLIg=
github.com/aws/aws-lambda-go v1.13.3 h1:SuCy7H3NLyp+1Mrfp+m80jcbi9KYWAs9/BXwppwRDzY=
github.com/aws/aws-lambda-go v1.13.3/go.mod h1:4UKl9IzQMoD+QF79YdCuzCwp8VbmG4VAQwij/eHl5CU=
github.com/aws/aws-lambda-go v1.17.0 h1:Ogihmi8BnpmCNktKAGpNwSiILNNING1MiosnKUfU8m0=
github.com/aws/aws-lambda-go v1.17.0/go.mod h1:FEwgPLE6+8wcGBTe5cJN3JWurd1Ztm9zN4jsXsjzKKw=
github.com/aws/aws-sdk-go v1.15.78/go.mod h1:E3/ieXAlvM0XWO57iftYVDLLvQ824smPP3ATZkfNZeM=
github.com/aws/aws-sdk-go v1.17.7/go.mod h1:KmX6BPdI08NWTb3/sm4ZGu5ShLoqVDhKgpiN924inxo=
github.com/aws/aws-sdk-go v1.22.4/go.mod h1:KmX6BPdI08NWTb3/sm4ZGu5ShLoqVDhKgpiN924inxo=
Expand Down Expand Up @@ -1227,6 +1230,7 @@ github.com/ugorji/go/codec v1.1.7 h1:2SvQaVZ1ouYrrKKwoSk2pzd4A9evlKJb9oTL+OaLUSs
github.com/ugorji/go/codec v1.1.7/go.mod h1:Ax+UKWsSmolVDwsd+7N3ZtXu+yMGCf907BLYF3GoBXY=
github.com/urfave/cli v1.20.0/go.mod h1:70zkFmudgCuE/ngEzBv17Jvp/497gISqfk5gWijbERA=
github.com/urfave/cli v1.22.1/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0=
github.com/urfave/cli/v2 v2.1.1/go.mod h1:SE9GqnLQmjVa0iPEY0f1w3ygNIYcIJ0OKPMoW2caLfQ=
github.com/vektah/gqlparser v1.1.2/go.mod h1:1ycwN7Ij5njmMkPPAOaRFY4rET2Enx7IkVv3vaXspKw=
github.com/weaveworks/common v0.0.0-20200206153930-760e36ae819a/go.mod h1:6enWAqfQBFrE8X/XdJwZr8IKgh1chStuFR0mjU/UOUw=
github.com/weaveworks/common v0.0.0-20200625145055-4b1847531bc9 h1:dNVIG9aKQHR9T4uYAC4YxmkHHryOsfTwsL54WrS7u28=
Expand Down
15 changes: 15 additions & 0 deletions tools/lambda-promtail/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
UNAME_S := $(shell uname -s)
LOCAL_PORT ?= 8080
ifeq ($(UNAME_S),Linux)
LOCAL_ENDPOINT=http://localhost:$(LOCAL_PORT)/loki/api/v1/push
else
LOCAL_ENDPOINT=http://host.docker.internal:$(LOCAL_PORT)/loki/api/v1/push
endif

.PHONY: build

build:
sam build

dry-run:
echo $$(sam local generate-event cloudwatch logs) | sam local invoke LambdaPromtailFunction -e - --parameter-overrides PromtailAddress=$(LOCAL_ENDPOINT)
120 changes: 120 additions & 0 deletions tools/lambda-promtail/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# lambda-promtail

This is a sample template for lambda-promtail - Below is a brief explanation of what we have generated for you:

```bash
.
├── Makefile <-- Make to automate build
├── README.md <-- This instructions file
├── hello-world <-- Source code for a lambda function
│ └── main.go <-- Lambda function code
└── template.yaml
```

## Requirements

* AWS CLI already configured with Administrator permission
* [Docker installed](https://www.docker.com/community-edition)
* [Golang](https://golang.org)
* SAM CLI - [Install the SAM CLI](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html)

## Setup process

### Installing dependencies & building the target

In this example we use the built-in `sam build` to automatically download all the dependencies and package our build target.
Read more about [SAM Build here](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-build.html)

The `sam build` command is wrapped inside of the `Makefile`. To execute this simply run

```shell
make
```

### Local development

**Invoking function locally

```bash
make dry-run
```

## Packaging and deployment

AWS Lambda Golang runtime requires a flat folder with the executable generated on build step. SAM will use `CodeUri` property to know where to look up for the application:

```bash
make build
```

To deploy your application for the first time, first make sure you've set the following parameters in the template:
- `LogGroup`
- `PromtailAddress`
- `ReservedConcurrency`

These can also be set via overrides by passing the following argument to `sam deploy`:
```
--parameter-overrides Optional. A string that contains AWS
CloudFormation parameter overrides encoded
as key=value pairs.For example, 'ParameterKe
y=KeyPairName,ParameterValue=MyKey Parameter
Key=InstanceType,ParameterValue=t1.micro' or
KeyPairName=MyKey InstanceType=t1.micro
```

Also, if your deployment requires a VPC configuration, make sure to edit the `VpcConfig` field in the `template.yaml` manually.

Then run the following in your shell:

```bash
sam deploy --guided --capabilities CAPABILITY_IAM,CAPABILITY_NAMED_IAM --parameter-overrides PromtailAddress=<>,LogGroup=<>
```

The command will package and deploy your application to AWS, with a series of prompts:

* **Stack Name**: The name of the stack to deploy to CloudFormation. This should be unique to your account and region, and a good starting point would be something matching your project name.
* **AWS Region**: The AWS region you want to deploy your app to.
* **Confirm changes before deploy**: If set to yes, any change sets will be shown to you before execution for manual review. If set to no, the AWS SAM CLI will automatically deploy application changes.
* **Allow SAM CLI IAM role creation**: Many AWS SAM templates, including this example, create AWS IAM roles required for the AWS Lambda function(s) included to access AWS services. By default, these are scoped down to minimum required permissions. To deploy an AWS CloudFormation stack which creates or modified IAM roles, the `CAPABILITY_IAM` value for `capabilities` must be provided. If permission isn't provided through this prompt, to deploy this example you must explicitly pass `--capabilities CAPABILITY_IAM` to the `sam deploy` command.
* **Save arguments to samconfig.toml**: If set to yes, your choices will be saved to a configuration file inside the project, so that in the future you can just re-run `sam deploy` without parameters to deploy changes to your application.

# Appendix

### Golang installation

Please ensure Go 1.x (where 'x' is the latest version) is installed as per the instructions on the official golang website: https://golang.org/doc/install

A quickstart way would be to use Homebrew, chocolatey or your linux package manager.

#### Homebrew (Mac)

Issue the following command from the terminal:

```shell
brew install golang
```

If it's already installed, run the following command to ensure it's the latest version:

```shell
brew update
brew upgrade golang
```

#### Chocolatey (Windows)

Issue the following command from the powershell:

```shell
choco install golang
```

If it's already installed, run the following command to ensure it's the latest version:

```shell
choco upgrade golang
```

## Limitations
- Error handling: If promtail is unresponsive, `lambda-promtail` will drop logs after `retry_count`, which defaults to 2.
- AWS does not support passing log lines over 256kb to lambdas.
102 changes: 102 additions & 0 deletions tools/lambda-promtail/lambda-promtail/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
package main

import (
"bufio"
"bytes"
"context"
"errors"
"fmt"
"io"
"net/http"
"net/url"
"os"

"github.com/aws/aws-lambda-go/events"
"github.com/aws/aws-lambda-go/lambda"
"github.com/cortexproject/cortex/pkg/util"
"github.com/gogo/protobuf/proto"
"github.com/golang/snappy"
"github.com/grafana/loki/pkg/logproto"
"github.com/prometheus/common/model"
)

const (
// We use snappy-encoded protobufs over http by default.
contentType = "application/x-protobuf"

maxErrMsgLen = 1024
)

var promtailAddress *url.URL

func init() {
addr := os.Getenv("PROMTAIL_ADDRESS")
if addr == "" {
panic(errors.New("required environmental variable PROMTAIL_ADDRESS not present"))
}
var err error
promtailAddress, err = url.Parse(addr)
if err != nil {
panic(err)
}
}

func handler(ctx context.Context, ev events.CloudwatchLogsEvent) error {

data, err := ev.AWSLogs.Parse()
if err != nil {
return err
}

stream := logproto.Stream{
Labels: model.LabelSet{
model.LabelName("__aws_cloudwatch_log_group"): model.LabelValue(data.LogGroup),
model.LabelName("__aws_cloudwatch_log_stream"): model.LabelValue(data.LogStream),
model.LabelName("__aws_cloudwatch_owner"): model.LabelValue(data.Owner),
}.String(),
Entries: make([]logproto.Entry, 0, len(data.LogEvents)),
}

for _, entry := range data.LogEvents {
stream.Entries = append(stream.Entries, logproto.Entry{
Line: entry.Message,
// It's best practice to ignore timestamps from cloudwatch as promtail is responsible for adding those.
Timestamp: util.TimeFromMillis(entry.Timestamp),
})
}

buf, err := proto.Marshal(&logproto.PushRequest{
Streams: []logproto.Stream{stream},
})
if err != nil {
return err
}

// Push to promtail
buf = snappy.Encode(nil, buf)
req, err := http.NewRequest("POST", promtailAddress.String(), bytes.NewReader(buf))
if err != nil {
return err
}
req.Header.Set("Content-Type", contentType)

resp, err := http.DefaultClient.Do(req.WithContext(ctx))
if err != nil {
return err
}

if resp.StatusCode/100 != 2 {
scanner := bufio.NewScanner(io.LimitReader(resp.Body, maxErrMsgLen))
line := ""
if scanner.Scan() {
line = scanner.Text()
}
err = fmt.Errorf("server returned HTTP status %s (%d): %s", resp.Status, resp.StatusCode, line)
}

return err
}

func main() {
lambda.Start(handler)
}
Loading

0 comments on commit 2a596a7

Please sign in to comment.