Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Support for external observability providers - Logging #2014

Closed
2 tasks done
seshubaws opened this issue Mar 15, 2023 · 17 comments · Fixed by #2183
Closed
2 tasks done

RFC: Support for external observability providers - Logging #2014

seshubaws opened this issue Mar 15, 2023 · 17 comments · Fixed by #2183
Assignees

Comments

@seshubaws
Copy link
Contributor

seshubaws commented Mar 15, 2023

Is this related to an existing feature request or issue?

#1433

Which AWS Lambda Powertools utility does this relate to?

Logger

Summary

This RFC is the first of three that will detail how the Logger, Metrics, and Tracer utilities can be extended to support a bring-your-own-observability-provider format for customers who want to use platforms other than AWS CloudWatch.

This RFC will be specifically for the Logger. Currently, in the Logger for Python (and Typescript), customers can bring their own Logging Formatter and have the option to provide a configuration option to define the output format when they're setting up their Logger in order to allow for better integration with observability solutions other than CloudWatch.

Since this is already quite powerful, we are seeking to improve the existing mechanism by offering out of the box support for Datadog and a few other providers (TBD) for the Logger Formatter.

Metrics RFC: #2015
Tracer RFC: #2030

Use case

The primary use case for this utility would be for developers who want a simplified way to integrate the Logger Formatter with the most popular third party observability providers (ie. DataDog).

Proposal

Current Logger Formatter experience

Below is a simple snippet of how the Logger Formatter currently works:

from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.formatter import LambdaPowertoolsFormatter

class CustomFormatter(LambdaPowertoolsFormatter):
    def serialize(self, log: dict) -> str:
        """Serialize final structured log dict to JSON str"""
        log["event"] = log.pop("message")  # rename message key to event
        return self.json_serializer(log)  # use configured json serializer
        logger = Logger(service="payment", logger_formatter=CustomFormatter())logger.info("hello")

logger = Logger(service="payment", logger_formatter=CustomFormatter())
logger.info("hello")

Output

{
    "level": "INFO",
    "location": "hello_name:8",
    "event": "Request from Alice received",
    "timestamp": "2023-03-21 13:25:36,129+0100",
    "service": "service_undefined"
}

Logger Formatter proposal

For this new utility, we propose a new parameter to the existing logger that developers can use to specify which observability provider they would like to receive their logs. The below code snippet is a rudimentary look at how this utility can be used and how it will function. Out of the box, we will support DataDog and a few other providers (TBD).

from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.providers import DatadogLogFormatter
    
logger = Logger(service="payment", logger_formatter=DatadogLogFormatter())
logger.info("hello")

Output

{
     "datetime": "2021-11-22 15:32:02,145", 
     "level": "INFO", 
     "message": "Request from Alice received",
     "requestId": "9b1deb4d-3b7d-4bad"
}

Custom logger usage

If the customer would like to use another observability provider, or define their own logger functions, we will define an interface that the customer can implement and pass in to the Logger class.

Out of scope

Sending logs from Powertools to the customer's desired observability platform will not be in the scope of this project. The implementation should only support modifying the output of the Logger so that the customer can push them to their platform of choice using the Lambda Handler.

Potential challenges

We need to determine which platforms we want to support out-of-the-box (apart from Datadog).

Dependencies and Integrations

We will have to integrate with (and thus, have a dependency on) Datadog and any other platforms we decide to support out-of-the-box.

Alternative solutions

Acknowledgment

@heitorlessa
Copy link
Contributor

reading!

@heitorlessa heitorlessa removed the triage Pending triage from maintainers label Mar 21, 2023
@heitorlessa
Copy link
Contributor

Thanks a lot @seshubaws for taking the time in creating this! I'm so excited we're kicking the observability provider discussion.

As I was reading, one thing struck me: only Python and TypeScript already provide this capability. This means we might not need an actual Provider capability but more of an improvement to the ergonomics for customers to bring their own output.

In Python, we use the standard logging library which is absurdly flexible yet why it's so complex to configure it. For Powertools Logger, customers tend to migrate from standard logging library or another loggers. Since we build atop of the standard logging library, we expose the same mechanisms for customers to override and bring their company standards, for example:

from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.formatter import LambdaPowertoolsFormatter


class CustomFormatter(LambdaPowertoolsFormatter):
    def serialize(self, log: dict) -> str:
        """Serialize final structured log dict to JSON str"""
        log["event"] = log.pop("message")  # rename message key to event
        return self.json_serializer(log)  # use configured json serializer


logger = Logger(service="payment", logger_formatter=CustomFormatter())
logger.info("hello")

In this case, logger_formatter would receive an instance of LambdaPowertoolsFormatter. Upon a new log that Logger already validated it has the correct logging levels, filters, fields like correlation ID, X-Ray ID, Lambda context and custom keys resolved, etc., it'd call serialize last and return. What I like about the logger_formatter is that it's explicit enough, customers can have additional logic in it, and we do the heavy lifting without them having to worry about logging levels, logging filters, and whatnot - if they do want the trouble, they can use BaseLambdaPowertoolsFormatter.

What I think we could do better here but didn't before is to either think of a simpler way since is already powerful enough, OR improve the existing mechanism by typing the log parameter with TypedDict for example.

What do you think?


Answering some comments and questions

Currently, the Logger automatically sends logs to AWS CloudWatch

Minor correction, Logger sends logs to standard output (stdout). That's why some customers even use it locally for scripts, or in platforms like Fargate/AppRunner/Beanstalk, despite not having access to all features. That's also true for all other languages. The reason we did this is to make it easier for customers to take control over stdout and transform/do as they please, with say sidecar containers... or in case of Lambda... Lambda Extensions Logs API.

We need to determine which platforms we want to support out-of-the-box (apart from Datadog).

We can ask the community. Turns out a customer shared that Logger output is recognized by Datadog out of the box when used with Logger(use_rfc3339=True). I had asked for a docs issue so make this more visible in a FAQ but customer lacked bandwidth (so did we :/).

An addendum to this proposal would be to have just one place users can specify which platform they'd like to use so they don't have to repeatedly configure the same observability provider setting for Logger, Metrics, and Tracer.

That's an interesting idea - would you mind expanding the RFC to cover an initial take? We heard some ideas in the past of having a central configuration.

Minor edits

  • Added syntax highlighting to code snippets markdown (python)
  • Corrected log output from Current logger experience
  • Split code snippet from output to ease reading and comparing

@seshubaws
Copy link
Contributor Author

Thank you for your comments @heitorlessa, it really clarifies a lot! I have revised the original RFC to integrate most of your comments, though I should call out some points:

  1. What I think we could do better here but didn’t before is to either think of a simpler way since is already powerful enough, OR improve the existing mechanism by typing the log parameter with TypedDict for example.

In the revised RFC, I have proposed supporting Datadog and any other providers we want out of the box to make the current Logger Formatter simpler. We could add the TypedDict in addition to that, or do that as a fast follow, up to you.

  1. We can ask the community. Turns out a customer shared that Logger output is recognized by Datadog out of the box when used with Logger(use_rfc3339=True). I had asked for a docs issue so make this more visible in a FAQ but customer lacked bandwidth (so did we :/).

I guess if we support Datadog formatting for logs out of the box like

logger = Logger(service="payment", logger_formatter=DatadogLogFormatter())

we can just do that behind the scenes.

Also, to determine the other providers we want to support out of the box, is this RFC the place where we are asking? Or is there some other forum for that kind of poll specifically?

  1. "An addendum to this proposal would be to have just one place users can specify which platform they'd like to use so they don't have to repeatedly configure the same observability provider setting for Logger, Metrics, and Tracer."
    That's an interesting idea - would you mind expanding the RFC to cover an initial take? We heard some ideas in the past of having a central configuration.

Of course! Should we have that in a separate RFC though, since this one is just about the Logger and that would involve Logger, Tracer, and Metrics?

@heitorlessa
Copy link
Contributor

awesome, I somehow missed your message @seshubaws (sorry!!). I'm finishing a documentation issue and going through all RFCs today

@heitorlessa
Copy link
Contributor

Loved it! Gotta re-read the updated RFC now. As for your questions:

Also, to determine the other providers we want to support out of the box, is this RFC the place where we are asking? Or is there some other forum for that kind of poll specifically?

We can use the Poll feature in GitHub Discussions, then drive customers there from our docs, board, Discord, and social media :)

It'll also be good to find out which ones don't work outside the box (only DataDog didn't work out as far as I remember).

Of course! Should we have that in a separate RFC though, since this one is just about the Logger and that would involve Logger, Tracer, and Metrics?

Good idea! A new RFC would be better. I did learn from another customer interview last week that they tend to use one Logging provider (e.g., CloudWatch), and use another for Metrics and Traces primarily to optimize costs.

@heitorlessa
Copy link
Contributor

That's it! That's exactly the simple developer experience customers love about Powertools - you're awesome @seshubaws!

Feel free to start the implementation (or let us know if you want us to!). Given how related it is, we could also address the TypedDict to make the authoring experience even better.

I've pinged a Serverless Partner SA for guidance on how to best ensure a fair chance for AWS Partners (whether to support a few out of the box, or do it based on customer demand and Partner's contribution). For now, assume Datadog as an example, as we can revert that easily if we need to.

Thanks

@heitorlessa heitorlessa assigned seshubaws and unassigned heitorlessa Apr 3, 2023
@PierreKiwi
Copy link

Hey @heitorlessa,
Sorry I never took the time to document the setup I use for DataDog. Can you point me towards the right page and I will write something down.
We use the DataDog extension + their Python layer + Lambda Powertools layer and it is indeed working with use_rfc3339=True.

@heitorlessa
Copy link
Contributor

hey @PierreKiwi no worries ;) Feel free to share them here.

I've got one question to help us determine the implementation for DataDog.

Q: When using the new DatadogLogFormatter, should we only fix the timestamp like use_rfc3339=True? (intended approach)

Or, would you expect the logging keys to be remapped and follow DataDog initial convention[1]?

Context

If you were starting a few weeks from now, we'd recommend using logger_formatter=DatadogLogFormatter() in the docs.

Logger(service="payment", logger_formatter=DatadogLogFormatter())

What we're unsure is what the formatter should change - the timestamp value only, or logging keys to match what DataDog suggest as their default logger. I think it's the former since the latter is less rich compared to what we provide out of the box.

Thanks a lot!!


[1] DataDog log that @seshubaws found.

{
     "datetime": "2021-11-22 15:32:02,145", 
     "level": "INFO", 
     "message": "Request from Alice received",
     "requestId": "9b1deb4d-3b7d-4bad"
}

@PierreKiwi
Copy link

PierreKiwi commented Apr 20, 2023

Hello @heitorlessa,
Sorry for the delay, I needed to retrieve my work laptop so that I can have access to real log statements :).

Disclaimer: I want to say that what I going to describe below is what works for us atm and I am more than happy to be corrected and learn better practices.

A bit of context first. We do have Lambad functions using AWS Lambda Powertools and Datadog dd-trace-py (both via layers). Our main goal is to have the traces, logs, and metrics correlated and ingested in Datadog with minimum effort.

So, in the __init.py__.py of our Lambda, we do usually have something like this:

# import from Python standard library (like datetime, os, etc.)

from ddtrace import patch

# Patch specific packages before importing them
patch(
    botocore=True,  # note: this also patches boto3
    logging=True,
    requests=True,
)

# pylint: disable=wrong-import-position

import simple_salesforce
from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.utils import copy_config_to_registered_loggers
from boto3 import Session

# pylint: enable=wrong-import-position

# Extract some environment variables
ENV: str = os.environ.get("ENV", "dev")
LOG_LEVEL = os.environ.get("LOG_LEVEL", "DEBUG")
SAMPLE_RATE = float(os.environ.get("POWERTOOLS_LOGGER_SAMPLE_RATE", 0.1))
SERVICE_NAME = os.environ.get("POWERTOOLS_SERVICE_NAME", "something",)
VERSION = os.environ.get("VERSION", "0.1.0")

# Configure logger (including 3rd party ones)
LOGGER = Logger(
    **{
        "env": ENV,
        "level": LOG_LEVEL,
        "location": "%(name)s.%(lineno)d",
        "sample_rate": SAMPLE_RATE,
        "service": SERVICE_NAME,
        "use_rfc3339": True,
        "version": VERSION,
    }
)
copy_config_to_registered_loggers(
    source_logger=LOGGER,
    include={
        "simple_salesforce",
    },
)

Ok, so quite a lot to unpack here...

Let's start with the ddtrace import. This allows us to patch some specific packages for Datadog (note that it is strongly possible that they got patched by default but better to be explicit). Because ddtrace patches logging (see details here), therefore it must happens before importing/setting up the AWS Lambda Powertools logger, so that the formatter is correctly configured with the extra fields expected by Datadog (see concrete example below).

We then continue our imports and eventually configure the root AWS Lambda Powertools LOGGER with the appropriate fields. That includes use_rfc3339 else Datadog cannot interpret the timestamp correctly (see examples just below and Supported date formats).

Examples: apparently the extra T changes the world for Datadog :).

  • With use_rfc3339 = False => "2023-04-20 20:11:34,004+1200"
  • With use_rfc3339 = True => "2023-04-20T07:33:02.637+00:00"

Finally we set up our 3rd party loggers using the handy copy_config_to_registered_loggers method.

Ok, so far so good. What does a log statement with this setup look like ?

{
    "level": "DEBUG",
    "location": "my.location.goes.here",
    "message": "Calculating since and to fields based on time (2023-04-20 07:32:54+00:00) and delta (5 minutes)...",
    "timestamp": "2023-04-20T07:33:02.637+00:00",
    "service": "our-service",
    "sampling_rate": "0.1",
    "env": "prod",
    "sample_rate": 0.1,
    "version": "1.0.842665798-c1a47030",
    "name": "...",
    "cold_start": false,
    "function_name": "function-name",
    "function_memory_size": "256",
    "function_arn": "function-arn",
    "function_request_id": "my-function-request-id",
    "dd.version": "1.0.842665798-c1a47030",
    "dd.env": "prod",
    "dd.service": "something",
    "dd.trace_id": "1262616779456844194",
    "dd.span_id": "2600278220644030616",
    "xray_trace_id": "1-6440eaae-3f50e7841185b74572f911a2"
}

As you can see, patching logging with ddtrace (and thanks to DD_LOGS_INJECTION=true), the dd.* fields got populated as expected (via the env variables DD_ENV, DD_SERVICE, DD_VERSION respectively).

There is a bit of duplication (service|dd.service, env|dd.env, version|dd.version) but it does not bother us too much. And it helps for unit tests log statements as we disable ddtrace there (so no dd.* fields).

Note: we do use CDK and we leverage the Datadog CDK constructs (see here) so that we do not need to remember how to set those DD_ env variables:

const datadog = new Datadog(this, "datadog", {
      apiKeySecretArn: secretDatadog.secretArn, // DD_API_KEY_SECRET_ARN
      env: this.env.name, // DD_ENV
      enableDatadogLogs: true, // DD_SERVERLESS_LOGS_ENABLED (true by default but let's be explicit)
      enableDatadogTracing: true, // DD_TRACE_ENABLED (true by default but let's be explicit)
      enableMergeXrayTraces: true, // DD_MERGE_XRAY_TRACES
      extensionLayerVersion: config.get("extensions.datadogLayerVersion"),
      // Must align with Lambda function Python runtime
      pythonLayerVersion: config.get("extensions.datadogPythonLayerVersion"),
      service: this.props.serviceName, // DD_SERVICE
      tags: repository.getDatadogTags(), // DD_TAGS
      version: this.props.version, // DD_VERSION
    });

Small disgression, inspired by it, I have built something similar for AWS Lambda Powertools so I do not need to remember how to setup POWERTOOLS_* environment variables and to resolve the ARN of the layer). Hope to open source it eventually:

const awsLambdaPowertools = new AwsLambdaPowertools(this, "aws-lambda-powertools", {
      layerVersion: config.get("extensions.awsLambdaPowertoolsPythonLayerVersion"),
      service: this.props.serviceName, // POWERTOOLS_SERVICE_NAME
    });

Ok, back to our sheeps (I am a Kiwi after all !)...

At this stage, we got our traces and logs correlated (haven't looked much about metrics just yet) in Datadog so pretty happy about it and the setup is pretty simple. Could it simplified ? With DatadogLogFormatter ? Possibly but chances are high that we would still need the ddtrace anyway as we use it for tracing. The patching of logging shared above does quite a few important things (especially around dd.trace_id and dd.span_id).

But I would conclude saying that the bare minimum is def. to have use_rfc3339=true so that the logs can be properly ingested by Datadog without requiring a custom date log remapper.

Hope this long message with concrete setup and examples helps and please let me know your feedback/suggestions.

Regards

@heitorlessa
Copy link
Contributor

This is "hot chips" @PierreKiwi !! It unblocks us. This means we can limit our DatadogLogFormatter to the timestamp format only and keep it super simple, helping with maintenance and for discoverability (anyone searching for Datadog in the docs).

In your case, keep using what you're doing. This new formatter will have the exact same effect as setting use_rfc3339=True.

LOGGER = Logger(
    **{
        "env": ENV,
        "level": LOG_LEVEL,
        "location": "%(name)s.%(lineno)d",
        "sample_rate": SAMPLE_RATE,
        "service": SERVICE_NAME,
        "use_rfc3339": True,
        "version": VERSION,
    }
)

Thank you so much!


@seshubaws would you like me to create a PR for this, or would you like to tackle it?

Items we need to address in an upcoming PR:

  • Add the new DatadogLogFormatter which will merely only change the timestamp format
  • Reference it in the documentation so it's easier to find
  • Optionally: we might need a new section to explicitly call out Observability Providers and give customers guidance as to what's required.

@heitorlessa
Copy link
Contributor

PR is up! #2183

@github-actions github-actions bot added the pending-release Fix or implementation already in dev waiting to be released label May 1, 2023
@PierreKiwi
Copy link

Great work @heitorlessa :)

@leandrodamascena leandrodamascena linked a pull request May 1, 2023 that will close this issue
7 tasks
@heitorlessa
Copy link
Contributor

Now merged! Before closing, I'll make a last comment by EOD to summarize key decisions and current UX to ease referencing in the future

@heitorlessa
Copy link
Contributor

Final comments to close this RFC.

log_formatter over provider parameter

Logger is built on top of the extensive Python standard logging. This meant we already had support for customers wanting to ship logs and change keys via built-in Handler and Formatter. This will be similar in other languages like Java and .NET, except TS as there isn't one for the language (will follow tenet on idiomatic as always).

We agreed that it'd be confusing to have a provider parameter and a new interface altogether only to align with Tracer and Metrics (we're still discussing interfaces for these two in separate RFCs).

Instead, we documented our recommendation for customers to use Lambda Extensions, and list our built-in formatters for AWS Lambda Partner providers that don't work out of the box - Datadog in this case due to timestamp.

from aws_lambda_powertools import Logger
from aws_lambda_powertools.logging.formatters.datadog import DatadogLogFormatter

logger = Logger(service="payment", logger_formatter=DatadogLogFormatter())
logger.info("hello")

New providers

According to customer feedback, all providers worked just fine without any code changes (e.g., Loggly, NewRelic, ELK, etc.). Therefore, we will add additional built-in formatters based on feature request and if they are an AWS Lambda Partner.

OpenTelemetry

OpenTelemetry Log Data Model is now stable. It looks however overly verbose due to the tight integration with traces and metrics - based on customer and our experience this can lead to a high log ingestion bill depending on volume.

We will halt for now and await customer demand before making it a built-in formatter.

OpenTelemetry Log output sample
{
    "resourceLogs": [
        {
            "resource": {
                "attributes": [
                    {
                        "key": "resource-attr",
                        "value": {
                            "stringValue": "resource-attr-val-1"
                        }
                    }
                ]
            },
            "scopeLogs": [
                {
                    "scope": {},
                    "logRecords": [
                        {
                            "timeUnixNano": "1581452773000000789",
                            "severityNumber": 9,
                            "severityText": "Info",
                            "name": "logA",
                            "body": {
                                "stringValue": "This is a log message"
                            },
                            "attributes": [
                                {
                                    "key": "app",
                                    "value": {
                                        "stringValue": "server"
                                    }
                                },
                                {
                                    "key": "instance_num",
                                    "value": {
                                        "intValue": "1"
                                    }
                                }
                            ],
                            "droppedAttributesCount": 1,
                            "traceId": "08040201000000000000000000000000",
                            "spanId": "0102040800000000"
                        },
                        {
                            "timeUnixNano": "1581452773000000789",
                            "severityNumber": 9,
                            "severityText": "Info",
                            "name": "logB",
                            "body": {
                                "stringValue": "something happened"
                            },
                            "attributes": [
                                {
                                    "key": "customer",
                                    "value": {
                                        "stringValue": "acme"
                                    }
                                },
                                {
                                    "key": "env",
                                    "value": {
                                        "stringValue": "dev"
                                    }
                                }
                            ],
                            "droppedAttributesCount": 1,
                            "traceId": "",
                            "spanId": ""
                        }
                    ]
                }
            ]
        }
    ]
}

FYI @dreamorosi @sliedig @awslabs/aws-lambda-powertools-java for our next maintainers sync.

@heitorlessa
Copy link
Contributor

Last but not least, huge thank you @seshubaws for working on the RFC and tweaking it several times. Much appreciated!!

@github-actions
Copy link
Contributor

github-actions bot commented May 3, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@github-actions
Copy link
Contributor

github-actions bot commented May 4, 2023

This is now released under 2.15.0 version!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants