Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpRequest processor #32

Open
petemounce opened this issue Jun 21, 2024 · 2 comments · May be fixed by #46
Open

httpRequest processor #32

petemounce opened this issue Jun 21, 2024 · 2 comments · May be fixed by #46

Comments

@petemounce
Copy link

petemounce commented Jun 21, 2024

We have a logging middleware that stamps each request with a canonical log-line. That attaches http request metadata as well as other things. That in turn causes GCP Logging to render those bits of metadata as lozenge-blobs in the UI, making them stand out a bit and also making the canonical log lines themselves stand out from the regular noise.

It was surprisingly fiddly to get this to work.

I'm not sure if you'd like to add some level of support? Here's something that resembles what I ended up doing:

class CanonicalLogLine(BaseHTTPMiddleware):
    def __init__(self, app: ASGIApp):
        super().__init__(app, dispatch=self.dispatch)
        self.logger = structlog.get_logger(__name__)

    async def dispatch(self, request: Request, call_next):
        start_time = time.perf_counter_ns()
        url = get_path_with_query_string(request.scope)
        client_host = request.client.host
        client_port = request.client.port
        http_method = request.method
        http_version = request.scope['http_version']
        response = await call_next(request)
        status_code = response.status_code
        request_duration = time.perf_counter_ns() - start_time
        self.logger.info(
            f"""{client_host}:{client_port} - "{http_method} {url} HTTP/{http_version}" {status_code}""",
            httpRequest={
                'latency': str(request_duration / 10**9) + 's',
                'protocol': f'HTTP/{http_version}',
                'remoteIp': f'{client_host}',
                'requestMethod': http_method,
                'requestUrl': str(request.url),
                'status': status_code,
                'userAgent': request.headers.get('user-agent', 'no-user-agent'),
            }
        )

        return response

... and then we have a structlog processor:

from structlog.types import EventDict, Processor, WrappedLogger


class AddGCPHTTP:
    def __init__(self, cloud_logging_key: str) -> None:
        self.cloud_logging_key = cloud_logging_key

    def setup(self) -> list[Processor]:
        return [self]

    def __call__(self, logger: WrappedLogger, method_name: str, event_dict: EventDict):
        if 'httpRequest' not in event_dict.keys():
            return event_dict

        event_dict[self.cloud_logging_key]['httpRequest'] = event_dict.pop('httpRequest')

        return event_dict

This was some time ago, and I don't remember the specifics but I do remember it being particularly papercutty. Perhaps if a field was not serialized how GCP expects (I vaguely recall things that are clearly numbers -- except status-code -- being necessary to pass as strings?) the whole thing didn't appear? Stuff like that.

@multani
Copy link
Owner

multani commented Jun 21, 2024

That would be a great addition to add, yes!

The canonical documentation for this would be:

This was some time ago, and I don't remember the specifics but I do remember it being particularly papercutty. Perhaps if a field was not serialized how GCP expects (I vaguely recall things that are clearly numbers -- except status-code -- being necessary to pass as strings?) the whole thing didn't appear?

I wouldn't be surprised that if the httpRequest fields types don't match those linked above, it won't show up correctly in Cloud Logging, yeah.

I'm totally in to add support for this. My only concern is that GCP specifies very specific fields to be logged, and as you discovered, they need to have the right types.

Maybe we could do it in 2 steps:

First, we add a processor like the AddGCPHTTP one you pasted above and say "please put exactly what is needed, how it's needed, in the httpRequest field".

Then, we can check how we could provide a better "user experience" to log this field correctly. I would need to think a bit more about it, but I would probably see a dedicated dataclass HTTPRequest:

  • containing all the fields documented in the doc, with their correct types | None
  • and providing a way to serialize this as a dict without the unset keys

For instance:

@dataclass
class HTTPRequest:
    method: str | None
    url: str | None
    size: int | None

    def format(self):
        """Format as https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry#httprequest"""

        # TODO: filter out all None values
        return {
            "requestMethod": self.method,
            "requestUrl": self.url,
            "requestSize": str(self.size),
        }

In your middleware, you would do something like:

self.logger.info(
            f"""{client_host}:{client_port} - "{http_method} {url} HTTP/{http_version}" {status_code}""",
            httpRequest=HTTPRequest(method=http_method, url=url))

This way:

  • The library would offer a way to log the right fields that could be understood by GCP and would enforce them
  • The typing would provide hints as to what the fields really should be to be correctly understood by GCP
  • Maybe some niceties, like formatting the latency field correctly, etc.

What do you think?

multani added a commit that referenced this issue Sep 21, 2024
This adds a new HTTP-specific processor that formats logs according to
the `HttpRequest` Google Cloud Logging log format.

See: https://cloud.google.com/logging/docs/structured-logging#structured_logging_special_fields
See: https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry#httprequest

Fix: #32
@multani multani linked a pull request Sep 21, 2024 that will close this issue
@multani
Copy link
Owner

multani commented Sep 21, 2024

I've started to draft a solution in #46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants