Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Prevents the zap logger from serializing the request in rewrite.go. #6541

Merged
merged 3 commits into from
Aug 30, 2024

Conversation

AlliBalliBaba
Copy link
Contributor

Consider a Caddy config looking like this:

rewrite * somewhere
respond "Hello Caddy"

When looking at the performance profile of Caddy I noticed that a lot of time is being spent setting the context of the logger. In the following flamegraph (captured with brendangregg/FlameGraph) you can see that for the simple 'Hello Caddy' request above roughly 10% of time is being spent serializing the request for the zap logger using .With(..).

torch

Half of it is being spent on serializing the request for the error logger here:

shouldLogCredentials := s.Logs != nil && s.Logs.ShouldLogCredentials
loggableReq := zap.Object("request", LoggableHTTPRequest{
Request: r,
ShouldLogCredentials: shouldLogCredentials,
})
errLog := s.errorLogger.With(loggableReq)

The other half is being spent on serializing the request for the Debug logger here:

https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/rewrite/rewrite.go#L135-L137

The work happens even if nothing ends up being logged.
This PR attempts fixing the second case, making the complete request with rewrite about 5% faster when debug logging is not enabled. (with a quick local wrk benchmark I'm going from ~1.350.000 requests/minute to 1.500.000 requests/minute)

I don't know if there's a workaround for the first case since the Error logger in server.go seems to need the initial request data.

@CLAassistant
Copy link

CLAassistant commented Aug 26, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

The work happens even if nothing ends up being logged.

That surprises me. I thought zap was optimized to only encode and write what is actually being logged.

modules/caddyhttp/rewrite/rewrite.go Outdated Show resolved Hide resolved
@AlliBalliBaba
Copy link
Contributor Author

Zap does have a WithLazy, but it will not immediately capture the request and instead keep a reference to it for later. Which means the request might be mutated at the moment of actually capturing it.

@AlliBalliBaba
Copy link
Contributor Author

Using WithLazy for the error logger in server.go actually boosts requests/second in the 'Hellow World' case by another ~10%., But when actual errors are logged, the request at the time of the error occurring gets logged instead of the original one, which is probably undesired.

This is how the flamegraph would look like without any calls to .With(..)
tourch2

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now. Thanks for the improvement!

You mentioned we can get a 10% speedup in the HTTP server as well. I'm not sure it's acceptable to use the Lazy approach since we need to log the original request before it goes through handling... is there another way?

(Also, I'm kind of surprised this is so inefficient. Are we not using the zero-allocation methods?)

@AlliBalliBaba
Copy link
Contributor Author

The only other option I can think of would be to clone the request via r.Clone(r.Context()) and use .WithLazy().
That seems to be only slightly more efficient though and will slow down the case where an actual error is logged.

wrk -t10 -c 200 -d 60 http://localhost:8123

.With                         ~9.200.000 requests
clone and .WithLazy           ~9.500.000 requests    ~3.3%
.WithLazy                     ~9.900.000 requests    ~7.6%

(Take these benchmarks with a grain of salt, Docker Ubuntu WSL Intel i9)

@francislavoie
Copy link
Member

(Also, I'm kind of surprised this is so inefficient. Are we not using the zero-allocation methods?)

Our loggable HTTP request allocates a bit I think.

@AlliBalliBaba
Copy link
Contributor Author

According to the flame graphs half of it comes from cloning the logger and the other half from json encoding the request.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Thanks for this improvement!

@mholt mholt merged commit ffd28be into caddyserver:master Aug 30, 2024
30 of 33 checks passed
@mholt mholt added the optimization 📉 Performance or cost improvements label Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization 📉 Performance or cost improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants