Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrichers are not thread-safe #14

Open
basvd opened this issue Jul 26, 2022 · 11 comments
Open

Enrichers are not thread-safe #14

basvd opened this issue Jul 26, 2022 · 11 comments
Labels
enhancement New feature or request

Comments

@basvd
Copy link

basvd commented Jul 26, 2022

The enrichers provided by ClientInfo are not thread-safe. This is a problem when concurrent tasks are trying to log inside the scope of a request. For example in the following situation:

var httpClient = _httpClientFactory.CreateClient(HttpClientName);

var task1 = Task.Run(async () =>
{
    var response = await httpClient.GetAsync("/api/call1");
    return await response.Content.ReadAsStringAsync();
});

var task2 = Task.Run(async () =>
{
    var response = await httpClient.GetAsync($"/api/call2");
    return await response.Content.ReadAsStringAsync();
});

await Task.WhenAll(task1, task2);

Because the HttpClients are logging they both use a logger that is trying to enrich the event with some client info (e.g. using ClientIpEnricher). If the property (e.g. Serilog_ClientIp) is not yet available on the HttpContext.Items then both instances will try to write it concurrently, which may cause various exceptions to be thrown depending on when and how the conflicting access occurs.

In theory this problem should only occur if the property isn't already on the HttpContext.Items during the concurrent phase. In other words, if the first logs of the request are written in a concurrent situation then it might occur. This means a workaround fix would be to write a log message before going concurrent (e.g. with _logger.Information("workaround fix") before the tasks in the snippet above). In my own tests this indeed seems to work.

I think there is an (implicit) expectation that Serilog enrichers are thread-safe (as mentioned here: serilog/serilog#1144) so that's why I report this as a bug.

@mo-esmp
Copy link
Member

mo-esmp commented Jul 26, 2022

I'm not sure if it going to happen because HttpContext in ASP.NET Core is not a singleton and for each request, a new instance of HttpContext will be populated and the value of httpContext.Items[IpAddressItemKey] should be the same for all requests (after first initialization).

@basvd
Copy link
Author

basvd commented Jul 26, 2022

It happens inside the same request context. The code I posted should be understood as being inside of an ASP.NET action (e.g. async Task<IActionResult> Get(). It's the two concurrent GetAsyncs of the HttpClient that cause the logger to be concurrently accessed and, through the enricher, the HttpContext.Items is concurrently accessed as well (because both these calls occur in the same request context). The problem occurs because this items dictionary is not thread safe.

@basvd
Copy link
Author

basvd commented Jul 26, 2022

It seems the main purpose of using HttpContext.Items is to cache the LogEventProperty. I think it might be worth looking into MemoryCache for this instead, because it serves the same purpose and also provides thread safety. As a cache key you could use the request ID plus the item key.

@mo-esmp
Copy link
Member

mo-esmp commented Jul 26, 2022

It's the two concurrent GetAsyncs of the HttpClient that cause the logger to be concurrently accessed and, through the enricher, the HttpContext.Items is concurrently accessed as well (because both these calls occur in the same request context). The problem occurs because this items dictionary is not thread safe.

I cannot understand what's the relation between HttpClient and HttpContext ? multiple concurrent requests do not lead to having a sharing HttpContext between all requests and each request has its own HttpContext. I think @Kahbazi can explain it better hence this functionality added by his PR.

@basvd
Copy link
Author

basvd commented Jul 26, 2022

Maybe the problem is easier to understand if I simplify the example a bit:

public async Task<IActionResult> Get()
{
   // Here _logger is a logger enriched with the ClientIpEnricher/ClientAgentEnricher

    var task1 = Task.Run(() =>
    {
        _logger.Information("Log message one");
    });

    var task2 = Task.Run(() =>
    {
        _logger.Information("Log message two");
    });

    await Task.WhenAll(task1, task2);
    return Ok();
}

The above code creates a situation where logger is being called in a multi-threaded way which in the ClientIpEnricher/ClientAgentEnricher can result in a race condition on the HttpContext, where the same item is added more than once. Both tasks share the same HttpContext because they are being performed as part of the same request to the ASP.NET application (i.e. the enclosing Get method).

HttpClient is just what triggered this behavior in my code, because it also uses the enriched logger. But I realize it makes it less obvious what is going on.

I've noticed at least two kinds of exceptions that can be thrown:

Exception System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at System.Collections.Generic.Dictionary`2.FindValue(TKey key)
   at System.Collections.Generic.Dictionary`2.TryGetValue(TKey key, TValue& value)
   at Microsoft.AspNetCore.Http.ItemsDictionary.System.Collections.Generic.IDictionary<System.Object,System.Object>.get_Item(Object key)
   at Serilog.Enrichers.ClientAgentEnricher.Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
   at Serilog.Core.Enrichers.SafeAggregateEnricher.Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory) caught while enriching Serilog.Events.LogEvent with Serilog.Enrichers.ClientAgentEnricher.

And:

Exception System.ArgumentException: An item with the same key has already been added. Key: Serilog_ClientIp
   at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
   at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
   at Serilog.Enrichers.ClientIpEnricher.Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
   at Serilog.Core.Enrichers.SafeAggregateEnricher.Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory) caught while enriching Serilog.Events.LogEvent with Serilog.Enrichers.ClientIpEnricher.

@Kahbazi
Copy link
Contributor

Kahbazi commented Jul 26, 2022

@basvd You are correct. I'm afraid theoretically this is a bug, when the first log of the request happens in some concurrent tasks. But I'm wondering if you have ever get this bug in a real-case scenario? Have you seen this bug in production?

The workaround that you have suggested (to have a log before concurrent tasks) would almost always happens since the ASP.Net Core itself have some logs. Unless they are disabled which I guess are rarely the case!

@mo-esmp To fix this bug I guess we could put a lock on httpContext in the Enrich method. But I don't think it would worth the effort and I'm not sure how much it would hurt the performance. I believe the worst case scenario is that the first log of the concurrent tasks won't have the client info 🤔 and Again this would only happen if there is no log before the concurrent tasks which almost never happen because the logs of the framework itself.

It seems the main purpose of using HttpContext.Items is to cache the LogEventProperty. I think it might be worth looking into MemoryCache for this instead, because it serves the same purpose and also provides thread safety. As a cache key you could use the request ID plus the item key.

Yes, the purpose of the PR was to reduce the LogEventProperty allocations, but right now I think storing it anywhere beside HttpContext bring more complexity. The HttpContext.Items would be cleared after the request is finished, but anything beside that needs to be time-dependent which I don't feel good about it. I suggest to leave this as it is.

@basvd
Copy link
Author

basvd commented Jul 27, 2022

@Kahbazi I have indeed seen this bug in real application code. The example that I describe in my original post is based on that scenario. Apparently no log is written before the controller action is called and so the running tasks will be the first to write a log, but I do agree that it is unlikely (me and my team have built an entire application with this enricher without ever running into the issue).

I understand the possible solutions are not ideal in terms of complexity. The workaround/solution in application code is quite simple: Just log something before doing anything multi-threaded. And I'm fine with that.

If I can think of a non-complex alternative to using the HttpContext I will consider making a pull request.

@mo-esmp
Copy link
Member

mo-esmp commented Jul 27, 2022

Changing code at line 53 would fix the problem, however, any suggestion to do it in a more performant way?

lock(locker)
{
    if (httpContext.Items[IpAddressItemKey] is LogEventProperty logEventProperty)
        return;
 
    var ipAddressProperty = new LogEventProperty(IpAddressPropertyName, new ScalarValue(ipAddress));
    httpContext.Items.Add(IpAddressItemKey, ipAddressProperty);
    logEvent.AddPropertyIfAbsent(ipAddressProperty);
}

@basvd
Copy link
Author

basvd commented Jul 27, 2022

I just realized that another solution (for projects using Serilog.AspNetCore) is to use its EnrichtDiagnosticContext to add the ClientIp and ClientAgent event properties. This is an alternative to using the ClientInfo enrichers entirely. The properties will be created by a middleware (which avoids the multi-thread problem) and set once on the diagnostic context, which effectively caches them.

Just posting this here for anyone who runs into the issue looking for solutions:

app.UseSerilogRequestLogging(options =>
{
    options.EnrichDiagnosticContext = (diagnosticContext, httpContext) =>
    {
        diagnosticContext.Set("ClientIp", httpContext.Connection.RemoteIpAddress);
        diagnosticContext.Set("ClientAgent", httpContext.Request.Headers.UserAgent);
    };
});

Note that if you use ForwardedHeadersMiddleware that you register this before the request logging so that the RemoteIpAddress can be set properly.

Edit: This will only enrich the request logs, so it does work differently from ClientInfo, which enriches all the logs during the request.

@mo-esmp
Copy link
Member

mo-esmp commented Jul 27, 2022

I think with the above code then installing this package is pointless.

@pergardebrink
Copy link
Contributor

Couldn't this be solved with static AsyncLocal storage instead of storing it on the HttpContext.Items?

HttpContextAccessor uses AsyncLocal internally, so they would follow each other either way.

@mo-esmp mo-esmp added the enhancement New feature or request label Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants