Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TT-12074 Empty interfaces being passed to pump backends #768

Open
monrax opened this issue Jan 16, 2024 · 0 comments
Open

TT-12074 Empty interfaces being passed to pump backends #768

monrax opened this issue Jan 16, 2024 · 0 comments

Comments

@monrax
Copy link
Contributor

monrax commented Jan 16, 2024

Timeout errors can occur when retrieving data from redis, especially when attempting to retrieve a large number of records:

time="Jan 16 10:50:11" level=error msg="Multi command failed: read tcp [::1]:56727->[::1]:6379: i/o timeout" prefix=redis

When resources become insufficient for larger loads, a state where the number of records created increases faster than they are purged out can be reached, so the corresponding timeout errors can be expected.

However, some (very noisy) unexpected additional error logs immediately follow the one above when this state is reached:

time="Jan 16 10:50:11" level=error msg="Couldn't unmarshal analytics data:EOF" analytic_key=tyk-system-analytics prefix=main
time="Jan 16 10:50:11" level=error msg="Couldn't unmarshal analytics data:EOF" analytic_key=tyk-system-analytics prefix=main
time="Jan 16 10:50:11" level=error msg="Couldn't unmarshal analytics data:EOF" analytic_key=tyk-system-analytics prefix=main
(...)

Depending on which pumps are configured, this can result in (also quite noisy) error logs such as:

time="Jan 16 10:50:33" level=error msg="Error decoding analytic record" prefix=resurface-pump
time="Jan 16 10:50:33" level=error msg="Error decoding analytic record" prefix=resurface-pump
(...)

In this case, the resurfaceio backend the following type assertion is performed on line 217:

decoded, ok := v.(analytics.AnalyticsRecord)
if !ok {
	rp.log.Error("Error decoding analytic record")
	continue
}

Which fails as the interface v does not hold an analytics.AnalyticsRecord type. This can just result in noisy logs as mentioned above, but for pumps that do not carry out a safe type assertion (decoded := v.(analytics.AnalyticsRecord) instead of decoded, ok := v.(analytics.AnalyticsRecord)), an unhandled runtime panic could be triggered.


By tracing back the origin of these logs, we can see how:

I believe that even though many empty records cause EOF errors at read time, many others do not, and they end up getting passed to the writePumps method as new interface-wrapped decoded values, which causes the type assertion errors.

This issue can be reproduced following the same steps described in PR #731, as the related issue can lead to a state where the number of records builds up faster than they are purged out.

@caroltyk caroltyk changed the title Empty interfaces being passed to pump backends TT-12074 Empty interfaces being passed to pump backends May 3, 2024
@caroltyk caroltyk added the bug label May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants