Tags: bufbuild/httplb
Tags
Fix issue with context being cancelled prematurely (#70) There was a bug where we would cancel the health-checker's context after warming up the connection, which means that health checkers that rely on the context would start permanently failing after the first check. To reproduce, I updated the `FakeHealthChecker` used in tests to be context-sensitive: it uses `context.AfterFunc` to permanently mark a connection as unhealthy if/when the context is cancelled. One of my local test runs then hung, revealing a potential deadlock bug. The issue was that the balancer would acquire a lock and then call `check.Close()`. For the `FakeHealthChecker`, closing then acquires the health checker's lock. But over in the `FakeHealthChecker.UpdateHealthState`, we acquire locks in the reverse order: acquiring the health checker's lock and then calling `tracker.UpdateHealthState` (which ends up calling into the balancer and acquiring the balancer's lock). Having the lock acquisition order differ exposes a potential deadlock. The deadlock is also fixed in this branch, by making it so that we don't ever try to acquire both locks at the same time, much less acquire them in different orders.
Implement address family affinity and prefer IPv4 by default (#66) Implements address family policy. When the policy is set to `PreferIPv4` or `PreferIPv6`, any DNS result that has both IPv4 and IPv6 records will be filtered to only the preferred records; in cases where only IPv4 or IPv6 addresses are present, this has no impact. The default is set to `PreferIPv4` to help lower the likelihood of problems due to the fact that `httplb` lacks some kind of implementation of a "happy eyeballs" algorithm to perform IPv6 fallback. It winds up being a little tricky to test this, but it's probably worthwhile; right now we can only test the DNS resolver in a fairly limited way (by relying on the fact that resolving an IP address is a no-op.) With this approach, we can test the DNS resolver more-or-less end-to-end.
Eliminate test flakes in recently-added heap tests (#65) The update operation includes iteration over a map, so the order of items picked from the heap is not entirely deterministic. This fixes the tests to be lax on the precise order of elements when they their load is equal/tied. The thing we care about is that items aren't picked when their load is higher than others.
Add support for recycling "leaf" connections (#60) Along with #58, this should be the other half of the solution to resolve #59. This allows the leaf level transports to be periodically re-created. That way, if by chance a single pool ends up in a state of limited diversity (too many transports with persistent connections to the same backend), re-creating the transport will allow it a chance to contact a different backend, and thus be continually changing the mix of target backends.