Bug: polling health checker will always fail #69

humbornjo · 2024-11-15T18:37:16Z

We run into some trouble with client conn control, and trying to use the health check. however, the default health checker will always fail with its context deadline exceeded. I think its because the line 332 in httplb/balancer.go, should use withtimeout instead of withcancel.

connection := conns[i]
connCtx, connCancel := context.WithCancel(b.ctx)
healthChecker := b.healthChecker.New(connCtx, connection, b)
go func() {
	defer connCancel()
	if err := connection.Prewarm(connCtx); err == nil {
		b.warmedUp(connection)
	}
}()

here is my test code

// server.go
package main

import (
	"github.com/gofiber/fiber/v2"
	"github.com/gofiber/fiber/v2/middleware/healthcheck"
	"github.com/gofiber/fiber/v2/middleware/recover"
)

func main() {
	app := fiber.New()

	app.Use(recover.New())
	app.Use(healthcheck.New(
		healthcheck.Config{
			LivenessEndpoint: "/healthz",
			LivenessProbe: func(c *fiber.Ctx) bool {
				c.Send([]byte("OK"))
				return true
			}},
	))
	app.Get("/foo", func(c *fiber.Ctx) error {
		return c.Send([]byte("bar"))
	})
	app.Get("/", func(c *fiber.Ctx) error {
		return c.Send([]byte("bar"))
	})
	if err := app.Listen(":3000"); err != nil {
		panic(err)
	}
}

// client.go
package main

import (
	"fmt"
	"net/http"
	"net/url"
	"time"

	"github.com/bufbuild/httplb"
	"github.com/bufbuild/httplb/health"
)

func main() {

	url := url.URL{
		Scheme: "http",
		Host:   "localhost:3000",
		Path:   "foo",
	}

	client := httplb.NewClient(
		httplb.WithHealthChecks(
			health.NewPollingChecker(
				health.PollingCheckerConfig{
					Timeout:         100 * time.Second,
					PollingInterval: 100 * time.Second,
				},
				health.NewSimpleProber("/healthz"),
			),
		),
	)

	for range 5 {
		req, err := http.NewRequest(http.MethodGet, url.String(), http.NoBody)
		if err != nil {
			fmt.Println(err)
			continue
		}

		resp, err := client.Do(req)
		if err != nil {
			fmt.Println(err)
			continue
		}
		fmt.Println(resp.Status)
		time.Sleep(1 * time.Second)
	}
}

the error log

200 OK
Get "http://localhost:3000/foo": unavailable: no healthy connections
Get "http://localhost:3000/foo": unavailable: no healthy connections
Get "http://localhost:3000/foo": unavailable: no healthy connections
Get "http://localhost:3000/foo": unavailable: no healthy connections

The text was updated successfully, but these errors were encountered:

humbornjo · 2024-11-15T18:42:15Z

btw, I wrote a dummy prober with no use of ctx, and it works well.

humbornjo · 2024-11-15T18:46:26Z

Speaking of which, about the wired behavior health when the conn dialing timeout. it should report unhealthy, but in fact it still use the timeout conn. I am still trying to comprehend the code logic of this part, maybe discuss it in another thread.

jhump · 2024-11-18T20:36:42Z

Speaking of which, about the wired behavior health when the conn dialing timeout. it should report unhealthy, but in fact it still use the timeout conn.

This should not be the case. If dialing times out, it will ultimately cause the HTTP request to fail and set the state to unhealthy (link).

However, what you might be seeing is that the unhealthy connection is attempted for use until the health check times out. There are currently a few internal constants and TODOs to possibly make them configurable. But the logic is that it tries to create a pool with some minimum size. And if the set of known-healthy connections/backends isn't large enough, it will also include connections/backends whose health state is still unknown.

humbornjo changed the title ~~polling health checker will always fail~~ Bug: polling health checker will always fail Nov 15, 2024

jhump mentioned this issue Nov 18, 2024

Fix issue with context being cancelled prematurely #70

Merged

jhump closed this as completed in #70 Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: polling health checker will always fail #69

Bug: polling health checker will always fail #69

humbornjo commented Nov 15, 2024

humbornjo commented Nov 15, 2024

humbornjo commented Nov 15, 2024

jhump commented Nov 18, 2024 •

edited

Loading

Bug: polling health checker will always fail #69

Bug: polling health checker will always fail #69

Comments

humbornjo commented Nov 15, 2024

humbornjo commented Nov 15, 2024

humbornjo commented Nov 15, 2024

jhump commented Nov 18, 2024 • edited Loading

jhump commented Nov 18, 2024 •

edited

Loading