Treat unhealthy docker container as down #4372

vfosnar · 2024-01-15T16:23:48Z

I have read and understand the pull request rules.

Description

Fixes #4369

Type of change

Please delete any options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
or
Breaking change (a fix or feature that would cause existing functionality to not work as expected)

Checklist

My code follows the style guidelines of this project
I ran ESLint and other linters for modified files
I have performed a self-review of my own code and tested it
I have commented my code, particularly in hard-to-understand areas (including JSDoc for methods)
My changes generates no new warnings
My code needed automated testing. I have added them (this is optional task)

nougad · 2024-01-15T18:44:43Z

Thank you @vfosnar for merging my commit. The change looks good to me. I'm surprised I completely missed that bug when I made my change. 🙈

My only suggestion would be to make DOWN the default (in the else branch):

starting -> PENDING (starting is according to 3 the third possible state: https://docs.docker.com/engine/reference/builder/#healthcheck)
healthy or empty string for podman -> UP
unhealthy and everything else -> DOWN

That way any unknown input would result in DOWN instead of showing it as UP. But that's just a suggestion and not a blocker.

chakflying · 2024-01-15T19:39:16Z

We may also need to increment retries in case it's stuck in starting.

vfosnar · 2024-01-15T19:50:49Z

This should (don't count on that) be okay because if you have a healthcheck defined the daemon will respond with "unhealthy"

chakflying · 2024-01-16T04:24:15Z

I haven't tested this, I'm just wondering if you set a container to automatically restart, but it keeps failing, would it be shown as always "starting"

vfosnar · 2024-01-16T06:51:31Z

nvm you're right, it will get stuck even with a healthcheck

vfosnar · 2024-01-16T07:06:13Z

But I'm not sure if simply incrementing retries every time is the best way to handle this. We should respect the start period in which we won't increment them. No idea how to get data on that from the daemon tho.

I propose we get this pr merged and start a seperate issue.

vfosnar · 2024-01-16T07:23:12Z

or nah, I will just add the counting, better false positive than false negative

CommanderStorm · 2024-01-16T10:04:14Z

Should be possible via listing containers with filters

vfosnar · 2024-01-17T15:36:58Z

I think the best way to handle this is to treat the monitor as DOWN if the container is in the "starting" state, as it basically is down for the end user.

This PR is ready to be reviewed.

CommanderStorm

I have not tested the changes, but they seem way cleaner than the code before. 👍🏻

Could you rebase the changes onto the 1.23.X-branch instead?
This would reduce our effort involved in making this part of v1.23.12

server/model/monitor.js

…dman as up

Co-authored-by: Frank Elsinga <[email protected]>

louislam · 2024-01-19T20:00:32Z

Making starting as DOWN is questionable.

I take Uptime Kuma's Docker image as an example:

uptime-kuma/docker/dockerfile

Line 85 in 36196f6

    
           HEALTHCHECK --interval=60s --timeout=30s --start-period=180s --retries=5 CMD extra/healthcheck

I set the starting period to 180s to make sure the server is fully started, in order to avoid "bootloop".

vfosnar · 2024-01-19T20:04:45Z

Yeah but even if it's starting it is not responding to user's request and that's in my opinion something we should care and notify about

louislam · 2024-01-23T23:18:05Z

As it is changing the behaviour of this monitor type, I think it won't be merged into 1.23.X.

And I think it needs an alternative method to deal the issue. Making starting as DOWN is not necessarily correct in my cases, because in the running state, the service could be ready or not ready.

Maybe Uptime Kuma should also add something like --start-period to avoid the issue.

up2-date · 2024-11-05T14:01:10Z

The problem still exists in current versions (even in the 2.0 beta). Is there a plan to fix this problem?

rmatte · 2024-12-02T20:50:09Z

Still a problem in the current version. Saw this today and missed a major issue for longer than I should have because of it (only eventually noticed because it was my DNS service that was down, but would have been useful to know right away).

This comment was marked as resolved.

Sign in to view

louislam added this to the 1.23.12 milestone Jan 15, 2024

nougad approved these changes Jan 15, 2024

View reviewed changes

nougad mentioned this pull request Jan 15, 2024

support podman by treating empty docker health check as healthy #4367

Closed

7 tasks

CommanderStorm reviewed Jan 17, 2024

View reviewed changes

server/model/monitor.js Show resolved Hide resolved

server/model/monitor.js Outdated Show resolved Hide resolved

CommanderStorm changed the title ~~fix #4369 - treat unhealthy docker container as down~~ Treat unhealthy docker container as down Jan 17, 2024

CommanderStorm added the area:monitor Everything related to monitors label Jan 17, 2024

This comment was marked as resolved.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as resolved.

Sign in to view

vfosnar and others added 5 commits January 20, 2024 03:39

fix louislam#4369 - treat unhealthy docker container as down

d736841

fix louislam#3767 by @nougad - treat empty healthcheck reported by po…

7f5dfcc

…dman as up

make DOWN the default fallback for docker container healthcheck

0b47840

treat the starting state of a docker container as DOWN

0b0f081

Apply suggestions for docker compose healthcheck

2ba9337

Co-authored-by: Frank Elsinga <[email protected]>

louislam force-pushed the master branch from 4c4f5b7 to 2ba9337 Compare January 19, 2024 19:39

louislam changed the base branch from master to 1.23.X January 19, 2024 19:39

louislam added the question Further information is requested label Jan 19, 2024

CommanderStorm mentioned this pull request Jan 23, 2024

ability to monitor containerd-containers #4111

Open

1 task

louislam removed this from the 1.23.12 milestone Jan 23, 2024

CommanderStorm marked this pull request as draft February 15, 2024 01:09

CommanderStorm added the pr:please address review comments this PR needs a bit more work to be mergable label May 19, 2024

This comment was marked as spam.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Treat unhealthy docker container as down #4372

Treat unhealthy docker container as down #4372

vfosnar commented Jan 15, 2024

This comment was marked as resolved.

This comment was marked as resolved.

nougad commented Jan 15, 2024

chakflying commented Jan 15, 2024

vfosnar commented Jan 15, 2024

chakflying commented Jan 16, 2024

vfosnar commented Jan 16, 2024

vfosnar commented Jan 16, 2024

vfosnar commented Jan 16, 2024

CommanderStorm commented Jan 16, 2024

vfosnar commented Jan 17, 2024

CommanderStorm left a comment

This comment was marked as resolved.

This comment was marked as off-topic.

This comment was marked as resolved.

louislam commented Jan 19, 2024

vfosnar commented Jan 19, 2024

louislam commented Jan 23, 2024

up2-date commented Nov 5, 2024

rmatte commented Dec 2, 2024

This comment was marked as spam.

Treat unhealthy docker container as down #4372

Are you sure you want to change the base?

Treat unhealthy docker container as down #4372

Conversation

vfosnar commented Jan 15, 2024

Description

Type of change

Checklist

This comment was marked as resolved.

This comment was marked as resolved.

nougad commented Jan 15, 2024

chakflying commented Jan 15, 2024

vfosnar commented Jan 15, 2024

chakflying commented Jan 16, 2024

vfosnar commented Jan 16, 2024

vfosnar commented Jan 16, 2024

vfosnar commented Jan 16, 2024

CommanderStorm commented Jan 16, 2024

vfosnar commented Jan 17, 2024

CommanderStorm left a comment

Choose a reason for hiding this comment

This comment was marked as resolved.

This comment was marked as off-topic.

This comment was marked as resolved.

louislam commented Jan 19, 2024

vfosnar commented Jan 19, 2024

louislam commented Jan 23, 2024

up2-date commented Nov 5, 2024

rmatte commented Dec 2, 2024

This comment was marked as spam.