Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix container healthcheck #245

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

feelingsonice
Copy link

@feelingsonice feelingsonice commented Apr 24, 2024

Previous container creation logic had the following 2 issues afaik:

  1. Health check was too quick. Increasing this from 50 to 500 fixed the problem for me on M2 Max.
  2. Container setup logic didn't do cleanup after failure. Resulted in error messages like Conflict. The container name \"/cornucopia_postgres\" is already in use by container on subsequent startup.

Misc:

  • Fix the build script for auto_build example.

This fixes #244

@@ -5,7 +5,7 @@ use self::error::Error;
/// Starts Cornucopia's database container and wait until it reports healthy.
pub fn setup(podman: bool) -> Result<(), Error> {
spawn_container(podman)?;
healthcheck(podman, 120, 50)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why too quick is a trouble? And do we really want to 10x it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This health check is pinging the Postgres container as it starts up, because the container needs sometime to start, ping it too quickly means the container won't be ready yet.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it retries the pings, why would it be a problem that the container isn't ready yet?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the health checker is erroring out before it could do any retry:

fn healthcheck(podman: bool, max_retries: u64, ms_per_retry: u64) -> Result<(), Error> {
    let slow_threshold = 10 + max_retries / 10;
    let mut nb_retries = 0;
    while !is_postgres_healthy(podman)? { // <------ this won't execute for me
       
    }
    // ...
}

I haven't confirmed this but a quick Look suggests that is_postgres_healthy is calling a command on the cornucopia_postgres container before it's even created:

/// Checks if Cornucopia's container reports healthy
fn is_postgres_healthy(podman: bool) -> Result<bool, Error> {
    Ok(cmd(
        podman,
        &["exec", "cornucopia_postgres", "pg_isready"],
        "check container health",
    )
    .is_ok())
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is_postgres_healthy can only return an Ok(true) or Ok(false), so it cannot actually error out?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can barely understand this file and I hate lifecycle lol... Is this a generated file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its auto generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto build don't work
3 participants