Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-snap-config: Do not check if node is part of the cluster #688

Merged
merged 1 commit into from
Sep 23, 2024

Conversation

bschimke95
Copy link
Contributor

Follow-up fix for: #633

In the mentioned PR, we wait until k8sd is up to sync the config. We exit early if the node is not part of a cluster. However, a closer look at the implementation of NodeStatus shows that this would skip any error that we wait for:

func (c *k8sd) NodeStatus(ctx context.Context) (apiv1.NodeStatusResponse, bool, error) {
	response, err := query(ctx, c, "GET", apiv1.NodeStatusRPC, nil, &apiv1.NodeStatusResponse{})
	if err != nil {
		// Error 503 means the node is not initialized yet
		var statusErr api.StatusError
		if errors.As(err, &statusErr) {
			if statusErr.Status() == http.StatusServiceUnavailable {
				return apiv1.NodeStatusResponse{}, false, nil
			}
		}

		return apiv1.NodeStatusResponse{}, false, err <- Returns partOfCluster == false if an error happens. 
	}

	return response, true, nil
}

This PR removes the wrong check.

@bschimke95 bschimke95 requested a review from a team as a code owner September 20, 2024 16:31
_, partOfCluster, err := client.NodeStatus(cmd.Context())
if !partOfCluster {
cmd.PrintErrf("Node is not part of a cluster: %v\n", err)
env.Exit(1)
}
_, _, err := client.NodeStatus(cmd.Context())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if NodeStatus() always returns partofCluster=false for any error why do we no longer need this exit early?

i'm not sure i have the context from #633 to properly review this. Maybe @mateoflorido

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically you're saying any error from NodeStatus at all exits early-- and that's wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, WaitUntilReady will retry if the bool return value is false. If something in the error return value is returned, then this exits immediately.

NodeStatus second return value indicates if the node is part of a cluster or not. However, it will also return false for this argument if an error is returned (third return value). The current implementation would always exit with Node is notpart of cluster even if NodeStatus returned and error (which means the "is part of cluster" return value is not valid).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AHH! i didn't look down at lines 70-71. Thank you

Copy link
Contributor

@addyess addyess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the explanation!

@bschimke95 bschimke95 merged commit 6b15893 into main Sep 23, 2024
19 of 20 checks passed
@bschimke95 bschimke95 deleted the bschimke95/fix-snap-sync branch September 23, 2024 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants