Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod should auto-restart when encountering cometbft bug. #409

Open
2 tasks
danbryan opened this issue Mar 1, 2024 · 6 comments
Open
2 tasks

Pod should auto-restart when encountering cometbft bug. #409

danbryan opened this issue Mar 1, 2024 · 6 comments
Assignees

Comments

@danbryan
Copy link
Contributor

danbryan commented Mar 1, 2024

Pods stop syncing blocks periodically when they encounter th cometbft bug. Lets try to identify a way to know when this occurred, and auto restart the pod. Could be as simple as no response from the status endpoint for 2 mins.

Tasks

@jonathanpberger jonathanpberger changed the title auto restart pod when cometbft bug encountered Pod should auto restart when cometbft bug encountered Mar 11, 2024
@danbryan
Copy link
Contributor Author

@vimystic can you provide an update on this?

@vimystic
Copy link
Contributor

Is there a description of the cometbft bug itself somewhere ?

@danbryan
Copy link
Contributor Author

@agouin are you able to describe or link to the bug?

@vimystic here is a script that identifies and restarts pods that are impacted by this bug.

#!/bin/bash

kubectl config use-context sentry-mainnet@sl-colo
PODS=( $(kubectl get pods -A | grep cosmos-sentry | awk '{print $1,$2}') )

for (( i=0; i<${#PODS[@]} ; i+=2 )) ; do
    ns="${PODS[i]}"
    pod="${PODS[i+1]}"
    kubectl logs -c node --tail=30 -n $ns $pod | grep "SignerListener: Connected" 2>&1 > /dev/null
    if [[ "$?" == "0" ]]; then
      echo "ns: ${PODS[i]} pod: ${PODS[i+1]} is stuck"
      kubectl delete --wait=false pod -n $ns $pod
    fi
done

@jonathanpberger
Copy link
Contributor

jonathanpberger commented Mar 25, 2024

depends on kubectl config secret.

@vimystic
Copy link
Contributor

vimystic commented Apr 23, 2024

Blocked until https://github.com/strangelove-ventures/infra/issues/3020 is completed.

@jonathanpberger
Copy link
Contributor

https://github.com/strangelove-ventures/infra/issues/3020 is complete! Unblocking.

@jonathanpberger jonathanpberger changed the title Pod should auto restart when cometbft bug encountered Pod should auto-restart when encountering cometbft bug. Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants