Skip to content
This repository has been archived by the owner on Dec 4, 2024. It is now read-only.

Unexpected end of JSON input / panic: send on closed channel #1277

Open
maciejmaz opened this issue Mar 8, 2023 · 2 comments
Open

Unexpected end of JSON input / panic: send on closed channel #1277

maciejmaz opened this issue Mar 8, 2023 · 2 comments

Comments

@maciejmaz
Copy link

Unexpected end of JSON input / panic: send on closed channel

Description

I've deployed a polygon-edge based testnet on the EKS cluster with 2 bootnodes and 4 validator nodes and sometimes, when the pods are getting killed, the consensus/metadata file is empty, so the pod can't start again.
Right before the pod is terminated, I can see "panic: send on closed channel" error, and then after the pod is started again it's showing "Unexpected end of JSON input".
Theoretically this PR should fix this issue but in some cases, when the shutdown is not graceful, it goes to panic mode.

Your environment

  • OS and version.
    NAME="Alpine Linux"
    ID=alpine
    VERSION_ID=3.14.8

  • The version of the Polygon Edge. - v0.6.3

  • Locally or Cloud hosted (which provider). - Cloud hosted on AWS

  • Please confirm if the validators are running under containerized environment (K8s, Docker, etc.). - Validators are running on EKS cluster.

Steps to reproduce

  • Tell us how to reproduce this issue.
  • Where the issue is, if you know.
  • Which commands triggered the issue, if any.
  • Provide us with the content of your genesis file.
  • Provide us with commands that you used to start your validators.
  • Provide us with the peer list of each of your validators by running the following command: polygon-edge peers list --grpc-address GRPC_ADDRESS.
[PEERS LIST]
Number of peers: 5

[0] = 16Uiu2HAmEawWfrp4eg5N8H6D8XZpohTDUVu3huhUYJsa7pBT7n9u
[1] = 16Uiu2HAkvFU7BVRvfg8mP3DLeMY62HWAuAgstrcqQ9Habh4L5vut
[2] = 16Uiu2HAmNFv1UN86Dk6tWSxdo5kfmrWTW2BDueBDeeDuM2n9sDha
[3] = 16Uiu2HAmMVn5yd8g3FDTbteWZUj1CAP4HHJERXC4xmZ6Gu74bEhS
[4] = 16Uiu2HAkwHpydH19fKV6qnQda6oxfpCYrWUsVUGWEMvkgWkfXBM7
  • Is the chain producing blocks and serving customers atm?
    Yes

Expected behavior

  • Tell us what should happen.
    Validator node should be able to recover after any kind of container failure.
  • Tell us what happened instead.
    Validator node should be able to recover after any kind of container failure.
    In some cases validator node is falling into crashloop, because of missing data in metadata file
@zuiris
Copy link

zuiris commented May 25, 2023

Hey @maciejmazur10c, could you please verify if the error persist in 0.9?

@maciejmaz
Copy link
Author

@zuiris I was planning to migrate this blockchain to v0.8.1 and test it, but didn't have time to do it yet. There will be a problem with testing it on 0.9, as it's not backward compatible...

FYI, v0.6.3 was also stopping to produce new blocks almost every day or every few days, so we had to roll it back to v0.6.2, which is more stable.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants