-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aleph-node logs failed connections to a bootnode that is *not* on the BOOT_NODES list #4
Comments
Hi, anything particular happened with network from your validator perspective around 2023-04-02 13:07:13? It seems that your node was not communicating other peers (no bytes sent/received), which fixed itself 15 minutes later:
|
hmm looks like a brief network outage, let me have a look :) |
ok, good find! Looks like I can reproduce this by disconnecting the network for a few seconds: on the same node, nic disconnected at 2023-04-04 09:31:16
nic reconnected at 2023-04-04 09:32:19
still not sure why node keeps picking on poor |
this happened again on a different testnet node (QNODE-TEST-2). This node in located in a DC, and has suffered no loss of connectivity, AFAICT. 40 peers solid throughout. Also, it self-corrected in about half a minute. log attached azero-testnet-2 - 2023-04-06 112056 - odd bootnode IP-peer_id.zip |
Hi, thanks for the reports. As per original logs, it does not seem to be issue affecting the chain. As per last log provided, There is one moment number of peers is 39, then it becomes 40 again, so I think there's some network activity done in the background. It's hard to tell at this moment where the issue is as it needs local reproduction to with some log levels set etc. |
Hi, please let me know if this issue can be closed, any more reproduction so far? |
Hi, I see this happening occasionally (every couple of weeks or so) on my mainnet node. Last instance was just a couple of days ago: 2023-06-25 03:58:12 💔 The bootnode you want to connect to at I did not see any of this on any of my testnet nodes, since upgrading to ver. 0.11.x |
mainnet node logged another couple of these: 2023-07-08 03:12:34 💔 The bootnode you want to connect to at I can see no similar entries for testnet nodes. Hopefully this is affecting ver. 0.10.x only. |
today my testnet-2 node also experienced the same issue: 2023-07-18 11:24:32 💔 The bootnode you want to connect to at I guess ver. 0.11.3 is not immune from this after all. |
Did you read the documentation and guides?
Is there an existing issue?
Description of the problem
my testnet node 1 (QNODE-TEST) started logging some events yesterday (Apr-2-2023), starting at 2023-04-02 13:13:07 PDT. Events continued until the node was restarted at 2023-04-02 14:06:08 PDT. a pair of the same event was logged every 5s or so:
2023-04-02 13:13:07 💔 The bootnode you want to connect to at
/ip4/84.245.33.122/tcp/30333/p2p/12D3KooWRkGLz4YbVmrsWK75VjFTs8NvaBu42xhAmQaP4KeJpw1Lprovided a different peer ID
12D3KooWJgniH6kTFxLAZUVaP2nE7ohrMVjFMXBbbyph1uGYFDL6than the one you expect
12D3KooWRkGLz4YbVmrsWK75VjFTs8NvaBu42xhAmQaP4KeJpw1L`I posted about this on the discord testnet-validator-general channel: https://discord.com/channels/775720634848116807/1050508965647233055/1092203956324814848
Another validator (Bart | StakeSafe) confirmed that the IP address logged (84.245.33.122) belonged to his ISP (Cambrium IT Services BV). All of the AZF bootnodes seem to be hosted on Amazon, so there was a mismatch of sort
the connection attempted was using one of my "regular" peers IP address along with a peer_id belonging to an actual AZF bootnode (the first on the bootnode list): bootnode-eu-central-1-0.test.azero.dev
bootnode-eu-central-1-0.test.azero.dev resolved to the correct IP address (18.184.79.200) on the same testnet machine
the problem persisted until i restarted aleph-node. after restarting, the issue disappeared
Information on your setup.
testnet validator ✌️ version 0.10.1-9c6a46952d3 (built from source)
node name: QNODE-TEST
peer_id: 12D3KooWMW86EoNYwBar6e5Gcnmaa13N6JsnzYVPPVGsmvjUddWt
OS: ubuntu 22.04.02
hardware: VM on ESXi hypervisor, AMD AMD EPYC 7302 (8c/16t dedicated), 64GB ECC RAM, 1×100 GB SSD, 1×1.92 TB SSD NVMe, 1G connectivity, business fibre (static IP)
IP address: 206.116.52.219
Steps to reproduce
not sure how to reproduce
backup-aleph-node-testnet-2023-04-02T1406.zip
Did you attach relevant logs?
The text was updated successfully, but these errors were encountered: