Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application panicked (crashed) when RPC nodes are unreachable #24

Open
lazam opened this issue Mar 11, 2022 · 1 comment
Open

Application panicked (crashed) when RPC nodes are unreachable #24

lazam opened this issue Mar 11, 2022 · 1 comment
Assignees
Labels

Comments

@lazam
Copy link
Contributor

lazam commented Mar 11, 2022

The application panicked (crashed) when some of the RPC nodes are unreachable.

Logs:

[2022-03-11T04:21:51.869184Z WARN  rustls::check] Received a ServerHelloDone handshake message while expecting [CertificateRequest]
[2022-03-11T04:21:54.206332Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:21:56.279044Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:22:00.395583Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:22:02.807649Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:22:02.807750Z ERROR polkadot_introspector::core::subxt] [wss://versi-tick-rpc.polkadot.io:443] Disconnected (Rpc(Transport(Connection timeout exceeded: 10s))) 
[2022-03-11T04:22:03.045872Z WARN  rustls::check] Received a ServerHelloDone handshake message while expecting [CertificateRequest]
[2022-03-11T04:22:04.470119Z ERROR polkadot_introspector::core::subxt] subxt call error: Metadata(PalletNotFound("Timestamp"))
[2022-03-11T04:22:04.794492Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:22:08.036180Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:22:09.950028Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:22:09.950232Z WARN  rustls::check] Received a ServerHelloDone handshake message while expecting [CertificateRequest]
[2022-03-11T04:22:11.994402Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:22:15.486887Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
[2022-03-11T04:22:15.486984Z ERROR polkadot_introspector::core::subxt] [wss://versi-tick-rpc.polkadot.io:443] Disconnected (Rpc(Transport(Connection timeout exceeded: 10s))) 
[2022-03-11T04:22:16.428480Z ERROR polkadot_introspector::core::subxt] subxt call error: Metadata(PalletNotFound("Timestamp"))
The application panicked (crashed).
Message:  Request timed out: Request { url: "wss://rococo-statemint-rpc.polkadot.io:443", request_type: GetBlockTimestamp(Some(0xa11aaa91fc27c0c9e70af3afa55861904383317cca730b7bebc3869c6b0786a4)), response_sender: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } }
Location: src/core/subxt.rs:157

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
                                ⋮ 7 frames hidden ⋮                               
   8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h93480fe48528e9e4
      at <unknown source file>:<unknown line>
   9: tokio::runtime::task::harness::Harness<T,S>::poll::h431ca52fb4b6e639
      at <unknown source file>:<unknown line>
  10: std::thread::local::LocalKey<T>::with::h3295764c141dc3ef
      at <unknown source file>:<unknown line>
  11: tokio::runtime::thread_pool::worker::Context::run_task::hb41ccf2538030ca1
      at <unknown source file>:<unknown line>
  12: tokio::runtime::thread_pool::worker::Context::run::h3053b95b664635d3
      at <unknown source file>:<unknown line>
  13: tokio::macros::scoped_tls::ScopedKey<T>::set::h408cc0b29da9105a
      at <unknown source file>:<unknown line>
  14: tokio::runtime::thread_pool::worker::run::h1ea665436ab3a822
      at <unknown source file>:<unknown line>
  15: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h29cbf04e8f43737f
      at <unknown source file>:<unknown line>
  16: std::panicking::try::h0aa4caf77b9db292
      at <unknown source file>:<unknown line>
  17: tokio::runtime::task::harness::Harness<T,S>::poll::h8a7121f28e3d21de
      at <unknown source file>:<unknown line>
  18: tokio::runtime::blocking::pool::Inner::run::h7d0364598c1b0208
      at <unknown source file>:<unknown line>
  19: std::sys_common::backtrace::__rust_begin_short_backtrace::h5d885a40af7ba8e4
      at <unknown source file>:<unknown line>
  20: core::ops::function::FnOnce::call_once{{vtable.shim}}::haa9c6ae36f70def5
      at <unknown source file>:<unknown line>
  21: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h7bd677a5dc988be6
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/alloc/src/boxed.rs:1691
  22: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h7b1c1ba11c4db785
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/alloc/src/boxed.rs:1691
  23: std::sys::unix::thread::Thread::new::thread_start::h9c58c0d12d84e854
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/sys/unix/thread.rs:106
  24: start_thread<unknown>
      at <unknown source file>:<unknown line>
  25: clone<unknown>
      at <unknown source file>:<unknown line>

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
The application panicked (crashed).
Message:  Expected GetBlockTimestampResponse, got something else.
Location: src/core/api.rs:45

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
                                ⋮ 7 frames hidden ⋮                               
   8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h35bffb1d27d20624
      at <unknown source file>:<unknown line>
   9: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h31e01134c8d1a41a
      at <unknown source file>:<unknown line>
  10: tokio::runtime::task::harness::Harness<T,S>::poll::h0e3c68e149302473
      at <unknown source file>:<unknown line>
  11: std::thread::local::LocalKey<T>::with::h3295764c141dc3ef
      at <unknown source file>:<unknown line>
  12: tokio::runtime::thread_pool::worker::Context::run_task::hb41ccf2538030ca1
      at <unknown source file>:<unknown line>
  13: tokio::runtime::thread_pool::worker::Context::run::h3053b95b664635d3
      at <unknown source file>:<unknown line>
  14: tokio::macros::scoped_tls::ScopedKey<T>::set::h408cc0b29da9105a
      at <unknown source file>:<unknown line>
  15: tokio::runtime::thread_pool::worker::run::h1ea665436ab3a822
      at <unknown source file>:<unknown line>
  16: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h29cbf04e8f43737f
      at <unknown source file>:<unknown line>
  17: std::panicking::try::h0aa4caf77b9db292
      at <unknown source file>:<unknown line>
  18: tokio::runtime::task::harness::Harness<T,S>::poll::h8a7121f28e3d21de
      at <unknown source file>:<unknown line>
  19: tokio::runtime::blocking::pool::Inner::run::h7d0364598c1b0208
      at <unknown source file>:<unknown line>
  20: std::sys_common::backtrace::__rust_begin_short_backtrace::h5d885a40af7ba8e4
      at <unknown source file>:<unknown line>
  21: core::ops::function::FnOnce::call_once{{vtable.shim}}::haa9c6ae36f70def5
      at <unknown source file>:<unknown line>
  22: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h7bd677a5dc988be6
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/alloc/src/boxed.rs:1691
  23: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h7b1c1ba11c4db785
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/alloc/src/boxed.rs:1691
  24: std::sys::unix::thread::Thread::new::thread_start::h9c58c0d12d84e854
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/sys/unix/thread.rs:106
  25: start_thread<unknown>
      at <unknown source file>:<unknown line>
  26: clone<unknown>
      at <unknown source file>:<unknown line>

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
[2022-03-11T04:22:16.906702Z ERROR polkadot_introspector::core::subxt] [wss://rococo-canvas-rpc.polkadot.io.io:443] Disconnected (Rpc(Transport(Failed to resolve IP addresses for this hostname: failed to lookup address information: Temporary failure in name resolution))) 
Error: 
   0: panic

Location:
   /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/result.rs:1914

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
                                ⋮ 2 frames hidden ⋮                               
   3: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h243de022c2fb1fb2
      at <unknown source file>:<unknown line>
   4: polkadot_introspector::main::{{closure}}::haa7179a574293364
      at <unknown source file>:<unknown line>
   5: tokio::park::thread::CachedParkThread::block_on::h41dec6d8c00dd82d
      at <unknown source file>:<unknown line>
   6: tokio::runtime::thread_pool::ThreadPool::block_on::hb517c83cedba7e35
      at <unknown source file>:<unknown line>
   7: tokio::runtime::Runtime::block_on::h6704d75a74637c4f
      at <unknown source file>:<unknown line>
   8: polkadot_introspector::main::hd43db547d28854a3
      at <unknown source file>:<unknown line>
   9: std::sys_common::backtrace::__rust_begin_short_backtrace::hcdf14ca10d077a1a
      at <unknown source file>:<unknown line>
  10: std::rt::lang_start::{{closure}}::hdfaf95dcab8b9283
      at <unknown source file>:<unknown line>
  11: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::hc56adab7a77ec6e3
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/ops/function.rs:259
  12: std::panicking::try::do_call::h29f013120c5abc65
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/panicking.rs:403
  13: std::panicking::try::h86d5b2b66caec4cf
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/panicking.rs:367
  14: std::panic::catch_unwind::h7dd136d787f51397
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/panic.rs:133
  15: std::rt::lang_start_internal::{{closure}}::h4a199351e630a8a5
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/rt.rs:128
  16: std::panicking::try::do_call::h20ceb9e5dff838c6
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/panicking.rs:403
  17: std::panicking::try::hc2abb46a5e41bd43
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/panicking.rs:367
  18: std::panic::catch_unwind::h52c3eb4408ad6dfb
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/panic.rs:133
  19: std::rt::lang_start_internal::hd15a47be08101c28
      at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/rt.rs:128
  20: main<unknown>
      at <unknown source file>:<unknown line>
  21: __libc_start_main<unknown>
      at <unknown source file>:<unknown line>
  22: _start<unknown>
      at <unknown source file>:<unknown line>

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.

To reproduce:

docker run --user 1000:1000 -e RUST_BACKTRACE=full  --network host lazam/introspector:latest block-time-monitor --ws "wss://rococo-canvas-rpc.polkadot.io.io:443,wss://rococo-statemint-rpc.polkadot.io:443,wss://versi-tick-rpc.polkadot.io:443" prometheus --port 9615

Expectation:

Introspector will just log the unreachable RPC nodes and continue on scraping other RPC nodes. RPC nodes being unreachable are common. Other option is to have a metric exposed if a node is reachable or not, something like:

introspector_rpc_node_up{node="wss://kusama-rpc.polkadot.io:443"} 1
introspector_rpc_node_up{node="wss://kusama-non-existing-rpc.polkadot.io:443"} 0
@sandreim sandreim self-assigned this Mar 25, 2022
@AndreiEres
Copy link
Collaborator

Currently, we only observe many chains in the block-time monitor. The other tools operate in a single-chain mode, and it doesn't make sense to continue working without a connection to a specific node.

In this case, we can use multiple nodes for a single chain to enhance stability. We can cycle through the nodes, connecting to the next one when we get disconnected from the current one. See #355

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

No branches or pull requests

3 participants