Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

RMQ Failover doesn't appear to be working #51

Open
warmfusion opened this issue Jul 20, 2017 · 8 comments
Open

RMQ Failover doesn't appear to be working #51

warmfusion opened this issue Jul 20, 2017 · 8 comments

Comments

@warmfusion
Copy link
Collaborator

Symptom

  1. First node of cluster is offline.
  2. Fluent connects to first node- timesout and indicates second will be used

Problem

Log events show message Io timeouts suggesting there may be a problem with failover as events never get sent out, and the buffers fill with messages.

Logging


Jul 20 13:51:16 webproxyprod02 fluentd[6072]: E, [2017-07-20T13:51:16.609199 #6162] ERROR -- #<Bunny::Session:0x26df838 [email protected]:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Got an exception when sending data: IO timeout when writing to socket (Timeout::Error)
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: W, [2017-07-20T13:51:16.609299 #6162]  WARN -- #<Bunny::Session:0x26df838 [email protected]:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Will recover from a network failure (no retry limit)...
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: W, [2017-07-20T13:51:16.630369 #6162]  WARN -- #<Bunny::Session:0x26df838 [email protected]:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Retrying connection on next host in line: rmq01.brk.example.tld:5672
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: W, [2017-07-20T13:51:16.632868 #6162]  WARN -- #<Bunny::Session:0x26df838 [email protected]:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Could not establish TCP connection to rmq01.brk.example.tld:5672: Connection refused - connect(2) for 172.20.4.4:5672
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: W, [2017-07-20T13:51:16.632941 #6162]  WARN -- #<Bunny::Session:0x26df838 [email protected]:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Will try to connect to the next endpoint in line: rmq02.brk.example.tld:5672
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: 2017-07-20 13:51:16 +0000 [warn]: #0 buffer flush took longer time than slow_flush_log_threshold: elapsed_time=30.261541206855327 slow_flush_log_threshold=20.0 plugin_id="object:12c2440"
@maxpain
Copy link

maxpain commented Nov 18, 2019

Same problem

@maxpain
Copy link

maxpain commented Nov 18, 2019

@warmfusion did you solve the problem?

@warmfusion
Copy link
Collaborator Author

Its not something i've noticed recently after upgrading to newer versions of most of the components to be honest. And we have issues with message brokers that mean we'd expect to see this issue a lot.

I'd go with "No?" but as you're seeing the issue too can you give me more details on the error and versions involved?

@maxpain
Copy link

maxpain commented Dec 5, 2019

@warmfusion

Version of FluentD: v1.7.4

Logs when the connection between fluent and rabbitmq was down:

E, [2019-12-05T07:59:41.662072 #16] ERROR -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Got an exception when receiving data: Connection reset by peer (Errno::ECONNRESET)
W, [2019-12-05T07:59:41.662346 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Exception in the reader loop: Errno::ECONNRESET: Connection reset by peer
W, [2019-12-05T07:59:41.662381 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Backtrace:
W, [2019-12-05T07:59:41.662407 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/2.6.0/socket.rb:452:in `__read_nonblock'
W, [2019-12-05T07:59:41.662429 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/2.6.0/socket.rb:452:in `read_nonblock'
W, [2019-12-05T07:59:41.662452 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/cruby/socket.rb:55:in `block in read_fully'
W, [2019-12-05T07:59:41.662477 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/cruby/socket.rb:54:in `loop'
W, [2019-12-05T07:59:41.662501 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/cruby/socket.rb:54:in `read_fully'
W, [2019-12-05T07:59:41.662523 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/transport.rb:239:in `read_fully'
W, [2019-12-05T07:59:41.662544 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/transport.rb:261:in `read_next_frame'
W, [2019-12-05T07:59:41.662573 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/reader_loop.rb:74:in `run_once'
W, [2019-12-05T07:59:41.662596 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/reader_loop.rb:39:in `block in run_loop'
W, [2019-12-05T07:59:41.662620 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/reader_loop.rb:36:in `loop'
W, [2019-12-05T07:59:41.662642 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: 	/usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/reader_loop.rb:36:in `run_loop'
W, [2019-12-05T07:59:41.662683 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Will recover from a network failure (no retry limit)...
W, [2019-12-05T07:59:51.663312 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 [email protected]:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Retrying connection on next host in line: white-guppy.rmq.cloudamqp.com:5672

@maxpain
Copy link

maxpain commented Dec 10, 2019

@warmfusion please...

@maxpain
Copy link

maxpain commented Dec 27, 2019

@warmfusion Can you please fix this bug? I can give you some money for this..

@maxpain
Copy link

maxpain commented Apr 30, 2020

@warmfusion We still have problems with this

@maxpain
Copy link

maxpain commented Jan 21, 2021

Any news?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants