Description
We are using elasticsearch-ruby
through the chewy
gem, and we import documents in a Sidekiq worker. This means each Sidekiq thread that accesses Elasticsearch has its own instance of the Elasticsearch::Client
, and when a (possibly unrelated) Sidekiq job fails, that worker thread is closed and a new thread is opened in the same process.
One thing we noticed though, is that the underlying Elasticsearch connections are only closed when Ruby's garbage collector collects the dead thread's Elasticsearch::Client
instance, which seems to be the cause of a file descriptor leak in our application.
We think we have found a way to close these connections by adding the following code to the error handler in a custom Sidekiq middleware:
Chewy.client.transport.transport.connections.each do |connection|
# This bit of code is tailored for the HTTPClient Faraday adapter
connection.connection.app.instance_variable_get(:@client)&.reset_all
end
However, this piece of code breaks multiple layers of abstractions, going through chewy
, elasticsearch
, elasticsearch-transport
, faraday
and faraday-httpclient
, even accessing an otherwise unexposed instance variable at one point.
Is there a better way of closing connections to Elasticsearch? Are we missing something obvious about their lifecycle?
Digging into it, my understanding of the issue is that neither elasticsearch
nor elasticsearch-transport
provide a method to close connections.
It looks like faraday
has Faraday::Connection#close
but that appears to not actually be implemented in most adapters, and in particular not in the faraday-httpclient
adapter that ends up being used in our app.
Of course, I may have missed something, and would be glad to know what if that's the case!