Attempted to resurrect connection to dead ES instance, but got an error. #160

xebia-progress · 2020-07-03T10:13:32Z

Hi,
I have a problem with connection to the AWS ES service from logstash installed on AWS EC2.
I have two instances [logstash01, logstash02] with the same configuration:

logstash 6.8.10
logstash-output-amazon_es (6.4.2)
AWS ES version 6.2

The first instance is working fine but on the second one there are many warnings:

[2020-07-03T09:52:46,924][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>https://{my-aws-es-service}.{aws-region}.es.amazonaws.com:443/, :path=>"/"} [2020-07-03T09:52:46,927][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"https://{my-aws-es-service}.{aws-region}.es.amazonaws.com:443/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::BadResponseCodeError, :error=>"Got response code '403' contacting Elasticsearch at URL 'https://{my-aws-es-service}.{aws-region}.es.amazonaws.com:443/'"}

Both EC2 instances have been assigned to the same IAM Role with full access to ES, they are both in the same VPC, subnet and their security groups have the same rules. I am able to curl ES service from both EC2s.

The configuration of amazon_es output plugin is:
output { amazon_es { hosts => ["{my-aws-es-service}.{aws-region}.es.amazonaws.com"] region => "{aws-region}" index => "logstash-%{[index]}-%{+YYYY.MM.dd}" } }

I have had this issue for couple of days and not able to resolve. Any help would be appreciated.

The text was updated successfully, but these errors were encountered:

riemann89 · 2020-07-22T11:57:58Z

Having same issue, have you solved?

xebia-progress · 2020-07-22T12:15:17Z

I created a new EC2 instance and recreated configuration on it but I still don't know why the problem was occuring on the old EC2.

riemann89 · 2020-07-22T12:19:57Z

Specs of the EC2 instance pls?

xebia-progress · 2020-07-22T12:28:17Z

Both EC2 instances (t3.medium - CentOS Linux 7 x86_64 HVM EBS 1708_11.01) - created with Terraform
Logstash configuration - created with Ansible

riemann89 · 2020-07-22T13:10:54Z

Thanks, I am using Amazon Linux one (medium). Still need to figure out the issue.

AdrienBigot · 2020-09-03T08:22:21Z

Same issue for me. I'm suspecting a throttle on the elasticsearch side because this error appears near another error about the Request size exceeded .

[2020-09-03T10:18:30,104][ERROR][logstash.outputs.amazonelasticsearch][main][7fe0e0a34b3fb83c50f4196e10aea37404a5be5bdef7037eb295a4700045ac92] Encountered a retryable error. Will Retry with exponential backoff {:code=>413, :url=>"https://search-elk-nonprod-es-xxxxxxxxxxxxxxxxxxxxxxxx.eu-west-1.es.amazonaws.com:443/_bulk"} [2020-09-03T10:18:30,430][WARN ][logstash.outputs.amazonelasticsearch][main][7fe0e0a34b3fb83c50f4196e10aea37404a5be5bdef7037eb295a4700045ac92] Marking url as dead. Last error: [LogStash::Outputs::AmazonElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [https://search-elk-nonprod-esxxxxxxxxxxxxxxxxxxxxxxxx.eu-west-1.es.amazonaws.com:443/][Manticore::ClientProtocolException] search-elk-nonprod-es-xxxxxxxxxxxxxxxxxxxxxxxx.eu-west-1.es.amazonaws.com:443 failed to respond {:url=>https://search-elk-nonprod-es-xxxxxxxxxxxxxxxxxxxxxxxx.eu-west-1.es.amazonaws.com:443/, :error_message=>"Elasticsearch Unreachable: [https://search-elk-nonprod-es-xxxxxxxxxxxxxxxxxxxxxxxx.eu-west-1.es.amazonaws.com:443/][Manticore::ClientProtocolException] search-elk-nonprod-es-xxxxxxxxxxxxxxxxxxxxxxxx.eu-west-1.es.amazonaws.com:443 failed to respond", :error_class=>"LogStash::Outputs::AmazonElasticSearch::HttpClient::Pool::HostUnreachableError"}

Even with the max_bulk_bytes set to 100.000 I have these errors.

NeckBeardPrince · 2020-10-27T18:10:29Z

@pgs-progress Did you ever figure this out?

xebia-progress · 2020-10-28T07:08:15Z

@NeckBeardPrince Unfortunately not. I recreated an EC2 instance and on the new one the problem disappeared.

NeckBeardPrince · 2020-10-28T11:45:56Z

@NeckBeardPrince Unfortunately not. I recreated an EC2 instance and on the new one the problem disappeared.

sigh not a lot of info in the error for me to troubleshoot either. Thanks.

nicon89 · 2020-11-13T17:24:37Z

Got same issue. Any advices?

Rnxxx · 2020-11-22T14:39:32Z

I got the same error and found out i mismatched Elasticsearch endpoint with Kibana endpoint.
To set proper Elasticsearch endpoint solved issue.

TechieGenie · 2021-06-15T18:33:33Z

Anyone got the fix?

kriss332 · 2021-06-27T10:18:08Z

I've also got the same problem.

[2021-06-27T15:35:30,864][WARN ][logstash.outputs.elasticsearch][main] Attempted to resurrect connection to dead ES instance, but got an error {:url=>"http://locahost:9200/", :exception=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :message=>"Elasticsearch Unreachable: [http://locahost:9200/][Manticore::ResolutionFailure] locahost"}

Whereas elasticsearch is available-

curl http://localhost:9200

{
  "name" : "ELK",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "i90k-uQlQjyERkduOLy6Jw",
  "version" : {
    "number" : "7.13.2",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "4d960a0733be83dd2543ca018aa4ddc42e956800",
    "build_date" : "2021-06-10T21:01:55.251515791Z",
    "build_snapshot" : false,
    "lucene_version" : "8.8.2",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Strangely when I try to stop logstash service, I get this problem from logstash-plain.log

[2021-06-27T15:38:21,684][WARN ][org.logstash.execution.ShutdownWatcherExt] {"inflight_count"=>0, "stalling_threads_info"=>{"other"=>[{"thread_id"=>32, "name"=>"[main]>worker0", "current_call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/interval.rb:95:in `sleep'"}, {"thread_id"=>33, "name"=>"[main]>worker1", "current_call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/interval.rb:95:in `sleep'"}, {"thread_id"=>34, "name"=>"[main]>worker2", "current_call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/interval.rb:95:in `sleep'"}, {"thread_id"=>35, "name"=>"[main]>worker3", "current_call"=>"[...]/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/interval.rb:95:in `sleep'"}]}}

Then I've to kill the logstash process. Although, I am able to upload a csv using a custom config file using below command-
/usr/share/logstash/bin/logstash -f paloAlto.config
Plz let me know if more logs are required.

kriss332 · 2021-06-27T13:52:05Z

UPDATE: I had been parsing local CSV files only for triage till now, so it was first time I was getting a syslog file.
Uncommenting the line
http.port: 9200
in /etc/elasticsearch/elasticsearch.yml worked for me. Now no more logs about dead elasticsearch instance and logs are flowing in.
Additionally make sure that the line network.host: localhost is also uncommented.

tuurek · 2022-09-02T11:07:43Z

I got this issue as well, it was due to the user lacking privileges to the root path. By adding cluster-level monitor privileges the problem was gone. Not necessarily the same issue, but the error message was the same.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempted to resurrect connection to dead ES instance, but got an error. #160

Attempted to resurrect connection to dead ES instance, but got an error. #160

xebia-progress commented Jul 3, 2020 •

edited

Loading

riemann89 commented Jul 22, 2020 •

edited

Loading

xebia-progress commented Jul 22, 2020

riemann89 commented Jul 22, 2020 •

edited

Loading

xebia-progress commented Jul 22, 2020

riemann89 commented Jul 22, 2020 •

edited

Loading

AdrienBigot commented Sep 3, 2020

NeckBeardPrince commented Oct 27, 2020

xebia-progress commented Oct 28, 2020

NeckBeardPrince commented Oct 28, 2020

nicon89 commented Nov 13, 2020

Rnxxx commented Nov 22, 2020 •

edited

Loading

TechieGenie commented Jun 15, 2021

kriss332 commented Jun 27, 2021 •

edited

Loading

kriss332 commented Jun 27, 2021

tuurek commented Sep 2, 2022 •

edited

Loading

Attempted to resurrect connection to dead ES instance, but got an error. #160

Attempted to resurrect connection to dead ES instance, but got an error. #160

Comments

xebia-progress commented Jul 3, 2020 • edited Loading

riemann89 commented Jul 22, 2020 • edited Loading

xebia-progress commented Jul 22, 2020

riemann89 commented Jul 22, 2020 • edited Loading

xebia-progress commented Jul 22, 2020

riemann89 commented Jul 22, 2020 • edited Loading

AdrienBigot commented Sep 3, 2020

NeckBeardPrince commented Oct 27, 2020

xebia-progress commented Oct 28, 2020

NeckBeardPrince commented Oct 28, 2020

nicon89 commented Nov 13, 2020

Rnxxx commented Nov 22, 2020 • edited Loading

TechieGenie commented Jun 15, 2021

kriss332 commented Jun 27, 2021 • edited Loading

kriss332 commented Jun 27, 2021

tuurek commented Sep 2, 2022 • edited Loading

xebia-progress commented Jul 3, 2020 •

edited

Loading

riemann89 commented Jul 22, 2020 •

edited

Loading

riemann89 commented Jul 22, 2020 •

edited

Loading

riemann89 commented Jul 22, 2020 •

edited

Loading

Rnxxx commented Nov 22, 2020 •

edited

Loading

kriss332 commented Jun 27, 2021 •

edited

Loading

tuurek commented Sep 2, 2022 •

edited

Loading