Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streaming_bulk fails to retry if max_retries is set but raise_on_error is kept on True #864

Open
teuneboon opened this issue Oct 30, 2018 · 0 comments · May be fixed by #2208
Open

streaming_bulk fails to retry if max_retries is set but raise_on_error is kept on True #864

teuneboon opened this issue Oct 30, 2018 · 0 comments · May be fixed by #2208

Comments

@teuneboon
Copy link

teuneboon commented Oct 30, 2018

Not sure if this is a documentation error, something I misunderstood or just unintended behavior, but if I use helpers.bulk (which uses streaming_bulk in the background) and just set max_retries=5 it won't retry on having a full bulk queue. Excerpt from the error dictionary I get:
'status': 429, 'error': {'reason': 'rejected execution of org.elasticsearch.transport.TransportService$7@5a7c85fe on EsThreadPoolExecutor[name = es001.******/write, queue capacity = 500, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@3ac3b71[Running, pool size = 4, active threads = 4, queued tasks = 507, completed tasks = 165216131]]', 'type': 'es_rejected_execution_exception'},

Which is a elasticsearch.helpers.BulkIndexError, and as you can see it shows the status "429" which is supposed to be the one that will get retried. I believe this is because the except right here: https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/helpers/__init__.py#L211 only catches TransportError, and a BulkIndexError is not a TransportError.

If I were to set raise_on_error=False I believe it should work(by looking at the code), this particular issue is kind of annoying to reproduce right now though since it only happens occasionally and only in production right now, but maybe it would make more sense that if raise_on_error=True it would keep retrying and raise the last exception once it ran out of retries? Another option would be to change the documentation to mention that combining raise_on_error and max_retries can lead to unexpected behavior.

edit: After some more investigation it seems the error is raised here: https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/helpers/__init__.py#L137 so I can't really work around it right now in any meaningful way except by re-implementing some retry logic in my own code.

bshakur8 pushed a commit to bshakur8/elasticsearch-py that referenced this issue Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant