Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure_blob: connect/read timeouts to the reqwest client #10147

Closed
wants to merge 2 commits into from

Conversation

arpad-m
Copy link
Member

@arpad-m arpad-m commented Dec 13, 2024

Problem

We already have timeouts at the higher levels of the stack, via select and friends, but the timeouts might not actually reach the reqwest client. That might lead to port exhaustion issus later on as the higher level timeouts don't clean up the ports. Therefore, set it up in a way that the timeouts are preconfigured.

https://github.com/neondatabase/cloud/issues/20971

Summary of changes

Pass down timeouts from the SDK to the reqwest client. Hopefully this will address the port exhaustion issue. We always use read_timeout because that one is shorter: there is only one global timeout we can configure.

Related earlier work: #9938

@arpad-m arpad-m requested review from jcsp and VladLazar December 13, 2024 17:38
@arpad-m arpad-m requested a review from a team as a code owner December 13, 2024 17:38
@arpad-m
Copy link
Member Author

arpad-m commented Dec 13, 2024

hmmm looking at the way they are implemented, I think it might actually not more useful than a select. Still, I think it's worth a try.

Copy link

7095 tests run: 6797 passed, 1 failed, 297 skipped (full report)


Failures on Postgres 17

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_idle_checkpoints[debug-pg17]"
Flaky tests (5)

Postgres 17

Postgres 16

  • test_pgdata_import_smoke[8-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS]: release-arm64
  • test_pgdata_import_smoke[None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS]: release-arm64

Postgres 15

  • test_pgdata_import_smoke[8-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS]: release-arm64
  • test_pgdata_import_smoke[None-1024-RelBlockSize.MULTIPLE_RELATION_SEGMENTS]: release-arm64

Test coverage report is not available

The comment gets automatically updated with the latest test results
3c4221d at 2024-12-13T18:58:28.142Z :recycle:

@jcsp
Copy link
Collaborator

jcsp commented Dec 16, 2024

We're not leaking connections -- the port exhaustion comes from lots of TIME_WAIT connections, i.e. those which are closed by the client but will stick around for some time. I would expect that using a shorter timeout on the client side would actually make this problem worse, as each time we time out a request we're leaving another connection in TIME_WAIT state for 60 seconds (by default).

@arpad-m
Copy link
Member Author

arpad-m commented Dec 16, 2024

Oh good point, yeah that makes more sense.

TIME_WAIT state for 60 seconds (by default).

Googling it told me the default is 4 minutes, but shrug. We'll need different approach in any case, closing.

@arpad-m arpad-m closed this Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants