Skip to content

Better methods for port-discovery when running ClickHouse server #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bnaecker opened this issue Jun 21, 2021 · 4 comments
Closed

Better methods for port-discovery when running ClickHouse server #130

bnaecker opened this issue Jun 21, 2021 · 4 comments
Assignees
Labels
bug Something that isn't working.

Comments

@bnaecker
Copy link
Collaborator

As part of #127, we added support for running the ClickHouse database server via the omicron-dev tool. This is mostly to support CI and development, enabling programmatic control of the server as a subprocess. In many cases, though, we need to run multiple instances of ClickHouse simultaneously, which means we can't hardcode the ports the program binds. ClickHouse supports binding a port of 0, in which case the OS correctly assigns it an open port -- however, there's no reflection of the actual port bound in the program's log-files or other output.

To work around this we introduced methods to discover the listening port, by parsing the output of an external program: pfiles on illumos, andlsof elsewhere. These are done in the discover_local_listening_port functions for each platform. This approach works, but is brittle and hacky, relying on an external program and the internals of the ClickHouse program (which ports it binds, under which conditions, etc.)

This issue tracks a more robust approach to this. The ideal solution is a PR against ClickHouse itself: when provided a port of 0, the program updates its internal storage of the port number to that actually doled out by the OS, by calling getsockname(2) or equivalent. Once this is integrated upstream, the functions linked above should be removed, and replaced with more robust parsing of the log output of ClickHouse, which does correctly report the port number on which the HTTP server is listening.

@bnaecker bnaecker added the bug Something that isn't working. label Jun 21, 2021
@bnaecker bnaecker self-assigned this Jun 21, 2021
@bnaecker
Copy link
Collaborator Author

The upstream fix for this has been opened here: ClickHouse/ClickHouse#25569

@bnaecker
Copy link
Collaborator Author

An initial fix for this was merged in #131, but this was insufficient. There's a race in the code for parsing port out of the log file -- the code searching the log may hit EOF before ClickHouse has written the searched-for line into the file. (@smklein first reported this here.)

A fix is in the works, which reads until the line is found (or a timeout), ignoring EOF. It'll be up shortly.

@bnaecker
Copy link
Collaborator Author

@smklein , fix is up in #139.

@bnaecker
Copy link
Collaborator Author

Closed by #131 and #139

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that isn't working.
Projects
None yet
Development

No branches or pull requests

1 participant