Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork #405

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from
Draft

Fork #405

wants to merge 18 commits into from

Conversation

mgrunbauer
Copy link
Collaborator

@mgrunbauer mgrunbauer commented Sep 9, 2024

Changes:

  • adagucserver now boots up through supervisord
  • when ran with ADAGUC_FORK set (to anything), adagucserver will create a unix domain socket in $ADAGUC_PATH/adaguc.socket and listen for connections to the socket.
  • if someone (python webserver) connects to socket, the adagucserver process calls fork() and handles the client request in the forked process. Python will write the QUERY_STRING in the socket, and the forked adagucserver process will write it's stdout back via the socket to the python webserver (e.g. a GetMap will result in adagucserver writing header+binary png data inside the socket).

Info:

  • Spawning a new process has overhead. I suspect the main reason is due to the long list of linked libraries; ldd ./bin/adagucserver | wc -l yields 103 entries. It could also be due to initialization code inside some of the .so files we link against.
  • You can measure the process overhead by enabling the MEASURETIME macro. Look for the logline Ready!!!. Performing a GetMap request takes 150 ms according to adagucserver C++. However the same request results in the response header x-process-time set to 200 ms. There is about 50 ms overhead (this will depend on your system/OS).
  • Forking results in the parent process getting duplicated. The child process inherits everything from the parent (fd's, env) but receives it's own separate space in memory. The forked process is already fully initialized.

Issues to investigate:

  • Current setup starts forkserver through supervisord. Investigate whether python webserver starting fork server is a preferred setup (helps with setting environment in one place).
  • Remove changes to stdout/stderr. Create a new logmethod specifically for fork mode which ensures err|warn|debug goes to stderr.
  • No stdout/stderr visible from forkserver when running in docker
  • ADAGUC_FORK env var should describe absolute path of socket. When not defined, adaguc will not run in fork mode. Note: path to adaguc socket must be readable/writable by adaguc process, /tmp/ does not work in Docker.
  • Store start time in fork mapping, periodically clean old processes and log warn.
  • timeout=300 should be configurable and should be used inside the "background thread that checks/cleans old processes".

@@ -49,7 +49,8 @@ void printErrorStream(const char *message) { _printErrorStreamPointer(message);

void _printErrorStream(const char *pszMessage) { fprintf(stderr, "%s", pszMessage); }
void _printWarningStream(const char *pszMessage) { fprintf(stderr, "%s", pszMessage); }
void _printDebugStream(const char *pszMessage) { printf("%s", pszMessage); }
// void _printDebugStream(const char *pszMessage) { printf("%s", pszMessage); }
void _printDebugStream(const char *pszMessage) { fprintf(stderr, "%s", pszMessage); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed we can keep this as it was and create a forklogger function which logs to stderr. This can be set as functionpointer in setErrorFunction(serverErrorFunction), setWarningFunction(serverWarningFunction), and setDebugFunction(serverDebugFunction)

Docker/start.sh Outdated Show resolved Hide resolved
ADAGUC_PATH=/adaguc/adaguc-server-master/,
ADAGUC_TMP=/tmp,
ADAGUC_CONFIG=/adaguc/adaguc-server-master/python/lib/adaguc/adaguc-server-config-python-postgres.xml
command=/adaguc/adaguc-server-master/bin/adagucserver
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss if it would be a good solution to start this executable once in the python code. The reasoning is that the python code has logic to calculate the environment variables like ADAGUC_ONLINERESOURCE and others.

@@ -23,6 +24,94 @@
sem = asyncio.Semaphore(int(ADAGUC_NUMPARALLELPROCESSES))


ADAGUC_FORK_UNIX_SOCKET = f"{os.getenv('ADAGUC_PATH')}/adaguc.socket"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to be able to set the socket path with an absolute value so it does not have to be in the adaguc-server-master directory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants