-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: massStoreRunAsynchronous()
#4326
Open
whisperity
wants to merge
4
commits into
Ericsson:master
Choose a base branch
from
whisperity:feat/server/asynchronous-store-protocol/patch/4-mass-store-run-asynchronous
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
feat: massStoreRunAsynchronous()
#4326
whisperity
wants to merge
4
commits into
Ericsson:master
from
whisperity:feat/server/asynchronous-store-protocol/patch/4-mass-store-run-asynchronous
Commits on Sep 18, 2024
-
feat(server): Asynchronous server-side background task execution
This patch implements the whole support ecosystem for server-side background tasks, in order to help lessen the load (and blocking) of API handlers in the web-server for long-running operations. A **Task** is represented by two things in strict co-existence: a lightweight, `pickle`-able implementation in the server's code (a subclass of `AbstractTask`) and a corresponding `BackgroundTask` database entity, which resides in the "configuration" database (shared across all products). A Task is created by API request handlers and then the user is instructed to retain the `TaskToken`: the task's unique identifier. Following, the server will dispatch execution of the object into a background worker process, and keep status synchronisation via the database. Even in a service cluster deployment, load balancing will not interfere with users' ability to query a task's status. While normal users can only query the status of a single task (which is usually automatically done by client code, and not the user manually executing something); product administrators, and especially server administrators have the ability to query an arbitrary set of tasks using the potential filters, with a dedicated API function (`getTasks()`) for this purpose. Tasks can be cancelled only by `SUPERUSER`s, at which point a special binary flag is set in the status record. However, to prevent complicating inter-process communication, cancellation is supposed to be implemented by `AbstractTask` subclasses in a co-operative way. The execution of tasks in a process and a `Task`'s ability to "communicate" with its execution environment is achieved through the new `TaskManager` instance, which is created for every process of a server's deployment. Unfortunately, tasks can die gracelessly if the server is terminated (either internally, or even externally). For this reason, the `DROPPED` status will indicate that the server has terminated prior to, or during a task's execution, and it was unable to produce results. The server was refactored significantly around the handling of subprocesses in order to support various server shutdown scenarios. Servers will start `background_worker_processes` number of task handling subprocesses, which are distinct from the already existing "API handling" subprocesses. By default, if unconfigured, `background_worker_processes` is equal to `worker_processes` (the number of API processes to spawn), which is equal to `$(nproc)` (CPU count in the system). This patch includes a `TestingDummyTask` demonstrative subclass of `AbstractTask` which counts up to an input number of seconds, and each second it gracefully checks whether it is being killed. The corresponding testing API endpoint, `createDummyTask()` can specify whether the task should simulate a failing status. This endpoint can only be used from, but is used extensively, the unit testing of the project. This patch does not include "nice" or "ergonomic" facilities for admins to manage the tasks, and so far, only the server-side of the corresponding API calls are supported.
Configuration menu - View commit details
-
Copy full SHA for a75244f - Browse repository at this point
Copy the full SHA a75244fView commit details -
feat(cmd): Implemented a CLI for task management
This patch extends `CodeChecker cmd` with a new sub-command, `serverside-tasks`, which lets users and administrators deal with querying the status of running server-side tasks. By default, the CLI queries the information of the task(s) specified by their token(s) in the `--token` argument from the server using `getTaskInfo(token)`, and shows this information in either verbose "plain text" (available if precisely **one** task was specified), "table" or JSON formats. In addition to `--token`, it also supports 19 more parameters, each of which correspond to a filter option in the `TaskFilter` API type. If any filters in addition to `--token` is specified, it will exercise `getTasks(filter)` instead. This mode is only available to administrators. The resulting, more detailed information structs are printed in "table" or JSON formats. Apart from querying the current status, two additional flags are available, irrespective of which query method is used to obtain a list of "matching tasks": * `--kill` will call `cancelTask(token)` for each task. * `--await` will block execution until the specified task(s) terminate (in one way or another). `--await` is implemented by calling the new **`await_task_termination`** library function, which is implemented with the goal of being reusable by other clients later.
Configuration menu - View commit details
-
Copy full SHA for d73c1da - Browse repository at this point
Copy the full SHA d73c1daView commit details -
refactor(server):
massStoreRun()
as a background taskSeparate the previously blocking execution of `massStoreRun()`, which was done in the context of the "API handler process", into a "foreground" and a "background" part, exploiting the previously implemented background task library support. The foreground part remains executed in the context of the API handler process, and deals with receiving and unpacking the to-be-stored data, saving configuration and checking constraints that are cheap to check. The foreground part can report issues synchronously to the client. Everything else that was part of the previous `massStoreRun()` pipeline, as implemented by the `mass_store_run.MassStoreRun` class becomes a background task, such as the parsing of the uploaded reports and the storing of data to the database. This background task, implemented using the new library, executes in a separate background worker process, and can not communicate directly with the user. Errors are logged to the `comments` fields. The `massStoreRun()` API handler will continue to work as previously, and block while waiting for the background task to terminate. In case of an error, it synchronously reports a `RequestFailed` exception, passing the `comments` field (into which the background process had written the exception details) to the client. Due to the inability for most of the exceptions previously caused in `MassStoreRun` to "escape" as `RequestFailed`s, some parts of the API had been deprecated and removed. Namely, `ErrorCode.SOURCE_FILE` and `ErrorCode.REPORT_FORMAT` are no longer sent over the API. This does not break existing behaviour and does not cause an incompatibility with clients: in cases where the request exceptions were raised earlier, now a different type of exception is raised, but the error message still precisely explains the problem as it did previously.
Configuration menu - View commit details
-
Copy full SHA for d42164b - Browse repository at this point
Copy the full SHA d42164bView commit details
Commits on Sep 20, 2024
-
feat:
massStoreRunAsynchronous()
Even though commit d915473 introduced a socket-level TCP keepalive support into the server's implementation, this was observed multiple times to not be enough to **deterministically** fix the issues with the `CodeChecker store` client hanging indefinitely when the server takes a long time processing the to-be-stored data. The underlying reasons are not entirely clear and the issue only pops up sporadically, but we did observe a few similar scenarios (such as multi-million report storage from analysing LLVM and then storing between datacentres) where it almost reliably reproduces. The symptoms (even with a configure `kepalive`) generally include the server not becoming notified about the client's disconnect, while the client process is hung on a low-level system call `read(4, ...)`, trying to get the Thrift response of `massStoreRun()` from the HTTP socket. Even if the server finishes the storage processing "in time" and sent the Thrift reply, it never reaches the client, which means it never exits from the waiting, which means it keeps either the terminal or, worse, a CI script occupied, blocking execution. This is the "more proper solution" foreshadowed in commit 15af7d8. Implemented the server-side logic to spawn a `MassStoreRun` task and return its token, giving the `massStoreRunAsynchronous()` API call full force. Implemented the client-side logic to use the new `task_client` module and the same logic as `CodeChecker cmd serverside-tasks --await --token TOKEN...` to poll the server for the task's completion and status.
Configuration menu - View commit details
-
Copy full SHA for e30fcc2 - Browse repository at this point
Copy the full SHA e30fcc2View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.