Skip to content

Proposal to buffer service requests #41

Closed
@ablaom

Description

@ablaom

The context of this proposal is this synchronisation issue.

The main problem with logging in parallelized operations is simply this: requests are
posted directly to an MLflow service without full information about the state the service
at the time the request is ultimately acted on. I propose we resolve this as follows:

  • Instead of a client posting requests directly to an MLflow service, they are posted
    (put!) to a first-in-first-out queue (Julia Channel). Requesting calls will return
    immediately, unless the queue is full. In this way, the performance of the parallel
    workload is not impacted.

  • A single Julia Task dispatches requests (take!s) from the end of the queue. Whenever
    a request has the possibility of altering the service state (e.g., creating an
    experiment), then the dispatcher waits for confirmation that the state change is
    complete before dispatching the next request.

I imagine that we can insert the queue (buffer) without breaking the user-facing
interface of MLFlowClient.jl.

I have implemented a POC for this proposal and shared it with two maintainers, and can share with anyone else interested.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions