Auto Review and Downstream Polling #208

mawelborn · 2025-01-14T16:51:50Z

This PR adds two classes that implement and encapsulate best-practices behavior for Auto Review and Downstream polling patterns. Combined with the results and etloutput modules, this factors out much of the scaffolding of an Auto Review or Downstream pod.

A project need only define one or both of these functions:

async def auto_review(result: Result, etl_outputs: dict[Document, EtlOutput]) -> AutoReviewed:
    """
    Apply auto review rules to predictions and determine straight through processing.
    Any IO performed (network requests, file reads/writes, etc) must be `await`ed to
    avoid blocking the asyncio loop that runs this coroutine.
    """
    predictions = result.pre_review

    # Auto review rules here.

    return AutoReviewed(
        changes=predictions.to_changes(result),
        stp=True,  # Defaults to `False` and may be omitted.
    )

async def downstream(submission: Submission) -> None:
    """
    Send a submission downstream.
    """
    await httpx.post(
        "https://dowstream_endpoint",
        json={
            "id": submission.id,
            "status": submission.status,
            "output_url": submission.result_file,
        },
    )

And instantiate one or both of the polling classes in a package's __main__.py, CLI, or other entrypoint:

asyncio.run(AutoReviewPoller(IndicoConfig(), workflow_id, auto_review).poll_forever())

asyncio.run(DownstreamPoller(IndicoConfig(), workflow_id, downstream).poll_forever())

asyncio.run(
    asyncio.gather(
        AutoReviewPoller(IndicoConfig(), workflow_id, auto_review).poll_forever(),
        DownstreamPoller(IndicoConfig(), workflow_id, downstream).poll_forever(),
    )
)

AutoReviewPoller loads the result file and etl output as dataclasses, applies use-case-specific logic with the auto_review callback, and submits the changes with optional STP.

DownstreamPoller loads the submission metadata, sends it downstream with the downstream callback, and marks the submission retrieved.

Implemented best-practices behavior includes:

Asyncio workers process submissions concurrently, up to a maximum number of workers configurable by the worker_count kwarg.
Continuous polling for new submissions keeps the worker queue saturated, configurable by the poll_delay kwarg.
Retry logic gives workers the best chance to recover from a network error, configurable by the retry_count kwarg and friends.
Robust error handling and logging and ensures the poller stays alive and errors are logged when workers fail.
Max workers, spawn rate, etl output loading, and retry behavior have sane defaults and are all configurable by kwargs.

See examples/poll_auto_review.py and the class definitions for AutoReviewPoller and DownstreamPoller for more details.

mawelborn added 12 commits January 13, 2025 16:37

Add polling classes for common auto review and downstream patterns

37768b0

Bump required indico-client version to include AsyncIndicoClient

30c1990

Change auto_review callable type signature to be more flexible

ff889e7

Rename run() to poll_forever()

0337607

Fix argument names for retry()

5c8a670

Use better argument names for worker count and spawn rate

f7662e1

Add example script for AutoReviewPoller

1c1026f

Improve DownstreamPoller docstring

f26e5fa

Update auto review polling example

2b2872a

Make types hints Python 3.9 compatible

19d1a6c

Catch all exceptions raised by a client call

4cf48f3

Log when downstream completes

db6081f

mawelborn requested review from nickesparza, Scott771, andrew8bit, annaliu-indico and prafulIndico January 15, 2025 21:27

mawelborn marked this pull request as ready for review January 15, 2025 21:27

Add the ability to reject submissions with AutoReviewPoller

6243fe5

mawelborn merged commit 09524e9 into main Feb 3, 2025
9 checks passed

mawelborn deleted the mawelborn/polling branch February 3, 2025 22:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Auto Review and Downstream Polling #208

Auto Review and Downstream Polling #208

Uh oh!

mawelborn commented Jan 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Auto Review and Downstream Polling #208

Auto Review and Downstream Polling #208

Uh oh!

Conversation

mawelborn commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mawelborn commented Jan 14, 2025 •

edited

Loading