[FEATURE] Handle arbitrarily long lists of inputs. #44

coltonbh · 2023-09-02T23:08:08Z

Multi-thread (or use asyncio) the submission and retror array

Use something like this:

import concurrent.futures

def compute(...):
     ...

with concurrent.futures.ThreadPoolExecutor() as executor:
    results = list(executor.map(compute, inputs))

Is there really any reason to even use Celery groups If I implement this? Using them can reduce the number of HTTP calls quite dramatically, but if I multi-thread the submission and collection of results--perhaps the performance gains aren't that big of a deal and then I have a single level of abstraction to worry about--multi-threading submission and collection of results--instead of nesting groups within that multi-threading.

Pros: I'd never get timeout issues when collecting results because the group ended up being so large.

Cons: Collection of results will be slower, as it IS faster to collect 100 results (current default) in a single request.

EDIT: Chatted with Ethan. He has special code to handle group size issues when a group has too much data to download before timing out. This indicates to me that the batch size thing should be dispensed with entirely. Better to get your data with a few seconds of delay on a large batch submission than to have your results held hostage on the server because the group is too large to be downloaded and you have to resubmit with a smaller batch size in order to get your results back.

For FutureResultGroup we could use concurrent.futures.as_completed(...) to collect results as they complete ("stream" results):

for result in future_result.as_completed():
    # do something with result

Could also have in order as-completed:

for result in future_result.collect() # or some better method name
    # do something with result

The text was updated successfully, but these errors were encountered:

coltonbh added the enhancement New feature or request label Sep 2, 2023

coltonbh self-assigned this Sep 2, 2023

coltonbh changed the title ~~Autopartition tasks by max_batch_inputs on server~~ [FEATURE] Handle arbitrarily long lists of inputs. Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Handle arbitrarily long lists of inputs. #44

[FEATURE] Handle arbitrarily long lists of inputs. #44

coltonbh commented Sep 2, 2023 •

edited

Loading

[FEATURE] Handle arbitrarily long lists of inputs. #44

[FEATURE] Handle arbitrarily long lists of inputs. #44

Comments

coltonbh commented Sep 2, 2023 • edited Loading

coltonbh commented Sep 2, 2023 •

edited

Loading