You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Multi-thread (or use asyncio) the submission and retror array
Use something like this:
import concurrent.futures
def compute(...):
...
with concurrent.futures.ThreadPoolExecutor() as executor:
results = list(executor.map(compute, inputs))
Is there really any reason to even use Celery groups If I implement this? Using them can reduce the number of HTTP calls quite dramatically, but if I multi-thread the submission and collection of results--perhaps the performance gains aren't that big of a deal and then I have a single level of abstraction to worry about--multi-threading submission and collection of results--instead of nesting groups within that multi-threading.
Pros: I'd never get timeout issues when collecting results because the group ended up being so large.
Cons: Collection of results will be slower, as it IS faster to collect 100 results (current default) in a single request.
EDIT: Chatted with Ethan. He has special code to handle group size issues when a group has too much data to download before timing out. This indicates to me that the batch size thing should be dispensed with entirely. Better to get your data with a few seconds of delay on a large batch submission than to have your results held hostage on the server because the group is too large to be downloaded and you have to resubmit with a smaller batch size in order to get your results back.
For FutureResultGroup we could use concurrent.futures.as_completed(...) to collect results as they complete ("stream" results):
forresultinfuture_result.as_completed():
# do something with result
Could also have in order as-completed:
forresultinfuture_result.collect() # or some better method name# do something with result
The text was updated successfully, but these errors were encountered:
Multi-thread (or use asyncio) the submission and retror array
Use something like this:
Is there really any reason to even use Celery groups If I implement this? Using them can reduce the number of HTTP calls quite dramatically, but if I multi-thread the submission and collection of results--perhaps the performance gains aren't that big of a deal and then I have a single level of abstraction to worry about--multi-threading submission and collection of results--instead of nesting groups within that multi-threading.
Pros: I'd never get timeout issues when collecting results because the group ended up being so large.
Cons: Collection of results will be slower, as it IS faster to collect 100 results (current default) in a single request.
EDIT: Chatted with Ethan. He has special code to handle group size issues when a group has too much data to download before timing out. This indicates to me that the batch size thing should be dispensed with entirely. Better to get your data with a few seconds of delay on a large batch submission than to have your results held hostage on the server because the group is too large to be downloaded and you have to resubmit with a smaller batch size in order to get your results back.
For
FutureResultGroup
we could useconcurrent.futures.as_completed(...)
to collect results as they complete ("stream" results):Could also have in order as-completed:
The text was updated successfully, but these errors were encountered: