You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to process a CC dump using the LocalPipelineExecutor. My setup includes 6 files in the dump and a VM with 48 CPU cores. I run the code with 6 tasks and 48 workers, What I expect is that 48 cores should be utilized efficiently. Only 6 cores are actively processing the tasks.
Hi, we only multiprocess on the individual file level. So if you have 1 task processing 1 file, giving it more CPUs will not speed up the processing. The way to go faster is to have more (smaller) input files so that you can have more tasks in total
I am trying to process a CC dump using the LocalPipelineExecutor. My setup includes 6 files in the dump and a VM with 48 CPU cores. I run the code with 6 tasks and 48 workers, What I expect is that 48 cores should be utilized efficiently. Only 6 cores are actively processing the tasks.
Code:
How can I use all cores to process data?
The text was updated successfully, but these errors were encountered: