Blocking use case: the developer wrote a worker that will create new tasks to the DP, that worker will wait for the task to complete from another worker: submit 1 -> submit 2 -> done 2 -> done 1
Non-blocking use case: the developer is simply sending tasks to the DP: submit 1 -> done 1 -> submit 2 -> done 2.
When dealing with blocking operations:
Execution Order (forward):
- groupA-1 (start)
- groupA-2 (start)
- groupA-3 (start)
- groupA-3 (complete)
- groupA-2 (complete)
- groupA-1 (complete)
Execution Order (reversed):
- groupB-3 (start)
- groupB-2 (start)
- groupB-1 (start)
- groupB-1 (complete)
- groupB-2 (complete)
- groupB-3 (complete)
When dealing with non-blocking operations:
Execution Order (forward):
- groupA-1 (start)
- groupA-1 (complete)
- groupA-2 (start)
- groupA-2 (complete)
- groupA-3 (start)
- groupA-3 (complete)
Execution Order (reversed):
- groupB-3 (start)
- groupB-3 (complete)
- groupB-2 (start)
- groupB-2 (complete)
- groupB-1 (start)
- groupB-1 (complete)
The DependencyPool shouldn't care WHERE tasks come from (worker or external) - it just needs to:
-
On Submit event:
- Store task state
- Check if task can run (based on mode and dependencies)
- If yes -> submit to pool
- If no -> keep in pending state
-
On Task Completion event:
- Update task state as completed
- Check all pending tasks that depend on completed one
- Submit any that can now run
We can't make one DP all at once
- IndependentDependencyPool: ideal for short tasks that doesn't block a worker to wait for another task, ideal for execution order
- BlockingDependencyPool: ideal for tasks that wait for other tasks, generally created from within a worker blocking it's execution
IndependentPool
Example:
- Used for data processing pipeline where tasks have clear dependencies
- Tasks are all submitted upfront
- Each task runs independently after its dependencies complete
- Execution order: Load Data -> Clean Data -> Transform Data -> Save Results
- Perfect for ETL workflows, build systems, or any sequential processing
BlockingPool
Example:
- Used for dynamic workflow where tasks create subtasks
- Main task creates and waits for subtasks to complete
- Subtasks can run in parallel if no dependencies between them
- Shows document processing workflow:
- Main task creates subtasks
- Text extraction must complete before analysis
- Sentiment analysis and keyword extraction can run in parallel
- Great for recursive tasks, document processing, or any workflow where tasks need to spawn and wait for subtasks
Key differences:
- Independent pool has all tasks known upfront
- Blocking pool allows dynamic task creation inside workers
- Independent tasks complete and move on
- Blocking tasks can wait for their children