Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, airflow scheduler takes
max_tis_per_query
number of tasks_instances at the time to check if they can be executed, it considers many things likepriority_weight
max_active_tasks
) <<<<<<<See code implementation
scheduler_job.py
This PR focuses in improving number 5 (DAG can execute more tasks (
max_active_tasks
)).The problem here is that this check is done after the tasks_instances are retrieved (
starved_dags
set is empty initially), because of this sometimes the first query gets 512 tasks_instances from which only a few (<10) can be executed, the rest are rejected with messages like this:The proposal change is is to first query for those
starved_dags
and apply that filter in the main query, this way those dags will be ignored and the rest of the tasks will be able to be executed.The stuck in schedule happens because most of the 512 task_instances have high priority weight, and the same are taken for analysis every schedule cycle. By ignoring the ones that we already know won't be able to be executed, we open a ton of space for those who can.