Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data movement task dependencies with clang-formatting #110

Merged
merged 6 commits into from
Jan 18, 2024

Conversation

nicelhc13
Copy link
Contributor

@yinengy found and reported this bug:

a.print_overview()
@spawn(ts[5], placement=gpu(1), inout=[(a,0)])
def check_array_evict():
    a.print_overview()

@spawn(ts[7], dependencies=[ts[5]], placement=[gpu(1)], input=[(a, 0)])
def check_array_evict2():
    a.print_overview()

@spawn(ts[8], dependencies=[ts[7]], placement=[gpu(2)], input=[(a, 0)])
def check_array_evict3():
    a.print_overview()

In this case, ts[8]'s datamovement task runs before ts[5].
This is because the current data move task dependency is intersection between compute task and parray's referrers.
However, in this case, ts[8]'s dependency is ts[7] and a's dependency is ts[5], and so, there is no intersection.
In order to fix this case, I make the parray's referrers the last task accessing it on OUT or INOUT permission.
Whenever this task is being spawned, the old referrers are cleared.
For correctness, tasks should be spawned in program execution order, but this is not new and so I think it is ok.

@nicelhc13 nicelhc13 requested review from wlruys and yinengy June 13, 2023 17:26
@wlruys
Copy link
Contributor

wlruys commented Jun 13, 2023

I still have to review the bulk of the code but I'm concerned about this comment:
For correctness, tasks should be spawned in program execution order, but this is not new and so I think it is ok.

Tasks can be spawned in any order (its bugged in old Parla: see ut-parla/Parla.py#146) and its one of the reason's that explicit dependencies are more flexible than inferred from dataflow.

Copy link
Contributor

@wlruys wlruys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) This means each PArray task_list will only ever have 1 element. Is this enough to handle all parent/child relations/slicing behavior and dependencies? My gut says no, but I still have a hard time understanding the very few cases that our slices can be used concurrently and can't think of an example that breaks. I think it looks/probably is okay! but just bringing it up in case we do find some edge case.

Comment) If this is the solution, it really can't be done at spawn time. It is not thread-safe and things may be spawned out of order. It needs to be after the first implicit topological sort when things enter the mapping phase / become mappable. For example, consider spawning two subtrees with nested tasks where there are some sibling dependencies between trees.

@nicelhc13
Copy link
Contributor Author

I still have to review the bulk of the code but I'm concerned about this comment: For correctness, tasks should be spawned in program execution order, but this is not new and so I think it is ok.

Tasks can be spawned in any order (its bugged in old Parla: see ut-parla/Parla.py#146) and its one of the reason's that explicit dependencies are more flexible than inferred from dataflow.

Ok. this is the reason why I was waiting for your review :-) I forgot that case. Let me think about this. In the worst case, we may need to enforce users to specify those dependenceis, which I really don't want.

@wlruys
Copy link
Contributor

wlruys commented Jun 13, 2023

All of the phase triggers/enqueues work as a topo sort, bc the current preconditions are that dependencies must go through the phase first.

Within a phase there could potentially be a reordering (there is not currently), but when they enter and exit it is guaranteed to be a valid execution ordering.

If I remember correctly, that is checked here for mappable:

if (status.mappable && (task->get_state() < Task::MAPPED)) {

@wlruys
Copy link
Contributor

wlruys commented Jun 13, 2023

I think the quick fix is just to delay adding tasks to parrays (and the reverse) until tasks are ordered and then use the same logic as in this original PR. There might still be a better way? but that's what jumps out to me.

@wlruys
Copy link
Contributor

wlruys commented Jun 13, 2023

... There might have problems with policy (and mess up the triggers) depending on where/when we are adding these extra dependencies.

@nicelhc13
Copy link
Contributor Author

I don't see any problem from your suggestion. I will rethink this problem from your suggestion. Thank you!

@wlruys
Copy link
Contributor

wlruys commented Jun 14, 2023

One edge case is going to be continuation tasks. They are only spawned once, but hit all of the other phases more than once. There should be a counter on the task meta-data already that tracks whether its the first time or not so you can just filter them out.

@wlruys
Copy link
Contributor

wlruys commented Aug 28, 2023

I forget the status of this? Does it need to be revisted?

@wlruys
Copy link
Contributor

wlruys commented Jan 9, 2024

We need to fix this for NUWEST, will try my best to think of a solution to change data movement dependency semantics.

@nicelhc13
Copy link
Contributor Author

working on it from today

@nicelhc13
Copy link
Contributor Author

To sum up, here is the possible solution:
Delay adding a task itself to a parray list from the creation phase to right after task mapping phase is completed; due to the ordering constraint that parents have been mapped before a task can be mapped, we can guarantee the task list is valid.
Then, we remove this when a compute task is completed.

Mapping policy actually gets more accurate information (it was your concern, wasn't it? Otherwise, I am missing something)

So when we create a data move task for task 8, task 5 should be on the list.
then task 5 will be included in task 8.

@nicelhc13
Copy link
Contributor Author

I changed data move task creation and its dependency creation. Basically, instead of finding task list of a parray and compute dependency tasks, now each task sets dependency lists for each parray for dependent tasks and propagates it if a dependent task accesses the parray with read-only permission. I am not 100% for sure if this is not wrong but the bug case, cholesky, and sorting seem working correctly.

Copy link
Contributor

@wlruys wlruys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand this after our call. It seems like a good solution while we resolve other data semantics.

@wlruys wlruys merged commit 6bd3166 into main Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants