_foreach_norm: Align with PyTorch operator semantics on allocation scheme of return tensors #709

chunhuanMeng · 2024-08-07T08:13:07Z

PyTorch requires separate copies returned in foreach_norm. The existing XPU implementation follows an out-of-date allocation scheme, to share storage among returned tensor. In latest PyTorch unit test, the behavior is not allowed.
related case:

test_dispatch_meta_outplace__foreach_norm_xpu_bfloat16
test_dispatch_meta_outplace__foreach_norm_xpu_float
test_dispatch_symbolic_meta_outplace__foreach_norm_xpu_bfloat16
test_dispatch_symbolic_meta_outplace__foreach_norm_xpu_float
test_dispatch_symbolic_meta_outplace_all_strides__foreach_norm_xpu_float32
test_meta_outplace__foreach_norm_xpu_bfloat16
test_meta_outplace__foreach_norm_xpu_float

fengyuan14 · 2024-08-08T01:29:48Z

The error is exposed by -WError in preci. Please fix it,

src/ATen/native/xpu/sycl/ForeachReduceKernels.cpp

fengyuan14 · 2024-08-13T07:05:19Z

fengyuan14 · 2024-08-16T00:45:28Z

src/ATen/native/xpu/sycl/ForeachReduceKernels.cpp

+    ret_per_tensor.push_back(at::empty({}, res_option));
+  }
+  sycl::queue q{sycl::property::queue::in_order()};
+  void** tensor_list_addresses = sycl::malloc_shared<void*>((ntensors), q);


Do not use raw runtime malloc, but helpers provided by PyTorch XPU backend. The usage here, should be,

Allocating ping memory

Initing meta on ping memory

Allocating device memory

memcpy from ping to dev

fix storage_offset not match with meta tensor

22e2a29

fengyuan14 changed the title ~~Aten::_foreach_norm: fix storage_offset not match with meta tensor~~ _foreach_norm: Align with PyTorch operator semantics on allocation scheme of return tensors Aug 7, 2024

chunhuanMeng and others added 2 commits August 8, 2024 01:59

fix unused

fb37471

Merge branch 'main' into meng_foreach_norm

9802ba2

fengyuan14 reviewed Aug 8, 2024

View reviewed changes

src/ATen/native/xpu/sycl/ForeachReduceKernels.cpp Outdated Show resolved Hide resolved

fengyuan14 reviewed Aug 8, 2024

View reviewed changes

src/ATen/native/xpu/sycl/ForeachReduceKernels.cpp Outdated Show resolved Hide resolved

chunhuanMeng and others added 4 commits August 8, 2024 13:37

put the same codes before if-else

59ccb46

Merge branch 'main' into meng_foreach_norm

ed23466

Merge branch 'main' into meng_foreach_norm

5cc8215

Merge branch 'main' into meng_foreach_norm

a5a4020

fengyuan14 approved these changes Aug 12, 2024

View reviewed changes

fengyuan14 and others added 2 commits August 12, 2024 19:21

Merge branch 'main' into meng_foreach_norm

c88df6a

undefined out_t and const int i.

b54ca82

fengyuan14 mentioned this pull request Aug 13, 2024

[Evaluated] Issues in test_meta #601

Closed

2 tasks

chunhuanMeng added 3 commits August 14, 2024 16:45

Fix incorrect memory usage

d198d9c

free ptr

c5cb568

Update ForeachReduceKernels.cpp

631a513

daisyden added this to the PT2.5 milestone Aug 15, 2024

fengyuan14 reviewed Aug 16, 2024

View reviewed changes

chunhuanMeng added 6 commits August 16, 2024 10:58

use ping memory and memcpy to device

dfde520

Update ForeachReduceKernels.cpp

11460a3

Update ForeachReduceKernels.cpp

f938710

Update n_tensors to ntensors

d18dcf6

Update ForeachReduceKernels.cpp

06b58ee

Update ForeachReduceKernels.cpp

e2b82dc

fengyuan14 approved these changes Aug 19, 2024

View reviewed changes

fengyuan14 added this pull request to the merge queue Aug 19, 2024

Merged via the queue into main with commit 7eb5219 Aug 19, 2024
3 checks passed

fengyuan14 deleted the meng_foreach_norm branch August 19, 2024 01:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

_foreach_norm: Align with PyTorch operator semantics on allocation scheme of return tensors #709

_foreach_norm: Align with PyTorch operator semantics on allocation scheme of return tensors #709

Uh oh!

chunhuanMeng commented Aug 7, 2024 •

edited by fengyuan14

Loading

Uh oh!

fengyuan14 commented Aug 8, 2024

Uh oh!

Uh oh!

Uh oh!

fengyuan14 commented Aug 13, 2024

Uh oh!

fengyuan14 Aug 16, 2024

Uh oh!

Uh oh!

Uh oh!

_foreach_norm: Align with PyTorch operator semantics on allocation scheme of return tensors #709

_foreach_norm: Align with PyTorch operator semantics on allocation scheme of return tensors #709

Uh oh!

Conversation

chunhuanMeng commented Aug 7, 2024 • edited by fengyuan14 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fengyuan14 commented Aug 8, 2024

Uh oh!

Uh oh!

Uh oh!

fengyuan14 commented Aug 13, 2024

Uh oh!

fengyuan14 Aug 16, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chunhuanMeng commented Aug 7, 2024 •

edited by fengyuan14

Loading