Enhancement: Avoid unnecessary gathering of distributed operand #1216

samadpls · 2023-09-08T08:04:21Z

Due Diligence

General:
- base branch must be main for new features, latest release branch (e.g. release/1.3.x) for bug fixes
- title of the PR is suitable to appear in the Release Notes
Implementation:
- unit tests: all split configurations tested
- unit tests: multiple dtypes tested
- documentation updated where needed

Description

Updated __sanitize_close_input logic to resolve issue #1064 regarding unnecessary gathering of distributed operands.

Issue/s resolved: #1064

Changes proposed:

optimized the logic in the __sanitize_close_input function to avoid unnecessary gathering of distributed operands.

Type of change

New feature (non-breaking change which adds functionality)

Memory requirements

Memory requirements for this change have not been profiled at this time.

Performance

Performance metrics for this change have not been measured at this time.

Does this change modify the behaviour of other functions? If so, which?

yes

for more information, see https://pre-commit.ci

github-actions · 2023-09-11T08:25:53Z

Thank you for the PR!

mtar

Thank you @samadpls. However, your contribution is contrary to the intended behavior of the function and does not address the issue. Perhaps it wasn't written clearly and concisely enough, so I rephrase it again:

What we want for allclose/isclose: If we have a distributed array and a non-distributed array as inputs, slice the undistributed array such that we can compare it with the corresponding local part of the distributed array on the process.

Hope this helps

heat/core/logical.py

mtar

This is a small step into the right direction 🙂

heat/core/logical.py

ClaudiaComito

@samadpls here's what we're trying to achieve with this fix.

First of all, all Heat operations are implemented to operate on memory-distributed arrays (DNDarrays) . In practice:

operations are performed on many MPI processes, could be many CPUs or many GPUs, many nodes of a supercomputer;
if a DNDarray is distributed, each MPI process stores only a slice of that DNDarray (along the DNDarray.split dimension)
users can choose whether to distribute which DNDarrays or not (DNDarray.split = None means each process has a copy of the entire DNDarray in memory).

In this PR, you are addressing the case in which logical operations all, allclose etc. compare a distributed DNDarray to a non-distributed one.

Example with small arrays for better understanding:

import numpy as np
import heat as ht

x = ht.arange(30, split=0).reshape(-1, 2)  
y = ht.arange(30).reshape(-1, 2)

print(ht.all(x == y))

This will return True no matter how many MPI processes you run it on.

Try and print out the local arrays (DNDarray.larray) to get a better understanding what the underlying data on each process really are (run this code on more than one process, see here):

rank = x.comm.rank 
print(f"On rank {rank}: local x data: {x.larray}")
print(f"On rank {rank}: local y data: {y.larray}")

In the current implementation, when a distributed x is compared to non-distributed y, x gets "gathered" so it is no longer distributed. But "gathering" all the slices of a distributed array onto each process is communication- and memory-intensive and, in this case, unnecessary. Each process can simply compare the local slice of x (x.larray) to the corresponding slice of y and, at the end, only communicate whether the slices on each process match or not.

Feel free to ask if anything is unclear.

heat/core/logical.py

samadpls · 2023-10-09T11:04:29Z

I apologize for the delay in responding. I've been on vacation and traveling for the past 22 days, and I just landed few hours ago. I'll definitely get back to work and address the PR comments. Thanks for your understanding!

ClaudiaComito · 2023-10-09T11:13:05Z

I apologize for the delay in responding. I've been on vacation and traveling for the past 22 days, and I just landed few hours ago. I'll definitely get back to work and address the PR comments. Thanks for your understanding!

No problem at all @samadpls, thanks for your time!

Signed-off-by: samadpls <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: samadpls <[email protected]>

github-actions · 2023-12-18T05:11:39Z

Thank you for the PR!

ClaudiaComito

Thanks @samadpls , two more small changes and we're good to go.

heat/core/logical.py

ClaudiaComito · 2023-12-22T05:28:42Z

heat/core/logical.py

+    if x.split is not None and y.split is None and y.ndim > 0:
+        t2 = factories.array(y.larray, device=x.device, split=x.split)
+        x, t2 = sanitation.sanitize_distribution(x, t2, target=x)
+        return x, t2



We're missing the same check for y distributed, x non distributed I think. Otherwise we're good to go.

Co-authored-by: Claudia Comito <[email protected]>

Signed-off-by: samadpls <[email protected]>

…thub.com/samadpls/heat into features/1064-optimize-logical-functions

for more information, see https://pre-commit.ci

github-actions · 2023-12-22T08:21:36Z

Thank you for the PR!

ClaudiaComito

We made it, thanks a lot @samadpls !

samadpls · 2023-12-22T13:59:43Z

Thank you so much @ClaudiaComito, I couldn't have done it without your assistance.

ClaudiaComito · 2023-12-22T14:12:28Z

Thank you so much @ClaudiaComito, I couldn't have done it without your assistance.

My pleasure.

The CI is failing, but this has nothing to do with your PR, we've been working on it in #1313 and #1314 . There's nothing you can do about it, #1314 needs to be merged before everything else.

We're about to start the Christmas break so things will be grinding to a halt until early January. Happy holidays! 🎄

github-actions · 2024-01-08T09:20:37Z

Thank you for the PR!

github-actions · 2024-02-05T08:40:24Z

Thank you for the PR!

github-actions · 2024-02-05T08:47:53Z

Thank you for the PR!

github-actions · 2024-02-05T09:17:56Z

Thank you for the PR!

github-actions · 2024-02-05T10:22:48Z

Backport failed for release/1.3.x, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin release/1.3.x
git worktree add -d .worktree/backport-1216-to-release/1.3.x origin/release/1.3.x
cd .worktree/backport-1216-to-release/1.3.x
git switch --create backport-1216-to-release/1.3.x
git cherry-pick -x 9d093b2b1b971320b6dbfa8d32af450cf414e22b 7b7a19a16069af11e54e6eabb94ed41ec68eb1e2 f8133118af2e4898f25cd7846034acc750484301 bdf62df2ef726fa9e97c5d54846ba2aee4b0a1bb bde9d0e5b34243c75c64ed0840cf053963e01f07 0e8060188f6c8c258badb691c272671f153f3574 b5bf608e0b34167d93e924ae35cd1b97ed8e4507 12c67ecb93c76ae457ae544845a354db4d8430a7 0ebe1ced2fb22bbbe6ab14999cc7f46f44fabf52 e5dc1213e878c0a1927365be8eff30979ff1d1ff e8d45d4a01d22c90a45c1a0089bbf0cad789b4b1 ae3f969828f1b9514ddbbe82dbca8ee37f3e6e54 43e77d81d8b41c488237566f1d340aff60cf5aaa 38f4314a9b721720453b00cf3927ab1384fae8a5 aa555d5a6747f5c9e6d97fda12c0e68f2443b835 52cb055e3dab9518a5c84a91a36070b028c461a7 d0bf3f3e5899688f28cbfbb96bddbba5a2c420ef 60fb1cddc6a414a8d9382b1c54d6366758b52bb6 3d1fe15cd2539b36beb4610bc08296843f765ee7

samadpls and others added 4 commits September 8, 2023 12:54

Avoid unnecessary gathering of distributed operand

9d093b2

[pre-commit.ci] auto fixes from pre-commit.com hooks

7b7a19a

for more information, see https://pre-commit.ci

updated a comment

f813311

refactored the logic

bdf62df

ClaudiaComito requested a review from mtar September 11, 2023 08:18

ClaudiaComito added the PR talk label Sep 13, 2023

mtar reviewed Sep 13, 2023

View reviewed changes

heat/core/logical.py Outdated Show resolved Hide resolved

samadpls added 3 commits September 13, 2023 09:54

Merge branch 'main' into features/1064-optimize-logical-functions

0e813f1

Merge branch 'main' into features/1064-optimize-logical-functions

d32ed27

updated the __sanitize_close_input method

bde9d0e

samadpls requested a review from mtar September 14, 2023 20:45

ClaudiaComito assigned mtar Sep 18, 2023

mtar reviewed Sep 22, 2023

View reviewed changes

heat/core/logical.py Outdated Show resolved Hide resolved

mtar reviewed Sep 22, 2023

View reviewed changes

heat/core/logical.py Outdated Show resolved Hide resolved

mtar reviewed Sep 22, 2023

View reviewed changes

heat/core/logical.py Outdated Show resolved Hide resolved

heat/core/logical.py Outdated Show resolved Hide resolved

heat/core/logical.py Outdated Show resolved Hide resolved

Merge branch 'main' into features/1064-optimize-logical-functions

4752d15

ClaudiaComito self-requested a review September 25, 2023 08:11

ClaudiaComito requested changes Oct 9, 2023

View reviewed changes

ClaudiaComito reviewed Oct 9, 2023

View reviewed changes

heat/core/logical.py Outdated Show resolved Hide resolved

Merge branch 'main' into features/1064-optimize-logical-functions

58a5c1a

samadpls added 2 commits October 12, 2023 22:11

Merge branch 'main' into features/1064-optimize-logical-functions

367d25f

Optimized logical operations

0e80601

Signed-off-by: samadpls <[email protected]>

samadpls requested review from mtar and ClaudiaComito October 12, 2023 17:54

pre-commit-ci bot and others added 3 commits October 12, 2023 17:55

[pre-commit.ci] auto fixes from pre-commit.com hooks

b5bf608

for more information, see https://pre-commit.ci

fixed typo

12c67ec

Signed-off-by: samadpls <[email protected]>

fixed the merge conflict

e875bc6

samadpls requested a review from ClaudiaComito December 21, 2023 09:18

ClaudiaComito requested changes Dec 22, 2023

View reviewed changes

samadpls and others added 4 commits December 22, 2023 06:16

Update heat/core/logical.py

d0bf3f3

Co-authored-by: Claudia Comito <[email protected]>

Refactored the __sanitize_close_input method

60fb1cd

Signed-off-by: samadpls <[email protected]>

Merge branch 'features/1064-optimize-logical-functions' of https://gi…

2a4e226

…thub.com/samadpls/heat into features/1064-optimize-logical-functions

[pre-commit.ci] auto fixes from pre-commit.com hooks

3d1fe15

for more information, see https://pre-commit.ci

samadpls requested a review from ClaudiaComito December 22, 2023 06:23

ClaudiaComito approved these changes Dec 22, 2023

View reviewed changes

ClaudiaComito added communication logical memory footprint merge queue and removed communication labels Dec 22, 2023

Merge branch 'main' into features/1064-optimize-logical-functions

abbe799

ClaudiaComito approved these changes Jan 8, 2024

View reviewed changes

Merge branch 'main' into features/1064-optimize-logical-functions

1991a60

Merge branch 'main' into features/1064-optimize-logical-functions

3f1727f

Merge branch 'main' into features/1064-optimize-logical-functions

4a2b87b

ClaudiaComito approved these changes Feb 5, 2024

View reviewed changes

ClaudiaComito added the backport release/1.3.x label Feb 5, 2024

ClaudiaComito merged commit 42733ab into helmholtz-analytics:main Feb 5, 2024
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Avoid unnecessary gathering of distributed operand #1216

Enhancement: Avoid unnecessary gathering of distributed operand #1216

samadpls commented Sep 8, 2023

github-actions bot commented Sep 11, 2023

mtar left a comment

mtar left a comment

ClaudiaComito left a comment •

edited

Loading

samadpls commented Oct 9, 2023 •

edited

Loading

ClaudiaComito commented Oct 9, 2023

github-actions bot commented Dec 18, 2023

ClaudiaComito left a comment

ClaudiaComito Dec 22, 2023

github-actions bot commented Dec 22, 2023

ClaudiaComito left a comment

samadpls commented Dec 22, 2023

ClaudiaComito commented Dec 22, 2023

github-actions bot commented Jan 8, 2024

github-actions bot commented Feb 5, 2024

github-actions bot commented Feb 5, 2024

github-actions bot commented Feb 5, 2024

github-actions bot commented Feb 5, 2024

Enhancement: Avoid unnecessary gathering of distributed operand #1216

Enhancement: Avoid unnecessary gathering of distributed operand #1216

Conversation

samadpls commented Sep 8, 2023

Due Diligence

Description

Changes proposed:

Type of change

Memory requirements

Performance

Does this change modify the behaviour of other functions? If so, which?

github-actions bot commented Sep 11, 2023

mtar left a comment

Choose a reason for hiding this comment

mtar left a comment

Choose a reason for hiding this comment

ClaudiaComito left a comment • edited Loading

Choose a reason for hiding this comment

samadpls commented Oct 9, 2023 • edited Loading

ClaudiaComito commented Oct 9, 2023

github-actions bot commented Dec 18, 2023

ClaudiaComito left a comment

Choose a reason for hiding this comment

ClaudiaComito Dec 22, 2023

Choose a reason for hiding this comment

github-actions bot commented Dec 22, 2023

ClaudiaComito left a comment

Choose a reason for hiding this comment

samadpls commented Dec 22, 2023

ClaudiaComito commented Dec 22, 2023

github-actions bot commented Jan 8, 2024

github-actions bot commented Feb 5, 2024

github-actions bot commented Feb 5, 2024

github-actions bot commented Feb 5, 2024

github-actions bot commented Feb 5, 2024

ClaudiaComito left a comment •

edited

Loading

samadpls commented Oct 9, 2023 •

edited

Loading