feat: support indexing parray by ndarray #89

yinengy · 2023-04-14T05:50:26Z

Also fixed a minor bug in parray.update()

wlruys

Other than efficiency lgtm,

wlruys · 2023-04-14T05:56:40Z

src/python/parla/common/parray/core.py

        """
        if isinstance(value, PArray):
            value = value.array

+        if isinstance(slices, numpy.ndarray) or isinstance(slices, cupy.ndarray):
+            slices = slices.tolist()


Isn't this inefficient as:

numpy supports indexing by both list and np.ndarray of int32/int64/uint32/uint64 type

cupy support indexing by list, and both cp.ndarray/np.ndarray ofint32/int64/uint32/uint64 type

This could lead to a lot of device -> host copies from the crosspy calls?

This could only avoid by caller since this information has to be saved in cpu side that is becase it will later be converted and hashed so it could be recorded for further use within parray. (so indexing by cupy array is not efficient in any senario since parray's internal data structure is still running in cpu)

And I think crosspy should always use numpy array as index, what is the benefit of using cupy array as index? @bozhiyou

And I think crosspy should always use numpy array as index

No, indices could be as large as the array itself - think of the permutation example arr[shuffle(arange(len(arr))] - and even larger. If you are indexing a cupy array, the indices array has to be copied to GPU as a cupy array.

I agree with Will that list will be inefficient. Storing all the information on CPU seems to be inefficient as well.

That makes sense, but currently there is no efficient way to deal with this issue like simply storing the info in GPU or as ndarray (actually all lists will be converted to hash maps in the later step since there is no gurantee the slice mapping is in sequential when you deal with local indices). And each access to array need to query these information (e.g. comparing slices with given slices). Consider the following scenario:

Data initialy stored in cpu then moved to gpu, but user don't know which device it is, should the user index with cp.ndarray or by np.ndarray?

Same as 1 but now in a multidevice task like where crosspy is in, which slices should user use to index into it? (and which device should it be if the cupy.ndarray is used?)

Which device should the cupy ndarray be stored inside parray?
(a) If we chose to put it on the same device as where data is, when there are multiple copy on different device, each device now has part of the slicing saved, every access regardless where the data is has to access all devices and compare slice information that saved locally, that will be really slow.
(b) if we chose to put it on the first gpu, operation happened for copy on other devices has to query this gpu to, that is the same as query data in cpu since nvlink is not guranted to be exist between all device and this request will be routed via cpu, which is slower than put it on cpu. This also raise memory issues when the slices is large (you mentioned it might be as large as array), GPU memory are more limited than cpu and runtime has to track the slice size to overwise new data moved there will be OOM, and this also cause imbalanced workload and memory size between devices.

when data is moved, should we move the slices mapping together? Will this increase the data movement overhead?

until we found a good solution to the above issues, CPU is still the best choice to store the slices. (but I could make it numpy array instead of python list)

yinengy added 2 commits April 14, 2023 05:48

fix: bug in reset coherence

472405c

feat: support index by array

d9d16eb

yinengy requested review from wlruys and bozhiyou April 14, 2023 05:50

yinengy linked an issue Apr 14, 2023 that may be closed by this pull request

PArray does not support array type indices #64

Open

fix: minor bug in __getitem__

92617c8

yinengy self-assigned this Apr 14, 2023

yinengy added bug Something isn't working enhancement New feature or request labels Apr 14, 2023

wlruys reviewed Apr 14, 2023

View reviewed changes

nicelhc13 force-pushed the main branch from 0bed0a3 to 041d4ac Compare April 15, 2023 03:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support indexing parray by ndarray #89

feat: support indexing parray by ndarray #89

yinengy commented Apr 14, 2023

wlruys left a comment

wlruys Apr 14, 2023

yinengy Apr 14, 2023 •

edited

Loading

bozhiyou Apr 14, 2023 •

edited

Loading

bozhiyou Apr 14, 2023

yinengy Apr 14, 2023 •

edited

Loading

feat: support indexing parray by ndarray #89

Are you sure you want to change the base?

feat: support indexing parray by ndarray #89

Conversation

yinengy commented Apr 14, 2023

wlruys left a comment

Choose a reason for hiding this comment

wlruys Apr 14, 2023

Choose a reason for hiding this comment

yinengy Apr 14, 2023 • edited Loading

Choose a reason for hiding this comment

bozhiyou Apr 14, 2023 • edited Loading

Choose a reason for hiding this comment

bozhiyou Apr 14, 2023

Choose a reason for hiding this comment

yinengy Apr 14, 2023 • edited Loading

Choose a reason for hiding this comment

yinengy Apr 14, 2023 •

edited

Loading

bozhiyou Apr 14, 2023 •

edited

Loading

yinengy Apr 14, 2023 •

edited

Loading