Fix scatter_by_map with spilling enabled #18095

mroeschke · 2025-02-25T22:33:16Z

Description

Before the old Cython bindings of columns_split spill locked the conversion from libcudf to a cudf Python column. When I replaced these bindings, this spill locking was removed during the refactor.

I'm spot checking that other APIs are not affected. If so I can open PRs for those

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

Matt711

Thanks! One non-blocking suggestion

Matt711 · 2025-02-25T22:47:44Z

python/cudf/cudf/core/indexed_frame.py

+        @acquire_spill_lock()
+        def split_from_pylibcudf(split: list[plc.Column]) -> list[ColumnBase]:
+            return [ColumnBase.from_pylibcudf(col) for col in split]
+


Does this work?

Suggested change

@acquire_spill_lock()

def split_from_pylibcudf(split: list[plc.Column]) -> list[ColumnBase]:

return [ColumnBase.from_pylibcudf(col) for col in split]

with acquire_spill_lock()

columns = [ColumnBase.from_pylibcudf(col) for col in split]

split here is an iterator variable (from for split in columns_split) below.

So in order to do it like this I would have to do

with acquire_spill_lock(): column = [[ColumnBase.from_pylibcudf(col) for col in split] for split in columns_split]

I could put the entire operation in the with acquire_spill_lock() block, but I would prefer tightly scope the operation that definitely needs the spill lock.

Oh thanks I didn't see the for split in. Sounds good

Matt711 · 2025-02-25T22:47:46Z

python/cudf/cudf/core/indexed_frame.py

        return [
            self._from_columns_like_self(
-                [ColumnBase.from_pylibcudf(col) for col in split],
+                split_from_pylibcudf(split),


Suggested change

split_from_pylibcudf(split),

columns,

mroeschke · 2025-02-26T00:58:11Z

/merge

Fix scatter_by_map with spilling enabled

55bdc16

mroeschke added bug Something isn't working Python Affects Python cuDF API. non-breaking Non-breaking change labels Feb 25, 2025

mroeschke self-assigned this Feb 25, 2025

mroeschke requested a review from a team as a code owner February 25, 2025 22:33

mroeschke requested review from bdice and galipremsagar February 25, 2025 22:33

Matt711 approved these changes Feb 25, 2025

View reviewed changes

rapids-bot bot merged commit d8b3d80 into rapidsai:branch-25.04 Feb 26, 2025
109 of 110 checks passed

mroeschke deleted the bug/spilling/scatter_by_map branch February 26, 2025 00:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix scatter_by_map with spilling enabled #18095

Fix scatter_by_map with spilling enabled #18095

mroeschke commented Feb 25, 2025

Matt711 left a comment

Matt711 Feb 25, 2025

mroeschke Feb 25, 2025

Matt711 Feb 25, 2025

Matt711 Feb 25, 2025

mroeschke commented Feb 26, 2025

Fix scatter_by_map with spilling enabled #18095

Fix scatter_by_map with spilling enabled #18095

Conversation

mroeschke commented Feb 25, 2025

Description

Checklist

Matt711 left a comment

Choose a reason for hiding this comment

Matt711 Feb 25, 2025

Choose a reason for hiding this comment

mroeschke Feb 25, 2025

Choose a reason for hiding this comment

Matt711 Feb 25, 2025

Choose a reason for hiding this comment

Matt711 Feb 25, 2025

Choose a reason for hiding this comment

mroeschke commented Feb 26, 2025