Make a few memory management objects public + Miscellaneous doc updates #693

leofang · 2025-06-07T03:37:17Z

Description

closes #596.

Buffer and MemoryResource are already public (ex: returned by Device.memory_resource.allocate()). We just expose them under the cuda.core namespace for documentation purposes (Doc: Document cuda.core public (but non-entry-point) objects #601)
- I noticed that Buffer's destructor would use the default stream if stream is not passed explicitly. This isn't right because it takes away the control from the memory resource authors. We now always let the memory resource make the call.
- I noticed Buffer has an __init__(). This isn't right (since we want buffers to be returned by MR.allocate()) if we want to make it public. I moved this support to from_handle().
The new exposed objects DeviceMemoryResource and LegacyPinnedMemoryResource are named after their respective cccl-rt/cudax counterparts.
The __cuda_stream__ protocol has a protocol type now (IsStreamT)
Internal refactoring to consolidate support for objects that have __cuda_stream__
Documentation is clarified and expanded to reflect the new public exposure of these classes, in particular many type hints are updated/fixed
Update the docs to mention all other merged PRs targeting this release (v0.3.0)

Checklist

New or existing tests cover these changes.
The documentation is up to date with these changes.

…Resource public APIs

copy-pr-bot · 2025-06-07T03:37:20Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

leofang · 2025-06-07T03:53:53Z

/ok to test a6387a0

leofang · 2025-06-09T01:32:43Z

/ok to test cc0f6ce

…g of stream=None; more docs

leofang · 2025-06-09T02:47:41Z

/ok to test 58323ac

cuda_core/cuda/core/experimental/_device.py

kkraus14 · 2025-06-10T01:33:24Z

cuda_core/cuda/core/experimental/_launcher.py

@@ -34,7 +35,7 @@ def _lazy_init():
    _inited = True


-def launch(stream, config, kernel, *kernel_args):
+def launch(stream: Union[Stream, IsStreamT], config: LaunchConfig, kernel: Kernel, *kernel_args):


It's a bit inconsistent where only this API supports __cuda_stream__ protocol, but many other APIs only work with explicit Stream type objects. We should probably be consistent.

Maybe push this to a follow up PR?

We already have public examples for PyTorch and CuPy showcasing the use of this protocol. With this PR we made it slightly faster in favor of our own Stream (through try-except). Perhaps we should instead encourage users to use our native objects, and state that using the protocol could add a slight overhead?

cuda_core/cuda/core/experimental/_stream.py

cuda_core/docs/source/contribute.rst

kkraus14 · 2025-06-10T01:40:16Z

cuda_core/docs/source/interoperability.rst

@@ -52,7 +52,8 @@ Then such objects can be understood by ``cuda.core`` anywhere a stream-like obje
 is needed.


I think we should discuss this. This will introduce non-negligible overhead, similar to DLPack, where we might be better off having a user explicitly wrap their stream objects in a cuda.core.Stream and reuse that Stream object as needed.

Yes right now only launch() and Device.create_stream() support the protocol, due to the exact concern of adding overheads. Overhead aside, I think essentially we are assessing which interoperability story we want to tell:

We give people a convenient way for them to convert their types to ours, and only accept our types in our APIs

We accept everyone's types in our APIs

Both stories need the protocol so it's not a waste. Just how to use the protocol is different.

Yes, we need the protocol regardless. My point was that we document here that an object that supports __cuda_stream__ protocol can be used anywhere cuda.core expects a stream-like object, which isn't currently true. We should do one of following things here:

Update the docs to make it clear that a user is required to wrap their __cuda_stream__ protocol supporting object in a cuda.core.Stream before passing it to other cuda.core APIs.

Update the cuda.core implementation to support IsStreamT objects everywhere Streams are an API argument.

Update the docs to make it clear which APIs explicitly allow IsStreamT objects vs required explicit Stream objects.

I'm -1 on this because it is confusing and non-intuitive to users

Based on the current state of implementation, I think we're the closest to 1 (minus the launch() API) where we should probably make the docs reflect that for the time being?

I'll work on 1 now.

@kkraus14 could you check out commit bbc1c65? I try to rephrase to hint that we want people to wrap their streams, instead of accepting them as-is. We probably have to defer the change of launch() to a later time. Vlad's graph support (nicely) took advantage of launch() supporting __cuda_stream__. We can look into it during graph phase 2.

cuda_core/docs/source/release/0.3.0-notes.rst

Co-authored-by: Keith Kraus <[email protected]>

leofang · 2025-06-10T20:46:19Z

/ok to test bbc1c65

leofang · 2025-06-10T23:34:48Z

CI is green now.

github-actions · 2025-06-11T02:13:35Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

make Buffer, DeviceMemoryResource, LegacyPinnedMemoryResource, Memory…

7066082

…Resource public APIs

github-project-automation bot added this to CCCL Jun 7, 2025

github-project-automation bot moved this to Todo in CCCL Jun 7, 2025

leofang added this to the cuda.core beta 4 milestone Jun 7, 2025

add docs

a6387a0

leofang self-assigned this Jun 7, 2025

leofang added P1 Medium priority - Should do feature New feature or request cuda.core Everything related to the cuda.core module labels Jun 7, 2025

This comment has been minimized.

Sign in to view

leofang marked this pull request as draft June 7, 2025 04:22

leofang mentioned this pull request Jun 8, 2025

WIP: Fold api_private.rst into api.rst #695

Draft

enchanced typing in buffer, MR, Stream, ...

d8e0f08

leofang force-pushed the default_mr branch from 65cc515 to d8e0f08 Compare June 9, 2025 01:10

more docs

cc0f6ce

leofang added P0 High priority - Must do! and removed P1 Medium priority - Should do labels Jun 9, 2025

leofang changed the title ~~Make a few memory management objects public~~ Make a few memory management objects public + Miscellaneous doc updates Jun 9, 2025

leofang marked this pull request as ready for review June 9, 2025 01:48

leofang requested a review from rwgk June 9, 2025 01:49

leofang marked this pull request as draft June 9, 2025 02:22

fix buffer destructor; highlight each MR needs to clarify the handlin…

58323ac

…g of stream=None; more docs

leofang added the breaking Breaking changes are introduced label Jun 9, 2025

leofang marked this pull request as ready for review June 9, 2025 03:25

kkraus14 reviewed Jun 10, 2025

View reviewed changes

Merge branch 'main' into default_mr

5abf37b

leofang and others added 3 commits June 10, 2025 13:58

fix create_stream docstring

7f34922

narrower error

f65b644

fix typo

f65c54c

Co-authored-by: Keith Kraus <[email protected]>

kkraus14 previously approved these changes Jun 10, 2025

View reviewed changes

github-project-automation bot moved this from Todo to In Review in CCCL Jun 10, 2025

Merge branch 'main' into default_mr

c0db6b3

leofang dismissed kkraus14’s stale review via c0db6b3 June 10, 2025 20:15

reduce the applicability of __cuda_stream__

bbc1c65

kkraus14 approved these changes Jun 11, 2025

View reviewed changes

leofang merged commit bf590e4 into NVIDIA:main Jun 11, 2025
101 of 103 checks passed

github-project-automation bot moved this from In Review to Done in CCCL Jun 11, 2025

leofang deleted the default_mr branch June 11, 2025 01:54

		@@ -52,7 +52,8 @@ Then such objects can be understood by ``cuda.core`` anywhere a stream-like obje
		is needed.

Make a few memory management objects public + Miscellaneous doc updates #693

Make a few memory management objects public + Miscellaneous doc updates #693

Conversation

leofang commented Jun 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

copy-pr-bot bot commented Jun 7, 2025

Uh oh!

leofang commented Jun 7, 2025

Uh oh!

This comment has been minimized.

leofang commented Jun 9, 2025

Uh oh!

leofang commented Jun 9, 2025

Uh oh!

Uh oh!

kkraus14 Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

leofang Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kkraus14 Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

leofang Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kkraus14 Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

leofang Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

leofang Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leofang commented Jun 10, 2025

Uh oh!

leofang commented Jun 10, 2025

Uh oh!

Uh oh!

github-actions bot commented Jun 11, 2025

Uh oh!

Uh oh!

leofang commented Jun 7, 2025 •

edited

Loading

leofang Jun 10, 2025 •

edited

Loading

leofang Jun 10, 2025 •

edited

Loading