Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the stream module a part of the public API #1775

Merged
merged 9 commits into from
Jan 16, 2025

Conversation

Matt711
Copy link
Contributor

@Matt711 Matt711 commented Jan 2, 2025

Description

Closes #1770.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@Matt711 Matt711 self-assigned this Jan 2, 2025
Copy link

copy-pr-bot bot commented Jan 2, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added CMake Python Related to RMM Python API labels Jan 2, 2025
@Matt711 Matt711 added feature request New feature or request non-breaking Non-breaking change CMake Python Related to RMM Python API and removed CMake Python Related to RMM Python API labels Jan 2, 2025
@Matt711 Matt711 marked this pull request as ready for review January 2, 2025 17:56
@Matt711 Matt711 requested review from a team as code owners January 2, 2025 17:56
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this! I will need a bit of time to play with this and see what works well in the design (or not).

Do we need both cuda_stream.pyx and stream.pyx? Could/should we combine these files?

python/rmm/rmm/_cuda/stream.py Outdated Show resolved Hide resolved
python/rmm/rmm/_cuda/stream.py Outdated Show resolved Hide resolved
@leofang
Copy link
Member

leofang commented Jan 2, 2025

What's the rationale of making this a public API now that we have cuda.core...? Couldn't it stay private while we perfect cuda.core and make it production ready? I might have missed something.

@bdice
Copy link
Contributor

bdice commented Jan 2, 2025

What's the rationale of making this a public API now that we have cuda.core...? Couldn't it stay private while we perfect cuda.core and make it production ready? I might have missed something.

We want to start adding streams to pylibcudf APIs in RAPIDS 25.02. Exposing streams publicly is important for speed-of-light performance on several workloads.

Eventually we will want to move to cuda.core but I don't think we want RAPIDS to be tied to cuda.core's timeline at this moment. I would like cuda.core to take whatever time is needed to have a stable design. Once it's stable, we will be able to deprecate this in RAPIDS and move to cuda.core within ~2-4 months (one release to deprecate the existing design, then make the change in the subsequent release). Having this in RMM/pylibcudf in the short term also gives us a way to test the cuda.core APIs with a full application by changing imports and doing minor refactoring.

@leofang
Copy link
Member

leofang commented Jan 3, 2025

Thanks for explanation!

What's the rationale of making this a public API now that we have cuda.core...? Couldn't it stay private while we perfect cuda.core and make it production ready? I might have missed something.

We want to start adding streams to pylibcudf APIs in RAPIDS 25.02. Exposing streams publicly is important for speed-of-light performance on several workloads.

Eventually we will want to move to cuda.core but I don't think we want RAPIDS to be tied to cuda.core's timeline at this moment. I would like cuda.core to take whatever time is needed to have a stable design. Once it's stable, we will be able to deprecate this in RAPIDS and move to cuda.core within ~2-4 months

First of all, I should not have given the impression that cuda.core is not production ready. We just want to have the flexibility to "soft-break" the design (if possible) in case deemed necessary, but we do want to get people to start using it today. For example, we'll start integrating it with numba-cuda very soon, so one way or another RAPIDS users will have a copy of cuda.core installed in their environment and used somewhere in their workloads.

Next, I understand pylibcudf's needs and I am very supportive of having all APIs taking the stream explicitly. But I don't think it needs to be tied to any concrete type (not even cuda.core.Stream) considering that any object having __cuda_stream__ could be consumed by pylibcudf. So perhaps all of these (exposure + later deprecation/removal) are still redundant work?

Finally, with regard to __cuda_stream__ please chime in this RFC: NVIDIA/cuda-python#348. Thanks!

@bdice
Copy link
Contributor

bdice commented Jan 5, 2025

@leofang I spent some time reviewing the Stream class in cuda.core. I have some questions before we could adopt this. However, I think we should probably try to avoid making streams "more public" in the RMM API (several RAPIDS libraries already use these internals) and use cuda.core.

  • Will there be a cuda.core Cython API? Currently all our RMM stream-related logic is in Cython classes.
  • What API stability guarantees do we have? Will the cuda.core.experimental namespace only change with minor releases (0.2.0, 0.3.0, ...) or can it also change with point releases (0.1.1, 0.1.2, ...)? We would need to pin cuda.core tightly enough to guard against API breaks if we use it in a shipping RAPIDS product.
  • We will need to figure out how to turn __cuda_stream__ protocol objects into RMM's Cython cuda_stream_view class. I am guessing the best way to do this is to construct cuda.core.experimental.Stream.from_handle(...) and then figure out how to make a cuda_stream_view from the raw stream pointer (cast from the handle pointer int to uintptr_t to cudaStream_t, probably).

@bdice
Copy link
Contributor

bdice commented Jan 9, 2025

@Matt711 @leofang I think I want to go ahead and pursue making these public API changes in RMM, to keep RAPIDS moving forward with stream usage. I think we will want to use cuda.core streams in the future, but not in 25.02-25.04. I do want to work more closely with CCCL and cuda.core on memory resources (and eventually streams), so that RMM can become a consumer of the CUDA platform's fundamental abstractions, rather than providing its own implementations.

@Matt711
Copy link
Contributor Author

Matt711 commented Jan 9, 2025

Do we need both cuda_stream.pyx and stream.pyx? Could/should we combine these files?

I think we should combine them. Both are small modules with similar related code.

  1. Can I address this in a follow-up?
  2. Should rmm.pylibrmm.cuda_stream go through the standard deprecation process if we move it rmm.pylibrmm.stream? Its not used in rapids (except RMM). I'm not sure if its used elsewhere though.

python/rmm/rmm/_cuda/stream.py Outdated Show resolved Hide resolved
python/rmm/rmm/pylibrmm/device_buffer.pxd Outdated Show resolved Hide resolved
python/rmm/rmm/pylibrmm/stream.pyx Outdated Show resolved Hide resolved
python/rmm/rmm/pylibrmm/stream.pyx Outdated Show resolved Hide resolved
python/rmm/rmm/pylibrmm/stream.pyx Outdated Show resolved Hide resolved
python/rmm/rmm/tests/test_rmm.py Outdated Show resolved Hide resolved
@bdice
Copy link
Contributor

bdice commented Jan 9, 2025

Do we need both cuda_stream.pyx and stream.pyx? Could/should we combine these files?

I think we should combine them. Both are small modules with similar related code.

  1. Can I address this in a follow-up?
  2. Should rmm.pylibrmm.cuda_stream go through the standard deprecation process if we move it rmm.pylibrmm.stream? Its not used in rapids (except RMM). I'm not sure if its used elsewhere though.
  1. Yes, let's think about this in a follow-up PR. If you want to scope this in an issue, that would be helpful!
  2. Yes, we should do a standard deprecation since rmm.pylibrmm.cuda_stream is currently public API. I don't see any uses of it in the wild, but we should follow normal deprecation procedures either way.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed one small fix to the docs, and this is good to go. I'll trigger a merge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically I think this might be a break in our Cython API if we're only exposing these as Python and not Cython objects now (.py and not .pxd). However, this API is already private (rmm._cuda) and I did extensive searching of the rapidsai codebase (and GitHub generally) to ensure that this should not have any negative effects downstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I also searched rapids for uses of rmm._cuda and I think they we're all in rmm

@bdice
Copy link
Contributor

bdice commented Jan 16, 2025

/merge

@rapids-bot rapids-bot bot merged commit fb5e6bc into rapidsai:branch-25.02 Jan 16, 2025
58 of 59 checks passed
@bdice
Copy link
Contributor

bdice commented Jan 16, 2025

Thanks for your work on this @Matt711! 🥳

@bdice
Copy link
Contributor

bdice commented Jan 17, 2025

This appears to have caused problems downstream:

rapidsai/cuml#6229 (comment)

https://github.com/rapidsai/cugraph/actions/runs/12820092417/job/35756703577?pr=4804#step:9:407

I don't know why yet -- still trying to trace down what is triggering the deprecation warning, since I don't see any references to the deprecated code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake feature request New feature or request non-breaking Non-breaking change Python Related to RMM Python API
Projects
Status: Done
Status: Landed
Development

Successfully merging this pull request may close these issues.

[FEA] Make the rmm._cuda.stream.Stream a part of the public API
3 participants