-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(gpu): implement fhe rand on gpu #1958
Conversation
349cfaa
to
9088fb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @guillermo-oyarzun! Thanks a lot for this PR, here comes my review. My main question is about the par_generate...
entry points: at the moment the cuda calls are made on the same streams, which makes execution sequential. What we could do is use a different set of streams for each of the par_iter iteration maybe? Wdyt?
5d54d65
to
63eb9b8
Compare
cd28cff
to
29dcb98
Compare
.into_par_iter() | ||
.enumerate() | ||
.map(|(i, seed)| { | ||
let stream_index = i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here if there are too many blocks this won't work, will it? We would need something like
let stream_index = i % streams.gpu_indexes.len()
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will work because before I generate a vector of streams with as many streams as blocks are. This is being executed in a single GPU but with many streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes of course, thanks for the clarification!
29dcb98
to
17f5d05
Compare
closes: please link all relevant issues
PR content/description
Check-list: