-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(gpu): general fixes on indexes used in multi-gpu context #1997
base: main
Are you sure you want to change the base?
Conversation
a6407ec
to
1b30c36
Compare
1b30c36
to
52a24ef
Compare
52a24ef
to
8c4d6a6
Compare
If you're new to commit signing, there are different ways to set it up: Sign commits with
|
d012f88
to
89a8733
Compare
@agnesLertá maybe we could introduce a new test to cover this scenario. Wdyt? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @pdroalves! Looks good to me thanks for the fix! Yes, we should definitely add a test for it. The thing is, for now only the long run tests are testing the use of a single GPU other than 0. What we could do is add long run tests with a set of classic PBS parameters maybe? Otherwise we could add signed & unsigned tests with a different FnExecutor, in the same fashion as I did for the long run tests. Wdyt? (see the executor here: tfhe/src/integer/gpu/server_key/radix/tests_long_run/mod.rs:34)
b12602f
to
6951ebd
Compare
- fix a bug in which the wrong GPU may be queried for the max shared memory - If multiple streams are running split through multiple GPUs, operations happening on a stream in GPU i should query GPU i about its max shared memory, - also fixes wrong indexing at rust side.
…er operations - multiple operations are issued in parallel running independently on different devices, - tests will only run when more than 1 GPU is available, - we only test ERC20-related operators: (overflow_) add/sub, cmp, and if_then_else.
6951ebd
to
8eb4d86
Compare
9c23e5d
to
65b35f0
Compare
65b35f0
to
1c68c50
Compare
1c68c50
to
d1623e7
Compare
@pdroalves I saw the multi-gpu tests passed: maybe you want to merge the cudaSetDevice encapsulation first and then rebase this one on main before merging? |
@agnesLeroy Actually that PR was based on this one. |
21a0e94
to
082ac40
Compare
closes: https://github.com/zama-ai/tfhe-rs-internal/issues/901
PR content/description
Check-list: