Tuning gpairs implementation params #341
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The gpairs implementation parameters like work group size and private memory size was initially tuned for integrated graphics devices. This PR updates the parameters determined from running experiments on Intel Datacenter Max GPUs.
CURRENT:
================ Benchmark GPairs (gpairs) ========================
WARNING:root:numba_dpex needs at least numba 0.58.0 but no more than 0.59.0, using numba=0.59.0 may cause unexpected behavior
================ implementation numba_dpex_k ========================
implementation: numba_dpex_k
framework: numba_dpex
framework version: 0.22.0.dev2+3.g59d523892
input size: 33554752
setup time: 485.043436ms (485043436 ns)
warmup time: 376.657913892s (376657913892 ns)
teardown time: 0ns (0 ns)
max execution times: 366.524516362s (366524516362 ns)
min execution times: 366.524516362s (366524516362 ns)
median execution times: 366.524516362s (366524516362 ns)
repeats: 1
preset: L
validated: Not Validated
================ implementation sycl ========================
implementation: sycl
framework: dpcpp
framework version: IntelLLVM 2024.0.0
input size: 33554752
setup time: 132.39452ms (132394520 ns)
warmup time: 240.888781093s (240888781093 ns)
teardown time: 0ns (0 ns)
max execution times: 243.165765882s (243165765882 ns)
min execution times: 243.165765882s (243165765882 ns)
median execution times: 243.165765882s (243165765882 ns)
repeats: 1
preset: L
validated: Not Validated
AFTER THIS CHANGE:
================ Benchmark GPairs (gpairs) ========================
WARNING:root:numba_dpex needs at least numba 0.58.0 but no more than 0.59.0, using numba=0.59.0 may cause unexpected behavior
================ implementation numba_dpex_k ========================
implementation: numba_dpex_k
framework: numba_dpex
framework version: 0.22.0.dev2+3.g59d523892
input size: 33554752
setup time: 364.385996ms (364385996 ns)
warmup time: 221.568546643s (221568546643 ns)
teardown time: 0ns (0 ns)
max execution times: 220.188768653s (220188768653 ns)
min execution times: 220.188768653s (220188768653 ns)
median execution times: 220.188768653s (220188768653 ns)
repeats: 1
preset: L
validated: Not Validated
================ implementation sycl ========================
implementation: sycl
framework: dpcpp
framework version: IntelLLVM 2024.0.0
input size: 33554752
setup time: 130.867229ms (130867229 ns)
warmup time: 217.880147619s (217880147619 ns)
teardown time: 0ns (0 ns)
max execution times: 218.045546452s (218045546452 ns)
min execution times: 218.045546452s (218045546452 ns)
median execution times: 218.045546452s (218045546452 ns)
repeats: 1
preset: L
validated: Not Validated