You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We can reach good bandwidth efficiency when combining linear indexing with dropped field dimensions on ClimaCore broadcasted objects for pointwise kernels (this is the thermo_bench_bw.jl benchmark script):
One future proofing complication of this branch is that we will need to continue to support the field dimension being present (perhaps inside TupleOfArrays, or whatever we decide to call this new layer's struct) in order to still work reasonably with on the order of 100 tracers.
Just to note: dropping the field dimension roughly 2xed the performance, and using linear indexing accounted for the rest. As discussed with @tapios, only applying linear indexing seems to improve performance for broadcasting with single variables, but seems to degrade performance with multiple variables. So, it seems that both of these changes are needed in tandem to improve the performance.
### Tasks
- [x] Refactor DataLayout internals to use `parent` less
- [x] Refactor DSS to use parent less, and leverage `UniversalSize`
- [x] Define `ArraySize` (similar to UniversalSize), that includes the field dimension
- [x] Define DataLayouts type parameter utilities? (e.g., `type_params`)
- [ ] Define a new layer, beneath DataLayouts, to store tuples of arrays
- [ ] Bypass Base.Broadcast's indexing to allow for linear indexing for pointwise kernels.
- [ ] https://github.com/CliMA/ClimaCore.jl/pull/1948
- [ ] https://github.com/CliMA/ClimaCore.jl/pull/1946
- [ ] https://github.com/CliMA/ClimaCore.jl/pull/1944
- [ ] https://github.com/CliMA/ClimaCore.jl/pull/1943
- [ ] https://github.com/CliMA/ClimaCore.jl/pull/1920
- [ ] https://github.com/CliMA/ClimaCore.jl/pull/1898
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
From this very hacked branch:
https://github.com/CliMA/ClimaCore.jl/tree/ck/drop_field_dimension (PR #1929).
We can reach good bandwidth efficiency when combining linear indexing with dropped field dimensions on ClimaCore broadcasted objects for pointwise kernels (this is the
thermo_bench_bw.jl
benchmark script):Main branch (Clima A100):
Branch with dropped field dimension + linear indexing (Clima A100):
One future proofing complication of this branch is that we will need to continue to support the field dimension being present (perhaps inside
TupleOfArrays
, or whatever we decide to call this new layer's struct) in order to still work reasonably with on the order of 100 tracers.Just to note: dropping the field dimension roughly
2x
ed the performance, and using linear indexing accounted for the rest. As discussed with @tapios, only applying linear indexing seems to improve performance for broadcasting with single variables, but seems to degrade performance with multiple variables. So, it seems that both of these changes are needed in tandem to improve the performance.cc @tapios
The text was updated successfully, but these errors were encountered: