Skip to content

add dims kwarg #247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft

add dims kwarg #247

wants to merge 5 commits into from

Conversation

bjarthur
Copy link
Member

not battle tested yet. and might as well add an init kwarg too

src/mapreduce.jl Outdated
Comment on lines 80 to 86
out_dims = [size(a)...]
out_dims[_dims] .= 1
out = fill($init(eltype(a)), out_dims...)
for c in eachchunk(a)
out_c = [c...]
out_c[_dims] .= Ref(1:1)
out[out_c...] .= $acum.(out[out_c...], Base.$_fname(f, a[c...], dims))
Copy link
Collaborator

@rafaqz rafaqz Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for each line would help for understanding the logic.

Also splatting a vector isn't type stable, maybe better to convert back to tuple and pass it through a function barrier before the loop? We actually know the length too so you could use ntuple to make it.

src/mapreduce.jl Outdated
@eval function Base.$_fname(f::Function, a::AbstractDiskArray, dims)
_dims = typeof(dims)<:Tuple ? [dims...] : typeof(dims)<:Number ? [dims] : dims
out_dims = [size(a)...]
out_dims[_dims] .= 1
Copy link
Collaborator

@rafaqz rafaqz Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out_dims = ntuple(Val{N}()) do i
    i in _dims ? 1 : size(a)[i]
end 

Something like this should be the same but type stable for splats. N needs a where N on the array ndims type parameter.

(actually the one below is more important as its in the loop, but theyre kind of the same)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a commit with this code. the out array is still type unstable. not sure how to fix that given it's eltype is calculated.

@bjarthur
Copy link
Member Author

also need to search for something other than "DiskGenerator" in the output of @trace as that is not present on main either.

src/mapreduce.jl Outdated
out_dims = ntuple(Val(N)) do i
i in _dims ? 1 : size(a)[i]
end
out = fill($init(Base.return_types(f, (T,))[1]), out_dims...)
Copy link
Collaborator

@rafaqz rafaqz Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
out = fill($init(Base.return_types(f, (T,))[1]), out_dims...)
T1 = Base.promote_op(f, T)
out = fill($init(T1), out_dims...)

I think this is more likely to be stable? (untested)

(also seems you have 8 space indenting turned on somewhere instead of 4)

@meggart
Copy link
Collaborator

meggart commented Mar 25, 2025

Why exactly do we need these special implementations? Has something in the Julia mapreduce internals changed so that all of these don't hit the mapreducedim! implementation anymore. See also this PR for a similar discussion.

@bjarthur
Copy link
Member Author

Why exactly do we need these special implementations? Has something in the Julia mapreduce internals changed so that all of these don't hit the mapreducedim! implementation anymore. See also this PR for a similar discussion.

which PR? and no, nothing falls back to mapreduce now as the internals have changed. #246

@rafaqz
Copy link
Collaborator

rafaqz commented Mar 25, 2025

Right yes we have this code already:

https://github.com/JuliaIO/DiskArrays.jl/blob/main/src/mapreduce.jl#L13-L26

Probably we need to hook this up again? It could also do with some comments and cleanup so its clear what its all for

@bjarthur
Copy link
Member Author

oops, i take that back. maximum is indeed falling back to https://github.com/JuliaIO/DiskArrays.jl/blob/main/src/mapreduce.jl#L13, it's count that isn't falling back to https://github.com/JuliaIO/DiskArrays.jl/blob/main/src/mapreduce.jl#L28. quick benchmarks show that this PR is no faster. shoot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants