You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many functions--e.g. mean, variance, etc.--could be made parallelizable, faster, shorter, and more general--by accepting Transducers.jl as a dependency, and it would substantially simplify the implementation of some features. I find myself reaching for it but having to use clumsier iterators or broadcasting methods often.
Luckily, Transducers.jl is now being maintained by Mason Protter and the rest of the people working on the JuliaFolds ecosystem.
The package and its dependencies have been pared down substantially over time and should not be a major contributor to StatsBase.jl's loading time. Transducers is now lightweight, with only about 80ms load time for all dependencies (including indirect dependencies) on v1.10.
julia> @time_imports using Transducers
0.2 ms Adapt
6.1 ms MacroTools
0.5 ms StaticArraysCore
0.3 ms ConstructionBase
6.4 ms Setfield
0.3 ms ArgCheck
0.1 ms Compat
0.1 ms Compat → CompatLinearAlgebraExt
6.4 ms InitialValues
┌ 0.0 ms Requires.__init__()
32.5 ms Requires 98.74% compilation time
┌ 0.0 ms BangBang.__init__()
4.7 ms BangBang
9.5 ms Baselet
0.2 ms CompositionsBase
0.2 ms DefineSingletons
2.9 ms MicroCollections
30.2 ms Test
4.5 ms SplittablesBase
14.2 ms Transducers
The primary advantage would be to simplify the implementation of many features, enable in-place algorithms that can be substantially faster and more memory-efficient, and to use a more generic interface than the iterator interface (as transducers can operate on collections that are not themselves iterators).
The text was updated successfully, but these errors were encountered:
BTW, @devmotion, the reason why I'm interested in Transducers.jl is I'm working on a PR that fixes all of the loops and uses of @inbounds in StatsBase.jl; Transducers.jl can replace most of these loops with faster (but less bug-prone) constructions. I think finally killing off @inbounds with no performance penalty (and in most cases a speedup) would be worth it.
Any remaining @inbounds issues could be fixed without switching to Transducers, so for me that's not a compelling argument for adopting such a large dependency (and I guess it's completely impossible for code that will be moved to the Statistics stdlib?). Even if StatsBase would use Transducers at some point, I think it would be good to keep bugfixes separate from a transition to/adoption of Transducers.
Many functions--e.g. mean, variance, etc.--could be made parallelizable, faster, shorter, and more general--by accepting Transducers.jl as a dependency, and it would substantially simplify the implementation of some features. I find myself reaching for it but having to use clumsier iterators or broadcasting methods often.
Luckily, Transducers.jl is now being maintained by Mason Protter and the rest of the people working on the JuliaFolds ecosystem.
The package and its dependencies have been pared down substantially over time and should not be a major contributor to StatsBase.jl's loading time. Transducers is now lightweight, with only about 80ms load time for all dependencies (including indirect dependencies) on v1.10.
The primary advantage would be to simplify the implementation of many features, enable in-place algorithms that can be substantially faster and more memory-efficient, and to use a more generic interface than the iterator interface (as transducers can operate on collections that are not themselves iterators).
The text was updated successfully, but these errors were encountered: