Skip to content

RFC/WIP: preserve the type of the source upon collect(Enumerable) #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

alyst
Copy link

@alyst alyst commented Oct 11, 2017

No user-code workaround for queryverse/Query.jl#157 came to my mind, so I decided to propose a fix.

The first commit just adds T param to the Enumerable, which allows reducing eltype() redundancy. I think it's a good change overall.

The 2nd commit introduces SimpleSourceEnumerable{T,S}, which is a type of enumerable that have a single source and doesn't transform elements, e.g. "where" and "orderby".
The 3rd commits actually allows SimpleSourceEnumerable to preserve the type of the AbstractArray source upon collect(), which covers the CategoricalArray case without requiring CategoricalArrays dependency.

Actually, it would make sense to try to preserve the source type for the other enumerables if the sink is of the same/similar type as the source(s).
Even if the elements are transformed, but the result is CategoricalValue, it would make sense to collect them into CategoricalArray.
But that would require more general design or package dependency decisions.

alyst added 2 commits October 12, 2017 01:23
remove redundant eltype() methods
abstract of the enumerable that doesn't transform the elements of the
sources but just rearranges/filters/duplicates them
@codecov-io
Copy link

codecov-io commented Oct 12, 2017

Codecov Report

Merging #3 into master will increase coverage by 2.46%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master       #3      +/-   ##
==========================================
+ Coverage   25.36%   27.82%   +2.46%     
==========================================
  Files          14       14              
  Lines         347      327      -20     
==========================================
+ Hits           88       91       +3     
+ Misses        259      236      -23
Impacted Files Coverage Δ
src/source_iterable.jl 100% <ø> (+21.42%) ⬆️
src/enumerable/enumerable_defaultifempty.jl 0% <ø> (ø) ⬆️
src/enumerable/enumerable_groupby.jl 56.25% <ø> (+2.4%) ⬆️
src/enumerable/enumerable_where.jl 92.85% <ø> (+2.85%) ⬆️
src/enumerable/enumerable_selectmany.jl 0% <ø> (ø) ⬆️
src/enumerable/enumerable_join.jl 0% <ø> (ø) ⬆️
src/enumerable/enumerable_groupjoin.jl 0% <ø> (ø) ⬆️
src/enumerable/enumerable_select.jl 100% <ø> (+15%) ⬆️
src/enumerable/enumerable_orderby.jl 0% <ø> (ø) ⬆️
src/sink_array.jl 83.33% <100%> (ø) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1eded27...05d1c2a. Read the comment docs.

@alyst alyst changed the title RFC: preserve the type of the source upon collect(Enumerable) RFC/WIP: preserve the type of the source upon collect(Enumerable) Oct 12, 2017
@alyst
Copy link
Author

alyst commented Oct 12, 2017

Ah, ok, this actually doesn't fix the problem with dataframes, because dataframes are handled by IterableTables, so I would need to patch the column type selection in _DataFrames() method.

add init_sink(Enumerable) method that defaults to Vector{eltype},
but keeps the source type for SimpleSourceEnumerable{T,<:AbstractArray}
@davidanthoff
Copy link
Member

I just outlined how the original issue could be addressed in the issue you had opened. I don't think any of the other stuff here would be needed, right?

Enumerable actually was Enumerable{T} originally, but in the end I changed that because I didn't want to embed so much info in the abstract base type.

@davidanthoff
Copy link
Member

Closing this for now, I don't think we want to merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants