Avoid calling vcat(dfs...) in combine() #1261

nalimilan · 2017-10-15T17:27:31Z

This avoids compiling specialized methods for each number of arguments,
which can be very large when using groupby(). The code does not take
advantage at all of that information anyway. The code prior to
refactoring used to do that, by exporting
vcat(::Vector{AbstractDataFrame}). Use an internal method instead since
this is non-standard.

Incidentally, this fixes a small bug introduced in the refactoring: when
working with an empty data frame, vcat() was called, and it returns
Any[0], which surprisingly implied the addition of a bogus x1 column to
the result. The test was actually correct before refactoring.

Refactoring was done at JuliaData/DataTables.jl#45. See also report at https://discourse.julialang.org/t/stack-overflow-in-dataframes-group-by/6357.

Cc: @ExpandingMan

This avoids compiling specialized methods for each number of arguments, which can be very large when using groupby(). The code does not take advantage at all of that information anyway. The code prior to refactoring used to do that, by exporting vcat(::Vector{AbstractDataFrame}). Use an internal method instead since this is non-standard. Incidentally, this fixes a small bug introduced in the refactoring: when working with an empty data frame, vcat() was called, and it returns Any[0], which surprisingly implied the addition of a bogus x1 column to the result. The test was actually correct before refactoring.

coveralls · 2017-10-15T17:58:05Z

Coverage increased (+0.02%) to 72.568% when pulling c939369 on nl/vcat into 8cf2be9 on master.

cjprybol

I had no idea vcat(args...) recompiled each time it received a new number of arguments, thanks for the explanation (here and on Discourse) and the fix!

nalimilan requested a review from cjprybol October 15, 2017 17:27

cjprybol approved these changes Oct 16, 2017

View reviewed changes

cjprybol merged commit 86c6145 into master Oct 17, 2017

cjprybol deleted the nl/vcat branch October 17, 2017 19:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid calling vcat(dfs...) in combine() #1261

Avoid calling vcat(dfs...) in combine() #1261

Uh oh!

nalimilan commented Oct 15, 2017 •

edited

Loading

Uh oh!

coveralls commented Oct 15, 2017 •

edited

Loading

Uh oh!

cjprybol left a comment •

edited

Loading

Uh oh!

Uh oh!

Avoid calling vcat(dfs...) in combine() #1261

Avoid calling vcat(dfs...) in combine() #1261

Uh oh!

Conversation

nalimilan commented Oct 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Oct 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cjprybol left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nalimilan commented Oct 15, 2017 •

edited

Loading

coveralls commented Oct 15, 2017 •

edited

Loading

cjprybol left a comment •

edited

Loading