-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] Allow to pass Arrow array as groups #6166
Merged
Merged
Changes from 69 commits
Commits
Show all changes
72 commits
Select commit
Hold shift + click to select a range
ab2d5e2
Add Arrow support to Python API
borchero 570ca64
Merge branch 'master' into arrow-support
borchero c21fab4
Fix lint
borchero 2cd4302
Fix isort
borchero 71957f6
[python-package] Allow to pass Arrow table as training data
borchero 175fb13
Merge branch 'master' into arrow-support-training-data
borchero 32dfb11
Remove change
borchero b5f0676
Implement JL comments
borchero cca3b37
Fix isort
borchero 001139a
Remove testcase
borchero 5861ca6
Adjust pyarrow version
borchero 54d171c
Revert gitignore
borchero a87a15b
Fix lint
borchero 8cda7cd
Merge branch 'master' into arrow-support-training-data
borchero 6b4245a
Increase timeout for bdist_wheel build
borchero 14a9326
Fix layout
borchero 854f306
Add newline
borchero 269582c
Fix typo
borchero 9164040
Merge branch 'master' into arrow-support-training-data
borchero 9a0a18d
Merge branch 'master' into arrow-support-training-data
borchero e5540cd
Remove arrow.py
borchero 98997bf
Merge branch 'master' into arrow-support-training-data
jameslamb 4a66cba
Merge branch 'master' into arrow-support-training-data
borchero f44421e
Fix cpp tests
borchero 80b0aa3
Fix tests
borchero 1869cfb
Fix omp parallel
borchero ba62bcc
Add missing <cmath> header
borchero db449e1
Fix cpplint
borchero 3dab653
Disable arrow tests
borchero 840cba9
Try fixing memory issue in tests
borchero 19b210b
Try chunking in test
borchero 059419d
Fix lint
borchero 36e7bf4
Merge branch 'master' into arrow-support-training-data
borchero 143a247
Implement review comments
borchero bb97817
Merge branch 'master' into arrow-support-training-data
jameslamb 62431f2
Uninstall optional dependencies correctly
borchero 34ee108
[python-package] Allow to pass Arrow array as labels
borchero 90a2c1f
Fix lint
borchero 6b65bcf
Fix lint
borchero ec33f75
WIP: [python-package] Allow to pass Arrow array as weights
borchero 20a23b8
Fix lint
borchero ccdb0ba
Push
borchero 7dbce53
Remove test
borchero ce69120
Merge branch 'arrow-support-weights' into arrow-support-groups
borchero e1593c2
Groups
borchero 0af7a7c
[python-package] Allow to pass Arrow table as training data
borchero 45a67a6
Merge branch 'arrow-support-training-data' into arrow-support-labels
borchero 80c12c0
Merge branch 'arrow-support-labels' into arrow-support-weights
borchero 221cba4
Merge branch 'arrow-support-weights' into arrow-support-groups
borchero 15c8637
Fix isort
borchero 06bdce2
Merge branch 'master' into arrow-support-labels
borchero 75a980e
Merge branch 'arrow-support-labels' into arrow-support-weights
borchero 3d3ffb1
Merge branch 'arrow-support-weights' into arrow-support-groups
borchero f7c67e7
Implement guolinke's review
borchero 91fade9
Merge branch 'master' into arrow-support-labels
jameslamb 09ad33b
Merge branch 'arrow-support-labels' into arrow-support-weights
borchero 33f3e44
Merge branch 'master' into arrow-support-labels
borchero cd556da
Merge branch 'arrow-support-labels' into arrow-support-weights
borchero 678ae7d
Use np_assert_array_equal
borchero 5331202
Implement jameslamb's review comments
borchero 74910d4
Merge branch 'master' into arrow-support-weights
jameslamb 5041282
Merge branch 'master' into arrow-support-weights
jameslamb 04f0f21
Merge branch 'arrow-support-weights' into arrow-support-groups
borchero 5e2baa1
Fix
borchero ff5c9f8
Merge branch 'master' into arrow-support-groups
borchero 0f56ea0
Fix and implement review comments
borchero 797cc3a
Fix
borchero 8714625
Fix test
borchero acd916e
Fix
borchero c00b841
Merge branch 'master' into arrow-support-groups
borchero 9b07160
Add tests for empty chunks
borchero 79d050b
Fix lint
borchero File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add test cases covering the case where a chunked array with some empty chunks is given?
At least these two:
And then could you:
Dataset
components you've already added Arrow support forSorry I didn't think of this earlier. It does seem to me that it's possible to have empty chunks in a chunked array, e.g.
And I can imagine situations where such arrays could end up being passed into LightGBM. I'm thinking use cases similar to how Dask Arrays are commonly created by concatenating the results of multiple separate function calls, some of which may be empty, e.g.:
snowflake-connector-python
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! I didn't know that this was possible, will add tests!