Skip to content

bench: add more {boolean, string, int} benchmarks for concat kernel #7376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 2, 2025

Conversation

rluvaton
Copy link
Contributor

@rluvaton rluvaton commented Apr 2, 2025

Which issue does this PR close?

N/A

Rationale for this change

Requested by @alamb in:

What changes are included in this PR?

  1. Remove duplicate concat benchmark
  2. Extracted benchmark added in Improve concat performance, and add append_array for some array builder implementations #7309 for concatenating a lot of arrays
  3. added concat benchmarks for boolean
  4. Added to string benchmark concatenating a lot of arrays

Are there any user-facing changes?

Nope

@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 2, 2025
Comment on lines -82 to -87
let v1 = create_string_array::<i32>(1024, 0.5);
let v2 = create_string_array::<i32>(1024, 0.5);
c.bench_function("concat str nulls 1024", |b| {
b.iter(|| bench_concat(&v1, &v2))
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a duplicate benchmark

rluvaton added a commit to rluvaton/arrow-rs that referenced this pull request Apr 2, 2025
@rluvaton rluvaton changed the title bench: add benchmarks for concat boolean and update string bench bench: add more {boolean, string, int} benchmarks for concat kernel Apr 2, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @rluvaton

@alamb alamb merged commit a35bdf0 into apache:main Apr 2, 2025
25 checks passed
@rluvaton rluvaton deleted the add-concat-boolean-benchmark branch April 2, 2025 14:42
alamb pushed a commit that referenced this pull request Apr 6, 2025
…uilder implementations (#7309)

* feat: add `append_buffer` for `NullBufferBuilder`

* feat: add `append_array` for `PrimitiveBuilder`

* feat: add `append_array` for `BooleanBuilder`

* test: add test that the underlying null values are added as is

* wip

* format and lint

* add special implementation for concat primitives and booleans improving perf by 50%

* add more tests for generic bytes builder

* add special implementation for bytes in concat

* manually concat primitives

* add large array impl

* wip

* remove unsafe API and use primitive builder in concat

* lint and format

* fix concat primitives to use the input array data type

* format

* add back the capacity for binary because dictionary call concat_fallback

* add tests and update comment

* extract benchmark changes to different PR #7376
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants