-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Improve Bitmap construction performance #15570
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,10 @@ | ||
use std::borrow::Borrow; | ||
use std::ops::{BitAnd, BitOr}; | ||
|
||
use polars_error::{polars_ensure, PolarsResult}; | ||
|
||
use crate::array::Array; | ||
use crate::bitmap::{and_not, ternary, Bitmap}; | ||
use crate::bitmap::{and_not, push_bitchunk, ternary, Bitmap}; | ||
|
||
pub fn combine_validities_and3( | ||
opt1: Option<&Bitmap>, | ||
|
@@ -49,6 +50,63 @@ pub fn combine_validities_and_not( | |
} | ||
} | ||
|
||
pub fn combine_validities_and_many<B: Borrow<Bitmap>>(bitmaps: &[Option<B>]) -> Option<Bitmap> { | ||
let mut bitmaps = bitmaps | ||
.iter() | ||
.flatten() | ||
.map(|b| b.borrow()) | ||
.collect::<Vec<_>>(); | ||
|
||
match bitmaps.len() { | ||
0 => None, | ||
1 => bitmaps.pop().cloned(), | ||
2 => combine_validities_and(bitmaps.pop(), bitmaps.pop()), | ||
3 => combine_validities_and3(bitmaps.pop(), bitmaps.pop(), bitmaps.pop()), | ||
_ => { | ||
let mut iterators = bitmaps | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Single pass/ allocation to combine bitmaps. |
||
.iter() | ||
.map(|v| v.fast_iter_u64()) | ||
.collect::<Vec<_>>(); | ||
let mut buffer = Vec::with_capacity(iterators.first().unwrap().size_hint().0 + 2); | ||
|
||
'rows: loop { | ||
// All ones so as identity for & operation | ||
let mut out = u64::MAX; | ||
for iter in iterators.iter_mut() { | ||
if let Some(v) = iter.next() { | ||
out &= v | ||
} else { | ||
break 'rows; | ||
} | ||
} | ||
push_bitchunk(&mut buffer, out); | ||
} | ||
|
||
// All ones so as identity for & operation | ||
let mut out = [u64::MAX, u64::MAX]; | ||
let mut len = 0; | ||
for iter in iterators.into_iter() { | ||
let (rem, rem_len) = iter.remainder(); | ||
len = rem_len; | ||
|
||
for (out, rem) in out.iter_mut().zip(rem) { | ||
*out &= rem; | ||
} | ||
} | ||
push_bitchunk(&mut buffer, out[0]); | ||
if len > 64 { | ||
push_bitchunk(&mut buffer, out[1]); | ||
} | ||
let bitmap = Bitmap::from_u8_vec(buffer, bitmaps[0].len()); | ||
if bitmap.unset_bits() == bitmap.len() { | ||
None | ||
} else { | ||
Some(bitmap) | ||
} | ||
}, | ||
} | ||
} | ||
|
||
// Errors iff the two arrays have a different length. | ||
#[inline] | ||
pub fn check_same_len(lhs: &dyn Array, rhs: &dyn Array) -> PolarsResult<()> { | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
use arrow::array::PrimitiveArray; | ||
use arrow::compute::utils::combine_validities_and; | ||
use arrow::compute::utils::combine_validities_and3; | ||
use polars_core::prelude::*; | ||
use polars_core::utils::align_chunks_ternary; | ||
use polars_core::with_match_physical_numeric_polars_type; | ||
|
@@ -11,10 +11,7 @@ fn fma_arr<T: NumericNative>( | |
c: &PrimitiveArray<T>, | ||
) -> PrimitiveArray<T> { | ||
assert_eq!(a.len(), b.len()); | ||
let validity = combine_validities_and( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could trigger two allocations/passes. |
||
combine_validities_and(a.validity(), b.validity()).as_ref(), | ||
c.validity(), | ||
); | ||
let validity = combine_validities_and3(a.validity(), b.validity(), c.validity()); | ||
let a = a.values().as_slice(); | ||
let b = b.values().as_slice(); | ||
let c = c.values().as_slice(); | ||
|
@@ -65,10 +62,7 @@ fn fsm_arr<T: NumericNative>( | |
c: &PrimitiveArray<T>, | ||
) -> PrimitiveArray<T> { | ||
assert_eq!(a.len(), b.len()); | ||
let validity = combine_validities_and( | ||
combine_validities_and(a.validity(), b.validity()).as_ref(), | ||
c.validity(), | ||
); | ||
let validity = combine_validities_and3(a.validity(), b.validity(), c.validity()); | ||
let a = a.values().as_slice(); | ||
let b = b.values().as_slice(); | ||
let c = c.values().as_slice(); | ||
|
@@ -118,10 +112,7 @@ fn fms_arr<T: NumericNative>( | |
c: &PrimitiveArray<T>, | ||
) -> PrimitiveArray<T> { | ||
assert_eq!(a.len(), b.len()); | ||
let validity = combine_validities_and( | ||
combine_validities_and(a.validity(), b.validity()).as_ref(), | ||
c.validity(), | ||
); | ||
let validity = combine_validities_and3(a.validity(), b.validity(), c.validity()); | ||
let a = a.values().as_slice(); | ||
let b = b.values().as_slice(); | ||
let c = c.values().as_slice(); | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
chains
as they require extra branches on iteration.