-
Notifications
You must be signed in to change notification settings - Fork 924
Add Zero-Copy Conversion between Vec and MutableBuffer #3920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I plan to run the benchmarks to confirm no performance regression |
@@ -810,7 +800,8 @@ mod tests { | |||
b.into_mutable().unwrap(); | |||
|
|||
let b = Buffer::from_vec(vec![1_u32, 3, 5]); | |||
let b = b.into_mutable().unwrap_err(); // Invalid layout | |||
let b = b.into_mutable().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now possible 🎉
/// Create a [`MutableBuffer`] from the provided [`Vec`] without copying | ||
#[inline] | ||
pub fn from_vec<T: ArrowNativeType>(vec: Vec<T>) -> Self { | ||
// Safety |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the logic from Buffer::from_vec
pub fn with_capacity(capacity: usize) -> Self { | ||
let capacity = bit_util::round_upto_multiple_of_64(capacity); | ||
let ptr = alloc::allocate_aligned(capacity); | ||
let layout = Layout::from_size_align(capacity, ALIGNMENT).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous allocation methods used from_size_align_unchecked
, this was technically incorrect as it didn't check capacity doesn't overflow isize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need round_upto_multiple_of_64
? from_size_align
seems also rounding capacity
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_size_align seems also rounding capacity
It doesn't round, it just verifies that the allocation isn't large enough to overflow isize and cause UB
// SAFETY: ALIGNMENT is a non-zero usize which is then casted | ||
// to a *mut T. Therefore, `ptr` is not null and the conditions for | ||
// calling new_unchecked() are respected. | ||
unsafe { NonNull::new_unchecked(ALIGNMENT as *mut u8) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved from the alloc module
I think @viirya should take a close look at this and make sure it is compatible with their usecase. cc @avantgardnerio as I think you had a similar usecase with modifying data a while ago |
I'll review this today or tomorrow. |
Thank you ❤️ To be clear this is a strictly additive change, it allows |
Deallocation::Standard(layout) if layout.align() == ALIGNMENT => { | ||
layout.size() | ||
} | ||
let layout = match bytes.deallocation() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alignment of bytes
doesn't matter now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, we can handle any alignment now
Which issue does this PR close?
Closes #3516
Rationale for this change
#3756 only added zero-copy conversion between
Vec
andBuffer
in order to keep the scope of the change down. Unfortunately as discovered in #3917 this causes issues for the APIs that allow converting an Array back into a builder, as the builders use MutableBuffer and therefore are restricted by the alignment expectations ofMutableBuffer
.What changes are included in this PR?
Adds zero-copy conversion to
MutableBuffer
by storing theLayout
inline.Are there any user-facing changes?
This changes what can and can't be converted to a MutableBuffer, and removes some low-level deprecated APIs. Downstream impact is likely to be extremely marginal.