-
-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix UB in RawStorageMut::swap_unchecked_linear
#1317
Conversation
13d962f
to
cb14390
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
src/base/storage.rs
Outdated
// we can't just use the pointers returned from `get_address_unchecked_linear_mut` because calling a | ||
// method taking self mutably invalidates any existing (mutable) pointers. since `get_address_unchecked_linear_mut` can | ||
// also be overriden by a custom implementation, we can't just use `wrapping_add` assuming that's what the method does. | ||
// instead, we use `offset_from` to compute the re-calculate the pointers from the base pointer. | ||
// this is safe as long as this trait is implemented safely | ||
// (and it's the caller's responsibility to ensure the indices are in-bounds). | ||
let base = self.ptr_mut(); | ||
let offset1 = self.get_address_unchecked_linear_mut(i1).offset_from(base); | ||
let offset2 = self.get_address_unchecked_linear_mut(i2).offset_from(base); | ||
|
||
let base = self.ptr_mut(); | ||
let a = base.offset(offset1); | ||
let b = base.offset(offset2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, I don't think this can be guaranteed to work for any possible custom get_address_unchecked_linear_mut
implementation, either. Imagine a storage that rearranges its elements in memory every time ptr_mut
is invoked -- the offsets computed in offset1
and offset2
are no longer valid by the time base
is computed the second time, and a
and b
are now pointing to the wrong elements. This would be a ridiculous way to implement your storage but it means that this is still not a sound way to implement this method by default for any arbitrary storage, it's just one that MIRI doesn't realize to complain about.
But I think this is still fine; my reasoning is, if the previous implementation didn't contain valid Rust semantics before, then we can't really have a "breaking change" on the internals by making it work for some-but-not-all cases. It's a "breaking change" at the API level that we might now have to document the safety preconditions for this to be a valid implementation of the method for a custom user-defined type -- but that should have been there before as well (it turns out the correct set of preconditions was previously ⊥.)
Now since the set of implementations for which this code will generate machine code that happens to work for an arbitrary use-case is not a subset of those which previously happened to generate working machine code, we may still wish to include this in a version bump, rather than a bugfix. But I think a minor version bump would probably be fine? Just realistically I suspect you'd have to work really hard to produce an implementation of RawStorageMut
that worked with the previous default implementation but not the current one...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I completely agree with your reasoning.
The only way this would break is if someone really went out of their way to implement a cursed custom storage, and I don't think many people are implementing custom storages at all.
Committed your suggestion below.
Thanks for the review and sorry for the delay in getting to it - must've missed the GH notification 😅.
Co-authored-by: tpdickso <[email protected]>
aeff6f6
to
d884a7e
Compare
let base = self.ptr_mut(); | ||
let offset1 = self.get_address_unchecked_linear_mut(i1).offset_from(base); | ||
let offset2 = self.get_address_unchecked_linear_mut(i2).offset_from(base); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Come to think of it -- is this technically violating stacked borrows, as well? ptr_mut
and get_address_unchecked_linear_mut
both accept &mut self
and so should also invalidate the previous references, right? I suspect the "correct" way to do this would be to have immutable versions of the trait methods that can be used to compute this validly.
But barring that, we'd need to make sure the borrows don't overlap. Since we need both the base
pointer and the element pointer to be alive at the same time, that's a nonstarter. So perhaps integer arithmetic is the escape hatch;
let base = self.ptr_mut(); | |
let offset1 = self.get_address_unchecked_linear_mut(i1).offset_from(base); | |
let offset2 = self.get_address_unchecked_linear_mut(i2).offset_from(base); | |
let base = self.ptr_mut() as isize; | |
let offset1 = self.get_address_unchecked_linear_mut(i1) as isize - base; | |
let offset2 = self.get_address_unchecked_linear_mut(i2) as isize - base; |
I'm not sure if this is equally-invalid though. It might just be tricking MIRI into not realizing we're doing something we're not meant to be doing in the first place. I'm not familiar enough with MIRI to know, myself. Does the current commit satisfy MIRI's checker?
But maybe I'm missing the point, and doing all this with mutable pointers and mutable borrows is fine as long as they're never dereferenced in an invalid state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said it yourself - calling methods with a &mut self
receiver invalidates any previous references. Pointers are not references, and are in themselves an escape hatch from the stacked borrows model, since pointers don't constitute a borrow. 😊 My code causes the swap tests to pass under MIRI so I'm pretty confident they're okay.
OTOH, ptr->int casts are almost always a bad idea, since that causes the loss of provenance. (it might not matter here - but I'm not much of an expert, I mostly rely on MIRI to tell me if I'm misbehaving)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code also operated on pointers, so I suspect MIRI is smart enough to extend its analysis to pointers if it was nevertheless managing to raise an error there. But the original error message says "read access" which leads me to believe that probably the issue was that the pointers were dereferenced after being invalidated, rather than just that they were used, so it's reasonable that MIRI's okay with this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right you are, my last comment is just completely wrong, was not thinking straight. 😅 Pointers are an escape hatch from the borrow checker, but their usage should still uphold (stacked) borrowing rules.
The reason my PR appeases MIRI is because we get a "fresh" base from self.ptr_mut()
just below this comment, and we don't use any of the already-invalidated pointers from before that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, your version with the ptrtoint casts also appeases MIRI, after a minor correction. It should look like this:
let base = self.ptr_mut() as isize;
let offset1 = (self.get_address_unchecked_linear_mut(i1) as isize - base) / (size_of::<T>() as isize);
let offset2 = (self.get_address_unchecked_linear_mut(i2) as isize - base) / (size_of::<T>() as isize);
(notice the div by size_of::<T>()
that you were missing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right; or rather, we do "use" an invalidated pointer here:
self.get_address_unchecked_linear_mut(i1).offset_from(base);
Since this uses base
after invalidating it with a call to get_address_unchecked_linear_mut
, but it doesn't dereference it, which I suspect is why MIRI is fine with it. It just uses it as basically an integer.
Co-authored-by: tpdickso <[email protected]>
According to Miri,
RawStorageMut::swap_unchecked_linear
introduces a stacked borrows violation:My fix is possibly over-conservative, but I couldn't think of anything else that wouldn't be a breaking change. Granted, for a user to be affected by the change would require that they implemented the unsafe
RawStorageMut
trait on their own type, and overrode the defaultget_address_unchecked_linear_mut
in a weird way. Still, I think it's better to err on the cautious side.(BTW, a lot of the tests in
edition.rs
are currently crashing under miri due to UB, I'm investigating how to fix the rest of the issues but they seem less straight-forward than this one)