Skip to content

Split up miri into the librustc_mir and bin parts #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Aug 2, 2017

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Jul 21, 2017

cc @RalfJung this will break your rebasing

I adjusted all the imports so they appear like they would inside librustc_mir. For now there's a miri crate that only reexports the librustc_mir::interpret contents. This is so that I can write a Machine trait which will allow us to rip out everything non CTFE from the librustc_mir and place it back into miri.

Next step: break all lines longer than 100 chars.

fixes #275

@RalfJung
Copy link
Member

How do you feel about accepting some of my validation stuff even though it is currently dead code because rustc does not yet have the validation commands? Would you be fine with me experimenting in-tree (the design of the validation statements is certainly not yet final)? Also see rust-lang/rust#43403.
Rebasing is getting tiresome.^^

@oli-obk
Copy link
Contributor Author

oli-obk commented Jul 22, 2017

I am absolutely fine with it, but how can it be done? The matches on Statement will fail, or do you mean just the relevant functions?

Alternatively I can offer running your branch in CI so any miri change has to build a stage 1 rebase your miri over the PR and run the tests. That would keep it out of tree, but ensure that it's the PR author's (my) job to create a new branch with your code rebased over the PR.

Depends on your projected timeline on getting your rustc PR merged. If it's a week or two, let's just get your dead code in here, if it's a few months, the other solution might be better( but for convenience it would also take all the code that compiles into miri)

You might even be able to get all your code merged under a cargo feature. Not sure if cfgs can create match arms

@RalfJung
Copy link
Member

but how can it be done? The matches on Statement will fail, or do you mean just the relevant functions?

We could merge only the relevant new bits in memory.rs, or we could also merge the new file validation.rs which contains most of the rest of the new stuff. (The git history involves lots of changes in lvalue.rs, but almost all of that is later moved to validation.rs. I am not sure if you prefer a clean or a true history. ;) Once rust-lang/rust#43403 lands the small patch to hook this all up with the statements could be added. Notice that none of this changes existing behavior; rustc only keeps the new statements around when -Z mir-emit-validate is set.

Depends on your projected timeline on getting your rustc PR merged. If it's a week or two, let's just get your dead code in here, if it's a few months, the other solution might be better( but for convenience it would also take all the code that compiles into miri)

That mostly depends on the compiler team. ;)

@oli-obk
Copy link
Contributor Author

oli-obk commented Jul 24, 2017

I am not sure if you prefer a clean or a true history

true is fine by me.

So I suggest you open a PR with your branch, and we merge that before messing further with the structure of miri.

@oli-obk
Copy link
Contributor Author

oli-obk commented Jul 24, 2017

@eddyb This PR now has a minimal Machine trait + its miri and const eval impls.

More methods will follow.

@@ -76,3 +78,21 @@ pub fn eval_body_as_integer<'a, 'tcx>(
_ => return Err(EvalError::NeedsRfc("evaluating anything other than isize/usize during typeck".to_string())),
})
}

struct Evaluator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put Const in its name. Or CompileTime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Otherwise this is along the lines you had in mind?

Copy link
Member

@eddyb eddyb Jul 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

miri/src/lib.rs Outdated
}

//struct Memory<'b, 'a: 'b, 'tcx: 'a>(&'b rustc_miri::interpret::Memory<'a, 'tcx, Evaluator>);
struct MemoryMut<'b, 'a: 'b, 'tcx: 'a>(&'b mut rustc_miri::interpret::Memory<'a, 'tcx, Evaluator>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this newtype?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And same for the other newtype above.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows writing methods for EvalContext. If we wrote an extension trait, we'd have to duplicate every single method on the trait and the impl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, makes sense. I find it a little funny that this ends up passing around &mut &mut ..., but I guess that's needed to get automatic reborrowing.

For consistency, shouldn't this one be called Memory (just like the other one is called EvalContext)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok... I gave up on the newtype idea. It gets messy fast. We're back to extension traits. While those create some duplication of the function headers, they are so much more convenient to use.

@RalfJung
Copy link
Member

I think I like the overall design, but I am not sure because I cannot see any structure in miri/src/lib.rs. ;) That file is rather... messy. I haven't even found the place yet where it says that this particular Machine impl is supposed to be used.

@RalfJung
Copy link
Member

So the folder structure is such that we have src/librustc_mir and miri/src? That looks a little funny.
It'd be great if there could be some folder that has a Cargo.toml and that contains everything. For example, what about putting miri's Cargo.toml into the root?

@RalfJung
Copy link
Member

RalfJung commented Jul 25, 2017

I was thinking of something like this: RalfJung@c019e50

@oli-obk
Copy link
Contributor Author

oli-obk commented Jul 26, 2017

The librustc_Mir folder structure is mirroring the rustc repository in order to simplify the merger into rustc. I'd rather not change the folder structure, but I can place a Cargo.toml into the root instead of in the miri folder

@RalfJung
Copy link
Member

The librustc_Mir folder structure is mirroring the rustc repository in order to simplify the merger into rustc.

Oh, I see. So you plan to use git-filter or so to then extract just this subfolder? That would still kill most of the history though.

@RalfJung
Copy link
Member

So what about RalfJung@fa9a15f? The idea is also to have the folder structure as much as possible already like how we'd like to have it after the merge -- the only exception being that the miri source is in miri, not src. It could be in src, but that'd be rather confusing with librustc_mir also being there.

@oli-obk
Copy link
Contributor Author

oli-obk commented Jul 26, 2017

Sgtm. I already have this rebased over master locally. I'll apply the move from your commit tomorrow and push it here.

but I am not sure because I cannot see any structure in miri/src/lib.rs. ;) That file is rather... messy. I haven't even found the place yet where it says that this particular Machine impl is supposed to be used.

I have split the file into clearer components. Will also include it in this PR

@RalfJung
Copy link
Member

Sgtm. I already have this rebased over master locally. I'll apply the move from your commit tomorrow and push it here.

Alternatively, push this somewhere and I can do the move again.

@oli-obk oli-obk force-pushed the upstream branch 3 times, most recently from 3e7218b to 93a7aeb Compare July 28, 2017 11:09
@oli-obk
Copy link
Contributor Author

oli-obk commented Jul 28, 2017

Down to 7k lines of code in interpret. Slowly moving towards a manageable size for the rustc merger.

@oli-obk
Copy link
Contributor Author

oli-obk commented Jul 28, 2017

Ok. I think we have reached everything for the merger.

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than that, this looks good! It's a large patch though, so there's probably things I missed.

miri/lib.rs Outdated
// FIXME: replace loop by some structure that works with stepping
while let Some((instance, ptr, key)) = dtor {
trace!("Running TLS dtor {:?} on {:?}", instance, ptr);
// TODO: Potentially, this has to support all the other possible instances? See eval_fn_call in terminator/mod.rs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file name reference needs updating, I presume?

miri/lib.rs Outdated
fn load_tls(&mut self, key: TlsKey) -> EvalResult<'tcx, Pointer>;
fn store_tls(&mut self, key: TlsKey, new_data: Pointer) -> EvalResult<'tcx>;
fn fetch_tls_dtor(&mut self, key: Option<TlsKey>) -> EvalResult<'tcx, Option<(ty::Instance<'tcx>, Pointer, TlsKey)>>;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO the TLS stuff should get its own file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has its own file now

miri/lib.rs Outdated
let size = ecx.type_size(ty)?.expect("box only works with sized types");
let align = ecx.type_align(ty)?;
if size == 0 {
Ok(PrimVal::Bytes(align.into()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened with the FIXME to call the lang item instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got lost, readded


match &path[..] {
// Allocators are magic. They have no MIR, even when the rest of libstd does.
"alloc::heap::::__rust_alloc" => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an exact duplicate of some code further up in call_fn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops, copy paste


/// The virtual memory system.
pub(crate) memory: Memory<'a, 'tcx>,
pub memory: Memory<'a, 'tcx, M>,

#[allow(dead_code)]
// FIXME(@RalfJung): validation branch
/// Lvalues that were suspended by the validation subsystem, and will be recovered later
pub(crate) suspended: HashMap<DynamicLifetime, Vec<ValidationQuery<'tcx>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the plan is to keep the validation stuff for CTFE, or to factor it out later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it in, I don't think we want to allow UB in CTFE ;)

@@ -266,6 +203,9 @@ impl<'a, 'tcx> EvalContext<'a, 'tcx> {
(Div, F64) => f64_arithmetic!(/, l, r),
(Rem, F64) => f64_arithmetic!(%, l, r),

(Eq, _) => PrimVal::from_bool(l == r),
(Ne, _) => PrimVal::from_bool(l != r),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to your other comment, ptr_op doesn't handle integer ops anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. ptr_op still pretty much has to support them (well, at least in its type), but I guess for the purpose of CTFE it makes sense to split this.

_left_ty: Ty<'tcx>,
_right: PrimVal,
_right_ty: Ty<'tcx>,
) -> EvalResult<'tcx, Option<(PrimVal, bool)>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the code in operator.rs correctly, this function is always called, not just on pointers (and after all, integers could be pointers). So isn't this failing too much?

I think what I'd expect is a test whether _left or _right is a Ptr value. If both are Bytes, we should return Ok(None) rather than Err.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to change the API that either left or right must be a Ptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my reasoning was that in the future you'll want integers to go through the ptr_op API to properly handle the intptr model

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... we can't do this. libstd calls e.g. Offset on two Bytes for zsts. There's more ops that would fail.... And now I'm just realizing I misunderstood your original meaning. Fixing properly

// this includes any method that might access the stack
// basically don't call anything other than `load_mir`, `alloc_ptr`, `push_stack_frame`
// The reason for this is, that `push_stack_frame` modifies the stack out of obvious reasons
struct ConstantExtractor<'a, 'b: 'a, 'tcx: 'b, M: Machine<'tcx> + 'a> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never really understood this comment. Why does the fact that push_stack_frame push a stack frame entail that all methods on this type (?!?) should not access the stack? What exactly goes wrong if they do?

(This doesn't have to be solved by this PR.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... the issue is with functions like type_size, which access frame().instance.substs. If you keep expecting such a function to do something useful, you're gonna get suprised.

Cargo.toml Outdated
@@ -35,4 +35,4 @@ compiletest_rs = "0.2.6"

[workspace]
members = ["src/librustc_mir"]
exclude = ["xargo"]
exclude = ["xargo", "cargo-miri-test"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The members line is probably redundant; everything below this folder is in the workspace automatically.

@oli-obk oli-obk force-pushed the upstream branch 2 times, most recently from fb80f81 to e4212a6 Compare July 31, 2017 13:51
Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how miri/lib.rs is getting short and only concerned with plumbing. I think that's how it should be. :)

miri/lib.rs Outdated
dtor @ Some(_) => dtor,
None => self.memory.fetch_tls_dtor(None)?,
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop is TLS code, should also move to the TLS file.

}
_ => {}
if let Some(handled) = M::ptr_op(self, bin_op, left, left_ty, right, right_ty)? {
return Ok(handled);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name doesn't really fit right now, does it? This is called for all int and/or ptr ops after all.

I went through the same trouble when I extended miri to handle more ptr ops while trying to keep the code clean. Separating ptr from int really doesn't make too much sense, except for CTFE.

Maybe a better alternative would be to call it only if either operand is a pointer or the operation is Offset (this being the only primitive operation that takes arguments of pointer type)? The function should then probably not return an Option, but for intptrcast it would need to have a chance to mutate the values (i.e., turning them into integers). Not sure though. (Note that, as far as I can tell, the current API would require code duplication for intptrcast -- the code below handling integer operations could not be reused.)

Alternatively, the comment documenting ptr_op in trait Machine should say that this is always called, not just when pointers are involved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that, as far as I can tell, the current API would require code duplication for intptrcast -- the code below handling integer operations could not be reused.

I'd assume that PrimVal::to_bytes would take a Memory argument and do the logic in there.

I renamed the method to try_ptr_op and changed the documentation to reflect the changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume that PrimVal::to_bytes would take a Memory argument and do the logic in there.

Hm. Thinking about it, to model intptrcast faithfully we probably want to eagerly do the allocation when the cast happens, or when the transmute happens -- arithmetic operations should never have to do that, as that would mean they have side-effects and optimizations become horrible.

Anyway, we can discuss that in the intptrcast PR, whenever that happens.

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@oli-obk oli-obk merged commit 7700bd0 into rust-lang:master Aug 2, 2017
@oli-obk oli-obk deleted the upstream branch August 2, 2017 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The Machine ™
3 participants