Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leakage traces #431

Open
wants to merge 91 commits into
base: master
Choose a base branch
from
Open

Leakage traces #431

wants to merge 91 commits into from

Conversation

OwenConoly
Copy link

This PR will add leakage traces to the source and intermediate semantics, augment the compiler-correctness statement and proof to talk about leakage traces, and add some examples involving leakage traces.

@OwenConoly
Copy link
Author

OwenConoly commented Aug 25, 2024

For now, this branch only has some edits to Semantics.v. Before I add anything else, I wanted to ask whether I should be making these edits (i.e., adding leakage traces) in Semantics.v, MetricSemantics.v, or both? Also, why are Semantics.v and MetricSemantics.v separate, and are there plans to merge them eventually?

Copy link
Contributor

@andres-erbsen andres-erbsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantics.v, MetricSemantics.v, or both?

IMO the best option is to create additional semantics files. We want to have only one compiler proof, so that would use LeakageMetricSemantics. Some program proofs may be done with that one as well, but the examples for our current manuscript probably shouldn't be, so there we'd probably want want LeakageSemantics as well. In an unverified software project, this kind of duplication would be asking for inconsistency bugs, but I hope it won't be too hard to state and prove the appropriate relationships between these variants.

bedrock2/src/bedrock2/Semantics.v Outdated Show resolved Hide resolved
bedrock2/src/bedrock2/Semantics.v Outdated Show resolved Hide resolved
@OwenConoly
Copy link
Author

I hope it won't be too hard to state and prove the appropriate relationships between these variants.

Proving the appropriate relationship between Semantics.v and LeakageSemantics.v will only be trivial if we modify Semantics.v so that exec takes the pick_sp parameter. If LeakageSemantics exec takes pick_sp as a parameter but Semantics exec stays as is, then proving any sort of equivalence between them is equally as nontrivial as that semantics-equivalence statement we put in the paper, the one that I wrote a ~3000-line proof of.

So, should I modify Semantics and MetricSemantics execs to take pick_sp as a parameter, and then prove the appropriate relationships between Semantics vs. MetricLeakageSemantics, MetricSemantics vs. MetricLeakageSemantics, and LeakageSemantics vs MetricLeakageSemantics?

@OwenConoly
Copy link
Author

However, the Semantics-exec-implies-LeakageSemantics-exec direction is trivial even if LeakageSemantics takes pick_sp as a parameter and Semantics doesn't. So if that direction of the equivalence is all we care about, then potentially I could just leave Semantics and MetricSemantics alone.

@andres-erbsen
Copy link
Contributor

andres-erbsen commented Aug 26, 2024

Ah, indeed. I was only thinking of the easy direction, Semantics implies LeakageSemantics. I don't think we should add pick_sp to current semantics. @samuelgruetter do you think it would be acceptable to carry both copies in the repo, but only the half of the equivalence proof that allows program proofs without leakage to be used with a compiler proof that preserves everything they say about leakage?

However, the other direction of this relationship might be a good opportunity to drill down on the difficulty in the equivalence proof and perhaps simplify it. We don't have to do this, but I'm thinking that if you could state the desired equivalence with a minimized version of semantics that is just complicated enough to illustrate the challenge (without abstract traces, predictors, or even function calls and etc), then I would give it a try. "How hard can pushing a quantifier into an inductive be" 🚀😂🙄

@samuelgruetter
Copy link
Contributor

Ah, indeed. I was only thinking of the easy direction, Semantics implies LeakageSemantics. I don't think we should add pick_sp to current semantics. @samuelgruetter do you think it would be acceptable to carry both copies in the repo, but only the half of the equivalence proof that allows program proofs without leakage to be used with a compiler proof that preserves everything they say about leakage?

Sounds good!

However, the other direction of this relationship might be a good opportunity to drill down on the difficulty in the equivalence proof and perhaps simplify it. We don't have to do this, but I'm thinking that if you could state the desired equivalence with a minimized version of semantics that is just complicated enough to illustrate the challenge (without abstract traces, predictors, or even function calls and etc), then I would give it a try "How hard can pushing a quantifier into an inductive be" 🚀😂🙄

This kind of proof would be interesting as a minimal standalone file that might accompany a paper as "supplemental material", where the focus is really to make it as simple as possible, but without simplifying away the core difficulty. (And if at some point, we actually need this direction of the equivalence for real, we can again scale it up to the whole language).

@OwenConoly
Copy link
Author

"How hard can pushing a quantifier into an inductive be"

This could be nice, yeah! To make it as academic as possible (and remove the coqutil/bedrock2 dependency and get a standalone file) I'd probably make the following simplifications to the equivalence proof I currently have:

  • no memory, only locals
  • no leakage traces or metrics either
  • locals are nats instead of words (in addition to removing a dependency, this makes me happy because proofs seem less trivial when they're written about Turing-complete languages)
  • no ifs, since you can write them with whiles anyway
  • instead of stack allocation, the quantifier is just an RNG construct
  • no function calls

I'd probably keep IO calls though, since the nondeterminism there seems to add an interesting dimension.

@OwenConoly
Copy link
Author

OwenConoly commented Aug 26, 2024

Oh, bedrock2 is Turing-complete! I forgot that you can store and retrieve as much information as you want in the IO trace (with an appropriate choice of ext_spec). That's kinda cool.

I had thought that it was just a finite-state machine, so in principle, you could prove some absolute upper bound (depending only on word size and source program) on the size of an exec proof tree (and then this sort of equivalence theorem becomes trivial).

@andres-erbsen
Copy link
Contributor

👍 You can store a turing-machine tape in a nat, so I think all options here are turing-complete. And I'd even remove I/O at first, we can add it back later if we want.

@OwenConoly
Copy link
Author

The existence of (Metric)LeakageSemantics.v implies the existence of:

(Metric)LeakageLoops.v, (Metric)LeakageProgramLogic.v, (Metric)LeakageWeakestPreconditionProperties.v, and (Metric)LeakageWeakestPrecondition.v,

right?

@andres-erbsen
Copy link
Contributor

Yes. At some point I may try to share more code, but for new these are just copied I think.

@samuelgruetter
Copy link
Contributor

The existence of (Metric)LeakageSemantics.v implies the existence of:

(Metric)LeakageLoops.v, (Metric)LeakageProgramLogic.v, (Metric)LeakageWeakestPreconditionProperties.v, and (Metric)LeakageWeakestPrecondition.v,

right?

Will we ever need a program logic that does both metrics and leakage at the same time? If not, the needed files should probably just be

  • MetricLeakageSemantics.v (because in the compiler, we want both at the same time)
  • (non-metric) LeakageLoops.v, LeakageProgramLogic.v, LeakageWeakestPreconditionProperties.v, and LeakageWeakestPrecondition.v

@andres-erbsen
Copy link
Contributor

I don't have specific code planned where this would be needed, but I imagine basic library functions like memset could end up being used by callers that need either kind of spec. I'm fine with procrastinating on building support for this, though.

@OwenConoly
Copy link
Author

OwenConoly commented Aug 27, 2024

Ok. I can just do the MetricLeakage files while I'm at it, if they'll be needed eventually anyway. (Probably I'll end up making some errors in MetricLeakageProgramLogic, since there won't be any test cases for it.)

Unrelatedly, since I am adding the separate files for Leakage stuff instead of changing what was already there, my changes to the source language shouldn't break the compiler proof. So it seems natural to split this PR into two: in this PR I can just change the source language and add some examples, and then I can make a separate PR for compiler proof things. Does that sound good?

@andres-erbsen
Copy link
Contributor

Separate PRs sounds good to me. (There is some possibility that we will want to review them together, though.)

@andres-erbsen
Copy link
Contributor

andres-erbsen commented Dec 16, 2024

As discussed in person, this branch is in overall good shape and it seems desirable to merge it soon rather than to wait for perfection.

@samuelgruetter do you have any concerns?

@@ -1,6 +1,6 @@
[submodule "deps/riscv-coq"]
path = deps/riscv-coq
url = https://github.com/mit-plv/riscv-coq.git
url = https://github.com/OwenConoly/riscv-coq.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This git-config change should not be merged, instead the relevant riscv-coq changes should be merged to its upstream.

bedrock2/src/bedrock2/BasicC32Semantics.v Outdated Show resolved Hide resolved
bedrock2/src/bedrock2/FE310CSemantics.v Show resolved Hide resolved
r0 closed binder, rn closed binder,
tr name, tr' name, mem name, mem' name,
pre at level 200,
post at level 200).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OwenConoly There are some leakage-fnspec notations here; should I interpret you talking about the lack of such notations as a signal that the ones here don't work? If so, probably best to remove.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe they did not work. But now I think I have fixed them.

There are now two notations in LeakageWeakestPrecondition. There is the fnspec! notation, which is just like the fnspec! in WeakestPrecondition.

There is also the fnspec_ex! notation, which is just like fnspec! except that an existential quantifier has been inserted in the appropriate place for stating that a program is constant-time. Do you like this notation? There are now many examples of it being used in bedrock2Examples/ct.v.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, they sure seem better than no notation. How about fnspec! exists f ... g, "getline" x y z ~> v, { ...? I believe "exists" is already a keyword, and perhaps this way a user familiar with fnspec! will guess the meaning of the new notation correctly more often.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems nice and intuitive. I changed the notations to look like that instead.

@@ -0,0 +1,452 @@
Require Import Coq.ZArith.ZArith coqutil.Z.div_mod_to_equations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

bedrock2/src/bedrock2Examples/ct.v Outdated Show resolved Hide resolved
compiler/src/compiler/Pipeline.v Outdated Show resolved Hide resolved
Call(pick_sp: list Leakage -> word)(p: Program)(s: Settings)(funcname: string)
(k: list Leakage)(t: trace)(m: mem)(argvals: list word)(mc: MetricLog)
(post: list Leakage -> trace -> mem -> list word -> MetricLog -> Prop): Prop;
WeakenCall: forall pick_sp p s funcname k t m mc argvals (post1: _ -> _ -> _ -> _ -> _ -> Prop),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelgruetter do properties of Call belong here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelgruetter do properties of Call belong here?

They don't. This is a record that's supposed to contain only "data", no proofs. But what happens if we just delete this WeakenCall field here? Then compose_phases doesn't work any more, because it needs L3.(WeakenCall). And why does compose_phases, which did not need weakening, now suddenly need weakening? My guess is that it's because you introduced a language-dependent argument to post (namely Leakage, which can be instantiated differently for each language), which is against the design that I had in mind for the phase composition (see chapter 3.5 of my thesis): The idea is that before composing the phases, you identify a common state data type over which to write the postconditions, and it's the responsibility of each Call definition to transform from the language-specific format into the common state datatype. So I'd expect that if you remove WeakenCall and Leakage from the record, and replace Leakage by some CommonLeakage, it should all work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I had used WeakenCall where it was not needed before is that I had adopted a default behaviour of axiom-avoidance. There is a choice between carrying around WeakenCall and using functional/propositional extensionality. I've just noticed that the previous version of compose_phases (before I touched it) used functional/propositional extensionality, so I assume that's the way to go here, and I've updated the compose_phases proof to use axioms instead of WeakenCall.

Keeping the Leakage parameter around works fine. I could still replace this by some CommonLeakage, but it would be kind of ugly: CommonLeakage would just have to be the union/sum of the two different types of Leakage.

rewrite app_nil_r. reflexivity. }
Qed.*)

Check (@compiler_correct_wp _ _ Words32Naive.word mem _ ext_spec _ _ _ ext_spec_ok _ _ _ _ _ word_ok _ _ RV32I _ compile_ext_call leak_ext_call compile_ext_call_correct compile_ext_call_length fs_swap instrs_swap finfo_swap req_stack_size_swap fname_swap _ _ _ _).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either remove this or turn it into a lemma.

etc/test_itauto.sh Show resolved Hide resolved
Copy link
Contributor

@samuelgruetter samuelgruetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (modulo Andres' and my review comments)

Comment on lines 261 to 263
Check @MetricLeakageSemantics.exec. Print mem.
Check locals_based_call_spec_weaken. Print env.
Print string_keyed_map.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nitpick: please clean up all such output before asking to merge)

compiler/src/compiler/Pipeline.v Outdated Show resolved Hide resolved
Call(pick_sp: list Leakage -> word)(p: Program)(s: Settings)(funcname: string)
(k: list Leakage)(t: trace)(m: mem)(argvals: list word)(mc: MetricLog)
(post: list Leakage -> trace -> mem -> list word -> MetricLog -> Prop): Prop;
WeakenCall: forall pick_sp p s funcname k t m mc argvals (post1: _ -> _ -> _ -> _ -> _ -> Prop),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelgruetter do properties of Call belong here?

They don't. This is a record that's supposed to contain only "data", no proofs. But what happens if we just delete this WeakenCall field here? Then compose_phases doesn't work any more, because it needs L3.(WeakenCall). And why does compose_phases, which did not need weakening, now suddenly need weakening? My guess is that it's because you introduced a language-dependent argument to post (namely Leakage, which can be instantiated differently for each language), which is against the design that I had in mind for the phase composition (see chapter 3.5 of my thesis): The idea is that before composing the phases, you identify a common state data type over which to write the postconditions, and it's the responsibility of each Call definition to transform from the language-specific format into the common state datatype. So I'd expect that if you remove WeakenCall and Leakage from the record, and replace Leakage by some CommonLeakage, it should all work.

etc/test_itauto.sh Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants