Merge pull request #24 from h0nzZik/nicer-benchmarks-3

Nicer benchmarks 3
h0nzZik · Jun 29, 2024 · eff132b · eff132b
2 parents 977cd4f + dbd2b4c
commit eff132b
Show file tree

Hide file tree

Showing 21 changed files with 127 additions and 367 deletions.
diff --git a/.github/workflows/build-and-profile.yml b/.github/workflows/build-and-profile.yml
@@ -45,12 +45,12 @@ jobs:
     - name: 'Run hybrid benchmarks'
       run: nix develop -L '.#bench-hybrid' --command ./bench-hybrid/test.sh
 
+    - name: 'Build Coq language examples'
+      run: nix build -L '.#languages-in-coq'
+
     - name: 'Run Coq benchmarks'
       run: nix develop -L '.#bench-coq' --command ./bench-coq/build-and-profile.sh
 
-    - name: 'Build old Coq examples'
-      run: nix build -L '.#old-examples-coq'
-
     - name: 'Build Docker image'
       run: nix build -L '.#minuska-docker' --out-link ./result-minuska-docker
 

diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 Minuska is a framework for defining operational semantics ("language definitions") of programming languages and deriving tools from them.
 Currently, the project is a research prototype, and the only tools derivable from language definitions are interpreters.
 Users looking for a mature programming language framework are advised to check out [K framework](https://kframework.org/) or [PLT-Redex](https://redex.racket-lang.org/).
-(For a more detailed comparison to K framework, look [here](./doc/comparison-to-k-framework.md))
+(For a more detailed comparison to K framework, look [here](./doc/comparison-to-k-framework.md).)
 
 Minuska is built on top of the [Coq proof assistant](https://coq.inria.fr/). At its core is a simple language MinusLang for expressing programming language semantics
 in an exact, unambiguous way: MinusLang has a simple formal semantics (mechanized in [spec.v](https://h0nzzik.github.io/minuska/Minuska.spec.html)),
@@ -52,9 +52,11 @@ In principle, many features could be implemented in Minuska that would make the
 These include support for concrete syntax of programming languages, formalization of the strictness declarations, symbolic execution, and deductive verification.
 See the [ideas document](./doc/ideas.md)
 
-
-
 # Troubleshooting
 
-If your system does not support FUSE, or its configuration is broken, try `export APPIMAGE_EXTRACT_AND_RUN=1` before running any generated interpreters. 
+If your system does not support FUSE, or its configuration is broken, try `export APPIMAGE_EXTRACT_AND_RUN=1` before running any generated interpreters.
+This needs to be done e.g., in Docker, or when running on Github-hosted runners.
+
+# Papers
 
+An extended version of the preliminary 'Minuska: Towards a Formally Verified Programming Language Framework' paper is available [here](h0nzzik.github.io/papers/minuska-extended.pdf).
diff --git a/bench-coq/README.md b/bench-coq/README.md
@@ -0,0 +1,83 @@
+# Benchmarks for languages specified directly in Coq
+
+To get some statistics about the languages defined in [../languages-in-coq](../languages-in-coq), run:
+```sh
+nix develop '.#bench-coq' --command ./build-and-profile.sh
+```
+The first column is the name of the benchmark, second column is the time of the `Compute` inside Coq (measured by Coq's `Time` command), and the third columns is the total time of `coqc` as measured by the GNU `time` utility.
+
+The last two columns of Figure 3 of [this paper](h0nzzik.github.io/papers/minuska-extended.pdf) can be obtained by filtering the output of the above command.
+For an illustrative example, a recent run on my "Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz" laptop yield, among other, the following rows:
+```
+two-counters(10), 0.015, 0.73
+two-counters(20), 0.01, 0.74
+two-counters(50), 0.014, 0.73
+two-counters(100), 0.008, 0.78
+unary-fib(5), 0.133, 0.81
+unary-fact(3), 0.285, 0.98
+```
+
+Similarly, the right table of Figure 2 of that paper corresponds to the following part of the output:
+```
+two-counters(10), 0.015, 0.73
+two-counters(100), 0.008, 0.78
+two-counters(1000), 0.011, 0.75
+two-counters(10000), 0.026, 0.74
+```
+
+The middle two columns of Figure 4 of that paper are covered as well:
+```
+imp-count-to(1), 0.642, 1.43
+imp-count-to(2), 1.078, 1.79
+imp-count-to(3), 1.453, 2.15
+imp-count-to(4), 1.911, 2.63
+imp-count-to(5), 2.322, 3.00
+imp-count-to(6), 2.569, 3.25
+imp-count-to(7), 3.004, 3.70
+```
+(For the last column of that figure, see [../bench-standalone](../bench-standalone)).
+
+Figure 5 of that paper is also contained in the output:
+```
+native-fib(1), 0.01, 0.75
+native-fib(6), 0.015, 0.73
+native-fib(11), 0.017, 0.73
+native-fib(16), 0.021, 0.73
+native-fib(21), 0.034, 0.74
+native-fib(26), 0.038, 0.84
+native-fib(31), 0.039, 0.89
+native-fib(36), 0.038, 0.77
+native-fib(41), 0.041, 0.74
+native-fib(46), 0.044, 0.76
+native-fib(51), 0.05, 0.76
+native-fib(56), 0.047, 0.76
+native-fib(61), 0.06, 0.79
+native-fib(66), 0.059, 0.75
+native-fib(71), 0.058, 0.74
+native-fib(76), 0.072, 0.80
+native-fib(81), 0.071, 0.79
+native-fib(86), 0.102, 0.79
+native-fib(91), 0.105, 0.80
+native-fib(96), 0.095, 0.79
+native-fib(101), 0.107, 0.79
+native-fib(106), 0.115, 0.80
+
+unary-fib(1), 0.074, 0.77
+unary-fib(2), 0.079, 0.78
+unary-fib(3), 0.093, 0.77
+unary-fib(4), 0.103, 0.78
+unary-fib(5), 0.137, 0.82
+unary-fib(6), 0.2, 0.90
+unary-fib(7), 0.311, 1.00
+unary-fib(8), 0.459, 1.20
+unary-fib(9), 0.882, 1.59
+unary-fib(10), 1.42, 2.12
+unary-fib(11), 2.471, 3.15
+
+unary-fact(1), 0.108, 0.78
+unary-fact(2), 0.167, 0.85
+unary-fact(3), 0.285, 0.96
+unary-fact(4), 0.575, 1.35
+unary-fact(5), 2.102, 2.78
+unary-fact(6), 10.318, 10.97
+```
diff --git a/bench-hybrid/README.md b/bench-hybrid/README.md
@@ -0,0 +1,18 @@
+## Hybrid Coq/`minuska` benchmarks
+
+To get some statistics about the languages defined in [../languages-in-coq](../languages-in-coq) converted into Coq and executed inside Coq, run:
+```sh
+nix develop '.#bench-hybrid' --command ./test.sh
+```
+The output contains lines like
+```
+coqc testCount7.v
+Finished transaction in 1.814 secs (1.806u,0.002s) (successful)
+2.45
+coqc testTC100.v
+Finished transaction in 0.094 secs (0.093u,0.s) (successful)
+0.68
+```
+where each three lines correspond to one benchmark: the first one the three contains the name of the benchmark, the second line the time required for Coq to execute the corresponding `Compute` command, and the third line is the execution time as measured by the GNU `time` utility.
+This suite contains only two kind of benchmarks. The ones named `testCount${N}` run the `sum-to-${N}` program in the IMP language defined in [../languages/imp](../languages/imp), but do so inside Coq. The ones named `testTC${N}` run the `two-counters` language ([../languages/two-counters](../languages/two-counters)) on input `${N}`.
+
diff --git a/flake.nix b/flake.nix
@@ -101,12 +101,12 @@
       in {
         packages.minuska = minuskaFun { coqPackages = pkgs.coqPackages_8_19; } ;
 
-        packages.old-examples-coq
+        packages.languages-in-coq
         = pkgs.coqPackages_8_19.callPackage 
         ( { coq, stdenv }:
         stdenv.mkDerivation {
-          name = "old-examples-coq";
-          src = ./old-examples-coq;
+          name = "languages-in-coq";
+          src = ./languages-in-coq;
 
           propagatedBuildInputs = [
             self.outputs.packages.${system}.minuska
@@ -163,7 +163,7 @@
           propagatedBuildInputs = [
             pkgs.time
             pkgs.python312
-            self.outputs.packages.${system}.old-examples-coq
+            self.outputs.packages.${system}.languages-in-coq
             #self.outputs.packages.${system}.examples-coq.coqPackages.coq
           ];
           enableParallelBuilding = true;

diff --git a/old-examples-coq/Makefile → languages-in-coq/Makefile b/old-examples-coq/Makefile → languages-in-coq/Makefile
diff --git a/languages-in-coq/README.md b/languages-in-coq/README.md
@@ -0,0 +1,13 @@
+## Language examples in Coq
+
+This directory contains a few language definitions encoded directly in Coq, as well as derived interpreters for those languages. Specifically:
+- `examples.two_counters.interp` in [theories/examples.v](theories/examples.v) is an interpreter for the "two counters" language over unary-encoded natural numbers;
+- `examples.two_counters_Z.interp` is the same but over built-in integers (that is, the `Z` type from Coq's standard library);
+- `examples.arith.interp` is an interpreter for a simple language of arithmetic expressions;
+- `examples.fib_native.interp` calculates a Fibonacci sequence (over built-in integers), in an imperative style
+- `examples.imp.interp` interpretes a simple imperative language with arithmetic expressions, assignment, conditions, and loops;
+- `examples_unary_nat.unary_nat.interp_fib` implements Fibonacci sequence in the same style, but over unary-encoded natural numbers;
+- `examples_unary_nat.unary_nat.interp_fact` implements factorial in the imperative style, over unary-encoded natural numbers.
+
+No performance benchmarks for these are provided in this directory, but see [../bench-coq](../bench-coq) which uses this directory as a dependency.
+
diff --git a/old-examples-coq/_CoqProject → languages-in-coq/_CoqProject b/old-examples-coq/_CoqProject → languages-in-coq/_CoqProject
@@ -1,7 +1,6 @@
 -R theories MinuskaExamples
 
 theories/examples.v
-theories/extraction.v
 theories/examples_unary_nat.v
 
 
diff --git a/old-examples-coq/theories/dep.v → languages-in-coq/theories/dep.v b/old-examples-coq/theories/dep.v → languages-in-coq/theories/dep.v
diff --git a/old-examples-coq/theories/examples.v → languages-in-coq/theories/examples.v b/old-examples-coq/theories/examples.v → languages-in-coq/theories/examples.v
diff --git a/...xamples-coq/theories/examples_unary_nat.v → ...ages-in-coq/theories/examples_unary_nat.v b/...xamples-coq/theories/examples_unary_nat.v → ...ages-in-coq/theories/examples_unary_nat.v
diff --git a/old-examples-coq/src/example_lang_driver.ml b/old-examples-coq/src/example_lang_driver.ml
diff --git a/old-examples-coq/src/fib_native_driver.ml b/old-examples-coq/src/fib_native_driver.ml
diff --git a/old-examples-coq/src/generic_driver.ml b/old-examples-coq/src/generic_driver.ml
diff --git a/old-examples-coq/src/imp.ml b/old-examples-coq/src/imp.ml
diff --git a/old-examples-coq/src/imp_lexer.mll b/old-examples-coq/src/imp_lexer.mll