Skip to content

Commit 58b8e0c

Browse files
authored
Add Blambda IR for bytecode compilation (#3590)
Adds a new IR called "Blambda" for compiling complicated things to bytecode slightly more easily than via Lambda. Blambda is designed to be a lambda-like expression language where every primitive is also a bytecode primitive. Control flow is still the same as lambda, except switch statements have already been elaborated somewhat. This allows us to separate the bytecode backend into two stages: First, Lambda -> Blambda: Preserves the expression structure, but compiles all complex primitives down to ones with corresponding bytecode instructions. This will become more important as we continue to add more primitives to Lambda which have no corresponding bytecode instruction. The translation of non-native operations and switches that was previously done in bytecomp/bytegen.ml is now done in a separate pass in bytecomp/blambda_of_lambda.ml. Second, Blambda -> Instructions: Only has to deal with linearizing the Lambda-like control flow. The comparatively fragile stack size maintenance and stack index computations can remain in their own module which doesn't need to be modified every time we change Lambda. As with any IR, there is also code to print blambda in bytegen/printblambda.ml, as well as a new command-line flag for ocamlc to dump the blambda code.
1 parent be6e477 commit 58b8e0c

21 files changed

+1422
-836
lines changed

bytecomp/.ocamlformat-enable

+6
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,8 @@
1+
blambda.ml
2+
blambda.mli
3+
blambda_of_lambda.ml
4+
blambda_of_lambda.mli
15
bytegen.ml
26
bytegen.mli
7+
printblambda.ml
8+
printblambda.mli

bytecomp/blambda.ml

+216
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
(**************************************************************************)
2+
(* *)
3+
(* OCaml *)
4+
(* *)
5+
(* Jacob Van Buren, Jane Street, New York *)
6+
(* *)
7+
(* Copyright 2024 Jane Street Group LLC *)
8+
(* *)
9+
(* All rights reserved. This file is distributed under the terms of *)
10+
(* the GNU Lesser General Public License version 2.1, with the *)
11+
(* special exception on linking described in the file LICENSE. *)
12+
(* *)
13+
(**************************************************************************)
14+
15+
(** [blambda] is designed to be a lambda-like expression language where every primitive is
16+
also a bytecode primitive. This allows us to separate the bytecode backend into two
17+
stages:
18+
19+
First, Lambda -> Blambda: Preserves the expression structure, but compiles all complex
20+
primitives down to ones with corresponding bytecode instructions. This will become
21+
more important as we continue to add more primitives to Lambda which have no
22+
corresponding bytecode instruction.
23+
24+
Second, Blambda -> Instructions: Only has to deal with linearizing the Lambda-like
25+
control flow. The comparatively fragile stack size maintenance and stack index
26+
computations can remain in their own module which doesn't need to be modified every
27+
time we change Lambda.
28+
*)
29+
30+
type constant = Lambda.constant
31+
32+
(** [structured_constant] needs to match the cmo file format *)
33+
type structured_constant = Lambda.structured_constant =
34+
| Const_base of constant
35+
| Const_block of int * structured_constant list
36+
| Const_mixed_block of
37+
int * Lambda.mixed_block_shape * structured_constant list
38+
| Const_float_array of string list
39+
| Const_immstring of string
40+
| Const_float_block of string list
41+
| Const_null
42+
43+
type direction_flag = Asttypes.direction_flag =
44+
| Upto
45+
| Downto
46+
47+
type raise_kind = Lambda.raise_kind =
48+
| Raise_regular
49+
| Raise_reraise
50+
| Raise_notrace
51+
52+
type static_label = Lambda.static_label
53+
54+
type event = Lambda.lambda_event
55+
56+
type context_switch =
57+
| Perform
58+
| Reperform
59+
| Runstack
60+
| Resume
61+
62+
type comparison = Instruct.comparison =
63+
| Eq
64+
| Neq
65+
| Ltint
66+
| Gtint
67+
| Leint
68+
| Geint
69+
| Ultint
70+
| Ugeint
71+
72+
type method_kind =
73+
| Self
74+
| Public
75+
76+
(** primitives that correspond to bytecode instructions that don't affect control flow *)
77+
type primitive =
78+
| Getglobal of Compilation_unit.t
79+
| Getpredef of Ident.t
80+
| Boolnot
81+
| Isint
82+
| Vectlength
83+
| Setglobal of Compilation_unit.t
84+
| Getfield of int
85+
| Getfloatfield of int
86+
| Raise of raise_kind
87+
| Offsetint of int
88+
| Offsetref of int
89+
| Negint
90+
| Addint
91+
| Subint
92+
| Mulint
93+
| Divint
94+
| Modint
95+
| Andint
96+
| Orint
97+
| Xorint
98+
| Lslint
99+
| Lsrint
100+
| Asrint
101+
| Intcomp of comparison
102+
| Getbyteschar
103+
| Getvectitem
104+
| Setfield of int
105+
| Setfloatfield of int
106+
| Setvectitem
107+
| Setbyteschar
108+
| Ccall of string
109+
| Makeblock of { tag : int }
110+
| Makefloatblock
111+
| Make_faux_mixedblock of
112+
{ total_len : int;
113+
tag : int
114+
}
115+
| Check_signals
116+
117+
and rec_binding =
118+
{ id : Ident.t;
119+
def : bfunction
120+
}
121+
122+
and bfunction =
123+
{ params : Ident.t list;
124+
body : blambda;
125+
free_variables : Ident.Set.t
126+
(** if we ever intended to do optimizations/transformations on blambda, this would
127+
be better as a function than a field *)
128+
}
129+
130+
and blambda =
131+
| Var of Ident.t
132+
| Const of structured_constant
133+
| Apply of
134+
{ func : blambda;
135+
args : blambda list;
136+
nontail : bool
137+
}
138+
| Function of bfunction
139+
| Let of
140+
{ id : Ident.t;
141+
arg : blambda;
142+
body : blambda
143+
}
144+
| Letrec of
145+
{ decls : rec_binding list;
146+
free_variables_of_decls : Ident.Set.t;
147+
(** if we ever intended to do optimizations/transformations on blambda, this
148+
would be better as a function than a field *)
149+
body : blambda
150+
}
151+
| Prim of primitive * blambda list
152+
| Switch of
153+
{ arg : blambda;
154+
const_cases : int array;
155+
(** indexes into {!cases}, indexed by the value of the immediate *)
156+
block_cases : int array;
157+
(** indexes into {!cases}, indexed by the the block tag *)
158+
cases : blambda array
159+
}
160+
| Staticraise of static_label * blambda list
161+
| Staticcatch of
162+
{ id : static_label;
163+
body : blambda;
164+
args : Ident.t list;
165+
handler : blambda
166+
}
167+
| Trywith of
168+
{ body : blambda;
169+
param : Ident.t;
170+
handler : blambda
171+
}
172+
| Sequence of blambda * blambda
173+
| Assign of Ident.t * blambda
174+
| Send of
175+
{ method_kind : method_kind;
176+
met : blambda;
177+
obj : blambda;
178+
args : blambda list;
179+
nontail : bool
180+
}
181+
| Context_switch of context_switch * blambda list
182+
| Ifthenelse of
183+
{ cond : blambda;
184+
ifso : blambda;
185+
ifnot : blambda
186+
}
187+
| While of
188+
{ cond : blambda;
189+
body : blambda
190+
}
191+
| For of
192+
{ id : Ident.t;
193+
from : blambda;
194+
to_ : blambda;
195+
dir : direction_flag;
196+
body : blambda
197+
}
198+
| Sequand of blambda * blambda
199+
| Sequor of blambda * blambda
200+
| Event of blambda * Lambda.lambda_event
201+
| Pseudo_event of blambda * Debuginfo.Scoped_location.t
202+
(** Pseudo events are ignored by the debugger. They are only used for generating
203+
backtraces.
204+
205+
We prefer adding this event here rather than in lambda generation because:
206+
+ There are many different situations where a Pmakeblock can be generated.
207+
+ We prefer inserting a pseudo event rather than an event after to prevent the
208+
debugger to stop at every single allocation.
209+
210+
Having [Event] and/or [Pseudo_event] make effective pattern-matching on blambda
211+
hard. However, blambda is only meant to go immediately before the code
212+
generator, so it shouldn't really be matched on anyway.
213+
214+
In the future, we could simplify things a bit and use a new [Lev_pseudo_after]
215+
event kind in the [Event] constructor instead of Pseudo_event, to generate
216+
during lambda to blambda conversion if [!Clflags.debug] is [true]. *)

0 commit comments

Comments
 (0)