Provides a (somewhat) optimizing compiler for the Smpl programming language
as used in the CS 241 Advanced Compiler Design class by Dr. Michael Franz at
the University of California, Irvine. See the test_progs
directory for
example programs of the Smpl programming language.
The compiler includes a backend for the DLX instruction set architecture, and an emulator for this ISA.
For help running the compiler, see
./main.py --help
The path from input file to compiled machine code goes as follows:
-
input file
-
lexer (provides stream of tokens) and parser (creates tree of AST nodes), see
parser.py
-
AST, see
ast.py
-
Interpreter: each AST node has a
run()
method that implements that node's semantics in Python and executes it -
Compiler: We produce a single static assignment intermediate representation. For this, each AST node has a
compile()
method that appends (emits) the appropriate SSA instructions to a givenCompilationContext
.IfStatement
s andWhileStatement
s create new basic blocks in the context. Note that both of these statements end in one join block; the current block in the context is updated to point to this block. Hence it is safe to keep emitting the instructions following if/while statement in the current basic block.
-
-
Common subexpression elimination, dead code elimination and constant elimination, see
ssa.py
Common subexpression elimination is performed as statements are compiled: The
emit()
method of the compilation context checks whether an equivalent dominating instruction has been previously emitted. If so, a pointer to that instruction is returned and no instruction is emitted.The other two optimizations are performed during multiple passes over the IR, removing instructions until a fixed point is reached.
-
DLX machine code or assembly generation, see
backend.py
(abstract base class),dlx.py
andallocator.py
. The machine code is generated by iterating through the SSA IR basic blocks in a depth-first manner; the backend indlx.py
calls back to its given allocator (allocator.py
) to map the SSA operands to actual machine registers and memory locations (for spills).