Skip to content

Technical details

zaddach edited this page Aug 11, 2016 · 6 revisions

How code is executed

S2E brings together Qemu, a fast CPU and platform emulator, and KLEE, a symbolic interpreter for LLVM Intermediate Representation (IR) code.

Qemu executes machine code for different architectures by cross-compiling it on the spot to its native host architecture. Whenever an untranslated Translation Block (TB) is about to be executed, Qemu will translate the machine code to TCG (Tiny Code Generator), its own intermediate language (e.g., here for ARM machine code). The TCG code is then translated to native host code and executed (here for an x86 host).
CPU registers are modelled as a structure, called CPUArchState. This structure is specific for each guest architecture (i.e., the structure for an ARM processor is called CPUARMState). A TB, which is almost but not exactly the equivalent of a basic block (a TB might contain several basic blocks, because entry points are not known at runtime unless they are executed, and there may exist a not-yet-executed program path jumping into the middle of a TB), is transformed into a function modifying this structure.
As some machine instructions are quite complicated (e.g., vectored instructions), they are implemented as native helper functions. The translator will simply insert a call to this native helper into the TCG code, which will be transformed to a native function call in the compiled native TB function.
Further, some instructions need to request the emulator to exit the emulation loop, which is realized via a setjmp/longjmp combination. Typically, a helper function will call cpu_loop_exit, which wraps the longjmp.

KLEE is a symbolic execution engine for LLVM IR code. In its original design, it executes a normal user-space program starting from its main function. Symbolic values can be provided as input (e.g., as argument strings, environment variables, or return values from system calls such as the read() system call). A special version of uclibc, which provides symbolic summaries for most functions, is included in KLEE's distribution.

Clone this wiki locally