Skip to content
forked from s3team/uroboros

Infrastructure for Reassembleable Disassembling and Transformation

Notifications You must be signed in to change notification settings

piax93/uroboros

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 

Repository files navigation

Uroboros

Infrastructure for Reassembleable Disassembling and Transformation

This fork was created as part of thesis work and it is now not maintained

Fork motivation

This fork is made with the idea of extending this technique to ARM Thumb executables. In such process, the OCaml core has been completely rewritten in Python.

To this date the rewritten tool has been tested to work on the following executables: bzip, gzip, BLAKE2, Himeno benchmark, dcraw (with statically linked libjpeg and liblcms, ARM requires assumption 3), FLAC encoder (with statically linked libFLAC), dolfyn, OPUS encoder (with statically linked libopus, ARM requires assumption 3).

Installation

Uroboros uses the following utilities (version numbers are in line with what was used during development, older releases may work as well):

Tool Version
python 2.7
objdump ≥2.22
readelf ≥2.22
awk ≥3.18
libcapstone 3.0.5-rc3

and the following python packages (available through pip repositories):

Package Version
capstone ≥3.0.4
termcolor ≥1.1.0
pyelftools ≥0.24

Build

Uroboros is now completely written in Python on the allpy branch. You don't need to build anything. However, you may want to modify some values in config.py to match your system configuration. Also, the parser, though recognising a large number of operators, is not complete; in case invalid operator exceptions are raised, these can be added to the right set in Types.py.

Usage

Uroboros supports 64-bit and 32-bit ELF x86 executables and, experimentally, also Thumb2 ARM binaries. To use Uroboros for disassembling:

 $> python uroboros.py path_to_bin

The disassembled output can be found in the workdir directory, named final.s. Uroboros will also assemble it back into an executable, a.out.

The startup Python script provides the following options:

  • -o output

    This option allows to specify an output path for the reassembled binary.

  • -g

    Apply instrumentations. New instrumentations can be implemented by creating subpackages in the instrumentation package. These must contain at least two modules (see the example package):

    • a module having the same name of the package with a function named perform, accepting a list of instructions and a list of function objects and returning the instrumented list of instructions, and a function named aftercompile. The first is invoked just after the symbol reconstruction phase is completed, while the latter allows further modifications after the code has already been adjusted for compilation;
    • a module named plaincode which must contain three string variables name beforemain, aftercode and instrdata. These are respectively inserted at the beginning of the main function, at the end of the .text section and at the end of the source file.

    Instrumentations are applied in alphabetical order, the task of preventing interference among different instrumentations is left to the user. If multiple instrumentations have been implemented but only a subset has to be used, adding their package names as strings in the instrumentors list of the config.py file will allow only these to be loaded and executed (in this case the order is the one specified by the user).

    Instrumentation against ROP attacks using an adaptation of the technique described in [2] is already available in this repository.

  • -gcc "parameters"

    String of additional arguments a user may want to pass to the compiler.

  • -ex exclusions_file

    Allows to specify a file containing on each line either a hexadecimal value to exclude from symbol search inside the code or an address ranges, in the format hexaddress-hexaddress, of the data sections which will be skipped when searching for pointers.

  • -fex function_exclusion_file

    In case a non-stripped binary is being analysed, allows to specify a file containing a list of symbol which should not be considered functions.

  • -a assumption_number

    This option configures the three symbolization assumptions proposed in the original Uroboros paper [1]. Note that in the current version, the first assumption (n-byte alignment) are set by default. The other two assumptions can be set by users.

    Assumption two reqires to put data sections (.data, .rodata and .bss) to its original starting addresses. Linker scripts can be used during reassembling (gcc -T ld_script.sty final.s). Users may write their own linker script, some examples are given at ld_script folder.

    Assumption three requires to know the function starting addresses. To obtain this information, Uroboros can take unstripped binaries as input. The function starting address information is obtained from the input, which is then stripped before disassembling.

    These assumptions can also be used at the same time (python uroboros.py path_to_bin -a 3 -a 2)

Stuff it would be nice to do

  • More testing on real applications and, after that, even more testing.
  • Change the way data flow is managed: now, to ease the debugging process, most of the data is passed along via file, which implies a lot of unnecessary IO operations.

[1] Reassembleable Disassembling, by Shuai Wang, Pei Wang, and Dinghao Wu. In Proceedings of the 24th USENIX Security Symposium, Washington, D.C., August 12-14. 2015.

[2] G-Free: defeating return-oriented programming through gadget-less binaries, by Onarlioglu Kaan, Leyla Bilge, Andrea Lanzi, Davide Balzarotti, and Engin Kirda. In Proceedings of the 26th Annual Computer Security Applications Conference, pp. 49-58. ACM, 2010."

About

Infrastructure for Reassembleable Disassembling and Transformation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 68.1%
  • TeX 31.9%