Skip to content

Commit

Permalink
chore: update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
harveyghq committed Jul 5, 2024
1 parent 11f21c6 commit 91a6e1d
Showing 1 changed file with 18 additions and 61 deletions.
79 changes: 18 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@ pip install --upgrade pip wheel
pip install --force-reinstall leb128==1.0.4
```

If you need to visualize the graph (`--visualize`), make sure you have installed `graphviz` on your system. You can use `sudo apt install graphviz` to install.

To verify everything is set up correctly, run the following command:

```shell
Expand Down Expand Up @@ -74,41 +72,27 @@ Analyze:
We will detail these options according to their functionalities.

### Input Arguments
According to values given to input arguments, SeeWasm can deassemble the target binary and construct valid inputs.
SeeWasm can deassemble the target binary and construct valid inputs based on the values of the input arguments.

Specifically, `-f` is not an optional option, following which the path of to be analyzed Wasm binary should be given.
`--stdin STRING` and `--sym_stdin N` can pass concrete or symbolic bytes through the stdin stream.
The difference between them is that a concrete string has to be passed through `--stdin`, and a string consisting of `N` symbolic characters need to be passed through `--sym_stdin`.
For example, `--sym_stdin 5` will input 5 symbolic bytes if some functions need to read input from stdin.
Specifically, `-f` option is mandatory, and it must be followed by the path of the Wasm binary to be analyzed. The `--stdin STRING` and `--sym_stdin N` options allow users to pass concrete and symbolic bytes through the stdin stream, respectively. A concrete string must be passed using `--stdin`, while a string consisting of `N` symbolic characters must be passed using `--sym_stdin`. For example, `--sym_stdin 5` inputs 5 symbolic bytes for functions that read from stdin.

Similarly, `--args STRING1, STRING2, ...` and `--sym_args N1, N2, ...` pass concrete and symbolic arguments to the Wasm binary.
For instance, if `main` requires three arguments where each of them should be two bytes, `--sym_args 2 2 2` is enough.
Similarly, `--args STRING1, STRING2, ...` and `--sym_args N1, N2, ...` options pass concrete and symbolic arguments to the Wasm binary. For instance, if `main` requires three arguments, each two bytes long, `--sym_args 2 2 2` is enough.

Some programs will interact with files and conduct reading and writing.
SeeWasm can also simulate this by a *symbolic file system*.
Users have to apply `--sym_files N M` to create `N` symbolic files, where each of them has (or can hold) `M` bytes at most.
Some programs interact with files. SeeWasm simulates this using a *symbolic file system*. Users can create `N` symbolic files, each with up to `M` bytes, using the `--sym_files N M` option.

Finally, as several high-level programming languages can be compiled to Wasm binaries. We have achieved some specific optimizations, but users have to indicate the source language by `--source_types`.
As multiple high-level programming languages can be compiled to Wasm binaries, we have implemented specific optimizations. To take advantage of these optimizations, users must indicate the source language using the `--source_type` option.

### Features
`--entry` can tell SeeWasm which function is the entry, from which the symbolic execution performs.
Note that the `__original_main` is the default entry for all Wasm binaries following WASI standard.
The toolchain we mentioned in the [previous section](README.md#prerequisites) can generate Wasm binaries following WASI standard.
`--entry` specifies the entry function from which symbolic execution begins. By default, the entry function is `__original_main`. Users must specify a proper entry function to ensure the symbolic execution is performed correctly.

As we mentioned in our paper, the given Wasm will be parsed into ICFG.
Sometimes visualizing the ICFG is necessary for debugging.
Thus `--visualize` can achieve this goal.
The input Wasm is parsed into an Interprocedural Control Flow Graph (ICFG), which can be visualized for debugging purposes using the `--visualize` option (requires `graphviz`, installable via `sudo apt install graphviz` on Ubuntu).

During symbolic execution, constraints solving is a bottleneck for the performance.
We have implemented a set of optimizations on the solving process.
The `--incremental` refers to *incremental solving*, which may not always introduce positive optimizations during the analysis. Therefore, we set a flag to allow users to decidie if enable the incremental solving.
The constraint solving process is a bottleneck for symbolic execution performance; however, we have implemented some optimizations to mitigate this issue. The `--incremental` flag enables *incremental solving*. Note that it may not always yield positive results during analysis, and is therefore optional.

The `-v` is an optional option.
Accoding to different values, different levels of logging can be generated, which may help the debugging.
The `-v` option controls the logging level, allowing users to adjust the verbosity of logging output to aid in debugging.

### Analyze
The `-s` is a mandatory option.
It will enable the symbolic execution analysis on the given Wasm binary.
The `-s` is a mandatory option. It enables symbolic execution analysis on the given Wasm binary.

## Output
The output of SeeWasm, including logs and results, is stored in the `log` folder, with each file named according to the pattern `NAME_TIMESTAMP`.
Expand All @@ -130,8 +114,8 @@ The result is a JSON file containing feasible paths with their solutions, format

```json
{
"Status": xxx,
"Solution": {xxx},
"Status": "xxx",
"Solution": {"xxx"},
"Output": [
{
"name": "stdout",
Expand All @@ -146,13 +130,13 @@ The result is a JSON file containing feasible paths with their solutions, format
```

## Example
If we want to execute a program which does not requrie any extra arguments and input, the command should be:
To execute a program that takes no extra arguments or input, use the following command:

```shell
python3 launcher.py -f PATH_TO_WASM_BINARY -s
```

If compilicated arguments are required, for example, a `base64` program whose `main` is like:
If compilicated arguments are required, for example, a `base64` program with a `main` function like:

```c
// main of base64
Expand All @@ -170,41 +154,14 @@ int main(int argc, char **argv)
// encode or decode
}
```
We can see that the `base64` not only requires a two bytes arguments, but also needs a string of input to encode or decode. Also, the encoded or decoded results will go to a file.
Thus, the command to analyze the `base64` is like:
```shell
python3 launcher.py -f PATH_TO_BASE64 -s --sym_args 2 --sym_stdin 5 --sym_files 1 10 -v info
```

## Citations
If you use our tool or dataset in your research for publication, please kindly cite the following paper:

```
@inproceedings{he2023eunomia,
author = {He, Ningyu and Zhao, Zhehao and Wang, Jikai and Hu, Yubin and Guo, Shengjian and Wang, Haoyu and Liang, Guangtai and Li, Ding and Chen, Xiangqun and Guo, Yao},
title = {Eunomia: Enabling User-Specified Fine-Grained Search in Symbolically Executing WebAssembly Binaries},
year = {2023},
isbn = {9798400702211},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3597926.3598064},
doi = {10.1145/3597926.3598064},
booktitle = {Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis},
pages = {385–397},
numpages = {13},
keywords = {WebAssembly, Symbolic Execution, Domain Specific Language, Path Explosion},
location = {Seattle, WA, USA},
series = {ISSTA 2023}
}
The `base64` program expects two-byte arguments and a string input to encode or decode, producing output that is written to a file.
Thus, the command to analyze `base64` is like:
```shell
python3 launcher.py -f PATH_TO_BASE64 -s --sym_args 2 --sym_stdin 5 --sym_files 1 10
```

## Feedback

If you have any questions or need further clarification, please post on the [Issues](https://github.com/HNYuuu/SeeWasm/issues) page.

## Acknowledgements

We would like to thank the anonymous reviewers for their valuable feedback and suggestions.

0 comments on commit 91a6e1d

Please sign in to comment.