# Usage ## Fuzzy-SAT ### Command-line interface The program `fuzzy-solver` can be used to solve SMT queries using the Fuzzy-SAT algorithm: ``` $ fuzzy-solver --help Usage: ./build/bin/fuzzy-solver [OPTIONS] -h, --help print this help and exit -q, --query SMT2 query filename (required) -s, --seed binary seed file (required) -o, --out output directory --dsat dump sat queries --dproofs dump sat proofs --notui no text UI ``` It takes as mandatory command-line arguments an smt2 query file with `--query` and a seed file with `--seed`. Optionally, the user can specify an output folder (`--out`) where it dumps the sat queries (`--dsat`) and the assignments for sat queries (`--dproofs`). `fuzzy-solver` will try to solve every _assert_ contained in the _smt2_ file. Note that the symbols in the smt2 file MUST be declared as 8-bit bitvectors, and they must be named `k!`, where `` is the index of the i-th byte in the seed that represents the initial assignment for that symbol. This is an example of smt2 query and seed file: _query.smt2_ ``` smt2 (declare-const k!0 (_ BitVec 8)) (declare-const k!1 (_ BitVec 8)) (declare-const k!2 (_ BitVec 8)) (declare-const k!3 (_ BitVec 8)) (assert (= #xdeadbeef (bvadd (concat k!0 k!1 k!2 k!3) #xabadcafe)) ) ``` _seed.bin_ ``` 0000 ``` If you are using the `fuzzolic-runner` container, you can find the executable in `~/solver/fuzzy-solver-cli/build/bin/fuzzy-solver`. #### Text UI By default, `fuzzy-solver` uses a text UI where it prints useful statistics about the solving process: ``` query 992/992 o-------------------------------------------------------------o | num eval: 400528 sat: 147 (1) [727 opt] | | | | its: 91 its ext: 14 | | sm: 8 bf: 11 | | rbf: 4 rbf opt: 6 | | gd: 0 havoc: 0 | | bit flips: 8, 0, 0 byte flips: 0, 0, 0, 0 | | arithms: 0 ints: 0 | | multigoal: 5 fallbacks: 164 (1120)si (115)nt | | | | # confl: 1203 # univ def: 0 | | confl cache size: 198 ast info cache size: 5378 | | num timeouts: 0 ast info cache hits: 6387 | | | | avg eval time: 1.293 usec | o-------------------------------------------------------------o ``` ### C/C++ Bindings We designed FuzzySAT as a library to be integrated into a concolic executor. Before detailing the various APIs, let us clarify some details on the usage of the library: - The APIs that take an AST as input expect as symbols only 8-bit bitvectors whose symbol name has been created using `Z3_mk_int_symbol`. This allows the library to easily map a symbol to its assignment. - The context of Z3 can be created either with `Z3_mk_context` or `Z3_mk_context_rc`, allowing the lib to be used with both the C and the C++ APIs of Z3. - At the moment, the library is NOT thread-safe. #### z3fuzz_init ``` void z3fuzz_init(fuzzy_ctx_t* ctx, Z3_context z3_ctx, char* seed_filename, void* /* NULL */, void* /* NULL */, unsigned timeout) ``` It initializes the context of the solver: - `ctx`: The context of the solver to be initialized. - `z3_ctx`: The z3 context (already initialized). - `seed_filename`: The filename of the seed. - `timeout`: Timeout for trying to solve a query in ms. If zero, no timeout. #### z3fuzz_free ``` void z3fuzz_free(fuzzy_ctx_t* ctx); ``` It releases all the memory allocated by `z3fuzz_init`. #### z3fuzz_evaluate_expression ``` unsigned long z3fuzz_evaluate_expression(fuzzy_ctx_t* ctx, Z3_ast expr, unsigned char* values); ``` It evaluates the expression `expr` using as assignments the values in the array `values`. #### z3fuzz_query_check_light ``` int z3fuzz_query_check_light(fuzzy_ctx_t* ctx, Z3_ast pi, Z3_ast branch_condition, unsigned char const** proof, unsigned long* proof_size); ``` It tries to solve the query `branch_condition ^ pi` using the FuzzySAT algorithm. - `ctx`: The context of the solver. - `pi`: The path constraint. - `branch_condition`: The branch condition. - `proof`: Output buffer that contains the resulting assignment if the function returns `1`. - `proof_size`: Output value that contains the resulting length of assignment if the function returns `1`. The function succeeds if it returns `1`. #### z3fuzz_get_optimistic_sol ``` int z3fuzz_get_optimistic_sol(fuzzy_ctx_t* ctx, unsigned char const** proof, unsigned long* proof_size); ``` If the last call to `z3fuzz_query_check_light` failed, this function tries to find an assignment for the `branch_condition` of the last query, ignoring `pi`. - `ctx`: The context of the solver. - `proof`: Output buffer that contains the resulting assignment if the function returns `1`. - `proof_size`: Output value that contains the resulting length of assignment if the function returns `1`. The function succeeds if it returns `1`. #### z3fuzz_maximize and z3fuzz_minimize ``` unsigned long z3fuzz_maximize(fuzzy_ctx_t* ctx, Z3_ast pi, Z3_ast to_maximize, unsigned char const** out_values, unsigned long* out_len); unsigned long z3fuzz_minimize(fuzzy_ctx_t* ctx, Z3_ast pi, Z3_ast to_minimize, unsigned char const** out_values, unsigned long* out_len); ``` They try to minimize/maximize the expression given the constraints in `pi`. The functions assume that `pi` evaluates as True in the seed. - `ctx`: The context of the solver. - `to_maximize` / `to_minimize`: The expression to maximize/minimize. It must be a bitvector. - `out_values`: Output buffer that contains the assignments that maximize/minimize the expression. - `out_len`: Output variable that contains the size of the `out_values` buffer. The function always succeeds (but it is not guaranteed that it finds a global minimum/maximum) and returns the value of the maximized/minimized expression. #### z3fuzz_find_all_values and z3fuzz_find_all_values_gd ``` typedef enum fuzzy_findall_res_t { Z3FUZZ_GIVE_NEXT, Z3FUZZ_STOP } fuzzy_findall_res_t; void z3fuzz_find_all_values( fuzzy_ctx_t* ctx, Z3_ast expr, Z3_ast pi, fuzzy_findall_res_t (*callback)(unsigned char const* out_bytes, unsigned long out_bytes_len, unsigned long val)); void z3fuzz_find_all_values_gd( fuzzy_ctx_t* ctx, Z3_ast expr, Z3_ast pi, int to_min, fuzzy_findall_res_t (*callback)(unsigned char const* out_bytes, unsigned long out_bytes_len, unsigned long val)); ``` They generate valid assignments for the expression `expr` given the path constraint `pi`. The user must define a `callback` that is called every time the function generates a new assignment. The callback can return `Z3FUZZ_GIVE_NEXT` to ask for another value or `Z3FUZZ_STOP` to stop. The only difference between `z3fuzz_find_all_values_gd` and `z3fuzz_find_all_values` is that the former uses gradient descent to generate the values, and has an additional parameter (`to_min`) that specify the direction (towards the maximum or the minimum). #### z3fuzz_notify_constraint ``` void z3fuzz_notify_constraint(fuzzy_ctx_t* ctx, Z3_ast constraint); ``` It notifies the solver that a new constraint has been added to `pi`. #### z3fuzz_dump_proof ``` void z3fuzz_dump_proof(fuzzy_ctx_t* ctx, const char* filename, unsigned char const* proof, unsigned long proof_size); ``` It dumps a proof returned by `z3fuzz_query_check_light` or `z3fuzz_get_optimistic_sol` on the file specified in `filename`. ## Concolic execution (standalone mode) To run fuzzolic in standalone mode, you need to execute the script `./fuzzolic/fuzzolic.py`. For instance: ``` $ ./fuzzolic/fuzzolic.py -o ./workdir -i ./seeds -- ./program [args] @@ ``` will run concolic execution on `./program`, passing (optional) arguments `args`, using initial inputs available in the directory `seeds`, generating the results in the directory `./workdir`. Similarly to AFL, since `@@` is specified for the program, then fuzzolic will assume that the program is getting the input from a file stored on the filesystem (fuzzolic will replace `@@` with the correct path at runtime). When `@@` is not used, fuzzolic will assume that the input is obtained by reading from the standard input. The exploration will follow multiple paths, halting when no new interesting inputs can be generated anymore by fuzzolic. By default, fuzzolic will use the SMT solver Z3. To select Fuzzy-SAT as the solver backend, add the option `-f` (or `--fuzzy`): ``` $ ./fuzzolic/fuzzolic.py -f -o ./workdir -i ./seeds -- ./program [args] @@ ``` Several other options can be set to enable additional features: * `-a, --afl AFL_WORKDIR`: this enables the AFL mode, see [next section](#hybrid-fuzzing-afl-mode); * `-t, --timeout TIMEOUT`: maximum cumulative solving time (ms) within one path exploration; * `-p, --optimistic-solving`: this enables optimistic solving; * `-k, --single-path`: fuzzolic will perform a single-path exploration (first input from the seed directory) * `--keep-run-dirs`: intermediate run directories (`workdir/fuzzolic-XXXXX`), containing tracer/solver logs and generated testcases (before discarding uninteresting ones), will not be deleted when this option is set; * `-r, --address-reasoning`: fuzzolic will try to generate testcases that cover different memory addresses when detecting a symbolic pointer during the exploration; * `-l, --symbolic-models`: fuzzolic will use models to reason over known libc functions (e.g., `memcpy`). The full list of fuzzolic options can be seen using `./fuzzolic/fuzzolic.py --help`. The options used for the ICSE paper are: * when using FUZZY-SAT: `--fuzzy`, `--address-reasoning`, `--optimistic-solving`, `--timeout 90000` * when using Z3: `--address-reasoning`, `--optimistic-solving`, `--timeout 90000` After (and during) an exploration, the workdir will typically contain the following files: * `{afl-bitmap, afl-crash-bitmap, branch_bitmap, context_bitmap}`: bitmaps used by fuzzolic to evaluate whether a branch query is interesting; * `fuzzolic-XXXXX/` (kept only when `--keep-run-dirs` is used): e.g., `{fuzzolic-00000, fuzzolic-00001, ...}` * `solver.log`: log of the solver backend * `tracer.log`: log of the QEMU tracer * `test_case_ZZZ_YYY.dat`: e.g., `test_case_0_0.dat`, a test case generated by fuzzolic for path id=`XXXXXX` when processing query with id=`ZZZ` * `queue/`: * when in standalone mode: test cases (from the seed directory or from previous `fuzzolic-XXXXX/` directories) that still need to be processed by fuzzolic during the current exploration * when in AFL mode: interesting test cases generated by fuzzolic * `tests/`: * when in standalone mode: interesting test cases generated by fuzzolic * when in AFL mode: not used Hence, when looking for interesting test cases generated by fuzzolic, check the directory `tests` when using fuzzolic in standalone mode. On the other hand, when using fuzzolic in AFL mode check the directory `queue`. ### Example Let us consider the program [`tests/example/example.c`](https://github.com/season-lab/fuzzolic/blob/master/tests/example/example.c#L21): ``` #include #include int magic_check(int p){ if (p == 0xDEADBEEF) return 1; else return 0; } int get_input(char* fname) { FILE* fp = fopen(fname, "r"); if (fp == NULL) exit(EXIT_FAILURE); int data; int r = fread(&data, 1, sizeof(data), fp); if (r != sizeof(data)) exit(EXIT_FAILURE); fclose(fp); return data; } int main(int argc, char* argv[]) { if (argc != 2) exit(EXIT_FAILURE); int input = get_input(argv[1]); // read four bytes from the input file if (magic_check(input)) { printf("Correct value [%x] :)\n", input); } else { printf("Wrong value [%x] :(\n", input); } return 0; } ``` Our goal is to automatically find the magic value `0xDEADBEEF` that is expected by the function `magic_check`. Since we do not know the magic value beforehand, we consider as an initial seed a file ([`tests/example/inputs/seed.dat`](https://github.com/season-lab/fuzzolic/blob/master/tests/example/inputs/seed.dat)) containing just the `AAAA\n` characters. Moreover, since fuzzolic is a binary concolic executor, we need to compile the example into a working binary: ``` $ gcc -o tests/example/example tests/example/example.c ``` If we run the program over the initial seed: ``` $ ./tests/example/example tests/example/inputs/seed.dat ``` we should get the following output: ``` Wrong value [41414141] :( ``` Now, if we instead start the concolic exploration with fuzzolic: ``` $ ./fuzzolic/fuzzolic.py --address-reasoning --optimistic-solving --timeout 90000 -o ./workdir/ -i tests/example/inputs/ -- ./tests/example/example @@ ``` The output should be similar to: ``` Configuration file for /home/ubuntu/fuzzolic/tests/example/example is missing. Using default configuration. Running directory: /home/ubuntu/fuzzolic/workdir/fuzzolic-00000 Using SMT solver [-] Discarding test_case_11_0.dat [-] Discarding test_case_3_0.dat [+] Keeping test_case_1_0.dat [-] Discarding test_case_7_0.dat [-] Discarding test_case_5_0.dat Run took 0.3 secs Running directory: /home/ubuntu/fuzzolic/workdir/fuzzolic-00001 Using SMT solver [-] Discarding test_case_1_0.dat Run took 0.2 secs [FUZZOLIC] no more testcase. Finishing. ``` We can see that fuzzolic has generated 6 testcases (`test_case_{1, 3, 5, 7, 11}_0.dat`) during run with id=`00000`, where the first one (`test_case_1_0.dat`) is due to the internal check from the `magic_check` function, while the other ones are due to the `printf` function inside the `main` function (which performs some checks). Moreover, we can see that fuzzolic is keeping only the first test case (see `[+]` in the output) and is discarding the other ones (see `[-]` in the output): only the first test case bring more code coverage in the program. Test case `test_case_1_0.dat` is copied (renaming it as `test_case_000_000.dat`) into both the `queue` and `tests` directories. In the second run (id=`00001`), fuzzolic performs another exploration using the new input (picking it from the queue) and generates only a single test case (`test_case_1_0.dat`), which is however dubbed uninteresting and thus discarded. If we check the content of the generated test case using `xxd`: ``` $ xxd workdir/tests/test_case_000_000.dat 00000000: efbe adde 0a ``` we can see that it contains five bytes, where the first four bytes are the little endian representation of `0xdeadbeef`. To test that fuzzolic has indeed found the correct input, we can run again the example program on the generated test case: ``` $ ./tests/example/example workdir/tests/test_case_000_000.dat Correct value [deadbeef] :) ``` and see that the input is accepted by the program. ## Hybrid fuzzing (AFL mode) To run fuzzolic in parallel with AFL++, you can use the script `fuzzolic/run_afl_fuzzolic.py` which supports most features from fuzzolic. For instance, to fuzz a program `lodepng` which takes as argument the input file to process, then: ``` $ ./fuzzolic/run_afl_fuzzolic.py --address-reasoning --optimistic-solving --timeout 90000 --fuzzy -o workdir/ -i ../benchmarks/lodepng/seeds/ -- ../benchmarks/lodepng/lodepng @ ``` The output should be similar to: ``` Running: /home/ubuntu/fuzzolic/utils/AFLplusplus/afl-fuzz -c 0 -M afl-master -o workdir/ -i ../benchmarks/lodepng/seeds/ -Q -- ../benchmarks/lodepng/lodepng @@ Waiting 30 seconds before starting slave. Running: /home/ubuntu/fuzzolic/utils/AFLplusplus/afl-fuzz -S afl-slave -o workdir/ -i ../benchmarks/lodepng/seeds/ -Q ../benchmarks/lodepng/lodepng @@ Waiting 30 seconds before starting fuzzolic. Running: /home/ubuntu/fuzzolic/fuzzolic/fuzzolic.py -f -p -r -a workdir//afl-slave/ -i workdir//afl-slave/queue/ -o workdir//fuzzolic -- ../benchmarks/lodepng/lodepng @@ Configuration file for /home/ubuntu/benchmarks/lodepng/lodepng_decode_cg_nocksm is missing. Using default configuration. Running directory: /home/ubuntu/fuzzolic/workdir/fuzzolic/fuzzolic-00000 Using Fuzzy-SAT solver Run took 0.2 secs Running directory: /home/ubuntu/fuzzolic/workdir/fuzzolic/fuzzolic-00001 Using Fuzzy-SAT solver [+] Keeping test_case_6_0.dat [+] Keeping test_case_11_0.dat [-] Discarding test_case_13_0.dat [+] Keeping test_case_3_0.dat [+] Keeping test_case_12_0.dat [+] Keeping test_case_14_0.dat [+] Keeping test_case_15_666.dat [+] Keeping test_case_1_0.dat [-] Discarding test_case_9_0.dat [+] Keeping test_case_10_0.dat [+] Keeping test_case_8_0.dat [+] Keeping test_case_4_0.dat [+] Keeping test_case_7_0.dat [+] Keeping test_case_2_0.dat [+] Keeping test_case_5_0.dat Run took 0.2 secs [...] ``` The generated inputs can be found in `/*/queue/`, i.e.: * `/afl-master/queue/` * `/afl-slave/queue/` * `/fuzzolic/queue/` while crashing inputs under `/*/crashes/`. Inputs generated by fuzzolic will have the expected format for parallel mode: `id:XXXXXX,src:id:YYYYYY`, where `XXXXXX` is an incremental id assigned by fuzzolic to its own inputs and `YYYYYYYY` is the id of the input that was picked by fuzzolic from the queue of the slave and used during the exploration that has generated the new input `XXXXXX`. To run AFL++ with additional flags, you can use `-g, --afl-args ""`. E.g.: ``` ./fuzzolic/run_afl_fuzzolic.py -o workdir/ -i ../benchmarks/lodepng/seeds/ -g "-m 1G" -- ../benchmarks/lodepng/lodepng @@ ``` will set the additional arguments `-m 1G` for AFL++: ``` Running: /home/ubuntu/fuzzolic/utils/AFLplusplus/afl-fuzz -c 0 -M afl-master -o workdir/ -i ../benchmarks/lodepng/seeds/ -m 1G -Q -- ../benchmarks/lodepng/lodepng @@ Waiting 30 seconds before starting slave. Running: /home/ubuntu/fuzzolic/utils/AFLplusplus/afl-fuzz -S afl-slave -o workdir/ -i ../benchmarks/lodepng/seeds/ -m 1G -Q ../benchmarks/lodepng/lodepng @@ Waiting 30 seconds before starting fuzzolic. [...] ```