Usage

You can run SymFusion using:

  • a Python wrapper: quick way of running the tool without setting variables, creating directories, etc.

  • a Rust wrapper: this is an improved version of the wrapper proposed by SymCC. Faster than the Python wrapper but requires a few preliminary operations. We show how to use SymFusion in a hybrid fuzzing setup with AFL++.

Concolic execution (Python wrapper)

To run SymFusion in standalone mode, you need to execute the script ./runner/symfusion.py. For instance:

$ ./runner/symfusion.py -o ./workdir -i ./seeds -- ./program [args] @@

will run concolic execution on ./program, passing (optional) arguments args, using initial inputs available in the directory seeds, generating the results in the directory ./workdir. Similarly to AFL, since @@ is specified for the program, then SymFusion will assume that the program is getting the input from a file stored on the filesystem (SymFusion will replace @@ with the correct path at runtime). When @@ is not used, SymFusion will assume that the input is obtained by reading from the standard input. The exploration will follow multiple paths, halting when no new interesting inputs can be generated anymore by SymFusion.

Several other options can be set to enable additional features:

  • -a, --afl AFL_WORKDIR: this enables the AFL++ mode;

  • -t, --timeout TIMEOUT: maximum running time for each input (secs);

  • -f, --fork-server: run SymFusion with the fork server;

  • -q, --queue-mode {symfusion, qsym, hash}: how to assess whether a generated input is interesting and how to pick inputs from the queue. symfusion is using a custom edge tracer. qsym is using AFL++ afl-showmap (as done by QSYM) and you need to have an uninstrumented binary <program>.symqemu. hash is keeping in the queue inputs that have a different hash.

  • --keep-run-dirs: intermediate run directories (workdir/fuzzolic-XXXXX), containing tracer/solver logs and generated testcases (before discarding uninteresting ones), will not be deleted when this option is set;

  • -d, --debug {output, gdb}: run SymFusion only on the first seed from the input directory. output will show you the full output of the run. gdb will execute SymFusion under GDB.

The full list of fuzzolic options can be seen using ./runner/symfusion.py --help.

After (and during) an exploration, the workdir will typically contain the following files:

  • symfusion-XXXXX/ (kept only when --keep-run-dirs is used): e.g., {symfusion-00000, symfusion-00001, ...}

    • output.log: standard output and standard error of the run

    • id:XXXXXX: e.g., id:000000,src:seed, this is the seed used for the run

    • YYYYYY: e.g., 000000, a generated input

  • queue/: interesting test cases generated by SymFusion

Hence, when looking for interesting test cases generated by SymFuson, check the directory queue.

Example

Let us consider the program tests/example/example.c:

#include <stdio.h>
#include <stdlib.h>

int magic_check(int p){
    if (p == 0xDEADBEEF)
        return 1;
    else
        return 0;
}

int get_input(char* fname) {
    FILE* fp = fopen(fname, "r");
    if (fp == NULL) exit(EXIT_FAILURE);
    int data;
    int r = fread(&data, 1, sizeof(data), fp);
    if (r != sizeof(data)) exit(EXIT_FAILURE);
    fclose(fp);
    return data;
}

int main(int argc, char* argv[]) {

    if (argc != 2) exit(EXIT_FAILURE);
    int input = get_input(argv[1]); // read four bytes from the input file
    if (magic_check(input)) {
        printf("Correct value [%x] :)\n", input);
    } else {
        printf("Wrong value [%x] :(\n", input);
    }

    return 0;
}

Our goal is to automatically find the magic value 0xDEADBEEF that is expected by the function magic_check. Since we do not know the magic value beforehand, we consider as an initial seed a file (tests/example/inputs/seed.dat) containing just the AAAA\n characters.

We build the programs in two versions (non instrumented and instrumented):

$ cd tests/example
$ clang-10 -o example example.c                                 # non instrument version
$ ../../symcc-hybrid/build/symcc -o example.symfusion example.c # instrumented version

If we run the (non-instrumented) binary over the initial seed:

$ ./tests/example/example ./inputs/seed.dat 

we should get the following output:

Wrong value [41414141] :(

Now, if we instead start the concolic exploration with SymFusion using the instrumented binary:

$ ./runner/symfusion.py -o ./out/ -i ./inputs/ -- ./example.symfusion @@

The output should be similar to:

Done generating config.

Evaluating seed inputs:
Testcase seed.dat [1f1964918bb5b9a1, 27ad6cb4038e12a]: new_edges=[app=10, all=4181], unique_path=[app=True, all=True]
Summary of testcases: duplicated=0 unique=1 tracer_time=0.06

Picking input from queue Edges (application code): id:000000,src:seed [score=10, count=1, waiting=0]
Working directory: workdir => symfusion-00000000
Running...
Completed (time=0.052)
Testcase 000000 [3efe4a0512d16b03, 8eb61eab981997fd]: new_edges=[app=4, all=265], unique_path=[app=True, all=True]
Summary of testcases: duplicated=0 unique=1 tracer_time=0.00
Run took 0.06 secs [runner=0.05, tracer=0.00, tracer_run=0.00, tracer_comp=0.00, total_tracer=0.00, total_runner=0.05, avg_total_runner=0.052, all_time=0.06, pick_time=0.00]

Picking input from queue Edges (application code): id:000001,src:id:000000 [score=4, count=2, waiting=0]
total_run=0.06, total_pick=0.00, all_time=0.06
Working directory: workdir => symfusion-00000001
Running...
Completed (time=0.055)
Summary of testcases: duplicated=1 unique=0 tracer_time=0.00
Run took 0.07 secs [runner=0.06, tracer=0.00, tracer_run=0.00, tracer_comp=0.00, total_tracer=0.01, total_runner=0.11, avg_total_runner=0.054, all_time=0.13, pick_time=0.00]
total_run=0.13, total_pick=0.00, all_time=0.13

[SymFusion] no more testcases. Finishing.

and we can see that the generated input is indeed accepted by the non-instrumented program:

$ ./example out/queue/id\:000001\,src\:id\:000000 
Correct value [deadbeef] :)

Hybrid fuzzing setup with AFL++ (Rust wrapper)

We consider again the simple example from./tests/example/. We will have to perform several steps, we review each of them but they are all detailed in the script ./tests/example/start-rust-aflpp.sh.

To start, we build two versions of the program:

$ cd ./tests/example/
$ ../../symcc-hybrid/build/symcc -o example example.c # build for SymFusion
$ /afl/afl-clang-fast example.c -o example.afl        # build for AFL++

Let us create an output directory:

$ export OUTPUT=`pwd`/out
$ mkdir ${OUTPUT}

And set some environment variables

$ export HYBRID_CONF_FILE=$OUTPUT/hybrid.conf
$ export LD_BIND_NOW=1
$ export SYMFUSION_HYBRID=1
$ export WRAPPER=`pwd`/../../symqemu-hybrid/x86_64-linux-user/symqemu-x86_64
$ export SYMCC_ENABLE_LINEARIZATION=1
$ export SEEDS=`pwd`/inputs                     # initial set of inputs for AFL++

We now build the hybrid configuration:

$ ../../runner/symfusion.py -g ${HYBRID_CONF_FILE} -i ${SEEDS} -o ${OUTPUT}/concolic -- ./example.symfusion @@
$ rm -rf ${OUTPUT}/concolic

You can start AFL++ in background:

$ /afl/afl-fuzz -M afl-master -t 5000 -m 100M -i ${SEEDS} -o ${OUTPUT} -- ./example.afl @@ >/dev/null 2>&1 &

And then wait until AFL++ creates:

  • ${OUTPUT}/afl-master/fuzzer_stats

  • ${OUTPUT}/afl-master/fuzz_bitmap

You can now start SymFusion with:

$ ../../symcc-hybrid/build/symcc_fuzzing_helper -a afl-master -o ${OUTPUT} -n concolic -- ${WRAPPER} ./example.symfusion @@

Use CTRL+c to stop the concolic executor and AFL++.