Symbolic execution with SymCC: Don't interpret, compile!Proceedings of the 29th USENIX Security Symposium (USENIX Security 20), Boston, MA, USA
A major impediment to practical symbolic execution is speed, especially when compared to near-native speed solutions like fuzz testing. We propose a compilation-based approach to symbolic execution that performs better than state-of-the-art implementations by orders of magnitude. We present SymCC, an LLVM-based C and C++ compiler that builds concolic execution right into the binary. It can be used by software developers as a drop-in replacement for clang and clang++, and we show how to add support for other languages with little effort. In comparison with KLEE, SymCC is faster by up to three orders of magnitude and an average factor of 12. It also outperforms QSYM, a system that recently showed great performance improvements over other implementations, by up to two orders of magnitude and an average factor of 10. Using it on real-world software, we found that our approach consistently achieves higher coverage, and we discovered two vulnerabilities in the heavily tested OpenJPEG project, which have been confirmed by the project maintainers and assigned CVE identifiers.
SymCC is a fast compiler-based symbolic execution engine. On this page, we provide SymCC's source code, the raw results of the experiments described in the paper, as well as instructions how you can replicate those experiments yourself.
SymCC is available on GitHub.
In the paper, we describe two sets of experiments: we first benchmark SymCC on the CGC programs, then we run it on real-world software. This section describes how to replicate our experiments, and provides links to our results.
CGC experiments (our results for pure execution time, concolic execution time, and coverage)
We used the Linux port of the CGC programs by Trail of Bits. SymCC needs to be built with support for 32-bit compilation (see docs/32-bit.txt; this is not part of the Dockerfile because it would double the build time of the container while providing value to just a few users). Then you can simply export CC=/path/to/symcc, CXX=/path/to/sym++ and SYMCC_NO_SYMBOLIC_INPUT=1, and build the CGC programs as usual (i.e., by invoking their build.sh script).
Run the programs on the raw PoV inputs with SYMCC_NO_SYMBOLIC_INPUT=1 to measure pure execution time, and unset the environment variable for symbolic execution. To assess coverage, we ran afl-showmap with the AFL-instrumented CGC programs on each generated input and accumulated the resulting coverage maps per program, resulting in a set of covered map entries for each CGC program. The sizes of those sets can then be fed to the scoring formula presented in the paper.
For KLEE and QSYM, we used the setup described here but with the regular 32-bit binaries built by cb-multios.
Real-world software
The analysis of real-world software always follows the same procedure. Assuming you have exported CC=symcc, CXX=sym++ and SYMCC_NO_SYMBOLIC_INPUT=1, first download the code, then build it using its own build system, finally unset SYMCC_NO_SYMBOLIC_INPUT and analyze the program in concert with AFL (which requires building a second time for AFL, see docs/Fuzzing.txt). We used AFL 2.56b and built the targets with AFL_USE_ASAN=1. Note that the fuzzing helper is already installed in the Docker container.
All experiments used one AFL master process, one secondary AFL process, and one SymCC process. We let them run for 24 hours and repeated each of them 30 times to create the graphs in the paper; AFL map density was extracted from the secondary AFL process' plot_data file, column map_size.
The QSYM experiments used an analogous setup, replacing SymCC with QSYM and running it with AFL according to the QSYM authors' instructions.
Response: In the paper, we report the median speedup factor on the set of programs that both SymCC and the system we compare to (i.e., either KLEE or QSYM) support. The figure, however, shows the execution times for _all_ programs; this makes it look as if KLEE was faster than QSYM, where really it just failed to execute some long-running programs that QSYM handled successfully.
Response: While we believe that considering only supported programs is the right way to accurately assess speed differences, explaining the calculations would have required too much time in the recorded video for USENIX. We therefore opted for a simpler strategy that underestimates SymCC's performance (and thus isn't unfair against KLEE and QSYM): we simply computed the median execution times across all programs and used those to derive the speedup factor.
QSYM takes great care to make branch IDs unique. In SymCC, we use the values of pointers to LLVM structures as IDs. Those they aren't stable across runs; however, this is in the compiler, and we compile just once. The IDs are embedded into the instrumented binaries as constants, so they're stable across executions of the target program; you can see them in the disassembly.
This work was supported by the DAPCODS/IOTics ANR 2016 project (ANR-16-CE25-0015).
Feel free to reach out to us if anything is unclear or if you need more information.