Results 1 - 10
of
232
KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs
"... We present a new symbolic execution tool, KLEE, capable of automatically generating tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs. We used KLEE to thoroughly check all 89 stand-alone programs in the GNU COREUTILS utility suite, which form the cor ..."
Abstract
-
Cited by 557 (15 self)
- Add to MetaCart
(Show Context)
We present a new symbolic execution tool, KLEE, capable of automatically generating tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs. We used KLEE to thoroughly check all 89 stand-alone programs in the GNU COREUTILS utility suite, which form the core user-level environment installed on millions of Unix systems, and arguably are the single most heavily tested set of open-source programs in existence. KLEE-generated tests achieve high line coverage — on average over 90% per tool (median: over 94%) — and significantly beat the coverage of the developers’ own hand-written test suite. When we did the same for 75 equivalent tools in the BUSYBOX embedded system suite, results were even better, including 100 % coverage on 31 of them. We also used KLEE as a bug finding tool, applying it to 452 applications (over 430K total lines of code), where it found 56 serious bugs, including three in COREUTILS that had been missed for over 15 years. Finally, we used KLEE to crosscheck purportedly identical BUSYBOX and COREUTILS utilities, finding functional correctness errors and a myriad of inconsistencies.
CUTE: A Concolic Unit Testing Engine for C
- IN ESEC/FSE-13: PROCEEDINGS OF THE 10TH EUROPEAN
, 2005
"... In unit testing, a program is decomposed into units which are collections of functions. A part of unit can be tested by generating inputs for a single entry function. The entry function may contain pointer arguments, in which case the inputs to the unit are memory graphs. The paper addresses the pro ..."
Abstract
-
Cited by 480 (22 self)
- Add to MetaCart
(Show Context)
In unit testing, a program is decomposed into units which are collections of functions. A part of unit can be tested by generating inputs for a single entry function. The entry function may contain pointer arguments, in which case the inputs to the unit are memory graphs. The paper addresses the problem of automating unit testing with memory graphs as inputs. The approach used builds on previous work combining symbolic and concrete execution, and more specifically, using such a combination to generate test inputs to explore all feasible execution paths. The current work develops a method to represent and track constraints that capture the behavior of a symbolic execution of a unit with memory graphs as inputs. Moreover, an efficient constraint solver is proposed to facilitate incremental generation of such test inputs. Finally, CUTE, a tool implementing the method is described together with the results of applying CUTE to real-world examples of C code.
EXE: Automatically generating inputs of death
- In Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS
, 2006
"... This article presents EXE, an effective bug-finding tool that automatically generates inputs that crash real code. Instead of running code on manually or randomly constructed input, EXE runs it on symbolic input initially allowed to be anything. As checked code runs, EXE tracks the constraints on ea ..."
Abstract
-
Cited by 349 (21 self)
- Add to MetaCart
This article presents EXE, an effective bug-finding tool that automatically generates inputs that crash real code. Instead of running code on manually or randomly constructed input, EXE runs it on symbolic input initially allowed to be anything. As checked code runs, EXE tracks the constraints on each symbolic (i.e., input-derived) memory location. If a statement uses a symbolic value, EXE does not run it, but instead adds it as an input-constraint; all other statements run as usual. If code conditionally checks a symbolic expression, EXE forks execution, constraining the expression to be true on the true branch and false on the other. Because EXE reasons about all possible values on a path, it has much more power than a traditional runtime tool: (1) it can force execution down any feasible program path and (2) at dangerous operations (e.g., a pointer dereference), it detects if the current path constraints allow any value that causes a bug. When a path terminates or hits a bug, EXE automatically generates a test case by solving the current path constraints to find concrete values using its own co-designed constraint solver, STP. Because EXE’s constraints have no approximations, feeding this concrete input to an uninstrumented version of the checked code will cause it to follow the same path and hit the same bug (assuming deterministic code).
Test Input Generation with Java PathFinder
"... We show how model checking and symbolic execution can be used to generate test inputs to achieve structural coverage of code that manipulates complex data structures. We focus on obtaining branch-coverage during unit testing of some of the core methods of the red-black tree implementation in the Jav ..."
Abstract
-
Cited by 185 (7 self)
- Add to MetaCart
We show how model checking and symbolic execution can be used to generate test inputs to achieve structural coverage of code that manipulates complex data structures. We focus on obtaining branch-coverage during unit testing of some of the core methods of the red-black tree implementation in the Java TreeMap library, using the Java PathFinder model checker. Three di#erent test generation techniques will be introduced and compared, namely, straight model checking of the code, model checking used in a black-box fashion to generate all inputs up to a fixed size, and lastly, model checking used during white-box test input generation. The main contribution of this work is to show how e#cient white-box test input generation can be done for code manipulating complex data, taking into account complex method preconditions.
Symstra: A framework for generating object-oriented unit tests using symbolic execution
- In TACAS
, 2005
"... Abstract. Object-oriented unit tests consist of sequences of method invocations. Behavior of an invocation depends on the method’s arguments and the state of the receiver at the beginning of the invocation. Correspondingly, generating unit tests involves two tasks: generating method sequences that b ..."
Abstract
-
Cited by 140 (18 self)
- Add to MetaCart
(Show Context)
Abstract. Object-oriented unit tests consist of sequences of method invocations. Behavior of an invocation depends on the method’s arguments and the state of the receiver at the beginning of the invocation. Correspondingly, generating unit tests involves two tasks: generating method sequences that build relevant receiverobject states and generating relevant method arguments. This paper proposes Symstra, a framework that achieves both test generation tasks using symbolic execution of method sequences with symbolic arguments. The paper defines symbolic states of object-oriented programs and novel comparisons of states. Given a set of methods from the class under test and a bound on the length of sequences, Symstra systematically explores the object-state space of the class and prunes this exploration based on the state comparisons. Experimental results show that Symstra generates unit tests that achieve higher branch coverage faster than the existing test-generation techniques based on concrete method arguments. 1
Hybrid concolic testing
"... We present hybrid concolic testing, an algorithm that interleaves random testing with concolic execution to obtain both a deep and a wide exploration of program state space. Our algorithm generates test inputs automatically by interleaving random testing until saturation with bounded exhaustive symb ..."
Abstract
-
Cited by 137 (7 self)
- Add to MetaCart
We present hybrid concolic testing, an algorithm that interleaves random testing with concolic execution to obtain both a deep and a wide exploration of program state space. Our algorithm generates test inputs automatically by interleaving random testing until saturation with bounded exhaustive symbolic exploration of program points. It thus combines the ability of random search to reach deep program states quickly together with the ability of concolic testing to explore states in a neighborhood exhaustively. We have implemented our algorithm on top of CUTE and applied it to obtain better branch coverage for an editor implementation (VIM 5.7, 150K lines of code) as well as a data structure implementation in C. Our experiments suggest that hybrid concolic testing can handle large programs and provide, for the same testing budget, almost 4 × the branch coverage than random testing and almost 2 × that of concolic testing.
Scalable error detection using Boolean satisfiability
- In Proc. 32Ç È POPL. ACM
, 2005
"... We describe a software error-detection tool that exploits recent advances in boolean satisfiability (SAT) solvers. Our analysis is path sensitive, precise down to the bit level, and models pointers and heap data. Our approach is also highly scalable, which we achieve using two techniques. First, for ..."
Abstract
-
Cited by 118 (10 self)
- Add to MetaCart
(Show Context)
We describe a software error-detection tool that exploits recent advances in boolean satisfiability (SAT) solvers. Our analysis is path sensitive, precise down to the bit level, and models pointers and heap data. Our approach is also highly scalable, which we achieve using two techniques. First, for each program function, several optimizations compress the size of the boolean formulas that model the control- and data-flow and the heap locations accessed by a function. Second, summaries in the spirit of type signatures are computed for each function, allowing inter-procedural analysis without a dramatic increase in the size of the boolean constraints to be solved. We demonstrate the effectiveness of our approach by constructing a lock interface inference and checking tool. In an interprocedural analysis of more than 23,000 lock related functions in the latest Linux kernel, the checker generated 300 warnings, of which 179 were unique locking errors, a false positive rate of only 40%.
DSD-Crasher: A hybrid analysis tool for bug finding
- In ISSTA
, 2006
"... DSD-Crasher is a bug finding tool that follows a three-step approach to program analysis: D. Capture the program’s intended execution behavior with dynamic invariant detection. The derived invariants exclude many unwanted values from the program’s input domain. S. Statically analyze the program with ..."
Abstract
-
Cited by 90 (5 self)
- Add to MetaCart
DSD-Crasher is a bug finding tool that follows a three-step approach to program analysis: D. Capture the program’s intended execution behavior with dynamic invariant detection. The derived invariants exclude many unwanted values from the program’s input domain. S. Statically analyze the program within the restricted input domain to explore many paths. D. Automatically generate test cases that focus on reproducing the predictions of the static analysis. Thereby confirmed results are feasible. This three-step approach yields benefits compared to past two-step combinations in the literature. In our evaluation with third-party applications, we demonstrate higher precision over tools that lack a dynamic step and higher efficiency over tools that lack a static step.
Differential symbolic execution
- IN SIGSOFT ’08/FSE-16: PROCEEDINGS OF THE 16TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON FOUNDATIONS OF SOFTWARE ENGINEERING
, 2009
"... Successful software systems tend to be long lived and evolve over time as requirements change and faults are detected. The number of times a system is updated and re-deployed may be in the hundreds, or even thousands. Re-validation of an updated system, before it is released, is a critical component ..."
Abstract
-
Cited by 88 (4 self)
- Add to MetaCart
(Show Context)
Successful software systems tend to be long lived and evolve over time as requirements change and faults are detected. The number of times a system is updated and re-deployed may be in the hundreds, or even thousands. Re-validation of an updated system, before it is released, is a critical component of the software evolution process. This step ensures that the changes made to the software have their intended effects, and that no unintended behaviors were introduced. Given the size and complexity of modern software systems, re-validation is generally costly and time consuming. Characterizing the differences between software versions can help focus re-validation tasks, potentially reducing the cost and effort necessary to re-deploy the software. Change characterizations are also useful for other software evolution tasks, e.g., assessing the impact of the changes on other parts of the system. Existing change characterization techniques infer differences in program behaviors based on changes to the source code. This approach is imprecise, and therefore, can lead to unnecessary effort and cause delays in deployment. In this dissertation, we present a novel extension and application of symbolic
Generating Tests from Counterexamples
- In Proc. of the 26th ICSE
, 2004
"... We have extended the software model checker BLAST to automatically generate test suites that guarantee full coverage with respect to a given predicate. More precisely, given a C program and a target predicate p, BLAST determines the set L of program locations which program execution can reach with p ..."
Abstract
-
Cited by 86 (12 self)
- Add to MetaCart
(Show Context)
We have extended the software model checker BLAST to automatically generate test suites that guarantee full coverage with respect to a given predicate. More precisely, given a C program and a target predicate p, BLAST determines the set L of program locations which program execution can reach with p true, and automatically generates a set of test vectors that exhibit the truth of p at all locations in L. We have used BLAST to generate test suites and to detect dead code in C programs with up to 30 K lines of code. The analysis and test-vector generation is fully automatic (no user intervention) and exact (no false positives).