Results 1 - 10
of
30
Symbolic Execution for Software Testing in Practice – Preliminary Assessment
"... We present results for the “Impact Project Focus Area ” on the topic of symbolic execution as used in software testing. Symbolic execution is a program analysis technique introduced in the 70s that has received renewed interest in recent years, due to algorithmic advances and increased availability ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
(Show Context)
We present results for the “Impact Project Focus Area ” on the topic of symbolic execution as used in software testing. Symbolic execution is a program analysis technique introduced in the 70s that has received renewed interest in recent years, due to algorithmic advances and increased availability of computational power and constraint solving technology. We review classical symbolic execution and some modern extensions such as generalized symbolic execution and dynamic test generation. We also give a preliminary assessment of the use in academia, research labs, and industry.
Directed Test Suite Augmentation
"... Abstract—As software evolves, engineers use regression testing to evaluate its fitness for release. Such testing typically begins with existing test cases, and many techniques have been proposed for reusing these cost-effectively. After reusing test cases, however, it is also important to consider c ..."
Abstract
-
Cited by 19 (6 self)
- Add to MetaCart
(Show Context)
Abstract—As software evolves, engineers use regression testing to evaluate its fitness for release. Such testing typically begins with existing test cases, and many techniques have been proposed for reusing these cost-effectively. After reusing test cases, however, it is also important to consider code or behavior that has not been exercised by existing test cases and generate new test cases to validate these. This process is known as test suite augmentation. In this paper we present a directed test suite augmentation technique, that utilizes results from reuse of existing test cases together with an incremental concolic testing algorithm to augment test suites so that they are coverageadequate for a modified program. We present results of an empirical study examining the effectiveness of our approach. Keywords-regression testing, augmentation, concolic testing I.
Verifying Systems Rules Using Rule-Directed Symbolic Execution
"... Systems code must obey many rules, such as “opened files must be closed. ” One approach to verifying rules is static analysis, but this technique cannot infer precise runtime effects of code, often emitting many false positives. An alternative is symbolic execution, a technique that verifies program ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
(Show Context)
Systems code must obey many rules, such as “opened files must be closed. ” One approach to verifying rules is static analysis, but this technique cannot infer precise runtime effects of code, often emitting many false positives. An alternative is symbolic execution, a technique that verifies program paths over all inputs up to a bounded size. However, when applied to verify rules, existing symbolic execution systems often blindly explore many redundant program paths while missing relevant ones that may contain bugs. Our key insight is that only a small portion of paths are relevant to rules, and the rest (majority) of paths are irrelevant and do not need to be verified. Based on this insight, we create WOOD-PECKER, a new symbolic execution system for effectively checking rules on systems programs. It provides a set of builtin checkers for common rules, and an interface for users to easily check new rules. It directs symbolic execution toward the program paths relevant to a checked rule, and soundly prunes redundant paths, exponentially speeding up symbolic execution. It is designed to be heuristic-agnostic, enabling users to leverage existing powerful search heuristics. Evaluation on 136 systems programs totaling 545K lines of code, including some of the most widely used programs, shows that, with a time limit of typically just one hour for each verification run, WOODPECKER effectively verifies 28.7 % of the program and rule combinations over bounded input, whereas an existing symbolic execution system KLEE verifies only 8.5%. For the remaining combinations, WOODPECKER verifies 4.6 times as many relevant paths as KLEE. With a longer time limit, WOODPECKER verifies much more paths than KLEE, e.g., 17 times as many with a fourhour limit. WOODPECKER detects 113 rule violations, including 10 serious data loss errors with 2 most serious ones already confirmed by the corresponding developers.
Memoized Symbolic Execution
"... This paper introduces memoized symbolic execution (Memoise), a new approach for more efficient application of forward symbolic execution, which is a well-studied technique for systematic exploration of program behaviors based on bounded execution paths. Our key insight is that application of symboli ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
(Show Context)
This paper introduces memoized symbolic execution (Memoise), a new approach for more efficient application of forward symbolic execution, which is a well-studied technique for systematic exploration of program behaviors based on bounded execution paths. Our key insight is that application of symbolic execution often requires several successive runs of the technique on largely similar underlying problems, e.g., running it once to check a program to find a bug, fixing the bug, and running it again to check the modified program. Memoise introduces a trie-based data structure that stores the key elements of a run of symbolic execution. Maintenance of the trie during successive runs allows re-use of previously computed results of symbolic execution without the need for recomputing them as is traditionally done. Experiments using our prototype implementation of Memoise show the benefits it holds in various standard scenarios of using symbolic execution, e.g., with iterative deepening of exploration depth, to perform regression analysis, or to enhance coverage using heuristics.
SymDrive: Testing Drivers without Devices
- in Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation. USENIX Association, 2012
, 2011
"... Device-driver development and testing is a complex and error-prone undertaking. For example, testing errorhandling code requires simulating faulty inputs from the device. A single driver may support dozens of devices, and a developer may not have access to any of them. Consequently, many Linux drive ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
(Show Context)
Device-driver development and testing is a complex and error-prone undertaking. For example, testing errorhandling code requires simulating faulty inputs from the device. A single driver may support dozens of devices, and a developer may not have access to any of them. Consequently, many Linux driver patches include the comment “compile tested only.” SymDrive is a system for testing Linux and FreeBSD drivers without their devices present. The system uses symbolic execution to remove the need for hardware, and extends past tools with three new features. First, Sym-Drive uses static-analysis and source-to-source transformation to greatly reduce the effort of testing a new driver. Second, SymDrive checkers are ordinary C code and execute in the kernel, where they have full access to kernel and driver state. Finally, SymDrive provides an executiontracing tool to identify how a patch changes I/O to the device and to compare device-driver implementations. In applying SymDrive to 21 Linux drivers and 5 FreeBSD drivers, we found 39 bugs. 1
Path exploration based on symbolic output
- In FSE’11
"... Efficient program path exploration is important for many software engineering activities such as testing, debugging and verification. However, enumerating all paths of a program is prohibitively expensive. In this paper, we develop a partitioning of program paths based on the program output. Two pro ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Efficient program path exploration is important for many software engineering activities such as testing, debugging and verification. However, enumerating all paths of a program is prohibitively expensive. In this paper, we develop a partitioning of program paths based on the program output. Two program paths are placed in the same partition if they derive the output similarly, that is, the symbolic expression connecting the output with the inputs is the same in both paths. Our grouping of paths is gradually created by a smart path exploration. Our experiments show the benefits of the proposed path exploration in test-suite construction. Our path partitioning produces a semantic signature of a program — describing all the different symbolic expressions that the output can assume along different program paths. To reason about changes between program versions, we can therefore analyze their semantic signatures. In particular, we demonstrate the applications of our path partitioning in debugging of software regressions. Categories and Subject Descriptors
KATCH: High-Coverage Testing of Software Patches
"... One of the distinguishing characteristics of software systems is that they evolve: new patches are committed to software repositories and new versions are released to users on a continuous basis. Unfortunately, many of these changes bring unexpected bugs that break the stability of the system or aff ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
(Show Context)
One of the distinguishing characteristics of software systems is that they evolve: new patches are committed to software repositories and new versions are released to users on a continuous basis. Unfortunately, many of these changes bring unexpected bugs that break the stability of the system or affect its security. In this paper, we address this problem using a technique for automatically testing code patches. Our technique combines symbolic execution with several novel heuristics based on static and dynamic program analysis which allow it to quickly reach the code of the patch. We have implemented our approach in a tool called katch, which we have applied to all the patches written in a combined period of approximately six years for nineteen mature programs from the popular GNU diffutils, GNU binutils and GNU findutils utility suites, which are shipped with virtually all UNIX-based distributions. Our results show that katch can automatically synthesise inputs that significantly increase the patch coverage achieved by the existing manual test suites, and find bugs at the moment they are introduced.
Regression mutation testing
- IN: ISSTA
, 2012
"... Mutation testing is one of the most powerful approaches for evaluating quality of test suites. However, mutation testing is also one of the most expensive testing approaches. This paper presents Regression Mutation Testing (ReMT), a new technique to speed up mutation testing for evolving systems. Th ..."
Abstract
-
Cited by 9 (7 self)
- Add to MetaCart
(Show Context)
Mutation testing is one of the most powerful approaches for evaluating quality of test suites. However, mutation testing is also one of the most expensive testing approaches. This paper presents Regression Mutation Testing (ReMT), a new technique to speed up mutation testing for evolving systems. The key novelty of ReMT is to incrementally calculate mutation testing results for the new program version based on the results from the old program version; ReMT uses a static analysis to check which results can be safely reused. ReMT also employs a mutation-specific test prioritization to further speed up mutation testing. We present an empirical study on six evolving systems, whose sizes range from 3.9KLoC to 88.8KLoC. The empirical results show that ReMT can substantially reduce mutation testing costs, indicating a promising future for applying mutation testing on evolving software systems.
Partition-Based Regression Verification
"... Abstract—Regression verification (RV) seeks to guarantee the absence of regression errors in a changed program version. This paper presents Partition-based Regression Verification (PRV): an approach to RV based on the gradual exploration of differential input partitions. A differential input partiti ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Regression verification (RV) seeks to guarantee the absence of regression errors in a changed program version. This paper presents Partition-based Regression Verification (PRV): an approach to RV based on the gradual exploration of differential input partitions. A differential input partition is a subset of the common input space of two program versions that serves as a unit of verification. Instead of proving the absence of regression for the complete input space at once, PRV verifies differential partitions in a gradual manner. If the exploration is interrupted, PRV retains partial verification guarantees at least for the explored differential partitions. This is crucial in practice as verifying the complete input space can be prohibitively expensive. Experiments show that PRV provides a useful alternative to state-of-the-art regression test generation techniques. During the exploration, PRV generates test cases which can expose different behaviour across two program versions. However, while test cases are generally single points in the common input space, PRV can verify entire partitions and moreover give feedback that allows programmers to relate a behavioral difference to those syntactic changes that contribute to this difference. Keywords-Software Verification, Testing and Analysis I.
Regression Tests to Expose Change Interaction Errors
"... Changes often introduce program errors, and hence recent software testing literature has focused on generating tests which stress changes. In this paper, we argue that changes cannot be treated as isolated program artifacts which are stressed via testing. Instead, it is the complex dependency across ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Changes often introduce program errors, and hence recent software testing literature has focused on generating tests which stress changes. In this paper, we argue that changes cannot be treated as isolated program artifacts which are stressed via testing. Instead, it is the complex dependency across multiple changes which introduce subtle errors. Furthermore, the complex dependence structures, that need to be exercised to expose such errors, ensure that they remain undiscovered even in well tested and deployed software. We motivate our work based on a well tested and stable project- GNU Coreutils- where we found that one third of the regressions take more than two (2) years to be fixed, and that two thirds of such long-standing regressions are introduced due to change interactions for the utilities we investigated. To combat change interaction errors, we first define a notion