Results 1 - 10
of
82
DSD-Crasher: A hybrid analysis tool for bug finding
- In ISSTA
, 2006
"... DSD-Crasher is a bug finding tool that follows a three-step approach to program analysis: D. Capture the program’s intended execution behavior with dynamic invariant detection. The derived invariants exclude many unwanted values from the program’s input domain. S. Statically analyze the program with ..."
Abstract
-
Cited by 90 (5 self)
- Add to MetaCart
DSD-Crasher is a bug finding tool that follows a three-step approach to program analysis: D. Capture the program’s intended execution behavior with dynamic invariant detection. The derived invariants exclude many unwanted values from the program’s input domain. S. Statically analyze the program within the restricted input domain to explore many paths. D. Automatically generate test cases that focus on reproducing the predictions of the static analysis. Thereby confirmed results are feasible. This three-step approach yields benefits compared to past two-step combinations in the literature. In our evaluation with third-party applications, we demonstrate higher precision over tools that lack a dynamic step and higher efficiency over tools that lack a static step.
Mining API patterns as partial orders from source code: From usage scenarios to specifications
- In Proc. 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE
, 2007
"... A software system interacts with third-party libraries through various APIs. Using these library APIs often needs to follow certain usage patterns. Furthermore, ordering rules (specifications) exist between APIs, and these rules govern the secure and robust operation of the system using these APIs. ..."
Abstract
-
Cited by 87 (9 self)
- Add to MetaCart
(Show Context)
A software system interacts with third-party libraries through various APIs. Using these library APIs often needs to follow certain usage patterns. Furthermore, ordering rules (specifications) exist between APIs, and these rules govern the secure and robust operation of the system using these APIs. But these patterns and rules may not be well documented by the API developers. Previous approaches mine frequent association rules, itemsets, or subsequences that capture API call patterns shared by API client code. However, these frequent API patterns cannot completely capture some useful orderings shared by APIs, especially when multiple APIs are involved across different procedures. In this paper, we present a framework to automatically extract usage scenarios among user-specified APIs as partial orders, directly from the source code (API client code). We adapt a model checker to generate interprocedural control-flow-sensitive static traces related to the APIs of interest. Different API usage scenarios are extracted from the static traces by our scenario extraction algorithm and fed to a miner. The miner summarizes different usage scenarios as compact partial orders. Specifications are extracted from the frequent partial orders using our specification extraction algorithm. Our experience of applying the framework on 72 X11 clients with 200K LOC in total has shown that the extracted API partial orders are useful in assisting effective API reuse and checking.
Muvi: Automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs
- In Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP07
, 2007
"... Software defects significantly reduce system dependability. Among various types of software bugs, semantic and concurrency bugs are two of the most difficult to detect. This paper proposes a novel method, called MUVI, that detects an important class of semantic and concurrency bugs. MUVI automatical ..."
Abstract
-
Cited by 70 (8 self)
- Add to MetaCart
(Show Context)
Software defects significantly reduce system dependability. Among various types of software bugs, semantic and concurrency bugs are two of the most difficult to detect. This paper proposes a novel method, called MUVI, that detects an important class of semantic and concurrency bugs. MUVI automatically infers commonly existing multi-variable access correlations through code analysis and then detects two types of related bugs: (1) inconsistent updates—correlated variables are not updated in a consistent way, and (2) multivariable concurrency bugs—correlated accesses are not protected in the same atomic sections in concurrent programs. We evaluate MUVI on four large applications: Linux, Mozilla, MySQL, and PostgreSQL. MUVI automatically infers more than 6000 variable access correlations with high accuracy (83%). Based on the inferred correlations, MUVI detects 39 new inconsistent update semantic bugs from the latest versions of these applications, with 17 of them recently confirmed by the developers based on our reports. We also implemented MUVI multi-variable extensions to two representative data race bug detection methods (lockset and happens-before). Our evaluation on five real-world multi-variable concurrency bugs from Mozilla and MySQL shows that the MUVI-extension correctly identifies the root causes of four out of the five multi-variable concurrency bugs with 14 % additional overhead on average. Interestingly, MUVI also helps detect four new multi-variable concurrency bugs in Mozilla that have never been reported before. None of the nine bugs can be identified correctly by the original race detectors without our MUVI extensions.
/* iComment: Bugs or Bad Comments? */
- PROCEEDINGS OF THE 21ST ACM SIGOPS SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES
, 2007
"... Commenting source code has long been a common practice in software development. Compared to source code, comments are more direct, descriptive and easy-to-understand. Comments and source code provide relatively redundant and independent information regarding a program’s semantic behavior. As softwar ..."
Abstract
-
Cited by 53 (11 self)
- Add to MetaCart
Commenting source code has long been a common practice in software development. Compared to source code, comments are more direct, descriptive and easy-to-understand. Comments and source code provide relatively redundant and independent information regarding a program’s semantic behavior. As software evolves, they can easily grow out-of-sync, indicating two problems: (1) bugs-the source code does not follow the assumptions and requirements specified by correct program comments; (2) bad comments- comments that are inconsistent with correct code, which can confuse and mislead programmers to introduce bugs in subsequent versions. Unfortunately, as most comments are written in natural language, no solution has been proposed to automatically analyze comments and detect inconsistencies between comments and source code. This paper takes the first step in automatically analyzing comments written in natural language to extract implicit program rules and use these rules to automatically detect inconsistencies between comments and source code, indicating either bugs or bad comments. Our solution, iComment, combines Natural Language Processing (NLP), Machine Learning, Statistics and Program Analysis techniques to achieve these goals. We evaluate iComment on four large code bases: Linux, Mozilla, Wine and Apache. Our experimental results show that iComment automatically extracts 1832 rules from comments with 90.8-100% accuracy and detects 60 comment-code inconsistencies, 33 new bugs and 27 bad comments, in the latest versions of the four programs. Nineteen of them (12 bugs and 7 bad comments) have already been confirmed by the corresponding developers while the others are currently being analyzed by the developers.
Merlin: Specification Inference for Explicit Information Flow Problems
"... The last several years have seen a proliferation of static and runtime analysis tools for finding security violations that are caused by explicit information flow in programs. Much of this interest has been caused by the increase in the number of vulnerabilities such as cross-site scripting and SQL ..."
Abstract
-
Cited by 48 (7 self)
- Add to MetaCart
(Show Context)
The last several years have seen a proliferation of static and runtime analysis tools for finding security violations that are caused by explicit information flow in programs. Much of this interest has been caused by the increase in the number of vulnerabilities such as cross-site scripting and SQL injection. In fact, these explicit information flow vulnerabilities commonly found in Web applications now outnumber vulnerabilities such as buffer overruns common in type-unsafe languages such as C and C++. Tools checking for these vulnerabilities require a specification to operate. In most cases the task of providing such a specification is delegated to the user. Moreover, the efficacy of these tools is only as good as the specification. Unfortunately, writing a comprehensive specification presents a major challenge: parts of the specification are easy to miss, leading to missed vulnerabilities; similarly, incorrect specifications may
Inferring resource specifications from natural language API documentation
- In Proc. 24th ASE
, 2009
"... Abstract—Typically, software libraries provide API documentation, through which developers can learn how to use libraries correctly. However, developers may still write code inconsistent with API documentation and thus introduce bugs, as existing research shows that many developers are reluctant to ..."
Abstract
-
Cited by 45 (18 self)
- Add to MetaCart
(Show Context)
Abstract—Typically, software libraries provide API documentation, through which developers can learn how to use libraries correctly. However, developers may still write code inconsistent with API documentation and thus introduce bugs, as existing research shows that many developers are reluctant to carefully read API documentation. To find those bugs, researchers have proposed various detection approaches based on known specifications. To mine specifications, many approaches have been proposed, and most of them rely on existing client code. Consequently, these mining approaches would fail to mine specifications when client code is not available. In this paper, we propose an approach, called Doc2Spec, that infers resource specifications from API documentation. For our approach, we implemented a tool and conducted an evaluation on Javadocs of five libraries. The results show that our approach infers various specifications with relatively high precisions, recalls, and F-scores. We further evaluated the usefulness of inferred specifications through detecting bugs in open source projects. The results show that specifications inferred by Doc2Spec are useful to detect real bugs in existing projects. I.
Modular Data Structure Verification
- EECS DEPARTMENT, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
, 2007
"... This dissertation describes an approach for automatically verifying data structures, focusing on techniques for automatically proving formulas that arise in such verification. I have implemented this approach with my colleagues in a verification system called Jahob. Jahob verifies properties of Java ..."
Abstract
-
Cited by 43 (20 self)
- Add to MetaCart
This dissertation describes an approach for automatically verifying data structures, focusing on techniques for automatically proving formulas that arise in such verification. I have implemented this approach with my colleagues in a verification system called Jahob. Jahob verifies properties of Java programs with dynamically allocated data structures. Developers write Jahob specifications in classical higher-order logic (HOL); Jahob reduces the verification problem to deciding the validity of HOL formulas. I present a new method for proving HOL formulas by combining automated reasoning techniques. My method consists of 1) splitting formulas into individual HOL conjuncts, 2) soundly approximating each HOL conjunct with a formula in a more tractable fragment and 3) proving the resulting approximation using a decision procedure or a theorem prover. I present three concrete logics; for each logic I show how to use it to approximate HOL formulas, and how to decide the validity of formulas in this logic. First, I present an approximation of HOL based on a translation to first-order logic, which enables the use of existing resolution-based theorem provers. Second, I present an approximation of HOL based on field constraint analysis, a new technique that enables
Toward Automated Detection of Logic Vulnerabilities in Web Applications
"... Web applications are the most common way to make services and data available on the Internet. Unfortunately, with the increase in the number and complexity of these applications, there has also been an increase in the number and complexity of vulnerabilities. Current techniques to identify security ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
(Show Context)
Web applications are the most common way to make services and data available on the Internet. Unfortunately, with the increase in the number and complexity of these applications, there has also been an increase in the number and complexity of vulnerabilities. Current techniques to identify security problems in web applications have mostly focused on input validation flaws, such as crosssite scripting and SQL injection, with much less attention devoted to application logic vulnerabilities. Application logic vulnerabilities are an important class of defects that are the result of faulty application logic. These vulnerabilities are specific to the functionality of particular web applications, and, thus, they are extremely difficult to characterize and identify. In this paper, we propose a first step toward the automated detection of application logic vulnerabilities. To this end, we first use dynamic analysis and observe the normal operation of a web application to infer a simple set of behavioral specifications. Then, leveraging the knowledge about the typical execution paradigm of web applications, we filter the learned specifications to reduce false positives, and we use model checking over symbolic input to identify program paths that are likely to violate these specifications under specific conditions, indicating the presence of a certain type of web application logic flaws. We developed a tool, called Waler, based on our ideas, and we applied it to a number of web applications, finding previously-unknown logic vulnerabilities. 1
Path-Sensitive Inference of Function Precedence Protocols
, 2007
"... Function precedence protocols define ordering relations among function calls in a program. In some instances, precedence protocols are well-understood (e.g., a call to pthread mutex init must always be present on all program paths before a call to pthread mutex lock). Oftentimes, however, these prot ..."
Abstract
-
Cited by 37 (1 self)
- Add to MetaCart
Function precedence protocols define ordering relations among function calls in a program. In some instances, precedence protocols are well-understood (e.g., a call to pthread mutex init must always be present on all program paths before a call to pthread mutex lock). Oftentimes, however, these protocols are neither welldocumented, nor easily derived. As a result, protocol violations can lead to subtle errors that are difficult to identify and correct. In this paper, we present CHRONICLER, a tool that applies scalable inter-procedural path-sensitive static analysis to automatically infer accurate function precedence protocols. CHRONICLER computes precedence relations based on a program’s control-flow structure, integrates these relations into a repository, and analyzes them using sequence mining techniques to generate a collection of feasible precedence protocols. Deviations from these protocols found in the program are tagged as violations, and represent potential sources of bugs. We demonstrate CHRONICLER’s effectiveness by deriving protocols for a collection of benchmarks ranging in size from 66K to 2M lines of code. Our results not only confirm the existence of bugs in these programs due to precedence protocol violations, but also highlight the importance of path sensitivity on accuracy and scalability.
Context-Based Detection of CloneRelated Bugs
- In Proceedings of the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’07), 10pp
, 2007
"... ABSTRACT Studies show that programs contain much similar code, commonlyknown as clones. One of the main reasons for introducing clones ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
ABSTRACT Studies show that programs contain much similar code, commonlyknown as clones. One of the main reasons for introducing clones