#### DMCA

## Precise Data Flow Analysis in the Presence of Correlated Method Calls

### Citations

454 | Precise Interprocedural Data Flow Analysis via Graph Reachability
- Reps, Horwitz, et al.
- 1995
(Show Context)
Citation Context ... paths. A solution to this IDE problem can be mapped back to the solution space of the original IFDS problem. We formalize the approach, prove it correct, and report on an implementation in the WALA analysis framework. 1 Introduction A control-flow graph (CFG) is an over-approximation of the possible flows of control in concrete executions of a program. It may contain infeasible paths that cannot occur at runtime. The precision of a data-flow analysis algorithm depends on its ability to detect and disregard such infeasible paths. The Interprocedural Finite Distributive Subset (IFDS) algorithm [16] is a general data-flow analysis algorithm that avoids infeasible interprocedural paths in which calls and returns to/from functions are not properly matched. The Interprocedural Distributive Environment (IDE) algorithm [18] has the same property, but supports a broader range of data-flow problems. This paper presents an approach to data-flow analysis that avoids a type of infeasible path that arises in object-oriented programs when two or more methods are dynamically dispatched on the same receiver object. If the method This research was supported by the Natural Sciences and Engineering Resea... |

395 | The DaCapo benchmarks: Java benchmarking development and analysis
- Blackburn, Garner, et al.
- 2006
(Show Context)
Citation Context ...the discussion in this section has focused on the specific problem of taint analysis, our technique generally applies to any data-flow-analysis problem that can be expressed in the IFDS framework. This includes many common analysis tasks such as reaching definitions, constant propagation, slicing, typestate analysis, pointer analysis, and lightweight shape analysis. 2.1 Occurrences of Correlated Calls How often do correlated calls occur in practice? To assess the benefit of the correlated-calls analysis, we counted the number of correlated calls that occur in programs of the Dacapo benchmarks [3], using the WALA framework [5]. Our goal was to obtain an upper bound on the number of redundant IFDS-result nodes that could be potentially removed by our analysis. The results are shown in the Technical Report [15]. In these programs, on average, 3 % of all call sites C are polymorphic call sites CP . Out of these polymorphic call sites, a significant fraction (39 %) are correlated call sites C. We also see that, on average, each correlated-call receiver is involved in approximately three correlated calls. 2.2 An Example from the Scala Collections Library The Scala collections library conta... |

368 |
Two approaches to interprocedural data flow analysis problems
- Sharir, Pnueli
- 1981
(Show Context)
Citation Context ...lower bound. The symbols ⊥ and are used to denote the greatest lower bound of S and of the empty set, respectively. We denote a map m as a set of pairs of keys and values, with each key appearing at most once. For a map m, m(k) is the value paired with the key k. We denote by m[x → y] a map that maps x to y and every other key k to m(k). 3.2 IFDS The IFDS framework [16] is a precise and efficient algorithm for data-flow analysis that has been used to solve a variety of data-flow analysis problems [4,9,12,22]. The IFDS framework is an instance of the functional approach to data-flow analysis [19] because it constructs summaries of the effects of called procedures. The IFDS framework is applicable to interprocedural data-flow problems whose domain consists of subsets of a finite set D, and whose data-flow functions are distributive. A function f is distributive if f(x1 x2) = f(x1) f(x2). The IFDS algorithm is notable because it computes a meet-over-valid paths solution in polynomial time. Most other interprocedural analysis algorithms are either: (i) imprecise due to invalid paths, (ii) general but do not run in polynomial time [7,19], or (iii) handle a very specific set of problem... |

165 | Parameterized object sensitivity for points-to analysis for java
- Milanova, Rountev, et al.
- 2005
(Show Context)
Citation Context ...ible paths between dynamically dispatched method calls, their approach eliminates infeasible paths between reads and writes of different properties of an object. The approach differs from ours in that it targets points-to analysis rather than IFDS analyses, in that it targets infeasible paths due to different property names rather than different dynamically dispatched methods, and in that it employs context sensitivity to improve precision. Our approach superficially resembles, but is orthogonal to, context sensitivity, including the CPA algorithm [1] and such variations as object sensitivity [11]. Context-sensitive points-to analysis is orthogonal to our work because it analyzes the flow of data (pointers), whereas we analyze control flow paths. Also, object-sensitive points-to analysis is flow-insensitive, while IFDS and IDE are flow-sensitive analyses. Note that our transformation only makes sense in a flow-sensitive setting since a flow-insensitive analysis already introduces many infeasible control flow paths. It would be possible to simulate the effect of our correlated calls transformation in the following way inspired by context-sensitivity: we could re-analyze each method in a... |

102 | The interprocedural coincidence theorem.
- Knoop, Steffen
- 1992
(Show Context)
Citation Context ... of the functional approach to data-flow analysis [19] because it constructs summaries of the effects of called procedures. The IFDS framework is applicable to interprocedural data-flow problems whose domain consists of subsets of a finite set D, and whose data-flow functions are distributive. A function f is distributive if f(x1 x2) = f(x1) f(x2). The IFDS algorithm is notable because it computes a meet-over-valid paths solution in polynomial time. Most other interprocedural analysis algorithms are either: (i) imprecise due to invalid paths, (ii) general but do not run in polynomial time [7,19], or (iii) handle a very specific set of problems [8]. The input to the IFDS algorithm is specified as (G∗, D, F, MF , ), where G∗ = (N∗, E∗) is the supergraph of the input program with nodes N∗ and edges E∗, D is a finite set of data-flow facts, F is a set of distributive data-flow functions of type 2D → 2D, MF : E∗ → F assigns a data-flow function to each supergraph edge, and is the meet operator on the powerset 2D, either union or intersection. In our presentation, the meet operator will always be union, but all of the results apply dually when the meet is intersection. The output of the... |

80 |
Taj: effective taint analysis of web applications
- Tripp, Pistoia, et al.
(Show Context)
Citation Context ...t or . A meet semilattice is a partially ordered set in which every subset only has a greatest lower bound. The symbols ⊥ and are used to denote the greatest lower bound of S and of the empty set, respectively. We denote a map m as a set of pairs of keys and values, with each key appearing at most once. For a map m, m(k) is the value paired with the key k. We denote by m[x → y] a map that maps x to y and every other key k to m(k). 3.2 IFDS The IFDS framework [16] is a precise and efficient algorithm for data-flow analysis that has been used to solve a variety of data-flow analysis problems [4,9,12,22]. The IFDS framework is an instance of the functional approach to data-flow analysis [19] because it constructs summaries of the effects of called procedures. The IFDS framework is applicable to interprocedural data-flow problems whose domain consists of subsets of a finite set D, and whose data-flow functions are distributive. A function f is distributive if f(x1 x2) = f(x1) f(x2). The IFDS algorithm is notable because it computes a meet-over-valid paths solution in polynomial time. Most other interprocedural analysis algorithms are either: (i) imprecise due to invalid paths, (ii) general... |

65 |
Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps
- Arzt, Rasthofer, et al.
- 2014
(Show Context)
Citation Context ...n 4 presents the correlated-calls transformation, states the correctness properties1, and discusses our implementation. Related work is discussed in Sect. 5. Finally, Sect. 6 presents conclusions and directions for future work. 2 Motivation We illustrate our approach using a small example that applies our technique to improve the precision of taint analysis. A taint analysis computes how string values may flow from “sources”, which are typically statements that read untrusted input, to “sinks”, which are typically security-sensitive operations such as calls to a database. In previous research [2,6], taint analysis algorithms have been formulated as IFDS problems. Figure 1 shows a small Java program. The program declares a class A with a subclass B, where A defines methods foo() and bar() that are overridden in B. We assume that secret values are created by an unspecified function secret(), which is called in A.foo() on line 2. Any write to standard output is assumed to be a sink (e.g., the call to System.out.println() in B.bar()). Depending on the number of arguments passed to the program, the main() method of the example program creates either an A-object or a B-object. The program the... |

58 | Concrete Type Inference: Delivering Object-Oriented Applications
- Agesen
- 1996
(Show Context)
Citation Context ...eliminates infeasible paths. Instead of infeasible paths between dynamically dispatched method calls, their approach eliminates infeasible paths between reads and writes of different properties of an object. The approach differs from ours in that it targets points-to analysis rather than IFDS analyses, in that it targets infeasible paths due to different property names rather than different dynamically dispatched methods, and in that it employs context sensitivity to improve precision. Our approach superficially resembles, but is orthogonal to, context sensitivity, including the CPA algorithm [1] and such variations as object sensitivity [11]. Context-sensitive points-to analysis is orthogonal to our work because it analyzes the flow of data (pointers), whereas we analyze control flow paths. Also, object-sensitive points-to analysis is flow-insensitive, while IFDS and IDE are flow-sensitive analyses. Note that our transformation only makes sense in a flow-sensitive setting since a flow-insensitive analysis already introduces many infeasible control flow paths. It would be possible to simulate the effect of our correlated calls transformation in the following way inspired by context-se... |

55 | Parallelism for free: Efficient and optimal bitvector analyses for parallel programs,”
- Knoop, Steffen, et al.
- 1996
(Show Context)
Citation Context ...ecause it constructs summaries of the effects of called procedures. The IFDS framework is applicable to interprocedural data-flow problems whose domain consists of subsets of a finite set D, and whose data-flow functions are distributive. A function f is distributive if f(x1 x2) = f(x1) f(x2). The IFDS algorithm is notable because it computes a meet-over-valid paths solution in polynomial time. Most other interprocedural analysis algorithms are either: (i) imprecise due to invalid paths, (ii) general but do not run in polynomial time [7,19], or (iii) handle a very specific set of problems [8]. The input to the IFDS algorithm is specified as (G∗, D, F, MF , ), where G∗ = (N∗, E∗) is the supergraph of the input program with nodes N∗ and edges E∗, D is a finite set of data-flow facts, F is a set of distributive data-flow functions of type 2D → 2D, MF : E∗ → F assigns a data-flow function to each supergraph edge, and is the meet operator on the powerset 2D, either union or intersection. In our presentation, the meet operator will always be union, but all of the results apply dually when the meet is intersection. The output of the IFDS algorithm is, for each node n in the supergraph... |

38 | Saving the world wide web from vulnerable javascript.
- Guarnieri, Pistoia, et al.
- 2011
(Show Context)
Citation Context ...n 4 presents the correlated-calls transformation, states the correctness properties1, and discusses our implementation. Related work is discussed in Sect. 5. Finally, Sect. 6 presents conclusions and directions for future work. 2 Motivation We illustrate our approach using a small example that applies our technique to improve the precision of taint analysis. A taint analysis computes how string values may flow from “sources”, which are typically statements that read untrusted input, to “sinks”, which are typically security-sensitive operations such as calls to a database. In previous research [2,6], taint analysis algorithms have been formulated as IFDS problems. Figure 1 shows a small Java program. The program declares a class A with a subclass B, where A defines methods foo() and bar() that are overridden in B. We assume that secret values are created by an unspecified function secret(), which is called in A.foo() on line 2. Any write to standard output is assumed to be a sink (e.g., the call to System.out.println() in B.bar()). Depending on the number of arguments passed to the program, the main() method of the example program creates either an A-object or a B-object. The program the... |

27 | Programming in Scala.
- Odersky, Spoon, et al.
- 2008
(Show Context)
Citation Context ...ice, R is much smaller than R, so Lemma 6 is significant for performance. Specifically, the complexity proof for the IDE algorithm requires the implementation of the micro-functions to be efficient according to a list of specific criteria. Our micro-function implementation does satisfy the criteria: Lemma 7. The correlated-call representation of a micro function is efficient according to the IDE criteria [18] and the required operations on micro-functions can be computed in time O(RT ). 4.7 Implementation of the Correlated-Calls Analysis We implemented the correlated-calls analysis in Scala [14]. Our implementation analyzes JVM bytecode compiled from input programs written in Java. We use WALA [5] to retrieve information about an input program, such as its controlflow supergraph and the set of receivers and their types. Since WALA does not contain an implementation of the IDE algorithm, we implemented it from scratch; we are working on contributing our infrastructure to WALA. We tested our correlated-calls analysis using an IFDS taint-analysis as a client analysis. To this end, we converted the IFDS taint analysis into an IDE problem with an implementation of T R . We extensively t... |

24 |
Typestate-like analysis of multiple interacting objects.
- Naeem, Lhotak
- 2008
(Show Context)
Citation Context ...t or . A meet semilattice is a partially ordered set in which every subset only has a greatest lower bound. The symbols ⊥ and are used to denote the greatest lower bound of S and of the empty set, respectively. We denote a map m as a set of pairs of keys and values, with each key appearing at most once. For a map m, m(k) is the value paired with the key k. We denote by m[x → y] a map that maps x to y and every other key k to m(k). 3.2 IFDS The IFDS framework [16] is a precise and efficient algorithm for data-flow analysis that has been used to solve a variety of data-flow analysis problems [4,9,12,22]. The IFDS framework is an instance of the functional approach to data-flow analysis [19] because it constructs summaries of the effects of called procedures. The IFDS framework is applicable to interprocedural data-flow problems whose domain consists of subsets of a finite set D, and whose data-flow functions are distributive. A function f is distributive if f(x1 x2) = f(x1) f(x2). The IFDS algorithm is notable because it computes a meet-over-valid paths solution in polynomial time. Most other interprocedural analysis algorithms are either: (i) imprecise due to invalid paths, (ii) general... |

23 | Correlation tracking for points-to analysis of JavaScript.
- Sridharan, Dolby, et al.
- 2012
(Show Context)
Citation Context ...ons specified by the product line. Rodriguez and Lhotak evaluate a parallelized implementation of the IFDS algorithm using actors [17] that can take advantage of multiple processors. Precise Data Flow Analysis in the Presence of Correlated Method Calls 69 The idea of using correlated calls to remove infeasible paths in data-flow analyses of object-oriented programs was introduced by Tip [21]. The possibility of using IDE to achieve this is mentioned, but not elaborated upon. Our work is the first to present and implement a concrete solution. Recent work on correlation tracking for JavaScript [20] also eliminates infeasible paths. Instead of infeasible paths between dynamically dispatched method calls, their approach eliminates infeasible paths between reads and writes of different properties of an object. The approach differs from ours in that it targets points-to analysis rather than IFDS analyses, in that it targets infeasible paths due to different property names rather than different dynamically dispatched methods, and in that it employs context sensitivity to improve precision. Our approach superficially resembles, but is orthogonal to, context sensitivity, including the CPA algo... |

12 | Spllift: Statically analyzing software product lines in minutes instead of years
- Bodden, Tolêdo, et al.
- 2013
(Show Context)
Citation Context ...t or . A meet semilattice is a partially ordered set in which every subset only has a greatest lower bound. The symbols ⊥ and are used to denote the greatest lower bound of S and of the empty set, respectively. We denote a map m as a set of pairs of keys and values, with each key appearing at most once. For a map m, m(k) is the value paired with the key k. We denote by m[x → y] a map that maps x to y and every other key k to m(k). 3.2 IFDS The IFDS framework [16] is a precise and efficient algorithm for data-flow analysis that has been used to solve a variety of data-flow analysis problems [4,9,12,22]. The IFDS framework is an instance of the functional approach to data-flow analysis [19] because it constructs summaries of the effects of called procedures. The IFDS framework is applicable to interprocedural data-flow problems whose domain consists of subsets of a finite set D, and whose data-flow functions are distributive. A function f is distributive if f(x1 x2) = f(x1) f(x2). The IFDS algorithm is notable because it computes a meet-over-valid paths solution in polynomial time. Most other interprocedural analysis algorithms are either: (i) imprecise due to invalid paths, (ii) general... |

7 | Practical extensions to the IFDS algorithm.
- Naeem, Lhotak, et al.
- 2010
(Show Context)
Citation Context ...o hours on javac. Our implementation is a research prototype and many opportunities for optimization remain. For the specific combination of this IFDS client analysis and these benchmark programs, tracking correlated calls did not impact precision. 5 Related Work The IFDS algorithm is an instance of the functional approach to data-flow analysis developed by Sharir and Pnueli [19]. IFDS has been used to encode a variety of data-flow problems such as typestate analysis [12,23] and shape analysis [9]. IFDS has been used [2,22] and extended [10] to solve taint-analysis problems. Naeem and Lhotak [13] proposed several extensions of IFDS. In particular, they propose several techniques for improving the algorithm’s efficiency, as well as a technique that improves expressiveness by extending applicability to a wider class of dataflow analysis problems. These extensions are orthogonal to, and could be combined with the approach presented in this paper. Our work differs from theirs by targeting analysis precision, not efficiency or expressiveness. Bodden et al. [4] presents a framework for applying IFDS analyses to software product lines. Their approach enables the analysis of all possible prod... |

6 | On abstraction refinement for program analyses in Datalog. In:
- Zhang, Mangal, et al.
- 2014
(Show Context)
Citation Context ... twice as long as the equivalence analysis. The absolute times range from a few seconds on the smaller SPEC programs to about two hours on javac. Our implementation is a research prototype and many opportunities for optimization remain. For the specific combination of this IFDS client analysis and these benchmark programs, tracking correlated calls did not impact precision. 5 Related Work The IFDS algorithm is an instance of the functional approach to data-flow analysis developed by Sharir and Pnueli [19]. IFDS has been used to encode a variety of data-flow problems such as typestate analysis [12,23] and shape analysis [9]. IFDS has been used [2,22] and extended [10] to solve taint-analysis problems. Naeem and Lhotak [13] proposed several extensions of IFDS. In particular, they propose several techniques for improving the algorithm’s efficiency, as well as a technique that improves expressiveness by extending applicability to a wider class of dataflow analysis problems. These extensions are orthogonal to, and could be combined with the approach presented in this paper. Our work differs from theirs by targeting analysis precision, not efficiency or expressiveness. Bodden et al. [4] presen... |

2 | A concurrent IFDS dataflow analysis algorithm using actors. Master’s thesis,
- Rodriguez
- 2010
(Show Context)
Citation Context ... presented in this paper. Our work differs from theirs by targeting analysis precision, not efficiency or expressiveness. Bodden et al. [4] presents a framework for applying IFDS analyses to software product lines. Their approach enables the analysis of all possible products derived from a product line in a single analysis pass. Like our approach, their approach transforms IFDS problems to IDE problems. The micro-functions keep track of the possible program variations specified by the product line. Rodriguez and Lhotak evaluate a parallelized implementation of the IFDS algorithm using actors [17] that can take advantage of multiple processors. Precise Data Flow Analysis in the Presence of Correlated Method Calls 69 The idea of using correlated calls to remove infeasible paths in data-flow analyses of object-oriented programs was introduced by Tip [21]. The possibility of using IDE to achieve this is mentioned, but not elaborated upon. Our work is the first to present and implement a concrete solution. Recent work on correlation tracking for JavaScript [20] also eliminates infeasible paths. Instead of infeasible paths between dynamically dispatched method calls, their approach eliminat... |

1 |
WALA – the TJ Watson libraries for analysis
- Fink, Dolby
- 2012
(Show Context)
Citation Context ...ls. The results of this IDE problem can be mapped back to the data-flow domain of the original IFDS problem, but are more precise than the results of directly applying the IFDS algorithm to the original problem. We present a formalization of the transformation and prove its correctness: specifically, we prove it still soundly considers all paths that are feasible, and that it avoids flow along all paths that are infeasible due to correlated calls. We implemented the correlated-calls transformation and the IDE algorithm in Scala, on top of the WALA framework for static analysis of JVM bytecode [5]. Our prototype implementation was tested extensively by using it to transform an IFDS-based taint analysis into a more precise IDE-based taint analysis, and applying the latter to small example programs with correlated calls. Our prototype along with all tests will be made available to the artifact evaluation committee. The remainder of this paper is organized as follows. Section 2 presents a motivating example. Section 3 reviews the IFDS and IDE algorithms. Section 4 presents the correlated-calls transformation, states the correctness properties1, and discusses our implementation. Related wo... |

1 |
FlowTwist: efficient contextsensitive inside-out taint analysis for large codebases. In: FSE
- Lerch, Hermann, et al.
- 2014
(Show Context)
Citation Context ...om a few seconds on the smaller SPEC programs to about two hours on javac. Our implementation is a research prototype and many opportunities for optimization remain. For the specific combination of this IFDS client analysis and these benchmark programs, tracking correlated calls did not impact precision. 5 Related Work The IFDS algorithm is an instance of the functional approach to data-flow analysis developed by Sharir and Pnueli [19]. IFDS has been used to encode a variety of data-flow problems such as typestate analysis [12,23] and shape analysis [9]. IFDS has been used [2,22] and extended [10] to solve taint-analysis problems. Naeem and Lhotak [13] proposed several extensions of IFDS. In particular, they propose several techniques for improving the algorithm’s efficiency, as well as a technique that improves expressiveness by extending applicability to a wider class of dataflow analysis problems. These extensions are orthogonal to, and could be combined with the approach presented in this paper. Our work differs from theirs by targeting analysis precision, not efficiency or expressiveness. Bodden et al. [4] presents a framework for applying IFDS analyses to software product lines.... |

1 |
Infeasible paths in object-oriented programs.
- Tip
- 2015
(Show Context)
Citation Context ...iver object. If the method This research was supported by the Natural Sciences and Engineering Research Council of Canada and the Ontario Ministry of Research and Innovation. c© Springer-Verlag Berlin Heidelberg 2015 S. Blazy and T. Jensen (Eds.): SAS 2015, LNCS 9291, pp. 54–71, 2015. DOI: 10.1007/978-3-662-48288-9 4 Precise Data Flow Analysis in the Presence of Correlated Method Calls 55 calls are polymorphic (i.e., the method invoked depends on the run-time type of the receiver), then their dispatch behaviours are correlated, and some of the paths between them are infeasible. A recent paper [21] made this observation but did not present any concrete algorithm to take advantage of it. Our approach transforms an IFDS problem into an IDE problem that precisely accounts for infeasible paths due to correlated calls. The results of this IDE problem can be mapped back to the data-flow domain of the original IFDS problem, but are more precise than the results of directly applying the IFDS algorithm to the original problem. We present a formalization of the transformation and prove its correctness: specifically, we prove it still soundly considers all paths that are feasible, and that it avoi... |