Results 1 
8 of
8
LASE: Locating and Applying Systematic Edits by Learning from Examples
, 2013
"... Adding features and fixing bugs often require systematic edits that make similar, but not identical, changes to many code locations. Finding all the relevant locations and making the correct edits is a tedious and errorprone process for developers. This paper addresses both problems using edit scr ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
(Show Context)
Adding features and fixing bugs often require systematic edits that make similar, but not identical, changes to many code locations. Finding all the relevant locations and making the correct edits is a tedious and errorprone process for developers. This paper addresses both problems using edit scripts learned from multiple examples. We design and implement a tool called LASE that (1) creates a contextaware edit script from two or more examples, and uses the script to (2) automatically identify edit locations and to (3) transform the code. We evaluate LASE on an oracle test suite of systematic edits from Eclipse JDT and SWT. LASE finds edit locations with 99 % precision and 89 % recall, and transforms them with 91% accuracy. We also evaluate LASE on 37 example systematic edits from other open source programs and find LASE is accurate and effective. Furthermore, we confirmed with developers that LASE found edit locations which they missed. Our novel algorithm that learns from multiple examples is critical to achieving high precision and recall; edit scripts created from only one example produce too many false positives, false negatives, or both. Our results indicate that LASE should help developers in automating systematic editing. Whereas most prior work either suggests edit locations or performs simple edits, LASE is the first to do both for nontrivial program edits.
Approximation of RNA multiple structural alignment
 In Proc. 17th Combinatorial Pattern Matching (CPM), volume 4009 of LNCS
, 2006
"... Abstract. In the context of noncoding RNA (ncRNA) multiple structural alignment, Davydov and Batzoglou introduced in [7] the problem of finding the largest nested linear graph that occurs in a set G of linear graphs, the socalled MaxNLS problem. This problem generalizes both the longest common su ..."
Abstract

Cited by 9 (8 self)
 Add to MetaCart
(Show Context)
Abstract. In the context of noncoding RNA (ncRNA) multiple structural alignment, Davydov and Batzoglou introduced in [7] the problem of finding the largest nested linear graph that occurs in a set G of linear graphs, the socalled MaxNLS problem. This problem generalizes both the longest common subsequence problem and the maximum common homeomorphic subtree problem for rooted ordered trees. In the present paper, we give a fast algorithm for finding the largest nested linear subgraph of a linear graph and a polynomialtime algorithm for a fixed number (k) of linear graphs. Also, we strongly strengthen the result of [7] by proving that the problem is NPcomplete even if G is composed of nested linear graphs of height at most 2, thereby precisely defining the borderline between tractable and intractable instances of the problem. Of particular importance, we improve the result of [7] by showing that the MaxNLS problem is approximable within ratio O(log mopt) in O(kn 2) running time, where mopt is the size of an optimal solution. We also present O(1)approximation of MaxNLS problem running in O(kn) time for restricted linear graphs. In particular, for ncRNA derived linear graphs, an 1approximation is presented. 4 1
TypeSafe Diff for Families of Datatypes
"... The UNIX diff program finds the difference between two text files using a classic algorithm for determining the longest common subsequence; however, when working with structured input (e.g. program code), we often want to find the difference between treelike data (e.g. the abstract syntax tree). In ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
The UNIX diff program finds the difference between two text files using a classic algorithm for determining the longest common subsequence; however, when working with structured input (e.g. program code), we often want to find the difference between treelike data (e.g. the abstract syntax tree). In a functional programming language such as Haskell, we can represent this data with a family of (mutually recursive) datatypes. In this paper, we describe a functional, datatypegeneric implementation of diff (and the associated program patch). Our approach requires advanced type system features to preserve type safety; therefore, we present the code in Agda, a dependentlytyped language wellsuited to datatypegeneric programming. In order to establish the usefulness of our work, we show that its efficiency can be improved with memoization and that it can also be defined in Haskell.
Fast Algorithms for Computing Tree LCS
"... The LCS of two rooted, ordered, and labeled trees F and G is the largest forest that can be obtained from both trees by deleting nodes. We present algorithms for computing tree LCS which exploit the sparsity inherent to the tree LCS problem. Assuming G is smaller than F, our first algorithm runs in ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
The LCS of two rooted, ordered, and labeled trees F and G is the largest forest that can be obtained from both trees by deleting nodes. We present algorithms for computing tree LCS which exploit the sparsity inherent to the tree LCS problem. Assuming G is smaller than F, our first algorithm runs in time O(r · height(F) · height(G) · lg lg G), where r is the number of pairs (v ∈ F, w ∈ G) such that v and w have the same label. Our second algorithm runs in time O(Lr lg r·lg lg G), where L is the size of the LCS of F and G. For this algorithm we present a novel three dimensional alignment graph. Our third algorithm is intended for the constrained variant of the problem in which only nodes with zero or one children can be deleted. For this case we obtain an O(rh lg lg G) time algorithm, where h = height(F) + height(G). 1
M.: Lase: an examplebased program transformation tool for locating and applying systematic edits
 In: Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13
, 2013
"... Abstract—Adding features and fixing bugs in software often require systematic edits which are similar, but not identical, changes to many code locations. Finding all edit locations and editing them correctly is tedious and errorprone. In this paper, we demonstrate an Eclipse plugin called LASE tha ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Adding features and fixing bugs in software often require systematic edits which are similar, but not identical, changes to many code locations. Finding all edit locations and editing them correctly is tedious and errorprone. In this paper, we demonstrate an Eclipse plugin called LASE that (1) creates contextaware edit scripts from two or more examples, and uses these scripts to (2) automatically identify edit locations and (3) transform the code. In LASE, users can view syntactic edit operations and corresponding context for each input example. They can also choose a different subset of the examples to adjust the abstraction level of inferred edits. When LASE locates target methods matching the inferred edit context and suggests customized edits, users can review and correct LASE’s edit suggestion. These features can reduce developers ’ burden in repetitively applying similar edits to different methods. The tool’s video demonstration is available at
Generating, Locating, and Applying Systematic Edits by Learning from Example(s)
, 2012
"... Programmers make systematic edits—similar, but not identical changes to multiple places during software development and maintenance. Finding all the correct locations and making correct edits is a tedious and errorprone process. Existing tools for automating systematic edits are limited because the ..."
Abstract
 Add to MetaCart
(Show Context)
Programmers make systematic edits—similar, but not identical changes to multiple places during software development and maintenance. Finding all the correct locations and making correct edits is a tedious and errorprone process. Existing tools for automating systematic edits are limited because they do not support edit generation, edit location suggestion, or edit application at the same time, except for specialized or trivial edits. However, since many similar changes are needed in locations that contain similar context (the surrounding dependent code and syntactic structures), there is an opportunity to automate the systematic editing process by inferring edit scripts and characterizing their context from code that developers changed already. The challenge is we need to abstract and generalize from example(s) in order to create an edit script that is correct in many different contexts. This thesis seeks to substantially improve the efficiency and correctness of automatic systematic program transformation. (1) We design and implement Sydit to generate an abstract, contextaware edit script from a single changed method, and apply it to userselected target method(s). This approach correctly performs many edits, but we show that the edit scripts from one example are not well suited to finding new locations. (2) We thus design and implement Lase to generate a partially abstract, contextaware edit script from multiple changed
Some Lower and Upper Bounds for Tree Edit Distance
, 2008
"... In this report I describe my results on the Tree Edit Distance problem [13, 27]. The edit distance between two ordered rooted trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nod ..."
Abstract
 Add to MetaCart
In this report I describe my results on the Tree Edit Distance problem [13, 27]. The edit distance between two ordered rooted trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. Tree Edit Distance has applications in many fields such as computer vision, computational biology and compiler optimization. I describe an algorithm that computes the edit distance between two trees of sizes n and m, where m < n, and runs in O(nm² (1+log n m)) = O(n³) time and O(nm) space. The previously best known algorithm for this problem, which is due to Philip Klein [22], runs in O(m2n log n) = O(n3 log n) time and O(mn) space. Next, a matching lower bound is proved for the family of decomposition strategy algorithms, which includes the previous fastest algorithms for this problem. The best previously known lower bound for this family was Ω(n2 log 2 n). Finally, I describe recent results on the Longest Common Subtree problem. This is an interesting special case of Tree Edit Distance in which only insertions and deletions are considered (i.e., the cost of all relabeling operations is infinite, and the cost of any insertion or deletion is 1). I describe a few algorithms for this problem, the fastest of which runs in O(Lr log r log log m), where L is the size of the LCS (L ≤ m) and r is the number of pairs of vertices with matching labels, one from each tree (r ≤ nm). These algorithms combine techniques from sparse string LCS (Longest Common Subsequence), with Tree Edit Distance algorithms. The tree edit distance paper [13] is a joint work with Erik Demaine, Benjamin Rossman