| L. Wills. Automated program recognition by graph parsing. MIT Technical Report 1358, MIT, AI Laboratory, 1993. |
....parsing algorithms proposed so far are either unable to recognize interesting languages of graphs, 4 Author for correspondence. or tend to be inefficient when applied to graphs with a large number of nodes and edges. Another problem is that nearly all known graph grammar parsing algorithms [2, 3, 4, 5, 6, 7] deal only with contextfree productions. A context free grammar requires that only a single non terminal is allowed on the left hand side of a production [8] A context sensitive graph grammar, on the other hand, allows left hand and right hand graphs of a production to have an arbitrary number of ....
Wills, L. M. (1992) Automated Program Recognition by Graph Parsing. PhD Thesis, MIT Artificial Intelligence Laboratory, Cambridge, MA, Technical Report 1358.
....through indexing, adding search or construction programs, and so on. One type of learning distribution is not considered in this chapter: when designers embody knowledge in tools or environments. For instance, populating a plan matching program s database using knowledge engineering (e.g. Wills [707], Chin et al. 127] certainly redistributes learning, but it these cases the tool developer has a special relationship to the user. This relationship is such that it is awkward, to say the least, to view the developer user activity as a collaborative learning effort (but see Section 6.7 for more ....
....in the environment and in the problem situation. Cliche recognition. Experienced programmers can rapidly recognize cliched patterns (see e.g. McKeithen et al. 403] Much of the research in relation to this ability has been concerned with the recognition of the so called programming plans [51, 241, 609, 707] and programming idioms [6,611] However the findings probably generalize for all relevant recurrent structures (text structures, control flow structures [266] design patterns, architectural patterns, etc. As one would expect, these structures would need to be relevant to the programming ....
Wills, L. M. Automated program recognition by graph parsing. Tech. Rep. TR--1358, MIT, Artificial Intelligence Laboratory, 1992. Phd Thesis.
....question answering systems that depend on a manually populated database describing the software system. This approach is typi ed by the Lassie system [7] 2. Plan driven, algorithmic program understanders or recognisers. Two examples of this type are the Programmer s Apprentice [28] and GRASPR [32]. 3. Model driven, plausible reasoning systems. Examples of this type include DM TAO [3] IRENE [17] and HB CA [9, 10] Biggersta et al. claim that systems using approaches 1 and 2 are good at completely deriving concepts within small scale programs but cannot deal with large scale programs ....
L. M. Wills. Automated Program Recognition by Graph Parsing. PhD Thesis, AI Lab, Massachusetts
....most graph grammar parsing algorithms proposed so far are either unable to recognize interesting languages of graphs or tend to be inefficient when applied to graphs with a large number of nodes and edges. Another problem is that nearly all known graph grammar parsing algorithms [7] 8][17] deal only with context free productions. A context free grammar requires that only a single non terminal is allowed on the left hand side of a production [19] A context sensitive graph grammar, on the other hand, allows left hand and right hand graphs of a production to have arbitrary number of ....
L.M. Wills, Automated program recognition by graph parsing, PhD Thesis, MIT AI Lab, Cambridge, Massachusetts, Technical Report 1358, 1992.
....[LHB91, Zim90] A powerful reverse engineering system may include knowledge based techniques to cope with the complexity of program semantics and with the needed abstraction steps. Knowledge based reverse engineering systems enable the performance of some kind of, human like program understanding [RiW90, Tra94a, Wil90, and Wil92]. Usually, such systems are built around a complex knowledge base of programming concepts and constructs [Ric85, SKW85, and RiW90] In this paper is described a knowledge based reverse engineering system for analysis of C program and for the generation of their knowledge based representation. ....
Wills, L.M., Automated Program Recognition by Graph Parsing, AI-TR 1358, MIT, 1992.
....constraint representation and processing, and the concurrent refinement modules that enhance the power of knowledge representation and processing. Programmed graph grammar parsing in the present approach is also related to the program understanding approach from the same Programmer s Apprentice [23,24] but extends parsing by taking 22 into account heuristic knowledge. XRL offers also some of the facilities of the V language [18] One conceptual difference between PA and the approach from this paper is the metaphor taken into account. As emphasized in [21] the current system is not directed ....
L.M. Wills, "Automated Program Recognition by Graph Parsing", PhD Thesis, (Also AI-TR 1358), MIT, 1992.
....which nodes represent operations and edges show the control and data flows between them. The explicit control and data flow representation abstracts away from the syntactic details of a program. This plan formalism is used in developing a program understanding tool called the Recognizer [Wills87, Wills92] The Recognizer is a prototype that automatically finds all occurrences of a given set of commonly used data structures and algorithms, called cliches, in a program. It builds a hierarchical description of the program in terms of the cliches it finds and gives an informal description of the ....
L. M. Wills, "Automated program recognition by graph parsing," Tech. Rep. AI-TR-1358, MIT-AILAB, July 1992.
....benefit from pointer analysis. Abstract data type identification techniques [8] 9] 31] group together routines based on their signature and their accesses to global variables. If accesses are made through pointers, some of them may be missed. Classical plan recognition techniques [33] 38] [49] have been usually presented applied to languages without pointers, such as Cobol [33] and Lisp [38] 49] Plans [33] consist of components and constraints, and constraints may involve flow dependences. For languages like C, in presence of pointers, points to results are needed to verify data ....
....routines based on their signature and their accesses to global variables. If accesses are made through pointers, some of them may be missed. Classical plan recognition techniques [33] 38] 49] have been usually presented applied to languages without pointers, such as Cobol [33] and Lisp [38] [49]. Plans [33] consist of components and constraints, and constraints may involve flow dependences. For languages like C, in presence of pointers, points to results are needed to verify data dependences involving locations accessed through pointers. Architecture recovery techniques may also use ....
L. Wills, "Automated program recognition by graph Parsing". Phd Dissertation, MIT, 1992.
....connector. Obviously many slightly different code fragments may be matched by a specific clich e. Hence some general features have to be recognized within source code to signal that a certain clich e is being implemented by that code. Moreover, the statements forming a clich e are usually spread [17, 18] i.e. they are not necessarily close each other nor even in the same procedure. In the program concept recognition research community, clich es were originally developed using a plan representation [1, 2] where a plan is a collection of components and constraints. Fig. 4 shows a plan ....
L. M. Wills, "Automated Program Recognition by Graph Parsing", Technical Report 1358, MIT Artificial Intelligence Lab., PhD Thesis, July 1992.
.... of the approaches explored in the area of program understanding is trying 2 DRAFT to identify instances of program concepts at algorithmical and data structure level using plans, which are abstract representations embodying the knowledge about program concepts, their components and constraints [22,27,40,46]. Other approaches investigated structural analysis and in the large re documentation of software systems, by building tools that support methods for identifying, re organizing, and documenting layered subsystem hierarchies [7,10,23,47] Recently, some works that address the problem of ....
....is usually based on recognizing aggregates of statements in the source, which exhibit specific relations, that may be syntactic or may involve control and data flow. Such source code structures represent standard or stereotypical ways to implement algorithmic computations or data structures [46]. These aggregates are called clich es and represent a sort of knowledge about problem solving. Examples of clich es are list enumeration, binary search, or sorted list and priority queues. Clich es implementing more abstract structures may be composed of other clich es : the overall result of ....
[Article contains additional citation context not shown here]
L. Wills. Automated program recognition by graph Parsing. Phd Dissertation, MIT, 1992.
....Reengineering, System renovation, Patterns, Plans, Clich es, Native pattern language, Year 2000 problem, Leap year problem. 1 Introduction Recognition of problematic code is a prerequisite to reengineering it. Therefore, much research has been carried out in the area of pattern recognition. See [25] for an overview. It is well known that real world problems do not have the habit of residing in one location. This makes recognition using lexical technology not always an optimal methodology, but see [20] for generation of source model extractors from lexical specifications. A powerful approach ....
L. M. Wills. Automated Program Recognition by Graph Parsing. PhD thesis, MIT, 1992.
....and diverse reverse engineering contributions we encountered similar scaffolding of the underlying ASTs of source programs with standard information like control flow and data flow information, but not of the source programs themselves. See for instance the literature on program plan recognition [42, 43, 44]. We do not scaffold the AST, but we scaffold the source text, then we parse the resulting scaffolded code. After a (small) transformation step the code and the scaffolding can be unparsed, inspected and or modified if necessary. Moreover, lexical tools and humans can easily add scaffolding to ....
L. M. Wills. Automated Program Recognition by Graph Parsing. PhD thesis, MIT, 1992.
....of problematic code. For example, for Year 2000 related remediation, it is necessary to accurately identify exposures to faulty Year 2000 reasoning in large software systems. The need to recognize such code has resulted in an ongoing interest in software pattern recognition in both the academic [54, 55, 57] and industrial reengineering communities. See [23, 31] for an overview of the important players in the Year 2000 arena, for instance, TechForce or Reasoning. It is wellknown that real world problems do not have the habit of residing in one location. This makes recognition using lexical technology ....
....Complexity Many papers on pattern matching focus on other issues rather than the design of the accompanying language. One central issue is the complexity of the algorithms used. The matching algorithm underlying the so called constraint satisfaction problem as used in [17] is know to be NP hard [24, 54, 58]. We will give some information on complexity issues, as well. In our case, since we omitted a formal definition of patterns, we cannot elaborate on the associated complexity issues in depth. We give some pointers though. First we emphasize that the constraint satisfaction problem is a different ....
L.M. Wills. Automated Program Recognition by Graph Parsing. PhD thesis, MIT, 1992.
....approaches to the problems of window organization and object layout. 6 Related and Further Research The majority of work on reverse engineering has focused on parsing the system code in various types of call graphs [10] and based on them, on extracting higher level, abstract structures [4]. While these methods were purely syntactic originally, more recently they have incorporated application specific information such as a designer s model of the system design [11] or a domain object model [2] To our knowledge, there has been no work that uses the traces of the dynamic interaction ....
Wills, L., Automated Program Recognition by Graph Parsing, MIT-AI-TR 1358, July 1992.
....different search methods, or to understand how the addition or deletion of certain types of domain specific knowledge may affect performance. I am unaware of concrete examples or experiments which might suggest that these approaches might scale up for specific uses in large sources. However, both [Wills, 1992], and [Quilici, 1994] present empirical results promising in identifying partial mappings from sources of up to 1,000 lines to a small library of program plans. 1.3 My Approach to Modeling Program Understanding This work is part of a research effort structured towards: 1) unifying previous ....
....as provided by Rigi [Muller et al. 1994] I understand a visualization tool of this kind to form the basis of a code understanding decision support system in which automated program understanding tools may be embedded. 1.5. 2 Software Repository In accordance with the work upon which I build [Wills, 1992, Quilici, 1994, Kozaczynski and Ning, 1994] the existence of a software repository is assumed from which program plans of a domain specific or domain independent nature are situated. Such a repository could be populated through the use of existing commercial class and template libraries in ....
[Article contains additional citation context not shown here]
L. M. Wills. Automated program recognition by Graph Parsing. PhD thesis, MIT, July 1992.
....to annotate code makes this approach unworkable for large software systems. 2.5. 2 Finding higher level abstractions Automated program recognition systems parse a program and try to match the code to known clich es to gain an understanding of the program [Letovsky, 1988, Wills, 1990, Lutz, 1992, Wills, 1992] These systems are still in their infancy and only handle short lengths of code. However, it is conceivable that they will evolve to handle larger systems in the future. Once such a system understands a program, it can in principle decide the best visualization forms to present and assume the ....
Wills, L. M. (1992). Automated Program Recognition by Graph Parsing. Technical Report AI-TR 1358, MIT Artificial Intelligence Laboratory.
.... translator generated the following code: void HORNER (tagSAPReg Reg) Tstm(14,12, Reg[13] ucp 12) Reg) Reg[12] sw = Reg[15] sw ; Reg[7] pv = COEF[0] Reg[5] sw = sWord )Reg[7] ucp; Reg[9] sw = 0 ; LOOP: if ( Reg[9] sw) Reg[2] sw) goto OUT; Reg[9] sw = 1; Reg[7] sw = 4; Tmult( Reg[4],Reg[3] sw) Reg[5] sw = sWord ) Reg[7] ucp) goto LOOP ; OUT: Reg[0] sw = Reg[5] sw Tlm(1,12, Reg[13] ucp 24) Reg) return; As can be seen from this example, the simulating translator relies on an array of a union type to simulate the assembly registers, and literally translates ....
.... that intervenes between the setting of the condition code by the compare (CH) instruction and its subsequent use by the branch instructions (BH, BNH) CH R7,0(R5,R4) SRL R2,1 BH CGADD BNH CGSUB The simulating translator generates the following code for this fragment: if (Reg[7] sw = SH(Reg[4].ucp Reg[5] sw) CC = CZero; else if (Reg[7] sw SH(Reg[4] ucp Reg[5] sw) CC = COne; else CC = CTwo; Reg[2] uw = 1; if (CC 0x4) goto CGADD; if (CC 0x3) goto CGSUB; Bogart, in contrast, analyses the data flow of the condition code and can therefore generate the following code: r2sh = 1; ....
[Article contains additional citation context not shown here]
L. M. Wills, "Automated program recognition by graph parsing," Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD thesis.
.... locating all instances of a hypothesized program plan (such as a schema augmented with data and control constraints) in an internal representation of program source code (such as an AST augmented with data and control flow information) Unfortunately, this process is exponential in the worst case [2, 3, 6, 13] and has been proven to be NP Hard [16] In particular, recognizing instances of a particular program plan in a given source code is O(S A ) where S is the size of the source and A is the number of plan actions. It is therefore an open question whether it is possible to develop efficient ....
.... for plan instances on a group of similarly sized programs for performing a particular task [3, 5] When earlier studies have taken efficiency into account, it has been to identify what factors, such as constraint ordering and constraint strength, are critical to the performance of the recognizer [6, 13]. The actual performance of various plan recognizers, however, has peviously been limited to anecdotal discussions. Our work appears to be unique in its focus on experiments that enable us to draw conclusions about how well these plan recognition algorithms scale. 9 Conclusion Despite program ....
[Article contains additional citation context not shown here]
L. M. Wills. Automated Program Recognition By Graph Parsing. PhD thesis, MIT, July 1992.
....understanding. In this direction, an explicit library of programming plan templates and concepts is constructed, and various top down and bottom up search strategies are utilized to implement the mapping process. Notable examples are Quilici[7] Kozaczynski and Ning[3] Rich and Waters[8] and Wills[10]. To some extent, all are aimed at improving the effectiveness of the mapping process through heuristic knowledge. Our work stems directly from the aforementioned approaches. Our observation is that different pieces of the source code are related to each other by various constraints, and the ....
....problem as constraint satisfaction, and applying known constraint satisfaction algorithms. In our experiment, we haven t utilized the full range of constraints inherent in a program source code, such as those derived from program parsing, a technique employed by Kozaczynski Ning[3] and Wills[10]. More extensive consideration is given to the specific use of these constraints in [11] We expect the empirical results to improve further with use of these constraints. Usability We envision our system as one part of a programmer s assistant toolset. For the MAP CSP problem, a programmer could ....
L. M. Wills. Automated program recognition by Graph Parsing. AI Laborary Technical Report 1358, MIT, 1992.
....In this direction, an explicit library of programming plan templates and concepts is constructed, and various top down and bottom up search strategies are utilized to implement the mapping process. Notable examples are Quilici[22] Kozaczynski and Ning[11] Rich and Waters[30] and Wills[36, 37]. To some extent, all are aimed at improving the effectiveness of the mapping process through heuristic knowledge. The basis for such heuristic approaches has been the assumed intractability of the complete understanding problem in general. In [47] not only is program understanding is shown to be ....
....approach, or to understand how the addition or deletion of certain types of domain specific knowledge may affect performance. We are unaware of concrete examples or experiments which might suggest that these approaches might scale up for specific uses in large sources. One exception might be Wills[37] who presents empirical results which seem promising in identifying partial mappings of reasonably sized legacy sources to a library of program plans. The work presented in this paper is part of the initial phase of work focused on demonstrating that an effective approach to partial program ....
[Article contains additional citation context not shown here]
L. M. Wills. Automated program recognition by Graph Parsing. PhD thesis, MIT, July 1992.
....world. This is justified by the fact that programmers generally use only a limited number of stereotyped combinations of maps to describe UIs. Match ing program plan against clich es has been discussed in [ Kozaczynski and Ning, 1989 ] Rich and Waters, 1990 ] Hartman, 1991 ] and [ Wills, 1992 ] Our approach would follow the same philosophy, but apply it in a different domain. Another interesting direction is that of introducing new design concepts which are totally absent from the character oriented design, such as multi thread dialogues and concurrency. Another important, practical ....
Wills, L. M. 1992. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Laboratory.
.... Chapter 1 A Constraint Satisfaction Framework for Evaluating Program Understanding Algorithms Introduction Over the past decade, researchers have proposed and implemented a wide variety of planbased program understanding algorithms (Quilici 1994, Kozaczynski Ning 1994, Kozaczynski Ning 1992, Wills 1992, Wills 1990, Hartman 1991, Johnson 1986) While some of these research efforts have presented promising empirical results in mapping plan libraries to reasonably sized (up to 1000 lines) legacy source code(Wills 1992, Quilici Chin 1995, Woods Yang 1995b) none have been clearly ....
.... algorithms (Quilici 1994, Kozaczynski Ning 1994, Kozaczynski Ning 1992, Wills 1992, Wills 1990, Hartman 1991, Johnson 1986) While some of these research efforts have presented promising empirical results in mapping plan libraries to reasonably sized (up to 1000 lines) legacy source code(Wills 1992, Quilici Chin 1995, Woods Yang 1995b) none have been clearly demonstrated either analytically or empirically as scaling up for use in understanding real world legacy systems. In addition, little work has been done in comparing the relative performance of these approaches or analyzing in ....
[Article contains additional citation context not shown here]
Wills, L. M. (1992), Automated program recognition by Graph Parsing, PhD thesis, MIT.
....Template Library contains frequently used data structures and algorithms. Such components are called clich es. Lists, trees, search and sort algorithms, for example, may all be called clich es as they are frequently used in all kinds of applications. Similar definitions can also be found in [30, 50]. Since the intention of this thesis was the replacement of only those clich es in the original source code where an equivalent implementation could be found in the Standard Template Library, in this thesis the term clich e will only be used for data structures and algorithms that have been ....
....iteratively the user give input to the application in order to influence the recognition process. LaSSIE [15] Desire [6] ffl Plan driven: This class of tools are based on patterns, which are specified in a particular pattern language. LSME[31] SCRUPLE[37] TAWK[18] GRASPR [50] PAT[19] Programmer s Apprentice [40] Proust, Talus [35] Since Plan driven approaches are the most common ones, this thesis only deals with this kind of tools to reflect the state of the art. In the following, we will discuss them. 2.1. PLAN DRIVEN TOOLS 2.1 Plan driven Tools The ....
[Article contains additional citation context not shown here]
L.M. Wills, Automated Program Recognition by Graph Parsing, Massachusetts Institute of Technology, Jul. 1992.
....of certain abstract syntax elements. At a very low level of abstraction, Paul Prakash (1993) describe a similar recognition approach based upon matching syntactic clich es with source constructs in legacy abstract syntax trees annotated with semantic relations 1 . Wills (Rich Waters 1990, Wills 1990, Wills 1992a, Wills 1992b) outlined an approach to recognition in which stereotypical program or data structures known as clich es are represented as a type of graph grammar. A source program is translated into an intermediate representation as a flow graph. These flow graphs are parsed so as to ....
.... the existence of an adequate blocking or slicing algorithms yielding either simple structural functions derived from the program s abstract syntax tree supplemented with semantic information such as call calling, control flow, and data flow relationships(Kozaczynski Ning 1992, Quilici 1994, Wills 1990). More complex approaches that attempt to derive related code portions based on additional information such as dynamic program flow (execution) traces, other sophisticated analysis such as similarity analysis(Schwanke Hanson 1994) or relatedness measures in problem decomposition (Yang 1995) ....
[Article contains additional citation context not shown here]
Wills, L. M. (1992b), Automated program recognition by Graph Parsing, PhD thesis, MIT. WWW available as http://www.cc.gatech.edu/cogsci/faculty/wills/phd-thesis.html.
....of certain abstract syntax elements. At a very low level of abstraction, Paul Prakash (1993) describe a similar recognition approach based upon matching syntactic clich es with source constructs in legacy abstract syntax trees annotated with semantic relations 1 . Wills (Rich Waters 1990, Wills 1990, Wills 1992a, Wills 1992b) outlined an approach to recognition in which stereotypical program or data structures known as clich es are represented as a type of graph grammar. A source program is translated into an intermediate representation as a flow graph. These flow graphs are parsed so as to ....
.... the existence of an adequate blocking or slicing algorithms yielding either simple structural functions derived from the program s abstract syntax tree supplemented with semantic information such as call calling, control flow, and data flow relationships(Kozaczynski Ning 1992, Quilici 1994, Wills 1990). More complex approaches that attempt to derive related code portions based on additional information such as dynamic program flow (execution) traces, other sophisticated analysis such as similarity analysis(Schwanke Hanson 1994) or relatedness measures in problem decomposition (Yang 1995) ....
[Article contains additional citation context not shown here]
Wills, L. M. (1992a), Automated program recognition by Graph Parsing, AI Laborary Technical Report 1358, MIT.
....Canfora, and Cimitile [Canfora94] Canfora92] Cimitile96] describe efficient algorithms for analyzing the control and data flow in order to identify binding conditions on program variables. Versions of programs with altered binding conditions become candidates for portability failures. In [Wills93] a program understanding system that uses attributed data flow sub graphs to represent programs and programming plans is presented. Comparison is performed by matching sub graphs and by checking constraints involving control dependencies and other program attributes. In [Gra92] a program ....
Wills, L.M., "Automated Program Recognition by Graph Parsing", MIT Technical Report 1358, MIT, AI Laboratory, 1993
....and get the maintainer s input. Modularization techniques like the ones used in this paper can produce static architectural views. Informal information available in identifiers and comments [Bigg93] can serve as a heuristic to choose between alternative component decompositions. Plan recognition [Will92] could be used to identify the implementation of algorithms typical of an architectural style [Perr92] A few teams have already started to work on reverse engineering to the architectural level [Gall96, Tone96] Harris et al. Harr96] and Fiutem et al. Fiut96] use clich s to recognize ....
L. M. Wills. Automated Program Recognition by Graph Parsing. Technical Report 1358, MIT Artificial Intelligence Laboratory, July 1992.
.... 45] in this case tools such as symbolic executors [47, 84] and theorem provers [4, 19, 110] can be required to map the specification onto the code) or can be given in term of a set of test cases capturing the behaviour of the abstraction [71, 134, 135] or can be encapsulated in a knowledge base [2, 85, 90, 109, 113, 137] in term of a library of programming plans [124] or clich es [115] The first two families of methods are also called mass methods, because they are applied to one or more systems and produce a large set of candidate modules. In the following we will refer to METMOD and METTYP methods as ....
.... in a formal way [29, 45] or as a set of test cases capturing the behaviour of the abstraction [71, 134, 135] A widely used approach is to build up a library of programming plans [124] or clich es [115] each of which corresponds to an abstraction) and to search for their instances in code [2, 85, 90, 109, 113, 137]. In general a specification driven method consists in instantiating a model (P; CF; S; sf : P Theta S 2 CF ) where: ffl P is a set of programs. ffl CF is a set of code fragments of programs in P. ffl S is a set of specifications. ffl sf is a selection function. Given a program p 2 P and ....
[Article contains additional citation context not shown here]
L. Wills, "Automated program recognition by graph parsing", Ph.D. Thesis, MIT, Cambridge, Massachussets, U.S.A., 1992.
....In this direction, an explicit library of programming plan templates and concepts is constructed, and various top down and bottom up search strategies are utilized to implement the mapping process. Notable examples are Quilici[18] Kozaczynski and Ning[8] Rich and Waters[23] and Wills[27, 28]. To some extent, all are aimed at improving the effectiveness of the mapping process through heuristic knowledge. In Figure 2 a subset of expert knowledge about a particular application domain is represented in a fragment of a hierarchical library of program templates. One possible mapping is ....
..... for (int k=0;A[k] k ) outchar(A[k] specialize when: contains = string Figure 2: Conceptualizing source with a plan library. examples or experiments which might suggest that these approaches might scale up for specific uses in large sources. One exception might be Wills[28] who presents empirical results promising in identifying partial mappings of reasonably sized legacy sources to a library of program plans. The work presented in this paper is part of the initial phase of work focused on demonstrating that an effective approach to partial program understanding is ....
[Article contains additional citation context not shown here]
L. M. Wills. Automated program recognition by Graph Parsing. PhD thesis, MIT, July 1992.
....architecture of the plan recognition system. The number of plans needed and the completeness of the resulting plans depends on the recognition technology used. Simply using an AST annotated with flowrepresentation allows us to ignore differences in variation in the order of divisions and tests [19]. In addition, simple expression simplification and reordering techniques (e.g. always using less than for comparisons rather than greater than, treating nested IFs as ANDs, simplifying negated conditions by switching the IF and ELSE branches, and so on) allow us to ignore many other variations. ....
L. M. Wills. Automated Program Recognition by Graph Parsing. PhD thesis, MIT, 1992.
....more general algorithms for parsing the expression DAG s. 4 Algorithm Recognition The internal representation used by the Convex compilers lends itself nicely to conversion to the format used by the Programmer sApprenticeprojectfor plans . We plan to implement this conversion and use Wills [ Will92 ] graph parsing approach to identify algorithms in the user code. We also plan to explore parallel implementations of graph parsing algorithms. Our experience with the Application Compiler has shown us that compiler users are willing to work with a different compilation paradigm if it has the ....
L. Wills, "Automated ProgramRecognition by Graph Parsing", Ph.D. Thesis, A.I. Technical Report 1358, MIT Artificial Intelligence Lab., 1992.
....In this direction, an explicit library of programming plan templates and concepts is constructed, and various top down and bottom up search strategies are utilized to implement the mapping process. Notable examples are Quilici[21] Kozaczynski and Ning[10] Rich and Waters[26] and Wills[31, 32]. To some extent, all are aimed at improving the effectiveness of the mapping process through heuristic knowledge. The basis for such heuristic approaches has been the assumed intractability of the complete understanding problem in general. In [40] not only is program understanding is shown to be ....
....approach, or to understand how the addition or deletion of certain types of domain specific knowledge may affect performance. We are unaware of concrete examples or experiments which might suggest that these approaches might scale up for specific uses in large sources. One exception might be Wills[32] who presents empirical results which seem promising in identifying partial mappings of reasonably sized legacy sources to a library of program plans. The work presented in this paper is part of the initial phase of work focused on demonstrating that an effective approach to partial program ....
[Article contains additional citation context not shown here]
L. M. Wills. Automated program recognition by Graph Parsing. PhD thesis, MIT, July 1992.
....all graph grammar parsing algorithms presented so far are either unable to recognize interesting languages of graphs or tend to be hopelessly inefficient when applied to graphs with a large number of nodes and edges. Another problem is that nearly all known graph grammar parsing algorithms [2,3,4,5,6,7] deal only with context free productions. This makes them difficult to specify a large portion of VPLs. A context sensitive graph grammar, on the other hand, allows left and right side graphs of a production to have arbitrary number of nodes and edges, like a layered graph grammar proposed ....
L.M. Wills, Automated Program Recognition by Graph Parsing, PhD Thesis, MIT Artificial Intelligence Laboratory, Cambridge, Massachusetts, Technical Report 1358, 1992.
....the code, or 3. replacing understood code portions with generic application code or calls to other code libraries. There have been a variety of methods proposed which partially solve the program understanding problem, primarily as parts of a supposed interactive assistant or maintenance toolset[11, 13, 14, 5, 9, 10]. Each of these approaches attempts to integrate perceptions or recognitions of particular abstracted program plan templates into an overall understanding of the source in terms of a particularly configured library of pre existing knowledge about how programs (in general or in a particular domain) ....
....is observed in the source code. An example of an inference test is also shown in Figure 1, where the existence of loop initialize string is inferred when an instance of loopthrough character array is near a related instance of copy character in the source code. In a different approach, Wills[11, 13, 14] models stereotypical program or data structures (cliches) as a type of flow graph grammar and parses 2 legacy source represented as a flow graph. Each successful partial parse represents a one explanation of part of the source program. 3 Complexity Analysis Our survey of the approaches to ....
L. M. Wills. Automated program recognition by Graph Parsing. PhD thesis, MIT, July 1992.
....program s code and design. 1 Introduction The standard goal of most program understanding efforts is a tool that takes source code and extracts all of its underlying design. The standard approach to this design extraction is to try to recognize the instances of a library of known code patterns [7, 19, 5, 16, 4, 13, 8, 11, 6]. Unfortunately, there are two fundamental problems with trying to apply this approach to large, real world legacy systems: 1. This approach requires enormous libraries of code patterns, since each new domain requires its own set of domain specific patterns. 2. This approach is doomed to ....
....mechanism for automatically extracting some of this object oriented design from the source code. Significant technical progress has been made in the area of program understanding in dealing with specific problems such as code variability and in finding algorithms for locating patterns in the code [7, 19, 20, 5, 8, 11, 6]. These previous approaches, however, have made two key assumptions: 1. The library of code patterns needed to automatically extract a program s design is small. 2. The knowledge base describing a program s design that results from this process is static. Neither of these assumptions holds in ....
L.M. Wills. "Automated Program Recognition by Graph Parsing", Ph.D. Thesis, Technical Report 1358, MIT Artificial Intelligence Lab, Cambridge, MA, 1992.
....desired set of goals, in which case there might be some techniques for modifying the plans and or combining more than one plan to achieve the objectives. Once such plans are identified, they are connected to the actual program code fragments. The second approach, known as the bottom up approach [15, 26, 27], also has a library of programming plans but in this case the analysis starts with the actual program and tries to find out which of the plans it matches. From these matched plans, the system infers higher level goals of the program being reverse engineered. In both these approaches, we need to ....
Wills, L. Automated Program Recognition by Graph Parsing. AI Memo no. 634, Ph.D. Thesis, Artificial Intelligence Laboratory, MIT, The Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA 02139, 1992.
....all graph grammar parsing algorithms suggested up to now are either unable to recognize interesting languages of graphs, or tend to be hopelessly inefficient when applied to graphs with more than a few dozen nodes and edges 1 . Even worse, all currently known graph grammar parsing algorithms [10, 16, 12, 21, 3, 22, 7] deal with context free productions only (where the left hand side is a single non terminal node) This might be sufficient from the theoretical point of view 2 . But in practice it would be quite useful to allow arbitrary graphs in the left hand side of a production, which might even share a ....
....algorithm 2 by the less strict equality: V (G c ) V (G) E(G c ) E(G) We will not discuss these examples however, as this section is too long already. 6 Related work Up to now, only a handful of proposals are published on how to parse graph like data structures generated by graph grammars [10, 16, 12, 21] or related formalisms like plex grammars [3] relational fringe grammars [22] or picture layout grammars [7] These approaches fall into two classes with respect to the overall organization of the parsing algorithm. On one side, we have Earley style [5] approaches [3, 22] which start at a single ....
[Article contains additional citation context not shown here]
L.M. Wills. Automated Program Recognition by Graph Parsing. PhD thesis, MIT Artificial Intelligence Laboratory, Cambridge, Massachusetts, 1992. Technical Report 1358.
....CSP framework. This initial work shows that the CSP approach compares favorably with at least several of these existing algorithms (Quilici and Woods, 1997) There are, however, a wide variety of different program understanding algorithms that we haven t yet explored, such as those used by GRASPR (Wills, 1992; Wills, 1990) In addition, we have also now found a more effective constraint evaluation algorithm, which suggests that we should redo these earlier efforts. 6.2. Alternatives to H MAP CSP H MAP CSP is just one approach to program understanding, and it may not be the most efficient way to make ....
Wills, L. M. (1992). Automated program recognition by Graph Parsing. PhD thesis, MIT, Department of Computer Science.
....for understanding legacy, i.e. sequential applications. We refer the interested reader to Appendix A for a thorough investigation and evaluation of some of the techniques employed in sequential and distributed applications. A common feature for several techniques is the use of a plan library [3, 18, 25, 34, 35, 36, 37, 38]. We use a pattern library, similar to a plan library, to search for patterns. Most of the techniques that employ plan libraries, use the library to search for programming concepts in the source code, i.e. they perform a static analysis of the application [22, 23, 24] We search for patterns ....
....Apprentice [26] This tool should provide support during specification, design, implementation and program analysis. The part of the Apprentice responsible for design recovery is the Recognizer. Wills describes this Recognizer, also called the Graphbased System for Program Recognition(GRASPR) [34], which when given a library of cliches, finds all occurences in a program and builds a hierarchical description of the program in terms of the cliches it finds. There are a number of steps GRASPR takes when recognizing cliches as shown in Figure A.3. First, GRASPR translates the program into a ....
Linda M. Wills. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. Ph.D. Thesis.
....to modify the knowledge base (programming patterns) as well as to use inferential services by asking questions. Various representations of programming knowledge and system models as well as inferential features influenced the development of different knowledge based software analysers: ffl (Wills 1992) studies a graph parsing approach to automating program recognition in which programs are represented as attributed dataflow graphs and a library of cliches is encoded as an attributed grammar. A graph parsing algorithm is used to recognise cliches in the code. ffl (Quilici 1994) represents ....
Wills, L. M. (1992), Automated Program Recognition by Graph Parsing, PhD thesis, MIT.
No context found.
L. Wills. Automated program recognition bygraph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD Thesis.
....recognition, provides a short cut to understanding a program s design. It bypasses complex reasoning about how behaviors and properties arise from certain combinations of language primitives. Several researchers have shown the feasibility and usefulness of automating recognition, most recently [1, 9, 10, 12, 15, 16, 24, 25]. A primary motivation for automating recognition is to facilitate tasks requiring program understanding, such as maintaining, debugging, and reusing software. We have developed an experimental recognition system, called GRASPR ( GRAph based System for Program Recognition ) 25] to automate ....
....10, 12, 15, 16, 24, 25] A primary motivation for automating recognition is to facilitate tasks requiring program understanding, such as maintaining, debugging, and reusing software. We have developed an experimental recognition system, called GRASPR ( GRAph based System for Program Recognition ) [25], to automate program recognition. Given a program and a library of clich es, GRASPR finds all instances of the clich es in a program. 1 It can generate multiple views of a program as well as near miss recognitions of clich es. It can also recognize clich es in programs even if they are ....
[Article contains additional citation context not shown here]
L. Wills. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD Thesis.
....Linda M. Wills College of Computing Georgia Institute of Technology Atlanta, Georgia 30332 0280 linda cc.gatech.edu http: www.cc. gatech.edu aimosaic faculty wills.html My research 1 focuses on program recognition techniques for automating software understanding [ Rich and Wills, 1990; Wills, 1990; 1992; 1993 ] I have developed an automated technique for recognizing standard programming plans (or clich es) in program, in which a graph grammar formalism is used for representing programs and clich es. Attributed graph parsing is used to automate program recognition [ Wills, 1992; 1994 ] ....
....1990; Wills, 1990; 1992; 1993 ] I have developed an automated technique for recognizing standard programming plans (or clich es) in program, in which a graph grammar formalism is used for representing programs and clich es. Attributed graph parsing is used to automate program recognition [ Wills, 1992; 1994 ] Recently, I have been collaborating with researchers at NASA Ames to apply software understanding ideas to assisting component based software synthesis. NASA Ames researchers have developed an automated synthesis system, called Amphion, which uses formal methods to generate Fortran ....
L. Wills. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD Thesis.
....definitions in [18, 28, 34] Note that a plan is not necessarily stereotypical or used repeatedly; it may be novel or idiosyncratic. Following [28, 34] we reserve the term clich e for a plan that represents a standard, stereotypical form, which can be detected by recognition techniques, such as [12, 17, 16, 25, 29, 40]. SUBROUTINE NPEDLN ( A, B, C, LINEPT, LINEDR, PNEAR, DIST ) C IF ( RETURN ( THEN RETURN ELSE CALL CHKIN ( NPEDLN ) END IF C CALL UNORM ( LINEDR, UDIR, MAG ) IF ( MAG .EQ. 0 ) THEN CALL SETMSG( Line direction vector is the zero vector. CALL SIGERR( SPICE(ZEROVECTOR) CALL CHKOUT( ....
....Techniques for detecting interleaving and disentangling interleaved plans are likely to build on existing program comprehension and maintenance techniques. 5. 1 The Role of Recognition When what is interleaved is familiar (i.e. stereotypical, frequently used plans) clich e recognition (e.g. [12, 15, 16, 17, 25, 29, 40]) is a useful detection mechanism. 3 In fact, most recognition systems deal explicitly with the recognition of clich es that are interleaved in specific ways with unrecognizable code or other clich es. One of the key features of GRASPR [40] for instance, is its ability to deal with ....
[Article contains additional citation context not shown here]
L. Wills. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD Thesis.
....an efficient means of reconstructing and understanding a program s design. It bypasses complex reasoning about how behaviors and properties arise from certain combinations of language primitives. Several researchers have shown the feasibility and usefulness of automating recognition, most recently [9, 10, 12, 13, 20, 25, 26]. A primary motivation for automating recognition is to facilitate tasks requiring program understanding, such as maintaining, debugging, and reusing software. We have developed an experimental recognition system, called GRASPR ( GRAphbased System for Program Recognition ) 26] to automate ....
....10, 12, 13, 20, 25, 26] A primary motivation for automating recognition is to facilitate tasks requiring program understanding, such as maintaining, debugging, and reusing software. We have developed an experimental recognition system, called GRASPR ( GRAphbased System for Program Recognition ) [26], to automate program recognition. Given a program and a library of clich es, GRASPR finds all instances of the clich es in a program. It can generate multiple views of a program as well as near miss recognitions of clich es. It can also recognize clich es in programs even if they are surrounded ....
[Article contains additional citation context not shown here]
L. Wills. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD Thesis.
....certain stereotypical constructs (called clich es [19] and map them back to a domain concept (such as an ellipsoid or an orthogonal projection) then we can use our knowledge of geometry to further understand and describe a program. This requires the use of clich e recognition techniques (e.g. [12, 17, 26]) It also raises the question of describing the domain itself in Refine, and consequently being able to reason about it. We would also like to look at architectural issues. In particular, the analyses we performed were fairly low level, and, in fact, the SPICELIB software itself has a fairly ....
L. Wills. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD Thesis.
.... The third dimension is the familiarity of the plans interleaved: are they clich es (i.e. stereotypical, frequently used plans) or are they unfamiliar plans (i.e. novel, idiosyncratic, or not used repeatedly) When what is interleaved is familiar (i.e. a clich e) clich e recognition (e.g. [10, 12, 13, 14, 18, 28]) is a useful detection mechanism. 3 In fact, most recognition systems deal explicitly with the recognition of clich es that are interleaved in specific ways with unrecognizable code or other clich es. One of the key features of GRASPR [28] for instance, is its ability to deal with ....
....clich e recognition (e.g. 10, 12, 13, 14, 18, 28] is a useful detection mechanism. 3 In fact, most recognition systems deal explicitly with the recognition of clich es that are interleaved in specific ways with unrecognizable code or other clich es. One of the key features of GRASPR [28], for instance, is its ability to deal with delocalization and redistribution type function sharing optimizations. KBEmacs [21, 26] uses a simple, special purpose recognition strategy to segment loops within programs. This is based on detecting coarse patterns of data and control flow at the ....
[Article contains additional citation context not shown here]
L. Wills. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD Thesis.
....gains efficiency without permanently sacrificing completeness. Furthermore, the knowledge about the program, the requirements on recognition power, and the resources available typically change as the tasks are being performed. We have developed an experimental recognition system, called GRASPR [19], which when given a library of clich es, finds all instances of clich es in a program. It can generate multiple views of a program as well as near miss recognitions of clich es. It has a flexible, adaptable control structure that can accept advice and guidance from external agents. GRASPR is ....
....makes control and search strategies explicit. 1. 1 Previous Recognition Work Several researchers have shown the feasibility of automating recognition and the usefulness of its results, most recently Bertels [1] Hartman [6] Johnson [8] Letovsky [10] Murray [12] Ning [13] and Wills [18] See [19] for a more detailed description of these systems and earlier research in this area. All existing recognition systems are isolated, standalone systems which are not expected to interact with people or with other reverse engineering techniques. They all are committed to a rigid control strategy, ....
[Article contains additional citation context not shown here]
L. Wills. Automated program recognition by graph parsing. Technical Report 1358, MIT Artificial Intelligence Lab., July 1992. PhD Thesis.
No context found.
L. Wills. Automated program recognition by graph parsing. MIT Technical Report 1358, MIT, AI Laboratory, 1993.
No context found.
L. Wills. Automated program recognition by graph parsing. MIT Technical Report 1358, MIT, AI Laboratory, 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC