Results 1 - 10
of
42
A Survey on Software Clone Detection Research
- SCHOOL OF COMPUTING TR 2007-541, QUEEN’S UNIVERSITY
, 2007
"... Code duplication or copying a code fragment and then reuse by pasting with or without any modifications is a well known code smell in software maintenance. Several studies show that about 5 % to 20 % of a software systems can contain duplicated code, which is basically the results of copying existin ..."
Abstract
-
Cited by 131 (17 self)
- Add to MetaCart
(Show Context)
Code duplication or copying a code fragment and then reuse by pasting with or without any modifications is a well known code smell in software maintenance. Several studies show that about 5 % to 20 % of a software systems can contain duplicated code, which is basically the results of copying existing code fragments and using then by pasting with or without minor modifications. One of the major shortcomings of such duplicated fragments is that if a bug is detected in a code fragment, all the other fragments similar to it should be investigated to check the possible existence of the same bug in the similar fragments. Refactoring of the duplicated code is another prime issue in software maintenance although several studies claim that refactoring of certain clones are not desirable and there is a risk of removing them. However, it is also widely agreed that clones should at least be detected.
In this paper, we survey the state of the art in clone detection research. First, we describe the clone terms commonly used in the literature along with their corresponding mappings to the commonly used clone types. Second, we provide a review of the existing clone taxonomies, detection approaches and experimental evaluations of clone detection tools. Applications of clone detection research to other domains of software engineering and in the same time how other domain can assist clone detection research have also been pointed out. Finally, this paper concludes by pointing out several open problems related to clone detection research.
A survey and taxonomy of approaches for mining software repositories in the context of software evolution
, 2007
"... ..."
Tracking Code Clones in Evolving Software
"... Code clones are generally considered harmful in software development, and the predominant approach is to try to eliminate them through refactoring. However, recent research has provided evidence that it may not always be practical, feasible, or cost-effective to eliminate certain clone groups. We pr ..."
Abstract
-
Cited by 71 (1 self)
- Add to MetaCart
(Show Context)
Code clones are generally considered harmful in software development, and the predominant approach is to try to eliminate them through refactoring. However, recent research has provided evidence that it may not always be practical, feasible, or cost-effective to eliminate certain clone groups. We propose a technique for tracking clones in evolving software. Our technique relies on the concept of abstract clone region descriptors (CRD), which describe clone regions within methods in a robust way that is independent from the exact text of the clone region or its location in a file. We present our definition of CRDs, and describe a complete clone tracking system capable of producing CRDs from the output of a clone detection tool, notify developers of modifications to clone regions, and support the simultaneous editing of clone regions. We report on two experiments and a case study conducted to assess the performance and usefulness of our approach.
Automatic Inference of Structural Changes for Matching Across Program Versions
, 2007
"... Mapping code elements in one version of a program to corresponding code elements in another version is a fundamental building block for many software engineering tools. Existing tools that match code elements or identify structural changes—refactorings and API changes—between two versions of a progr ..."
Abstract
-
Cited by 48 (12 self)
- Add to MetaCart
Mapping code elements in one version of a program to corresponding code elements in another version is a fundamental building block for many software engineering tools. Existing tools that match code elements or identify structural changes—refactorings and API changes—between two versions of a program have two limitations that we overcome. First, existing tools cannot easily disambiguate among many potential matches or refactoring candidates. Second, it is difficult to use these tools ’ results for various software engineering tasks due to an unstructured representation of results. To overcome these limitations, our approach represents structural changes as a set of high-level change rules, automatically infers likely change rules and determines method-level matches based on the rules. By applying our tool to several open source projects, we show that our tool identifies matches that are difficult to find using other approaches and produces more concise results than other approaches. Our representation can serve as a better basis for other software engineering tools.
Discovering and representing systematic code changes
- In ICSE ’09
, 2009
"... Software engineers often inspect program differences when reviewing others ’ code changes, when writing check-in comments, or when determining why a program behaves differently from expected behavior after modification. Program differencing tools that support these tasks are limited in their ability ..."
Abstract
-
Cited by 40 (9 self)
- Add to MetaCart
(Show Context)
Software engineers often inspect program differences when reviewing others ’ code changes, when writing check-in comments, or when determining why a program behaves differently from expected behavior after modification. Program differencing tools that support these tasks are limited in their ability to group related code changes or to detect potential inconsistencies in those changes. To overcome these limitations and to complement existing approaches, we built Logical Structural Diff (LSdiff), a tool that infers systematic structural differences as logic rules. LSdiff notes anomalies from systematic changes as exceptions to the logic rules. We conducted a focus group study with professional software engineers in a large E-commerce company; we also compared LSdiff’s results with textual differences and with structural differences without rules. Our evaluation suggests that LSdiff complements existing differencing tools by grouping code changes that form systematic change patterns regardless of their distribution throughout the code, and its ability to discover anomalies shows promise in detecting inconsistent changes. 1
Characterizing and understanding development sessions
- IN: PROCEEDINGS OF ICPC
, 2007
"... The understanding of development sessions, the phases during which a developer actively modifies a software system, is a valuable asset for program comprehension, since the sessions directly impact the current state and future evolution of a software system. Such information is usually lost by state ..."
Abstract
-
Cited by 21 (14 self)
- Add to MetaCart
(Show Context)
The understanding of development sessions, the phases during which a developer actively modifies a software system, is a valuable asset for program comprehension, since the sessions directly impact the current state and future evolution of a software system. Such information is usually lost by state-of-the-art versioning systems, because of the checkin/checkout model they rely on: a developer must explicitly commit his changes to the repository. Since this happens in arbitrary and sometimes long intervals, recovering the changes between two commits is difficult and inaccurate, and recovering the order of the changes is impossible. We have implemented an evolution monitoring prototype which records every semantic change performed on a system, and is able to completely reconstruct development sessions. In this paper we use this fine-grained information to understand and characterize the development sessions as they were carried out on two object-oriented systems.
Clone Region Descriptors: Representing and Tracking Duplication in Source Code
"... Source code duplication, commonly known as code cloning, is considered an obstacle to software maintenance because changes to a cloned region often require consistent changes to other regions of the source code. Research has provided evidence that the elimination of clones may not always be practica ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
Source code duplication, commonly known as code cloning, is considered an obstacle to software maintenance because changes to a cloned region often require consistent changes to other regions of the source code. Research has provided evidence that the elimination of clones may not always be practical, feasible, or cost-effective. We present a clone management approach that describes clone regions in a robust way that is independent from the exact text of clone regions or their location in a file, and that provides support for tracking clones in evolving software. Our technique relies on the concept of abstract clone region descriptors (CRDs), which describe clone regions using a combination of their syntactic, structural, and lexical information. We present our definition of CRDs, and describe a clone tracking system capable of producing CRDs from the output of different clone detection tools, notifying developers of modifications to clone regions, and supporting updates to the documented clone relationships. We evaluated the performance and usefulness of our approach across three clone detection tools and five subject systems, and the results indicate that CRDs are a practical and robust representation for tracking code clones in evolving software. 3
Tracking Your Changes: a Language-Independent Approach
"... The availability of powerful differencing algorithms is crucial to track the evolution of source code, for example with the purpose of monitoring clones or vulnerable statements. In this paper we present a language-independent approach to track the evolution of code fragments, based on a novel diffe ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
The availability of powerful differencing algorithms is crucial to track the evolution of source code, for example with the purpose of monitoring clones or vulnerable statements. In this paper we present a language-independent approach to track the evolution of code fragments, based on a novel differencing algorithm, that overcomes limitations of the Unix diff. We show how the algorithm is able to track the evolution of code elements in real-world software systems with acceptable precision, and provide examples—such as clone tracking and vulnerability tracking—where the algorithm has been successfully applied.
Supporting the investigation and planning of pragmatic reuse tasks
- in Proc. Int’l Conf. Softw. Eng., 2007
"... Software reuse has long been promoted as a means to increase developer productivity; however, reusing source code is difficult in practice and tends to be performed in an ad hoc manner. This is problematic because poor decisions can be made either to attempt an unwise, overly complex reuse task, or ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
(Show Context)
Software reuse has long been promoted as a means to increase developer productivity; however, reusing source code is difficult in practice and tends to be performed in an ad hoc manner. This is problematic because poor decisions can be made either to attempt an unwise, overly complex reuse task, or to avoid a reuse task that would have saved time and effort. This paper describes a lightweight tool that supports the investigation and planning of pragmatic reuse tasks. The tool helps developers to identify the dependen-cies from the source code they wish to reuse, and to de-cide how to deal with those dependencies. Questions about pragmatic reuse are evaluated through a survey of indus-trial developers. The tool is evaluated through the planning and execution of reuse tasks by industrial developers. 1.
Automatically Identifying Changes that Impact Code-to-Design Traceability
"... An approach is presented that automatically determines if a given source code change impacts the design (i.e., UML class diagram) of the system. This allows code-to-design traceability to be consistently maintained as the source code evolves. The approach uses lightweight analysis and syntactic diff ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
(Show Context)
An approach is presented that automatically determines if a given source code change impacts the design (i.e., UML class diagram) of the system. This allows code-to-design traceability to be consistently maintained as the source code evolves. The approach uses lightweight analysis and syntactic differencing of the source code changes to determine if the change alters the class diagram in the context of abstract design. The intent is to support both the simultaneous updating of design documents with code changes and bringing old design documents up to date with current code given the change history. An efficient tool was developed to support the approach and is applied to an open source system (i.e., HippoDraw). The results are evaluated and compared against manual inspection by human experts. The tool performs better than (error prone) manual inspection. 1.