Results 1 - 10
of
27
Empirical validation of object-oriented metrics on open source software for fault prediction
- IEEE Transactions on Software Engineering
, 2005
"... Abstract—Open source software systems are becoming increasingly important these days. Many companies are investing in open source projects and lots of them are also using such software in their own work. But, because open source software is often developed with a different management style than the ..."
Abstract
-
Cited by 165 (5 self)
- Add to MetaCart
(Show Context)
Abstract—Open source software systems are becoming increasingly important these days. Many companies are investing in open source projects and lots of them are also using such software in their own work. But, because open source software is often developed with a different management style than the industrial ones, the quality and reliability of the code needs to be studied. Hence, the characteristics of the source code of these projects need to be measured to obtain more information about it. This paper describes how we calculated the object-oriented metrics given by Chidamber and Kemerer to illustrate how fault-proneness detection of the source code of the open source Web and e-mail suite called Mozilla can be carried out. We checked the values obtained against the number of bugs found in its bug database—called Bugzilla—using regression and machine learning methods to validate the usefulness of these metrics for fault-proneness prediction. We also compared the metrics of several versions of Mozilla to see how the predicted faultproneness of the software system changed during its development cycle. Index Terms—Fact extraction, metrics validation, reverse engineering, open source software, fault-proneness detection, Mozilla, Bugzilla, C++, compiler wrapping, Columbus.
Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code
- in Proc. of ICPC'07
, 2007
"... The paper addresses the problem of concept location in source code by presenting an approach which combines Formal Concept Analysis (FCA) and Latent Semantic Indexing (LSI). In the proposed approach, LSI is used to map the concepts expressed in queries written by the programmer to relevant parts of ..."
Abstract
-
Cited by 70 (18 self)
- Add to MetaCart
(Show Context)
The paper addresses the problem of concept location in source code by presenting an approach which combines Formal Concept Analysis (FCA) and Latent Semantic Indexing (LSI). In the proposed approach, LSI is used to map the concepts expressed in queries written by the programmer to relevant parts of the source code, presented as a ranked list of search results. Given the ranked list of source code elements, our approach selects most relevant attributes from these documents and organizes the results in a concept lattice, generated via FCA. The approach is evaluated in a case study on concept location in the source code of Eclipse, an industrial size integrated development environment. The results of the case study show that the proposed approach is effective in organizing different concepts and their relationships present in the subset of the search results. The proposed concept location method outperforms the simple ranking of the search results, reducing the programmers ’ effort. 1.
Automatic mining of source code repositories to improve bug finding techniques
- IEEE Transactions on Software Engineering
, 2005
"... Abstract—We describe a method to use the source code change history of a software project to drive and help to refine the search for bugs. Based on the data retrieved from the source code repository, we implement a static source code checker that searches for a commonly fixed bug and uses informatio ..."
Abstract
-
Cited by 62 (1 self)
- Add to MetaCart
(Show Context)
Abstract—We describe a method to use the source code change history of a software project to drive and help to refine the search for bugs. Based on the data retrieved from the source code repository, we implement a static source code checker that searches for a commonly fixed bug and uses information automatically mined from the source code repository to refine its results. By applying our tool, we have identified a total of 178 warnings that are likely bugs in the Apache Web server source code and a total of 546 warnings that are likely bugs in Wine, an open-source implementation of the Windows API. We show that our technique is more effective than the same static analysis that does not use historical data from the source code repository. Index Terms—Testing tools, version control, configuration control, debugging aids. 1
Leveraged quality assessment using information retrieval techniques.
- In Proceedings of the 14th IEEE International Conference on Program Comprehension,
, 2006
"... ..."
(Show Context)
Analyzing multiple configurations of a C program, in
- ICSM ’05: Proceedings of the 21st IEEE International Conference on Software Maintenance, IEEE Computer Society
, 2005
"... Preprocessor conditionals are heavily used in C programs since they allow the source code to be configured for different platforms or capabilities. However, preprocessor conditionals, as well as other preprocessor directives, are not part of the C language. They need to be evaluated and removed, and ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
(Show Context)
Preprocessor conditionals are heavily used in C programs since they allow the source code to be configured for different platforms or capabilities. However, preprocessor conditionals, as well as other preprocessor directives, are not part of the C language. They need to be evaluated and removed, and so a single configuration selected, before parsing can take place. Most analysis and program understanding tools run on this preprocessed version of the code so their results are based on a single configuration. This paper describes the approach of CRefactory, a refactoring tool for C programs. A refactoring tool cannot consider only a single configuration: changing the code for one configuration may break the rest of the code. CRefactory analyses the program for all possible configurations simultaneously. CRefactory also preserves preprocessor directives and integrates them in the internal representations. The paper also presents metrics from two case studies to show that CRefactory’s program representation is practical. 1.
IRiSS - A Source Code Exploration Tool
- in Industrial and Tool Proceedings of 21st IEEE International Conference on Software Maintenance (ICSM'05
, 2005
"... IRiSS (Information Retrieval based Software Search) is a software exploration tool that uses an indexing engine based on an information retrieval method. IRiSS is implemented as an add-in to the Visual Studio.NET development environment and it allows the user to search a C++ project for the implemen ..."
Abstract
-
Cited by 18 (10 self)
- Add to MetaCart
(Show Context)
IRiSS (Information Retrieval based Software Search) is a software exploration tool that uses an indexing engine based on an information retrieval method. IRiSS is implemented as an add-in to the Visual Studio.NET development environment and it allows the user to search a C++ project for the implementation of concepts formulated as natural language queries. The results of the query are presented as ranked list of software methods or classes, ordered by the similarity to the user query. A second component of IRiSS provides another searching method based on regular expression matching. This method is based on the existing “find” feature form the Visual Studio environment and it has an improved format for the display of the search results. 1.
Open Source Software Evolution and Its Dynamics
, 2006
"... I hereby declare that I am the sole author of this thesis. The is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii This thesis undertakes an empirical study of software e ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
(Show Context)
I hereby declare that I am the sole author of this thesis. The is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii This thesis undertakes an empirical study of software evolution by analyzing open source software (OSS) systems. The main purpose is to aid in understanding OSS evolution. The work centers on collecting large quantities of structural data cost-effectively and analyzing such data to understand software evolution dynamics (the mechanisms and causes of change and growth). We propose a multipurpose systematic approach to extracting program facts (e.g., func-tion calls). This approach is supported by a suite of C and C++ program extractors, which cover different steps in the program build process and handle both source and binary code. We present several heuristics to link facts extracted from individual files into a combined system model of reasonable accuracy. We extract historical sequences of system models to aid software evolution analysis. We propose that software evolution can be viewed as Punctuated Equilibrium (i.e., long periods of small changes interrupted occasionally by large avalanche changes). We develop two approaches to study such dynamical behavior. One approach uses the evolution spec-trograph to visualize file level changes to the implemented system structure. The other ap-proach relies on automated software clustering techniques to recover system design changes. We discuss lessons learned from using these approaches. We present a new perspective on software evolution dynamics. From this perspective, an evolving software system responds to external events (e.g., new functional requirements) according to Self-Organized Criticality (SOC). The SOC dynamics is characterized by the following: (1) the probability distribution of change sizes is a power law; and (2) the time series of change exhibits long range correlations with power law behavior. We present em-pirical evidence that SOC occurs in open source software systems. iii
An interactive reverse engineering environment for large-scale C++ code
- In Proc. ACM SoftVis
, 2008
"... Few toolsets for reverse-engineering and understanding of C++ code provide parsing and fact extraction, querying, analysis and code metrics, navigation, and visualization of source-code-level facts in a way which is as easy-to-use as integrated development environments (IDEs) are for forward enginee ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Few toolsets for reverse-engineering and understanding of C++ code provide parsing and fact extraction, querying, analysis and code metrics, navigation, and visualization of source-code-level facts in a way which is as easy-to-use as integrated development environments (IDEs) are for forward engineering. We present an interactive reverse-engineering environment (IRE) for C and C++ which allows to set up the fact extraction process, apply userwritten queries and metrics, and visualize combined query results, metrics, code text, and code structure. Our IRE tightly couples a fast, tolerant C++ fact extractor, an open query system, and several scalable dense-pixel visualizations in a novel way, offering an easy way to analyze and examine large code bases. We illustrate our IRE with several examples, focusing on the added value of the integrated, visual reverse-engineering approach. 1
Concept location using formal concept analysis and information retrieval
- ACM Transactions on Software Engineering and Methodology (TOSEM
"... The paper addresses the problem of concept location in source code by proposing an approach that combines ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
The paper addresses the problem of concept location in source code by proposing an approach that combines
Refactoring trends across n versions of n java open source systems: an empirical study
, 2005
"... In the past few years, refactoring has emerged as an important consideration in the maintenance and evolution of software. Yet very little empirical evidence exists to support the claim about whether developers actively undertake refactoring, or whether as Fowler suggests that the benefits of doing ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
In the past few years, refactoring has emerged as an important consideration in the maintenance and evolution of software. Yet very little empirical evidence exists to support the claim about whether developers actively undertake refactoring, or whether as Fowler suggests that the benefits of doing refactoring are not short-term but too ‘long-term ’ [8]. In this paper, we describe an empirical study of multiple versions of a range of open source Java systems in an attempt to understand whether refactoring does occur and, if so, which types of refactoring were most (and least) common. Fifteen refactorings were chosen as a basis (on seven Java systems) and the code analysed using an automated tool. Results confirmed that refactoring did take place, but the majority were of the simpler, less complex type. Interestingly, the most common refactorings empirically identified were those which, according to Fowler (and from a dependency graph of the ‘seventy two ’ original refactorings), were central to larger more involved refactorings. One conclusion from the study is thus that developer time and effort for relatively large restructuring and testing of refactored code is prohibitive; making small, simple changes is preferred. A further conclusion from our study is that refactoring didn’t occur in the earliest or latest versions of the systems we investigated. 1.