Results 1 - 10
of
13
Predicting Re-opened Bugs: A Case Study on the Eclipse Project
"... Abstract—Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract—Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and lead to unnecessary rework by busy practitioners. In this paper, we study and predict re-opened bugs through a case study on the Eclipse project. We structure our study along 4 dimensions: 1) the work habits dimension (e.g., the weekday on which the bug was initially closed on), 2) the bug report dimension (e.g., the component in which the bug was found) 3) the bug fix dimension (e.g., the amount of time it took to perform the initial fix) and 4) the team dimension (e.g., the experience of the bug fixer). Our case study on the Eclipse Platform 3.0 project shows that the comment and description text, the time it took to fix the bug, and the component the bug was found in are the most important factors in determining whether a bug will be re-opened. Based on these dimensions we create decision trees that predict whether a bug will be re-opened after its closure. Using a combination of our dimensions, we can build explainable prediction models that can achieve 62.9 % precision and 84.5% recall when predicting whether a bug will be re-opened. I.
Predicting Buggy Changes Inside an Integrated Development Environment
, 2007
"... We present a tool that predicts whether the software under development inside an IDE has a bug. An IDE plugin performs this prediction, using the Change Classification technique to classify source code changes as buggy or clean during the editing session. Change Classification uses Support Vector Ma ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present a tool that predicts whether the software under development inside an IDE has a bug. An IDE plugin performs this prediction, using the Change Classification technique to classify source code changes as buggy or clean during the editing session. Change Classification uses Support Vector Machines (SVM), a machine learning classifier algorithm, to classify changes to projects mined from their configuration management repository. This technique, besides being language independent and relatively accurate, can (a) classify a change immediately upon its completion and (b) use features extracted solely from the change delta (added, deleted) and the source code to predict buggy changes. Thus, integrating change classification within an IDE can predict potential bugs in the software as the developer edits the source code, ideally reducing the amount of time spent on fixing bugs later. To this end, we have developed a Change Classification plugin for Eclipse based on client-server architecture, described in this paper.
Reducing Features to Improve Bug Prediction
"... Abstract—Recently, machine learning classifiers have emerged as a way to predict the existence of a bug in a change made to a source code file. The classifier is first trained on software history data, and then used to predict bugs. Two drawbacks of existing classifier-based bug prediction are poten ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—Recently, machine learning classifiers have emerged as a way to predict the existence of a bug in a change made to a source code file. The classifier is first trained on software history data, and then used to predict bugs. Two drawbacks of existing classifier-based bug prediction are potentially insufficient accuracy for practical use, and use of a large number of features. These large numbers of features adversely impact scalability and accuracy of the approach. This paper proposes a feature selection technique applicable to classification-based bug prediction. This technique is applied to predict bugs in software changes, and performance of Naïve Bayes and Support Vector Machine (SVM) classifiers is characterized.
How Do Fixes Become Bugs? A Comprehensive Characteristic Study on Incorrect Fixes in Commercial and Open Source Operating Systems
"... Software bugs affect system reliability. When a bug is exposed in the field, developers need to fix them. Unfortunately, the bug-fixing process can also introduce errors, which leads to buggy patches that further aggravate the damage to end users and erode software vendors ’ reputation. This paper p ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Software bugs affect system reliability. When a bug is exposed in the field, developers need to fix them. Unfortunately, the bug-fixing process can also introduce errors, which leads to buggy patches that further aggravate the damage to end users and erode software vendors ’ reputation. This paper presents a comprehensive characteristic study on incorrect bug-fixes from large operating system code bases including Linux, OpenSolaris, FreeBSD and also a mature commercial OS developed and evolved over the last 12 years, investigating not only the mistake patterns during bug-fixing but also the possible human reasons in the development process when these incorrect bug-fixes were introduced. Our major findings include: (1) at least 14.8%∼24.4 % of sampled fixes for post-release bugs 1 in these large OSes are incorrect and have made impacts to end users. (2) Among several common bug types, concurrency bugs are the most difficult to fix correctly: 39 % of concurrency bug fixes are incorrect. (3) Developers and reviewers for incorrect fixes usually do not have enough knowledge about the involved code. For example, 27 % of the incorrect fixes are made by developers who have never touched the source code files associated with the fix. Our results provide useful guidelines to design new tools and also to improve the development process. Based on our findings, the commercial software vendor whose OS code we evaluated is building a tool to improve the bug fixing and code reviewing process.
Analyzing the Impact of Change in Multi-threaded Programs ⋆
, 2009
"... Abstract. We introduce a technique for debugging multi-threaded C programs and analyzing the impact of source code changes, and its implementation in the prototype tool Direct. Our approach uses a combination of source code instrumentation and runtime management. The source code along with a test ha ..."
Abstract
- Add to MetaCart
Abstract. We introduce a technique for debugging multi-threaded C programs and analyzing the impact of source code changes, and its implementation in the prototype tool Direct. Our approach uses a combination of source code instrumentation and runtime management. The source code along with a test harness is instrumented to monitor Operating System (OS) and user defined function calls. All concurrency control primitives are tracked through the OS functions. Optionally, Direct can track some concurrency related data of interest. Direct keeps track of an abstract global state that combines the abstract states of every thread, including the sequence of function calls and concurrency primitives executed. The runtime manager can insert delays, provoking thread interleavings that may exhibit bugs, and that may be difficult to reach otherwise. The runtime manager collects an approximation of the reachable state space and uses this approximation to assess the impact of change in a new version of the program. 1
Kenyon-Web: Reconfigurable Web-based Feature Extractor
"... Research on Mining Software Repositories (MSR) has yielded fruitful results in many Software Engineering areas including software change comprehension, bug prediction, and developer network recovery. When performing MSR research, the first task is to extract features corresponding to source code det ..."
Abstract
- Add to MetaCart
Research on Mining Software Repositories (MSR) has yielded fruitful results in many Software Engineering areas including software change comprehension, bug prediction, and developer network recovery. When performing MSR research, the first task is to extract features corresponding to source code details from repositories. Since reusable feature extraction tools are not available, each MSR research group builds their own extraction tool, a duplication of effort. We introduce a reusable feature extractor, Kenyonweb, for MSR research. Kenyon-web is fully reconfigurable, pluggable, and serves most MSR related tasks. In this report, we show the architecture of Kenyonweb and demonstrate its utility by showcasing a sample MSR task.
Information Systems III
"... While the financial consequences of software errors on the developer’s side have been explored extensively, the cost arising for the end user has been largely neglected. One reason is the difficulty of linking errors in the code with emerging failure behavior of the software. The problem becomes eve ..."
Abstract
- Add to MetaCart
While the financial consequences of software errors on the developer’s side have been explored extensively, the cost arising for the end user has been largely neglected. One reason is the difficulty of linking errors in the code with emerging failure behavior of the software. The problem becomes even more difficult when trying to predict failure probabilities based on models or code metrics. In this paper we take a first step towards a cost prediction model by exploring the possibilities of modeling the financial consequences of already identified software failures. Firefox, a well-known open source software, is used as a test subject. Historically identified failures are modeled using fault trees. To identify expenses, usage profiles are employed to depict the interaction with the system. The presented approach demonstrates the possibility to model failure cost for an organization using a specific software by establishing a relationship between user behavior, software failures, and cost. As future work, an extension with software error prediction techniques as well as an empirical validation of the model is aspired.
Naresh Kumar Nagwani,
"... Suffix Tree Clustering (STC) is one of the popular text clustering algorithms. STC has number of applications and the most popular is web document clustering. Software bug data contains number of attributes like bug-id, summary (title), description, comments, status, version etc. Most of the importa ..."
Abstract
- Add to MetaCart
Suffix Tree Clustering (STC) is one of the popular text clustering algorithms. STC has number of applications and the most popular is web document clustering. Software bug data contains number of attributes like bug-id, summary (title), description, comments, status, version etc. Most of the important attributes holds text data. Since the software bug repositories are consist of most the data in the form of text, STC can be applied to create the clusters of software bug record. In this paper STC algorithm is used for software bug classification. First clusters are created from the bug repositories and then labels are assigned to the each cluster, which indicates the classes of the clusters. STC implementation is available as the part of Carrot2 framework. The designed technique is evaluated using the common clustering parameters.
An Integration Resolution Algorithm for Mining Multiple Branches in Version Control Systems
"... Abstract — The high cost of software maintenance necessitates methods to improve the efficiency of the maintenance process. Such methods typically need a vast amount of knowledge about a system, which is often mined from software repositories. Collecting this data becomes a challenge if the system w ..."
Abstract
- Add to MetaCart
Abstract — The high cost of software maintenance necessitates methods to improve the efficiency of the maintenance process. Such methods typically need a vast amount of knowledge about a system, which is often mined from software repositories. Collecting this data becomes a challenge if the system was developed using multiple code branches. In this paper we present an integration resolution algorithm that facilitates data collection across multiple code branches. The algorithm tracks code integrations across different branches and associates code changes in the main development branch with corresponding changes in other branches. We provide evidence for the practical relevance of this algorithm during the development of the Windows Vista Service Pack 2. Keywords-Algorithms, Management, Measurement

