Results 1 - 10
of
67
Characterizing and Predicting Which Bugs Get Fixed: An Empirical Study of Microsoft Windows
"... We performed an empirical study to characterize factors that affect which bugs get fixed in Windows Vista and Windows 7, focusing on factors related to bug report edits and relationships between people involved in handling the bug. We found that bugs reported by people with better reputations were m ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
(Show Context)
We performed an empirical study to characterize factors that affect which bugs get fixed in Windows Vista and Windows 7, focusing on factors related to bug report edits and relationships between people involved in handling the bug. We found that bugs reported by people with better reputations were more likely to get fixed, as were bugs handled by people on the same team and working in geographical proximity. We reinforce these quantitative results with survey feedback from 358 Microsoft employees who were involved in Windows bugs. Survey respondents also mentioned additional qualitative influences on bug fixing, such as the importance of seniority and interpersonal skills of the bug reporter. Informed by these findings, we built a statistical model to predict the probability that a new bug will be fixed (the first known one, to the best of our knowledge). We trained it on Windows Vista bugs and got a precision of 68 % and recall of 64 % when predicting Windows 7 bug fixes. Engineers could use such a model to prioritize bugs during triage, to estimate developer workloads, and to decide which bugs should be closed or migrated to future product versions. Categories and Subject Descriptors:
Fine-grained Incremental Learning and Multi-feature Tossing Graphs to Improve Bug Triaging
"... Abstract—Software bugs are inevitable and bug fixing is a difficult, expensive, and lengthy process. One of the primary reasons why bug fixing takes so long is the difficulty of accurately assigning a bug to the most competent developer for that bug kind or bug class. Assigning a bug to a potential ..."
Abstract
-
Cited by 24 (6 self)
- Add to MetaCart
(Show Context)
Abstract—Software bugs are inevitable and bug fixing is a difficult, expensive, and lengthy process. One of the primary reasons why bug fixing takes so long is the difficulty of accurately assigning a bug to the most competent developer for that bug kind or bug class. Assigning a bug to a potential developer, also known as bug triaging, is a labor-intensive, time-consuming and faultprone process if done manually. Moreover, bugs frequently get reassigned to multiple developers before they are resolved, a process known as bug tossing. Researchers have proposed automated techniques to facilitate bug triaging and reduce bug tossing using machine learning-based prediction and tossing graphs. While these techniques achieve good prediction accuracy for triaging and reduce tossing paths, they are vulnerable to several issues: outdated training sets, inactive developers, and imprecise, singleattribute tossing graphs. In this paper we improve triaging accuracy and reduce tossing path lengths by employing several techniques such as refined classification using additional attributes and intra-fold updates during training, a precise ranking function for recommending potential tossees in tossing graphs, and multi-feature tossing graphs. We validate our approach on two large software projects, Mozilla and Eclipse, covering 856,259 bug reports and 21 cumulative years of development. We demonstrate that our techniques can achieve up to 83.62% prediction accuracy in bug triaging. Moreover, we reduce tossing path lengths to 1.5–2 tosses for most bugs, which represents a reduction of up to 86.31 % compared to original tossing paths. Our improvements have the potential to significantly reduce the bug fixing effort, especially in the context of sizable projects with large numbers of testers and developers.
Blending Conceptual and Evolutionary Couplings to Support Change Impact Analysis in Source Code
- in Proc. of 17th IEEE Working Conference on Reverse Engineering (WCRE'10
"... Abstract — The paper presents an approach that combines conceptual and evolutionary techniques to support change impact analysis in source code. Information Retrieval (IR) is used to derive conceptual couplings from the source code in a single version (release) of a software system. Evolutionary cou ..."
Abstract
-
Cited by 23 (8 self)
- Add to MetaCart
(Show Context)
Abstract — The paper presents an approach that combines conceptual and evolutionary techniques to support change impact analysis in source code. Information Retrieval (IR) is used to derive conceptual couplings from the source code in a single version (release) of a software system. Evolutionary couplings are mined from source code commits. The premise is that such combined methods provide improvements to the accuracy of impact sets. A rigorous empirical assessment on the changes of the open source systems Apache
Security Versus Performance Bugs A Case Study on Firefox
- in Proceedings of the 8th Working Conference on Mining Software Repositories
, 2011
"... A good understanding of the impact of different types of bugs on various project aspects is essential to improve soft-ware quality research and practice. For instance, we would expect that security bugs are fixed faster than other types of bugs due to their critical nature. However, prior research h ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
(Show Context)
A good understanding of the impact of different types of bugs on various project aspects is essential to improve soft-ware quality research and practice. For instance, we would expect that security bugs are fixed faster than other types of bugs due to their critical nature. However, prior research has often treated all bugs as similar when studying various aspects of software quality (e.g., predicting the time to fix a bug), or has focused on one particular type of bug (e.g., security bugs) with little comparison to other types. In this paper, we study how different types of bugs (performance and security bugs) differ from each other and from the rest of the bugs in a software project. Through a case study on the Firefox project, we find that security bugs are fixed and triaged much faster, but are reopened and tossed more frequently. Furthermore, we also find that security bugs in-volve more developers and impact more files in a project. Our work is the first work to ever empirically study perfor-mance bugs and compare it to frequently-studied security bugs. Our findings highlight the importance of considering the different types of bugs in software quality research and practice.
Predicting Re-opened Bugs: A Case Study on the Eclipse Project
"... Abstract—Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
(Show Context)
Abstract—Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and lead to unnecessary rework by busy practitioners. In this paper, we study and predict re-opened bugs through a case study on the Eclipse project. We structure our study along 4 dimensions: 1) the work habits dimension (e.g., the weekday on which the bug was initially closed on), 2) the bug report dimension (e.g., the component in which the bug was found) 3) the bug fix dimension (e.g., the amount of time it took to perform the initial fix) and 4) the team dimension (e.g., the experience of the bug fixer). Our case study on the Eclipse Platform 3.0 project shows that the comment and description text, the time it took to fix the bug, and the component the bug was found in are the most important factors in determining whether a bug will be re-opened. Based on these dimensions we create decision trees that predict whether a bug will be re-opened after its closure. Using a combination of our dimensions, we can build explainable prediction models that can achieve 62.9 % precision and 84.5% recall when predicting whether a bug will be re-opened. I.
Integrated impact analysis for managing software changes
- in Proceedings of the 34th IEEE/ACM International Conference on Software Engineering (ICSE’12
"... Abstract — The paper presents an adaptive approach to perform impact analysis from a given change request to source code. Given a textual change request (e.g., a bug report), a single snapshot (release) of source code, indexed using Latent Semantic Indexing, is used to estimate the impact set. Shoul ..."
Abstract
-
Cited by 19 (12 self)
- Add to MetaCart
(Show Context)
Abstract — The paper presents an adaptive approach to perform impact analysis from a given change request to source code. Given a textual change request (e.g., a bug report), a single snapshot (release) of source code, indexed using Latent Semantic Indexing, is used to estimate the impact set. Should additional contextual information be available, the approach configures the best-fit combination to produce an improved impact set. Contextual information includes the execution trace and an initial source code entity verified for change. Combinations of information retrieval, dynamic analysis, and data mining of past source code commits are considered. The research hypothesis is that these combinations help counter the precision or recall deficit of individual techniques and improve the overall accuracy. The tandem operation of the three techniques sets it apart from other related solutions. Automation along with the effective utilization of two key sources of developer knowledge, which are often overlooked in impact analysis at the change request level, is achieved. To validate our approach, we conducted an empirical evaluation on four open source software systems. A benchmark consisting of a number of maintenance issues, such as feature requests and bug fixes, and their associated source code changes was established by manual examination of these systems and their change history. Our results indicate that there are combinations formed from the augmented developer contextual information that show statistically significant improvement over stand-alone approaches. I.
Characterizing and Predicting Which Bugs Get Reopened
, 2012
"... Fixing bugs is an important part of the software development process. An underlying aspect is the effectiveness of fixes: if a fair number of fixed bugs are reopened, it could indicate instability in the software system. To the best of our knowledge there has been on little prior work on understand ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Fixing bugs is an important part of the software development process. An underlying aspect is the effectiveness of fixes: if a fair number of fixed bugs are reopened, it could indicate instability in the software system. To the best of our knowledge there has been on little prior work on understanding the dynamics of bug reopens. Towards that end, in this paper, we characterize when bug reports are reopened by using the Microsoft Windows operating system project as an empirical case study. Our analysis is based on a mixedmethods approach. First, we categorize the primary reasons for reopens based on a survey of 358 Microsoft employees. We then reinforce these results with a large-scale quantitative study of Windows bug reports, focusing on factors related to bug report edits and relationships between people involved in handling the bug. Finally, we build statistical models to describe the impact of various metrics on reopening bugs ranging from the reputation of the opener to how the bug was found.
Transfer Defect Learning
"... Abstract—Many software defect prediction approaches have been proposed and most are effective in within-project prediction settings. However, for new projects or projects with limited training data, it is desirable to learn a prediction model by using sufficient training data from existing source pr ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
(Show Context)
Abstract—Many software defect prediction approaches have been proposed and most are effective in within-project prediction settings. However, for new projects or projects with limited training data, it is desirable to learn a prediction model by using sufficient training data from existing source projects and then apply the model to some target projects (cross-project defect prediction). Unfortunately, the performance of crossproject defect prediction is generally poor, largely because of feature distribution differences between the source and target projects. In this paper, we apply a state-of-the-art transfer learning approach, TCA, to make feature distributions in source and target projects similar. In addition, we propose a novel transfer defect learning approach, TCA+, by extending TCA. Our experimental results for eight open-source projects show that TCA+ significantly improves cross-project prediction performance. Index Terms—cross-project defect prediction, transfer learning, empirical software engineering I.
Assigning change requests to software developers
, 2011
"... The paper presents an approach to recommend a ranked list of expert developers to assist in the implementation of software change requests (e.g., bug reports and feature requests). An Information Retrieval (IR)-based concept location technique is first used to locate source code entities, e.g., file ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
The paper presents an approach to recommend a ranked list of expert developers to assist in the implementation of software change requests (e.g., bug reports and feature requests). An Information Retrieval (IR)-based concept location technique is first used to locate source code entities, e.g., files and classes, relevant to a given textual description of a change request. The previous commits from version control repositories of these entities are then mined for expert developers. The role of the IR method in selectively reducing the mining space is different from previous approaches that textually index past change requests and/or commits. The approach is evaluated on change requests from three open-source systems: ArgoUML, Eclipse, andKOffice, across a range of accuracy criteria. The results show that the overall accuracies of the correctly recommended developers are between 47 and 96 % for bug reports, and between 43 and 60% for feature requests. Moreover, comparison results with two other recommendation alternatives show that the presented approach outperforms them with a substantial margin. Project leads or developers can use this approach in maintenance tasks immediately after the receipt of a change request in a free-form text.
Fuzzy set and cache-based approach for bug triaging
- In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ESEC/FSE ’11
, 2011
"... Bug triaging aims to assign a bug to the most appropriate fixer. That task is crucial in reducing time and efforts in a bug fixing process. In this paper, we propose Bugzie, a novel approach for automatic bug triaging based on fuzzy set and cache-based modeling of the bug-fixing expertise of devel-o ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
(Show Context)
Bug triaging aims to assign a bug to the most appropriate fixer. That task is crucial in reducing time and efforts in a bug fixing process. In this paper, we propose Bugzie, a novel approach for automatic bug triaging based on fuzzy set and cache-based modeling of the bug-fixing expertise of devel-opers. Bugzie considers a software system to have multiple technical aspects, each of which is associated with technical terms. For each technical term, it uses a fuzzy set to repre-sent the developers who are capable/competent of fixing the bugs relevant to the corresponding aspect. The fixing corre-lation of a developer toward a technical term is represented by his/her membership score toward the corresponding fuzzy set. The score is calculated based on the bug reports that (s)he has fixed, and is updated as the newly fixed bug reports are available. For a new bug report, Bugzie combines the fuzzy sets corresponding to its terms and ranks the develop-ers based on their membership scores toward that combined fuzzy set to find the most capable fixers. Our empirical re-sults show that Bugzie achieves significantly higher accuracy and time efficiency than existing state-of-the-art approaches.