Results 1 - 10
of
13
Automatically Finding Patches Using Genetic Programming ∗
"... Automatic program repair has been a longstanding goal in software engineering, yet debugging remains a largely manual process. We introduce a fully automated method for locating and repairing bugs in software. The approach works on off-the-shelf legacy applications and does not require formal specif ..."
Abstract
-
Cited by 33 (8 self)
- Add to MetaCart
Automatic program repair has been a longstanding goal in software engineering, yet debugging remains a largely manual process. We introduce a fully automated method for locating and repairing bugs in software. The approach works on off-the-shelf legacy applications and does not require formal specifications, program annotations or special coding practices. Once a program fault is discovered, an extended form of genetic programming is used to evolve program variants until one is found that both retains required functionality and also avoids the defect in question. Standard test cases are used to exercise the fault and to encode program requirements. After a successful repair has been discovered, it is minimized using structural differencing algorithms and delta debugging. We describe the proposed method and report experimental results demonstrating that it can successfully repair ten different C programs totaling 63,000 lines in under 200 seconds, on average. 1
An approach to detecting duplicate bug reports using natural language and execution information
- In ICSE ’08: Proceedings of the 30th International Conference on Software Engineering
, 2008
"... An open source project typically maintains an open bug repository so that bug reports from all over the world can be gathered. When a new bug report is submitted to the repository, a person, called a triager, examines whether it is a duplicate of an existing bug report. If it is, the triager marks i ..."
Abstract
-
Cited by 26 (7 self)
- Add to MetaCart
An open source project typically maintains an open bug repository so that bug reports from all over the world can be gathered. When a new bug report is submitted to the repository, a person, called a triager, examines whether it is a duplicate of an existing bug report. If it is, the triager marks it as DUPLICATE and the bug report is removed from consideration for further work. In the literature, there are approaches exploiting only natural language information to detect duplicate bug reports. In this paper we present a new approach that further involves execution information. In our approach, when a new bug report arrives, its natural language information and execution information are compared with those of the existing bug reports. Then, a small number of existing bug reports are suggested to the triager as the most similar bug reports to the new bug report. Finally, the triager examines the suggested bug reports to determine whether the new bug report duplicates an existing bug report. We calibrated our approach on a subset of the Eclipse bug repository and evaluated our approach on a subset of the Firefox bug repository. The experimental results show that our approach can detect 67%-93 % of duplicate bug reports in the Firefox bug repository, compared to 43%-72% using natural language information alone.
Duplicate Bug Reports Considered Harmful... Really?
"... In a survey we found that most developers have experienced duplicated bug reports, however, only few considered them as a serious problem. This contradicts popular wisdom that considers bug duplicates as a serious problem for open source projects. In the survey, developers also pointed out that the ..."
Abstract
-
Cited by 15 (6 self)
- Add to MetaCart
In a survey we found that most developers have experienced duplicated bug reports, however, only few considered them as a serious problem. This contradicts popular wisdom that considers bug duplicates as a serious problem for open source projects. In the survey, developers also pointed out that the additional information provided by duplicates helps to resolve bugs quicker. In this paper, we therefore propose to merge bug duplicates, rather than treating them separately. We quantify the amount of information that is added for developers and show that automatic triaging can be improved as well. In addition, we discuss the different reasons why users submit duplicate bug reports in the first place. 1.
Automated Duplicate Detection for Bug Tracking Systems
"... Bug tracking systems are important tools that guide the maintenance activities of software developers. The utility of these systems is hampered by an excessive number of duplicate bug reports–in some projects as many as a quarter of all reports are duplicates. Developers must manually identify dupli ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Bug tracking systems are important tools that guide the maintenance activities of software developers. The utility of these systems is hampered by an excessive number of duplicate bug reports–in some projects as many as a quarter of all reports are duplicates. Developers must manually identify duplicate bug reports, but this identification process is time-consuming and exacerbates the already high cost of software maintenance. We propose a system that automatically classifies duplicate bug reports as they arrive to save developer time. This system uses surface features, textual semantics, and graph clustering to predict duplicate status. Using a dataset of 29,000 bug reports from the Mozilla project, we perform experiments that include a simulation of a real-time bug reporting environment. Our system is able to reduce development cost by filtering out 8 % of duplicate bug reports while allowing at least one report for each real defect to reach developers. 1.
A discriminative model approach for accurate duplicate bug report retrieval
- in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ICSE’10) - Volume
"... Bug repositories are usually maintained in software projects. Testers or users submit bug reports to identify various issues with systems. Sometimes two or more bug reports correspond to the same defect. To address the problem with duplicate bug reports, a person called a triager needs to manually l ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Bug repositories are usually maintained in software projects. Testers or users submit bug reports to identify various issues with systems. Sometimes two or more bug reports correspond to the same defect. To address the problem with duplicate bug reports, a person called a triager needs to manually label these bug reports as duplicates, and link them to their ”master ” reports for subsequent maintenance work. However, in practice there are considerable duplicate bug reports sent daily; requesting triagers to manually label these bugs could be highly time consuming. To address this issue, recently, several techniques have be proposed using various similarity based metrics to detect candidate duplicate bug reports for manual verification. Automating triaging has been proved challenging as two reports of the same bug could be written in various ways. There is still much room for improvement in terms of accuracy of duplicate detection process. In this paper, we leverage recent advances on using discriminative models for information retrieval to detect duplicate bug reports more accurately. We have validated our approach on three large software bug repositories from Firefox, Eclipse, and OpenOffice. We show that our technique could result in 17–31%, 22–26%, and 35– 43 % relative improvement over state-of-the-art techniques in OpenOffice, Firefox, and Eclipse datasets respectively using commonly available natural language information only.
Modeling Bug Reprot Quality
, 2007
"... Software developers spend a significant portion of their resources handling user-submitted bug reports. For software that is widely deployed, the number of bug reports typically outstrips the resources available to triage them. As a result, some reports may be dealt with too slowly or not at all. We ..."
Abstract
- Add to MetaCart
Software developers spend a significant portion of their resources handling user-submitted bug reports. For software that is widely deployed, the number of bug reports typically outstrips the resources available to triage them. As a result, some reports may be dealt with too slowly or not at all. We present a descriptive model of bug report quality based on a statistical analysis of surface features of over 27,000 publicly available bug reports for the Mozilla Firefox project. The model predicts whether a bug report is triaged within a given amount of time. Our analysis of this model has implications for bug reporting systems and suggests features that should be emphasized when composing bug reports. We evaluate our model empirically based on its hypothetical performance as an automatic filter of incoming bug reports. Our results show that our model performs significantly better than chance in terms of precision and recall. In addition, we show that our model can reduce the overall cost of software maintenance in a setting where the average cost of addressing a bug report is more than 2 % of the cost of ignoring an important bug report.
A Human Study of Fault Localization Accuracy
"... Abstract—Localizing and repairing defects are critical software engineering activities. Not all programs and not all bugs are equally easy to debug, however. We present formal models, backed by a human study involving 65 participants (from both academia and industry) and 1830 total judgments, relati ..."
Abstract
- Add to MetaCart
Abstract—Localizing and repairing defects are critical software engineering activities. Not all programs and not all bugs are equally easy to debug, however. We present formal models, backed by a human study involving 65 participants (from both academia and industry) and 1830 total judgments, relating various software- and defect-related features to human accuracy at locating errors. Our study involves example code from Java textbooks, helping us to control for both readability and complexity. We find that certain types of defects are much harder for humans to locate accurately. For example, humans are over five times more accurate at locating “extra statements ” than “missing statements ” based on experimental observation. We also find that, independent of the type of defect involved, certain code contexts are harder to debug than others. For example, humans are over three times more accurate at finding defects in code that provides an array abstraction than in code that provides a tree abstraction. We identify and analyze code features that are predictive of human fault localization accuracy. Finally, we present a formal model of debugging accuracy based on those source code features that have a statistically significant correlation with human performance. I.
Merging Duplicate Bug Reports by Sentence Clustering
"... Duplicate bug reports are often unfavorable because they tend to take many man hours for being identified as duplicates, marked so and eventually discarded. In this time, no progress occurs on the program in question, and is justifiably an overhead which should be minimized. Considerable research ha ..."
Abstract
- Add to MetaCart
Duplicate bug reports are often unfavorable because they tend to take many man hours for being identified as duplicates, marked so and eventually discarded. In this time, no progress occurs on the program in question, and is justifiably an overhead which should be minimized. Considerable research has been carried out to alleviate this problem. Many methods have been proposed for bug report categorization and duplicate bug report detection. However, it is often the case that a duplicate bug report can provide some additional information about a problem which could help in faster resolution of the bug. We propose that duplicate bug reports be merged when possible instead of being discarded, so that maximum information is captured. We propose a clustering-based algorithm to group together similar sentences and create a union of bug reports considered duplicates of each other. 1
Information Systems III
"... While the financial consequences of software errors on the developer’s side have been explored extensively, the cost arising for the end user has been largely neglected. One reason is the difficulty of linking errors in the code with emerging failure behavior of the software. The problem becomes eve ..."
Abstract
- Add to MetaCart
While the financial consequences of software errors on the developer’s side have been explored extensively, the cost arising for the end user has been largely neglected. One reason is the difficulty of linking errors in the code with emerging failure behavior of the software. The problem becomes even more difficult when trying to predict failure probabilities based on models or code metrics. In this paper we take a first step towards a cost prediction model by exploring the possibilities of modeling the financial consequences of already identified software failures. Firefox, a well-known open source software, is used as a test subject. Historically identified failures are modeled using fault trees. To identify expenses, usage profiles are employed to depict the interaction with the system. The presented approach demonstrates the possibility to model failure cost for an organization using a specific software by establishing a relationship between user behavior, software failures, and cost. As future work, an extension with software error prediction techniques as well as an empirical validation of the model is aspired.
3 Micro
"... Permission to ma ake digital or hard copies of all or part of this work for personal or classr room use is grante ed without fee pro ovided that copies are not made or distributed for profit or commercial adva antage and that cop pies bear this notice an nd the full citation on the first page e. To ..."
Abstract
- Add to MetaCart
Permission to ma ake digital or hard copies of all or part of this work for personal or classr room use is grante ed without fee pro ovided that copies are not made or distributed for profit or commercial adva antage and that cop pies bear this notice an nd the full citation on the first page e. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires pr rior specific permissio on and/or a fee.

