Results 1 - 10
of
42
Expertise Browser: A Quantitative Approach to Identifying Expertise
- In proceedings of International Conference on Software Engineering (ICSE 2002
, 2002
"... Finding relevant expertise is a critical need in collaborative software engineering, particularly in geographically distributed developments. We introduce a tool that uses data from change management systems to locate people with desired expertise. It uses a quantification of experience, and present ..."
Abstract
-
Cited by 85 (12 self)
- Add to MetaCart
Finding relevant expertise is a critical need in collaborative software engineering, particularly in geographically distributed developments. We introduce a tool that uses data from change management systems to locate people with desired expertise. It uses a quantification of experience, and presents evidence to validate this quantification as a measure of expertise. The tool enables developers, for example, easily to distinguish someone who has worked only briefly in a particular area of the code from someone who has more extensive experience, and to locate people with broad expertise throughout large parts of the product, such as module or even subsystems. In addition, it allows a user to discover expertise profiles for individuals or organizations. Data from a deployment of the tool in a large software development organization shows that newer, remote sites tend to use the tool for expertise location more frequently. Larger, more established sites used the tool to find expertise profiles for people or organizations. We conclude by describing extensions that provide continuous awareness of ongoing work and an interactive, quantitative resume. 1
An empirical study of global software development: distance and speed
- In ICSE ’01: Proceedings of the 23rd International Conference on Software Engineering
, 2001
"... Global software development is rapidly becoming the norm for technology companies. Previous qualitative research suggests that multi-site development may increase development cycle time. We use both survey data and data from the source code change management system to model the extent of delay in a ..."
Abstract
-
Cited by 84 (12 self)
- Add to MetaCart
Global software development is rapidly becoming the norm for technology companies. Previous qualitative research suggests that multi-site development may increase development cycle time. We use both survey data and data from the source code change management system to model the extent of delay in a multi-site software development organization, and explore several possible mechanisms for this delay. We also measure differences in same-site and cross-site communication patterns, and analyze the relationship of these variables to delay. Our results show that compared to same-site work, cross-site work takes much longer, and requires more people for work of equal size and complexity. We also report a strong relationship between delay in cross-site work and the degree to which remote colleagues are perceived to help out when workloads are heavy. We discuss implications of our findings for collaboration technology for distributed software development.
Automating the measurement of open source projects
- In Proceedings of the 3rd Workshop on Open Source Software Engineering
, 2003
"... The proliferation of open source projects raises a number of vital economic, social, and software engineering questions that are subject of intense research. Based on experience analyzing numerous open source and commercial projects we propose a set of tools to support extraction and validation of s ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
The proliferation of open source projects raises a number of vital economic, social, and software engineering questions that are subject of intense research. Based on experience analyzing numerous open source and commercial projects we propose a set of tools to support extraction and validation of software project data. Such tools would streamline empirical investigation of open source projects and make it possible to test existing and new theories about the nature of open source projects. Our software includes tools to extract and summarize information from mailing lists, CVS logs, ChangeLog files, and defect tracing databases. More importantly, it cross-links records from various data sources and identifies all contributors for a software change. We illustrate some of the capabilities by analyzing data from Ximian Evolution project. 1.
An Empirical Study of Speed and Communication in Globally-Distributed Software Development
- IEEE Transactions on Software Engineering
, 2003
"... Abstract—Global software development is rapidly becoming the norm for technology companies. Previous qualitative research suggests that distributed development may increase development cycle time for individual work items (modification requests). We use both data from the source code change manageme ..."
Abstract
-
Cited by 41 (8 self)
- Add to MetaCart
Abstract—Global software development is rapidly becoming the norm for technology companies. Previous qualitative research suggests that distributed development may increase development cycle time for individual work items (modification requests). We use both data from the source code change management system and survey data to model the extent of delay in a distributed software development organization and explore several possible mechanisms for this delay. One key finding is that distributed work items appear to take about two and one-half times as long to complete as similar items where all the work is colocated. The data strongly suggest a mechanism for the delay, i.e., that distributed work items involve more people than comparable same-site work items, and the number of people involved is strongly related to the calendar time to complete a work item. We replicate the analysis of change data in a different organization with a different product and different sites and confirm our main findings. We also report survey results showing differences between same-site and distributed social networks, testing several hypotheses about characteristics of distributed social networks that may be related to delay. We discuss implications of our findings for practices and collaboration technology that have the potential for dramatically speeding distributed software development. Index Terms—Global development, collaboration, delay, speed, awareness, informal communication. 1
Understanding and Predicting Effort in Software Projects
- In 2003 International Conference on Software Engineering
, 2002
"... We set out to answer a question we were asked by software project management: how much effort remains to be spent on a specific software project and how will that effort be distributed over time? To answer this question we propose a model based on the concept that each modification to software may c ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
We set out to answer a question we were asked by software project management: how much effort remains to be spent on a specific software project and how will that effort be distributed over time? To answer this question we propose a model based on the concept that each modification to software may cause repairs at some later time and investigate its theoretical properties and application to several projects in Avaya to predict and plan development resource allocation. Our model presents a novel unified framework to investigate and predict effort, schedule, and defects of a software project. The results of applying the model confirm a fundamental relationship between the new feature and defect repair changes and demonstrate its predictive properties.
Toward understanding the rhetoric of small source code changes
- IEEE Transactions on Software Engineering
, 2005
"... Understanding the impact of software changes has been a challenge since software systems were first developed. With the increasing size and complexity of systems, this problem has become more difficult. There are many ways to identify the impact of changes on the system from the plethora of software ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
Understanding the impact of software changes has been a challenge since software systems were first developed. With the increasing size and complexity of systems, this problem has become more difficult. There are many ways to identify the impact of changes on the system from the plethora of software artifacts produced during development and maintenance. We present the analysis of the software development process using change and defect history data. Specifically, we address the problem of small changes. The studies revealed that (1) there is less than 4 percent probability that a one-line change will introduce an error in the code; (2) nearly 10 percent of all changes made during the maintenance of the software under consideration were one-line changes; (3 the phenomena of change differs for additions, deletions and modifications as well as for the number of lines affected. 1.
Predicting faults from cached history
- In Proceedings of the 29th International Conference on Software Engineering
, 2007
"... We analyze the version history of 7 software systems to predict the most fault prone entities and files. The basic assumption is that faults do not occur in isolation, but rather in bursts of several related faults. Therefore, we cache locations that are likely to have faults: starting from the loca ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
We analyze the version history of 7 software systems to predict the most fault prone entities and files. The basic assumption is that faults do not occur in isolation, but rather in bursts of several related faults. Therefore, we cache locations that are likely to have faults: starting from the location of a known (fixed) fault, we cache the location itself, any locations changed together with the fault, recently added locations, and recently changed locations. By consulting the cache at the moment a fault is fixed, a developer can detect likely fault-prone locations. This is useful for prioritizing verification and validation resources on the most fault prone files or entities. In our evaluation of seven open source projects with more than 200,000 revisions, the cache selects 10 % of the source code files; these files account for 73%-95 % of faults— a significant advance beyond the state of the art. 1.
Classifying Software Changes: Clean or Buggy?
, 2008
"... This paper introduces a new technique for predicting latent software bugs, called change classification. Change classification uses a machine learning classifier to determine whether a new software change is more similar to prior buggy changes or clean changes. In this manner, change classification ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
This paper introduces a new technique for predicting latent software bugs, called change classification. Change classification uses a machine learning classifier to determine whether a new software change is more similar to prior buggy changes or clean changes. In this manner, change classification predicts the existence of bugs in software changes. The classifier is trained using features (in the machine learning sense) extracted from the revision history of a software project stored in its software configuration management repository. The trained classifier can classify changes as buggy or clean, with a 78 percent accuracy and a 60 percent buggy change recall on average. Change classification has several desirable qualities: 1) The prediction granularity is small (a change to a single file), 2) predictions do not require semantic information about the source code, 3) the technique works for a broad array of project types and programming languages, and 4) predictions can be made immediately upon the completion of a change. Contributions of this paper include a description of the change classification approach, techniques for extracting features from the source code and change histories, a characterization of the performance of change classification across 12 open source projects, and an evaluation of the predictive power of different groups of features.
Improving evolvability through refactoring
- In MSR ’05: Proceedings of the 2005 international workshop on Mining software repositories
, 2005
"... Refactoring is one means of improving the structure of existing software. Locations for the application of refactoring are often based on subjective perceptions such as ”bad smells”, which are vague suspicions of design shortcomings. We exploit historical data extracted from repositories such as CVS ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Refactoring is one means of improving the structure of existing software. Locations for the application of refactoring are often based on subjective perceptions such as ”bad smells”, which are vague suspicions of design shortcomings. We exploit historical data extracted from repositories such as CVS and focus on change couplings: if some software parts change at the same time very often over several releases, this data can be used to point to candidates for refactoring. We adopt the concept of bad smells and provide additional change smells. Such a smell is hardly visible in the code, but easy to spot when viewing the change history. Our approach enables the detection of such smells allowing an engineer to apply refactoring on these parts of the source code to improve the evolvability of the software. For that, we analyzed the history of a large industrial system for a period of 15 months, proposed spots for refactorings based on change couplings, and performed them with the developers. After observing the system for another 15 months we finally analyzed the effectiveness of our approach. Our results support our hypothesis that the combination of change dependency analysis and refactoring is applicable and effective.
Towards understanding the rhetoric of small changes
- In the International Workshop on Mining Repositories
, 2004
"... Understanding the impact of software changes has been a challenge since software systems were first developed. With the increasing size and complexity of systems, this problem has become more difficult. There are many ways to identify change impact from the plethora of software artifacts produced du ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Understanding the impact of software changes has been a challenge since software systems were first developed. With the increasing size and complexity of systems, this problem has become more difficult. There are many ways to identify change impact from the plethora of software artifacts produced during development and maintenance. We present the analysis of the software development process using change and defect history data. Specifically, we address the problem of small changes. The studies revealed that (1) there is less than 4 percent probability that a one-line change will introduce an error in the code; (2) nearly 10 percent of all changes made during the maintenance of the software under consideration were one-line changes; (3 the phenomena of change differs for additions, deletions and modifications as well as for the number of lines affected. 1.

