| Lionel C. Briand, William M. Thomas, and Christopher J. Hetmanski. Modeling and managing risk early in software development. In Proceedings of the Fifteenth International Conference on Software Engineering, pages 55--65. IEEE, May 1993. |
....and the maintainability measures. Based on the notion that a combination of independent variables might better explain high change effort than only a single variable, we planned to analyze multiple variables in combination using a machine learning technique called Optimized Set Reduction (OSR) [BTH93, BBH93]. OSR finds patterns in the independent (explanatory) variables which reliably predict values of a single dependent variable. The OSR approach is insensitive to the scale of the data, but requires a large data set, ideally several hundred points. We planned to apply the OSR technique to the full ....
....and implementation effort against other independent variables were similarly random, which discouraged us from computing univariate correlations. Results of multivariate analyses. Because we had data for several hundred changes in the acceptance test phase, we were able to apply the OSR technique [BTH93, BBH93]. Based on the results achieved when working with the maintenance data, we restricted the data set to the error corrections. All analyses took the approach of trying to identify whether the error corrections (changes) would be inexpensive or expensive, where inexpensive was defined as requiring ....
[Article contains additional citation context not shown here]
Lionel C. Briand, William M. Thomas, and Christopher J. Hetmanski. Modeling and managing risk early in software development. In Proceedings of the Fifteenth International Conference on Software Engineering, pages 55--65. IEEE, May 1993.
....and the maintainability measures. Based on the notion that a combination of independent variables might better explain high change effort than only a single variable, we planned to analyze multiple variables in combination using a machinelearning technique called Optimized Set Reduction (OSR) [BTH93, BBH93]. OSR finds patterns in the independent (explanatory) variables which reliably predict values of a single dependent variable. The OSR approach is insensitive to the scale of the data, but requires a large data set, ideally several hundred points. We planned to apply the OSR technique to the full ....
....in acceptance test against other independent variables were similarly random, which discouraged us from computing univariate correlations. Results of multivariate analyses. Because we had data for several hundred changes in the acceptance test phase, we were able to apply the OSR technique [BTH93, BBH93]. Based on the results achieved when working with the maintenance data, we restricted the data set to the error corrections. All analyses took the approach of trying to identify whether the error corrections (changes) would be inexpensive or expensive, where inexpensive was defined as requiring ....
[Article contains additional citation context not shown here]
Lionel C. Briand, William M. Thomas, and Christopher J. Hetmanski. Modeling and managing risk early in software development. In Proceedings of the 15 International Conference on Software Engineering, pages 55-- 65. IEEE, May 1993.
....of a high risk component varies depending on the context of the study. For example, a high risk component is one that contains any faults found during testing [14] 75] one that contains any faults found during operation [72] or one that is costly to correct after an error has been found [3][13][1] The identification of high risk components allows an organization to take mitigating actions, such as focus defect detection activities on high risk components, for example optimally allocating testing resources [56] or redesign components that are likely to cause field failures or be ....
L. Briand, W. Thomas, and C. Hetmanski: "Modeling and Managing Risk Early in Software Development". In Proceedings of the International Conference on Software Engineering, pages 55- 65, 1993.
....accuracy values. For example, see [1] 4] The consequence of this is that 5 . 0 h p . It is unrealistic to assume that in R D exactly half of the components will be high risk, and therefore p h p . In other studies a sample is constructed to ensure equal numbers of high and low risk components [8]. In other studies the median split is performed on the training data set, and the same cutoff value is used on the hold out sample which can be a different system within the same organization. As is clear in [16] median splits on one system do not result in the same prevalence for another ....
L. Briand, W. Thomas, and C. Hetmanski: "Modeling and Managing Risk Early in Software Development". In Proceedings of the International Conference on Software Engineering, pages 5565, 1993.
....of a high risk component varies depending on the context of the study. For example, a high risk component is one that contains any faults found during testing [14] 74] one that contains any faults found during operation [71] or one that is costly to correct after an error has been found [3][13][1] The identification of high risk components allows an organization to take mitigating actions, such as focus defect detection activities on high risk components, for example optimally allocating testing resources [55] or redesign components that are likely to cause field failures or be costly ....
L. Briand, W. Thomas, and C. Hetmanski: "Modeling and Managing Risk Early in Software Development". In Proceedings of the International Conference on Software Engineering, pages 5565, 1993.
....The definition of a high risk component varies depending on the context. For example, a high risk component is one that contains any faults found during testing [11] 82] one that contains any faults found during operation [75] or one that is costly to correct after a fault has been found [1] 5][12]. Recent evidence suggests that most faults are found in only a few of a system s components [41] 67] 91] 95] If these few components can be identified early, then an organization can take mitigating actions, such as focus fault detection activities on high risk components, for example by ....
L. Briand, W. Thomas, and C. Hetmanski, "Modeling and Managing Risk Early in Software Development," in Proceedings of the International Conference on Software Engineering, pp. 5565, 1993.
....2. Prior and related work There is a small, but growing body of work in which ML is applied in the software quality prediction task. Selby and Porter [Porter90] have used ID3 to identify the attributes that are the best predictor of interface errors likely to be encountered during maintenance. [Briand93] built cost models for isolation and correction using an approach that combines statistical and machine learning classification methods [Briand92] In this work, values of metrics were used for the first time to represent the properties of the code believed relevant to the task. The metrics were ....
Briand, L.; Thomas, W. M.; Hetmanski, C. J. Modeling and managing risk early in software development In proceedings IEEE 15 th International Conference on software engineering, Baltmore, May 1993.
....of components that may be problem prone in the early operational phase. Several authors have published models that attempt to relate some early software metrics, such as, the size of the code, Halstead length, or cyclomatic number, to the failure proneness of a program [Kho90, Mun92, Bri93] A more process oriented approach is discussed in [Vou93, Lyu96] Highly correlated nature of the early software verification and testing events may require the use of a more sophisticated, time series, approach [Sin92] 0 0 50 100 150 200 250 300 350 Weibull Model Empirical Data Execution ....
L.C. Briand, W.M. Thomas and C.J. Hetsmanski, "Modeling and Managing Risk Early in Software Development," Proc. 15th ICSE, pp 55-65, 1993.
....selected. As given earlier, SME initially had only a single model parameter implementation language (either FORTRAN or Ada) In 1993, a Euclidean distance clustering algorithm [9] was added as a second alternative model. Additional empirical models are planned. Optimized Set Reduction (OSR) [4] is one such approach to include. Alternative clustering approaches are being considered. 6 Current status and conclusions The use of collected data on past projects as predictors of future project behavior is a growing phenomenon in software development. However, development environments vary ....
L. Briand, W. Thomas, and C. Hetmanski. Modeling and managing risk early in software development. In Proc. of the 15th Int'l Conf. on Software Engineering, May 1993.
....testing phases with the effects observed in the later phases, such, as early operational phase. Several authors have published models that attempt to relate some early software metrics, such as, the size of the code, Halstead length, or cyclomatic number, to the failure proneness of the program [Mun92, Bri93]. This paper is concerned with some issues related to the use of software reliability engineering (SRE) indicators available during early software testing of a large multi component software system to: i) Quantify component quality expressed, for example, as the number of failures (or problem ....
....which is expressed in terms of failures per test case, or the number of unique failures per unique test case. 3. Risk Modeling This section addresses the use of early testing information in identification of components that may be problem prone in the field. Several approaches have been proposed [Kho90, Mun92, Bri93]. Highly correlated nature of the early software verification and testing events may require the use of a more sophisticated, time series, approach [Sin92] We illustrate some of the issues through a risk model [Ehr85, Boe89] 3.1 Process States At the end of a non operational testing phase an ....
L.C. Briand, W.M. Thomas and C.J. Hetsmanski, "Modeling and Managing Risk Early in Software Development," Proc. 15th ICSE, pp 55-65, 1993.
.... Engineering Workshop, December 1994 4 Based on the notion that a combination of independent variables might better explain high change effort than only a single variable, we planned to analyze multiple variables in combination using a machine learning technique called Optimized Set Reduction (OSR) [BTH93, BBH93]. OSR finds patterns in the independent (explanatory) variables which reliably predict values of a single dependent variable. The OSR approach is insensitive to the scale of the data, but requires a large data set, ideally several hundred points. We planned to apply the OSR technique to the full ....
....random, which discouraged us from computing univariate correlations. Appeared in Proc. 19th Software Engineering Workshop, December 1994 12 Results of multivariate analyses. Because we had data for several hundred changes in the acceptance test phase, we were able to apply the OSR technique [BTH93, BBH93]. Based on the results achieved when working with the maintenance data, we restricted the data set to the error corrections. All analyses took the approach of trying to identify whether the error corrections (changes) would be inexpensive or expensive, where inexpensive was defined as requiring ....
[Article contains additional citation context not shown here]
Lionel C. Briand, William M. Thomas, and Christopher J. Hetmanski. Modeling and managing risk early in software development. In Proceedings of the Fifteenth International Conference on Software Engineering, pages 55--65. IEEE, May 1993.
....classification examples. This paper compares the following modeling techniques covering all the three classification paradigms: Principal component analysis, which has been often used in the software engineering field to improve the accuracy of discriminant models [22] or regression models [5], 6] 17] Discriminant analysis, which has been previously applied to detect fault prone programs [22] Logistic regression, which has been included in empirical comparisons between models identifying highrisk components [5] 6] Logical classification models, which have been extensively ....
.... the accuracy of discriminant models [22] or regression models [5] 6] 17] Discriminant analysis, which has been previously applied to detect fault prone programs [22] Logistic regression, which has been included in empirical comparisons between models identifying highrisk components [5], 6] Logical classification models, which have been extensively used in software engineering issues, such as the identification of high risk modules [5] 6] 23] 24] 30] or the detection of reusable software components [9] Layered neural networks, which have been already applied in ....
[Article contains additional citation context not shown here]
L. C. Briand, W. M. Thomas, and C. J. Hetmanski, "Modeling and managing risk early in software development", in Proceedings of the 15th International Conference on Software Engineering, Baltimore, Maryland, May 1993, pp.55-65.
....kinds of modeling techniques. Multiple linear regression analysis has been used to predict the number of corrective changes [13, 14] Discriminant analysis has been applied to detect fault prone modules [16, 19] Logistic regression has been used for modeling to identify high risk components [3, 4]. Principal component analysis has often been used to improve the accuracy of discriminant models [15, 19] or regression models [3, 4, 14] Logical classification models have been used extensively to identify high risk modules [3, 4, 20, 21, 27] and reusable software components [8] Layered neural ....
.... 14] Discriminant analysis has been applied to detect fault prone modules [16, 19] Logistic regression has been used for modeling to identify high risk components [3, 4] Principal component analysis has often been used to improve the accuracy of discriminant models [15, 19] or regression models [3, 4, 14]. Logical classification models have been used extensively to identify high risk modules [3, 4, 20, 21, 27] and reusable software components [8] Layered neural networks have already been applied to building reliability growth models [11, 12] to predicting the gross change [16] and the degree of ....
[Article contains additional citation context not shown here]
Briand, L. C., Thomas, W. M., and Hetmanski, C. J., Modeling and managing risk early in software development, in Proc. 15th Int. Conf. Sofware Eng., 55-65, 1993.
....quality achieved after that the components classified as high risk have been undergone to a verification activity. We suppose that the verification will be so exhaustive to find the faults of all the components which are actually high risk. We measure this criterion using the completeness measure [BTH93], which is the percentage of faulty components that have been actually classified as such by the model. Quality is achieved by increasing the cost of verification due to an extra effort in inspection and testing for the components which have been flagged as high risk. We measure the verification ....
....the simplest to the most complex, is worthwhile only if there is a local process to select metrics which are valid as predictors. Principal component analysis does not always produce a better input for predictive models. The domain metrics have often been used in the software engineering field [BBH93, BTH93, MK92, KLM93] to reduce the dimensions of a metric space when the metrics have a strong relationship between them, and obtain a smaller number of orthogonal domain metrics to be used as input to regression and discriminant analysis models. In our study, we built two classification models for both discriminant ....
L. C. Briand, W. M. Thomas, and C. J. Hetmanski, "Modeling and managing risk early in software development", in Proceedings of the 15th International Conference on Software Engineering, Baltimore, Maryland, May 1993, pp.55-65.
....our research on a smaller set of strategies and concepts. A number of studies have been published on software design measures in recent years. It has been shown that system architecture has an impact on maintainability and fault proneness [26] 24] 38] 30] 39] 16] 40] 41] 43] 1] [17], 2] 44] These studies have attempted to capture the design attributes affecting the ease of maintaining and debugging a software system. Most of the design measures are based on information flow between subroutines or declaration counts. We think that, even though they provide interesting ....
L.C. Briand, W. Thomas, and C. Hetmanski, "Modeling and Managing Risk Early in Software Development," Int'l Conf. Software Eng., Maryland, May 1993.
....the system and its constituent parts. In this paper, we will focus on high level design metrics for software systems. A number of studies have been published on software design metrics in recent years. It has been shown that system architecture has an impact on maintainability and error proneness [HK84, G86, R87, R90, S90, SB91, Z91, AE92, BTH93, BBH93]. These studies have attempted to capture the design characteristics affecting the ease of maintaining and debugging a software system. Most of the design metrics are based on information flow between subroutines or declaration counts. We think that, even though it provides an interesting insight ....
L. Briand, W. Thomas and C. Hetmanski, "Modeling and Managing Risk early in Software Development," International Conference on Software Engineering, Maryland, May 1993
....(i.e. dependent variable) such as project effort. X 1 , X n could be the set of project characteristics (e.g. team experience, product size) driving effort. Another type of prediction model would be: p(e) f( X 1 , X n ) 2) where p(e) is the probability of occurrence of event e [BTH93] e.g. fault detection in a software component. X 1 , X n could be the set of component internal attributes (e.g. complexity, coupling) used to explain the occurrence of faults. Many techniques may be used to build such prediction models such as: Regression analysis [BTH93, BMB94b] ....
....of event e [BTH93] e.g. fault detection in a software component. X 1 , X n could be the set of component internal attributes (e.g. complexity, coupling) used to explain the occurrence of faults. Many techniques may be used to build such prediction models such as: Regression analysis [BTH93, BMB94b] Inductive algorithms, e.g. classification trees, Optimized Set Reduction [BBH93] Neural networks [KPM92] Depending on the type of data to be used, the intended use of the model and the profile of the future users of the model, different techniques should be used [Bri93] Different ....
L. Briand, W. Thomas, and C. Hetmanski, "Modeling and Managing Risk Early in Software Development", in Proceedings of the 15th International Conference on Software Engineering, pages 55-65, Maryland, May 1993.
....and concepts. 1 Object based systems differ from object oriented systems in that inheritance is not allowed. 3 A number of studies have been published on software design measures in recent years. It has been shown that system architecture has an impact on maintainability and faultproneness [HK84, G86, R87, IS88, R90, BRBD90, S90, SB91, Z91, AE92, BTH93, BBH93, ZEWH95]. These studies have attempted to capture the design attributes affecting the ease of maintaining and debugging a software system. Most of the design measures are based on information flow between subroutines or declaration counts. We think that, even though they provide interesting insights into ....
L. Briand, W. Thomas and C. Hetmanski, "Modeling and Managing Risk early in Software Development," International Conference on Software Engineering, Maryland, May 1993
No context found.
L.C. Briand, W.M. Thomas and C.J. Hetsmanski, "Modeling and Managing Risk Early in Software Development," Proc. 15th ICSE, pp 55-65, 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC