An Approach to Improving Parametric Estimation Models in the Case of Violation of Assumptions Based upon Risk Analysis

Thumbnail Image


TR 139.pdf (1.2 MB)
No. of downloads: 411

Publication or External Link







In this work, we show the mathematical reasons why parametric models fall short of providing correct estimates and define an approach that overcomes the causes of these shortfalls. The approach aims at improving parametric estimation models when any regression model assumption is violated for the data being analyzed. Violations can be that, the errors are x-correlated, the model is not linear, the sample is heteroscedastic, or the error probability distribution is not Gaussian. If data violates the regression assumptions and we do not deal with the consequences of these violations, we cannot improve the model and estimates will be incorrect forever. The novelty of this work is that we define and use a feed-forward multi-layer neural network for discrimination problems to calculate prediction intervals (i.e. evaluate uncertainty), make estimates, and detect improvement needs. The primary difference from traditional methodologies is that the proposed approach can deal with scope error, model error, and assumption error at the same time. The approach can be applied for prediction, inference, and model improvement over any situation and context without making specific assumptions. An important benefit of the approach is that, it can be completely automated as a stand-alone estimation methodology or used for supporting experts and organizations together with other estimation techniques (e.g., human judgment, parametric models). Unlike other methodologies, the proposed approach focuses on the model improvement by integrating the estimation activity into a wider process that we call the Estimation Improvement Process as an instantiation of the Quality Improvement Paradigm. This approach aids mature organizations in learning from their experience and improving their processes over time with respect to managing their estimation activities. To provide an exposition of the approach, we use an old NASA COCOMO data set to (1) build an evolvable neural network model and (2) show how a parametric model, e.g, a regression model, can be improved and evolved with the new project data.


Index Terms- Multi-layer feed-forward neural networks, non-linear regression, curvilinear component analysis, Bayesian learning, prediction intervals for neural networks, risk analysis and management, learning organizations, software cost prediction, integrated software engineering environment, quality improvement paradigm, estimation improvement paradigm, bayesian discrimination function, TAME system