Latent Failures and Mixed Distributions: Using Mixed Distributions and Cost Modeling to Optimize the Management of Systems with Weak Latent Defect Subpopulations

Thumbnail Image
umi-umd-5898.pdf(2.03 MB)
No. of downloads: 1457
Publication or External Link
Touw, Anduin E
Sandborn, Peter
Under most reliability model assumptions, all failures in a population are considered to come from the same distribution. Each individual failure time is assumed to provide information about the likely failure times of all other devices in the population. However, from time to time, process variation or an unexpected event will lead to the development of a weak subpopulation while other devices remained durable. In this paper, estimation techniques for this situation are explored. Ideally, when such situations arise, the weak subpopulation could be identified through determination of root cause and sequestering of impacted devices. But many times, for practical reasons, the overall population is a mixture of the weak and strong subpopulations; there may be no non-destructive way to identify the weak devices. If the defect is not inspectable, statistical estimation methods must be used, either with or without root cause information, to quantify the reliability risk to the population and develop appropriate screening. The accuracy of these estimates may be critical to the management of the product, but estimation in these circumstances is difficult. The mixed Weibull distribution is a common form for modeling latent failures. However, estimation of the mixed Weibull parameters is not straightforward. The number of parameters involved, and frequently the sparseness of the data, can lead to estimation biases and instabilities that produce misleading results. Bayesian techniques can stabilize these estimates through the priors, but there is no closed-form conjugate family for the Weibull distribution. This dissertation, using Monte Carlo simulation, examines bias and random error for three estimation techniques: standard maximum likelihood estimation, the Trunsored method, and Bayesian estimation. To determine how errors in the estimation methods impacts decisions about screening, a cost model was developed by generalizing existing screening cost models through the addition of the impact of schedule slippage cost and capacity. The cost model was used in determining the optimal screen length based on total life-cycle cost. The estimated optimal screen length for each method was compared to the true optimal screen length. Recommendations about when each estimation method is appropriate were developed.