TREATMENT OF INFLUENTIAL OBSERVATIONS IN THE CURRENT EMPLOYMENT STATISTICS SURVEY
MetadataПоказать полную информацию
It is common for many establishment surveys that a sample contains a fraction of observations that may seriously affect survey estimates. Influential observations may appear in the sample due to imperfections of the survey design that cannot fully account for the dynamic and heterogeneous nature of the population of businesses. An observation may become influential due to a relatively large survey weight, extreme value, or combination of the weight and value. We propose a Winsorized estimator with a choice of cutoff points that guarantees that the resulting mean squared error is lower than the variance of the original survey weighted estimator. This estimator is based on very un-restrictive modeling assumptions and can be safely used when the sample is sufficiently large. We consider a different approach when the sample is small. Estimation from small samples generally relies on strict model assumptions. Robustness here is understood as insensitivity of an estimator to model misspecification or to appearance of outliers. The proposed approach is a slight modification of the classical linear mixed model application to small area estimation. The underlying distribution of the random error term is a scale mixture of two normal distributions. This setup can describe outliers in individual observations. It is also suitable for a more general situation where units from two distinct populations are put together for estimation. The mixture group indicator is not observed. The probabilities of observations coming from a group with a smaller or larger variance are estimated from the data. These conditional probabilities can serve as the basis for a formal test on outlyingness at the area level. Simulations are carried out to compare several alternative estimators under different scenarios. Performance of the bootstrap method for prediction confidence intervals is investigated using simulations. We also compare the proposed method with alternative existing methods in a study using data from the Current Employment Statistics Survey conducted by the U.S. Bureau of Labor Statistics.