Human Development & Quantitative Methodology
Permanent URI for this communityhttp://hdl.handle.net/1903/2248
The departments within the College of Education were reorganized and renamed as of July 1, 2011. This department incorporates the former departments of Measurement, Statistics & Evaluation; Human Development; and the Institute for Child Study.
Browse
14 results
Search Results
Item Accuracy and consistency in discovering dimensionality by correlation constraint analysis and common factor analysis(2009) Tractenberg, Rochelle Elaine; Hancock, Gregory R; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)An important application of multivariate analysis is the estimation of the underlying dimensions of an instrument or set of variables. Estimation of dimensions is often pursued with the objective of finding the single factor or dimension to which each observed variable belongs or by which it is most strongly influenced. This can involve estimating the loadings of observed variables on a pre-specified number of factors, achieved by common factor analysis (CFA) of the covariance or correlational structure of the observed variables. Another method, correlation constraint analysis (CCA), operates on the determinants of all 2x2 submatrices of the covariance matrix of the variables. CCA software also determines if partialling out the effects of any observed variable affects observed correlations, the only exploratory method to specifically rule out (or identify) observed variables as being the cause of correlations among observed variables. CFA estimates the strengths of associations between factors, hypothesized to underlie or cause observed correlations, and the observed variables; CCA does not estimate factor loadings but can uncover mathematical evidence of the causal relationships hypothesized between factors and observed variables. These are philosophically and analytically diverse methods for estimating the dimensionality of a set of variables, and each can be useful in understanding the simple structure in multivariate data. This dissertation studied the performances of these methods at uncovering the dimensionality of simulated data under conditions of varying sample size and model complexity, the presence of a weak factor, and correlated vs. independent factors. CCA was sensitive (performed significantly worse) when these conditions were present in terms of omitting more factors, and omitting and mis-assigning more indicators. CFA was also found to be sensitive to all but one condition (whether factors were correlated or not) in terms of omitting factors; it was sensitive to all conditions in terms of omitting and mis-assigning indicators, and it also found extra factors depending on the number of factors in the population, the purity of factors and the presence of a weak factor. This is the first study of CCA in data with these specific features of complexity, which are common in multivariate data.Item AN INVESTIGATION OF GROWTH MIXTURE MODELS WHEN DATA ARE COLLECTED WITH UNEQUAL SELECTION PROBABILITIES: A MONTE CARLO STUDY(2009) Hamilton, Jennifer; Hancock, Gregory R.; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)As researchers begin to use Growth Mixture Models (GMM) with data from nationally representative samples, it becomes increasingly critical for researchers to understand the difficulties associated with modeling data that come from complex sample designs. If researchers naively apply GMM to nationally representative data sets without adjusting for the way in which the sample was selected, the resulting parameter estimates, standard errors and tests of significant may not be trustworthy. Therefore, the objective of the current study was to quantify the accuracy of parameter estimates and class assignment when subjects are sampled with unequal probabilities of selection. To this end, a series of Monte Carlo simulations empirically investigated the ability of GMM to recover known growth parameters of distinct populations when various adjustments are applied to the statistical model. Specifically, the current research compared the performance of GMM that 1) ignores the sample design; 2) accounts for the sample design via weighting; 3) accounts for the sample design via explicitly modeling the stratification variable; and 4) accounts for the sample design by using weights and modeling the stratification variable. Results suggested that a model-based approach does not improve the accuracy of parameter estimates when individuals are sampled with disproportionate sampling probabilities. Not only does this method often fail to converge, when it did converge the parameter estimates exhibited an unacceptable amount of bias. The weighted model performed the best out of all of the models tested, but still resulted in parameter estimates with unacceptably high percentages of bias. It is possible that the distributions of the manifest variables overlap too much, and the aggregate distribution may be unimodal, making it potentially difficult to distinguish among the latent classes and thus affecting the accuracy of parameter estimates. In sum, the current research indicates that GMM should not be used when data are sampled with disproportionate probabilities. Researchers should therefore attend to the study design and data collection strategies when considering the use of a Growth Mixture Model in the analysis phase.Item An Integrated Item Response Model for Evaluating Individual Students' Growth in Educational Achievement(2009) Koran, Jennifer; Hancock, Gregory R.; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Measuring continuous change or growth in individual students' academic abilities over time currently uses several statistical models or transformations to move from data representing a student's correct or incorrect responses on individual test items to inferences about the form and quantity of changes in the student's underlying ability. This study proposed and investigated a single integrated model of underlying growth within an Item Response Theory framework as a potential alternative to this approach. A Monte Carlo investigation explored parameter recovery for marginal maximum likelihood estimates via the Expectation-Maximization algorithm under variations of several conditions, including the form of the underlying growth trajectory, the amount of inter-individual variation in the rate(s) of growth, the sample size, the number of items at each time point, and the selection of items administered across time points. A real data illustration with mathematics assessment data from the Early Childhood Longitudinal Study showed the practical use of this integrated model for measuring gains in academic achievement. Overall, this exploration of an integrated model approach contributed to a better understanding of the appropriate use of growth models to draw valid inferences about students' academic growth over time.Item Finite Mixture Model Specifications Accommodating Treatment Nonresponse in Experimental Research(2009) Wasko, John A.; Hancock, Gregory R; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)For researchers exploring causal inferences with simple two group experimental designs, results are confounded when using common statistical methods and further are unsuitable in cases of treatment nonresponse. In signal processing, researchers have successfully extracted multiple signals from data streams with Gaussian mixture models, where their use is well matched to accommodate researchers in this predicament. While the mathematics underpinning models in either application remains unchanged, there are stark differences. In signal processing, results are definitively evaluated assessing whether extracted signals are interpretable. Such obvious feedback is unavailable to researchers seeking causal inference who instead rely on empirical evidence from inferential statements regarding mean differences, as done in analysis of variance (ANOVA). Two group experimental designs do provide added benefit by anchoring treatment nonrespondents' distributional response properties from the control group. Obtaining empirical evidence supporting treatment nonresponse, however, can be extremely challenging. First, if indeed nonresponse exists, then basic population means, ANOVA or repeated measures tests cannot be used because of a violation of the identical distribution property required for each method. Secondly, the mixing parameter or proportion of nonresponse is bounded between 0 and 1, so does not subscribe to normal distribution theory to enable inference by common methods. This dissertation introduces and evaluates the performance of an information-based methodology as a more extensible and informative alternative to statistical tests of population means while addressing treatment nonresponse. Gaussian distributions are not required under this methodology which simultaneously provides empirical evidence through model selection regarding treatment nonresponse, equality of population means, and equality of variance hypotheses. The use of information criteria measures as an omnibus assessment of a set of mixture and non-mixture models within a maximum likelihood framework eliminates the need for a Newton-Pearson framework of probabilistic inferences on individual parameter estimates. This dissertation assesses performance in recapturing population conditions for hypotheses' conclusions, parameter accuracy, and class membership. More complex extensions addressing multiple treatments, multiple responses within a treatment, a priori consideration of covariates, and multivariate responses within a latent framework are also introduced.Item Testing for Differentially Functioning Indicators Using Mixtures of Confirmatory Factor Analysis Models(2009) Mann, Heather Marie; Hancock, Gregory R; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Heterogeneity in measurement model parameters across known groups can be modeled and tested using multigroup confirmatory factor analysis (CFA). When it is not reasonable to assume that parameters are homogeneous for all observations in a manifest group, mixture CFA models are appropriate. Mixture CFA models can add theoretically important unmeasured characteristics to capture heterogeneity and have the potential to be used to test measurement invariance. The current study investigated the ability of mixture CFA models to identify differences in factor loadings across latent classes when there is no mean separation in both the latent and measured variables. Using simulated data from models with known parameters, parameter recovery, classification accuracy, and the power of the likelihood-ratio test were evaluated as impacted by model complexity, sample size, latent class proportions, magnitude of factor loading differences, percentage of noninvariant factor loadings, and pattern of noninvariant factor loadings. Results suggested that mixture CFA models may be a viable option for testing the invariance of measurement model parameters, but without impact and differences in measurement intercepts, larger sample sizes, more noninvariant factor loadings, and larger amounts of heterogeneity are needed to distinguish different latent classes and successfully estimate their parameters.Item Effects of Model Selection on the Coverage Probability of Confidence Intervals in Binary-Response Logistic Regression(2008-07-24) Zhang, Dongquan; Dayton, C. Mitchell; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)While model selection is viewed as a fundamental task in data analysis, it imposes considerable effects on the subsequent inference. In applied statistics, it is common to carry out a data-driven approach in model selection and draw inference conditional on the selected model, as if it is given a priori. Parameter estimates following this procedure, however, generally do not reflect uncertainty about the model structure. As far as confidence intervals are concerned, it is often misleading to report estimates based upon conventional 1−α without considering possible post-model-selection impact. This paper addresses the coverage probability of confidence intervals of logit coefficients in binary-response logistic regression. We conduct simulation studies to examine the performance of automatic model selectors AIC and BIC, and their subsequent effects on actual coverage probability of interval estimates. Important considerations (e.g. model structure, covariate correlation, etc.) that may have key influence are investigated. This study contributes in terms of understanding quantitatively how the post-model-selection confidence intervals perform in terms of coverage in binary-response logistic regression models. A major conclusion was that while it is usually below the nominal level, there is no simple predictable pattern with regard to how and how far the actual coverage probability of confidence intervals may fall. The coverage probability varies given the effects of multiple factors: (1) While the model structure always plays a role of paramount importance, the covariate correlation significantly affects the interval's coverage, with the tendency that a higher correlation indicates a lower coverage probability. (2) No evidence shows that AIC inevitably outperforms BIC in terms of achieving higher coverage probability, or vice versa. The model selector's performance is dependent upon the uncertain model structure and/or the unknown parameter vector θ . (3) While the effect of sample size is intriguing, a larger sample size does not necessarily achieve asymptotically more accurate inference on interval estimates. (4) Although the binary threshold of the logistic model may affect the coverage probability, such effect is less important. It is more likely to become substantial with an unrestricted model when extreme values along the dimensions of other factors (e.g. small sample size, high covariate correlation) are observed.Item INVESTIGATING DIFFERENTIAL ITEM FUNCTION AMPLIFICATION AND CANCELLATION IN APPLICATION OF ITEM RESPONSE TESTLET MODELS(2007-05-24) Bao, Han; Dayton, Mitchell; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Many educational tests use testlets as a way of providing context, instead of presenting only discrete multiple-choice items, where items are grouped into testlets (Wainer & Kiely, 1987) or item bundles (Rosenbaum, 1988) marked by shared common stimulus materials. One might doubt the usual assumption of standard item response theory of local independence among items in these cases. Plausible causes of local dependence might be test takers' different levels of background knowledge necessary to understand the common passage, as a considerable amount of mental processing may be required to read and understand the stimulus, and different persons' learning experiences. Here, the local dependence can be viewed as additional dimensions other than the latent traits. Furthermore, from the multidimensional differential item functioning (DIF) point of view, different distributions of testlet dimensions among different examinee subpopulations (race, gender, etc) could be the cognitive cause of individual differences in test performance. When testlet effect and item idiosyncratic features of individual items are both considered to be the reasons of DIF, it is interesting to investigate the phenomena of DIF amplification and cancellation resulting from the interactive effects of these two factors. This dissertation presented a study based on a multiple-group testlet item response theory model developed by Li et al. (2006) to examine in detail different situations of DIF amplification and cancellation at the item and testlet level using testlet characteristic curve procedures with signed/ unsigned area indices and logistic regression procedure. The testlet DIF model was estimated using a hierarchical Bayesian framework with the Markov Chain Monte Carlo (MCMC) method implemented in the computer software WINBUGS. The simulation study investigated all of the possible conditions of DIF amplification and cancellation attributed to person-testlet interaction effect and individual item characteristics. Real data analysis indicated the existence of testlet effect and its magnitudes of difference on the means and/or variance of testlet distribution between manifest groups imputed to the different contexts or natures of the passages as well as its interaction with the manifest groups of examinees such as gender or ethnicity.Item Factors Influencing The Mixture Index of Model Fit in Contingency Tables Showing Indenpendence(2006-11-13) Pan, Xuemei; Dayton, C. Mitchell; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Traditional methods for evaluating contingency table models based on chi square statistics or quantities derived from them are not attractive in many applied research settings. The two-point mixture index of fit, pi-star, introduced by Rudas, Clogg and Lindsay (RCL: 1994) provides a new way to represent goodness-of-fit for contingency tables. This study: (a) evaluated several techniques for dealing with sampling zeros when computing pi-star in contingency tables when the independence assumption holds; (b) investigated the performance of the estimate in various combinations of conditions, as a function of different sizes of tables, different marginal distributions and different sample sizes; and (c) compared the standard error of pi-star and confidence interval estimated by using a method proposed by RCL, with the "true" standard error based on empirical simulations in various scenarios especially when encountering small sample sizes and close to zero. The goals of this study were achieved by Monte Carlo simulation methods and then were applied to two real data examples. The first is a 6 by 3 cross-classification of fatal crashes by speed limit and land use with 37,295 cases based on 2004 USDOT traffic data and the second 4 by 4 cross-classification of eye color and hair color with 592 cases reported in RCL. Results suggest that: pi-star is positively biased from zero in a range from 2.98% to 40.86% in the conditions studied when the independence assumption holds. Replacing zero with larger flattening values results in smaller pi-star. For each table size, pi-star is smallest for all extremely dispersed row and column marginal distributions. For all extremely and most slightly dispersed marginal distributions tables with small sample size and small table size, using structural zero technique is superior to other sampling zero techniques. The lower bound for pi-star using the RCL method is generally close to the "true" estimate based on empirical parametric simulation. However, under some circumstances, RCL method underestimates the lower bound value even though the magnitude is relatively small and the difference shrinks as the sample size increases. This study will provide guidance for researchers in the use of this important method for interpreting models fit to contingency tables.Item EFFECT OF CATEGORIZATION ON TYPE I ERROR AND POWER IN ORDINAL INDICATOR LATENT MEANS MODELS FOR BETWEEN-SUBJECTS DESIGNS(2006-07-28) Choi, Jaehwa; Hancock, Gregory R; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Due to the superiority of latent means models (LMM) over the modeling of means on a single measured variable (ANOVA) or on a composite (MANOVA) in terms of power and effect size estimation, LMM is starting to be recognized as a powerful modeling technique. Conducting a group difference (e.g., a treatment effect) testing at the latent level, LMM enables us to analyze the consequence of the measurement error on measured level variable(s). And, this LMM has been developed for both interval indicators (IILMM; Jöreskog & Goldberger, 1975, Muthén, 1989, Sörbom, 1974) and ordinal indicators (OILMM; Jöreskog, 2002). Recently, effect size estimates, post hoc power estimates, and a priori sample size determination for LMM have been developed for interval indicators (Hancock, 2001). Considering the frequent analysis of ordinal data in the social and behavior sciences, it seems most appropriate that these measures and methods be extended to LMM involving such data, OILMM. However, unlike IILMM, the OILMM power analysis involves various additional issues regarding the ordinal indicators. This research starts with illustrating various aspects of the OILMM: options for handling ordinal variables' metric level, options of estimating OILMM, and the nature of ordinal data (e.g., number of categories, categorization rules). Also, this research proposes a test statistic of the OILMM power analysis parallel to the IILMM results by Hancock (2001). The main purpose of this research is to examine the effect of categorization (mostly focused on the options handling ordinal indicators, and number of ordinal categories) on Type I error and power in OILMM based on the proposed measures and OILMM test statistic. A simulation study is conducted particularly for the two-populations between-subjects design case. Also, a numerical study is provided using potentially useful statistics and indices to help understanding the consequence of the categorization especially when one treats ordinal data as if they had metric properties.Item Robust Means Modeling: An Alternative to Hypothesis Testing Of Mean Equality in the Between-Subjects Design under Variance Heterogeneity and Nonnormality(2006-07-23) FAN, WEIHUA; Hancock, Gregory R.; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The study describes the various alternatives to the between-subjects ANOVA F test that have been performing reasonably well in the literature under different experimental conditions of sample sizes, variance ratios or nonnormality. Drawing from structural equation modeling (SEM), the robust means modeling (RMM) approach is developed, in which the assumption of variance homogeneity is not part of the model or its estimation. Specifically, univariate structured means modeling (SMM) is applied to the independent groups design with robust estimation strategies such as the Browne's asymptotic distribution free (ADF) estimator (1982, 1984) and its alternatives for non-normal continuous variables in order to achieve robustness to the biasing effects of nonnormality. A Monte Carlo simulation investigation is conducted to compare the Type I error rate and the power of the ANOVA-based methods as well as the proposed RMM approaches. Various factors including variance inequality, sample-size pairings with group variances, and degree of nonnormality are manipulated in the simulation. The results show that the proposed RMM methods are indeed superior to the ANOVA-based methods across conditions, especially when the distribution is asymmetric nonnormal.