A Latent Factor Approach for Social Network Analysis
Sweet, Tracy M.
MetadataПоказать полную информацию
Social network data consist of entities and the relation of information between pairs of entities. Observations in a social network are dyadic and interdependent. Therefore, making appropriate statistical inferences from a network requires specifications of dependencies in a model. Previous studies suggested that latent factor models (LFMs) for social network data can account for stochastic equivalence and transitivity simultaneously, which are the two primary dependency patterns that are observed social network data in real-world social networks. One particular LFM, the additive and multiplicative effects network model (AME) accounts for the heterogeneity of second-order dependencies at the actor level. However, all current latent variable models have not considered the heterogeneity of third-order dependencies, actor-level transitivity for example. Failure to model third-order dependency heterogeneity may result in worse fits to local network structures, which in turn may result in biased parameter inferences and may negatively influence the goodness-of-fit and prediction performance of a model. Motivated by such a gap in the literature, this dissertation proposes to incorporate a correlation structure between the sender and receiver latent factors in the AME to account for the distribution of actor-level transitivity. The proposed model is compared with the existing AME in both simulation studies real-world data. Models are evaluated via multiple goodness-of-fit techniques, including mean squared error, parameter coverage rate, information criteria, receiver-operation curve (ROC) based on K-fold cross-validation or full data, and posterior predictive checking. This work may also contribute to the literature of goodness-of-fit methods to network models, which is an area that has not been unified. Both the simulation studies and real-world data analyses showed that adding the correlation structure provides a better fit as well as higher prediction accuracy to network data. The proposed method has equal or similar performance to the AME when the underlying correlation is zero, with regard to mean-squared error of probability of ties and widely applicable information criteria. The present study did not find any significant impact of the correlation term on the node-level covariate’s coefficient estimation. Future studies include investigating more types of covariates, subgroup related covariate effects is an example.