A Latent Factor Approach for Social Network Analysis

Loading...
Thumbnail Image

Files

Publication or External Link

Date

2019

Citation

Abstract

Social network data consist of entities and the relation of information between

pairs of entities. Observations in a social network are dyadic and interdependent.

Therefore, making appropriate statistical inferences from a network requires specifications

of dependencies in a model. Previous studies suggested that latent factor

models (LFMs) for social network data can account for stochastic equivalence and

transitivity simultaneously, which are the two primary dependency patterns that are

observed social network data in real-world social networks. One particular LFM, the

additive and multiplicative effects network model (AME) accounts for the heterogeneity

of second-order dependencies at the actor level. However, all current latent

variable models have not considered the heterogeneity of third-order dependencies,

actor-level transitivity for example. Failure to model third-order dependency heterogeneity

may result in worse fits to local network structures, which in turn may result

in biased parameter inferences and may negatively influence the goodness-of-fit and

prediction performance of a model.

Motivated by such a gap in the literature, this dissertation proposes to incorporate

a correlation structure between the sender and receiver latent factors in the

AME to account for the distribution of actor-level transitivity. The proposed model

is compared with the existing AME in both simulation studies real-world data. Models

are evaluated via multiple goodness-of-fit techniques, including mean squared error,

parameter coverage rate, information criteria, receiver-operation curve (ROC)

based on K-fold cross-validation or full data, and posterior predictive checking. This

work may also contribute to the literature of goodness-of-fit methods to network

models, which is an area that has not been unified.

Both the simulation studies and real-world data analyses showed that adding

the correlation structure provides a better fit as well as higher prediction accuracy

to network data. The proposed method has equal or similar performance to the

AME when the underlying correlation is zero, with regard to mean-squared error

of probability of ties and widely applicable information criteria. The present study

did not find any significant impact of the correlation term on the node-level covariate’s

coefficient estimation. Future studies include investigating more types of covariates,

subgroup related covariate effects is an example.

Notes

Rights