Show simple item record

Statistical Methods for Analyzing Time Series Data Drawn from Complex Social Systems

dc.contributor.advisorGirvan, Michelleen_US
dc.contributor.advisorRand, Williamen_US
dc.contributor.authorDarmon, Daviden_US
dc.date.accessioned2015-09-18T06:03:02Z
dc.date.available2015-09-18T06:03:02Z
dc.date.issued2015en_US
dc.identifierhttps://doi.org/10.13016/M2V93N
dc.identifier.urihttp://hdl.handle.net/1903/17111
dc.description.abstractThe rise of human interaction in digital environments has lead to an abundance of behavioral traces. These traces allow for model-based investigation of human-human and human-machine interaction `in the wild.' Stochastic models allow us to both predict and understand human behavior. In this thesis, we present statistical procedures for learning such models from the behavioral traces left in digital environments. First, we develop a non-parametric method for smoothing time series data corrupted by serially correlated noise. The method determines the simplest smoothing of the data that simultaneously gives the simplest residuals, where simplicity of the residuals is measured by their statistical complexity. We find that complexity regularized regression outperforms generalized cross validation in the presence of serially correlated noise. Next, we cast the task of modeling individual-level user behavior on social media into a predictive framework. We demonstrate the performance of two contrasting approaches, computational mechanics and echo state networks, on a heterogeneous data set drawn from user behavior on Twitter. We demonstrate that the behavior of users can be well-modeled as processes with self-feedback. We find that the two modeling approaches perform very similarly for most users, but that users where the two methods differ in performance highlight the challenges faced in applying predictive models to dynamic social data. We then expand the predictive problem of the previous work to modeling the aggregate behavior of large collections of users. We use three models, corresponding to seasonal, aggregate autoregressive, and aggregation-of-individual approaches, and find that the performance of the methods at predicting times of high activity depends strongly on the tradeoff between true and false positives, with no method dominating. Our results highlight the challenges and opportunities involved in modeling complex social systems, and demonstrate how influencers interested in forecasting potential user engagement can use complexity modeling to make better decisions. Finally, we turn from a predictive to a descriptive framework, and investigate how well user behavior can be attributed to time of day, self-memory, and social inputs. The models allow us to describe how a user processes their past behavior and their social inputs. We find that despite the diversity of observed user behavior, most models inferred fall into a small subclass of all possible finitary processes. Thus, our work demonstrates that user behavior, while quite complex, belies simple underlying computational structures.en_US
dc.language.isoenen_US
dc.titleStatistical Methods for Analyzing Time Series Data Drawn from Complex Social Systemsen_US
dc.typeDissertationen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.contributor.departmentApplied Mathematics and Scientific Computationen_US
dc.subject.pqcontrolledStatisticsen_US
dc.subject.pqcontrolledApplied mathematicsen_US
dc.subject.pqcontrolledSociologyen_US
dc.subject.pquncontrolledbehavior modelingen_US
dc.subject.pquncontrolledcomplex systemsen_US
dc.subject.pquncontrolledcomputational mechanicsen_US
dc.subject.pquncontrolledsocial mediaen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record