USING SOCIAL MEDIA AS A DATA SOURCE IN PUBLIC HEALTH RESEARCH
Publication or External Link
Researchers have increasingly looked to social media data as a means of measuring population health and well-being in a less intrusive and more scalable manner compared to traditional public health data sources. In this dissertation, I outline three studies that leverage social media as a data source, to answer research questions related to public health and compare traditional public health data sources to social media data sources. In Study #1, I conduct a study with the aim of developing, from geotagged Twitter data, a predictive model for the identification of food deserts in the United States, using the linguistic constructs found in food-related tweets. The results from this study suggest the food-ingestion language found in tweets, such as census-tract level measures of food sentiment and healthiness, are associated with census tract-level food desert status. Additionally, the results suggest that including food ingestion language derived from tweets in classification models that predict food desert status improves model performance when compared to baseline models that only include socio-economic characteristics. In Study #2, I evaluate whether attitudes towards COVID-19 vaccines collected from the Household Pulse Survey can be predicted using attitudes extracted from Twitter. The results reveal that attitudes toward COVID-19 vaccines found in tweets explain 61-72% of the variability in the percentage of HPS respondents that were vaccine hesitant or compliant. The results also reveal significant statistical relationships between perceptions expressed on Twitter and in the survey. In Study #3, I conduct a study to examine whether supplementing COVID-19 vaccine uptake forecast models with the attitudes found in tweets improves over baseline models that only use historical vaccination data. The results of this study reveal that supplementing baseline forecast models with both historical vaccination data and COVID-19 vaccine attitudes found in tweets reduce RMSE by as much as 9%. The studies outlined in this dissertation suggest there is a valuable signal for public health research in Twitter data.