Evolving Language Patterns on the Web: Community Influence, Model Adaptation, and Bias Mitigation

Zhou, Yuhang

Evolving Language Patterns on the Web: Community Influence, Model Adaptation, and Bias Mitigation

dc.contributor.advisor	Ai, Wei	en_US
dc.contributor.author	Zhou, Yuhang	en_US
dc.contributor.department	Information Studies	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2025-08-08T12:08:37Z
dc.date.issued	2025	en_US
dc.description.abstract	The language of the Web is in constant evolution, shaped by the dynamic interplay between social interaction, symbolic innovation, and shifting cultural norms. Online communities actively drive this evolution by introducing novel expressions, reinterpreting existing tokens, and constructing meanings that challenge traditional linguistic assumptions. Among the most illustrative of these transformations are emojis—visual symbols whose interpretations vary widely across users and contexts. As emojis evolve from standardized Unicode definitions into contextually rich social symbols, they reveal the complexity and fluidity of digital communication. However, this rapid pace of linguistic change presents major challenges for both human communication and natural language processing (NLP) systems, which struggle to adapt to the semantic drift of tokens over time. This dissertation investigates the interconnections between language evolution, social meaning construction, and computational modeling. It centers on three key areas that reflect different facets of this linguistic transformation. First, we examine the diffusion and semantic adaptation of newly introduced emojis in digital discourse. By analyzing usage patterns and leveraging large language models (LLMs), we develop an interpretation framework to decode the evolving meanings of new emojis and assess their impact on downstream NLP tasks. Second, we explore how biased associations embedded in training data lead to spurious correlations at the concept level. We demonstrate that LLMs tend to internalize these associations, which can skew their predictions and reinforce societal stereotypes. By identifying the mechanisms behind such biases, we highlight the importance of mitigating shortcut learning in both pre-training and fine-tuning stages. Third, we investigate how emojis, originally designed for neutral or positive expression, are repurposed for offensive communication. We develop a multi-step LLM-based pipeline to identify and replace offensive emojis in social media content while preserving the original semantic intent. Our human evaluations demonstrate that this approach reduces perceived offensiveness without sacrificing clarity or meaning. Together, these three investigations provide a comprehensive account of how language evolves in digital environments—and how NLP systems can better keep pace. Our findings underscore the need for adaptive, socially aware computational frameworks that account for linguistic fluidity, community-specific conventions, and evolving symbolic practices. By aligning NLP models more closely with the dynamics of human communication, this dissertation contributes to the development of more inclusive, responsive, and semantically grounded language technologies.	en_US
dc.identifier	https://doi.org/10.13016/w2ze-uzal
dc.identifier.uri	http://hdl.handle.net/1903/34231
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Information science	en_US
dc.subject.pqcontrolled	Communication	en_US
dc.subject.pqcontrolled	Computer science	en_US
dc.subject.pquncontrolled	Computational Social Science	en_US
dc.subject.pquncontrolled	Large Language Models	en_US
dc.subject.pquncontrolled	Natural Language Processing	en_US
dc.subject.pquncontrolled	Social Media Mining	en_US
dc.title	Evolving Language Patterns on the Web: Community Influence, Model Adaptation, and Bias Mitigation	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zhou_umd_0117E_25092.pdf
Size:: 10.17 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Information Studies Theses and Dissertations