Classifying Bias in Large Multilingual Corpora via Crowdsourcing and Topic Modeling
dc.contributor.advisor | Zajic, David | |
dc.contributor.author | Caljean, Brianna | |
dc.contributor.author | Calvert, Katherine | |
dc.contributor.author | Chang, Ashley | |
dc.contributor.author | Frank, Elliot | |
dc.contributor.author | Garay Jáuregui, Rosana | |
dc.contributor.author | Palo, Geoffrey | |
dc.contributor.author | Rinker, Ryan | |
dc.contributor.author | Weakly, Gareth | |
dc.contributor.author | Wolfrey, Nicolette | |
dc.contributor.author | Zhang, William | |
dc.date.accessioned | 2018-06-22T17:38:58Z | |
dc.date.available | 2018-06-22T17:38:58Z | |
dc.date.issued | 2018 | |
dc.description.abstract | Our project extends previous algorithmic approaches to finding bias in large text corpora. We used multilingual topic modeling to examine language-specific bias in the English, Spanish, and Russian versions of Wikipedia. In particular, we placed Spanish articles discussing the Cold War on a Russian-English viewpoint spectrum based on similarity in topic distribution. We then crowdsourced human annotations of Spanish Wikipedia articles for comparison to the topic model. Our hypothesis was that human annotators and topic modeling algorithms would provide correlated results for bias. However, that was not the case. Our annotators indicated that humans were more perceptive of sentiment in article text than topic distribution, which suggests that our classifier provides a different perspective on a text’s bias. | en_US |
dc.identifier | https://doi.org/10.13016/M2R49GC7C | |
dc.identifier.uri | http://hdl.handle.net/1903/20668 | |
dc.language.iso | en_US | en_US |
dc.relation.isAvailableAt | Digital Repository at the University of Maryland | |
dc.relation.isAvailableAt | Gemstone Program, University of Maryland (College Park, Md) | |
dc.subject | Gemstone Team BIASES | en_US |
dc.title | Classifying Bias in Large Multilingual Corpora via Crowdsourcing and Topic Modeling | en_US |
dc.type | Thesis | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- BIASES Thesis.pdf
- Size:
- 1.46 MB
- Format:
- Adobe Portable Document Format
- Description: