Digital Words: Moving Forward with Measuring the Readability of Online Texts

dc.contributor.authorRedmiles, Elissa M.
dc.contributor.authorMaszkiewicz, Lisa
dc.contributor.authorHwang, Emily
dc.contributor.authorKuchhal, Dhruv
dc.contributor.authorLiu, Everest
dc.contributor.authorMorales, Miraida
dc.contributor.authorPeskov, Denis
dc.contributor.authorRao, Sudha
dc.contributor.authorStevens, Rock
dc.contributor.authorGligoric, Kristina
dc.contributor.authorKross, Sean
dc.contributor.authorMazurek, Michelle L.
dc.contributor.authorDaumé, Hal III
dc.date.accessioned2018-10-28T14:25:25Z
dc.date.available2018-10-28T14:25:25Z
dc.date.issued2018-10-26
dc.description.abstractThe readability of a digital text can influence people’s information acquisition (Wikipedia articles), online security (how-to articles), and even health (WebMD). Readability metrics can also alter search rankings and are used to evaluate AI system performance. However, prior work on measuring readability has significant gaps, especially for HCI applications. Prior work has (a) focused on grade-school texts, (b) ignored domain-specific, jargon-heavy texts (e.g., health advice), and (c) failed to compare metrics, especially in the context of scaling to use with online corpora. This paper addresses these shortcomings by comparing well-known readability measures and a novel domain-specific approach across four different corpora: crowd-worker generated stories, Wikipedia articles, security and privacy advice, and health information. We evaluate the convergent, discriminant, and content validity of each measure and detail tradeoffs in domain-specificity and participant burden. These results provide a foundation for more accurate readability measurements in HCI.en_US
dc.identifierhttps://doi.org/10.13016/M2B853N3M
dc.identifier.urihttp://hdl.handle.net/1903/21456
dc.language.isoen_USen_US
dc.relation.ispartofseriesUM Computer Science Department;CS-TR-5061
dc.titleDigital Words: Moving Forward with Measuring the Readability of Online Textsen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CS-TR-5061.pdf
Size:
1.47 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.57 KB
Format:
Item-specific license agreed upon to submission
Description: