Topic Modeling for Wikipedia Link Disambiguation
Skaggs, Bradley Alan
Getoor, Lise C
MetadataShow full item record
Many articles in the online encyclopedia Wikipedia have hyperlinks to ambiguous article titles. To improve the reader experience, any link to an ambiguous title should be replaced with a link to one of the unambiguous meanings. We propose a novel statistical topic model, which we refer to as the Link Text Topic Model (LTTM), that can suggest new link targets for existing ambiguous links in Wikipedia articles. For evaluation, we develop a method for extracting ground truth from snapshots of Wikipedia at different points in time. We evaluate LTTM on this ground truth, and demonstrate its superiority over existing link- and content-based approaches. Finally, we build a web service that uses LTTM to suggest unambiguous articles for human editors wanting to fix ambiguous links.