Topic Modeling for Wikipedia Link Disambiguation

dc.contributor.advisorGetoor, Lise Cen_US
dc.contributor.authorSkaggs, Bradley Alanen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2012-02-17T07:11:01Z
dc.date.available2012-02-17T07:11:01Z
dc.date.issued2011en_US
dc.description.abstractMany articles in the online encyclopedia Wikipedia have hyperlinks to ambiguous article titles. To improve the reader experience, any link to an ambiguous title should be replaced with a link to one of the unambiguous meanings. We propose a novel statistical topic model, which we refer to as the Link Text Topic Model (LTTM), that can suggest new link targets for existing ambiguous links in Wikipedia articles. For evaluation, we develop a method for extracting ground truth from snapshots of Wikipedia at different points in time. We evaluate LTTM on this ground truth, and demonstrate its superiority over existing link- and content-based approaches. Finally, we build a web service that uses LTTM to suggest unambiguous articles for human editors wanting to fix ambiguous links.en_US
dc.identifier.urihttp://hdl.handle.net/1903/12383
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pquncontrolleddisambiguationen_US
dc.subject.pquncontrolledlink predictionen_US
dc.subject.pquncontrolledtopic modelingen_US
dc.subject.pquncontrolledwikipediaen_US
dc.titleTopic Modeling for Wikipedia Link Disambiguationen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Skaggs_umd_0117N_12844.pdf
Size:
803.07 KB
Format:
Adobe Portable Document Format