PubMed related articles: a probabilistic topic-based model for content similarity

dc.contributor.authorLin, Jimmy
dc.contributor.authorWilbur, W John
dc.date.accessioned2021-12-06T19:19:01Z
dc.date.available2021-12-06T19:19:01Z
dc.date.issued2007-10-30
dc.description.abstractWe present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from term frequencies, modeled as Poisson distributions. Unlike previous probabilistic retrieval models, we do not attempt to estimate relevance–but rather our focus is "relatedness", the probability that a user would want to examine a particular document given known interest in another. We also describe a novel technique for estimating parameters that does not require human relevance judgments; instead, the process is based on the existence of MeSH ® in MEDLINE ®. The pmra retrieval model was compared against bm25, a competitive probabilistic model that shares theoretical similarities. Experiments using the test collection from the TREC 2005 genomics track shows a small but statistically significant improvement of pmra over bm25 in terms of precision. Our experiments suggest that the pmra model provides an effective ranking algorithm for related article search.en_US
dc.description.urihttps://doi.org/10.1186/1471-2105-8-423
dc.identifierhttps://doi.org/10.13016/gfgf-atuw
dc.identifier.citationLin, J., Wilbur, W.J. PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinformatics 8, 423 (2007).en_US
dc.identifier.urihttp://hdl.handle.net/1903/28203
dc.language.isoen_USen_US
dc.publisherSpringer Natureen_US
dc.relation.isAvailableAtCollege of Information Studiesen_us
dc.relation.isAvailableAtInformation Studiesen_us
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_us
dc.relation.isAvailableAtUniversity of Maryland (College Park, MD)en_us
dc.subjectInformation Retrievalen_US
dc.subjectMeSHen_US
dc.subjectRetrieval Modelen_US
dc.subjectRelated Articleen_US
dc.subjectTest Collectionen_US
dc.titlePubMed related articles: a probabilistic topic-based model for content similarityen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1471-2105-8-423.pdf
Size:
1.52 MB
Format:
Adobe Portable Document Format
Description: