PubMed related articles: a probabilistic topic-based model for content similarity
dc.contributor.author | Lin, Jimmy | |
dc.contributor.author | Wilbur, W John | |
dc.date.accessioned | 2021-12-06T19:19:01Z | |
dc.date.available | 2021-12-06T19:19:01Z | |
dc.date.issued | 2007-10-30 | |
dc.description.abstract | We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from term frequencies, modeled as Poisson distributions. Unlike previous probabilistic retrieval models, we do not attempt to estimate relevance–but rather our focus is "relatedness", the probability that a user would want to examine a particular document given known interest in another. We also describe a novel technique for estimating parameters that does not require human relevance judgments; instead, the process is based on the existence of MeSH ® in MEDLINE ®. The pmra retrieval model was compared against bm25, a competitive probabilistic model that shares theoretical similarities. Experiments using the test collection from the TREC 2005 genomics track shows a small but statistically significant improvement of pmra over bm25 in terms of precision. Our experiments suggest that the pmra model provides an effective ranking algorithm for related article search. | en_US |
dc.description.uri | https://doi.org/10.1186/1471-2105-8-423 | |
dc.identifier | https://doi.org/10.13016/gfgf-atuw | |
dc.identifier.citation | Lin, J., Wilbur, W.J. PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinformatics 8, 423 (2007). | en_US |
dc.identifier.uri | http://hdl.handle.net/1903/28203 | |
dc.language.iso | en_US | en_US |
dc.publisher | Springer Nature | en_US |
dc.relation.isAvailableAt | College of Information Studies | en_us |
dc.relation.isAvailableAt | Information Studies | en_us |
dc.relation.isAvailableAt | Digital Repository at the University of Maryland | en_us |
dc.relation.isAvailableAt | University of Maryland (College Park, MD) | en_us |
dc.subject | Information Retrieval | en_US |
dc.subject | MeSH | en_US |
dc.subject | Retrieval Model | en_US |
dc.subject | Related Article | en_US |
dc.subject | Test Collection | en_US |
dc.title | PubMed related articles: a probabilistic topic-based model for content similarity | en_US |
dc.type | Article | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- 1471-2105-8-423.pdf
- Size:
- 1.52 MB
- Format:
- Adobe Portable Document Format
- Description: