Is searching full text more effective than searching abstracts?

dc.contributor.authorLin, Jimmy
dc.date.accessioned2021-11-30T15:51:51Z
dc.date.available2021-11-30T15:51:51Z
dc.date.issued2009-02-03
dc.description.abstractWith the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE® abstracts, full-text articles, and spans (paragraphs) within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: bm25 and the ranking algorithm implemented in the open-source Lucene search engine. Experiments show that treating an entire article as an indexing unit does not consistently yield higher effectiveness compared to abstract-only search. However, retrieval based on spans, or paragraphs-sized segments of full-text articles, consistently outperforms abstract-only search. Results suggest that highest overall effectiveness may be achieved by combining evidence from spans and full articles. Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives. Experimental results also highlight the need to develop distributed text retrieval algorithms, since full-text articles are significantly longer than abstracts and may require the computational resources of multiple machines in a cluster. The MapReduce programming model provides a convenient framework for organizing such computations.en_US
dc.description.urihttps://doi.org/10.1186/1471-2105-10-46
dc.identifierhttps://doi.org/10.13016/1eo6-j815
dc.identifier.citationLin, J. Is searching full text more effective than searching abstracts?. BMC Bioinformatics 10, 46 (2009).en_US
dc.identifier.urihttp://hdl.handle.net/1903/28175
dc.language.isoen_USen_US
dc.publisherSpringer Natureen_US
dc.relation.isAvailableAtCollege of Information Studiesen_us
dc.relation.isAvailableAtInformation Studiesen_us
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_us
dc.relation.isAvailableAtUniversity of Maryland (College Park, MD)en_us
dc.subjectRetrieval Modelen_US
dc.subjectMean Average Precisionen_US
dc.subjectRanking Algorithmen_US
dc.subjectTest Collectionen_US
dc.subjectInverted Indexen_US
dc.titleIs searching full text more effective than searching abstracts?en_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1471-2105-10-46.pdf
Size:
395.53 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.57 KB
Format:
Item-specific license agreed upon to submission
Description: