Skip to content
University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • College of Information Studies
    • Information Studies
    • Information Studies Research Works
    • View Item
    •   DRUM
    • College of Information Studies
    • Information Studies
    • Information Studies Research Works
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Is searching full text more effective than searching abstracts?

    Thumbnail
    View/Open
    1471-2105-10-46.pdf (395.5Kb)
    No. of downloads: 42

    External Link(s)
    https://doi.org/10.1186/1471-2105-10-46
    Date
    2009-02-03
    Author
    Lin, Jimmy
    Citation
    Lin, J. Is searching full text more effective than searching abstracts?. BMC Bioinformatics 10, 46 (2009).
    DRUM DOI
    https://doi.org/10.13016/1eo6-j815
    Metadata
    Show full item record
    Abstract
    With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE® abstracts, full-text articles, and spans (paragraphs) within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: bm25 and the ranking algorithm implemented in the open-source Lucene search engine. Experiments show that treating an entire article as an indexing unit does not consistently yield higher effectiveness compared to abstract-only search. However, retrieval based on spans, or paragraphs-sized segments of full-text articles, consistently outperforms abstract-only search. Results suggest that highest overall effectiveness may be achieved by combining evidence from spans and full articles. Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives. Experimental results also highlight the need to develop distributed text retrieval algorithms, since full-text articles are significantly longer than abstracts and may require the computational resources of multiple machines in a cluster. The MapReduce programming model provides a convenient framework for organizing such computations.
    URI
    http://hdl.handle.net/1903/28175
    Collections
    • Information Studies Research Works

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility