WHO, WHAT, WHEN, WHERE, AND WHY? QUANTIFYING AND UNDERSTANDING BIOMEDICAL DATA REUSE

dc.contributor.advisorShilton, Katieen_US
dc.contributor.authorFederer, Lisa Men_US
dc.contributor.departmentLibrary & Information Servicesen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2019-06-20T05:35:15Z
dc.date.available2019-06-20T05:35:15Z
dc.date.issued2019en_US
dc.description.abstractSince the mid-2000s, new data sharing mandates have led to an increase in the amount of research data available for reuse. Reuse of data benefits the scientific community and the public by potentially speeding scientific discovery and increasing the return on investment of publicly funded research. However, despite the potential benefits of reuse and the increasing availability of data, research on the impact of data reuse is so far sparse. This dissertation provides a deeper understanding of the impacts of shared biomedical research data by exploring who is reusing data and for what purpose. Specifically, this dissertation examines use requests and dataset descriptions from three biomedical repositories that require potential requestors to submit descriptions of their planned reuse. Content analysis of use requests yields insight into who is requesting data and the methods and topics of their planned reuse. Comparing use requests to the descriptions of the original datasets provides insight into the breadth of impact of data reuse and text mining of the original dataset descriptions helps determine the topics of datasets that are highly reused. This study demonstrates that patterns of reuse differ between dataset types, with genomic datasets used more frequently together in meta-analyses for topics that diverge from the original purpose of collection, while clinical datasets are used more often on their own within a context that is similar to the reason for which they were collected. While requestors do come from a range of career stages from around the world, they are not evenly distributed; most requests come from English-speaking countries, especially the United States. This study also finds that datasets that receive the most requests soon after release continue to go on to be more requested, and that datasets covering common diseases are requested more than datasets on rare diseases. These findings have implications for several stakeholders, including funders and institutions developing policies to reward and incentivize data sharing, researchers who share data and those who reuse it, and repositories and data curators who must make choices about which datasets to curate and preserve.en_US
dc.identifierhttps://doi.org/10.13016/60jd-9hux
dc.identifier.urihttp://hdl.handle.net/1903/21991
dc.language.isoenen_US
dc.subject.pqcontrolledInformation scienceen_US
dc.subject.pquncontrolledbiomedical dataen_US
dc.subject.pquncontrolleddata reuseen_US
dc.subject.pquncontrolleddata sharingen_US
dc.subject.pquncontrolledscientometricsen_US
dc.titleWHO, WHAT, WHEN, WHERE, AND WHY? QUANTIFYING AND UNDERSTANDING BIOMEDICAL DATA REUSEen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Federer_umd_0117E_19889.pdf
Size:
6.7 MB
Format:
Adobe Portable Document Format