Data rescue: An assessment framework for legacy research collections


Widespread investments in the reproducibility and reuse of scientific data have spurred an increasing recognition of the potential value of data biding in unpublished records and collections of legacy research materials, such as scientists’ papers, historical publications, and working files. Recovering usable scientific data from legacy collections constitutes one kind of data rescue: the application of selected data curation processes to data at imminent risk of loss. Given the growing interest in data-intensive science and growing movement toward computationally amenable collections in memory institutions, the National Agricultural Library and other curation institutions need systematic approaches to processing legacy collections with the specific goal of retrieving reusable or historically valuable scientific data. This white paper reports on research conducted under the auspices of the Digital Curation Fellows Program, a collaborative research initiative of the United States Department of Agriculture’s National Agricultural Library and the University of Maryland College of Information Studies. We offer a framework for assessing collections of scientific records for the purpose of data rescue, developed through research on three case studies of agricultural research collections. This framework aims to guide data rescue initiatives at the National Agricultural Library and other agricultural research centers, and to provide conceptual and practical framing for emerging conversations around data rescue in the agricultural research community and across disciplines.