Endangered Data Preservation: Organizing and Visualizing Thoreau's Botanical Observations
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Identification and preservation of boutique sets of historical botanical data poses several unique problems in research data management. For example, Henry David Thoreau's journals contain invaluable data on spring blooming of thousands of species in and around the Massachusetts area. However, with each year that passes, botanical identification of those species have shifted to their current taxonomic name. The aim of this research was to capture historical botanical data from an at risk website developed in the early aughts and transform it into a usable, easily accessible and manipulable format, essentially creating a "living data set" that can be updated easily as botanical nomenclature shifts. This poster will discuss issues in identifying at risk boutique data sets with high value, scraping the website to gather the data, and transforming it into a usable set.