Deriving Vegetation Variables from Satellite Observations using a Data-driven Approach
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Land remote sensing techniques offer an unprecedented spatiotemporal coverage of key vegetative indicators in climatological modeling and understanding, such as fraction of vegetation cover (fCover). However, temporal inconsistencies and noise from snow cover, atmospheric contaminants, and viewing and illumination geometry hinder its applicability, necessitating validation using a more limited set of ground-based observations (GBOV). The research assesses three regression models: Cubist, XGBoost, and random forest for predicting ground-measured fCover using satellite-derived feature information. Ground measurements of fCover from 43 National Ecological Observatory Network (NEON) sites were processed at 20 m spatial resolution to provide labels for training, validation, and testing, which were then upscaled to 500 m to align with the high spatial resolution land surface reflectance data provided by the Visible Infrared Imaging Radiometer Suite (VIIRS) daily surface reflectance (VNP09GA) product. When evaluated against unseen data, the random forest regression model demonstrated the best agreement (R-squared = 0.912, MAE = 0.043), followed by the XGBoost regressor (R-squared = 0.910, MAE = 0.043) and the Cubist model (R-squared = 0.904, MAE = 0.047). Applying the random forest model on the 2023 VIIRS data for the East Coast produced estimates consistent with the expected annual phenological cycle. Limitations on the NEON site measurements may reduce the global representativeness and produce biases within regression models. Future work should focus on direct validation of the performance and representativeness using existing global products, such as GEOV3 and MODIS, and on the more globally representative BELMANIP2 sites.
Notes
This report was written at the completion of the 2025 CISESS Summer Internship Program.