Parameterized and Machine Learning Methods for Estimating Evapotranspiration from Satellite Data

Thumbnail Image


Publication or External Link





The studies in this dissertation present evaluation of and improvement to parametric and machine learning regression methods for estimating evapotranspiration from remote sensing. It includes three main parts. The first part is an assessment of parametric regression methods for obtaining evapotranspiration from vegetation index and other variables. It was found that including more variables tends to improve results, but the form of the regression formula does not make a large difference. Algorithm performance is not as good for wetland and agricultural sites as for other land cover types. Re-training of algorithms for those surface type results in some improvement. The second part consists of an evaluation of ten machine learning techniques for retrieval of evapotranspiration from surface radiation and several other variables. It is found that the best results are obtainable using all available input variables to train the bootstrap aggregation tree, random kernel, and two- and three- hidden layer neural network algorithms. Performance is again found to be weaker for wetland and agricultural surface types than for other surface types. However, separate training of the machine learning algorithms with data from those surface types does not significantly improve performance. The third part consists of further refinement to the machine learning algorithms and application of the bootstrap aggregation tree method to generate evapotranspiration maps of the continental United States for 2012. It is found that separating snow and non-snow data points improves performance. Performance for all tested algorithms was similar against the validation data set, but best for the bootstrap aggregation tree using an independent test data set. Monthly mean maps of the continental United States are generated for the drought year 2012 using the bootstrap aggregation tree. Evapotranspiration levels are lower than those shown in comparison data sets for the growing season in the eastern United States, resulting from a low bias at high evapotranspiration values. Retraining with the training data set weighted towards higher evapotranspiration values reduces this discrepancy but does not eliminate it. It is clear that machine learning evapotranspiration algorithm results have a significant dependence on training data set composition.