Assessing the performance of the Gaussian Process Regression algorithm to fill gaps in the time-series of daily actual evapotranspiration of different crops in temperate and continental zones using ground and remotely sensed data
- Autori: Dario De Caro; Matteo Ippolito; Marcella Cannarozzo; Giuseppe Provenzano; Giuseppe Ciraolo
- Anno di pubblicazione: 2023
- Tipologia: Articolo in rivista
- OA Link: http://hdl.handle.net/10447/620173
Abstract
The knowledge of crop evapotranspiration is crucial for several hydrological processes, including those related to the management of agricultural water sources. In particular, the estimations of actual evapotranspiration fluxes within fields are essential to managing irrigation strategies to save water and preserve water resources. Among the indirect methods to estimate actual evapotranspiration, ETa, the eddy covariance (EC) method allows to acquire continuous measurement of latent heat flux (LE). However, the time series of EC measurements are sometimes characterized by a lack of data due to the sensors' malfunctions. At this aim, Machine Learning (ML) techniques could represent a powerful tool to fill possible gaps in the time series. In this paper, the ML technique was applied using the Gaussian Process Regression (GPR) algorithm to fill gaps in daily actual evapotranspiration. The technique was tested in six different plots, two in Italy, three in the United States of America, and one in Canada, with different crops and climatic conditions in order to consider the suitability of the ML model in various contexts. For each site, the climate variables were not the same, therefore, the performance of the method was investigated on the basis of the available information. Initially, a comparison of ground and reanalysis data, where both databases were available, and between two different satellite products, when both databases were available, have been conducted. Then, the GPR model was tested. The mean and the covariance functions were set by considering a database of climate variables, soil water status measurements, and remotely sensed vegetation indices. Then, five different combinations of variables were analyzed to verify the suitability of the ML approach when limited input data are available or when the weather variables are replaced with reanalysis data. Cross-validation was used to assess the performance of the procedure. The model performances were assessed based on the statistical indicators: Root Mean Square Error (RMSE), coefficient of determination (R2), Mean Absolute Error (MAE), regression coefficient (b), and Nash-Sutcliffe efficiency coefficient (NSE). The quite high Nash Sutcliffe Efficiency (NSE) coefficient, and the root mean square error (RMSE) low values confirm the suitability of the proposed algorithm.