Machine Learning Using Hyperspectral Data Inaccurately Predicts Plant Traits Under Spatial Dependency

A. Duarte Rocha (Corresponding Author), T.A. Groen, A.K. Skidmore, R. Darvishzadeh, L. Willemen

Research output: Contribution to journalArticleAcademicpeer-review

20 Citations (Scopus)
106 Downloads (Pure)


Spectral, temporal and spatial dimensions are difficult to model together when predicting in situ plant traits from remote sensing data. Therefore, machine learning algorithms solely based on spectral dimensions are often used as predictors, even when there is a strong effect of spatial or temporal autocorrelation in the data. A significant reduction in prediction accuracy is expected when algorithms are trained using a sequence in space or time that is unlikely to be observed again. The ensuing inability to generalise creates a necessity for ground-truth data for every new area or period, provoking the propagation of “single-use” models. This study assesses the impact of spatial autocorrelation on the generalisation of plant trait models predicted with hyperspectral data. Leaf Area Index (LAI) data generated at increasing levels of spatial dependency are used to simulate hyperspectral data using Radiative Transfer Models. Machine learning regressions to predict LAI at different levels of spatial dependency are then tuned (determining the optimum model complexity) using cross-validation as well as the NOIS method. The results show that cross-validated prediction accuracy tends to be overestimated when spatial structures present in the training data are fitted (or learned) by the model
Original languageEnglish
Article number1263
Pages (from-to)1-20
Number of pages20
JournalRemote sensing
Issue number8
Publication statusPublished - 1 Aug 2018



Cite this