Machine Learning Using Hyperspectral Data Inaccurately Predicts Plant Traits Under Spatial Dependency

A. Duarte Rocha (Corresponding Author), T.A. Groen, A.K. Skidmore, R. Darvishzadeh, L. Willemen

Research output: Contribution to journalArticleAcademicpeer-review

3 Citations (Scopus)
49 Downloads (Pure)

Abstract

Spectral, temporal and spatial dimensions are difficult to model together when predicting in situ plant traits from remote sensing data. Therefore, machine learning algorithms solely based on spectral dimensions are often used as predictors, even when there is a strong effect of spatial or temporal autocorrelation in the data. A significant reduction in prediction accuracy is expected when algorithms are trained using a sequence in space or time that is unlikely to be observed again. The ensuing inability to generalise creates a necessity for ground-truth data for every new area or period, provoking the propagation of “single-use” models. This study assesses the impact of spatial autocorrelation on the generalisation of plant trait models predicted with hyperspectral data. Leaf Area Index (LAI) data generated at increasing levels of spatial dependency are used to simulate hyperspectral data using Radiative Transfer Models. Machine learning regressions to predict LAI at different levels of spatial dependency are then tuned (determining the optimum model complexity) using cross-validation as well as the NOIS method. The results show that cross-validated prediction accuracy tends to be overestimated when spatial structures present in the training data are fitted (or learned) by the model
Original languageEnglish
Article number1263
Pages (from-to)1-20
Number of pages20
JournalRemote sensing
Volume10
Issue number8
DOIs
Publication statusPublished - 1 Aug 2018

Fingerprint

leaf area index
autocorrelation
prediction
machine learning
radiative transfer
remote sensing
method
effect
in situ

Keywords

  • ITC-ISI-JOURNAL-ARTICLE
  • ITC-GOLD

Cite this

@article{5e227e80ea59464dbfd9d13f14a39ec7,
title = "Machine Learning Using Hyperspectral Data Inaccurately Predicts Plant Traits Under Spatial Dependency",
abstract = "Spectral, temporal and spatial dimensions are difficult to model together when predicting in situ plant traits from remote sensing data. Therefore, machine learning algorithms solely based on spectral dimensions are often used as predictors, even when there is a strong effect of spatial or temporal autocorrelation in the data. A significant reduction in prediction accuracy is expected when algorithms are trained using a sequence in space or time that is unlikely to be observed again. The ensuing inability to generalise creates a necessity for ground-truth data for every new area or period, provoking the propagation of “single-use” models. This study assesses the impact of spatial autocorrelation on the generalisation of plant trait models predicted with hyperspectral data. Leaf Area Index (LAI) data generated at increasing levels of spatial dependency are used to simulate hyperspectral data using Radiative Transfer Models. Machine learning regressions to predict LAI at different levels of spatial dependency are then tuned (determining the optimum model complexity) using cross-validation as well as the NOIS method. The results show that cross-validated prediction accuracy tends to be overestimated when spatial structures present in the training data are fitted (or learned) by the model",
keywords = "ITC-ISI-JOURNAL-ARTICLE, ITC-GOLD",
author = "{Duarte Rocha}, A. and T.A. Groen and A.K. Skidmore and R. Darvishzadeh and L. Willemen",
year = "2018",
month = "8",
day = "1",
doi = "10.3390/rs10081263",
language = "English",
volume = "10",
pages = "1--20",
journal = "Remote sensing",
issn = "2072-4292",
publisher = "Multidisciplinary Digital Publishing Institute",
number = "8",

}

Machine Learning Using Hyperspectral Data Inaccurately Predicts Plant Traits Under Spatial Dependency. / Duarte Rocha, A. (Corresponding Author); Groen, T.A.; Skidmore, A.K.; Darvishzadeh, R.; Willemen, L.

In: Remote sensing, Vol. 10, No. 8, 1263, 01.08.2018, p. 1-20.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Machine Learning Using Hyperspectral Data Inaccurately Predicts Plant Traits Under Spatial Dependency

AU - Duarte Rocha, A.

AU - Groen, T.A.

AU - Skidmore, A.K.

AU - Darvishzadeh, R.

AU - Willemen, L.

PY - 2018/8/1

Y1 - 2018/8/1

N2 - Spectral, temporal and spatial dimensions are difficult to model together when predicting in situ plant traits from remote sensing data. Therefore, machine learning algorithms solely based on spectral dimensions are often used as predictors, even when there is a strong effect of spatial or temporal autocorrelation in the data. A significant reduction in prediction accuracy is expected when algorithms are trained using a sequence in space or time that is unlikely to be observed again. The ensuing inability to generalise creates a necessity for ground-truth data for every new area or period, provoking the propagation of “single-use” models. This study assesses the impact of spatial autocorrelation on the generalisation of plant trait models predicted with hyperspectral data. Leaf Area Index (LAI) data generated at increasing levels of spatial dependency are used to simulate hyperspectral data using Radiative Transfer Models. Machine learning regressions to predict LAI at different levels of spatial dependency are then tuned (determining the optimum model complexity) using cross-validation as well as the NOIS method. The results show that cross-validated prediction accuracy tends to be overestimated when spatial structures present in the training data are fitted (or learned) by the model

AB - Spectral, temporal and spatial dimensions are difficult to model together when predicting in situ plant traits from remote sensing data. Therefore, machine learning algorithms solely based on spectral dimensions are often used as predictors, even when there is a strong effect of spatial or temporal autocorrelation in the data. A significant reduction in prediction accuracy is expected when algorithms are trained using a sequence in space or time that is unlikely to be observed again. The ensuing inability to generalise creates a necessity for ground-truth data for every new area or period, provoking the propagation of “single-use” models. This study assesses the impact of spatial autocorrelation on the generalisation of plant trait models predicted with hyperspectral data. Leaf Area Index (LAI) data generated at increasing levels of spatial dependency are used to simulate hyperspectral data using Radiative Transfer Models. Machine learning regressions to predict LAI at different levels of spatial dependency are then tuned (determining the optimum model complexity) using cross-validation as well as the NOIS method. The results show that cross-validated prediction accuracy tends to be overestimated when spatial structures present in the training data are fitted (or learned) by the model

KW - ITC-ISI-JOURNAL-ARTICLE

KW - ITC-GOLD

UR - https://ezproxy2.utwente.nl/login?url=https://webapps.itc.utwente.nl/library/2018/isi/groen_mac.pdf

U2 - 10.3390/rs10081263

DO - 10.3390/rs10081263

M3 - Article

VL - 10

SP - 1

EP - 20

JO - Remote sensing

JF - Remote sensing

SN - 2072-4292

IS - 8

M1 - 1263

ER -