A two-point machine learning method for the spatial prediction of soil pollution

Bingbo Gao, A. Stein, Jinfeng Wang*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

21 Citations (Scopus)
232 Downloads (Pure)

Abstract

Heavy metal soil pollution is a worldwide problem. It is affected by many natural and human factors through heterogeneous relationships. Accurate prediction at unobserved locations using a limited number of observations hence remains a challenge. This study proposes a two-point machine learning method to fully utilize the information in spatial neighbors and high-dimensional covariates to improve prediction accuracy. It models the difference between pairs of points, predicts concentration differences between observation points and unobserved points, and uses those for neighbor selection. This supervised learning method integrates both spatial autocorrelation and property similarity. Method performance, illustrated in a case study of soil Pb, confirms that our method can greatly improve prediction accuracy for different sample sizes. The improvements vary with the sample size and have a decreasing trend as the sample size increases. Compared with ordinary kriging, kriging with external drift, random forest, and random forest-based regression kriging, the average improvements on RMSE are 1.49, 0.95, 0.93 and 0.62 respectively, and on MAE are 1.29, 1.17, 0.87 and 0.65 respectively. In the future, the method may be applied to the spatial prediction of other variables of the earth system, while the supervised learning method can be adjusted to new applications.

Original languageEnglish
Article number102742
Pages (from-to)1-10
Number of pages10
JournalInternational Journal of Applied Earth Observation and Geoinformation
Volume108
DOIs
Publication statusPublished - Apr 2022

Keywords

  • Soil heavy metal
  • Spatial heterogeneity
  • Spatial prediction
  • Two point machine learning
  • ITC-ISI-JOURNAL-ARTICLE
  • ITC-GOLD

Fingerprint

Dive into the research topics of 'A two-point machine learning method for the spatial prediction of soil pollution'. Together they form a unique fingerprint.

Cite this