TY - JOUR
T1 - A benchmark dataset and workflow for landslide susceptibility zonation
AU - Alvioli, Massimiliano
AU - Loche, Marco
AU - Jacobs, Liesbet
AU - Grohmann, Carlos H.
AU - Abraham, Minu Treesa
AU - Gupta, Kunal
AU - Satyam, Neelima
AU - Scaringi, Gianvito
AU - Bornaetxea, Txomin
AU - Rossi, Mauro
AU - Marchesini, Ivan
AU - Lombardo, Luigi
AU - Moreno, Mateo
AU - Steger, Stefan
AU - Camera, Corrado A.S.
AU - Bajni, Greta
AU - Samodra, Guruh
AU - Wahyudi, Erwin Eko
AU - Susyanto, Nanang
AU - Sinčić, Marko
AU - Gazibara, Sanja Bernat
AU - Sirbu, Flavius
AU - Torizin, Jewgenij
AU - Schüßler, Nick
AU - Mirus, Benjamin B.
AU - Woodard, Jacob B.
AU - Aguilera, Héctor
AU - Rivera-Rivera, Jhonatan
N1 - Publisher Copyright:
© 2024 The Author(s)
PY - 2024/11
Y1 - 2024/11
N2 - Landslide susceptibility shows the spatial likelihood of landslide occurrence in a specific geographical area and is a relevant tool for mitigating the impact of landslides worldwide. As such, it is the subject of countless scientific studies. Many methods exist for generating a susceptibility map, mostly falling under the definition of statistical or machine learning. These models try to solve a classification problem: given a collection of spatial variables, and their combination associated with landslide presence or absence, a model should be trained, tested to reproduce the target outcome, and eventually applied to unseen data. Contrary to many fields of science that use machine learning for specific tasks, no reference data exist to assess the performance of a given method for landslide susceptibility. Here, we propose a benchmark dataset consisting of 7360 slope units encompassing an area of about 4,100km2 in Central Italy. Using the dataset, we tried to answer two open questions in landslide research: (1) what effect does the human variability have in creating susceptibility models; (2) how can we develop a reproducible workflow for allowing meaningful model comparisons within the landslide susceptibility research community. With these questions in mind, we released a preliminary version of the dataset, along with a “call for collaboration,” aimed at collecting different calculations using the proposed data, and leaving the freedom of implementation to the respondents. Contributions were different in many respects, including classification methods, use of predictors, implementation of training/validation, and performance assessment. That feedback suggested refining the initial dataset, and constraining the implementation workflow. This resulted in a final benchmark dataset and landslide susceptibility maps obtained with many classification methods. Values of area under the receiver operating characteristic curve obtained with the final benchmark dataset were rather similar, as an effect of constraints on training, cross–validation, and use of data. Brier score results show larger variability, instead, ascribed to different model predictive abilities. Correlation plots show similarities between results of different methods applied by the same group, ascribed to a residual implementation dependence. We stress that the experiment did not intend to select the “best” method but only to establish a first benchmark dataset and workflow, that may be useful as a standard reference for calculations by other scholars. The experiment, to our knowledge, is the first of its kind for landslide susceptibility modeling. The data and workflow presented here comparatively assess the performance of independent methods for landslide susceptibility and we suggest the benchmark approach as a best practice for quantitative research in geosciences.
AB - Landslide susceptibility shows the spatial likelihood of landslide occurrence in a specific geographical area and is a relevant tool for mitigating the impact of landslides worldwide. As such, it is the subject of countless scientific studies. Many methods exist for generating a susceptibility map, mostly falling under the definition of statistical or machine learning. These models try to solve a classification problem: given a collection of spatial variables, and their combination associated with landslide presence or absence, a model should be trained, tested to reproduce the target outcome, and eventually applied to unseen data. Contrary to many fields of science that use machine learning for specific tasks, no reference data exist to assess the performance of a given method for landslide susceptibility. Here, we propose a benchmark dataset consisting of 7360 slope units encompassing an area of about 4,100km2 in Central Italy. Using the dataset, we tried to answer two open questions in landslide research: (1) what effect does the human variability have in creating susceptibility models; (2) how can we develop a reproducible workflow for allowing meaningful model comparisons within the landslide susceptibility research community. With these questions in mind, we released a preliminary version of the dataset, along with a “call for collaboration,” aimed at collecting different calculations using the proposed data, and leaving the freedom of implementation to the respondents. Contributions were different in many respects, including classification methods, use of predictors, implementation of training/validation, and performance assessment. That feedback suggested refining the initial dataset, and constraining the implementation workflow. This resulted in a final benchmark dataset and landslide susceptibility maps obtained with many classification methods. Values of area under the receiver operating characteristic curve obtained with the final benchmark dataset were rather similar, as an effect of constraints on training, cross–validation, and use of data. Brier score results show larger variability, instead, ascribed to different model predictive abilities. Correlation plots show similarities between results of different methods applied by the same group, ascribed to a residual implementation dependence. We stress that the experiment did not intend to select the “best” method but only to establish a first benchmark dataset and workflow, that may be useful as a standard reference for calculations by other scholars. The experiment, to our knowledge, is the first of its kind for landslide susceptibility modeling. The data and workflow presented here comparatively assess the performance of independent methods for landslide susceptibility and we suggest the benchmark approach as a best practice for quantitative research in geosciences.
KW - Benchmark dataset
KW - Geomorphological mapping
KW - Geomorphometry
KW - Landslide inventory
KW - Landslide susceptibility
KW - Landslide susceptibility mapping
KW - Machine learning
KW - Slope units
KW - Spatial analysis
KW - Statistical modeling
KW - ITC-ISI-JOURNAL-ARTICLE
KW - ITC-HYBRID
U2 - 10.1016/j.earscirev.2024.104927
DO - 10.1016/j.earscirev.2024.104927
M3 - Review article
AN - SCOPUS:85204051068
SN - 0012-8252
VL - 258
JO - Earth-science reviews
JF - Earth-science reviews
M1 - 104927
ER -