TY - JOUR
T1 - Local and global encoder network for semantic segmentation of Airborne laser scanning point clouds
AU - Lin, Yaping
AU - Vosselman, G.
AU - Cao, Yanpeng
AU - Yang, Michael Ying
N1 - Publisher Copyright:
© 2021 The Author(s)
PY - 2021/6/1
Y1 - 2021/6/1
N2 - Interpretation of Airborne Laser Scanning (ALS) point clouds is a critical procedure for producing various geo-information products like 3D city models, digital terrain models and land use maps. In this paper, we present a local and global encoder network (LGENet) for semantic segmentation of ALS point clouds. Adapting the KPConv network, we first extract features by both 2D and 3D point convolutions to allow the network to learn more representative local geometry. Then global encoders are used in the network to exploit contextual information at the object and point level. We design a segment-based Edge Conditioned Convolution to encode the global context between segments. We apply a spatial-channel attention module at the end of the network, which not only captures the global interdependencies between points but also models interactions between channels. We evaluate our method on two ALS datasets namely, the ISPRS benchmark dataset and DCF2019 dataset. For the ISPRS benchmark dataset, our model achieves state-of-the-art results with an overall accuracy of 0.845 and an average F1 score of 0.737. With regards to the DFC2019 dataset, our proposed network achieves an overall accuracy of 0.984 and an average F1 score of 0.834.
AB - Interpretation of Airborne Laser Scanning (ALS) point clouds is a critical procedure for producing various geo-information products like 3D city models, digital terrain models and land use maps. In this paper, we present a local and global encoder network (LGENet) for semantic segmentation of ALS point clouds. Adapting the KPConv network, we first extract features by both 2D and 3D point convolutions to allow the network to learn more representative local geometry. Then global encoders are used in the network to exploit contextual information at the object and point level. We design a segment-based Edge Conditioned Convolution to encode the global context between segments. We apply a spatial-channel attention module at the end of the network, which not only captures the global interdependencies between points but also models interactions between channels. We evaluate our method on two ALS datasets namely, the ISPRS benchmark dataset and DCF2019 dataset. For the ISPRS benchmark dataset, our model achieves state-of-the-art results with an overall accuracy of 0.845 and an average F1 score of 0.737. With regards to the DFC2019 dataset, our proposed network achieves an overall accuracy of 0.984 and an average F1 score of 0.834.
KW - ITC-ISI-JOURNAL-ARTICLE
KW - UT-Hybrid-D
KW - ITC-HYBRID
UR - https://ezproxy2.utwente.nl/login?url=https://library.itc.utwente.nl/login/2021/isi/lin_loc.pdf
U2 - 10.1016/j.isprsjprs.2021.04.016
DO - 10.1016/j.isprsjprs.2021.04.016
M3 - Article
VL - 176
SP - 151
EP - 168
JO - ISPRS journal of photogrammetry and remote sensing
JF - ISPRS journal of photogrammetry and remote sensing
SN - 0924-2716
ER -