TY - JOUR
T1 - Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection
AU - Cao, Yanpeng
AU - Guan, Dayan
AU - Wu, Yulun
AU - Yang, Jiangxin
AU - Cao, Yanlong
AU - Yang, Michael Ying
PY - 2019/4
Y1 - 2019/4
N2 - Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g., daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed.
AB - Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g., daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed.
KW - ITC-ISI-JOURNAL-ARTICLE
KW - Multispectral data
KW - Pedestrian detection
KW - Deep neural networks
KW - Box-level segmentation
KW - Real-time application
UR - https://ezproxy2.utwente.nl/login?url=https://library.itc.utwente.nl/login/2019/isi/yang_box.pdf
UR - https://ezproxy2.utwente.nl/login?url=https://doi.org/10.1016/j.isprsjprs.2019.02.005
U2 - 10.1016/j.isprsjprs.2019.02.005
DO - 10.1016/j.isprsjprs.2019.02.005
M3 - Article
SN - 0924-2716
VL - 150
SP - 70
EP - 79
JO - ISPRS journal of photogrammetry and remote sensing
JF - ISPRS journal of photogrammetry and remote sensing
IS - April
ER -