Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection

Yanpeng Cao, Dayan Guan, Yulun Wu, Jiangxin Yang (Corresponding Author), Yanlong Cao, Michael Ying Yang

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)
1 Downloads (Pure)

Abstract

Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g., daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed.
Original languageEnglish
Pages (from-to)70-79
Number of pages10
JournalISPRS journal of photogrammetry and remote sensing
Volume150
Issue numberApril
DOIs
Publication statusPublished - Apr 2019

Fingerprint

pedestrian
segmentation
boxes
Infrared radiation
Anchors
anchor
Supervised learning
Computational efficiency
annotations
Fusion reactions
detection method
Cameras
Detectors
daytime
surveillance
learning
Sensors
sensor
education
fusion

Keywords

  • ITC-ISI-JOURNAL-ARTICLE
  • Multispectral data
  • Pedestrian detection
  • Deep neural networks
  • Box-level segmentation
  • Real-time application

Cite this

@article{6dd51132d03945f991838636201815c2,
title = "Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection",
abstract = "Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g., daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed.",
keywords = "ITC-ISI-JOURNAL-ARTICLE, Multispectral data, Pedestrian detection, Deep neural networks, Box-level segmentation, Real-time application",
author = "Yanpeng Cao and Dayan Guan and Yulun Wu and Jiangxin Yang and Yanlong Cao and Yang, {Michael Ying}",
year = "2019",
month = "4",
doi = "10.1016/j.isprsjprs.2019.02.005",
language = "English",
volume = "150",
pages = "70--79",
journal = "ISPRS journal of photogrammetry and remote sensing",
issn = "0924-2716",
publisher = "Elsevier",
number = "April",

}

Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection. / Cao, Yanpeng; Guan, Dayan; Wu, Yulun; Yang, Jiangxin (Corresponding Author); Cao, Yanlong; Yang, Michael Ying.

In: ISPRS journal of photogrammetry and remote sensing, Vol. 150, No. April, 04.2019, p. 70-79.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection

AU - Cao, Yanpeng

AU - Guan, Dayan

AU - Wu, Yulun

AU - Yang, Jiangxin

AU - Cao, Yanlong

AU - Yang, Michael Ying

PY - 2019/4

Y1 - 2019/4

N2 - Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g., daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed.

AB - Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g., daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed.

KW - ITC-ISI-JOURNAL-ARTICLE

KW - Multispectral data

KW - Pedestrian detection

KW - Deep neural networks

KW - Box-level segmentation

KW - Real-time application

UR - https://ezproxy2.utwente.nl/login?url=https://library.itc.utwente.nl/login/2019/isi/yang_box.pdf

UR - https://ezproxy2.utwente.nl/login?url=https://doi.org/10.1016/j.isprsjprs.2019.02.005

U2 - 10.1016/j.isprsjprs.2019.02.005

DO - 10.1016/j.isprsjprs.2019.02.005

M3 - Article

VL - 150

SP - 70

EP - 79

JO - ISPRS journal of photogrammetry and remote sensing

JF - ISPRS journal of photogrammetry and remote sensing

SN - 0924-2716

IS - April

ER -