Learning convolutional neural networks for object detection with very little training data

Christoph Reinders, Hanno Ackermann, Michael Ying Yang, Bodo Rosenhahn

Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

Abstract

In recent years, convolutional neural networks have shown great success in various computer vision tasks such as classification, object detection, and scene analysis. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. The availability of sufficient data, however, limits possible applications. While large amounts of data can be quickly collected, supervised learning further requires labeled data. Labeling data, unfortunately, is usually very time-consuming and literally expensive. This chapter addresses the problem of learning with very little labeled data for extracting information about the infrastructure in urban areas. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. The presented system for object detection is trained with very few training examples. To achieve this, the advantages of convolutional neural networks and random forests are combined to learn a patch-wise classifier. In the next step, the random forest is mapped to a neural network and the classifier is transformed to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, GPS-data is integrated to localize the predictions on the map and multiple observations are merged to further improve the localization accuracy. In comparison to faster R-CNN and other networks for object detection or algorithms for transfer learning, the required amount of labeled data is considerably reduced.

Original languageEnglish
Title of host publicationMultimodal Scene Understanding
Subtitle of host publicationAlgorithms, Applications and Deep Learning
EditorsMichael Ying Yang, Bodo Rosenhahn, Vittorio Murino
PublisherElsevier
Chapter4
Pages65-100
Number of pages36
ISBN (Electronic)9780128173589
DOIs
Publication statusPublished - 2 Aug 2019

Keywords

  • Convolutional neural networks
  • Localization
  • Object detection
  • Random forests

Fingerprint Dive into the research topics of 'Learning convolutional neural networks for object detection with very little training data'. Together they form a unique fingerprint.

  • Cite this

    Reinders, C., Ackermann, H., Yang, M. Y., & Rosenhahn, B. (2019). Learning convolutional neural networks for object detection with very little training data. In M. Y. Yang, B. Rosenhahn, & V. Murino (Eds.), Multimodal Scene Understanding: Algorithms, Applications and Deep Learning (pp. 65-100). Elsevier. https://doi.org/10.1016/B978-0-12-817358-9.00010-X