Perception-action based object detection from local descriptor combination and reinforcement learning

Lucas Paletta*, Gerald Fritz, Christin Seifert

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

1 Downloads (Pure)


This work proposes to learn visual encodings of attention patterns that enables sequential attention for object detection in real world environments. The system embeds a saccadic decision procedure in a cascaded process where visual evidence is probed at informative image locations. It is based on the extraction of information theoretic saliency by determining informative local image descriptors that provide selected foci of interest. The local information in terms of code book vector responses and the geometric information in the shift of attention contribute to recognition states of a Markov decision process. A Q-learner performs then performs search on useful actions towards salient locations, developing a strategy of action sequences directed in state space towards the optimization of information maximization. The method is evaluated in outdoor object recognition and demonstrates efficient performance.

Original languageEnglish
Title of host publicationImage Analysis
Subtitle of host publication14th Scandinavian Conference, SCIA 2005, Joensuu, Finland, June 19-22, 2005, Proceedings
EditorsHeikki Kalviainen, Jussi Parkkinen, Arto Kaarna
Place of PublicationBerlin, Heidelberg
Number of pages10
ISBN (Electronic)978-3-540-31566-7
ISBN (Print)978-3-540-26320-3
Publication statusPublished - 2005
Externally publishedYes
Event14th Scandinavian Conference on Image Analysis, SCIA 2005 - Joensuu, Finland
Duration: 19 Jun 200522 Jun 2005
Conference number: 14

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference14th Scandinavian Conference on Image Analysis, SCIA 2005
Abbreviated titleSCIA


  • n/a OA procedure


Dive into the research topics of 'Perception-action based object detection from local descriptor combination and reinforcement learning'. Together they form a unique fingerprint.

Cite this