Contextual Working Memory for Trans-Saccadic Object Recognition Using Reinforcement Learning and Informative Local Descriptors

Lucas Paletta, Christin Seifert, Gerald Fritz

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    9 Downloads (Pure)


    Previous research on behavioural modelling of saccade-driven image interpretation (Henderson, 1982 Psychological Science 8 51 ^ 55) has emphasised the sampling of informative parts under visual attention to guide visual perception. We propose two major innovations to trans-saccadic object recognition: first, we model contextual tuning at the early visual processing stage. Salience in pre-processing is determined from descriptors in terms of local gradient histogram patterns - SIFT features (Lowe, 2004 International Journal of Computer Vision 60 91 ^ 110). SIFT features are scale-, rotation-, and to a high degree illumination-tolerant, in a substantial extension to previously used edge features (Rybak et al, 1998 Vision Research 38 2387 ^ 2400) or appearance patterns (Paletta et al, 2004 Perception 33 Supplement, 126). Descriptors that are informative with respect to an information theoretic framework (Fritz et al, 2004, in Proceedings of the International Conference on Pattern Recognition volume 2, pp 15 ^ 18) are selected and weighted according to contextual salience. Second, we develop a behavioural strategy for saccade-driven information access, operating on contextually selected features and attention shifts, being performed in terms of a partially observable Markovian decision process and represented by a short-term working memory generating discriminative perception ^ action sequences. It is developed under exploration and reinforcement feedback using Q-learning, a machine-learning methodology representing operant conditioning. Saccadic targets are selected for attention only in a local neighbourhood of a currently focused descriptor. The learned strategy proposes next actions that support expected maximisation of reward, eg minimisation of entropy in posterior object discrimination. We demonstrate the performance of using the sensory ^motor context of trans-saccadic outdoor object recognition, efficiently identifying building facades from different viewpoints, distances, and varying illumination conditions.
    Original languageEnglish
    Title of host publicationEuropean Conference on Visual Perception (ECVP 2005)
    Subtitle of host publicationXXVIII Annual Meeting, A Coruña - Spain 2005
    Place of PublicationCoruña, Spain
    Number of pages1
    Publication statusPublished - 1 Aug 2005
    Event28th European Conference on Visual Perception, ECVP 2005 - Coruña, Spain
    Duration: 22 Aug 200526 Aug 2005
    Conference number: 28

    Publication series



    Conference28th European Conference on Visual Perception, ECVP 2005
    Abbreviated titleECVP


    Dive into the research topics of 'Contextual Working Memory for Trans-Saccadic Object Recognition Using Reinforcement Learning and Informative Local Descriptors'. Together they form a unique fingerprint.

    Cite this