An implicit spatiotemporal shape model for human activity localization and recognition

A. Oikonomopoulos, I. Patras, Maja Pantic

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    22 Citations (Scopus)
    40 Downloads (Pure)

    Abstract

    In this paper we address the problem of localisation and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic, sparse, dasiavisual wordspsila and dasiavisual verbspsila. Evidence for the spatiotemporal localization of the activity are accumulated in a probabilistic spatiotemporal voting scheme. The local nature of our voting framework allows us to recover multiple activities that take place in the same scene, as well as activities in the presence of clutter and occlusions. We construct class-specific codebooks using the descriptors in the training set, where we take the spatial co-occurrences of pairs of codewords into account. The positions of the codeword pairs with respect to the object centre, as well as the frame in the training set in which they occur are subsequently stored in order to create a spatiotemporal model of codeword co-occurrences. During the testing phase, we use mean shift mode estimation in order to spatially segment the subject that performs the activities in every frame, and the Radon transform in order to extract the most probable hypotheses concerning the temporal segmentation of the activities within the continuous stream.
    Original languageUndefined
    Title of host publicationIEEE International Conference on Computer Vision and Pattern Recognition
    Place of PublicationLos Alamitos
    PublisherIEEE Computer Society
    Pages27-33
    Number of pages7
    ISBN (Print)978-1-4244-3994-2
    DOIs
    Publication statusPublished - 2009

    Publication series

    Name
    PublisherIEEE Computer Society Press
    Volume3

    Keywords

    • METIS-264326
    • HMI-HF: Human Factors
    • Temporal segmentation
    • Radon transform
    • activities recovering
    • EWI-17213
    • EC Grant Agreement nr.: FP7/231287
    • visual verbs
    • visual words
    • probabilistic spatiotemporal voting scheme
    • spatial co-occurrences
    • human activity localization
    • human activity recognition
    • mean shift mode estimation
    • training set
    • unsegmented image sequences
    • class-specific codebooks
    • implicit representation
    • implicit spatiotemporal shape model
    • HMI-MI: MULTIMODAL INTERACTIONS
    • IR-69561

    Cite this

    Oikonomopoulos, A., Patras, I., & Pantic, M. (2009). An implicit spatiotemporal shape model for human activity localization and recognition. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 27-33). [10.1109/CVPR.2009.5204262] Los Alamitos: IEEE Computer Society. https://doi.org/10.1109/CVPR.2009.5204262