Visual Alphabets: Video classification by end users

Menno Israël, Egon van den Broek, Peter van der Putten, Marten J. den Uyl

    Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

    2 Citations (Scopus)
    40 Downloads (Pure)


    The work presented here introduces a real-time automatic scene classifier within content-based video retrieval. In our envisioned approach end users like documentalists, not image processing experts, build classifiers interactively, by simply indicating positive examples of a scene. Classification consists of a two-stage procedure. First, small image fragments called patches are classified. Second, frequency vectors of these patch classifications are fed into a second classifier for global scene classification (e.g., city, portraits, or countryside). The first stage classifiers can be seen as a set of highly specialized, learned feature detectors, as an alternative to letting an image processing expert determine features a priori. The end user or domain expert thus builds a visual alphabet that can be used to describe the image in features that are relevant for the task at hand.We present results for experiments on a variety of patch and image classes. The scene classifier approach has been successfully applied to other domains of video content analysis, such as content-based video retrieval in television archives, automated sewer inspection, and porn filtering.
    Original languageUndefined
    Title of host publicationMultimedia Data mining and Knowledge Discovery
    EditorsValery A. Petrushin, Latifur Khan
    Place of PublicationLondon
    Number of pages22
    ISBN (Print)978-1-84628-436-6
    Publication statusPublished - 2007

    Publication series

    NameChapter 10


    • IR-58737
    • METIS-243090
    • Image Processing
    • Classification
    • content
    • scenes
    • Video Retrieval
    • Real Time
    • EWI-20854
    • patches

    Cite this