Classifying visemes for automatic lipreading

Michiel Visser, Mannes Poel, Antinus Nijholt

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    12 Citations (Scopus)

    Abstract

    Automatic lipreading is automatic speech recognition that uses only visual information. The relevant data in a video signal is isolated and features are extracted from it. From a sequence of feature vectors, where every vector represents one video image, a sequence of higher level semantic elements is formed. These semantic elements are "visemes" the visual equivalent of "phonemes" The developed prototype uses a Time Delayed Neural Network to classify the visemes.
    Original languageUndefined
    Title of host publicationInternational Workshop Text, Speech and Dialogue (TSD'99)
    EditorsVaclav Matousek, Pavel Mautner, Jana Ocelikovi, Petr Sojka
    Place of PublicationBerlin
    PublisherSpringer
    Pages349-352
    Number of pages4
    ISBN (Print)3-540-66494-7
    DOIs
    Publication statusPublished - 1 Sept 1999
    Event2nd Text, Speech & Dialogue Workshop, TSD 1999 - Plzen, Czech Republic
    Duration: 13 Sept 199917 Sept 1999
    Conference number: 2

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer Verlag
    Volume1692
    ISSN (Print)0302-9743

    Workshop

    Workshop2nd Text, Speech & Dialogue Workshop, TSD 1999
    Abbreviated titleTSD 1999
    Country/TerritoryCzech Republic
    CityPlzen
    Period13/09/9917/09/99

    Keywords

    • EWI-9759
    • IR-64013
    • METIS-119592
    • HMI-MI: MULTIMODAL INTERACTIONS

    Cite this