Abstract
Automatic lipreading is automatic speech recognition that uses only visual information. The relevant data in a video signal is isolated and features are extracted from it. From a sequence of feature vectors, where every vector represents one video image, a sequence of higher level semantic elements is formed. These semantic elements are "visemes" the visual equivalent of "phonemes" The developed prototype uses a Time Delayed Neural Network to classify the visemes.
| Original language | Undefined |
|---|---|
| Title of host publication | International Workshop Text, Speech and Dialogue (TSD'99) |
| Editors | Vaclav Matousek, Pavel Mautner, Jana Ocelikovi, Petr Sojka |
| Place of Publication | Berlin |
| Publisher | Springer |
| Pages | 349-352 |
| Number of pages | 4 |
| ISBN (Print) | 3-540-66494-7 |
| DOIs | |
| Publication status | Published - 1 Sept 1999 |
| Event | 2nd Text, Speech & Dialogue Workshop, TSD 1999 - Plzen, Czech Republic Duration: 13 Sept 1999 → 17 Sept 1999 Conference number: 2 |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Publisher | Springer Verlag |
| Volume | 1692 |
| ISSN (Print) | 0302-9743 |
Workshop
| Workshop | 2nd Text, Speech & Dialogue Workshop, TSD 1999 |
|---|---|
| Abbreviated title | TSD 1999 |
| Country/Territory | Czech Republic |
| City | Plzen |
| Period | 13/09/99 → 17/09/99 |
Keywords
- EWI-9759
- IR-64013
- METIS-119592
- HMI-MI: MULTIMODAL INTERACTIONS
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver