We are developing a dialogue manager (DM) for a multimodal interactive Question Answering (QA) system. Our QA system presents answers using text and pictures, and the user may pose follow-up questions using text or speech, while indicating screen elements with the mouse. We developed a corpus of multimodal follow-up questions for this system. This paper describes a detailed analysis of this corpus, and its impact on the implementation of our system. We found that users pose two major types of follow-up question: regular questions, which may be reformulated in such a way as to be answerable to a QA system, and questions asking about a specific picture. We found that indicating screen elements with the mouse is done often, even in cases where it may be considered redundant, and that these mouse gestures appear to have a close correspondence to regular anaphors in the utterance. We also found that users use a limited number of ways to indicate screen elements with the mouse. We argue that our QA system will need to annotate its pictures with information about the visual elements that the picture is made up of. This enables appropriate anaphor resolution and answering identity questions about these elements. We present our first results of follow-up question handling and deictic reference resolution, using annotations we made for the pictures of the corpus.
|Title of host publication||Tenth international symposium on social communication|
|Place of Publication||Santiago de Cuba|
|Publisher||Centro de Lingüística Aplicada|
|Number of pages||5|
|Publication status||Published - 26 Jan 2007|
|Publisher||Centro de Linguistica Aplicada|
- HMI-SLT: Speech and Language Technology
- HMI-MR: MULTIMEDIA RETRIEVAL
- HMI-MI: MULTIMODAL INTERACTIONS
van Schooten, B. W., & op den Akker, H. J. A. (2007). Multimodal follow-up questions to multimodal answers in a QA system. In Tenth international symposium on social communication (pp. 469-474). Santiago de Cuba: Centro de Lingüística Aplicada.