TY - GEN
T1 - Multimodal follow-up questions to multimodal answers in a QA system
AU - van Schooten, B.W.
AU - op den Akker, Hendrikus J.A.
PY - 2007/1/26
Y1 - 2007/1/26
N2 - We are developing a dialogue manager (DM) for a multimodal interactive Question Answering (QA)
system. Our QA system presents answers using text and pictures, and the user may pose follow-up
questions using text or speech, while indicating screen elements with the mouse. We developed a
corpus of multimodal follow-up questions for this system. This paper describes a detailed analysis of
this corpus, and its impact on the implementation of our system.
We found that users pose two major types of follow-up question: regular questions, which may be
reformulated in such a way as to be answerable to a QA system, and questions asking about a specific
picture. We found that indicating screen elements with the mouse is done often, even in cases where it
may be considered redundant, and that these mouse gestures appear to have a close correspondence
to regular anaphors in the utterance. We also found that users use a limited number of ways to indicate
screen elements with the mouse.
We argue that our QA system will need to annotate its pictures with information about the visual
elements that the picture is made up of. This enables appropriate anaphor resolution and answering
identity questions about these elements. We present our first results of follow-up question handling and
deictic reference resolution, using annotations we made for the pictures of the corpus.
AB - We are developing a dialogue manager (DM) for a multimodal interactive Question Answering (QA)
system. Our QA system presents answers using text and pictures, and the user may pose follow-up
questions using text or speech, while indicating screen elements with the mouse. We developed a
corpus of multimodal follow-up questions for this system. This paper describes a detailed analysis of
this corpus, and its impact on the implementation of our system.
We found that users pose two major types of follow-up question: regular questions, which may be
reformulated in such a way as to be answerable to a QA system, and questions asking about a specific
picture. We found that indicating screen elements with the mouse is done often, even in cases where it
may be considered redundant, and that these mouse gestures appear to have a close correspondence
to regular anaphors in the utterance. We also found that users use a limited number of ways to indicate
screen elements with the mouse.
We argue that our QA system will need to annotate its pictures with information about the visual
elements that the picture is made up of. This enables appropriate anaphor resolution and answering
identity questions about these elements. We present our first results of follow-up question handling and
deictic reference resolution, using annotations we made for the pictures of the corpus.
KW - HMI-SLT: Speech and Language Technology
KW - IR-63429
KW - HMI-MR: MULTIMEDIA RETRIEVAL
KW - HMI-MI: MULTIMODAL INTERACTIONS
KW - METIS-245698
KW - EWI-6907
M3 - Conference contribution
SN - 959-7174-08-1
SP - 469
EP - 474
BT - Tenth international symposium on social communication
PB - Centro de Lingüística Aplicada
CY - Santiago de Cuba
T2 - Tenth international symposium on social communication, Cuba
Y2 - 26 January 2007
ER -