Online Detection Of Vocal Listener Responses With Maximum Latency Constraints

Daniel Neiberg, Khiet Phuong Truong

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    7 Citations (Scopus)
    13 Downloads (Pure)

    Abstract

    When human listeners utter Listener Responses (e.g. back-channels or acknowledgments) such as 'yeah' and 'mmhmm', interlocutors commonly continue to speak or resume their speech even before the listener has ﬿nished his/her response. This type of speech interactivity results in frequent speech overlap which is common in human-human conversation. To allow for this type of speech interactivity to occur between humans and spoken dialog systems, which will result in more human-like continuous and smoother human-machine interaction, we propose an on-line classi﬿er which can classify incoming speech as Listener Responses. We show that it is possible to detect vocal Listener Responses using maximum latency thresholds of 100-500 ms, thereby obtaining equal error rates ranging from 34% to 28% by using an energy based voice activity detector.
    Original languageUndefined
    Title of host publicationProceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
    Place of PublicationUSA
    PublisherIEEE Signal Processing Society
    Pages5836-5839
    Number of pages4
    ISBN (Print)978-1-4577-0538-0
    DOIs
    Publication statusPublished - May 2011
    EventIEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2011 - Prague, Czech Republic
    Duration: 22 May 201127 May 2011

    Publication series

    Name
    PublisherIEEE Signal Processing Society

    Conference

    ConferenceIEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2011
    Abbreviated titleICASSP
    CountryCzech Republic
    CityPrague
    Period22/05/1127/05/11

    Keywords

    • METIS-277647
    • EC Grant Agreement nr.: FP7/231287
    • EWI-20186
    • IR-77316

    Cite this

    Neiberg, D., & Truong, K. P. (2011). Online Detection Of Vocal Listener Responses With Maximum Latency Constraints. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 5836-5839). USA: IEEE Signal Processing Society. https://doi.org/10.1109/ICASSP.2011.5947688