The 2005 AMI System for the Transcription of Speech in Meetings

Thomas Hain, Lukas Burget, John Dines, Giulia Gaurau, Martin Karafiat, Mike Lincoln, Iain McCowan, Roeland J.F. Ordelman, Darren Moore, Vincent Wan, Steve Renals

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademic

    7 Citations (Scopus)
    70 Downloads (Pure)

    Abstract

    In this paper we describe the 2005 AMI system for the transcription of speech in meetings used for participation in the 2005 NIST RT evaluations. The system was designed for participation in the speech to text part of the evaluations, in particular for transcription of speech recorded with multiple distant microphones and independent headset microphones. System performance was tested on both conference room and lecture style meetings. Although input sources are processed using different front-ends, the recognition process is based on a unified system architecture. The system operates in multiple passes and makes use of state of the art technologies such as discriminative training, vocal tract length normalisation, heteroscedastic linear discriminant analysis, speaker adaptation with maximum likelihood linear regression and minimum word error rate decoding. In this paper we describe the system performance on the official development and test sets for the NIST RT05s evaluations. The system was jointly developed in less than 10 months by a multi-site team and was shown to achieve very competitive performance.
    Original languageUndefined
    Title of host publication2nd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2005
    Place of PublicationBerlin
    PublisherSpringer
    Pages450-462
    ISBN (Print)978-3-540-32549-9
    DOIs
    Publication statusPublished - 2005
    Event2nd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2005 - Edinburgh, United Kingdom
    Duration: 11 Jul 200513 Jul 2005
    Conference number: MLMI

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer Verlag
    Number3869
    Volume3869
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Workshop

    Workshop2nd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2005
    CountryUnited Kingdom
    CityEdinburgh
    Period11/07/0513/07/05

    Keywords

    • IR-65564
    • METIS-227319
    • EC Grant Agreement nr.: FP6/506811
    • EWI-1829

    Cite this

    Hain, T., Burget, L., Dines, J., Gaurau, G., Karafiat, M., Lincoln, M., ... Renals, S. (2005). The 2005 AMI System for the Transcription of Speech in Meetings. In 2nd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2005 (pp. 450-462). (Lecture Notes in Computer Science; Vol. 3869, No. 3869). Berlin: Springer. https://doi.org/10.1007/11677482_38