Evaluating ASR Output for Information Retrieval

Laurens Bastiaan van der Werff, W.F.L. Heeren

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review


    Within the context of international benchmarks and collection specific projects, much work on spoken document retrieval has been done in recent years. In 2000 the issue of automatic speech recognition for spoken document retrieval was declared 'solved' for the broadcast news domain. Many collections, however, are not in this domain and automatic speech recognition for these collections may contain specific new challenges. This requires a method to evaluate automatic speech recognition optimization schemes for these application areas. Traditional measures such as word error rate and story word error rate are not ideal for this. In this paper, three new metrics are proposed. Their behaviour is investigated on a cultural heritage collection and performance is compared to traditional measurements on TREC broadcast news data.
    Original languageUndefined
    Title of host publicationProceedings of the ACM SIGIR Workshop on Searching Spontaneous Conversational Speech
    EditorsFranciska M.G. de Jong, D.W. Oard, Roeland J.F. Ordelman, S. Raaijmakers
    Place of PublicationEnschede
    PublisherCentre for Telematics and Information Technology (CTIT)
    Number of pages8
    ISBN (Print)978-90-365-2542-8
    Publication statusPublished - Jul 2007
    EventACM/SIGIR Workshop on Searching Spontaneous Conversational Speech, SSCS 2007 - Amsterdam, Netherlands
    Duration: 27 Jul 200727 Jul 2007

    Publication series

    PublisherCentre for Telematics and Information Technology, University of Twente


    WorkshopACM/SIGIR Workshop on Searching Spontaneous Conversational Speech, SSCS 2007
    Abbreviated titleSSCS


    • EWI-10826
    • IR-61873
    • Evaluation
    • METIS-241816
    • Lattices
    • Spoken Document Retrieval
    • Automatic Speech Recognition

    Cite this