Automated Metadata Extraction for Semantic Access to Spoken Word Archives

Franciska M.G. de Jong, W.F.L. Heeren, Adrianus J. van Hessen, Roeland J.F. Ordelman, Antinus Nijholt

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademic

    67 Downloads (Pure)


    Archival practice is shifting from the analogue to the digital world. A specific subset of heritage collections that impose interesting challenges for the field of language and speech technology are spoken word archives. Given the enormous backlog at audiovisual archives of unannotated materials and the generally global level of item description, collection disclosure and item access are both at risk, and (semi-)automated methods for analysis and annotation may help to increase the use and reuse of these rich content collections. In several HMI projects the interplay has been investigated between evolving user scenarios and user requirements for spoken audio collections on the one hand, and the potential of automatic annotation and search technology for the improved accessibility and search paradigms on the other hand. In this paper we will present an overview of the state-of-the-art in metadata generation for audio content and explain the crucial importance of involving user groups in the design of research agendas and road maps for novel applications in this domain.
    Original languageUndefined
    Title of host publicationProceedings 12th International Symposium on Social Communication
    EditorsL. Ruiz Miyares, M.R. Alvarez Silva
    Place of PublicationSantiago de Cuba, Cuba
    PublisherCentre for Applied Linguistics
    Number of pages10
    ISBN (Print)978-959-7174-19-6
    Publication statusPublished - 17 Jan 2011
    Event12th International Symposium on Social Communication 2011 - Santiago de Cuba, Cuba
    Duration: 17 Jan 201121 Jan 2011
    Conference number: 12

    Publication series

    PublisherCentro de Lingüística Aplicada


    Conference12th International Symposium on Social Communication 2011
    CitySantiago de Cuba


    • METIS-277425
    • IR-75826
    • EWI-18431
    • HMI-SLT: Speech and Language Technology

    Cite this