Automated Metadata Extraction for Semantic Access to Spoken Word Archives

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademic

    26 Downloads (Pure)

    Abstract

    Archival practice is shifting from the analogue to the digital world. A specific subset of heritage collections that impose interesting challenges for the field of language and speech technology are spoken word archives. Given the enormous backlog at audiovisual archives of unannotated materials and the generally global level of item description, collection disclosure and item access are both at risk, and (semi-)automated methods for analysis and annotation may help to increase the use and reuse of these rich content collections. In several HMI projects the interplay has been investigated between evolving user scenarios and user requirements for spoken audio collections on the one hand, and the potential of automatic annotation and search technology for the improved accessibility and search paradigms on the other hand. In this paper we will present an overview of the state-of-the-art in metadata generation for audio content and explain the crucial importance of involving user groups in the design of research agendas and road maps for novel applications in this domain.
    Original languageUndefined
    Title of host publicationProceedings 12th International Symposium on Social Communication
    EditorsL. Ruiz Miyares, M.R. Alvarez Silva
    Place of PublicationSantiago de Cuba, Cuba
    PublisherCentre for Applied Linguistics
    Pages896-905
    Number of pages10
    ISBN (Print)978-959-7174-19-6
    Publication statusPublished - 17 Jan 2011
    Event12th International Symposium on Social Communication 2011 - Santiago de Cuba, Cuba
    Duration: 17 Jan 201121 Jan 2011
    Conference number: 12

    Publication series

    Name
    PublisherCentro de Lingüística Aplicada

    Conference

    Conference12th International Symposium on Social Communication 2011
    CountryCuba
    CitySantiago de Cuba
    Period17/01/1121/01/11

    Keywords

    • METIS-277425
    • IR-75826
    • EWI-18431
    • HMI-MR: MULTIMEDIA RETRIEVAL
    • HMI-SLT: Speech and Language Technology

    Cite this

    de Jong, F. M. G., Heeren, W. F. L., van Hessen, A. J., Ordelman, R. J. F., & Nijholt, A. (2011). Automated Metadata Extraction for Semantic Access to Spoken Word Archives. In L. Ruiz Miyares, & M. R. Alvarez Silva (Eds.), Proceedings 12th International Symposium on Social Communication (pp. 896-905). Santiago de Cuba, Cuba: Centre for Applied Linguistics.