Abstract
Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue collections, the traditional manual annotation of collections puts heavy demands on resources, especially for large audiovisual archives. One way to address this challenge, is to introduce (semi) automatic annotation techniques for generating and/or enhancing metadata. The NWO funded CATCH-CHOICE project has investigated the extraction of keywords form textual resources related to the TV programs to be archived (context documents), in collaboration with the Dutch audiovisual archives, Sound and Vision. Besides the descriptions of the programs published by the broadcasters on their Websites, Automatic Speech Transcription (ASR) techniques from the CATCH-CHoral project, also provide textual resources that might be relevant for suggesting keywords. This paper investigates the suitability of ASR for generating such keywords, which we evaluate against manual annotations of the documents and against keywords automatically generated from context documents.
Original language | English |
---|---|
Title of host publication | Actes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France) |
Place of Publication | Paris |
Publisher | Association pour le Traitement Automatique des Langues (ATALA) |
Number of pages | 10 |
Publication status | Published - 2009 |
Event | 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009 - Senlis, France Duration: 24 Jun 2009 → 26 Jun 2009 Conference number: 16 |
Conference
Conference | 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009 |
---|---|
Abbreviated title | TALN |
Country/Territory | France |
City | Senlis |
Period | 24/06/09 → 26/06/09 |
Keywords
- EWI-16997
- Keyword extraction
- Automatic Speech Recognition
- Audiovisual Documents
- IR-68941
- METIS-264237