Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs

Véronique Malaisé, Luit Gazendam, Willemijn Heeren, Roeland Ordelman, Hennie Brugman

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

12 Downloads (Pure)

Abstract

Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue collections, the traditional manual annotation of collections puts heavy demands on resources, especially for large audiovisual archives. One way to address this challenge, is to introduce (semi) automatic annotation techniques for generating and/or enhancing metadata. The NWO funded CATCH-CHOICE project has investigated the extraction of keywords form textual resources related to the TV programs to be archived (context documents), in collaboration with the Dutch audiovisual archives, Sound and Vision. Besides the descriptions of the programs published by the broadcasters on their Websites, Automatic Speech Transcription (ASR) techniques from the CATCH-CHoral project, also provide textual resources that might be relevant for suggesting keywords. This paper investigates the suitability of ASR for generating such keywords, which we evaluate against manual annotations of the documents and against keywords automatically generated from context documents.
Original languageEnglish
Title of host publicationActes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France)
Place of PublicationParis
PublisherAssociation pour le Traitement Automatique des Langues (ATALA)
Number of pages10
Publication statusPublished - 2009
Event16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009 - Senlis, France
Duration: 24 Jun 200926 Jun 2009
Conference number: 16

Conference

Conference16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009
Abbreviated titleTALN
CountryFrance
CitySenlis
Period24/06/0926/06/09

Fingerprint

Transcription
Metadata
Analog to digital conversion
Websites
Semantics
Acoustic waves

Keywords

  • EWI-16997
  • Keyword extraction
  • Automatic Speech Recognition
  • Audiovisual Documents
  • IR-68941
  • METIS-264237

Cite this

Malaisé, V., Gazendam, L., Heeren, W., Ordelman, R., & Brugman, H. (2009). Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs. In Actes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France) Paris: Association pour le Traitement Automatique des Langues (ATALA).
Malaisé, Véronique ; Gazendam, Luit ; Heeren, Willemijn ; Ordelman, Roeland ; Brugman, Hennie. / Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs. Actes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France). Paris : Association pour le Traitement Automatique des Langues (ATALA), 2009.
@inproceedings{68b0337332b44db7aa45d20092f5fcd8,
title = "Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs",
abstract = "Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue collections, the traditional manual annotation of collections puts heavy demands on resources, especially for large audiovisual archives. One way to address this challenge, is to introduce (semi) automatic annotation techniques for generating and/or enhancing metadata. The NWO funded CATCH-CHOICE project has investigated the extraction of keywords form textual resources related to the TV programs to be archived (context documents), in collaboration with the Dutch audiovisual archives, Sound and Vision. Besides the descriptions of the programs published by the broadcasters on their Websites, Automatic Speech Transcription (ASR) techniques from the CATCH-CHoral project, also provide textual resources that might be relevant for suggesting keywords. This paper investigates the suitability of ASR for generating such keywords, which we evaluate against manual annotations of the documents and against keywords automatically generated from context documents.",
keywords = "EWI-16997, Keyword extraction, Automatic Speech Recognition, Audiovisual Documents, IR-68941, METIS-264237",
author = "V{\'e}ronique Malais{\'e} and Luit Gazendam and Willemijn Heeren and Roeland Ordelman and Hennie Brugman",
year = "2009",
language = "English",
booktitle = "Actes de la 16{\`e}me Conf{\'e}rence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France)",
publisher = "Association pour le Traitement Automatique des Langues (ATALA)",
address = "France",

}

Malaisé, V, Gazendam, L, Heeren, W, Ordelman, R & Brugman, H 2009, Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs. in Actes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France). Association pour le Traitement Automatique des Langues (ATALA), Paris, 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis, France, 24/06/09.

Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs. / Malaisé, Véronique; Gazendam, Luit; Heeren, Willemijn; Ordelman, Roeland; Brugman, Hennie.

Actes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France). Paris : Association pour le Traitement Automatique des Langues (ATALA), 2009.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs

AU - Malaisé, Véronique

AU - Gazendam, Luit

AU - Heeren, Willemijn

AU - Ordelman, Roeland

AU - Brugman, Hennie

PY - 2009

Y1 - 2009

N2 - Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue collections, the traditional manual annotation of collections puts heavy demands on resources, especially for large audiovisual archives. One way to address this challenge, is to introduce (semi) automatic annotation techniques for generating and/or enhancing metadata. The NWO funded CATCH-CHOICE project has investigated the extraction of keywords form textual resources related to the TV programs to be archived (context documents), in collaboration with the Dutch audiovisual archives, Sound and Vision. Besides the descriptions of the programs published by the broadcasters on their Websites, Automatic Speech Transcription (ASR) techniques from the CATCH-CHoral project, also provide textual resources that might be relevant for suggesting keywords. This paper investigates the suitability of ASR for generating such keywords, which we evaluate against manual annotations of the documents and against keywords automatically generated from context documents.

AB - Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue collections, the traditional manual annotation of collections puts heavy demands on resources, especially for large audiovisual archives. One way to address this challenge, is to introduce (semi) automatic annotation techniques for generating and/or enhancing metadata. The NWO funded CATCH-CHOICE project has investigated the extraction of keywords form textual resources related to the TV programs to be archived (context documents), in collaboration with the Dutch audiovisual archives, Sound and Vision. Besides the descriptions of the programs published by the broadcasters on their Websites, Automatic Speech Transcription (ASR) techniques from the CATCH-CHoral project, also provide textual resources that might be relevant for suggesting keywords. This paper investigates the suitability of ASR for generating such keywords, which we evaluate against manual annotations of the documents and against keywords automatically generated from context documents.

KW - EWI-16997

KW - Keyword extraction

KW - Automatic Speech Recognition

KW - Audiovisual Documents

KW - IR-68941

KW - METIS-264237

M3 - Conference contribution

BT - Actes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France)

PB - Association pour le Traitement Automatique des Langues (ATALA)

CY - Paris

ER -

Malaisé V, Gazendam L, Heeren W, Ordelman R, Brugman H. Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs. In Actes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2009, Senlis (France). Paris: Association pour le Traitement Automatique des Langues (ATALA). 2009