Evaluating Unsupervised Thesaurus-based Labeling of Audiovisual Content in an Archive Production Environment

Victor de Boer, Roeland J.F. Ordelman, Josefien Schuurman

Abstract

In this paper we report on a two-stage evaluation of unsupervised labeling of audiovisual content using collateral text data sources to investigate how such an approach can provide acceptable results for given requirements with respect to archival quality, authority and service levels to external users. We conclude that with parameter settings that are optimized using a rigorous evaluation of precision and accuracy, the quality of automatic term-suggestion is sufficiently high. We furthermore provide an analysis of the term extraction after being taken into production, where we focus on performance variation with respect to term types and television programs. Having implemented the procedure in our production work-flow allows us to gradually develop the system further and to also assess the effect of the transformation from manual to automatic annotation from an end-user perspective. Additional future work will be on deploying different information sources including annotations based on multimodal video analysis such as speaker recognition and computer vision.
Original languageUndefined
Pages (from-to)189-201
Number of pages13
JournalInternational journal on digital libraries
Volume17
Issue number3
DOIs
StatePublished - Sep 2016

Fingerprint

production
user
quality
analysis
television program
workflow
manual
video
authority
transformation
computer
parameter
text
procedure
content
future
work
service
perspective
performance

Keywords

  • EWI-27061
  • HMI-MR: MULTIMEDIA RETRIEVAL
  • Thesaurus
  • Practice-oriented evaluation
  • METIS-318455
  • Audiovisual archives
  • Information Extraction
  • IR-100719
  • Audiovisual access

Cite this

de Boer, Victor; Ordelman, Roeland J.F.; Schuurman, Josefien / Evaluating Unsupervised Thesaurus-based Labeling of Audiovisual Content in an Archive Production Environment.

Vol. 17, No. 3, 09.2016, p. 189-201.

Research output: Scientific - peer-reviewArticle

@article{1c9642873ac0420dae48fe2fb0399c8f,
title = "Evaluating Unsupervised Thesaurus-based Labeling of Audiovisual Content in an Archive Production Environment",
abstract = "In this paper we report on a two-stage evaluation of unsupervised labeling of audiovisual content using collateral text data sources to investigate how such an approach can provide acceptable results for given requirements with respect to archival quality, authority and service levels to external users. We conclude that with parameter settings that are optimized using a rigorous evaluation of precision and accuracy, the quality of automatic term-suggestion is sufficiently high. We furthermore provide an analysis of the term extraction after being taken into production, where we focus on performance variation with respect to term types and television programs. Having implemented the procedure in our production work-flow allows us to gradually develop the system further and to also assess the effect of the transformation from manual to automatic annotation from an end-user perspective. Additional future work will be on deploying different information sources including annotations based on multimodal video analysis such as speaker recognition and computer vision.",
keywords = "EWI-27061, HMI-MR: MULTIMEDIA RETRIEVAL, Thesaurus, Practice-oriented evaluation, METIS-318455, Audiovisual archives, Information Extraction, IR-100719, Audiovisual access",
author = "{de Boer}, Victor and Ordelman, {Roeland J.F.} and Josefien Schuurman",
note = "Open access",
year = "2016",
month = "9",
doi = "10.1007/s00799-016-0182-6",
volume = "17",
pages = "189--201",
number = "3",

}

Evaluating Unsupervised Thesaurus-based Labeling of Audiovisual Content in an Archive Production Environment. / de Boer, Victor; Ordelman, Roeland J.F.; Schuurman, Josefien.

Vol. 17, No. 3, 09.2016, p. 189-201.

Research output: Scientific - peer-reviewArticle

TY - JOUR

T1 - Evaluating Unsupervised Thesaurus-based Labeling of Audiovisual Content in an Archive Production Environment

AU - de Boer,Victor

AU - Ordelman,Roeland J.F.

AU - Schuurman,Josefien

N1 - Open access

PY - 2016/9

Y1 - 2016/9

N2 - In this paper we report on a two-stage evaluation of unsupervised labeling of audiovisual content using collateral text data sources to investigate how such an approach can provide acceptable results for given requirements with respect to archival quality, authority and service levels to external users. We conclude that with parameter settings that are optimized using a rigorous evaluation of precision and accuracy, the quality of automatic term-suggestion is sufficiently high. We furthermore provide an analysis of the term extraction after being taken into production, where we focus on performance variation with respect to term types and television programs. Having implemented the procedure in our production work-flow allows us to gradually develop the system further and to also assess the effect of the transformation from manual to automatic annotation from an end-user perspective. Additional future work will be on deploying different information sources including annotations based on multimodal video analysis such as speaker recognition and computer vision.

AB - In this paper we report on a two-stage evaluation of unsupervised labeling of audiovisual content using collateral text data sources to investigate how such an approach can provide acceptable results for given requirements with respect to archival quality, authority and service levels to external users. We conclude that with parameter settings that are optimized using a rigorous evaluation of precision and accuracy, the quality of automatic term-suggestion is sufficiently high. We furthermore provide an analysis of the term extraction after being taken into production, where we focus on performance variation with respect to term types and television programs. Having implemented the procedure in our production work-flow allows us to gradually develop the system further and to also assess the effect of the transformation from manual to automatic annotation from an end-user perspective. Additional future work will be on deploying different information sources including annotations based on multimodal video analysis such as speaker recognition and computer vision.

KW - EWI-27061

KW - HMI-MR: MULTIMEDIA RETRIEVAL

KW - Thesaurus

KW - Practice-oriented evaluation

KW - METIS-318455

KW - Audiovisual archives

KW - Information Extraction

KW - IR-100719

KW - Audiovisual access

U2 - 10.1007/s00799-016-0182-6

DO - 10.1007/s00799-016-0182-6

M3 - Article

VL - 17

SP - 189

EP - 201

IS - 3

ER -