Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections

M.A.H. Huijbregts, Chuck Wooters, Roeland J.F. Ordelman

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

14 Citations (Scopus)
20 Downloads (Pure)

Abstract

In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system trains its models on-line, it is robust for handling out-of-domain data. The speech activity error rate on an out-of-domain test set, recordings of English conference meetings, was 4.4%. The overall error rate on twelve randomly selected five minute TRECVID fragments was 11.5%.
Original languageEnglish
Title of host publicationProceedings of Interspeech 2007
Place of PublicationAntwerp
PublisherInternational Speech Communication Association (ISCA)
PagesFrC.P3-4
Number of pages4
ISBN (Print)1990-9772
Publication statusPublished - 27 Aug 2007
Event8th Annual Conference of the International Speech Communication Association, INTERSPEECH 2007 - Antwerp, Belgium
Duration: 27 Aug 200731 Aug 2007
Conference number: 8
https://www.interspeech2007.org/

Publication series

Name
PublisherInternational Speech Communication Association
NumberLNCS4549
ISSN (Print)1990-9772

Conference

Conference8th Annual Conference of the International Speech Communication Association, INTERSPEECH 2007
Abbreviated titleINTERSPEECH
CountryBelgium
CityAntwerp
Period27/08/0731/08/07
Internet address

Fingerprint

Acoustic waves

Keywords

  • IR-64329
  • Speech activity detection
  • EC Grant Agreement nr.: FP6/027685
  • METIS-241881
  • EC Grant Agreement nr.: FP6/027413
  • EWI-11003
  • EC Grant Agreement nr.: FP6/506811

Cite this

Huijbregts, M. A. H., Wooters, C., & Ordelman, R. J. F. (2007). Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections. In Proceedings of Interspeech 2007 (pp. FrC.P3-4). Antwerp: International Speech Communication Association (ISCA).
Huijbregts, M.A.H. ; Wooters, Chuck ; Ordelman, Roeland J.F. / Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections. Proceedings of Interspeech 2007. Antwerp : International Speech Communication Association (ISCA), 2007. pp. FrC.P3-4
@inproceedings{65f231a8c4f44f2da3f1955fc77c555d,
title = "Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections",
abstract = "In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system trains its models on-line, it is robust for handling out-of-domain data. The speech activity error rate on an out-of-domain test set, recordings of English conference meetings, was 4.4{\%}. The overall error rate on twelve randomly selected five minute TRECVID fragments was 11.5{\%}.",
keywords = "IR-64329, Speech activity detection, EC Grant Agreement nr.: FP6/027685, METIS-241881, EC Grant Agreement nr.: FP6/027413, EWI-11003, EC Grant Agreement nr.: FP6/506811",
author = "M.A.H. Huijbregts and Chuck Wooters and Ordelman, {Roeland J.F.}",
year = "2007",
month = "8",
day = "27",
language = "English",
isbn = "1990-9772",
publisher = "International Speech Communication Association (ISCA)",
number = "LNCS4549",
pages = "FrC.P3--4",
booktitle = "Proceedings of Interspeech 2007",

}

Huijbregts, MAH, Wooters, C & Ordelman, RJF 2007, Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections. in Proceedings of Interspeech 2007. International Speech Communication Association (ISCA), Antwerp, pp. FrC.P3-4, 8th Annual Conference of the International Speech Communication Association, INTERSPEECH 2007, Antwerp, Belgium, 27/08/07.

Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections. / Huijbregts, M.A.H.; Wooters, Chuck; Ordelman, Roeland J.F.

Proceedings of Interspeech 2007. Antwerp : International Speech Communication Association (ISCA), 2007. p. FrC.P3-4.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections

AU - Huijbregts, M.A.H.

AU - Wooters, Chuck

AU - Ordelman, Roeland J.F.

PY - 2007/8/27

Y1 - 2007/8/27

N2 - In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system trains its models on-line, it is robust for handling out-of-domain data. The speech activity error rate on an out-of-domain test set, recordings of English conference meetings, was 4.4%. The overall error rate on twelve randomly selected five minute TRECVID fragments was 11.5%.

AB - In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system trains its models on-line, it is robust for handling out-of-domain data. The speech activity error rate on an out-of-domain test set, recordings of English conference meetings, was 4.4%. The overall error rate on twelve randomly selected five minute TRECVID fragments was 11.5%.

KW - IR-64329

KW - Speech activity detection

KW - EC Grant Agreement nr.: FP6/027685

KW - METIS-241881

KW - EC Grant Agreement nr.: FP6/027413

KW - EWI-11003

KW - EC Grant Agreement nr.: FP6/506811

M3 - Conference contribution

SN - 1990-9772

SP - FrC.P3-4

BT - Proceedings of Interspeech 2007

PB - International Speech Communication Association (ISCA)

CY - Antwerp

ER -

Huijbregts MAH, Wooters C, Ordelman RJF. Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections. In Proceedings of Interspeech 2007. Antwerp: International Speech Communication Association (ISCA). 2007. p. FrC.P3-4