Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections

M.A.H. Huijbregts, Chuck Wooters, Roeland J.F. Ordelman

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    14 Citations (Scopus)
    26 Downloads (Pure)

    Abstract

    In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system trains its models on-line, it is robust for handling out-of-domain data. The speech activity error rate on an out-of-domain test set, recordings of English conference meetings, was 4.4%. The overall error rate on twelve randomly selected five minute TRECVID fragments was 11.5%.
    Original languageEnglish
    Title of host publicationProceedings of Interspeech 2007
    Place of PublicationAntwerp
    PublisherInternational Speech Communication Association (ISCA)
    PagesFrC.P3-4
    Number of pages4
    ISBN (Print)1990-9772
    Publication statusPublished - 27 Aug 2007
    Event8th Annual Conference of the International Speech Communication Association, INTERSPEECH 2007 - Antwerp, Belgium
    Duration: 27 Aug 200731 Aug 2007
    Conference number: 8
    https://www.interspeech2007.org/

    Publication series

    Name
    PublisherInternational Speech Communication Association
    NumberLNCS4549
    ISSN (Print)1990-9772

    Conference

    Conference8th Annual Conference of the International Speech Communication Association, INTERSPEECH 2007
    Abbreviated titleINTERSPEECH
    CountryBelgium
    CityAntwerp
    Period27/08/0731/08/07
    Internet address

      Fingerprint

    Keywords

    • IR-64329
    • Speech activity detection
    • EC Grant Agreement nr.: FP6/027685
    • METIS-241881
    • EC Grant Agreement nr.: FP6/027413
    • EWI-11003
    • EC Grant Agreement nr.: FP6/506811

    Cite this

    Huijbregts, M. A. H., Wooters, C., & Ordelman, R. J. F. (2007). Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections. In Proceedings of Interspeech 2007 (pp. FrC.P3-4). Antwerp: International Speech Communication Association (ISCA).