The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing

Florian Eyben, Klaus Scherer, Björn Schuller, Johan Sundberg, Elisabeth André, Carlos Busso, Laurence Devillers, Julien Epps, Petri Laukka, Shrikanth Narayanan, Khiet Phuong Truong

  • 82 Citations

Abstract

Work on voice sciences over recent decades has led to a proliferation of acoustic parameters that are used quite selectively and are not always extracted in a similar fashion. With many independent teams working in different research areas, shared standards become an essential safeguard to ensure compliance with state-of-the-art methods allowing appropriate comparison of results across studies and potential integration and combination of extraction and recognition systems. In this paper we propose a basic standard acoustic parameter set for various areas of automatic voice analysis, such as paralinguistic or clinical speech analysis. In contrast to a large brute-force parameter set, we present a minimalistic set of voice parameters here. These were selected based on a) their potential to index affective physiological changes in voice production, b) their proven value in former studies as well as their automatic extractability, and c) their theoretical significance. The set is intended to provide a common baseline for evaluation of future research and eliminate differences caused by varying parameter sets or even different implementations of the same parameters. Our implementation is publicly available with the openSMILE toolkit. Comparative evaluations of the proposed feature set and large baseline feature sets of INTERSPEECH challenges show a high performance of the proposed set in relation to its size.
Original languageUndefined
Pages (from-to)190-202
Number of pages14
JournalIEEE transactions on affective computing
Volume7
Issue number2
DOIs
StatePublished - Apr 2016

Fingerprint

Acoustics
Speech analysis

Keywords

  • Standard
  • Speech Analysis
  • EWI-26649
  • Acoustic Features
  • IR-98965
  • Geneva Minimalistic Parameter Set
  • Emotion Recognition
  • METIS-315136
  • Affective Computing

Cite this

Eyben, Florian; Scherer, Klaus; Schuller, Björn; Sundberg, Johan; André, Elisabeth; Busso, Carlos; Devillers, Laurence; Epps, Julien; Laukka, Petri; Narayanan, Shrikanth; Truong, Khiet Phuong / The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing.

In: IEEE transactions on affective computing, Vol. 7, No. 2, 04.2016, p. 190-202.

Research output: Scientific - peer-reviewArticle

@article{60dffd1c51834c7c996272c419934d7c,
title = "The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing",
abstract = "Work on voice sciences over recent decades has led to a proliferation of acoustic parameters that are used quite selectively and are not always extracted in a similar fashion. With many independent teams working in different research areas, shared standards become an essential safeguard to ensure compliance with state-of-the-art methods allowing appropriate comparison of results across studies and potential integration and combination of extraction and recognition systems. In this paper we propose a basic standard acoustic parameter set for various areas of automatic voice analysis, such as paralinguistic or clinical speech analysis. In contrast to a large brute-force parameter set, we present a minimalistic set of voice parameters here. These were selected based on a) their potential to index affective physiological changes in voice production, b) their proven value in former studies as well as their automatic extractability, and c) their theoretical significance. The set is intended to provide a common baseline for evaluation of future research and eliminate differences caused by varying parameter sets or even different implementations of the same parameters. Our implementation is publicly available with the openSMILE toolkit. Comparative evaluations of the proposed feature set and large baseline feature sets of INTERSPEECH challenges show a high performance of the proposed set in relation to its size.",
keywords = "Standard, Speech Analysis, EWI-26649, Acoustic Features, IR-98965, Geneva Minimalistic Parameter Set, Emotion Recognition, METIS-315136, Affective Computing",
author = "Florian Eyben and Klaus Scherer and Björn Schuller and Johan Sundberg and Elisabeth André and Carlos Busso and Laurence Devillers and Julien Epps and Petri Laukka and Shrikanth Narayanan and Truong, {Khiet Phuong}",
note = "Open access",
year = "2016",
month = "4",
doi = "10.1109/TAFFC.2015.2457417",
volume = "7",
pages = "190--202",
journal = "IEEE transactions on affective computing",
issn = "1949-3045",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "2",

}

Eyben, F, Scherer, K, Schuller, B, Sundberg, J, André, E, Busso, C, Devillers, L, Epps, J, Laukka, P, Narayanan, S & Truong, KP 2016, 'The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing' IEEE transactions on affective computing, vol 7, no. 2, pp. 190-202. DOI: 10.1109/TAFFC.2015.2457417

The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. / Eyben, Florian; Scherer, Klaus; Schuller, Björn; Sundberg, Johan; André, Elisabeth; Busso, Carlos; Devillers, Laurence; Epps, Julien; Laukka, Petri; Narayanan, Shrikanth; Truong, Khiet Phuong.

In: IEEE transactions on affective computing, Vol. 7, No. 2, 04.2016, p. 190-202.

Research output: Scientific - peer-reviewArticle

TY - JOUR

T1 - The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing

AU - Eyben,Florian

AU - Scherer,Klaus

AU - Schuller,Björn

AU - Sundberg,Johan

AU - André,Elisabeth

AU - Busso,Carlos

AU - Devillers,Laurence

AU - Epps,Julien

AU - Laukka,Petri

AU - Narayanan,Shrikanth

AU - Truong,Khiet Phuong

N1 - Open access

PY - 2016/4

Y1 - 2016/4

N2 - Work on voice sciences over recent decades has led to a proliferation of acoustic parameters that are used quite selectively and are not always extracted in a similar fashion. With many independent teams working in different research areas, shared standards become an essential safeguard to ensure compliance with state-of-the-art methods allowing appropriate comparison of results across studies and potential integration and combination of extraction and recognition systems. In this paper we propose a basic standard acoustic parameter set for various areas of automatic voice analysis, such as paralinguistic or clinical speech analysis. In contrast to a large brute-force parameter set, we present a minimalistic set of voice parameters here. These were selected based on a) their potential to index affective physiological changes in voice production, b) their proven value in former studies as well as their automatic extractability, and c) their theoretical significance. The set is intended to provide a common baseline for evaluation of future research and eliminate differences caused by varying parameter sets or even different implementations of the same parameters. Our implementation is publicly available with the openSMILE toolkit. Comparative evaluations of the proposed feature set and large baseline feature sets of INTERSPEECH challenges show a high performance of the proposed set in relation to its size.

AB - Work on voice sciences over recent decades has led to a proliferation of acoustic parameters that are used quite selectively and are not always extracted in a similar fashion. With many independent teams working in different research areas, shared standards become an essential safeguard to ensure compliance with state-of-the-art methods allowing appropriate comparison of results across studies and potential integration and combination of extraction and recognition systems. In this paper we propose a basic standard acoustic parameter set for various areas of automatic voice analysis, such as paralinguistic or clinical speech analysis. In contrast to a large brute-force parameter set, we present a minimalistic set of voice parameters here. These were selected based on a) their potential to index affective physiological changes in voice production, b) their proven value in former studies as well as their automatic extractability, and c) their theoretical significance. The set is intended to provide a common baseline for evaluation of future research and eliminate differences caused by varying parameter sets or even different implementations of the same parameters. Our implementation is publicly available with the openSMILE toolkit. Comparative evaluations of the proposed feature set and large baseline feature sets of INTERSPEECH challenges show a high performance of the proposed set in relation to its size.

KW - Standard

KW - Speech Analysis

KW - EWI-26649

KW - Acoustic Features

KW - IR-98965

KW - Geneva Minimalistic Parameter Set

KW - Emotion Recognition

KW - METIS-315136

KW - Affective Computing

U2 - 10.1109/TAFFC.2015.2457417

DO - 10.1109/TAFFC.2015.2457417

M3 - Article

VL - 7

SP - 190

EP - 202

JO - IEEE transactions on affective computing

T2 - IEEE transactions on affective computing

JF - IEEE transactions on affective computing

SN - 1949-3045

IS - 2

ER -

Eyben F, Scherer K, Schuller B, Sundberg J, André E, Busso C et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE transactions on affective computing. 2016 Apr;7(2):190-202. Available from, DOI: 10.1109/TAFFC.2015.2457417