Both in speech synthesis and in sound coding it is often beneficial to have a measure that predicts whether, and to what extent, two sounds are different. This paper addresses the problem of estimating the perceptual effects of small modifications to the spectral envelope of a harmonic sound. A recently proposed auditory model is investigated that transforms the physical spectrum into a pattern of specific loudness as a function of critical band rate. A distance measure based on the concept of partial loudness is presented, which treats detectability in terms of a partial loudness threshold. This approach is adapted to the problem of estimating discrimination thresholds related to modifications of the spectral envelope of synthetic vowels. Data obtained from subjective listening tests using a representative set of stimuli in a 3IFC adaptive procedure show that the model makes reasonably good predictions of the discrimination threshold. Systematic deviations from the predicted thresholds may be related to individual differences in auditory filter selectivity. The partial loudness measure is compared with previously proposed distance measures such as the Euclidean distance between excitation patterns and between specific loudness applied to the same experimental data. An objective test measure shows that the partial loudness measure and the Euclidean distance of the excitation patterns are equally appropriate as distance measures for predicting audibility thresholds. The Euclidean distance between specific loudness is worse in performance compared with the other two.
Rao, P., van Dinther, R., Veldhuis, R. N. J., & Kohlrausch, A. (2001). A measure for predicting audibility discrimination thresholds for spectral envelope distortions in vowel sounds. Journal of the Acoustical Society of America, 109(5), 2085-2097. https://doi.org/10.1121/1.1354986