Speaker Counting Model based on Transfer Learning from SincNet Bottleneck Layer

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

People counting techniques have been widely researched recently and many different types of sensors can be used in this context. In this paper, we propose a system based on a deep-learning model able to identify the number of people in the crowded scenarios through the speech sound. In a nutshell the system relies on two components: counting concurrent speakers in overlapping talking sound directly and clustering single-speaker sound by speaker-identity over time. Compared to previously proposed speaker-counting systems models that only cluster single-speaker sound, this system is more accurate and less vulnerable to the overlapping sound in the crowded environment. In addition, counting speakers in overlapping sound also gives the minimal number of speakers so that it also improves the counting accuracy in a quiet environment. Our methodology is inspired by the newly proposed SincNet deep neural network framework which proves to be outstanding and highly efficient in sound processing with raw signals. By transferring the bottleneck layer of SincNet model as features fed to our speaker clustering model we reached a noticeably better performance than previous models who rely on the use MFCC and other engineered features.
Original languageEnglish
Title of host publication2020 IEEE International Conference on Pervasive Computing and Communications (PerCom)
PublisherIEEE
Pages1
Number of pages8
ISBN (Electronic)978-1-7281-4657-7
ISBN (Print)978-1-7281-4658-4
DOIs
Publication statusPublished - 29 Jun 2020
Event2020 IEEE International Conference on Pervasive Computing and Communications Workshops, PerCom 2020 - University of Austin, Austin, United States
Duration: 23 Mar 202027 Mar 2020
Conference number: 18
http://percom.org/Previous/ST2020/

Conference

Conference2020 IEEE International Conference on Pervasive Computing and Communications Workshops, PerCom 2020
Abbreviated titlePerCom
CountryUnited States
CityAustin
Period23/03/2027/03/20
Internet address

Fingerprint Dive into the research topics of 'Speaker Counting Model based on Transfer Learning from SincNet Bottleneck Layer'. Together they form a unique fingerprint.

Cite this