A sound-based crowd activity recognition with neural network based regression models

Wei Wang, Fatjon Seraj, Paul J.M. Havinga

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

3 Citations (Scopus)
164 Downloads (Pure)

Abstract

Activities performed by humans can be recognized by the sound they emit while being performed, hence, researchers have proposed methods that use sound to recognize human activities, by detecting the presence of sound events in short time frames. However, in crowded environments, many sound events overlap making it impossible to distinguish the individual events and methods of detection to fail.

To address this issue and make the sound-based model suitable for crowd activities, this paper proposes to predict the proportion of activities happening in a specific place, by designing two neural network-based regression models: a CNN-model and a concatenate model. The CNN-model takes the Mel-bands as the input and is very popular in single activity recognition problems. Based on the CNN-model, we also designed a concatenate model which additionally inputting the global FFT feature to further improve the performance.

The evaluation of this approach is performed over 3 generated groups of audio samples, where each group has a different crowded-level. Both RMSE and coefficient of determination (R2 score), are used as evaluation metrics. The experiments show that the concatenate model works statistically better throughout the dataset, with a R2 score of 0.7377. Results show that using the concatenate model with both short-frame and holistic features provides a better result than any single-feature based model.
Original languageEnglish
Title of host publicationPETRA '20: The 13th PErvasive Technologies Related to Assistive Environments Conference
PublisherACM SigCHI
Pages126-133
Number of pages8
ISBN (Electronic)9781450377737
ISBN (Print)978-1-4503-7773-7
DOIs
Publication statusPublished - 30 Jun 2020
Event13th ACM International Conference on PErvasive Technologies Related to Assistive Environments, PETRA 2020 - Corfu Holiday Palace, Virtual, Online, Greece
Duration: 30 Jun 20203 Jul 2020
Conference number: 13

Conference

Conference13th ACM International Conference on PErvasive Technologies Related to Assistive Environments, PETRA 2020
Abbreviated titlePETRA 2020
Country/TerritoryGreece
CityVirtual, Online
Period30/06/203/07/20

Keywords

  • ambient intelligence
  • automatic sound event recognition
  • concatenate neural network
  • convolutional neural network
  • crowd activity monitoring
  • machine learning
  • mel-bands spectrogram
  • r-squared score

Fingerprint

Dive into the research topics of 'A sound-based crowd activity recognition with neural network based regression models'. Together they form a unique fingerprint.

Cite this