Eating sound dataset for 20 food types and sound classification using convolutional neural networks

  • Jeannette Shijie Ma
  • , Marcello A. Gómez Maureira
  • , Jan N. Van Rijn

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Food identification technology potentially benefits both food and media industries, and can enrich human-computer interaction. We assembled a food classification dataset consisting of 11,141 clips, based on YouTube videos of 20 food types. This dataset is freely available on Kaggle. We suggest the grouped holdout evaluation protocol as evaluation method to assess model performance. As a first approach, we applied Convolutional Neural Networks on this dataset. When applying an evaluation protocol based on grouped holdout, the model obtained an accuracy of 18.5%, whereas when applying an evaluation protocol based on uniform holdout, the model obtained an accuracy of 37.58%. When approaching this as a binary classification task, the model performed well for most pairs. In both settings, the method clearly outperformed reasonable baselines. We found that besides texture properties, eating action differences are important consideration for data driven eating sound researches. Protocols based on biting sound are limited to textural classification and less heuristic while assembling food differences.

Original languageEnglish
Title of host publicationICMI 2020 Companion
Subtitle of host publicationCompanion Publication of the 2020 International Conference on Multimodal Interaction
EditorsKhiet Truong, Dirk Heylen, Mary Czerwinski
Place of PublicationNew York, NY
PublisherAssociation for Computing Machinery
Pages348-351
Number of pages4
ISBN (Electronic)978-1-4503-8002-7
DOIs
Publication statusPublished - 25 Oct 2020
Externally publishedYes
Event22nd ACM International Conference on Multimodal Interaction, ICMI 2020 - Online, Virtual, Online, Netherlands
Duration: 25 Oct 202029 Oct 2020
Conference number: 22
http://icmi.acm.org/2020/

Conference

Conference22nd ACM International Conference on Multimodal Interaction, ICMI 2020
Abbreviated titleICMI
Country/TerritoryNetherlands
CityVirtual, Online
Period25/10/2029/10/20
Internet address

Keywords

  • Eating sound
  • Food classification
  • Neural networks
  • Sound classification
  • Sound dataset
  • n/a OA procedure

Fingerprint

Dive into the research topics of 'Eating sound dataset for 20 food types and sound classification using convolutional neural networks'. Together they form a unique fingerprint.

Cite this