Spatial-Temporal Transformer for Crime Recognition in Surveillance Videos

Kayleigh Boekhoudt, Estefania Talavera

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Human-related crime recognition from surveillance videos becomes an even more challenging task when dealing with relatively similar human actions. We propose a transformer-based model that relies on the spatial-temporal representation of extracted skeletal trajectories for fine-grained classification. We validate the effectiveness of our model on the complex HR-Crime dataset consisting of videos representing 13 categories of human-related crimes. Quantitative and qualitative results suggest that building a transformer architecture with coupled spatial and temporal modules enables the model to compete in performance while improving intrinsic interpretability.

Original languageEnglish
Title of host publicationAVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance
PublisherIEEE
Number of pages8
ISBN (Electronic)978-1-6654-6382-9
DOIs
Publication statusPublished - 24 Nov 2022
Event18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022 - Virtual, Online, Spain
Duration: 29 Nov 20222 Dec 2022
Conference number: 18

Publication series

NameAVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance

Conference

Conference18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022
Abbreviated titleAVSS 2022
Country/TerritorySpain
CityVirtual, Online
Period29/11/222/12/22

Keywords

  • 2023 OA procedure

Fingerprint

Dive into the research topics of 'Spatial-Temporal Transformer for Crime Recognition in Surveillance Videos'. Together they form a unique fingerprint.

Cite this