Body-part Tubelet Transformer for Human-Related Crime Classification

Ajay Mathew Joseph*, Fath U.Min Ullah, Estefania Talavera

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Detecting human-related crimes from surveillance videos poses an increasingly difficult challenge, especially when confronted with human actions that are relatively similar. In this work, we propose a transformer-based model that induces bias through the incorporation of a Tubelet embedder module-a 3D convolutional layer. The aim is to capture spatiotemporal embeddings from skeletal trajectories extracted from videos using 3D convolutional operations. Our experiments are conducted on the Human-Related Crime dataset, revealing that the use of tubelet embeddings maintains competitive performance (49% accuracy) to the state-of-the-art, while considerably reducing the computational complexity of the model.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)
PublisherIEEE
ISBN (Electronic)9798350374285
DOIs
Publication statusPublished - 18 Sept 2024
Event20th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2024 - Niagara Falls, Canada
Duration: 15 Jul 202416 Jul 2024
Conference number: 20

Conference

Conference20th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2024
Abbreviated titleAVSS 2024
Country/TerritoryCanada
CityNiagara Falls
Period15/07/2416/07/24

Keywords

  • 2024 OA procedure

Fingerprint

Dive into the research topics of 'Body-part Tubelet Transformer for Human-Related Crime Classification'. Together they form a unique fingerprint.

Cite this