Abstract
Human-related crime recognition from surveillance videos becomes an even more challenging task when dealing with relatively similar human actions. We propose a transformer-based model that relies on the spatial-temporal representation of extracted skeletal trajectories for fine-grained classification. We validate the effectiveness of our model on the complex HR-Crime dataset consisting of videos representing 13 categories of human-related crimes. Quantitative and qualitative results suggest that building a transformer architecture with coupled spatial and temporal modules enables the model to compete in performance while improving intrinsic interpretability.
Original language | English |
---|---|
Title of host publication | AVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance |
Publisher | IEEE |
Number of pages | 8 |
ISBN (Electronic) | 978-1-6654-6382-9 |
DOIs | |
Publication status | Published - 24 Nov 2022 |
Event | 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022 - Virtual, Online, Spain Duration: 29 Nov 2022 → 2 Dec 2022 Conference number: 18 |
Publication series
Name | AVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance |
---|
Conference
Conference | 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022 |
---|---|
Abbreviated title | AVSS 2022 |
Country/Territory | Spain |
City | Virtual, Online |
Period | 29/11/22 → 2/12/22 |
Keywords
- 2023 OA procedure