Abstract
Human-related crime recognition from surveillance videos becomes an even more challenging task when dealing with relatively similar human actions. We propose a transformer-based model that relies on the spatial-temporal representation of extracted skeletal trajectories for fine-grained classification. We validate the effectiveness of our model on the complex HR-Crime dataset consisting of videos representing 13 categories of human-related crimes. Quantitative and qualitative results suggest that building a transformer architecture with coupled spatial and temporal modules enables the model to compete in performance while improving intrinsic interpretability.
| Original language | English |
|---|---|
| Title of host publication | AVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance |
| Publisher | IEEE |
| Number of pages | 8 |
| ISBN (Electronic) | 978-1-6654-6382-9 |
| DOIs | |
| Publication status | Published - 24 Nov 2022 |
| Event | 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022 - Virtual, Online, Spain Duration: 29 Nov 2022 → 2 Dec 2022 Conference number: 18 |
Publication series
| Name | AVSS 2022 - 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance |
|---|
Conference
| Conference | 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2022 |
|---|---|
| Abbreviated title | AVSS 2022 |
| Country/Territory | Spain |
| City | Virtual, Online |
| Period | 29/11/22 → 2/12/22 |
Keywords
- 2023 OA procedure
Fingerprint
Dive into the research topics of 'Spatial-Temporal Transformer for Crime Recognition in Surveillance Videos'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver