Abstract
Neural networks have demonstrated remarkable effectiveness in solving distinct real-world vision problems pertaining to activity recognition and violence detection in surveillance scenarios. The broad reliance on practicing a single network for spatial and motion information collection has made them less effective for long-term dependency analysis in video snippets. Our work solves this issue through a multi-network fusion strategy suitable for real-world surveillance. Initially, the spatial information is accessed from a compound coefficient strategy inspired by a robust convolutional neural network (ConvNet). Next, the pyramidal convolutional features from two consecutive frames are obtained through LiteFlowNet. The output from both the networks (ConvNet and LiteFlowNet) is separately passed into a deep-gated recurrent Unit (GRU) that is assembled for a skip connection. The latter obtained from each GRU is fused and further propagated to the dense layer for final decision. The results on the datasets and the ablation study confirm our method's efficiency, outperforming the state-of-the-art methods.
Original language | English |
---|---|
Title of host publication | 2024 IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) |
Publisher | IEEE |
Edition | 2024 |
ISBN (Electronic) | 9798350374285 |
DOIs | |
Publication status | Published - 18 Sept 2024 |
Event | 20th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2024 - Niagara Falls, Canada Duration: 15 Jul 2024 → 16 Jul 2024 Conference number: 20 |
Conference
Conference | 20th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2024 |
---|---|
Abbreviated title | AVSS 2024 |
Country/Territory | Canada |
City | Niagara Falls |
Period | 15/07/24 → 16/07/24 |
Keywords
- 2024 OA procedure