The UAVid Dataset for Video Semantic Segmentation

Ye Lyu, G. Vosselman, Guisong Xia, Alper Yilmaz, Michael Ying Yang

Research output: Working paperProfessional

4 Downloads (Pure)

Abstract

Video semantic segmentation has been one of the research focus in computer vision recently. It serves as a perception foundation for many fields such as robotics and autonomous driving. The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods. Currently, there already exist several semantic segmentation datasets for complex urban scenes, such as the Cityscapes and CamVid datasets. They have been the standard datasets for comparison among semantic segmentation methods. In this paper, we introduce a new high resolution UAV video semantic segmentation dataset as complement, UAVid. Our UAV dataset consists of 30 video sequences capturing high resolution images. In total, 300 images have been densely labelled with 8 classes for urban scene understanding task. Our dataset brings out new challenges. We provide several deep learning baseline methods, among which the proposed novel Multi-Scale-Dilation net performs the best via multi-scale feature extraction. We have also explored the usability of sequence data by leveraging on CRF model in both spatial and temporal domain.
Original languageEnglish
PublisherarXiv.org
Number of pages9
Publication statusPublished - 24 Oct 2018

Fingerprint

Semantics
Unmanned aerial vehicles (UAV)
Image resolution
Computer vision
Feature extraction
Robotics
Deep learning

Keywords

  • cs.CV
  • ITC-GOLD

Cite this

@techreport{30a0828bcdfb438799a560be8bdbda90,
title = "The UAVid Dataset for Video Semantic Segmentation",
abstract = "Video semantic segmentation has been one of the research focus in computer vision recently. It serves as a perception foundation for many fields such as robotics and autonomous driving. The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods. Currently, there already exist several semantic segmentation datasets for complex urban scenes, such as the Cityscapes and CamVid datasets. They have been the standard datasets for comparison among semantic segmentation methods. In this paper, we introduce a new high resolution UAV video semantic segmentation dataset as complement, UAVid. Our UAV dataset consists of 30 video sequences capturing high resolution images. In total, 300 images have been densely labelled with 8 classes for urban scene understanding task. Our dataset brings out new challenges. We provide several deep learning baseline methods, among which the proposed novel Multi-Scale-Dilation net performs the best via multi-scale feature extraction. We have also explored the usability of sequence data by leveraging on CRF model in both spatial and temporal domain.",
keywords = "cs.CV, ITC-GOLD",
author = "Ye Lyu and G. Vosselman and Guisong Xia and Alper Yilmaz and Yang, {Michael Ying}",
year = "2018",
month = "10",
day = "24",
language = "English",
publisher = "arXiv.org",
type = "WorkingPaper",
institution = "arXiv.org",

}

The UAVid Dataset for Video Semantic Segmentation. / Lyu, Ye; Vosselman, G.; Xia, Guisong; Yilmaz, Alper; Yang, Michael Ying.

arXiv.org, 2018.

Research output: Working paperProfessional

TY - UNPB

T1 - The UAVid Dataset for Video Semantic Segmentation

AU - Lyu, Ye

AU - Vosselman, G.

AU - Xia, Guisong

AU - Yilmaz, Alper

AU - Yang, Michael Ying

PY - 2018/10/24

Y1 - 2018/10/24

N2 - Video semantic segmentation has been one of the research focus in computer vision recently. It serves as a perception foundation for many fields such as robotics and autonomous driving. The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods. Currently, there already exist several semantic segmentation datasets for complex urban scenes, such as the Cityscapes and CamVid datasets. They have been the standard datasets for comparison among semantic segmentation methods. In this paper, we introduce a new high resolution UAV video semantic segmentation dataset as complement, UAVid. Our UAV dataset consists of 30 video sequences capturing high resolution images. In total, 300 images have been densely labelled with 8 classes for urban scene understanding task. Our dataset brings out new challenges. We provide several deep learning baseline methods, among which the proposed novel Multi-Scale-Dilation net performs the best via multi-scale feature extraction. We have also explored the usability of sequence data by leveraging on CRF model in both spatial and temporal domain.

AB - Video semantic segmentation has been one of the research focus in computer vision recently. It serves as a perception foundation for many fields such as robotics and autonomous driving. The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods. Currently, there already exist several semantic segmentation datasets for complex urban scenes, such as the Cityscapes and CamVid datasets. They have been the standard datasets for comparison among semantic segmentation methods. In this paper, we introduce a new high resolution UAV video semantic segmentation dataset as complement, UAVid. Our UAV dataset consists of 30 video sequences capturing high resolution images. In total, 300 images have been densely labelled with 8 classes for urban scene understanding task. Our dataset brings out new challenges. We provide several deep learning baseline methods, among which the proposed novel Multi-Scale-Dilation net performs the best via multi-scale feature extraction. We have also explored the usability of sequence data by leveraging on CRF model in both spatial and temporal domain.

KW - cs.CV

KW - ITC-GOLD

UR - https://ezproxy2.utwente.nl/login?url=https://webapps.itc.utwente.nl/library/2018/scie/yang_uav.pdf

M3 - Working paper

BT - The UAVid Dataset for Video Semantic Segmentation

PB - arXiv.org

ER -