A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

Research output: Book/ReportReportProfessional

315 Downloads (Pure)

Abstract

The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework.
Original languageUndefined
Place of PublicationEnschede
PublisherCentre for Telematics and Information Technology (CTIT)
Number of pages40
Publication statusPublished - 14 Nov 2007

Publication series

NameCTIT Technical Report Series
PublisherCentre for Telematics and Information Technology, University of Twente
No.Paper P-NS/TR-CTIT-07-79
ISSN (Print)1381-3625

Keywords

  • METIS-245767
  • IR-64450
  • CAES-PS: Pervasive Systems
  • EWI-11366

Cite this

Zhang, Y., Meratnia, N., & Havinga, P. J. M. (2007). A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets. (CTIT Technical Report Series; No. Paper P-NS/TR-CTIT-07-79). Enschede: Centre for Telematics and Information Technology (CTIT).
Zhang, Y. ; Meratnia, Nirvana ; Havinga, Paul J.M. / A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets. Enschede : Centre for Telematics and Information Technology (CTIT), 2007. 40 p. (CTIT Technical Report Series; Paper P-NS/TR-CTIT-07-79).
@book{ca8af56262084f5b9a3b59c92c5b9c72,
title = "A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets",
abstract = "The term {"}outlier{"} can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework.",
keywords = "METIS-245767, IR-64450, CAES-PS: Pervasive Systems, EWI-11366",
author = "Y. Zhang and Nirvana Meratnia and Havinga, {Paul J.M.}",
year = "2007",
month = "11",
day = "14",
language = "Undefined",
series = "CTIT Technical Report Series",
publisher = "Centre for Telematics and Information Technology (CTIT)",
number = "Paper P-NS/TR-CTIT-07-79",
address = "Netherlands",

}

Zhang, Y, Meratnia, N & Havinga, PJM 2007, A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets. CTIT Technical Report Series, no. Paper P-NS/TR-CTIT-07-79, Centre for Telematics and Information Technology (CTIT), Enschede.

A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets. / Zhang, Y.; Meratnia, Nirvana; Havinga, Paul J.M.

Enschede : Centre for Telematics and Information Technology (CTIT), 2007. 40 p. (CTIT Technical Report Series; No. Paper P-NS/TR-CTIT-07-79).

Research output: Book/ReportReportProfessional

TY - BOOK

T1 - A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

AU - Zhang, Y.

AU - Meratnia, Nirvana

AU - Havinga, Paul J.M.

PY - 2007/11/14

Y1 - 2007/11/14

N2 - The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework.

AB - The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework.

KW - METIS-245767

KW - IR-64450

KW - CAES-PS: Pervasive Systems

KW - EWI-11366

M3 - Report

T3 - CTIT Technical Report Series

BT - A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

PB - Centre for Telematics and Information Technology (CTIT)

CY - Enschede

ER -

Zhang Y, Meratnia N, Havinga PJM. A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets. Enschede: Centre for Telematics and Information Technology (CTIT), 2007. 40 p. (CTIT Technical Report Series; Paper P-NS/TR-CTIT-07-79).