The generation of wireless sensor networks (WSNs) makes human beings observe and reason about the physical environment better, easier, and faster. The wireless sensor nodes equipped with sensing, processing, wireless communication and actuation capabilities can be densely deployed in a wide geographical area and measure various parameters continuously from the physical world. Compared with traditional environmental sensing technologies, such densely deployed WSNs enable collection of fine-grained high spatial and temporal resolution data with less installation, maintenance, and operation costs. However, raw sensor observations often have low data quality and reliability due to both internal and external factors including low quality of cheap sensors, dynamicity of network conditions, and harshness of the deployment environment. Use of low quality sensor data in any data analysis and decision making process will not only negatively impact analysis results and decisions made but also waste huge amount of valuable and limited network resources such as energy, as many incorrect values are transmitted. Low quality sensor data also prevents WSNs to fulfill their promises in terms of reliable real-time situation-awareness, as the low quality sensor data may generate large number of false alarms. Motivated by the need to improve quality of data analysis and decision making, enhance efficiency of using WSNs resources by preventing unnecessary transmission of erroneous sensor observations, and increase effectiveness of monitoring and situation-awareness capabilities of the WSNs, in this thesis we focus on online identification of outliers whenever and wherever they occur. Outliers in WSNs are those observations that represent erroneous values (errors) or indicate particular phenomenal changes (events). Our outlier detection techniques, which are based on distributed in-network data processing, identify sensor observations that do not conform to normal behavior of sensor data without using a pre-defined threshold or triggering conditions. Our main research objective is to design and implement effective and efficient outlier detection techniques for WSNs to identify outliers in an online and distributed manner and distinguish between errors and events with high accuracy and low false alarm, while maintaining the communication, computation and memory complexity low. Main contributions of this thesis can be summarized as: 1. Taxonomy of and guideline for outlier detection techniques for WSNs. We present shortcomings of existing outlier detection techniques and a set of important issues for outlier detection techniques for WSNs. We further provide a technique-based taxonomy to categorize current outlier detection techniques developed for WSNs and provide a guideline on requirements of suitable outlier detection techniques for WSNs. 2. Design and comparison of data labelling techniques for performance evaluation of outlier detection techniques. Many WSN applications suffer from lack of labelled data. To solve this problem, various labelling techniques are used offline to give semantic to data collected by WSNs and distinguish between normal data and outliers. We investigate impact of data distribution and data dependencies on four of these labelling techniques and evaluate their performance for the outlier detection process. 3. Statistical-Based outlier detection techniques for WSNs. We take two approaches in designing our outlier detection techniques. One approach originates from the field of statistics, while the other comes from the field of data mining and machine learning. Considering that spatio-temporal correlation exists between sensor observations, we use statistical approaches to quantify this correlation and to identify outliers in an online and distributed manner and distinguish between errors and events in real-time. 4. Spherical support vector machine (SVM)-based outlier detection techniques for WSNs. From data mining and machine learning perspective, we propose our distributed and online outlier detection techniques based on quarter-sphere one-class SVM. These techniques do not take into account correlation that may exist between data attributes. We simplify the process of modelling the quarter-sphere SVM to fit limited resources of WSNs and present three strategies to update the SVM-based model that represents normal behavior of sensor data. 5. Ellipsoidal support vector machine (SVM)-based outlier detection techniques for WSNs. We extend our quarter-sphere one-class SVM by taking into account correlation between different attributes to identify multivariate outliers. This results in our ellipsoidal SVM-based outlier detection techniques. To cope with dynamic nature of sensor data, we propose an efficient strategy to update the SVM normal model.
|Qualification||Doctor of Philosophy|
|Award date||23 Jun 2010|
|Place of Publication||Enschede|
|Publication status||Published - 23 Jun 2010|