The anticipated 'sensing environments' of the near future pose new requirements to the data management systems that mediate between sensor data supply and demand sides. We identify and investigate one of them: the need to deal with the inherent uncertainty in sensor data due to measurement noise, missing data, the semantic gap between the measured data and relevant information, and the integration of data from different sensors. Probabilistic models of sensor data can be used to deal with these uncertainties in the well-understood and fruitful framework of probability theory. In particular, the Bayesian network formalism proves useful for modeling sensor data in a flexible environment, because of its comprehensiveness and modularity. We provide extensive technical argumentation for this claim. As a demonstration case, we define a discrete Bayesian network for location tracking using Bluetooth transceivers. In order to scale up sensor models, efficient probabilistic inference on the Bayesian network is crucial. However, we observe that the conventional inference methods do not scale well for our demonstration case. We propose several optimizations, making it possible to jointly scale up the number of locations and sensors in sublinear time, and to scale up the time resolution in linear time. Moreover, we define a theoretical framework in which these optimizations are derived by translating an inference query into relational algebra. This allows the query to be analyzed and optimized using insights and techniques from the database community; for example, using cost metrics based on cardinality rather than dimensionality. An orthogonal research question investigates the possibility of collecting transition statistics in a local, clustered fashion, in which transitions between states of different clusters cannot be directly observed. We show that this problem can be written as a constrained system of linear equations, for which we describe a specialized solution method.
|Qualification||Doctor of Philosophy|
|Award date||25 Sept 2009|
|Place of Publication||Enschede|
|Publication status||Published - 25 Sept 2009|
- Sensor data
- Probabilistic models
- Dynamic Bayesian networks