Besides the scientific paradigms of empiricism, mathematical modelling, and simulation, the method of combining and analysing data in novel ways has become a main research paradigm capable of tackling research questions that could not be answered before. To speed up research in this new paradigm, scientists are reusing and integrating originally gathered for different purposes. This repurposing of data requires a thorough understanding of the used data sources. Data understanding is an ongoing process in which the scientists gains insight into the semantics and quality of the data through exploration and use. In this book we propose a flexible method to guide this exploration and to highlight the places where automated assistance can be used to the greatest effect. The method is based on the principles of `good is good enough' and `pay as you go', meaning that the scientist puts in only as much effort as is necessary to get the integrated data to the level of quality that he needs to continue his research. This book pursues two directions of research. The first is an investigation of note taking. By documenting his exploration efforts the scientist can share his understanding of the data sources with others. To support the scientist in this a prototype note taking system is created. This system offers a compromise between the exploratory workflow of the scientist and the rigid procedures of the research institute. The second direction is the use of probabilistic data to support the `pay as you go'principle. A formal framework for the creation of probabilistic data models is introduced. By keeping data accessible even if there are contradictions or multiple alternatives, the scientists can postpone data integration choices that would have otherwise prevented him from continuing with his work.
|Award date||16 Jun 2016|
|Place of Publication||Enschede|
|Publication status||Published - 16 Jun 2016|