Abstract
Classification is a common task in data mining and knowledge discovery. Usually classifiers have to be generated by machine learning experts. Thus, the user who applies the
classifier has no idea whether, how and why the classifier works. This lack of understanding results in a lack of trust in the algorithms. Further, excluding domain experts from
the classifier construction and adaptation process does not allow to fully exploit users’ domain knowledge.
In this thesis the concept of Visually Supported Supervised Learning is introduced. It is investigated whether a tighter coupling of the data mining process with the user by
the means of interactive visualizations can improve construction, understanding, assessment, and adaptation of supervised learning algorithms. Different classifier-agnostic visualizations are designed and implemented and the concept of Visual Active Learning is deduced. Various experiments evaluate the suitability of these visualization with respect to assessment, understanding, creation and adaptation of classifiers.
The experiments show that, first, classifier-agnostic visualizations can help users to asses and understand arbitrary classification models in various classification tasks. Second, a specific (classifier-dependent) visualization for text classifiers can be used to asses certain aspects of the internal classification model in more detail. Third, a combination of data visualization and classifier visualization enables domain users to create classifiers from scratch. Fourth, Visual Active Learning outperforms classical active learning in classifier-agnostic settings. Fifth, automatically extracted key phrases are a fast and accurate representation for document labeling and thus allow for fast and efficient training data generation.
The results show, that the combination of classification algorithms and information visualization, Visually Supported Classification, is a reasonable approach. Tighter integration
of domain users in classification applications can be beneficial for both, the users and the algorithms.
classifier has no idea whether, how and why the classifier works. This lack of understanding results in a lack of trust in the algorithms. Further, excluding domain experts from
the classifier construction and adaptation process does not allow to fully exploit users’ domain knowledge.
In this thesis the concept of Visually Supported Supervised Learning is introduced. It is investigated whether a tighter coupling of the data mining process with the user by
the means of interactive visualizations can improve construction, understanding, assessment, and adaptation of supervised learning algorithms. Different classifier-agnostic visualizations are designed and implemented and the concept of Visual Active Learning is deduced. Various experiments evaluate the suitability of these visualization with respect to assessment, understanding, creation and adaptation of classifiers.
The experiments show that, first, classifier-agnostic visualizations can help users to asses and understand arbitrary classification models in various classification tasks. Second, a specific (classifier-dependent) visualization for text classifiers can be used to asses certain aspects of the internal classification model in more detail. Third, a combination of data visualization and classifier visualization enables domain users to create classifiers from scratch. Fourth, Visual Active Learning outperforms classical active learning in classifier-agnostic settings. Fifth, automatically extracted key phrases are a fast and accurate representation for document labeling and thus allow for fast and efficient training data generation.
The results show, that the combination of classification algorithms and information visualization, Visually Supported Classification, is a reasonable approach. Tighter integration
of domain users in classification applications can be beneficial for both, the users and the algorithms.
Original language | English |
---|---|
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 4 May 2012 |
Place of Publication | Graz |
Publisher | |
Publication status | Published - 4 May 2012 |