Visually Supported Supervised Machine Learning

Research output: ThesisPhD Thesis - Research external, graduation externalAcademic

21 Downloads (Pure)

Abstract

Classification is a common task in data mining and knowledge discovery. Usually classifiers have to be generated by machine learning experts. Thus, the user who applies the
classifier has no idea whether, how and why the classifier works. This lack of understanding results in a lack of trust in the algorithms. Further, excluding domain experts from
the classifier construction and adaptation process does not allow to fully exploit users’ domain knowledge.

In this thesis the concept of Visually Supported Supervised Learning is introduced. It is investigated whether a tighter coupling of the data mining process with the user by
the means of interactive visualizations can improve construction, understanding, assessment, and adaptation of supervised learning algorithms. Different classifier-agnostic visualizations are designed and implemented and the concept of Visual Active Learning is deduced. Various experiments evaluate the suitability of these visualization with respect to assessment, understanding, creation and adaptation of classifiers.

The experiments show that, first, classifier-agnostic visualizations can help users to asses and understand arbitrary classification models in various classification tasks. Second, a specific (classifier-dependent) visualization for text classifiers can be used to asses certain aspects of the internal classification model in more detail. Third, a combination of data visualization and classifier visualization enables domain users to create classifiers from scratch. Fourth, Visual Active Learning outperforms classical active learning in classifier-agnostic settings. Fifth, automatically extracted key phrases are a fast and accurate representation for document labeling and thus allow for fast and efficient training data generation.

The results show, that the combination of classification algorithms and information visualization, Visually Supported Classification, is a reasonable approach. Tighter integration
of domain users in classification applications can be beneficial for both, the users and the algorithms.
Original languageEnglish
Awarding Institution
  • Graz University of Technology
Supervisors/Advisors
  • Lindstaedt, Stefanie N. , Supervisor, External person
  • Granitzer, Michael, Co-Supervisor, External person
Award date4 May 2012
Place of PublicationGraz
Publisher
Publication statusPublished - 4 May 2012

Fingerprint

Learning systems
Classifiers
Visualization
Data mining
Supervised learning
Data visualization
Labeling
Learning algorithms
Experiments

Cite this

Seifert, C. (2012). Visually Supported Supervised Machine Learning. Graz: Graz University of Technology.
Seifert, Christin. / Visually Supported Supervised Machine Learning. Graz : Graz University of Technology, 2012. 177 p.
@phdthesis{3f101133c0384b40b44e6c97d5660d56,
title = "Visually Supported Supervised Machine Learning",
abstract = "Classification is a common task in data mining and knowledge discovery. Usually classifiers have to be generated by machine learning experts. Thus, the user who applies theclassifier has no idea whether, how and why the classifier works. This lack of understanding results in a lack of trust in the algorithms. Further, excluding domain experts fromthe classifier construction and adaptation process does not allow to fully exploit users’ domain knowledge.In this thesis the concept of Visually Supported Supervised Learning is introduced. It is investigated whether a tighter coupling of the data mining process with the user bythe means of interactive visualizations can improve construction, understanding, assessment, and adaptation of supervised learning algorithms. Different classifier-agnostic visualizations are designed and implemented and the concept of Visual Active Learning is deduced. Various experiments evaluate the suitability of these visualization with respect to assessment, understanding, creation and adaptation of classifiers.The experiments show that, first, classifier-agnostic visualizations can help users to asses and understand arbitrary classification models in various classification tasks. Second, a specific (classifier-dependent) visualization for text classifiers can be used to asses certain aspects of the internal classification model in more detail. Third, a combination of data visualization and classifier visualization enables domain users to create classifiers from scratch. Fourth, Visual Active Learning outperforms classical active learning in classifier-agnostic settings. Fifth, automatically extracted key phrases are a fast and accurate representation for document labeling and thus allow for fast and efficient training data generation.The results show, that the combination of classification algorithms and information visualization, Visually Supported Classification, is a reasonable approach. Tighter integrationof domain users in classification applications can be beneficial for both, the users and the algorithms.",
author = "Christin Seifert",
year = "2012",
month = "5",
day = "4",
language = "English",
publisher = "Graz University of Technology",
school = "Graz University of Technology",

}

Seifert, C 2012, 'Visually Supported Supervised Machine Learning', Graz University of Technology, Graz.

Visually Supported Supervised Machine Learning. / Seifert, Christin.

Graz : Graz University of Technology, 2012. 177 p.

Research output: ThesisPhD Thesis - Research external, graduation externalAcademic

TY - THES

T1 - Visually Supported Supervised Machine Learning

AU - Seifert, Christin

PY - 2012/5/4

Y1 - 2012/5/4

N2 - Classification is a common task in data mining and knowledge discovery. Usually classifiers have to be generated by machine learning experts. Thus, the user who applies theclassifier has no idea whether, how and why the classifier works. This lack of understanding results in a lack of trust in the algorithms. Further, excluding domain experts fromthe classifier construction and adaptation process does not allow to fully exploit users’ domain knowledge.In this thesis the concept of Visually Supported Supervised Learning is introduced. It is investigated whether a tighter coupling of the data mining process with the user bythe means of interactive visualizations can improve construction, understanding, assessment, and adaptation of supervised learning algorithms. Different classifier-agnostic visualizations are designed and implemented and the concept of Visual Active Learning is deduced. Various experiments evaluate the suitability of these visualization with respect to assessment, understanding, creation and adaptation of classifiers.The experiments show that, first, classifier-agnostic visualizations can help users to asses and understand arbitrary classification models in various classification tasks. Second, a specific (classifier-dependent) visualization for text classifiers can be used to asses certain aspects of the internal classification model in more detail. Third, a combination of data visualization and classifier visualization enables domain users to create classifiers from scratch. Fourth, Visual Active Learning outperforms classical active learning in classifier-agnostic settings. Fifth, automatically extracted key phrases are a fast and accurate representation for document labeling and thus allow for fast and efficient training data generation.The results show, that the combination of classification algorithms and information visualization, Visually Supported Classification, is a reasonable approach. Tighter integrationof domain users in classification applications can be beneficial for both, the users and the algorithms.

AB - Classification is a common task in data mining and knowledge discovery. Usually classifiers have to be generated by machine learning experts. Thus, the user who applies theclassifier has no idea whether, how and why the classifier works. This lack of understanding results in a lack of trust in the algorithms. Further, excluding domain experts fromthe classifier construction and adaptation process does not allow to fully exploit users’ domain knowledge.In this thesis the concept of Visually Supported Supervised Learning is introduced. It is investigated whether a tighter coupling of the data mining process with the user bythe means of interactive visualizations can improve construction, understanding, assessment, and adaptation of supervised learning algorithms. Different classifier-agnostic visualizations are designed and implemented and the concept of Visual Active Learning is deduced. Various experiments evaluate the suitability of these visualization with respect to assessment, understanding, creation and adaptation of classifiers.The experiments show that, first, classifier-agnostic visualizations can help users to asses and understand arbitrary classification models in various classification tasks. Second, a specific (classifier-dependent) visualization for text classifiers can be used to asses certain aspects of the internal classification model in more detail. Third, a combination of data visualization and classifier visualization enables domain users to create classifiers from scratch. Fourth, Visual Active Learning outperforms classical active learning in classifier-agnostic settings. Fifth, automatically extracted key phrases are a fast and accurate representation for document labeling and thus allow for fast and efficient training data generation.The results show, that the combination of classification algorithms and information visualization, Visually Supported Classification, is a reasonable approach. Tighter integrationof domain users in classification applications can be beneficial for both, the users and the algorithms.

M3 - PhD Thesis - Research external, graduation external

PB - Graz University of Technology

CY - Graz

ER -

Seifert C. Visually Supported Supervised Machine Learning. Graz: Graz University of Technology, 2012. 177 p.