Since the early development of artificial neural networks, researchers have tried to analyze trained neural networks in order to gain insight into their behavior. For certain applications and in certain problem domains this has been successful, for example by the development of so-called rule extraction and other methods. For many other problem domains the existing knowledge extraction methods do not offer satisfactory solutions. Another key factor involved in the knowledge extractability is formed by the neural network architecture, as some network architectures (e.g., self-organizing maps) offer more direct insight into the stored knowledge than others (e.g., feedforward nets). This thesis therefore presents a generic neural network analysis method that utilizes domain-specific basic functions that are easy to interpret by the user and that can furthermore be used to optimize neural network systems. In general, the analysis consists in describing the internal functionality of the neural network in terms of domain-specific basic functions, functions that can be considered basic in the application domain of the neural network. This means that users who may not be familiar with artificial neural networks, but who are familiar with basic functions that are often used in their problem domain, can gain insight in the way the neural network solves their problem. For such users, this is often an important factor in deciding to apply artificial neural networks to a problem that may be difficult to solve otherwise. Traditionally, artificial neural network systems are monolithic, single-tier systems. In such systems, the process knowledge is distributed among all elements of the system. This makes it very difficult to extract the contained knowledge. It is much easier to locate knowledge in a system if it is based on a 3-tier model. Basic domain knowledge is stored in the elementary functions that are found in the bottom tier. It can be easily identified or analyzed. The middle tier basically describes how this knowledge is used in the system, and the application results, that are acquired on the base of the system knowledge, are presented in the top tier. Therefore, it would be beneficial in many aspects if systems based on a single-tier model could be transformed into equivalent systems based on the 3-tier model. This thesis proposes a generic method that can be used as a first step to achieve this for monolithic neural network systems. Whereas in general the system knowledge is stored in the neural network in a distributive manner, this thesis shows that it is possible to create a foundation tier on which the whole system is apparently based. The method presented in this thesis breaks the internal system knowledge into identifiable basic foundation blocks. These foundation blocks, or basic functions, depend on the application domain in which the system is operational, and so do the methods to extract those basic functions. However, the domain-dependent methods are all based on a single generic domain-independent idea, namely the analysis of the neural network in terms of (generic) domain-dependent basic functions. This thesis first gives a brief overview of the enormous variety of neural networks and applications that have been developed since the first mathematical model of human nerve cells was described. This is followed by a review of existing methods for the analysis of neural networks. This includes a discussion to what extent these existing methods are successful in retrieving complete and comprehensive knowledge from the network. A study is presented which identifies those application domains for which existing knowledge extraction techniques produce insufficient results. This is then used as a starting point for exploring the suitability of existing and new methods for neural network analysis in those domains. This is then further expanded by outlining the new theory of the analysis of trained neural networks in terms of domain-dependent basic functions, the main topic in this thesis. Based on this, suggestions for basic functions are given for a range of application domains. Two of these suggestions are then worked out in more detail and applied to some typical applications in the respective problem areas. One example application that is worked out is edge detection, from the digital image processing domain. The stored domain knowledge is translated into sets of gradient filters, which are commonly used in image processing. Another one is a feedforward network for classification of character images. Class knowledge stored in this network is extracted in the form of class prototypes. It is expected that these examples offer good illustrations of the benefits of the new method for neural network analysis presented in this thesis. The method presented in this thesis offers a much wider applicability than existing methods, some of which can actually be defined as specific cases of the generic method that forms the central theme in this thesis.
|Award date||17 Dec 2003|
|Place of Publication||Enschede|
|Publication status||Published - 17 Dec 2003|