Hierarchical Neural Networks for Monitoring Complex Dynamic Systems.
Mavrovouniotis, Micheal L.
MetadataShow full item record
With the common three-layer neural network architectures, the processing of a large number of signals requires an enormously large neural network. Such a network is very difficult to train and may not have sufficient speed for real-time applications. The computational complexity also prevents the use of additional hidden layers, potentially leading to total inability of the network to capture essential complex patterns in the signals. Furthermore, the lack of internal structure in a large network makes it very difficult to discern characteristics of the knowledge acquired by the network, in order to evaluate its reliability and applicability. We will investigate alternative neural network structures that contain much fewer connections and are organized in a hierarchical fashion. Our hierarchical networks consist of a number of loosely-coupled subnets, arranged in layers each subnet is intended to capture specific aspects of the input data. At the bottom layer, each subnet operates directly on a particular subset of the input variables. In the intermediate layers, each subnet receives its inputs from subnets of the previous layer and sends its outputs to subnets in the next higher layer. Each subnet is expected to model and summarize in its output the important characteristics of a particular set of related input variables. In order to construct the subnets we start from the set of inputs and identify all its subsets which, based on our a priori knowledge of the structure and behavior of the system being modelled, consist of related inputs. We call these subsets input clusters. In general, the clusters will overlap, and there will even be clusters that are fully contained in (i.e., are subsets of) other clusters. This defines the hierarchy that will be used in the construction of the network Whenever a larger cluster is equal to the union of smaller clusters, the subnet that corresponds to the larger cluster will not receive its inputs directly from the input set, but rather from the outputs of other subnets that correspond to the smaller clusters. This should only take place if the rationale for establishing the cluster can be viewed as a combination of the rationales of the smaller cluster, i.e., if the set-subset relation of the cluster is not coinddental. The complexity of each subnet (i.e., its number of hidden and output nodes) can be adjusted based on the complexity of the relationship that led to the establishment of the cluster.