Browsing by Author "Lawrence, Steve"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Face Recognition: A Hybrid Neural Network Approach(1998-10-15) Lawrence, Steve; Giles, C. Lee; Tsoi, Ah Chung; Back, Andrew D.Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult. We present a hybrid neural network solution which compares favorably with other methods. The system combines local image sampling, a self-organizing map neural network, and a convolutional neural network. The self-organizing map provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides for partial invariance to translation, rotation, scale, and deformation. The convolutional network extracts successively larger features in a hierarchical set of layers. We present results using the Karhunen-Loeve transform in place of the self-organizing map, and a multi-layer perceptron in place of the convolutional network. The Karhunen-Loeve transform performs almost as well (5.3% error versus 3.8%). The multi-layer perceptron performs very poorly (40% error versus 3.8%). The method is capable of rapid classification, requires only fast, approximate normalization and preprocessing, and consistently exhibits better classification performance than the eigenfaces approach on the database considered as the number of images per person in the training database is varied from 1 to 5. With 5 images per person the proposed method and eigenfaces result in 3.8 and 10.5 error respectively. The recognizer provides a measure of confidence in its output and classification error approaches zero when rejecting as few as 10 of the examples. We use a database of 400 images of 40 individuals which contains quite a high degree of variability in expression, pose, and facial details. We analyze computational complexity and discuss how new classes could be added to the trained recognizer. (Also cross-referenced as UMIACS-TR-96-16)Item Noisy Time Series Prediction using Symbolic Representation and Recurrent Neural Network Grammatical Inference(1998-10-15) Lawrence, Steve; Tsoi, Ah Chung; Giles, C. LeeFinancial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very successful in a number of signal processing applications. We discuss fundamental limitations and inherent difficulties when using neural networks for the processing of high noise, small sample size signals. We introduce a new intelligent signal processing method which addresses the difficulties. The method uses conversion into a symbolic representation with a self-organizing map, and grammatical inference with recurrent neural networks. We apply the method to the prediction of daily foreign exchange rates, addressing difficulties with non-stationarity, overfitting, and unequal a priori class probabilities, and we find significant predictability in comprehensive experiments covering 5 different foreign exchange rates. The method correctly predicts the direction of change for the next day with an error rate of 47.1%. The error rate reduces to around 40% when rejecting examples where the system has low confidence in its prediction. The symbolic representation aids the extraction of symbolic knowledge from the recurrent neural networks in the form of deterministic finite state automata. These automata explain the operation of the system and are often relatively simple. Rules related to well known behavior such as trend following and mean reversal are extracted. Also cross-referenced as UMIACS-TR-96-27Item On the Applicability of Neural Network and Machine Learning Methodologies to Natural Language Processing(1998-10-15) Lawrence, Steve; Giles, C. Lee; Fong, SandiwayWe examine the inductive inference of a complex grammar - specifically, we consider the task of training a model to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or Government- and-Binding theory. We investigate the following models: feed-forward neural networks, Fransconi-Gori-Soda and Back-Tsoi locally recurrent networks, Elman, Narendra \& Parthasarathy, and Williams \& Zipser recurrent networks, Euclidean and edit-distance nearest-neighbors, simulated annealing, and decision trees. The feed-forward neural networks and non-neural network machine learning models are included primarily for comparison. We address the question: How can a neural network, with its distributed nature and gradient descent based iterative calculations, possess linguistic capability which is traditionally handled with symbolic computation and recursive processes? Initial simulations with all models were only partially successful by using a large temporal window as input. Models trained in this fashion did not learn the grammar to a significant degree. Attempts at training recurrent networks with small temporal input windows failed until we implemented several techniques aimed at improving the convergence of the gradient descent training algorithms. We discuss the theory and present an empirical study of a variety of models and learning algorithms which highlights behaviour not present when attempting to learn a simpler grammar. (Also cross-referenced as UMIACS-TR-95-64)Item What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation(1998-10-15) Lawrence, Steve; Giles, C. Lee; Tsoi, Ah ChungOne of the most important aspects of any machine learning paradigm is how it scales according to problem size and complexity. Using a task with known optimal training error, and a pre-specified maximum number of training updates, we investigate the convergence of the backpropagation algorithm with respect to a) the complexity of the required function approximation, b) the size of the network in relation to the size required for an optimal solution, and c) the degree of noise in the training data. In general, for a) the solution found is worse when the function to be approximated is more complex, for b) oversize networks can result in lower training and generalization error, and for c) the use of committee or ensemble techniques can be more beneficial as the amount of noise in the training data is increased. For the experiments we performed, we do not obtain the optimal solution in any case. We further support the observation that larger networks can produce better training and generalization error using a face recognition example where a network with many more parameters than training points generalizes better than smaller networks. (Also cross-referenced as UMIACS-TR-96-22)