On the Applicability of Neural Network and Machine Learning
             Methodologies to Natural Language Processing

Lawrence, Steve; Giles, C. Lee; Fong, Sandiway

On the Applicability of Neural Network and Machine Learning Methodologies to Natural Language Processing

Files

CS-TR-3479.ps (739.01 KB)

No. of downloads: 254

CS-TR-3479.pdf (453.86 KB)

No. of downloads: 1258

Date

1998-10-15

Authors

Lawrence, Steve

Giles, C. Lee

Fong, Sandiway

Abstract

We examine the inductive inference of a complex grammar - specifically, we consider the task of training a model to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or Government- and-Binding theory. We investigate the following models: feed-forward neural networks, Fransconi-Gori-Soda and Back-Tsoi locally recurrent networks, Elman, Narendra & Parthasarathy, and Williams & Zipser recurrent networks, Euclidean and edit-distance nearest-neighbors, simulated annealing, and decision trees. The feed-forward neural networks and non-neural network machine learning models are included primarily for comparison. We address the question: How can a neural network, with its distributed nature and gradient descent based iterative calculations, possess linguistic capability which is traditionally handled with symbolic computation and recursive processes? Initial simulations with all models were only partially successful by using a large temporal window as input. Models trained in this fashion did not learn the grammar to a significant degree. Attempts at training recurrent networks with small temporal input windows failed until we implemented several techniques aimed at improving the convergence of the gradient descent training algorithms. We discuss the theory and present an empirical study of a variety of models and learning algorithms which highlights behaviour not present when attempting to learn a simpler grammar. (Also cross-referenced as UMIACS-TR-95-64)

URI (handle)

http://hdl.handle.net/1903/733

Collections

Technical Reports from UMIACS
Technical Reports of the Computer Science Department

Full item page