## Learning a Class of Large Finite State Machines with a Recurrent Neural Network

##### Abstract

One of the issues in any learning model is how it scales with problem
size. Neural networks have not been immune to scaling issues. We show that a
dynamically-driven discrete-time recurrent network (DRNN) can learn rather
large grammatical inference problems when the strings of a finite memory
machine (FMM) are encoded as temporal sequences. FMMs are a subclass of finite
state machines which have a finite memory or a finite order of inputs and
outputs. The DRNN that learns the FMM is a neural network that maps directly
from the sequential machine implementation of the FMM. It has feedback only
from the output and not from any hidden units; an example is the recurrent
network of Narendra and Parthasarathy. (FMMs that have zero order in the
feedback of outputs are called definite memory machines and are analogous to
Time-delay or Finite Impulse Response neural networks.) Due to their
topology these DRNNs are as least as powerful as any sequential machine
implementation of a FMM and should be capable of representing any FMM. We
choose to learn ``particular FMMs.\' Specifically, these FMMs have a large
number of states (simulations are for $256$ and $512$ state FMMs) but have
minimal order, relatively small depth and little logic when the FMM is
implemented as a sequential machine. Simulations for the number of training
examples versus generalization performance and FMM extraction size show that
the number of training samples necessary for perfect generalization
is less than that necessary to completely characterize the FMM to be learned.
This is in a sense a best case learning problem since any arbitrarily chosen
FMM with a minimal number of states would have much more order and string
depth and most likely require more logic in its sequential machine
implementation.
(Also cross-referenced as UMIACS-TR-94-94)

University of Maryland, College Park, MD 20742-7011 (301)314-1328.

Please send us your comments.