Browsing by Author "Lin, Tsungnan"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item A Delay Damage Model Selection Algorithm for NARX Neural Networks(1998-10-15) Lin, Tsungnan; Giles, C. Lee; Horne, Bill G.; Kung, Sun-YangRecurrent neural networks have become popular models for system identification and time series prediction. NARX (Nonlinear AutoRegressive models with eXogenous inputs) neural network models are a popular subclass of recurrent networks and have beenused in many applications. Though embedded memory can be found in all recurrent network models, it is particularly prominent in NARX models. We show that using intelligent memory order selection through pruning and good initial heuristics significantly improves the generalization and predictive performance of these nonlinear systems on problems as diverse as grammatical inference and time series prediction. (Also cross-referenced as UMIACS-TR-96-77)Item How Embedded Memory in Recurrent Neural Network Architectures Helps Learning Long-term Dependencies(1998-10-15) Lin, Tsungnan; Horne, Bill G.; Giles, C. LeeLearning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NARX networks perform much better than conventional recurrent neural networks for learning certain simple long-term dependency problems. The intuitive explanation for this behavior is that the output memories of a NARX network can be manifested as jump-ahead connections in the time-unfolded network. These jump-ahead connections can propagate gradient information more efficiently, thus reducing the sensitivity of the network to long-term dependencies. This work gives empirical justification to our hypothesis that similar improvements in learning long-term dependencies can be achieved with other classes of recurrent neural network architectures simply by increasing the order of the embedded memory. In particular we explore the impact of learning simple long-term dependency problems on three classes of recurrent neural networks architectures: globally recurrent networks, locally recurrent networks, and NARX (output feedback) networks. Comparing the performance of these architectures with different orders of embedded memory on two simple long-term dependences problems shows that all of these classes of networks architectures demonstrate significant improvement on learning long-term dependencies when the orders of embedded memory are increased. These results can be important to a user comfortable to a specific recurrent neural network architecture because simply increasing the embedding memory order will make the architecture more robust to the problem of long-term dependency learning. (Also cross-referenced as UMIACS-TR-96-28)Item Learning Long-Term Dependencies is Not as Difficult With NARX Recurrent Neural Networks(1998-10-15) Lin, Tsungnan; Horne, Bill G.; Tino, Peter; Giles, C. LeeIt has recently been shown that gradient descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long- term dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. In this paper we explore the long-term dependencies problem for a class of architectures called NARX recurrent neural networks, which have power ful representational capabilities. We have previously reported that gradient descent learning is more effective in NARX networks than in recurrent neural network architectures that have ``hidden states'' on problems includ ing grammatical inference and nonlinear system identification. Typically, the network converges much faster and generalizes better than other net works. The results in this paper are an attempt to explain this phenomenon. We present some experimental results which show that NARX networks can often retain information for two to three times as long as conventional recurrent neural networks. We show that although NARX networks do not circumvent the problem of long-term dependencies, they can greatly improve performance on long-term dependency problems. We also describe in detail some of the assumption regarding what it means to latch information robustly and suggest possible ways to loosen these assumptions. (Also cross-referenced as UMIACS-TR-95-78)