An Information Geometric Treatment of Maximum Likelihood Criteria and Generalization in Hidden Markov Modeling

Byrne, William

An Information Geometric Treatment of Maximum Likelihood Criteria and Generalization in Hidden Markov Modeling

dc.contributor.author	Byrne, William	en_US
dc.contributor.department	ISR	en_US
dc.date.accessioned	2007-05-23T09:54:06Z
dc.date.available	2007-05-23T09:54:06Z
dc.date.issued	1993	en_US
dc.description.abstract	It is shown here that several techniques for masimum likelihood training of Hidden Markov Models are instances of the EM algorithm and have very similar descriptions when formulated as instances of the Alternating Minimization procedure. The N-Best and Segmental K-Means algorithms are derived under a minimum discrimination information criterion and are shown to result from an additional restriction placed on the minimum discrimination information formulation which yields the Baum Welch algorithm. This uniform formulation is employed in an exploration of generalization by the EM algorithm.<P>It has been noted that the EM algorithm can introduce artifacts as training progresses. A related phenomenon is that over-training can occur; although the performance as measured on the training set continues to improve as the algorithm progresses, performance on related data sets may eventually begin to deteriorate. This is inherent in the maximum likelihood criterion and its cause can be seen when the training problem is stated in the Alternating Minimization framework. A modification of the maximum likelihood training criterion is suggested to counter this behavior and is applied to the broader problem of maximum likelihood training of exponential models from incomplete data. It leads to a simple modification of the learning algorithms which relates generalization to learning speed. Relationships to other techniques which encourage generalization, particularly methods of incorporating prior information, are discussed.	en_US
dc.format.extent	1867077 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/1903/5395
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	ISR; TR 1993-50	en_US
dc.subject	information theory	en_US
dc.subject	neural systems	en_US
dc.subject	speech processing	en_US
dc.subject	Communication	en_US
dc.subject	Signal Processing Systems	en_US
dc.title	An Information Geometric Treatment of Maximum Likelihood Criteria and Generalization in Hidden Markov Modeling	en_US
dc.type	Technical Report	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: TR_93-50.pdf
Size:: 1.78 MB
Format:: Adobe Portable Document Format

Download

Collections

Institute for Systems Research Technical Reports