Finding Genes in DNA with a Hidden Markov Model
Finding Genes in DNA with a Hidden Markov Model
Loading...
Files
Publication or External Link
Date
1997
Authors
Henderson, John
Salzberg, Steven
Fasman, Kenneth H
Advisor
Citation
Finding Genes in Human DNA with a Hidden Markov Model. J. Henderson, S.L. Salzberg, and K. Fasman. This describes the VEIL system for finding genes. Journal of Computational Biology 4:2 (1997), 127-141.
DRUM DOI
Abstract
This study describes a new Hidden Markov Model (HMM) system for segmenting uncharacterized genomic DNA sequences into exons, introns, and intergenic
regions. Separate HMM modules were designed and trained for specific regions of
DNA: exons, introns, intergenic regions, and splice sites. The models were then
tied together to form a biologically feasible topology. The integrated HMM was
trained further on a set of eukaryotic DNA sequences, and tested by using it to
segment a separate set of sequences. The resulting HMM system, which is called
VEIL (Viterbi Exon-Intron Locator), obtains an overall accuracy on test data of
92% of total bases correctly labelled, with a correlation coefficient of 0.73. Using the more stringent test of exact exon prediction, VEIL correctly located both ends
of 53% of the coding exons, and 49% of the exons it predicts are exactly correct.
These results compare favorably to the best previous results for gene structure
prediction, and demonstrate the benefits of using HMMs for this problem.