Improving pan-genome annotation using whole genome multiple alignment
Improving pan-genome annotation using whole genome multiple alignment
Files
Publication or External Link
Date
2011-06-30
Authors
Angiuoli, Samuel V
Dunning Hotopp, Julie C
Salzberg, Steven L
Tettelin, Herve
Advisor
Citation
Angiuoli, S.V., Dunning Hotopp, J.C., Salzberg, S.L. et al. Improving pan-genome annotation using whole genome multiple alignment. BMC Bioinformatics 12, 272 (2011).
DRUM DOI
Abstract
Background: Rapid annotation and comparisons of genomes from multiple isolates (pan-genomes) is becoming
commonplace due to advances in sequencing technology. Genome annotations can contain inconsistencies and
errors that hinder comparative analysis even within a single species. Tools are needed to compare and improve
annotation quality across sets of closely related genomes.
Results: We introduce a new tool, Mugsy-Annotator, that identifies orthologs and evaluates annotation quality in
prokaryotic genomes using whole genome multiple alignment. Mugsy-Annotator identifies anomalies in annotated
gene structures, including inconsistently located translation initiation sites and disrupted genes due to draft
genome sequencing or pseudogenes. An evaluation of species pan-genomes using the tool indicates that such
anomalies are common, especially at translation initiation sites. Mugsy-Annotator reports alternate annotations that
improve consistency and are candidates for further review.
Conclusions: Whole genome multiple alignment can be used to efficiently identify orthologs and annotation
problem areas in a bacterial pan-genome. Comparisons of annotated gene structures within a species may show
more variation than is actually present in the genome, indicating errors in genome annotation. Our new tool
Mugsy-Annotator assists re-annotation efforts by highlighting edits that improve annotation consistency.