Computer Science Research Works
Permanent URI for this collectionhttp://hdl.handle.net/1903/1593
Browse
2 results
Search Results
Item Parsimonious reconstruction of network evolution(2012-09-19) Patro, Rob; Sefer, Emre; Malin, Justin; Marcais, Guillaume; Navlakha, Saket; Kingsford, CarlBackground: Understanding the evolution of biological networks can provide insight into how their modular structure arises and how they are affected by environmental changes. One approach to studying the evolution of these networks is to reconstruct plausible common ancestors of present-day networks, allowing us to analyze how the topological properties change over time and to posit mechanisms that drive the networks’ evolution. Further, putative ancestral networks can be used to help solve other difficult problems in computational biology, such as network alignment. Results: We introduce a combinatorial framework for encoding network histories, and we give a fast procedure that, given a set of gene duplication histories, in practice finds network histories with close to the minimum number of interaction gain or loss events to explain the observed present-day networks. In contrast to previous studies, our method does not require knowing the relative ordering of unrelated duplication events. Results on simulated histories and real biological networks both suggest that common ancestral networks can be accurately reconstructed using this parsimony approach. A software package implementing our method is available under the Apache 2.0 license at http://cbcb.umd.edu/kingsford-group/parana. Conclusions: Our parsimony-based approach to ancestral network reconstruction is both efficient and accurate. We show that considering a larger set of potential ancestral interactions by not assuming a relative ordering of unrelated duplication events can lead to improved ancestral network inference.Item Alignment and clustering of phylogenetic markers - implications for microbial diversity studies(2010-03-24) White, James R; Navlakha, Saket; Nagarajan, Niranjan; Ghodsi, Mohammad-Reza; Kingsford, Carl; Pop, MihaiBackground: Molecular studies of microbial diversity have provided many insights into the bacterial communities inhabiting the human body and the environment. A common first step in such studies is a survey of conserved marker genes (primarily 16S rRNA) to characterize the taxonomic composition and diversity of these communities. To date, however, there exists significant variability in analysis methods employed in these studies. Results: Here we provide a critical assessment of current analysis methodologies that cluster sequences into operational taxonomic units (OTUs) and demonstrate that small changes in algorithm parameters can lead to significantly varying results. Our analysis provides strong evidence that the species-level diversity estimates produced using common OTU methodologies are inflated due to overly stringent parameter choices. We further describe an example of how semi-supervised clustering can produce OTUs that are more robust to changes in algorithm parameters. Conclusions: Our results highlight the need for systematic and open evaluation of data analysis methodologies,especially as targeted 16S rRNA diversity studies are increasingly relying on high-throughput sequencing technologies. All data and results from our study are available through the JGI FAMeS website http://fames.jgi-psf. org/.