<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>DRUM Collection: Fischell Department of Bioengineering Research Works</title>
  <link rel="alternate" href="http://hdl.handle.net/1903/6627" />
  <subtitle />
  <id>http://hdl.handle.net/1903/6627</id>
  <updated>2013-05-26T01:24:29Z</updated>
  <dc:date>2013-05-26T01:24:29Z</dc:date>
  <entry>
    <title>Complete genomic sequence analysis of infectious bronchitis virus  Ark DPI strain and its evolution by recombination</title>
    <link rel="alternate" href="http://hdl.handle.net/1903/13387" />
    <author>
      <name>Ammayappan, Arun</name>
    </author>
    <author>
      <name>Upadhyay, Chitra</name>
    </author>
    <author>
      <name>Gelb, Jack Jr.</name>
    </author>
    <author>
      <name>Vakharia, Vikram N</name>
    </author>
    <id>http://hdl.handle.net/1903/13387</id>
    <updated>2013-01-11T03:48:32Z</updated>
    <published>2008-12-22T00:00:00Z</published>
    <summary type="text">Title: Complete genomic sequence analysis of infectious bronchitis virus  Ark DPI strain and its evolution by recombination
Authors: Ammayappan, Arun; Upadhyay, Chitra; Gelb, Jack Jr.; Vakharia, Vikram N
Abstract: An infectious bronchitis virus Arkansas DPI (Ark DPI) virulent strain was sequenced, analyzed and compared with many different IBV strains and coronaviruses. The genome of Ark DPI consists of 27,620 nucleotides, excluding poly (A) tail, and comprises ten open reading frames. Comparative sequence analysis of Ark DPI with other IBV strains shows striking similarity to the Conn, Gray, JMK, and Ark 99, which were circulating during that time period. Furthermore, comparison of the Ark genome with other coronaviruses demonstrates a close relationship to turkey coronavirus. Among non-structural genes, the 5'untranslated region (UTR), 3C-like proteinase (3CL&#xD;
pro) and the polymerase (RdRp) sequences are 100% identical to the Gray strain. Among structural genes, S1 has 97% identity with Ark 99; S2 has 100% identity with JMK and 96% to Conn; 3b 99%, and 3C to N is 100% identical to Conn strain. Possible recombination sites were found at the intergenic region  of  spike  gene,  3'end  of  S1  and  3a  gene. Independent recombination events may have occurred in the entire  genome of Ark DPI, involving four different IBV strains, suggesting that&#xD;
genomic RNA recombination may occur in any part of the genome at number of sites. Hence, we speculate that the Ark DPI strain originated  from the Conn strain, but diverged and evolved independently by point mutations and recombination between field strains.</summary>
    <dc:date>2008-12-22T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>Assembly complexity of prokaryotic genomes using short reads</title>
    <link rel="alternate" href="http://hdl.handle.net/1903/13377" />
    <author>
      <name>Kingsford, Carl</name>
    </author>
    <author>
      <name>Schatz, Michael C</name>
    </author>
    <author>
      <name>Pop, Mihai</name>
    </author>
    <id>http://hdl.handle.net/1903/13377</id>
    <updated>2013-01-11T03:40:23Z</updated>
    <published>2010-01-12T00:00:00Z</published>
    <summary type="text">Title: Assembly complexity of prokaryotic genomes using short reads
Authors: Kingsford, Carl; Schatz, Michael C; Pop, Mihai
Abstract: Background: De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes.&#xD;
Results: We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for de novo reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages).&#xD;
Conclusions: Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed.</summary>
    <dc:date>2010-01-12T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies</title>
    <link rel="alternate" href="http://hdl.handle.net/1903/13358" />
    <author>
      <name>Wetzel, Joshua</name>
    </author>
    <author>
      <name>Kingsford, Carl</name>
    </author>
    <author>
      <name>Pop, Mihai</name>
    </author>
    <id>http://hdl.handle.net/1903/13358</id>
    <updated>2013-01-11T03:47:01Z</updated>
    <published>2011-04-13T00:00:00Z</published>
    <summary type="text">Title: Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies
Authors: Wetzel, Joshua; Kingsford, Carl; Pop, Mihai
Abstract: Background: Next-generation sequencing technologies allow genomes to be sequenced more quickly and less&#xD;
expensively than ever before. However, as sequencing technology has improved, the difficulty of de novo genome&#xD;
assembly has increased, due in large part to the shorter reads generated by the new technologies. The use of&#xD;
mated sequences (referred to as mate-pairs) is a standard means of disambiguating assemblies to obtain a more&#xD;
complete picture of the genome without resorting to manual finishing. Here, we examine the effectiveness of&#xD;
mate-pair information in resolving repeated sequences in the DNA (a paramount issue to overcome). While it has&#xD;
been empirically accepted that mate-pairs improve assemblies, and a variety of assemblers use mate-pairs in the&#xD;
context of repeat resolution, the effectiveness of mate-pairs in this context has not been systematically evaluated&#xD;
in previous literature.&#xD;
Results: We show that, in high-coverage prokaryotic assemblies, libraries of short mate-pairs (about 4-6 times the&#xD;
read-length) more effectively disambiguate repeat regions than the libraries that are commonly constructed in&#xD;
current genome projects. We also demonstrate that the best assemblies can be obtained by ‘tuning’ mate-pair&#xD;
libraries to accommodate the specific repeat structure of the genome being assembled - information that can be&#xD;
obtained through an initial assembly using unpaired reads. These results are shown across 360 simulations on&#xD;
‘ideal’ prokaryotic data as well as assembly of 8 bacterial genomes using SOAPdenovo. The simulation results&#xD;
provide an upper-bound on the potential value of mate-pairs for resolving repeated sequences in real prokaryotic&#xD;
data sets. The assembly results show that our method of tuning mate-pairs exploits fundamental properties of&#xD;
these genomes, leading to better assemblies even when using an off -the-shelf assembler in the presence of basecall errors.&#xD;
Conclusions: Our results demonstrate that dramatic improvements in prokaryotic genome assembly quality can be&#xD;
achieved by tuning mate-pair sizes to the actual repeat structure of a genome, suggesting the possible need to&#xD;
change the way sequencing projects are designed. We propose that a two-tiered approach - first generate an&#xD;
assembly of the genome with unpaired reads in order to evaluate the repeat structure of the genome; then&#xD;
generate the mate-pair libraries that provide most information towards the resolution of repeats in the genome&#xD;
being assembled - is not only possible, but likely also more cost-effective as it will significantly reduce downstream&#xD;
manual finishing costs. In future work we intend to address the question of whether this result can be extended to&#xD;
larger eukaryotic genomes, where repeat structure can be quite different.</summary>
    <dc:date>2011-04-13T00:00:00Z</dc:date>
  </entry>
</feed>

