Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies

dc.contributor.authorNeale, David B
dc.contributor.authorWegrzyn, Jill L
dc.contributor.authorStevens, Kristian A
dc.contributor.authorZimin, Aleksey V
dc.contributor.authorPuiu, Daniela
dc.contributor.authorCrepeau, Marc W
dc.contributor.authorCardeno, Charis
dc.contributor.authorKoriabine, Maxim
dc.contributor.authorHoltz-Morris, Ann E
dc.contributor.authorLiechty, John D
dc.contributor.authorMartínez-García, Pedro J
dc.contributor.authorVasquez-Gross, Hans A
dc.contributor.authorLin, Brian Y
dc.contributor.authorZieve, Jacob J
dc.contributor.authorDougherty, William M
dc.contributor.authorFuentes-Soriano, Sara
dc.contributor.authorWu, Le-Shin
dc.contributor.authorGilbert, Don
dc.contributor.authorMarçais, Guillaume
dc.contributor.authorRoberts, Michael
dc.contributor.authorHolt, Carson
dc.contributor.authorYandell, Mark
dc.contributor.authorDavis, John M
dc.contributor.authorSmith, Katherine E
dc.contributor.authorDean, Jeffrey FD
dc.contributor.authorLorenz, W Walter
dc.contributor.authorWhetten, Ross W
dc.contributor.authorSederoff, Ronald
dc.contributor.authorWheeler, Nicholas
dc.contributor.authorMcGuire, Patrick E
dc.contributor.authorMain, Doreen
dc.contributor.authorLoopstra, Carol A
dc.contributor.authorMockaitis, Keithanne
dc.contributor.authordeJong, Pieter J
dc.contributor.authorYorke, James A
dc.contributor.authorSalzberg, Steven L
dc.contributor.authorLangley, Charles H
dc.description.abstractThe size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.en_US
dc.identifier.citationNeale, D.B., Wegrzyn, J.L., Stevens, K.A. et al. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome Biol 15, R59 (2014).en_US
dc.publisherSpringer Natureen_US
dc.relation.isAvailableAtCollege of Computer, Mathematical & Natural Sciencesen_us
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_us
dc.relation.isAvailableAtUniversity of Maryland (College Park, MD)en_us
dc.subjectRust Resistanceen_US
dc.subjectPinus Taedaen_US
dc.subjectWhole Genome Shotgunen_US
dc.subjectWhole Genome Shotgun Sequenceen_US
dc.subjectConifer Genomeen_US
dc.titleDecoding the massive genome of loblolly pine using haploid DNA and novel assembly strategiesen_US
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
1.69 MB
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Thumbnail Image
1.57 KB
Item-specific license agreed upon to submission