Scaffolding of long read assemblies using long range contact information

dc.contributor.authorGhurye, Jay
dc.contributor.authorPop, Mihai
dc.contributor.authorKoren, Sergey
dc.contributor.authorBickhart, Derek
dc.contributor.authorChin, Chen-Shan
dc.date.accessioned2021-07-21T17:17:22Z
dc.date.available2021-07-21T17:17:22Z
dc.date.issued2017-07-12
dc.description.abstractLong read technologies have revolutionized de novo genome assembly by generating contigs orders of magnitude longer than that of short read assemblies. Although assembly contiguity has increased, it usually does not reconstruct a full chromosome or an arm of the chromosome, resulting in an unfinished chromosome level assembly. To increase the contiguity of the assembly to the chromosome level, different strategies are used which exploit long range contact information between chromosomes in the genome. We develop a scalable and computationally efficient scaffolding method that can boost the assembly contiguity to a large extent using genome-wide chromatin interaction data such as Hi-C. We demonstrate an algorithm that uses Hi-C data for longer-range scaffolding of de novo long read genome assemblies. We tested our methods on the human and goat genome assemblies. We compare our scaffolds with the scaffolds generated by LACHESIS based on various metrics. Our new algorithm SALSA produces more accurate scaffolds compared to the existing state of the art method LACHESIS.en_US
dc.description.urihttps://doi.org/10.1186/s12864-017-3879-z
dc.identifierhttps://doi.org/10.13016/tpkf-eb9r
dc.identifier.citationGhurye, J., Pop, M., Koren, S. et al. Scaffolding of long read assemblies using long range contact information. BMC Genomics 18, 527 (2017).en_US
dc.identifier.urihttp://hdl.handle.net/1903/27552
dc.language.isoen_USen_US
dc.publisherSpringer Natureen_US
dc.relation.isAvailableAtCollege of Computer, Mathematical & Natural Sciencesen_us
dc.relation.isAvailableAtComputer Scienceen_us
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_us
dc.relation.isAvailableAtUniversity of Maryland (College Park, MD)en_us
dc.subjectAssemblyen_US
dc.subjectScaffoldingen_US
dc.subjectHi-Cen_US
dc.subjectLong readsen_US
dc.titleScaffolding of long read assemblies using long range contact informationen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s12864-017-3879-z.pdf
Size:
1.24 MB
Format:
Adobe Portable Document Format
Description: