Computational methods for the identification of mutation signatures and intracellular microbes in cancer

dc.contributor.advisorLeiserson, Mark D.M.en_US
dc.contributor.advisorRuppin, Eytanen_US
dc.contributor.authorRobinson, Wellsen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2021-07-14T05:33:03Z
dc.date.available2021-07-14T05:33:03Z
dc.date.issued2021en_US
dc.description.abstractCancer is the second leading cause of death in the United States behind heart disease, killing ~600,000 Americans per year. Technological advances have lowered the cost of sequencing a tumor genome even faster than would have been predicted by Moore’s law. However, specialized computational techniques are required to effectively analyze this genomic data. In this dissertation, we present two such computational approaches to address key challenges in the field of computational cancer biology. Given the importance of reproducibility in biomedical research, we provide publicly available workflows for reproducing the results from our computational approaches. In the first part of this thesis, we present a novel approach for the extraction of mutation signatures from matrices of somatic mutations. One computational challenge for extracting mutation signatures is the relatively small number of mutations in each tumor compared to the relatively large number of distinct signatures, which can be mathematically similar to each other. To help address this computational challenge, we apply ideas from the field of topic modeling to develop the first mutation signature model, the Tumor Covariate Signature Model (TCSM), that can incorporate known tumor covariates. We focus on two mathematically similar signatures associated with distinct covariates to evaluate TCSM and show that by leveraging these covariates, TCSM can more accurately distinguish between mutations attributed to these two signatures than existing NMF-based approaches. The second part focuses on the microbes in the tumor microenvironment. It is not currently known whether microbial reads identified from tumor sequencing datasets result from contamination or represent either extracellular or intracellular microbial residents of the tumor microenvironment. We develop a computational approach named CSI-Microbes (computational identification of Cell type Specific Intracellular Microbes) that mines single-cell RNA sequencing (scRNA-seq) datasets to distinguish cell-type specific intracellular microbes from other microbes. We show that CSI-Microbes can identify previously reported intracellular microbes from both human-designed and cancer scRNA-seq datasets. Finally, we apply CSI-Microbes to a large scRNA-seq lung cancer dataset and identify microbial taxa in tumor cells with a transcriptomic signature of decreased immune activity.en_US
dc.identifierhttps://doi.org/10.13016/nptj-eaa3
dc.identifier.urihttp://hdl.handle.net/1903/27439
dc.language.isoenen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pqcontrolledBioinformaticsen_US
dc.subject.pquncontrolledcomputational biologyen_US
dc.subject.pquncontrolledintracellular bacteriaen_US
dc.subject.pquncontrolledintracellular microbesen_US
dc.subject.pquncontrolledmutation signaturesen_US
dc.subject.pquncontrolledscRNA-seqen_US
dc.titleComputational methods for the identification of mutation signatures and intracellular microbes in canceren_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Robinson_umd_0117E_21585.pdf
Size:
3.15 MB
Format:
Adobe Portable Document Format