EVOLUTION OF THE CRISPR IMMUNE SYSTEM FROM ECOLOGICAL TO MOLECULAR SCALES

dc.contributor.advisorJohnson, Philip LFen_US
dc.contributor.authorXiao, Weien_US
dc.contributor.departmentBiologyen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2024-06-29T05:42:43Z
dc.date.available2024-06-29T05:42:43Z
dc.date.issued2024en_US
dc.description.abstractBacteria and archaea inhabit environments that constantly face viral infections and other external genetic threats. They have evolved an arsenal of defense strategies to protect themselves. My research delves into the CRISPR immune system, the only known adaptive immune system of prokaryotes. My work explores three different dimensions of the CRISPR immune system, ranging from ecological to molecular scales.From an evolutionary perspective, CRISPR is widely distributed across the prokaryotic tree, underscoring its immune effectiveness. However, the CRISPR distribution is uneven and some lineages are devoid of CRISPR. Here, I identify two ecological drivers of the CRISPR immune system. By analyzing both 16S rRNA data and metagenomic data, I find the CRISPR system is favored in less abundant prokaryotes in the saltwater environment and higher diverse prokaryote communities in the human oral environment. On the molecular level, the CRISPR system selects and cleaves its “favorite” DNA segments (also known as “spacers”) from invading viral genomes to form immune memories. I explore how the spacer sequence composition affects its acquisition rate by the CRISPR system. I develop a convolutional neural network model to predict the spacer acquisition rate based on the spacer sequence composition in natural microbial communities. The model interpretation reveals that the PAM-proximal end of the spacer is more important in predicting the spacer abundance, which is consistent with previous findings from controlled experimental studies. Combining these scales, CRISPR repeat sequences coevolve with the rest of the genome. Thus, I explore the potential of utilizing CRISPR repeat sequences for taxonomy profiling. I find a strong relationship between unique repeat sequences and taxonomy in both the RefSeq database and a human metagenomic dataset. Then I show high accuracy when utilizing repeat sequences in taxonomy annotation of human metagenomic contigs. This novel method not only aids in annotating CRISPR arrays but also introduces a novel tool for metagenomic sequence annotation.en_US
dc.identifierhttps://doi.org/10.13016/hbs4-bxob
dc.identifier.urihttp://hdl.handle.net/1903/32879
dc.language.isoenen_US
dc.subject.pqcontrolledBiologyen_US
dc.subject.pqcontrolledBioinformaticsen_US
dc.subject.pqcontrolledMicrobiologyen_US
dc.subject.pquncontrolledCRISPRen_US
dc.subject.pquncontrolledEcologyen_US
dc.subject.pquncontrolledImmune systemen_US
dc.subject.pquncontrolledMachine Learningen_US
dc.subject.pquncontrolledProkaryotesen_US
dc.titleEVOLUTION OF THE CRISPR IMMUNE SYSTEM FROM ECOLOGICAL TO MOLECULAR SCALESen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Xiao_umd_0117E_24149.pdf
Size:
4.32 MB
Format:
Adobe Portable Document Format
Download
(RESTRICTED ACCESS)