Skip to content
University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • College of Computer, Mathematical & Natural Sciences
    • Biology
    • Biology Research Works
    • View Item
    •   DRUM
    • College of Computer, Mathematical & Natural Sciences
    • Biology
    • Biology Research Works
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Gene annotations for the Horvath37K_DNAMethylation array for 10 bat genomes

    Thumbnail
    View/Open
    Desmodus_rotundus.GCF 002940915.1ASM294091v2.AH4.csv (13.28Mb)
    No. of downloads: 16

    Eptesicus_fuscus.GCF 000308155.1EptFus1.0.AH4.csv (14.59Mb)
    No. of downloads: 14

    Homo_sapiens.hg19.AH4.csv (15.21Mb)
    No. of downloads: 10

    Molossus_molossus.HLmolMol2.AH4.csv (12.23Mb)
    No. of downloads: 9

    Myotis_lucifugus.GCF 000147115.1_Myoluc2.0.AH4.csv (12.93Mb)
    No. of downloads: 9

    Myotis_myotis.HLmyoMyo6.AH4.csv (11.69Mb)
    No. of downloads: 13

    Phyllostomus_discolor. HLphyDis3.AH4.csv (12.46Mb)
    No. of downloads: 7

    Pipistrellus_kuhlii.HLpipKuh2.AH4.csv (11.40Mb)
    No. of downloads: 7

    Pteropus_vampyrus.pteVam1.100.AH4.csv (12.02Mb)
    No. of downloads: 11

    Rhinolophus_ferrumequinum.HLrhiFer5.AH4.csv (15.72Mb)
    No. of downloads: 37

    Rousettus_aegyptiacus.HLrouAeg4.AH4.csv (12.79Mb)
    No. of downloads: 10

    External Link(s)
    https://doi.org/10.1038/s41467-021-21900-2
    Date
    2020-09
    Author
    Wilkinson, Gerald
    Haghani, Amin
    Horvath, Steve
    DRUM DOI
    https://doi.org/10.13016/gf6o-3wby
    Metadata
    Show full item record
    Abstract
    This submission contain gene annotations for an Illumina microarray (HorvathMammalMethylChip40) for 10 species of bats. The array design is available from the Gene Expression Omnibus (GEO) at NCBI as platform GPL28271. This array was used to generate DNA methylation data for nearly 700 known-aged individuals representing 26 species of bats. The resulting data were then used to predict age and species lifespan, and identify genomic regions that influence both of those traits.
    Notes
    We used sequences and annotations for ten bat genomes (see Table 1 below), which include six recently published reference assemblies, to locate each 50 bp probe on the array. The alignment was done using the QUASR package (Gaidatzis et al., 2015) with the assumption for bisulfite conversion treatment of the genomic DNA. For each species’ genome sequence, QUASR creates an in-silico-bisulfite-treated version of the genome. The set of nucleotide sequences of the designed probes, which includes degenerate base positions due to the bisulfite conversion, was expanded into a larger set of nucleotide sequences representing every possible combination of degenerate bases. We then ran QUASR (a wrapper for Bowtie2) with parameters -k 2 --strata --best -v 3 and bisulfite = "undir” to align the enlarged set of probe sequences to each prepared genome. From these files, we collected only alignments where the entire length of the probe perfectly matched to the genome sequence (i.e. the CIGAR string 50M and flag XM=0). Following the alignment, the CpGs were annotated based on the distance to the closest transcriptional start site using the Chipseeker package (Yu et al., 2015). A gff file with these was created using these positions, sorted by scaffold and position, and compared to the location of each probe in BAM format. We report probes whose variants only mapped to one unique locus in a particular genome. Genomic location of each CpG is categorized as intergenic, 3’ UTR, 5’ UTR, promoter region (minus 10 kb to plus 1000 bp from the nearest TSS), exon, or intron. Gaidatzis, D., Lerch, A., Hahne, F., and Stadler, M.B. (2015). QuasR: quantification and annotation of short reads in R. Bioinformatics 31, 1130-1132. Yu, G., Wang, L.G., and He, Q.Y. (2015). ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382-2383. Table 1. Bat genome assemblies and sources used for identifying location of CpG sites and number of sites mapped per genome. Species, Assembly and annotation, Source, CpGs mapped Molossus molossus, HLmolMol2, MPI*, 33557 Myotis myotis, HLmyoMyo6, MPI*, 32687 Phyllostomus discolor, HLphyDis3, MPI*, 33615 Rhinolophus ferrumequinum, HLrhiFer5, MPI*, 34411 Pipistrellus kuhlii, HLpipKuh2, MPI*, 31074 Rousettus aegyptiacus, HLrouAeg4, MPI*, 34308 Desmodus rotundus, GCF 002940915.1, ASM294091v2, NCBI, 32930 Eptesicus fuscus, GCF 000308155.1, EptFus1.0, NCBI, 32218 Myotis lucifugus, GCF 000147115.1, Myoluc2.0, NCBI, 29810 Pteropus vampyrus, pteVam1.100, ENSEMBL, 24681 MPI* (downloaded from https://bds.mpi-cbg.de/hillerlab/Bat1KPilotProject/)
    URI
    http://hdl.handle.net/1903/26373
    Collections
    • Biology Research Works
    • UMD Data Collection

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility