Computer Science Research Works
Permanent URI for this collectionhttp://hdl.handle.net/1903/1593
Browse
89 results
Search Results
Item Exploring the Computational Explanatory Gap(MDPI, 2017-01-16) Reggia, James A.; Huang, Di-Wei; Katz, GarrettWhile substantial progress has been made in the field known as artificial consciousness, at the present time there is no generally accepted phenomenally conscious machine, nor even a clear route to how one might be produced should we decide to try. Here, we take the position that, from our computer science perspective, a major reason for this is a computational explanatory gap: our inability to understand/explain the implementation of high-level cognitive algorithms in terms of neurocomputational processing. We explain how addressing the computational explanatory gap can identify computational correlates of consciousness. We suggest that bridging this gap is not only critical to further progress in the area of machine consciousness, but would also inform the search for neurobiological correlates of consciousness and would, with high probability, contribute to demystifying the “hard problem” of understanding the mind–brain relationship. We compile a listing of previously proposed computational correlates of consciousness and, based on the results of recent computational modeling, suggest that the gating mechanisms associated with top-down cognitive control of working memory should be added to this list. We conclude that developing neurocognitive architectures that contribute to bridging the computational explanatory gap provides a credible and achievable roadmap to understanding the ultimate prospects for a conscious machine, and to a better understanding of the mind–brain problem in general.Item Few amino acid positions in rpoB are associated with most of the rifampin resistance in Mycobacterium tuberculosis(Springer Nature, 2004-09-28) Cummings, Michael P; Segal, Mark RMutations in rpoB, the gene encoding the β subunit of DNA-dependent RNA polymerase, are associated with rifampin resistance in Mycobacterium tuberculosis. Several studies have been conducted where minimum inhibitory concentration (MIC, which is defined as the minimum concentration of the antibiotic in a given culture medium below which bacterial growth is not inhibited) of rifampin has been measured and partial DNA sequences have been determined for rpoB in different isolates of M. tuberculosis. However, no model has been constructed to predict rifampin resistance based on sequence information alone. Such a model might provide the basis for quantifying rifampin resistance status based exclusively on DNA sequence data and thus eliminate the requirements for time consuming culturing and antibiotic testing of clinical isolates. Sequence data for amino acid positions 511–533 of rpoB and associated MIC of rifampin for different isolates of M. tuberculosis were taken from studies examining rifampin resistance in clinical samples from New York City and throughout Japan. We used tree-based statistical methods and random forests to generate models of the relationships between rpoB amino acid sequence and rifampin resistance. The proportion of variance explained by a relatively simple tree-based cross-validated regression model involving two amino acid positions (526 and 531) is 0.679. The first partition in the data, based on position 531, results in groups that differ one hundredfold in mean MIC (1.596 μg/ml and 159.676 μg/ml). The subsequent partition based on position 526, the most variable in this region, results in a > 354-fold difference in MIC. When considered as a classification problem (susceptible or resistant), a cross-validated tree-based model correctly classified most (0.884) of the observations and was very similar to the regression model. Random forest analysis of the MIC data as a continuous variable, a regression problem, produced a model that explained 0.861 of the variance. The random forest analysis of the MIC data as discrete classes produced a model that correctly classified 0.942 of the observations with sensitivity of 0.958 and specificity of 0.885. Highly accurate regression and classification models of rifampin resistance can be made based on this short sequence region. Models may be better with improved (and consistent) measurements of MIC and more sequence data.Item Genome re-annotation: a wiki solution?(Springer Nature, 2007-02-01) Salzberg, Steven LThe annotation of most genomes becomes outdated over time, owing in part to our ever-improving knowledge of genomes and in part to improvements in bioinformatics software. Unfortunately, annotation is rarely if ever updated and resources to support routine reannotation are scarce. Wiki software, which would allow many scientists to edit each genome's annotation, offers one possible solution.Item A finite element model for protein transport in vivo(Springer Nature, 2007-06-28) Sadegh Zadeh, Kouroush; Elman, Howard C; Montas, Hubert J; Shirmohammadi, AdelBiological mass transport processes determine the behavior and function of cells, regulate interactions between synthetic agents and recipient targets, and are key elements in the design and use of biosensors. Accurately predicting the outcomes of such processes is crucial to both enhancing our understanding of how these systems function, enabling the design of effective strategies to control their function, and verifying that engineered solutions perform according to plan. A Galerkin-based finite element model was developed and implemented to solve a system of two coupled partial differential equations governing biomolecule transport and reaction in live cells. The simulator was coupled, in the framework of an inverse modeling strategy, with an optimization algorithm and an experimental time series, obtained by the Fluorescence Recovery after Photobleaching (FRAP) technique, to estimate biomolecule mass transport and reaction rate parameters. In the inverse algorithm, an adaptive method was implemented to calculate sensitivity matrix. A multi-criteria termination rule was developed to stop the inverse code at the solution. The applicability of the model was illustrated by simulating the mobility and binding of GFP-tagged glucocorticoid receptor in the nucleoplasm of mouse adenocarcinoma. The numerical simulator shows excellent agreement with the analytic solutions and experimental FRAP data. Detailed residual analysis indicates that residuals have zero mean and constant variance and are normally distributed and uncorrelated. Therefore, the necessary and sufficient criteria for least square parameter optimization, which was used in this study, were met.The developed strategy is an efficient approach to extract as much physiochemical information from the FRAP protocol as possible. Well-posedness analysis of the inverse problem, however, indicates that the FRAP protocol provides insufficient information for unique simultaneous estimation of diffusion coefficient and binding rate parameters. Care should be exercised in drawing inferences, from FRAP data, regarding concentrations of free and bound proteins, average binding and diffusion times, and protein mobility unless they are confirmed by long-range Markov Chain-Monte Carlo (MCMC) methods and experimental observations.Item Comparative study of meningitis dynamics across nine African countries: a global perspective(Springer Nature, 2007-07-10) Broutin, Hélène; Philippon, Solenne; de Magny, Guillaume Constantin; Courel, Marie-Françoise; Sultan, Benjamin; Guégan, Jean-FrançoisMeningococcal meningitis (MM) represents an important public health problem especially in the "meningitis belt" in Africa. Although seasonality of epidemics is well known with outbreaks usually starting in the dry season, pluri-annual cycles are still less understood and even studied. In this context, we aimed at study MM cases time series across 9 sahelo-sudanian countries to detect pluri-annual periodicity and determine or not synchrony between dynamics. This global and comparative approach allows a better understanding of MM evolution in time and space in the long-term. We used the most adapted mathematical tool to time series analyses, the wavelet method. We showed that, despite a strong consensus on the existence of a global pluri-annual cycle of MM epidemics, it is not the case. Indeed, even if a clear cycle is detected in all countries, these cycles are not as permanent and regular as generally admitted since many years. Moreover, no global synchrony was detected although many countries seemed correlated. These results of the first large-scale study of MM dynamics highlight the strong interest and the necessity of a global survey of MM in order to be able to predict and prevent large epidemics by adapted vaccination strategy. International cooperation in Public Health and cross-disciplines studies are highly recommended to hope controlling this infectious disease.Item Features generated for computational splice-site prediction correspond to functional elements(Springer Nature, 2007-10-24) Dogan, Rezarta Islamaj; Getoor, Lise; Wilbur, W John; Mount, Stephen MAccurate selection of splice sites during the splicing of precursors to messenger RNA requires both relatively well-characterized signals at the splice sites and auxiliary signals in the adjacent exons and introns. We previously described a feature generation algorithm (FGA) that is capable of achieving high classification accuracy on human 3' splice sites. In this paper, we extend the splice-site prediction to 5' splice sites and explore the generated features for biologically meaningful splicing signals. We present examples from the observed features that correspond to known signals, both core signals (including the branch site and pyrimidine tract) and auxiliary signals (including GGG triplets and exon splicing enhancers). We present evidence that features identified by FGA include splicing signals not found by other methods. Our generated features capture known biological signals in the expected sequence interval flanking splice sites. The method can be easily applied to other species and to similar classification problems, such as tissue-specific regulatory elements, polyadenylation sites, promoters, etc.Item High-throughput sequence alignment using Graphics Processing Units(Springer Nature, 2007-12-10) Schatz, Michael C; Trapnell, Cole; Delcher, Arthur L; Varshney, AmitabhThe recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.Item Genome assembly forensics: finding the elusive mis-assembly(Springer Nature, 2008-03-14) Phillippy, Adam M; Schatz, Michael C; Pop, MihaiWe present the first collection of tools aimed at automated genome assembly validation. This work formalizes several mechanisms for detecting mis-assemblies, and describes their implementation in our automated validation pipeline, called amosvalidate. We demonstrate the application of our pipeline in both bacterial and eukaryotic genome assemblies, and highlight several assembly errors in both draft and finished genomes. The software described is compatible with common assembly formats and is released, open-source, at http://amos.sourceforge.net .Item Genome sequence and rapid evolution of the rice pathogen Xanthomonas oryzae pv. oryzae PXO99A(Springer Nature, 2008-05-01) Salzberg, Steven L; Sommer, Daniel D; Schatz, Michael C; Phillippy, Adam M; Rabinowicz, Pablo D; Tsuge, Seiji; Furutani, Ayako; Ochiai, Hirokazu; Delcher, Arthur L; Kelley, David; Madupu, Ramana; Puiu, Daniela; Radune, Diana; Shumway, Martin; Trapnell, Cole; Aparna, Gudlur; Jha, Gopaljee; Pandey, Alok; Patil, Prabhu B; Ishihara, Hiromichi; Meyer, Damien F; Szurek, Boris; Verdier, Valerie; Koebnik, Ralf; Dow, J Maxwell; Ryan, Robert P; Hirata, Hisae; Tsuyumu, Shinji; Lee, Sang Won; Ronald, Pamela C; Sonti, Ramesh V; Van Sluys, Marie-Anne; Leach, Jan E; White, Frank F; Bogdanove, Adam JXanthomonas oryzae pv. oryzae causes bacterial blight of rice (Oryza sativa L.), a major disease that constrains production of this staple crop in many parts of the world. We report here on the complete genome sequence of strain PXO99A and its comparison to two previously sequenced strains, KACC10331 and MAFF311018, which are highly similar to one another. The PXO99A genome is a single circular chromosome of 5,240,075 bp, considerably longer than the genomes of the other strains (4,941,439 bp and 4,940,217 bp, respectively), and it contains 5083 protein-coding genes, including 87 not found in KACC10331 or MAFF311018. PXO99A contains a greater number of virulence-associated transcription activator-like effector genes and has at least ten major chromosomal rearrangements relative to KACC10331 and MAFF311018. PXO99A contains numerous copies of diverse insertion sequence elements, members of which are associated with 7 out of 10 of the major rearrangements. A rapidly-evolving CRISPR (clustered regularly interspersed short palindromic repeats) region contains evidence of dozens of phage infections unique to the PXO99A lineage. PXO99A also contains a unique, near-perfect tandem repeat of 212 kilobases close to the replication terminus. Our results provide striking evidence of genome plasticity and rapid evolution within Xanthomonas oryzae pv. oryzae. The comparisons point to sources of genomic variation and candidates for strain-specific adaptations of this pathogen that help to explain the extraordinary diversity of Xanthomonas oryzae pv. oryzae genotypes and races that have been isolated from around the world.Item Ultrafast and memory-efficient alignment of short DNA sequences to the human genome(Springer Nature, 2009-03-04) Langmead, Ben; Trapnell, Cole; Pop, Mihai; Salzberg, Steven LBowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu .