Mathematics Research Works
Permanent URI for this collectionhttp://hdl.handle.net/1903/1595
Browse
9 results
Search Results
Item Better Metrics to Automatically Predict the Quality of a Text Summary(MDPI, 2012-09-26) Rankel, Peter A.; Conroy, John M.; Schlesinger, Judith D.In this paper we demonstrate a family of metrics for estimating the quality of a text summary relative to one or more human-generated summaries. The improved metrics are based on features automatically computed from the summaries to measure content and linguistic quality. The features are combined using one of three methods—robust regression, non-negative least squares, or canonical correlation, an eigenvalue method. The new metrics significantly outperform the previous standard for automatic text summarization evaluation, ROUGE.Item Complexity-Regularized Regression for Serially-Correlated Residuals with Applications to Stock Market Data(MDPI, 2014-12-23) Darmon, David; Girvan, MichelleA popular approach in the investigation of the short-term behavior of a non-stationary time series is to assume that the time series decomposes additively into a long-term trend and short-term fluctuations. A first step towards investigating the short-term behavior requires estimation of the trend, typically via smoothing in the time domain. We propose a method for time-domain smoothing, called complexity-regularized regression (CRR). This method extends recent work, which infers a regression function that makes residuals from a model “look random”. Our approach operationalizes non-randomness in the residuals by applying ideas from computational mechanics, in particular the statistical complexity of the residual process. The method is compared to generalized cross-validation (GCV), a standard approach for inferring regression functions, and shown to outperform GCV when the error terms are serially correlated. Regression under serially-correlated residuals has applications to time series analysis, where the residuals may represent short timescale activity. We apply CRR to a time series drawn from the Dow Jones Industrial Average and examine how both the long-term and short-term behavior of the market have changed over time.Item Simultaneous transcriptional profiling of Leishmania major and its murine macrophage host cell reveals insights into host-pathogen interactions(Springer Nature, 2015-12-29) Dillon, Laura A. L.; Suresh, Rahul; Okrah, Kwame; Corrada Bravo, Hector; Mosser, David M.; El-Sayed, Najib M.Parasites of the genus Leishmania are the causative agents of leishmaniasis, a group of diseases that range in manifestations from skin lesions to fatal visceral disease. The life cycle of Leishmania parasites is split between its insect vector and its mammalian host, where it resides primarily inside of macrophages. Once intracellular, Leishmania parasites must evade or deactivate the host's innate and adaptive immune responses in order to survive and replicate. We performed transcriptome profiling using RNA-seq to simultaneously identify global changes in murine macrophage and L. major gene expression as the parasite entered and persisted within murine macrophages during the first 72 h of an infection. Differential gene expression, pathway, and gene ontology analyses enabled us to identify modulations in host and parasite responses during an infection. The most substantial and dynamic gene expression responses by both macrophage and parasite were observed during early infection. Murine genes related to both pro- and anti-inflammatory immune responses and glycolysis were substantially upregulated and genes related to lipid metabolism, biogenesis, and Fc gamma receptor-mediated phagocytosis were downregulated. Upregulated parasite genes included those aimed at mitigating the effects of an oxidative response by the host immune system while downregulated genes were related to translation, cell signaling, fatty acid biosynthesis, and flagellum structure. The gene expression patterns identified in this work yield signatures that characterize multiple developmental stages of L. major parasites and the coordinated response of Leishmania-infected macrophages in the real-time setting of a dual biological system. This comprehensive dataset offers a clearer and more sensitive picture of the interplay between host and parasite during intracellular infection, providing additional insights into how pathogens are able to evade host defenses and modulate the biological functions of the cell in order to survive in the mammalian environment.Item Evolution of transcriptional networks in yeast: alternative teams of transcriptional factors for different species(Springer Nature, 2016-11-11) Muñoz, Adriana; Santos Muñoz, Daniella; Zimin, Aleksey; Yorke, James A.The diversity in eukaryotic life reflects a diversity in regulatory pathways. Nocedal and Johnson argue that the rewiring of gene regulatory networks is a major force for the diversity of life, that changes in regulation can create new species. We have created a method (based on our new “ping-pong algorithm) for detecting more complicated rewirings, where several transcription factors can substitute for one or more transcription factors in the regulation of a family of co-regulated genes. An example is illustrative. A rewiring has been reported by Hogues et al. that RAP1 in Saccharomyces cerevisiae substitutes for TBF1/CBF1 in Candida albicans for ribosomal RP genes. There one transcription factor substitutes for another on some collection of genes. Such a substitution is referred to as a “rewiring”. We agree with this finding of rewiring as far as it goes but the situation is more complicated. Many transcription factors can regulate a gene and our algorithm finds that in this example a “team” (or collection) of three transcription factors including RAP1 substitutes for TBF1 for 19 genes. The switch occurs for a branch of the phylogenetic tree containing 10 species (including Saccharomyces cerevisiae), while the remaining 13 species (Candida albicans) are regulated by TBF1. To gain insight into more general evolutionary mechanisms, we have created a mathematical algorithm that finds such general switching events and we prove that it converges. Of course any such computational discovery should be validated in the biological tests. For each branch of the phylogenetic tree and each gene module, our algorithm finds a sub-group of co-regulated genes and a team of transcription factors that substitutes for another team of transcription factors. In most cases the signal will be small but in some cases we find a strong signal of switching. We report our findings for 23 Ascomycota fungi species.Item Analysis and correction of compositional bias in sparse sequencing count data(Springer Nature, 2018-11-06) Kumar, M. Senthil; Slud, Eric V.; Okrah, Kwame; Hicks, Stephanie C.; Hannenhalli, Sridhar; Bravo, Héctor CorradaCount data derived from high-throughput deoxy-ribonucliec acid (DNA) sequencing is frequently used in quantitative molecular assays. Due to properties inherent to the sequencing process, unnormalized count data is compositional, measuring relative and not absolute abundances of the assayed features. This compositional bias confounds inference of absolute abundances. Commonly used count data normalization approaches like library size scaling/rarefaction/subsampling cannot correct for compositional or any other relevant technical bias that is uncorrelated with library size. We demonstrate that existing techniques for estimating compositional bias fail with sparse metagenomic 16S count data and propose an empirical Bayes normalization approach to overcome this problem. In addition, we clarify the assumptions underlying frequently used scaling normalization methods in light of compositional bias, including scaling methods that were not designed directly to address it.Item A statistical analysis of vaccine-adverse event data(Springer Nature, 2019-05-28) Ren, Jian-Jian; Sun, Tingni; He, Yongqun; Zhang, YujiVaccination has been one of the most successful public health interventions to date, and the U.S. FDA/CDC Vaccine Adverse Event Reporting System (VAERS) currently contains more than 500,000 reports for post-vaccination adverse events that occur after the administration of vaccines licensed in the United States. The VAERS dataset is huge, contains very large dimension nominal variables, and is complex due to multiple listing of vaccines and adverse symptoms in a single report. So far there has not been any statistical analysis conducted in attempting to identify the cross-board patterns on how all reported adverse symptoms are related to the vaccines.Item A deficiency in SUMOylation activity disrupts multiple pathways leading to neural tube and heart defects in Xenopus embryos(Springer Nature, 2019-05-17) Bertke, Michelle M.; Dubiak, Kyle M.; Cronin, Laura; Zeng, Erliang; Huber, Paul W.Adenovirus protein, Gam1, triggers the proteolytic destruction of the E1 SUMO-activating enzyme. Microinjection of an empirically determined amount of Gam1 mRNA into one-cell Xenopus embryos can reduce SUMOylation activity to undetectable, but nonlethal, levels, enabling an examination of the role of this post-translational modification during early vertebrate development.Item On The Number of Unlabeled Bipartite Graphs(2016) Atmaca, Abdullah; Oruc, Yavuz ALet $I$ and $O$ denote two sets of vertices, where $I\cap O =\Phi$, $|I| = n$, $|O| = r$, and $B_u(n,r)$ denote the set of unlabeled graphs whose edges connect vertices in $I$ and $O$. It is shown that the following two-sided equality holds. $\displaystyle \frac{\binom{r+2^{n}-1}{r}}{n!} \le |B_u(n,r)| \le 2\frac{\binom{r+2^{n}-1}{r}}{n!} $Item On Number Of Partitions Of An Integer Into A Fixed Number Of Positive Integers(2015-04) Oruc, A. YavuzThis paper focuses on the number of partitions of a positive integer $n$ into $k$ positive summands, where $k$ is an integer between $1$ and $n$. Recently some upper bounds were reported for this number in [Merca14]. Here, it is shown that these bounds are not as tight as an earlier upper bound proved in [Andrews76-1] for $k\le 0.42n$. A new upper bound for the number of partitions of $n$ into $k$ summands is given, and shown to be tighter than the upper bound in [Merca14] when $k$ is between $O(\frac{\sqrt{n}}{\ln n})$ and $n-O(\frac{\sqrt{n}}{\ln n})$. It is further shown that the new upper bound is also tighter than two other upper bounds previously reported in~[Andrews76-1] and [Colman82]. A generalization of this upper bound to number of partitions of $n$ into at most $k$ summands is also presented.