Mathematics
Permanent URI for this communityhttp://hdl.handle.net/1903/2261
Browse
21 results
Search Results
Item Novel Methods for Metagenomic Analysis(2010) White, James Robert; Pop, Mihai; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)By sampling the genetic content of microbes at the nucleotide level, metagenomics has rapidly established itself as the standard in characterizing the taxonomic diversity and functional capacity of microbial populations throughout nature. The decreasing cost of sequencing technologies and the simultaneous increase of throughput per run has given scientists the ability to deeply sample highly diverse communities on a reasonable budget. The Human Microbiome Project is representative of the flood of sequence data that will arrive in the coming years. Despite these advancements, there remains the significant challenge of analyzing massive metagenomic datasets to make appropriate biological conclusions. This dissertation is a collection of novel methods developed for improved analysis of metagenomic data: (1) We begin with Figaro, a statistical algorithm that quickly and accurately infers and trims vector sequence from large Sanger-based read sets without prior knowledge of the vector used in library construction. (2) Next, we perform a rigorous evaluation of methodologies used to cluster environmental 16S rRNA sequences into species-level operational taxonomic units, and discover that many published studies utilize highly stringent parameters, resulting in overestimation of microbial diversity. (3) To assist in comparative metagenomics studies, we have created Metastats, a robust statistical methodology for comparing large-scale clinical datasets with up to thousands of subjects. Given a collection of annotated metagenomic features (e.g. taxa, COGs, or pathways), Metastats determines which features are differentially abundant between two populations. (4) Finally, we report on a new methodology that employs the generalized Lotka-Volterra model to infer microbe-microbe interactions from longitudinal 16S rRNA data. It is our hope that these methods will enhance standard metagenomic analysis techniques to provide better insight into the human microbiome and microbial communities throughout our world. To assist metagenomics researchers and those developing methods, all software described in this thesis is open-source and available online.Item Meshless Collocation Methods for the Numerical Solution of Elliptic Boundary Valued Problems and the Rotational Shallow Water Equations on the Sphere(2009) Blakely, Christopher Dallas; Osborn, John E; Baer, Ferdinand; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation thesis has three main goals: 1) To explore the anatomy of meshless collocation approximation methods that have recently gained attention in the numerical analysis community; 2) Numerically demonstrate why the meshless collocation method should clearly become an attractive alternative to standard finite-element methods due to the simplicity of its implementation and its high-order convergence properties; 3) Propose a meshless collocation method for large scale computational geophysical fluid dynamics models. We provide numerical verification and validation of the meshless collocation scheme applied to the rotational shallow-water equations on the sphere and demonstrate computationally that the proposed model can compete with existing high performance methods for approximating the shallow-water equations such as the SEAM (spectral-element atmospheric model) developed at NCAR. A detailed analysis of the parallel implementation of the model, along with the introduction of parallel algorithmic routines for the high-performance simulation of the model will be given. We analyze the programming and computational aspects of the model using Fortran 90 and the message passing interface (mpi) library along with software and hardware specifications and performance tests. Details from many aspects of the implementation in regards to performance, optimization, and stabilization will be given. In order to verify the mathematical correctness of the algorithms presented and to validate the performance of the meshless collocation shallow-water model, we conclude the thesis with numerical experiments on some standardized test cases for the shallow-water equations on the sphere using the proposed method.Item Fast Solvers for Models of Fluid Flow with Spectral Elements(2008-09-02) Lott, Paul Aaron; Elman, Howard; Deane, Anil; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)We introduce a preconditioning technique based on Domain Decomposition and the Fast Diagonalization Method that can be applied to tensor product based discretizations of the steady convection-diffusion and the linearized Navier-Stokes equations. The method is based on iterative substructuring where fast diagonalization is used to efficiently eliminate the interior degrees of freedom and subsidiary subdomain solves. We demonstrate the effectiveness of this preconditioner in numerical simulations using a spectral element discretization. This work extends the use of Fast Diagonalization to steady convection-diffusion systems. We also extend the "least-squares commutator" preconditioner, originally developed for the finite element method, to a matrix-free spectral element framework. We show that these two advances, when used together, allow for efficient computation of steady-state solutions the the incompressible Navier-Stokes equations using high-order spectral element discretizations.Item Pricing Volatility Derivatives Using Space Scaled Levy Processes(2008-09-02) Prakash, Samvit; Madan, Dilip B; von Petersdorff, Tobias; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The VIX index measures the one-month risk-neutral forward volatility of the S&P500 (SPX) index. While Lévy processes such as the CGMY process can price options on the underlying stock or index, they implicitly assume a constant forward volatility. This makes them unsuitable for pricing options on VIX. We propose a model within the one dimensional Markovian framework for pricing VIX and SPX options simultaneously. We introduce space dependence of volatility by scaling the CGMY process with a leverage function. The resultant process can consistently price options on SPX and VIX of a given maturity. We also perform surface calibrations of options on the two indices separately. We explore the properties of the implied distribution of the SPX from both indices and conclude that the VIX index under-weighs small jumps as compared to large jumps as well as the skewness of the SPX index .Item Dimensionality reduction for hyperspectral data(2008-05-09) Widemann, David P; Benedetto, John J; Czaja, Wojciech; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This thesis is about dimensionality reduction for hyperspectral data. Special emphasis is given to dimensionality reduction techniques known as kernel eigenmap methods and manifold learning algorithms. Kernel eigenmap methods require a nearest neighbor or a radius parameter be set. A new algorithm that does not require these neighborhood parameters is given. Most kernel eigenmap methods use the eigenvectors of the kernel as coordinates for the data. An algorithm that uses the frame potential along with subspace frames to create nonorthogonal coordinates is given. The algorithms are demonstrated on hyperspectral data. The last two chapters include analysis of representation systems for LIDAR data and motion blur estimation, respectively.Item Definable families of finite Vapnik Chervonenkis dimension(2008-04-25) Johnson, Hunter R; Laskowski, Michael C; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Vapnik Chervonenkis dimension is a basic combinatorial notion with applications in machine learning, stability theory, and statistics. We explore what effect model theoretic structure has on the VC dimension of formulas, considered as parameterized families of sets, with respect to long disjunctions and conjunctions. If the growth in VC dimension is linear in the number of disjunctions, then the theory under consideration has a certain kind of good structure. We have found a general class of theories in which this structure obtains, as well as situations where it fails. We relate ``compression schemes'' of computational learning theory to model theoretic type definitions, and explore the model theoretic implications. All stable definable families are shown to have finite compression schemes, with specific bounds in the case of NFCP theories. Notions of maximality in VC classes are discussed, and classified according to their first order properties. While maximum classes can be characterized in first-order logic, maximal classes can notItem An Investigation of the Relationship Between Automated Machine Translation Evaluation Metrics and User Performance on an Information Extraction Task(2007-12-04) Tate, Calandra Rilette; Slud, Eric V; Dorr, Bonnie J; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation applies nonparametric statistical techniques toMachine Translation (MT) Evaluation using data from a MT Evaluation experiment conducted through a joint Army Research Laboratory (ARL) and Center for the Advanced Study of Language (CASL) project. In particular, the relationship between human task performance on an information extraction task with translated documents and well-known automated translation evaluation metric scores for those documents is studied. Findings from a correlation analysis of the connection between autometrics and task-based metrics are presented and contrasted with current strategies for evaluating translations. A novel idea for assessing partial rank correlation within the presence of grouping factors is also introduced. Lastly, this dissertation presents a framework for task-based machine translation (MT) evaluation and predictive modeling of task responses that gives new information about the relative predictive strengths of the different autometrics (and re-coded variants of them) within the statistical Generalized Linear Models developed in analyses of the Information Extraction Task data. This work shows that current autometrics are inadequate with respect to the prediction of task performance but, near adequacy can be accomplished through the use of re-coded autometrics in a logistic regression setting. As a result, a class of automated metrics that are best suitable for predicting performance is established and suggestions are offered about how to utilize metrics to supplement expensive and time-consuming experiments with human participants. Now users can begin to tie the intrinsic automated metrics to the extrinsic metrics for task they perform. The bottom line is that there is a need to average away MT dependence (averaged metrics perform better in overall predictions than original autometrics). Moreover, combinations of recoded metrics performed better than any individual metric. Ultimately, MT evaluation methodology is extended to create new metrics specially relevant to task-based comparisons. A formal method to establish that differences among metrics as predictors are strong enough not to be due by chance remains as future work. Given the lack of connection in the field of MT Evaluation between task utility and the interpretation of automated evaluation metrics, as well as the absence of solid statistical reasoning in evaluating MT, there is a need to bring innovative and interdisciplinary analytical techniques to this problem. Because there are no papers in the MT evaluation literature that have done statistical modeling before or that have linked automated metrics with how well MT supports human tasks, this work is unique and has high potential for benefiting the Machine Translation research community.Item Global Phenomena from Local Rules: Peer-to-Peer Networks and Crystal Steps(2007-12-04) Finkbiner, Amy; Yorke, James A; Margetis, Dionisios; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Even simple, deterministic rules can generate interesting behavior in dynamical systems. This dissertation examines some real world systems for which fairly simple, locally defined rules yield useful or interesting properties in the system as a whole. In particular, we study routing in peer-to-peer networks and the motion of crystal steps. Peers can vary by three orders of magnitude in their capacities to process network traffic. This heterogeneity inspires our use of "proportionate load balancing," where each peer provides resources in proportion to its individual capacity. We provide an implementation that employs small, local adjustments to bring the entire network into a global balance. Analytically and through simulations, we demonstrate the effectiveness of proportionate load balancing on two routing methods for de Bruijn graphs, introducing a new "reversed" routing method which performs better than standard forward routing in some cases. The prevalence of peer-to-peer applications prompts companies to locate the hosts participating in these networks. We explore the use of supervised machine learning to identify peer-to-peer hosts, without using application-specific information. We introduce a model for "triples," which exploits information about nearly contemporaneous flows to give a statistical picture of a host's activities. We find that triples, together with measurements of inbound vs. outbound traffic, can capture most of the behavior of peer-to-peer hosts. An understanding of crystal surface evolution is important for the development of modern nanoscale electronic devices. The most commonly studied surface features are steps, which form at low temperatures when the crystal is cut close to a plane of symmetry. Step bunching, when steps arrange into widely separated clusters of tightly packed steps, is one important step phenomenon. We analyze a discrete model for crystal steps, in which the motion of each step depends on the two steps on either side of it. We find an time-dependence term for the motion that does not appear in continuum models, and we determine an explicit dependence on step number.Item Entropy Stable Approximations of Nonlinear Conservation Laws and Related Fluid Equations(2007-08-01) Zhong, Weigang; Tadmor, Eitan; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)We present a systematic study of novel entropy stable approximations for a variety of nonlinear conservation laws, from the scalar Burgers equation to one dimensional Navier-Stokes and two dimensional shallow water equations. To this end, we construct a new family of second-order entropy stable difference schemes which retain the precise entropy decay of the original partial differential equations. Here we employ the entropy conservative differences of Tadmor's 2004 paper to discretize the convective fluxes, and center differences to discretize the dissipative fluxes. This resulting family of difference schemes are free of artificial numerical viscosity in the sense that their entropy dissipation is then dictated solely by physical dissipation terms. The numerical results of 1D compressible Navier-Stokes equations equations provide us a remarkable evidence for the different roles of viscosity and heat conduction in forming sharp monotone profiles in the immediate neighborhoods of shocks and contacts. Further implementation in 2D shallow water equations is realized dimension by dimension.Item Algorithms for Solving Linear and Polynomial Systems of Equations over Finite Fields with Applications to Cryptanalysis(2007-06-07) Bard, Gregory Van; Washington, Lawrence C; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation contains algorithms for solving linear and polynomial systems of equations over GF(2). The objective is to provide fast and exact tools for algebraic cryptanalysis and other applications. Accordingly, it is divided into two parts. The first part deals with polynomial systems. Chapter 2 contains a successful cryptanalysis of Keeloq, the block cipher used in nearly all luxury automobiles. The attack is more than 16,000 times faster than brute force, but queries 0.62 × 2^32 plaintexts. The polynomial systems of equations arising from that cryptanalysis were solved via SAT-solvers. Therefore, Chapter 3 introduces a new method of solving polynomial systems of equations by converting them into CNF-SAT problems and using a SAT-solver. Finally, Chapter 4 contains a discussion on how SAT-solvers work internally. The second part deals with linear systems over GF(2), and other small fields (and rings). These occur in cryptanalysis when using the XL algorithm, which converts polynomial systems into larger linear systems. We introduce a new complexity model and data structures for GF(2)-matrix operations. This is discussed in Appendix B but applies to all of Part II. Chapter 5 contains an analysis of "the Method of Four Russians" for multiplication and a variant for matrix inversion, which is log n faster than Gaussian Elimination, and can be combined with Strassen-like algorithms. Chapter 6 contains an algorithm for accelerating matrix multiplication over small finite fields. It is feasible but the memory cost is so high that it is mostly of theoretical interest. Appendix A contains some discussion of GF(2)-linear algebra and how it differs from linear algebra in R and C. Appendix C discusses algorithms faster than Strassen's algorithm, and contains proofs that matrix multiplication, matrix squaring, triangular matrix inversion, LUP-factorization, general matrix in- version and the taking of determinants, are equicomplex. These proofs are already known, but are here gathered into one place in the same notation.
- «
- 1 (current)
- 2
- 3
- »