Mathematics

Permanent URI for this communityhttp://hdl.handle.net/1903/2261

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Item
    STATISTICAL DATA FUSION WITH DENSITY RATIO MODEL AND EXTENSION TO RESIDUAL COHERENCE
    (2024) Zhang, Xuze; Kedem, Benjamin; Mathematical Statistics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Nowadays, the statistical analysis of data from diverse sources has become more prevalent. The Density Ratio Model (DRM) is one of the methods for fusing and analyzing such data. The population distributions of different samples can be estimated basedon fused data, which leads to more precise estimates of the probability distributions. These probability distributions are related by assuming the ratios of their probability density functions (PDFs) follow a parametric form. In the previous works, this parametric form is assumed to be uniform for all ratios. In Chapter 1, an extension is made to allow this parametric form to vary for different ratios. Two methods of determining the parametric form for each ratio are developed based on asymptotic test and Akaike Information Criterion (AIC). This extended DRM is applied to Radon concentration and Pertussis rates to demonstrate the use of this extension in univariate case and multivariate case, respectively. The above analysis is made possible when data in each sample are independent and identically distributed (IID). However, in many cases, statistical analysis is entailed for time series in which data appear to be sequentially dependent. In Chapter 2, an extension is made for DRM to account for weakly dependent data, which allows us to investigate the structure of multiple time series on the strength of each other. It is shown that the IID assumption can be replaced by proper stationarity, mixing and moment conditions. This extended DRM is applied to the analysis of air quality data which are recorded in chronological order. As mentioned above, DRM is suitable for the situation where we investigate a single time series based on multiple alternative ones. These time series are assumed to be mutually independent. However, in time series analysis, it is often of interest to detect linear and nonlinear dependence between different time series. In such dependent scenario, coherence is a common tool to measure the linear dependence between two time series, and residual coherence is used to detect a possible quadratic relationship. In Chapter 3, we extend the notion of residual coherence and develop statistical tests for detecting linear and nonlinear associations between time series. These tests are applied to the analysis of brain functional connectivity data.
  • Thumbnail Image
    Item
    Multivariate Tail Probabilities: Predicting Regional Pertussis Cases in Washington State
    (MDPI, 2021-05-27) Zhang, Xuze; Pyne, Saumyadipta; Kedem, Benjamin
    In disease modeling, a key statistical problem is the estimation of lower and upper tail probabilities of health events from given data sets of small size and limited range. Assuming such constraints, we describe a computational framework for the systematic fusion of observations from multiple sources to compute tail probabilities that could not be obtained otherwise due to a lack of lower or upper tail data. The estimation of multivariate lower and upper tail probabilities from a given small reference data set that lacks complete information about such tail data is addressed in terms of pertussis case count data. Fusion of data from multiple sources in conjunction with the density ratio model is used to give probability estimates that are non-obtainable from the empirical distribution. Based on a density ratio model with variable tilts, we first present a univariate fit and, subsequently, improve it with a multivariate extension. In the multivariate analysis, we selected the best model in terms of the Akaike Information Criterion (AIC). Regional prediction, in Washington state, of the number of pertussis cases is approached by providing joint probabilities using fused data from several relatively small samples following the selected density ratio model. The model is validated by a graphical goodness-of-fit plot comparing the estimated reference distribution obtained from the fused data with that of the empirical distribution obtained from the reference sample only.