Semiparametric Cluster Detection
MetadataShow full item record
In this dissertation, a Semiparametric density ratio testing method which borrows strength from two or more samples is applied to moving windows of variable size in cluster detection. This Semiparametric cluster detection method requires neither the prior knowledge of the underlying distribution nor the number of cases before scanning. To take into account the multiple testing problem induced by numerous overlapping windows, Storey's q-value method, a false discovery rate (FDR) methodology, is used in conjunction with the Semiparametric testing procedure. Monte Carlo power studies show that for binary data, the Semiparametric cluster detection method and its competitor, Kulldorff's scan statistics method, both achieve similar high power in detecting unknown hot-spot clusters. When the data are not binary, the Semiparametric methodology is still applicable, but Kulldorff's method may not be as it requires the choice of a correct probability model, namely the correct scan statistic, in order to achieve power comparable to that achieved by the Semiparametric method. Kulldorff's method with an inappropriate probability model may lose power. Moreover, when the data are binary, the Semiparametric density ratio model reduces to the same scan statistic as Kulldorff's Bernoulli model. If a cluster candidate is known, under certain conditions the Semiparametric method achieves a higher power than the power achieved by a certain focused test in testing the hy- pothesis of no cluster. The Semiparametric method potential in cluster detection is illustrated using a North Humberside childhood leukemia data set and a Maryland-DC-Virginia crime data set.