Mathematics
Permanent URI for this communityhttp://hdl.handle.net/1903/2261
Browse
2 results
Search Results
Item Analysis and correction of compositional bias in sparse sequencing count data(Springer Nature, 2018-11-06) Kumar, M. Senthil; Slud, Eric V.; Okrah, Kwame; Hicks, Stephanie C.; Hannenhalli, Sridhar; Bravo, Héctor CorradaCount data derived from high-throughput deoxy-ribonucliec acid (DNA) sequencing is frequently used in quantitative molecular assays. Due to properties inherent to the sequencing process, unnormalized count data is compositional, measuring relative and not absolute abundances of the assayed features. This compositional bias confounds inference of absolute abundances. Commonly used count data normalization approaches like library size scaling/rarefaction/subsampling cannot correct for compositional or any other relevant technical bias that is uncorrelated with library size. We demonstrate that existing techniques for estimating compositional bias fail with sparse metagenomic 16S count data and propose an empirical Bayes normalization approach to overcome this problem. In addition, we clarify the assumptions underlying frequently used scaling normalization methods in light of compositional bias, including scaling methods that were not designed directly to address it.Item Multiple Testing Procedures for the Analysis of Microarray Data(2013) Nuriely, Ayala; Smith, Paul J; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)We reviewed literature about various multiple testing techniques, especially addressing microarray analyses and small sample sizes, and reanalyzed data from Yuen et al. (Physiological Genomics, 2006) which compared the effect of HgCl2 and Ischemia/Reperfusion injuries on rat kidney tissues. Our analysis uses only 22 rats with small numbers of rats in each treatment group, and 9,501 genes under study. We used empirical Bayes (EB) and permutation testing (implemented in Bioconductor) in an effort to identify differentially expressed genes. EB identified a large number of genes as differentially expressed, including both previously identified and newly identified genes. The newly identified genes appear to have biological functions similar to those previously identified. We also recognized power differences between EB and permutation tests, possibly due to nonnormality of the data but also because permutation tests do not make use of all available information in the data.