Building a Coherent Data Pipeline in Microarray Data Analyses: Optimization of Signal/Noise Ratios Using an Interactive Visualization Tool and a Novel Noise Filtering Method (2003)

dc.contributor.authorSeo, Jinwooken_US
dc.contributor.authorBakay, Marinaen_US
dc.contributor.authorChen, Yi-Wenen_US
dc.contributor.authorHilmer, Saraen_US
dc.contributor.authorShneiderman, Benen_US
dc.contributor.authorHoffman, Eric P.en_US
dc.contributor.departmentISRen_US
dc.date.accessioned2007-05-23T10:16:55Z
dc.date.available2007-05-23T10:16:55Z
dc.date.issued2005en_US
dc.description.abstractMotivation: Sources of uncontrolled noise strongly influence data analysis in microarray studies, yet signal/noise ratios are rarely considered in microarray data analyses. We hypothesized that different research projects would have different sources and levels of confounding noise, and built an interactive visual analysis tool to test and define parameters in Affymetrix analyses that optimize the ratio of signal (desired biological variable) versus noise (confounding uncontrolled variables). Results: Five probe set algorithms were studied with and without statistical weighting of probe sets using Microarray Suite (MAS) 5.0 probe set detection p values. The signal/noise optimization method was tested in two large novel microarray datasets with different levels of confounding noise; a 105 sample U133A human muscle biopsy data set (11 groups) (extensive noise), and a 40 sample U74A inbred mouse lung data set (8 groups) (little noise). Success was measured using F-measure value of success of unsupervised clustering into appropriate biological groups (signal). We show that both probe set signal algorithm and probe set detection p-value weighting have a strong effect on signal/noise ratios, and that the different methods performed quite differently in the two data sets. Among the signal algorithms tested, dChip difference model with p-value weighting was the most consistent at maximizing the effect of the target biological variables on data interpretation of the two data sets. Availability: The Hierarchical Clustering Explorer 2.0 is [url=http://www.cs.umd.edu/hcil/hce/]available[/url] online and the improved version of the Hierarchical Clustering Explorer 2.0 with p-value weighting and Fmeasure is available upon request to the first author. Murine arrays (40 samples) are publicly available at the [url=http://microarray.cnmcresearch.org/pgadatatable.asp]PEPR resource.[/url] (Chen et al., 2004).en_US
dc.format.extent1387055 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/6511
dc.language.isoen_USen_US
dc.relation.ispartofseriesISR; TR 2005-49en_US
dc.titleBuilding a Coherent Data Pipeline in Microarray Data Analyses: Optimization of Signal/Noise Ratios Using an Interactive Visualization Tool and a Novel Noise Filtering Method (2003)en_US
dc.typeTechnical Reporten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR_2005-49.pdf
Size:
1.32 MB
Format:
Adobe Portable Document Format