Principal component tests: applied to temporal gene expression data

dc.contributor.authorZhang, Wensheng
dc.contributor.authorFang, Hong-Bin
dc.contributor.authorSong, Jiuzhou
dc.date.accessioned2021-11-30T15:52:02Z
dc.date.available2021-11-30T15:52:02Z
dc.date.issued2009-01-30
dc.description.abstractClustering analysis is a common statistical tool for knowledge discovery. It is mainly conducted when a project still is in the exploratory phase without any priori hypotheses. However, the statistical significance testing between the clusters can be meaningful in helping the researchers to assess if the classification results from implementing a clustering algorithm need to be improved, even after the cluster number has been determined by a well-established criterion. This is important when we want to identify highly-specific patterns through classification. We proposed to use a principal component (PC) test, which is an implementation of an exact F statistic for the measures at multiple endpoints based on elliptical distribution theory, to assess the statistical significance between clusters. A challenge in the implementation is the choice of the number (q) of principal components to be considered, which can severely influence the statistical power of the method. We optimized the determination via validation according to a permutation test based on the clustering to be evaluated. The method was applied to a public dataset in classifying genes according to their temporal gene expression profiles. The results demonstrated that the PC testing were useful for determining the optimal number of clusters.en_US
dc.description.urihttps://doi.org/10.1186/1471-2105-10-S1-S26
dc.identifierhttps://doi.org/10.13016/lapf-hwop
dc.identifier.citationZhang, W., Fang, HB. & Song, J. Principal component tests: applied to temporal gene expression data. BMC Bioinformatics 10, S26 (2009).en_US
dc.identifier.urihttp://hdl.handle.net/1903/28176
dc.language.isoen_USen_US
dc.publisherSpringer Natureen_US
dc.relation.isAvailableAtCollege of Agriculture & Natural Resourcesen_us
dc.relation.isAvailableAtAnimal & Avian Sciencesen_us
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_us
dc.relation.isAvailableAtUniversity of Maryland (College Park, MD)en_us
dc.subjectCluster Algorithmen_US
dc.subjectCluster Numberen_US
dc.subjectAgglomerative Hierarchical Clusteren_US
dc.subjectFunctional Enrichment Analysisen_US
dc.subjectSilhouette Widthen_US
dc.titlePrincipal component tests: applied to temporal gene expression dataen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
1471-2105-10-S1-S26.pdf
Size:
361.99 KB
Format:
Adobe Portable Document Format
Description: