Coral: an integrated suite of visualizations for comparing clusterings

dc.contributor.authorFilippova, Darya
dc.contributor.authorGadani, Aashish
dc.contributor.authorKingsford, Carl
dc.date.accessioned2021-09-28T19:31:36Z
dc.date.available2021-09-28T19:31:36Z
dc.date.issued2012-10-29
dc.description.abstractClustering has become a standard analysis for many types of biological data (e.g interaction networks, gene expression, metagenomic abundance). In practice, it is possible to obtain a large number of contradictory clusterings by varying which clustering algorithm is used, which data attributes are considered, how algorithmic parameters are set, and which near-optimal clusterings are chosen. It is a difficult task to sift though such a large collection of varied clusterings to determine which clustering features are affected by parameter settings or are artifacts of particular algorithms and which represent meaningful patterns. Knowing which items are often clustered together helps to improve our understanding of the underlying data and to increase our confidence about generated modules. We present Coral, an application for interactive exploration of large ensembles of clusterings. Coral makes all-to-all clustering comparison easy, supports exploration of individual clusterings, allows tracking modules across clusterings, and supports identification of core and peripheral items in modules. We discuss how each visual component in Coral tackles a specific question related to clustering comparison and provide examples of their use. We also show how Coral could be used to visually and quantitatively compare clusterings with a ground truth clustering. As a case study, we compare clusterings of a recently published protein interaction network of Arabidopsis thaliana. We use several popular algorithms to generate the network’s clusterings. We find that the clusterings vary significantly and that few proteins are consistently co-clustered in all clusterings. This is evidence that several clusterings should typically be considered when evaluating modules of genes, proteins, or sequences, and Coral can be used to perform a comprehensive analysis of these clustering ensembles.en_US
dc.description.urihttps://doi.org/10.1186/1471-2105-13-276
dc.identifierhttps://doi.org/10.13016/a6ws-lpxp
dc.identifier.citationFilippova, D., Gadani, A. & Kingsford, C. Coral: an integrated suite of visualizations for comparing clusterings. BMC Bioinformatics 13, 276 (2012).en_US
dc.identifier.urihttp://hdl.handle.net/1903/28040
dc.language.isoen_USen_US
dc.publisherSpringer Natureen_US
dc.relation.isAvailableAtCollege of Computer, Mathematical & Natural Sciencesen_us
dc.relation.isAvailableAtComputer Scienceen_us
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_us
dc.relation.isAvailableAtUniversity of Maryland (College Park, MD)en_us
dc.subjectCluster Algorithmen_US
dc.subjectData Itemen_US
dc.subjectJaccard Similarityen_US
dc.subjectModule Pairen_US
dc.subjectItem Pairen_US
dc.titleCoral: an integrated suite of visualizations for comparing clusteringsen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1471-2105-13-276.pdf
Size:
2.39 MB
Format:
Adobe Portable Document Format
Description: