#### DMCA

## Information Theoretic Measures for Clusterings Comparison: Is a Correction for Chance Necessary?

### Cached

### Download Links

Citations: | 101 - 5 self |

### Citations

12423 |
Elements of Information Theory
- Cover, Thomas
- 1991
(Show Context)
Citation Context ...sure, information theoretic based measures have received increasing attention for their strong theoretical background. Let us first review some of the very fundamental concepts of information theory (=-=Cover & Thomas, 1991-=-) and then see how those concepts might be used toward assessing clusterings agreement. Definition 2.1 The information entropy of a discrete random variable X, that can take on possible values in its ... |

987 |
Comparing partitions
- Hubert, Arabie
- 1985
(Show Context)
Citation Context ...area which has also received much attention. Various clustering comparison measures have been proposed: besides the class of pair-counting based measures including the well-known Adjusted Rand Index (=-=Hubert & Arabie, 1985-=-), and set-matching based measures, such as the H criterion (Meilǎ, 2005), information theoretic based measures, such as the Mutual Information (Strehl & Ghosh, 2002) and the Variation of Information ... |

813 |
Objective criteria for the evaluation of clustering methods
- Rand
- 1971
(Show Context)
Citation Context ... 1). Intuitively, N11 and N00 can be used as indicators of agreement between U . . and V, while N01 and N10 can be used as disagreement indicators. A well known index of this class is the Rand Index (=-=Rand, 1971-=-), defined straightforwardly as: ( ) N RI(U, V) = (N00 + N11)/ (1) 2 The Rand Index lies between 0 and 1. It takes the value of 1 when the two clusterings are identical, and 0 when no pair of points a... |

603 | Cluster ensembles – a knowledge reuse framework for combining multiple partitions.
- Strehl, Ghosh
- 2002
(Show Context)
Citation Context ...e well-known Adjusted Rand Index (Hubert & Arabie, 1985), and set-matching based measures, such as the H criterion (Meilǎ, 2005), information theoretic based measures, such as the Mutual Information (=-=Strehl & Ghosh, 2002-=-) and the Variation of Information (Meilǎ, 2005), form another fundamental class of clustering comparison measures. In this paper, we aim to improve the usability of the class of information theoretic... |

255 | Consensus clustering: a resamplingbased method for class discovery and visualization of gene-expression microaray data.
- Monti, Tamayo, et al.
- 2003
(Show Context)
Citation Context ...of clusters via Consensus Clustering: We start by first providing some background on Consensus Clustering. In an era where a huge number of clustering algorithms exist, the Consensus Clustering idea (=-=Monti et al., 2003-=-; Strehl & Ghosh, 2002; Yu et al., 2007) has recently received increasing interest. Consensus Clustering is not just another clustering algorithm: it rather provides a framework for unifying the knowl... |

164 | Clustering on the unit hypersphere using von mises-fisher distributions.
- Banerjee, Dhillon, et al.
- 2005
(Show Context)
Citation Context ...h are information theoretic based, have also beenInformation Theoretic Measures for Clusterings Comparison: Is a Correction for Chance Necessary? employed more recently in the clustering literature (=-=Banerjee et al., 2005-=-; Strehl & Ghosh, 2002; Meilǎ, 2005). Although there is currently no consensus on which is the best measure, information theoretic based measures have received increasing attention for their strong th... |

82 | The Chi-squared Distribution. - Lancaster - 1969 |

21 |
On similarity indices and correction for chance agreement
- Albatineh, Niewiadomska-Bugaj, et al.
- 2006
(Show Context)
Citation Context ...ts expected value (under the generalized hypergeometric distribution assumption for randomness). Besides the Adjusted Rand Index, there are many other, possibly less popular, measures in this class. (=-=Albatineh et al., 2006-=-) discussed correction for chance for a comprehensive list of 28 different indices in this class, a number which is large enough to make the task of choosing an appropriate measure difficult and confu... |

7 |
A novel approach for automatic number of clusters detection in microarray data based on consensus clustering
- Vinh, Epps
- 2009
(Show Context)
Citation Context ...f the obtained clusterInformation Theoretic Measures for Clusterings Comparison: Is a Correction for Chance Necessary? structure. To quantify this diversity we have recently developed a novel index (=-=Vinh & Epps, 2009-=-), namely the Consensus Index (CI), which is built upon a suitable clustering similarity measure. Given a value of K, suppose we have generated a set of B clustering solutions UK = {U1, U2, . . . , UB... |

5 |
On similarity coefficients for 2x2 tables and correction for chance
- Warrens
- 2008
(Show Context)
Citation Context ...ensive list of 28 different indices in this class, a number which is large enough to make the task of choosing an appropriate measure difficult and confusing. Their work, and subsequent extension of (=-=Warrens, 2008-=-), however, showed that after correction for chance, many of these measures become equivalent, facilitating the task of choosing a measure. 2.2. Information Theoretic based Indices Another class of cl... |

1 |
Comparing clusterings: an axiomatic view. ICML ’05
- Meilǎ
- 2005
(Show Context)
Citation Context ...s have been proposed: besides the class of pair-counting based measures including the well-known Adjusted Rand Index (Hubert & Arabie, 1985), and set-matching based measures, such as the H criterion (=-=Meilǎ, 2005-=-), information theoretic based measures, such as the Mutual Information (Strehl & Ghosh, 2002) and the Variation of Information (Meilǎ, 2005), form another fundamental class of clustering comparison m... |