Bootstrapping for Significance of Compact Clusters in Multi-dimensional Datasets

Volodymyr Melnykov, Assistant Professor-Dept. of Statistics, North Dakota Univ

Wednesday, February 1, 2012 12:00 PM - 1:00 PM

Quantitating significance in identified clusters has been a long-standing desire in a wide range of applications. In this talk, I provide a bootstrap approach for this purpose by comparing clustering solutions obtained by fitting K and K* (K>  K*) components. Significance of the more complex model vis-a-vis the simpler model is assessed using a nonparametric bootstrap test designed to account for compactness of the clustering model. The resulting p-values are used to construct a quantitation map which graphically displays the significance between pairs of models. The procedure is illustrated on some simulation and classification datasets and also applied to the problem of detecting the number of clusters in a dataset, with excellent results. This is joint work with Ranjan Maitra of Iowa State University and Soumendra N. Lahiri of Texas A&M University.


