Filed under: Papers
[author]Jeff M. Phillips, Parasaran Raman and Suresh Venkatasubramanian[/author]
Proc. 2011 SIAM Conference on Data Mining, Apr 2011.
This paper proposes a new distance metric between clusterings that incorporates information about the spatial distribution of points and clusters. Our approach yields not only a distance function, but a Hilbert space-based representation of clusters as a combination of the representations of their constituent points. We use this representation and the underlying metric to design a spatially-aware consensus clustering procedure, the first of its kind. This consensus procedure also introduces a novel reduction to Euclidean clustering, and is very simple to implement. All of our results apply to comparing both soft and hard clusterings. We accompany these algorithms with a detailed experimental evaluation that demonstrates the efficiency and quality of our techniques.
Tags: CCF 0953066