Clustering the Google Distance Using Graph Cuts
Speaker: Thomas Zeugmann, Hokkaido University, Japan
Clustering algorithms working with a matrix of pairwise similarities
(kernel matrix) for the data are widely known and used, a particularly
popular class being spectral clustering algorithms. In contrast,
algorithms working with the pairwise distance matrix have been studied
rarely for clustering. This is surprising, as in many applications,
distances are directly given, and computing similarities involves
another step that is error-prone, since the kernel has to be chosen
appropriately, albeit computationally cheap.
In the talk we propose a clustering algorithm based on the SDP
relaxation of the max-k-cut of the graph of pairwise distances, based
on the work of Frieze and Jerrum. Experimental results are presented
indicating that the more direct approach yields better results.
Finally, we shortly propose a simple heuristic for dealing with
missing data, i.e., the case where some of the pairwise distances or
similarities are not known.