Talk:Algorithms Seminar/Spring09

From ResearchWiki

(Difference between revisions)
Jump to: navigation, search
(23 Jan)
(23 Jan)
Line 4: Line 4:
: What does it mean for an algorithm to be <math>O(X)</math>-''competitive''?
: What does it mean for an algorithm to be <math>O(X)</math>-''competitive''?
: What are some good methods to help make Kmeans robust regarding outliers?
: What are some good methods to help make Kmeans robust regarding outliers?
 +
 +
'''Piyush'''
 +
: My feeling is that the k-means algorithm might fare poorly if there are substantial differences between the original cluster sizes (or in general between the cluster distributions). Say, for example, the data has one very large and two reasonably small clusters (and assume they all are spherical-shaped clusters). The way k-means partitioning works (i.e. minimizing sum of square distances from the cluster centers) might make the algorithm "split" the large cluster and assign some of its points to the smaller clusters. If this does happen, this would clearly not be the correct clustering. Are there ways to circumvent such a problem in k-means?

Revision as of 14:03, 23 January 2009

23 Jan

Nathan

What does it mean for an algorithm to be O(X)-competitive?
What are some good methods to help make Kmeans robust regarding outliers?

Piyush

My feeling is that the k-means algorithm might fare poorly if there are substantial differences between the original cluster sizes (or in general between the cluster distributions). Say, for example, the data has one very large and two reasonably small clusters (and assume they all are spherical-shaped clusters). The way k-means partitioning works (i.e. minimizing sum of square distances from the cluster centers) might make the algorithm "split" the large cluster and assign some of its points to the smaller clusters. If this does happen, this would clearly not be the correct clustering. Are there ways to circumvent such a problem in k-means?
Personal tools