Tweet Notes (CSE 494/598 F11): 10/20/2011

Sunday, October 23, 2011

10/20/2011

K-means is the cluster mean and it represents the dissimilarity among clusters. It's a measure. And one of the problem it faces is that when we increase K dissimilarity falls; this could lead K to be the number of points/nodes in the entire problem. One of the solutions for this is sampling instead of all, for example, 3 trillion books; sampling with different K and use the best. The other solution could be to penalize when increasing the number of clusters usage.

--
Ivan Zhou

Graduate Student

Graduate Professional Student Association (GPSA) Assembly Member

School of Computing, Informatics and Decision Systems Engineering

Ira A. Fulton School of Engineering

Arizona State University