Pages

K-mean Clustering

K-mean Clustering:  k-Means clustering is one of the simplest and most commonly used clustering algorithms. It tries to find cluster centers that are representative of certain regions of the data. 
  The algorithm alternates between two steps: assigning each data point to the closest
cluster center, and then setting each cluster center as the mean of the data points that
are assigned to it. The algorithm is finished when the assignment of instances to clusters no longer change. It minimizes the intra-cluster distance and maximize inter-cluster distances.  

Training method:
(i) Initialize the k centroids randomly
(ii) Calculate the distance of each point from centroids
(iii) Assign each data point to the closest centroid
(iv) Compute the sum of square errors
(v) Minimize square error
(vi) Compute new centroids
(vii) Repeat  above steps again

Accuracy: 
(i) Compare the ground truth if available
(ii) The average distance between data points with in the cluster

The choice of k depends on the distribution of the data points. If k increases then the error will go down

No comments:

Post a Comment

If you have any doubt, let me know

Email Subscription

Enter your email address:

Delivered by FeedBurner

INSTAGRAM FEED