Pages

Hierarchical clustering


Hierarchical clustering is a method of cluster analysis which sets to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types:
Agglomerative: This is a bottom-up approach,  each observation starts in its own cluster and pairs of clusters are merged as one moves up the hierarchy.
Divisive: This is a top-down approach, all observations start in one cluster and splits are performed recursively as one moves down the hierarchy.

Algorithm of the Agglomerative:
(i) Create n clusters, one for each data point
(ii) Compute proximity matrix
(iii) Repeat the above steps by merging two clusters and  Update proximity matrix
(iv) Until only one single cluster is left

Measuring distance between two clusters: It depends on data type, the dimensionality of data, and knowledge of the domain.
Single Linkage cluster: It is the minimum distance between two points in each cluster.
Complete linkage cluster:  It is the maximum distance between two points in each cluster
Average linkage cluster: It is the mean distance between the elements of each cluster.

No comments:

Post a Comment

If you have any doubt, let me know

Email Subscription

Enter your email address:

Delivered by FeedBurner

INSTAGRAM FEED