Hierarchical
clustering is a method of cluster analysis which sets to build a hierarchy of
clusters. Strategies for hierarchical clustering generally fall into two types:
Agglomerative:
This is a bottom-up approach, each observation starts in its own cluster and
pairs of clusters are merged as one moves up the hierarchy.
Divisive:
This is a top-down approach, all observations start in one cluster and
splits are performed recursively as one moves down the hierarchy.
Algorithm
of the Agglomerative:
(i) Create n clusters, one for each
data point
(ii) Compute proximity matrix
(iii) Repeat the above steps by merging two
clusters and Update proximity matrix
(iv) Until only one single cluster is
left
Measuring distance between two clusters:
It depends on data type, the dimensionality of data, and knowledge of the domain.
Single
Linkage cluster: It is the minimum distance between two points in each cluster.
Complete
linkage cluster: It is the maximum distance between two points
in each cluster
Average
linkage cluster: It is the mean distance between the elements of each cluster.
No comments:
Post a Comment
If you have any doubt, let me know