It
is a supervised learning approach. In classification, data points are classified
based on some characteristics features. Different classification algorithms are
Decision tree, Naïve Bayes, Linear Discriminant Analysis, K-Nearest Neighbour,
Support Vector Machines(SVM).
K-NN:
It is the most common method of classification. We simply calculate the Euclidian distance
between the data points, find the nearest points, and classify the data point to
the nearest class. K determines the number of data points nearest to the given
point.
Let’s
take an example, we are given data points of class 1 and class 2. We are asked to
find the
Class
of the points A (6,11) and B (14,3)
If we assume K = 1, one nearest data point, then the A can be classified in class 2 as it’s shortest distance is 1 while B will be classified to both the classes.
Class 1
|
Class 2
|
Distance of point (6,11) from
Class 1 class 2
|
Distance of point (14,3) from
Class 1 class 2
|
||
(11,11)
|
(7,11)
|
5
|
8.54
|
1
|
10.36
|
(13,11)
|
(15,9)
|
7
|
8.06
|
9.21
|
6.08
|
(8,10)
|
(15,7)
|
2.23
|
9.21
|
9.84
|
4.12
|
(9,9)
|
(13,5)
|
3.60
|
7.81
|
9.21
|
2.23
|
(7,7)
|
(14,4)
|
4.12
|
8.06
|
10.63
|
1
|
(7,5)
|
(9,3)
|
6.08
|
7.28
|
8.54
|
5
|
(15,3)
|
(11,3)
|
12.04
|
1
|
9.43
|
3
|
If we assume K = 1, one nearest data point, then the A can be classified in class 2 as it’s shortest distance is 1 while B will be classified to both the classes.
For
K=2, two nearest data points, A will be classified into class1, and B will be classified into class 2.
No comments:
Post a Comment
If you have any doubt, let me know