Support vector machine is particularly powerful
and flexible class of the supervised learning algorithms for both
classification and regression.
Main aim in SVM is to find a separating hyperplane in such a way that
it maximizes the margin on both sides of the hyperplane.
Let us consider
that we have data points belonging to two classes say w1 and w2.
Let aTx+b=0
be equation of
the hyperplane. Then for one class data points aTx+b>0 and for other class data points aTx+b>0 where b is bias (Position of the plane) and gives the orientation of the plane.
Let M be margin
and we want to maximize the margin subject to some constraints.
i.e. yi(xTiβ+β0)≥N
for I = 1,2,3, . . . . . , N. We have the constraint
that ||β|| should be one that
because we do not want the solution to blow up arbitrarily.
Every data point
must be at-least M distance away from hyper-plane.
So the condition
will beyi(xTiβ+β0)||β||≥M
here i can remove condition that ||β||=1.
here i can remove condition that ||β||=1.
So here I can
arbitrary set ||β||=1M then I
can say that yi(xTiβ+β0)≥1 and margin will be M=1||β||.
Then I am left
with constraint yi(xTiβ+β0) that minimize ||β||22.
The constraints
define the margin around the linear decision boundary of thickness 1||β||.
Hence we choose β and β0 to maximize
its thickness. The Lagrangian Function that is to be minimized w.r.t. β and β0
is Lp=12||β||2−ΣNi=1αi[yi(xTiβ+β0)−1]
Setting the
derivatives to zero, we obtain: β=ΣNi=1αiyixi,
0=ΣNi=1αiyi
Then substituting
these values in the above equation(1), then we obtain the so-called WOLFE DUAL
subject to constraint αi ≥ 0. LD=ΣNi=1αi−12ΣNi=1ΣNk=1αiαkyiykxTixk
The solution is
obtained by maximizing LD in the positive orthant.
No comments:
Post a Comment
If you have any doubt, let me know