It is used for classification. It is a statistical and
machine learning technique for classifying records of the dataset based on the value of the input field. It uses one or more independent variables to predict the
output (dependent) categorical or discrete field. It can be used for binary and
multiclass classification.
The boundary for
classification may be linear or polynomial. Logistic regression is analogous to linear
regression but it takes a categorical discrete target field instead of numeric
one. It is the probability of a case belonging to a specific class. It can be used
to understand the impact of a feature on a dependent variable.
Let us assume that an input data point is taken from Rmxn and output, y ϵ {0,1} where m
is the number of the dimensions of features, n is the number of data
points(records).
So it is clear from
probability theory that if data point belong to class 0 then we can write P(y =
0 | x ) = 1 - P(y = 1 | x ).
The problem with
linear regression is that it always returns a numerical value. The sigmoid function is used generally as an activation function. It gives the probability of a class.
Training
Process:
(i)
Initialize θ
(ii)
Calculate y = σ (θTX)
(iii)
Compare the output with the actual output of the
data points and find the error
(iv)
Calculate the error for all data points i.e.
total error
(v)
Change the θ to reduce the total error
(vi)
Go back to step (ii)
* pic from google
No comments:
Post a Comment
If you have any doubt, let me know