Linear Regression error Analysis:
As mentioned earlier that there is an error in
the solution obtained by a linear regression method.
We are assuming input from p-dimensional
space ( p number of the features) and mapping it to R (real numbers).
We can write linear regression simply as\[\hat{\beta} = (X^TX)^{-1}X^Ty\]\[f(X) = \beta_0 + \Sigma_{j=1}^{p} X_j\beta_j\]
Now Residual sum of squares is given by \[RSS(\beta) = \Sigma_{i=1}^N (y_i - f(x_i))^2\]\[= \Sigma(y_i-\beta_0 - \Sigma_{j=1}^{p} x_{ij}\beta_j)^2\]
Here it is assumed that input data points
are also independent of each other.
Now, the aim is to minimize the error in the
result. Let X denote the matrix of order Nx(p+1) where N is the number of data
points, p is the number of the features and 1 is added because of intercept, and y
is N-vector of outputs.
Hence we can write Residual sum of squares
as \[RSS(\beta) = (y-X\beta)^T(y-X\beta)\]
Differentiate it with respect to β, we will obtain\[\frac{\partial RSS}{\partial \beta} = -2X^T(y-X\beta)\]\[\frac{\partial^2 RSS}{\partial \beta \partial \beta^T} = X^TX\]
So a unique solution to minimize the error is
given by finding β, which is obtained by putting the first
derivative equal to zero. So the solution now is given by, \[\hat{\beta} = (X^TX)^{-1}X^Ty\]
No comments:
Post a Comment
If you have any doubt, let me know