Pages

Linear Regression Error Analysis


Linear Regression error Analysis: 
As mentioned earlier that there is an error in the solution obtained by a linear regression method.
We are assuming input from p-dimensional space ( p number of the features) and mapping it to R (real numbers).
We can write linear regression simply as\[\hat{\beta} = (X^TX)^{-1}X^Ty\]\[f(X) = \beta_0 + \Sigma_{j=1}^{p} X_j\beta_j\]
Now Residual sum of squares is given by \[RSS(\beta) = \Sigma_{i=1}^N (y_i - f(x_i))^2\]\[= \Sigma(y_i-\beta_0 - \Sigma_{j=1}^{p} x_{ij}\beta_j)^2\]
Here it is assumed that input data points are also independent of each other.
Now, the aim is to minimize the error in the result. Let X denote the matrix of order Nx(p+1) where N is the number of data points, p is the number of the features and 1 is added because of intercept, and y is N-vector of outputs.
Hence we can write Residual sum of squares as \[RSS(\beta) = (y-X\beta)^T(y-X\beta)\]

Differentiate it with respect to β, we will obtain\[\frac{\partial RSS}{\partial \beta} = -2X^T(y-X\beta)\]\[\frac{\partial^2 RSS}{\partial \beta \partial \beta^T} = X^TX\]

So a unique solution to minimize the error is given by finding β, which is obtained by putting the first derivative equal to zero. So the solution now is given by, \[\hat{\beta} = (X^TX)^{-1}X^Ty\]





No comments:

Post a Comment

If you have any doubt, let me know

Email Subscription

Enter your email address:

Delivered by FeedBurner

INSTAGRAM FEED