Pages

Subset Selection


Before we talk about subset selection, let’s try to find why we are not satisfied by least square estimates.
Prediction Accuracy: Least square often has low bias and high variance. It can be improved by shrinking or setting some coefficients to zero. We will lose a little bit of bias to reduce variance thus improve prediction accuracy.
Interpretation: With a large number of predictors, we often would like to determine a smaller subset that exhibits the strongest effects.
Here, We are some of the approaches to variable subset selection with linear regression.
1. Forward Stepwise Selection: Forward stepwise selection starts with intercept, then sequentially adds into the model the predictor that improves the fit. It is a greedy algorithm. It produces the nested sequence of models. It is used because of two reasons: Computational and Statistical. Suppose we are given a large value of p (features) than the best sequence can’t be computed. In that case, we use Forward Stepwise sequence ( p >> N). Since forward stepwise is a more constrained search so it will have lower variance but perhaps more bias.
2. Backward Stepwise Selection: It starts with the full model and sequentially deletes a predictor that has the least impact on the fit. The candidate for dropping is the variable with the smallest Z-score. It is used when N > p, while forward selection can always be used.  
3. Forward Stagewise Regression: It is more constrained more than forward stepwise regression. It starts like forward stepwise regression, with an intercept, and centered predictors with coefficients initially all 0. At each step, the algorithm identifies the variable most correlated with the current residual. It then computes the simple linear regression coefficient of the residual on this chosen variable and then adds it to the current coefficient for that variable. This is continued till none of the variables have a correlation with the residuals—i.e. the least-squares fit when N > p.


No comments:

Post a Comment

If you have any doubt, let me know

Email Subscription

Enter your email address:

Delivered by FeedBurner

INSTAGRAM FEED