Ensembles of Decision Trees: Random Forests

As the main drawback of decision trees is that they tend to overfit the training data. Random forests are one way to address this problem. Random forests are essentially a collection of decision trees, where each tree is slightly different from the others. The idea of random forests is that each tree might do a relatively good job of predicting, but will likely overfit on part of the data. If we build many trees, all of which work well and overfit in different ways, we can reduce the amount of overfitting by averaging their results. To implement this strategy, we need to build many decision trees. Each tree should do an acceptable job of predicting the target, and should also be different from the other trees. Random forests get their name from injecting randomness into the tree building to ensure each tree is different. There are two ways in which the trees in a random forest are randomized: by selecting the data points used to build a tree and by selecting the features in each split test.

My Revision Web Page

Pages

Ensembles of Decision Trees: Random Forests

No comments:

Post a Comment

BLOGGER

Follow Me

Blog Archive

Popular

Tags

Report Abuse

About Me

Creating users (login and logout pages)

Subscribe

Followers

Blog Archive

Search This Blog

Cloud

Video Of Day

Ads

Popular Posts

Pages

Email Subscription

INSTAGRAM FEED