Optimization : Stochastic Gradient Descent

Stochastic Gradient Descent: Instead of considering all the samples together, we take samples one by one. Find whether the sample is correctly classified or not. If the sample is not classified correctly then compute the loss function. Use this loss function to update the weight matrix. If an error is not acceptable then repeat the steps until error is acceptable.

Upsides: The Frequent updates immediately give an insight into the performance of the model and the rate of improvement. This variant of gradient descent may be the simplest to understand and implement. The increased model update frequency can result in faster learning on some problems. The noisy update process can allow the model to avoid local minima (e.g. premature convergence).

Downs: Updating the model so frequently is more computationally expensive than other configurations of gradient descent, taking significantly longer to train models on large datasets. The frequent updates can result in a noisy gradient signal, which may cause the model parameters and in turn the model error to jump around (have a higher variance over training epochs). The noisy learning process down the error gradient can also make it hard for the algorithm to settle on an error minimum for the model.

My Revision Web Page

Pages

Optimization : Stochastic Gradient Descent

No comments:

Post a Comment

BLOGGER

Follow Me

Blog Archive

Popular

Tags

Report Abuse

About Me

Creating users (login and logout pages)

Subscribe

Followers

Blog Archive

Search This Blog

Cloud

Video Of Day

Ads

Popular Posts

Pages

Email Subscription

INSTAGRAM FEED