Mini-Batch Gradient Descent: This process is similar to the Batch optimization technique. But in this
process, we use batches with some fixed size. Then apply the same approach as
applied for batch optimization.
Upsides: The model update
frequency is higher than batch gradient descent which allows for a more robust
convergence, avoiding local minima. The batched updates provide a
computationally more efficient process than stochastic gradient descent. The
batching allows both the efficiency of not having all training data in memory
and algorithm implementations.
Downsides: Mini-batch requires the configuration of an additional
“mini-batch size” hyperparameter for the learning algorithm. Error information
must be accumulated across mini-batches of training examples like batch
gradient descent.
No comments:
Post a Comment
If you have any doubt, let me know