Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Which is better mini-batch or batch gradient descent?


Asked by Miranda Case on Nov 30, 2021 FAQ



Implementations may choose to sum the gradient over the mini-batch which further reduces the variance of the gradient. Mini-batch gradient descent seeks to find a balance between the robustness of stochastic gradient descent and the efficiency of batch gradient descent.
Also,
Mini-Batch Gradient Descent. The above approach we have seen is the Batch Gradient Descent. As you might have noticed while calculating the Gradient vector ∇ w, each step involved calculation over full training set X. Since this algorithm uses a whole batch of the training set, it is called Batch Gradient Descent.
Just so, Hence if the number of training examples is large, then batch gradient descent is not preferred. Instead, we prefer to use stochastic gradient descent or mini-batch gradient descent. Stochastic Gradient Descent: This is a type of gradient descent which processes 1 training example per iteration.
In addition,
Since this algorithm uses a whole batch of the training set, it is called Batch Gradient Descent. In the case of a large number of features, the Batch Gradient Descent performs well better than the Normal Equation method or the SVD method. But in the case of very large training sets, it is still quite slow.
Likewise,
Stochastic is just a mini-batch with batch_size equal to 1. In that case, the gradient changes its direction even more often than a mini-batch gradient.