Posts

Showing posts from January, 2020

Optimizations of Gradient Descent

Image
Introduction Gradient Descent is one of the most popular technique to optimize machine learning algorithm. We've already discussed Gradient Descent in the past in Gradient descent with Python article, and gave some intuitions toward it's behaviour. We've also made an overview about choosing learning rate hyper-parameter for the algorithm in hyperparameter optimization article. So by now, you should have a fair understanding of how it works. Today we'll discuss different ways to optimize the performance of the algorithm itself. Gradient Descent Variants We've already three variants of the Gradient Descent in Gradient Descent with Python article: Batch Gradient Descent, Stochastic Gradient Descent and Mini-Batch Gradient Descent. What we haven't discussed was problems arising when using these techniques. Choosing a proper learning rate is difficult. A too small learning rate leads to tremendously slow convergence, while a very large learning rate that ca