Supervisor: Prof. Ketan Rejawat, IIT Kanpur.
In this term report we eproduced and extended the empirical results of “On the Insufficiency of Existing Momentum Schemes for Stochastic Optimization” and ”Accelerating Stochastic Gradient Descent For Least Squares Regression” by Kidambi et al. We showed experimentally that there exist simple stochastic problem instances where momentum based methods are sub-optimal and enjoy practical gains over SGD in deep learning applications due to minibatching and also established that ASGD and Adam can converge faster than all other methods irrespective of batch sizes