Comparing different SGD variants for online optimization

Supervisor: Prof. Ketan Rejawat, IIT Kanpur.

In this term report we eproduced and extended the empirical results of “On the Insufficiency of Existing Momentum Schemes for Stochastic Optimization” and ”Accelerating Stochastic Gradient Descent For Least Squares Regression” by Kidambi et al. We showed experimentally that there exist simple stochastic problem instances where momentum based methods are sub-optimal and enjoy practical gains over SGD in deep learning applications due to minibatching and also established that ASGD and Adam can converge faster than all other methods irrespective of batch sizes

Prateek Varshney
Prateek Varshney
Research Associate

My research interests include distributed robotics, mobile computing and programmable matter.