Continuous Heavy-Tailed Theory of SGD

Jul 4, 2022 · 1 min read
seminars

Event

Weekly DL Theory & Stat Phy Seminar

Short summary

In this seminar, I will talk about a recent line of works that propose to analyze SGD under heavy-tail noise assumptions using techniques from Levy-driven SDE theory and metastability analysis from statistical physics.

Papers

Papers discussed in the seminar:

  • Mert Gürbüzbalaban, Umut Şimşekli, and Lingjiong Zhu. The Heavy-Tail Phenomenon in SGD. In arXiv 2020.
  • Umut Şimşekli, Levent Sagun, and Mert Gürbüzbalaban. A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks. In ICML 2019.
Junghyun Lee
Authors
PhD Candidate in Artificial Intelligence
PhD candidate at KAIST AI, jointly advised by Se-Young Yun and Chulhee Yun. I work on interactive machine learning, theoretical aspects of LLMs, learning/optimization theory, and statistical analysis of large networks.