TY - JOUR
T1 - A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization with Non-Isolated Local Minima
AU - Ko , Taehee
AU - Li , Xiantao
JO - Journal of Machine Learning
VL - 2
SP - 138
EP - 160
PY - 2023
DA - 2023/06
SN - 2
DO - http://doi.org/10.4208/jml.230106
UR - https://global-sci.org/intro/article_detail/jml/21759.html
KW - Stochastic Gradient Descent, Stochastic Stability, Non-Convex Optimization, Local Convergence, Non-Isolated Minima.
AB - <p style="text-align: justify;">Loss functions with non-isolated minima have emerged in several machine-learning problems, creating a gap between theoretical predictions and observations in practice. In this paper, we formulate a new
type of local convexity condition that is suitable to describe the behavior of loss functions near non-isolated
minima. We show that such a condition is general enough to encompass many existing conditions. In addition,
we study the local convergence of the stochastic gradient descent (SGD) method under this mild condition by
adopting the notion of stochastic stability. In the convergence analysis, we establish concentration inequalities
for the iterates in SGD, which can be used to interpret the empirical observation from some practical training
results.</p>