TY - JOUR
T1 - Projection-Based Regularized Dual Averaging for Stochastic Optimization
AU - Ushio, Asahi
AU - Yukawa, Masahiro
N1 - Funding Information:
Manuscript received October 31, 2018; revised February 15, 2019; accepted March 11, 2019. Date of publication April 2, 2019; date of current version April 22, 2019. The associate editor coordinating the review of this manuscript and approving it for publication was Lei Huang. This work was supported in part by JSPS Grants-in-Aid under Grants 18H01446 and 15H02757. A preliminary version of this paper was presented at 42nd IEEE International Conference on Acoustics, Speech, and Signal Processing, New Orleans, LA, USA, March 2017. (Corresponding author: Masahiro Yukawa.) A. Ushio was with the Department of Electronics and Electrical Engineering, Keio University, Yokohama 223-8522, Japan. He is now with Cogent Labs, Tokyo 150-0034, Japan (e-mail:,ushio@elec.keio.ac.jp).
Publisher Copyright:
© 1991-2012 IEEE.
PY - 2019/5/15
Y1 - 2019/5/15
N2 - We propose a novel stochastic-optimization framework based on the regularized dual averaging (RDA) method. The proposed approach differs from the previous studies of RDA in three major aspects. First, the squared-distance loss function to a 'random' closed convex set is employed for stability. Second, a sparsity-promoting metric (used implicitly by a certain proportionate-type adaptive filtering algorithm) and a quadratically-weighted \ell -1 regularizer are used simultaneously. Third, the step size and regularization parameters are both constant due to the smoothness of the loss function. These three differences yield an excellent sparsity-seeking property, high estimation accuracy, and insensitivity to the choice of the regularization parameter. Numerical examples show the remarkable advantages of the proposed method over the existing methods (including AdaGrad and the adaptive proximal forward-backward splitting method) in applications to regression and classification with real/synthetic data.
AB - We propose a novel stochastic-optimization framework based on the regularized dual averaging (RDA) method. The proposed approach differs from the previous studies of RDA in three major aspects. First, the squared-distance loss function to a 'random' closed convex set is employed for stability. Second, a sparsity-promoting metric (used implicitly by a certain proportionate-type adaptive filtering algorithm) and a quadratically-weighted \ell -1 regularizer are used simultaneously. Third, the step size and regularization parameters are both constant due to the smoothness of the loss function. These three differences yield an excellent sparsity-seeking property, high estimation accuracy, and insensitivity to the choice of the regularization parameter. Numerical examples show the remarkable advantages of the proposed method over the existing methods (including AdaGrad and the adaptive proximal forward-backward splitting method) in applications to regression and classification with real/synthetic data.
KW - Online learning
KW - orthogonal projection
KW - proximity operator
KW - regularized stochastic optimization
UR - http://www.scopus.com/inward/record.url?scp=85065095967&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065095967&partnerID=8YFLogxK
U2 - 10.1109/TSP.2019.2908901
DO - 10.1109/TSP.2019.2908901
M3 - Article
AN - SCOPUS:85065095967
SN - 1053-587X
VL - 67
SP - 2720
EP - 2733
JO - IEEE Transactions on Acoustics, Speech, and Signal Processing
JF - IEEE Transactions on Acoustics, Speech, and Signal Processing
IS - 10
M1 - 8680689
ER -