We propose a novel stochastic-optimization framework based on the regularized dual averaging (RDA) method. The proposed approach differs from the previous studies of RDA in three major aspects. First, the squared-distance loss function to a 'random' closed convex set is employed for stability. Second, a sparsity-promoting metric (used implicitly by a certain proportionate-type adaptive filtering algorithm) and a quadratically-weighted \ell -1 regularizer are used simultaneously. Third, the step size and regularization parameters are both constant due to the smoothness of the loss function. These three differences yield an excellent sparsity-seeking property, high estimation accuracy, and insensitivity to the choice of the regularization parameter. Numerical examples show the remarkable advantages of the proposed method over the existing methods (including AdaGrad and the adaptive proximal forward-backward splitting method) in applications to regression and classification with real/synthetic data.
- Online learning
- orthogonal projection
- proximity operator
- regularized stochastic optimization
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering