Projection-Based Regularized Dual Averaging for Stochastic Optimization

Asahi Ushio, Masahiro Yukawa

Research output: Contribution to journalArticle

Abstract

We propose a novel stochastic-optimization framework based on the regularized dual averaging (RDA) method. The proposed approach differs from the previous studies of RDA in three major aspects. First, the squared-distance loss function to a 'random' closed convex set is employed for stability. Second, a sparsity-promoting metric (used implicitly by a certain proportionate-type adaptive filtering algorithm) and a quadratically-weighted \ell -1 regularizer are used simultaneously. Third, the step size and regularization parameters are both constant due to the smoothness of the loss function. These three differences yield an excellent sparsity-seeking property, high estimation accuracy, and insensitivity to the choice of the regularization parameter. Numerical examples show the remarkable advantages of the proposed method over the existing methods (including AdaGrad and the adaptive proximal forward-backward splitting method) in applications to regression and classification with real/synthetic data.

Original languageEnglish
Article number8680689
Pages (from-to)2720-2733
Number of pages14
JournalIEEE Transactions on Signal Processing
Volume67
Issue number10
DOIs
Publication statusPublished - 2019 May 15

Fingerprint

Adaptive filtering

Keywords

  • Online learning
  • orthogonal projection
  • proximity operator
  • regularized stochastic optimization

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Projection-Based Regularized Dual Averaging for Stochastic Optimization. / Ushio, Asahi; Yukawa, Masahiro.

In: IEEE Transactions on Signal Processing, Vol. 67, No. 10, 8680689, 15.05.2019, p. 2720-2733.

Research output: Contribution to journalArticle

@article{9e35cd62522e4db1b36c6bc5ed5d03c7,
title = "Projection-Based Regularized Dual Averaging for Stochastic Optimization",
abstract = "We propose a novel stochastic-optimization framework based on the regularized dual averaging (RDA) method. The proposed approach differs from the previous studies of RDA in three major aspects. First, the squared-distance loss function to a 'random' closed convex set is employed for stability. Second, a sparsity-promoting metric (used implicitly by a certain proportionate-type adaptive filtering algorithm) and a quadratically-weighted \ell -1 regularizer are used simultaneously. Third, the step size and regularization parameters are both constant due to the smoothness of the loss function. These three differences yield an excellent sparsity-seeking property, high estimation accuracy, and insensitivity to the choice of the regularization parameter. Numerical examples show the remarkable advantages of the proposed method over the existing methods (including AdaGrad and the adaptive proximal forward-backward splitting method) in applications to regression and classification with real/synthetic data.",
keywords = "Online learning, orthogonal projection, proximity operator, regularized stochastic optimization",
author = "Asahi Ushio and Masahiro Yukawa",
year = "2019",
month = "5",
day = "15",
doi = "10.1109/TSP.2019.2908901",
language = "English",
volume = "67",
pages = "2720--2733",
journal = "IEEE Transactions on Signal Processing",
issn = "1053-587X",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "10",

}

TY - JOUR

T1 - Projection-Based Regularized Dual Averaging for Stochastic Optimization

AU - Ushio, Asahi

AU - Yukawa, Masahiro

PY - 2019/5/15

Y1 - 2019/5/15

N2 - We propose a novel stochastic-optimization framework based on the regularized dual averaging (RDA) method. The proposed approach differs from the previous studies of RDA in three major aspects. First, the squared-distance loss function to a 'random' closed convex set is employed for stability. Second, a sparsity-promoting metric (used implicitly by a certain proportionate-type adaptive filtering algorithm) and a quadratically-weighted \ell -1 regularizer are used simultaneously. Third, the step size and regularization parameters are both constant due to the smoothness of the loss function. These three differences yield an excellent sparsity-seeking property, high estimation accuracy, and insensitivity to the choice of the regularization parameter. Numerical examples show the remarkable advantages of the proposed method over the existing methods (including AdaGrad and the adaptive proximal forward-backward splitting method) in applications to regression and classification with real/synthetic data.

AB - We propose a novel stochastic-optimization framework based on the regularized dual averaging (RDA) method. The proposed approach differs from the previous studies of RDA in three major aspects. First, the squared-distance loss function to a 'random' closed convex set is employed for stability. Second, a sparsity-promoting metric (used implicitly by a certain proportionate-type adaptive filtering algorithm) and a quadratically-weighted \ell -1 regularizer are used simultaneously. Third, the step size and regularization parameters are both constant due to the smoothness of the loss function. These three differences yield an excellent sparsity-seeking property, high estimation accuracy, and insensitivity to the choice of the regularization parameter. Numerical examples show the remarkable advantages of the proposed method over the existing methods (including AdaGrad and the adaptive proximal forward-backward splitting method) in applications to regression and classification with real/synthetic data.

KW - Online learning

KW - orthogonal projection

KW - proximity operator

KW - regularized stochastic optimization

UR - http://www.scopus.com/inward/record.url?scp=85065095967&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85065095967&partnerID=8YFLogxK

U2 - 10.1109/TSP.2019.2908901

DO - 10.1109/TSP.2019.2908901

M3 - Article

AN - SCOPUS:85065095967

VL - 67

SP - 2720

EP - 2733

JO - IEEE Transactions on Signal Processing

JF - IEEE Transactions on Signal Processing

SN - 1053-587X

IS - 10

M1 - 8680689

ER -