We propose a novel online learning paradigm for nonlinear-function estimation tasks based on the iterative projections in the $L^2$ space with probability measure reflecting the stochastic property of input signals. The proposed learning algorithm exploits the reproducing kernel of the so-called dictionary subspace, based on the fact that any finite-dimensional space of functions has a reproducing kernel characterized by the Gram matrix. The $L^2$ -space geometry provides the best decorrelation property in principle. The proposed learning paradigm is significantly different from the conventional kernel-based learning paradigm in two senses: first, the whole space is not a reproducing kernel Hilbert space; and second, the minimum mean squared error estimator gives the best approximation of the desired nonlinear function in the dictionary subspace. It preserves efficiency in computing the inner product as well as in updating the Gram matrix when the dictionary grows. Monotone approximation, asymptotic optimality, and convergence of the proposed algorithm are analyzed based on the variable-metric version of adaptive projected subgradient method. Numerical examples show the efficacy of the proposed algorithm for real data over a variety of methods including the extended Kalman filter and many batch machine-learning methods such as the multilayer perceptron.
ASJC Scopus subject areas