TY - JOUR
T1 - Penalized optimal scoring for the classification of multi-dimensional functional data
AU - Ando, Tomohiro
N1 - Copyright:
Copyright 2009 Elsevier B.V., All rights reserved.
PY - 2009/11
Y1 - 2009/11
N2 - Many fields of research need to classify individual systems based on one or more data series, which are obtained by sampling an unknown continuous curve with noise. In other words, the underlying process is an unknown function which the observed variables represent only imperfectly. Although functional logistic regression has many attractive features for this classification problem, this method is applicable only when the number of individuals to be classified (or available to estimate the model) is large compared to the number of curves sampled per individual. To overcome this limitation, we use penalized optimal scoring to construct a new method for the classification of multi-dimensional functional data. The proposed method consists of two stages. First, the series of observed discrete values available for each individual are expressed as a set of continuous curves. Next, the penalized optimal scoring model is estimated on the basis of these curves. A similar penalized optimal scoring method was described in my previous work, but this model is not suitable for the analysis of continuous functions. In this paper we adopt a Gaussian kernel approach to extend the previous model. The high accuracy of the new method is demonstrated on Monte Carlo simulations, and used to predict defaulting firms on the Japanese Stock Exchange.
AB - Many fields of research need to classify individual systems based on one or more data series, which are obtained by sampling an unknown continuous curve with noise. In other words, the underlying process is an unknown function which the observed variables represent only imperfectly. Although functional logistic regression has many attractive features for this classification problem, this method is applicable only when the number of individuals to be classified (or available to estimate the model) is large compared to the number of curves sampled per individual. To overcome this limitation, we use penalized optimal scoring to construct a new method for the classification of multi-dimensional functional data. The proposed method consists of two stages. First, the series of observed discrete values available for each individual are expressed as a set of continuous curves. Next, the penalized optimal scoring model is estimated on the basis of these curves. A similar penalized optimal scoring method was described in my previous work, but this model is not suitable for the analysis of continuous functions. In this paper we adopt a Gaussian kernel approach to extend the previous model. The high accuracy of the new method is demonstrated on Monte Carlo simulations, and used to predict defaulting firms on the Japanese Stock Exchange.
KW - Functional data analysis
KW - Kernel methods
KW - Optimal scoring
KW - Penalized least squares
UR - http://www.scopus.com/inward/record.url?scp=70350774370&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70350774370&partnerID=8YFLogxK
U2 - 10.1016/j.stamet.2009.06.003
DO - 10.1016/j.stamet.2009.06.003
M3 - Article
AN - SCOPUS:70350774370
SN - 1572-3127
VL - 6
SP - 565
EP - 576
JO - Statistical Methodology
JF - Statistical Methodology
IS - 6
ER -