Penalized optimal scoring for the classification of multi-dimensional functional data

Tomohiro Ando

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Many fields of research need to classify individual systems based on one or more data series, which are obtained by sampling an unknown continuous curve with noise. In other words, the underlying process is an unknown function which the observed variables represent only imperfectly. Although functional logistic regression has many attractive features for this classification problem, this method is applicable only when the number of individuals to be classified (or available to estimate the model) is large compared to the number of curves sampled per individual. To overcome this limitation, we use penalized optimal scoring to construct a new method for the classification of multi-dimensional functional data. The proposed method consists of two stages. First, the series of observed discrete values available for each individual are expressed as a set of continuous curves. Next, the penalized optimal scoring model is estimated on the basis of these curves. A similar penalized optimal scoring method was described in my previous work, but this model is not suitable for the analysis of continuous functions. In this paper we adopt a Gaussian kernel approach to extend the previous model. The high accuracy of the new method is demonstrated on Monte Carlo simulations, and used to predict defaulting firms on the Japanese Stock Exchange.

Original languageEnglish
Pages (from-to)565-576
Number of pages12
JournalStatistical Methodology
Volume6
Issue number6
DOIs
Publication statusPublished - 2009 Nov

Fingerprint

Functional Data
Multidimensional Data
Scoring
Curve
Unknown
Gaussian Kernel
Series
Logistic Regression
Classification Problems
Model
Continuous Function
High Accuracy
Monte Carlo Simulation
Classify
Predict
Estimate

Keywords

  • Functional data analysis
  • Kernel methods
  • Optimal scoring
  • Penalized least squares

ASJC Scopus subject areas

  • Statistics and Probability

Cite this

Penalized optimal scoring for the classification of multi-dimensional functional data. / Ando, Tomohiro.

In: Statistical Methodology, Vol. 6, No. 6, 11.2009, p. 565-576.

Research output: Contribution to journalArticle

@article{94d0d2824ddb45449f8ce0100c1f5cfb,
title = "Penalized optimal scoring for the classification of multi-dimensional functional data",
abstract = "Many fields of research need to classify individual systems based on one or more data series, which are obtained by sampling an unknown continuous curve with noise. In other words, the underlying process is an unknown function which the observed variables represent only imperfectly. Although functional logistic regression has many attractive features for this classification problem, this method is applicable only when the number of individuals to be classified (or available to estimate the model) is large compared to the number of curves sampled per individual. To overcome this limitation, we use penalized optimal scoring to construct a new method for the classification of multi-dimensional functional data. The proposed method consists of two stages. First, the series of observed discrete values available for each individual are expressed as a set of continuous curves. Next, the penalized optimal scoring model is estimated on the basis of these curves. A similar penalized optimal scoring method was described in my previous work, but this model is not suitable for the analysis of continuous functions. In this paper we adopt a Gaussian kernel approach to extend the previous model. The high accuracy of the new method is demonstrated on Monte Carlo simulations, and used to predict defaulting firms on the Japanese Stock Exchange.",
keywords = "Functional data analysis, Kernel methods, Optimal scoring, Penalized least squares",
author = "Tomohiro Ando",
year = "2009",
month = "11",
doi = "10.1016/j.stamet.2009.06.003",
language = "English",
volume = "6",
pages = "565--576",
journal = "Statistical Methodology",
issn = "1572-3127",
publisher = "Elsevier",
number = "6",

}

TY - JOUR

T1 - Penalized optimal scoring for the classification of multi-dimensional functional data

AU - Ando, Tomohiro

PY - 2009/11

Y1 - 2009/11

N2 - Many fields of research need to classify individual systems based on one or more data series, which are obtained by sampling an unknown continuous curve with noise. In other words, the underlying process is an unknown function which the observed variables represent only imperfectly. Although functional logistic regression has many attractive features for this classification problem, this method is applicable only when the number of individuals to be classified (or available to estimate the model) is large compared to the number of curves sampled per individual. To overcome this limitation, we use penalized optimal scoring to construct a new method for the classification of multi-dimensional functional data. The proposed method consists of two stages. First, the series of observed discrete values available for each individual are expressed as a set of continuous curves. Next, the penalized optimal scoring model is estimated on the basis of these curves. A similar penalized optimal scoring method was described in my previous work, but this model is not suitable for the analysis of continuous functions. In this paper we adopt a Gaussian kernel approach to extend the previous model. The high accuracy of the new method is demonstrated on Monte Carlo simulations, and used to predict defaulting firms on the Japanese Stock Exchange.

AB - Many fields of research need to classify individual systems based on one or more data series, which are obtained by sampling an unknown continuous curve with noise. In other words, the underlying process is an unknown function which the observed variables represent only imperfectly. Although functional logistic regression has many attractive features for this classification problem, this method is applicable only when the number of individuals to be classified (or available to estimate the model) is large compared to the number of curves sampled per individual. To overcome this limitation, we use penalized optimal scoring to construct a new method for the classification of multi-dimensional functional data. The proposed method consists of two stages. First, the series of observed discrete values available for each individual are expressed as a set of continuous curves. Next, the penalized optimal scoring model is estimated on the basis of these curves. A similar penalized optimal scoring method was described in my previous work, but this model is not suitable for the analysis of continuous functions. In this paper we adopt a Gaussian kernel approach to extend the previous model. The high accuracy of the new method is demonstrated on Monte Carlo simulations, and used to predict defaulting firms on the Japanese Stock Exchange.

KW - Functional data analysis

KW - Kernel methods

KW - Optimal scoring

KW - Penalized least squares

UR - http://www.scopus.com/inward/record.url?scp=70350774370&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350774370&partnerID=8YFLogxK

U2 - 10.1016/j.stamet.2009.06.003

DO - 10.1016/j.stamet.2009.06.003

M3 - Article

AN - SCOPUS:70350774370

VL - 6

SP - 565

EP - 576

JO - Statistical Methodology

JF - Statistical Methodology

SN - 1572-3127

IS - 6

ER -