Optimal estimators of principal points for minimizing expected mean squared distance

Shun Matsuura, Hiroshi Kurata, Thaddeus Tarpey

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

k-Principal points of a random variable are k points that minimize the mean squared distance (MSD) between the random variable and the nearest of the k points. This paper focuses on finding optimal estimators of principal points in terms of the expected mean squared distance (EMSD) between the random variable and the nearest principal point estimator. These estimators are compared with nonparametric and maximum likelihood estimators. It turns out that a minimum EMSD estimator of k-principal points of univariate normal distributions is determined by the k-principal points of the t-distribution with n+. 1 degrees of freedom, where n is the sample size. Extensions of the results to location-scale families, multivariate distributions, and principal surfaces are also discussed.

Original languageEnglish
Pages (from-to)102-122
Number of pages21
JournalJournal of Statistical Planning and Inference
Volume167
DOIs
Publication statusPublished - 2015 Dec 1

Fingerprint

Principal Points
Random variables
Estimator
Random variable
Normal distribution
Maximum likelihood
Nonparametric Likelihood
Location-scale Family
t-distribution
Multivariate Distribution
Maximum Likelihood Estimator
Univariate
Gaussian distribution
Sample Size
Degree of freedom
Minimise

Keywords

  • Elliptical distributions
  • K-means clustering
  • Location-scale family
  • Normal distribution
  • Principal curves and surfaces
  • Self-consistency
  • T-distribution

ASJC Scopus subject areas

  • Statistics, Probability and Uncertainty
  • Applied Mathematics
  • Statistics and Probability

Cite this

Optimal estimators of principal points for minimizing expected mean squared distance. / Matsuura, Shun; Kurata, Hiroshi; Tarpey, Thaddeus.

In: Journal of Statistical Planning and Inference, Vol. 167, 01.12.2015, p. 102-122.

Research output: Contribution to journalArticle

@article{c8ae55f229534ca29b8929c6ded1f9bc,
title = "Optimal estimators of principal points for minimizing expected mean squared distance",
abstract = "k-Principal points of a random variable are k points that minimize the mean squared distance (MSD) between the random variable and the nearest of the k points. This paper focuses on finding optimal estimators of principal points in terms of the expected mean squared distance (EMSD) between the random variable and the nearest principal point estimator. These estimators are compared with nonparametric and maximum likelihood estimators. It turns out that a minimum EMSD estimator of k-principal points of univariate normal distributions is determined by the k-principal points of the t-distribution with n+. 1 degrees of freedom, where n is the sample size. Extensions of the results to location-scale families, multivariate distributions, and principal surfaces are also discussed.",
keywords = "Elliptical distributions, K-means clustering, Location-scale family, Normal distribution, Principal curves and surfaces, Self-consistency, T-distribution",
author = "Shun Matsuura and Hiroshi Kurata and Thaddeus Tarpey",
year = "2015",
month = "12",
day = "1",
doi = "10.1016/j.jspi.2015.05.005",
language = "English",
volume = "167",
pages = "102--122",
journal = "Journal of Statistical Planning and Inference",
issn = "0378-3758",
publisher = "Elsevier",

}

TY - JOUR

T1 - Optimal estimators of principal points for minimizing expected mean squared distance

AU - Matsuura, Shun

AU - Kurata, Hiroshi

AU - Tarpey, Thaddeus

PY - 2015/12/1

Y1 - 2015/12/1

N2 - k-Principal points of a random variable are k points that minimize the mean squared distance (MSD) between the random variable and the nearest of the k points. This paper focuses on finding optimal estimators of principal points in terms of the expected mean squared distance (EMSD) between the random variable and the nearest principal point estimator. These estimators are compared with nonparametric and maximum likelihood estimators. It turns out that a minimum EMSD estimator of k-principal points of univariate normal distributions is determined by the k-principal points of the t-distribution with n+. 1 degrees of freedom, where n is the sample size. Extensions of the results to location-scale families, multivariate distributions, and principal surfaces are also discussed.

AB - k-Principal points of a random variable are k points that minimize the mean squared distance (MSD) between the random variable and the nearest of the k points. This paper focuses on finding optimal estimators of principal points in terms of the expected mean squared distance (EMSD) between the random variable and the nearest principal point estimator. These estimators are compared with nonparametric and maximum likelihood estimators. It turns out that a minimum EMSD estimator of k-principal points of univariate normal distributions is determined by the k-principal points of the t-distribution with n+. 1 degrees of freedom, where n is the sample size. Extensions of the results to location-scale families, multivariate distributions, and principal surfaces are also discussed.

KW - Elliptical distributions

KW - K-means clustering

KW - Location-scale family

KW - Normal distribution

KW - Principal curves and surfaces

KW - Self-consistency

KW - T-distribution

UR - http://www.scopus.com/inward/record.url?scp=84945438926&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84945438926&partnerID=8YFLogxK

U2 - 10.1016/j.jspi.2015.05.005

DO - 10.1016/j.jspi.2015.05.005

M3 - Article

AN - SCOPUS:84945438926

VL - 167

SP - 102

EP - 122

JO - Journal of Statistical Planning and Inference

JF - Journal of Statistical Planning and Inference

SN - 0378-3758

ER -