Abstract
k-Principal points of a random variable are k points that minimize the mean squared distance (MSD) between the random variable and the nearest of the k points. This paper focuses on finding optimal estimators of principal points in terms of the expected mean squared distance (EMSD) between the random variable and the nearest principal point estimator. These estimators are compared with nonparametric and maximum likelihood estimators. It turns out that a minimum EMSD estimator of k-principal points of univariate normal distributions is determined by the k-principal points of the t-distribution with n+. 1 degrees of freedom, where n is the sample size. Extensions of the results to location-scale families, multivariate distributions, and principal surfaces are also discussed.
Original language | English |
---|---|
Pages (from-to) | 102-122 |
Number of pages | 21 |
Journal | Journal of Statistical Planning and Inference |
Volume | 167 |
DOIs | |
Publication status | Published - 2015 Dec 1 |
Keywords
- Elliptical distributions
- K-means clustering
- Location-scale family
- Normal distribution
- Principal curves and surfaces
- Self-consistency
- T-distribution
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Applied Mathematics