Abstract
Spatial centrality, whereby samples closer to the center of a dataset tend to be closer to all other samples, is regarded as one source of hubness. Hubness is well known to degrade k-nearest-neighbor (k-NN) classification. Spatial centrality can be removed by centering, i.e., shifting the origin to the global center of the dataset, in cases where inner product similarity is used. However, when Euclidean distance is used, centering has no effect on spatial centrality because the distance between the samples is the same before and after centering. As described in this paper, we propose a solution for the hubness problem when Euclidean distance is considered. We provide a theoretical explanation to demonstrate how the solution eliminates spatial centrality and reduces hubness. We then present some discussion of the reason the proposed solution works, from a viewpoint of density gradient, which is regarded as the origin of spatial centrality and hubness. We demonstrate that the solution corresponds to flattening the density gradient. Using real-world datasets, we demonstrate that the proposed method improves k-NN classification performance and outperforms an existing hub-reduction method.
Original language | English |
---|---|
Title of host publication | 30th AAAI Conference on Artificial Intelligence, AAAI 2016 |
Publisher | AAAI press |
Pages | 1659-1665 |
Number of pages | 7 |
ISBN (Electronic) | 9781577357605 |
Publication status | Published - 2016 |
Externally published | Yes |
Event | 30th AAAI Conference on Artificial Intelligence, AAAI 2016 - Phoenix, United States Duration: 2016 Feb 12 → 2016 Feb 17 |
Other
Other | 30th AAAI Conference on Artificial Intelligence, AAAI 2016 |
---|---|
Country | United States |
City | Phoenix |
Period | 16/2/12 → 16/2/17 |
ASJC Scopus subject areas
- Artificial Intelligence