On the selection of the globally optimal prototype subset for nearest-neighbor classification

E. Carrizosa, B. Martín-Barragán, F. Plastria, D.R. Morales

Research output: Contribution to journalArticlepeer-review

Abstract

The nearest-neighbor classifier has been shown to be a powerful tool for multiclass classification. We explore both theoretical properties and empirical behavior of a variant method, in which the nearest-neighbor rule is applied to a reduced set of prototypes. This set is selected a priori by fixing its cardinality and minimizing the empirical misclassification cost. In this way we alleviate the two serious drawbacks of the nearest- neighbor method: high storage requirements and time-consuming queries. Finding this reduced set is shown to be NP-hard. We provide mixed integer programming (MIP) formulations, which are theoretically compared and solved by a standard MIP solver for small problem instances. We show that the classifiers derived from these formulations are comparable to benchmark procedures. We solve large problem instances by a metaheuristic that yields good classification rules in reasonable time. Additional experiments indicate that prototype-based nearest-neighbor classifiers remain quite stable in the presence of missing values.
Original languageEnglish
Pages (from-to)470-479
Number of pages10
JournalINFORMS Journal on Computing
Volume19
Issue number3
DOIs
Publication statusPublished - 1 Jun 2007

Fingerprint Dive into the research topics of 'On the selection of the globally optimal prototype subset for nearest-neighbor classification'. Together they form a unique fingerprint.

Cite this