Abstract
In many situations, the choice of an adequate similarity measure or metric on the feature space dramatically determines the performance of machine learning methods. Building automatically such measures is the specific purpose of metric/similarity learning. In [21], similarity learning is formulated as a pairwise bipartite ranking problem: ideally, the larger the probability that two observations in the feature space belong to the same class (or share the same label), the higher the similarity measure between them. From this perspective, the backslashmathrmROC curve is an appropriate performance criterion and it is the goal of this article to extend recursive tree-based backslashmathrmROC optimization techniques in order to propose efficient similarity learning algorithms. The validity of such iterative partitioning procedures in the pairwise setting is established by means of results pertaining to the theory of U-processes and from a practical angle, it is discussed at length how to implement them by means of splitting rules specifically tailored to the similarity learning task. Beyond these theoretical/methodological contributions, numerical experiments are displayed and provide strong empirical evidence of the performance of the algorithmic approaches we propose.
Original language | English |
---|---|
Title of host publication | Machine Learning, Optimization, and Data Science |
Editors | Giuseppe Nicosia, Panos Pardalos, Renato Umeton, Giovanni Giuffrida, Vincenzo Sciacca |
Place of Publication | Cham |
Publisher | Springer International Publishing |
Pages | 676-688 |
Number of pages | 13 |
ISBN (Electronic) | 978-3-030-37599-7 |
ISBN (Print) | 978-3-030-37598-0 |
DOIs | |
Publication status | Published - 3 Jan 2020 |
Event | Fifth International Conference on Machine Learning, Optimization, and Data Science - Tuscany, Italy Duration: 10 Sep 2019 → 13 Sep 2019 https://lod2019.icas.xyz/ |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 11943 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | Fifth International Conference on Machine Learning, Optimization, and Data Science |
---|---|
Abbreviated title | LOD 2019 |
Country/Territory | Italy |
City | Tuscany |
Period | 10/09/19 → 13/09/19 |
Internet address |
Keywords
- Metric-learning
- Rate bound analysis
- Similaritylearning
- Tree-based algorithms
- U-processes