Abstract
We investigate the problem of fine-grained sketch-based image retrieval (SBIR), where free-hand human sketches are used as queries to perform instance-level retrieval of images. This is an extremely challenging task because (i) visual comparisons not only need to be fine-grained but also executed cross-domain, (ii) free-hand (finger) sketches are highly abstract, making fine-grained matching harder, and most importantly (iii) annotated cross-domain sketch-photo datasets required for training are scarce, challenging many state-of-the-art machine learning techniques.
In this paper, for the first time, we address all these challenges, providing a step towards the capabilities that would underpin a commercial sketch-based image retrieval application. We introduce a new database of 1,432 sketch photo pairs from two categories with 32,000 fine-grained triplet ranking annotations. We then develop a deep triple tranking model for instance-level SBIR with a novel data augmentation and staged pre-training strategy to alleviate the issue of insufficient fine-grained training data. Extensive experiments are carried out to contribute a variety of insights into the challenges of data sufficiency and over-fitting avoidance when training deep networks for fine grained cross-domain ranking tasks.
In this paper, for the first time, we address all these challenges, providing a step towards the capabilities that would underpin a commercial sketch-based image retrieval application. We introduce a new database of 1,432 sketch photo pairs from two categories with 32,000 fine-grained triplet ranking annotations. We then develop a deep triple tranking model for instance-level SBIR with a novel data augmentation and staged pre-training strategy to alleviate the issue of insufficient fine-grained training data. Extensive experiments are carried out to contribute a variety of insights into the challenges of data sufficiency and over-fitting avoidance when training deep networks for fine grained cross-domain ranking tasks.
Original language | English |
---|---|
Title of host publication | 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 799-807 |
Number of pages | 9 |
ISBN (Electronic) | 978-1-4673-8851-1 |
ISBN (Print) | 978-1-4673-8852-8 |
DOIs | |
Publication status | Published - 12 Dec 2016 |
Event | 29th IEEE Conference on Computer Vision and Pattern Recognition - Las Vegas, United States Duration: 26 Jun 2016 → 1 Jul 2016 http://cvpr2016.thecvf.com/ |
Publication series
Name | |
---|---|
Publisher | IEEE |
ISSN (Electronic) | 1063-6919 |
Conference
Conference | 29th IEEE Conference on Computer Vision and Pattern Recognition |
---|---|
Abbreviated title | CVPR 2016 |
Country/Territory | United States |
City | Las Vegas |
Period | 26/06/16 → 1/07/16 |
Internet address |
Fingerprint
Dive into the research topics of 'Sketch Me That Shoe'. Together they form a unique fingerprint.Profiles
-
Timothy Hospedales
- School of Informatics - Personal Chair of Artificial Intelligence
- Institute of Perception, Action and Behaviour
- Language, Interaction, and Robotics
Person: Academic: Research Active