Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval

Jifei Song, Qian Yu, Yi-Zhe Song, Tao Xiang, Timothy Hospedales

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Human sketches are unique in being able to capture both the spatial topology of a visual object, as well as its subtle appearance details. Fine-grained sketch-based image retrieval (FG-SBIR) importantly leverages on such fine-grained characteristics of sketches to conduct instance level retrieval of photos. Nevertheless, human sketches are often highly abstract and iconic, resulting in severe misalignments with candidate photos which in turn make subtle visual detail matching difficult. Existing FG-SBIR approaches focus only on coarse holistic matching via deep cross-domain representation learning, yet ignore explicitly accounting for fine-grained details and their spatial context. In this paper, a novel deep FG-SBIR model is proposed which differs significantly from the existing models in that: (1) It is spatially aware, achieved by introducing an attention module that is sensitive to the spatial position of visual details; (2) It combines coarse and fine semantic information via a shortcut connection fusion block; and (3) It models feature correlation and is robust to misalignments between the extracted features across the two domains by introducing a novel
higher-order learnable energy function (HOLEF) based loss. Extensive experiments show that the proposed deep spatial-semantic attention model significantly outperforms the state-of-the-art.
Original languageEnglish
Title of host publicationThe International Conference on Computer Vision (ICCV 2017)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages10
ISBN (Electronic)978-1-5386-1032-9
ISBN (Print)978-1-5386-1033-6
Publication statusPublished - 25 Dec 2017
EventInternational Conference on Computer Vision Workshop 2017 - Venice, Italy
Duration: 22 Oct 201729 Oct 2017

Publication series

ISSN (Electronic)2380-7504


WorkshopInternational Conference on Computer Vision Workshop 2017
Abbreviated titleICCCVW 2018
Internet address


Dive into the research topics of 'Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval'. Together they form a unique fingerprint.

Cite this