Iterative Visual Relationship Detection via Commonsense Knowledge Graph

Hai Wan, Jialing Ou, Baoyi Wang, Jianfeng Du, Jeff Z. Pan, Juan Zeng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Visual relationship detection, i.e., discovering the interaction between pairs of objects in an image, plays a significant role in image understanding. However, most of recent works only consider visual features, ignoring the implicit effect of common sense. Motivated by the iterative visual reasoning in image recognition, we propose a novel model to take the advantage of common sense in the form of the knowledge graph in visual relationship detection, named Iterative Visual Relationship Detection with Commonsense Knowledge Graph (IVRDC). Our model consists of two modules: a feature module that predicts predicates by visual features and semantic features with a bi-directional RNN; and a commonsense knowledge module that constructs a specific commonsense knowledge graph for predicate prediction. After iteratively combining prediction from both modules, IVRDC updates the memory and commonsense knowledge graph. The final predictions are made by taking the result of each iteration into account with an attention mechanism. Our experiments on the Visual Relationship Detection (VRD) dataset and the Visual Genome (VG) dataset demonstrate that our proposed model is competitive.
Original languageEnglish
Title of host publicationSemantic Technology
Subtitle of host publication9th Joint International Conference, JIST 2019, Hangzhou, China, November 25–27, 2019, Proceedings
EditorsXin Wang, Francesca Alessandra Lisi, Guohui Xiao, Elena Botoeva
Place of PublicationCham
PublisherSpringer
Pages210-225
Number of pages16
ISBN (Electronic)978-3-030-41407-8
ISBN (Print)978-3-030-41406-1
DOIs
Publication statusPublished - 14 Feb 2020
EventThe 9th Joint International Semantic Technology Conference - Hangzhou, China
Duration: 25 Nov 201927 Nov 2019
http://jist2019.openkg.cn/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer, Cham
Volume12032
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceThe 9th Joint International Semantic Technology Conference
Abbreviated titleJIST 2019
Country/TerritoryChina
CityHangzhou
Period25/11/1927/11/19
Internet address

Keywords / Materials (for Non-textual outputs)

  • Commonsense knowledge graph
  • Visual relationship detection
  • Visual Genome

Fingerprint

Dive into the research topics of 'Iterative Visual Relationship Detection via Commonsense Knowledge Graph'. Together they form a unique fingerprint.

Cite this