Abstract
Image tagging is a well known challenge in image processing. It is typically addressed through multi-instance multi-label (MIML) classification methodologies. Convolutional Neural Networks (CNNs) possess great potential to perform well on MIML tasks, since multi-level convolution and max pooling coincide with the multi-instance setting and the sharing of hidden representation may benefit multi-label modeling. However, CNNs usually require a large amount of carefully labeled data for training, which is hard to obtain in many real applications. In this paper, we propose a new approach for transferring pre-trained deep networks such as VGG16 on Imagenet to small MIML tasks. We extract features from each group of the network layers and apply multiple binary classifiers to them for multi-label prediction. Moreover, we adopt an L1-norm regularized Logistic Regression (L1LR) to find the most effective features for learning the multi-label classifiers. The experiment results on two most-widely used and relatively small benchmark MIML image datasets demonstrate that the proposed approach can substantially outperform the state-of-the-art algorithms, in terms of all popular performance metrics.
Original language | English |
---|---|
Title of host publication | 2017 IEEE International Conference on Image Processing (ICIP) |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 1332-1336 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-5090-2175-8 |
DOIs | |
Publication status | Published - 22 Feb 2018 |
Event | 2017 IEEE International Conference on Image Processing - Beijing, China Duration: 17 Sept 2017 → 20 Sept 2017 http://2017.ieeeicip.org/ |
Publication series
Name | |
---|---|
Publisher | IEEE |
ISSN (Electronic) | 2381-8549 |
Conference
Conference | 2017 IEEE International Conference on Image Processing |
---|---|
Abbreviated title | ICIP 2017 |
Country/Territory | China |
City | Beijing |
Period | 17/09/17 → 20/09/17 |
Internet address |