Abstract
We present a method for training CNN-based object class detectors directly using mean average precision (mAP) as the training loss, in a truly end to-end fashion that includes non-maximum suppresion (NMS) at training time. This contrasts with the traditional approach of training a CNN for a window classification loss, then applying NMS only at test time, when mAP is used as the evaluation metric in place of classification accuracy. However, mAP following NMS forms a
piecewise-constant structured loss over thousands of windows, with gradients that do not convey useful information for gradient descent. Hence, we define new, general gradient-like quantities for piecewise constant functions, which have wide applicability. We describe how to calculate these efficiently for mAP following NMS, enabling to train a detector based on Fast R-CNN [1] directly for mAP. This model achieves equivalent performance to the standard Fast R-CNN on the PASCAL VOC 2007 and 2012 datasets, while being conceptually more appealing as the very same model and loss are used at both training and test time.
piecewise-constant structured loss over thousands of windows, with gradients that do not convey useful information for gradient descent. Hence, we define new, general gradient-like quantities for piecewise constant functions, which have wide applicability. We describe how to calculate these efficiently for mAP following NMS, enabling to train a detector based on Fast R-CNN [1] directly for mAP. This model achieves equivalent performance to the standard Fast R-CNN on the PASCAL VOC 2007 and 2012 datasets, while being conceptually more appealing as the very same model and loss are used at both training and test time.
Original language | English |
---|---|
Title of host publication | Computer Vision -- ACCV 2016 |
Publisher | Springer |
Pages | 198-213 |
Number of pages | 15 |
ISBN (Electronic) | 978-3-319-54193-8 |
ISBN (Print) | 978-3-319-54192-1 |
DOIs | |
Publication status | Published - 11 Mar 2017 |
Event | 13th Asian Conference on Computer Vision - Taipei, Taiwan, Province of China Duration: 20 Nov 2016 → 24 Nov 2016 http://www.accv2016.org/ |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer, Cham |
Volume | 10115 |
ISSN (Print) | 0302-9743 |
Conference
Conference | 13th Asian Conference on Computer Vision |
---|---|
Abbreviated title | ACCV'16 |
Country/Territory | Taiwan, Province of China |
City | Taipei |
Period | 20/11/16 → 24/11/16 |
Internet address |