Improving Object Detection With One Line of Code

Navaneeth Bodla* Bharat Singh* Rama Chellappa Larry S. Davis Center For Automation Research, University of Maryland, College Park

Abstract (论文摘要)

Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss. To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process. Soft-NMS obtains consistent improvements for the coco-style mAP metric on standard datasets like PASCAL VOC 2007 (1.7% for both RFCN and Faster-RCNN) and MS-COCO (1.3% for R-FCN and 1.1% for Faster-RCNN) by just changing the NMS algorithm without any additional hyper-parameters. Using Deformable-RFCN, Soft-NMS improves state-of-the-art in object detection from 39.8% to 40.9% with a single model. Further, the computational complexity of Soft-NMS is the same as traditional NMS and hence it can be efficiently implemented. Since Soft-NMS does not require any extra training and is simple to implement, it can be easily integrated into any object detection pipeline. Code for SoftNMS is publicly available on GitHub http://bit.ly/ 2nJLNMu.

非最大抑制(Non-maximum suppression, NMS)是物体检测流程中重要的组成部分。它首先基于物体检测分数产生检测框,分数最高的检测框M被选中,其他与被选中检测框有明显重叠的检测框被抑制。该过程被不断递归的应用于其余检测框。根据演算法的设计,如果一个物体处于预设的重叠阈值之内,可能会导致检测不到该待检测物体。因此,我们提出了Soft-NMS演算法,该连续函数对非最大检测框的检测分数进行衰减而非彻底移除。它仅需要对传统的NMS演算法进行简单的改动且不增额外的参数。该Soft-NMS演算法在标准数据集PASCAL VOC2007(较R-FCN和Faster-RCNN提升1.7%)和MS-COCO(较R-FCN提升1.3%,较Faster-RCNN提升1.1%)上均有提升。此外,Soft-NMS具有与传统NMS相同的演算法复杂度,使用高效。Soft-NMS也不需要额外的训练,并易于实现,它可以轻松的被集成到任何物体检测流程中。Soft-NMS的源代码请参加Github:bit.ly/2nJLNMu.

NMS演算法介绍

物体检测是计算机视觉领域的一个经典问题,它为特定类别的物体产生检测边框并对其分类打分。传统的物体检测流程常常采用多尺度滑动窗口,根据每个物体类别的前景/背景分数对每个窗口计算其特征。然而,相邻窗口往往具有相关的分数,这会增加检测结果的假阳性。为了避免这样的问题,人们会采用非最大抑制的方法对检测结果进行后续处理来得到最终的检测结果。目前为止,非最大抑制演算法仍然是流行的物体检测处理演算法并能有效的降低检测结果的假阳性。

在现有的物体检测框架(如图一所示)中,每一个检测框均会产生检测分数,那么对于图片中的一个物体可能对应多个检测分数。在这种情况下,除了最正确(检测分数最高)的一个检测框,其余的检测框均产生假阳性结果。非最大抑制演算法针对特定物体类别分别设定重叠阈值来解决这个问题。

图一采用NMS的物体检测流程

传统的非最大抑制演算法首先在被检测图片中产生一系列的检测框B以及对应的分数S。当选中最大分数的检测框M,它被从集合B中移出并放入最终检测结果集合D。于此同时,集合B中任何与检测框M的重叠部分大于重叠阈值Nt的检测框也将随之移除。非最大抑制演算法中的最大问题就是它将相邻检测框的分数均强制归零。在这种情况下,如果一个真实物体在重叠区域出现,则将导致对该物体的检测失败并降低了演算法的平均检测率(average precision, AP)。

换一种思路,如果我们只是通过一个基于与M重叠程度相关的函数来降低相邻检测框的分数而非彻底剔除。虽然分数被降低,但相邻的检测框仍在物体检测的序列中。图二中的实例可以说明这个问题。

Figure . This image has two confident horse detections (shown in red and green) which have a score of 0.95 and 0.8 respectively. The green detection box has a significant overlap with the red one. Is it better to suppress the green box altogether and assign it a score of 0 or a slightly lower score of 0.4?

上图所示。这幅图像有两个可靠的马的检测(红色和绿色显示),得分分别为0.95和0.8。绿色检测框与红色检测框有明显的重叠。将绿色框完全隐藏,并将其赋值为0或稍低的0.4分,哪个更好?

Soft-NMS可提升目标检测的平均准确率

针对NMS存在的问题,我们提出了一种新的Soft-NMS演算法(图三),它只需改动一行代码即可有效改进传统贪心NMS演算法。在该演算法中,我们基于重叠部分的大小为相邻检测框设置一个衰减函数而非彻底将其分数置为零。简单来讲,如果一个检测框与M有大部分重叠,它会有很低的分数;而如果检测框与M只有小部分重叠,那么它的原有检测分数不会受太大影响。在标准数据集PASCAL VOC 和 MS-COCO等标准数据集上,Soft-NMS对现有物体检测演算法在多个重叠物体检测的平均准确率有显著的提升。同时,Soft-NMS不需要额外的训练且易于实现,因此,它很容易被集成到当前的物体检测流程中。

Figure 2. The pseudo code in red is replaced with the one in green in Soft-NMS. We propose to revise the detection scores by scaling them as a linear or Gaussian function of overlap.

图2。在Soft-NMS中,红色的伪代码替换为绿色的伪代码。我们建议通过将检测分数按线性或高斯函数的重叠来修正检测分数。

传统的NMS处理方法可以通过以下的分数重置函数(Rescoring Function)来表达:

? Score of neighboring detections should be decreased to an extent that they have a smaller likelihood of increasing the false positive rate, while being above obvious false positives in the ranked list of detections.

相邻检测的分值应降低到增加假阳性率的可能性较小的程度,而在检测的排序列表中应高于明显的假阳性率。

? Removing neighboring detections altogether with a low NMS threshold would be sub-optimal and would increase the miss-rate when evaluation is performed at high overlap thresholds.

通过较低的NMS重叠阈值来移除所有相邻检测框并不是最优解,并且很容易导致错过被检测物体,特别是在物体高度重叠的地方。

? Average precision measured over a range of overlap thresholds would drop when a high NMS threshold is used.

当NMS采用一个较高的重叠阈值时,平均准确率可能会相应降低。

Rescoring Functions for Soft-NMS: Decaying the scores of other detection boxes which have an overlap with M seems to be a promising approach for improving NMS. It is also clear that scores for detection boxes which have a higher overlap with M should be decayed more, as they have a higher likelihood of being false positives. Hence, we propose to update the pruning step with the following rule,

通过衰减与检测框M有重叠的相邻检测框的检测分数是对NMS演算法的有效改进。越是与M高度重叠的检测框,越有可能出现假阳性结果,它们的分数衰减应该更严重。因此,我们对NMS原有的分数重置函数做如下改进:

The above function would decay the scores of detections above a threshold Nt as a linear function of overlap with M. Hence, detection boxes which are far away from M would not be affected and those which are very close would be assigned a greater penalty.

上述函数会使阈值Nt以上的检测分数衰减为与M重叠的线性函数,因此远离M的检测盒不会受到影响,距离M很近的检测盒会受到更大的惩罚。

However, it is not continuous in terms of overlap and a sudden penalty is applied when a NMS threshold of Nt is reached. It would be ideal if the penalty function was continuous, otherwise it could lead to abrupt changes to the ranked list of detections. A continuous penalty function should have no penalty when there is no overlap and very high penalty at a high overlap. Also, when the overlap is low, it should increase the penalty gradually, as M should not affect the scores of boxes which have a very low overlap with it. However, when overlap of a box bi with M becomes close to one, bi should be significantly penalized. Taking this into consideration, we propose to update the pruning step with a Gaussian penalty function as follows,

但是,它在重叠方面不是连续的,并且当达到Nt的NMS阈值时,会使用一个突然的惩罚。如果惩罚函数是连续的,那将是理想的,否则它将导致对检测序列的突然变化。连续罚函数在无重叠时无罚,在高重叠时罚很高。另外,当重叠度较低时,惩罚值应该逐渐增加,因为M不应该影响与之重叠度极低的箱子的分数。但是,当box bi与M的重叠接近1时,bi应该受到显著的惩罚。考虑到这一点,我们提出更新剪枝步骤与高斯惩罚函数如下

This update rule is applied in each iteration and scores of all remaining detection boxes are updated.

此更新规则应用于每次迭代,并更新所有剩余检测框的分数。

The Soft-NMS algorithm is formally described in Figure 2, where f(iou(M, bi))) is the overlap based weighting function. The computational complexity of each step in Soft-NMS is O(N), where N is the number of detection boxes. This is because scores for all detection boxes which have an overlap with M are updated. So, for N detection boxes, the computational complexity for Soft-NMS is O(N2 ), which is the same as traditional greedy-NMS. Since NMS is not applied on all detection boxes (boxes with a minimum threshold are pruned in each iteration), this step is not computationally expensive and hence does not affect the running time of current detectors.

在图三的Soft-NMS演算法中,f(iou(M,bi))是基于检测框重叠程度的权重函数。演算法中每一步的复杂度为O(N),N为图片中检测框的数量。对于N个检测框,Soft-NMS的演算法复杂度为O(N2),与传统的贪心NMS演算法相同。由于分数低于一个最小阈值的检测框会被直接剔除,因此NMS并不需要对所有检测框进行操作,计算量并不庞大,也不会减慢当前检测器的运行速度。

Note that Soft-NMS is also a greedy algorithm and does not find the globally optimal re-scoring of detection boxes. Re-scoring of detection boxes is performed in a greedy fashion and hence those detections which have a high local score are not suppressed. However, Soft-NMS is a generalized version of non-maximum suppression and traditional NMS is a special case of it with a discontinuous binary weighting function. Apart from the two proposed functions, other functions with more parameters can also be explored with Soft-NMS which take overlap and detection scores into account. For example, instances of the generalized logistic function like the Gompertz function can be used, but such functions would increase the number of hyper-parameters.

值得注意的是,soft-NMS也是一种贪心演算法,并不能保证找到全局最优的检测框分数重置。但是,soft-NMS演算法是一种更加通用的非最大抑制演算法,传统的NMS演算法可以看做是它的一个采用不连续二值权重函数的特例。除了以上这两种分数重置函数,我们也可以考虑开发其他包含更多参数的分数重置函数,比如Gompertz函数等。但是它们在完成分数重置的过程中增加了额外的参数。

实验数据分析

我们在两个标准数据集PASCAL VOC 和MS-COCO上分别进行实验。Pascal数据集有20种物体分类,MS-COCO数据集含有80种。我们在这里选择VOC 2007 测试集来衡量演算法的性能。同时在MS-COCO中一个包含5000张图片的数据集上完成敏感度分析。此外,我们还在含有20288张图片的MS-COCO集上展示了结果。为了检验我们的演算法,我们在两种现有的检测器faster-RCNN和R-FCN上完成实验。

在表一中,我们利用MS-COCO数据集,分别比较了R-FCN和Faster-RCNN演算法在使用传统NMS和soft-NMS的情况下的性能。我们在线性权重函数中的Nt为0.3,高斯权重函数中Nt为0.5。可以很明显的看出,Soft-NMS在上述各种情况中均能提高演算法性能,特别是在多物体重叠的情况下。例如,soft-NMS分别使R-FCN和faster-RCNN演算法的平均准确率提升了1.3%和1.1%,在MS-COCO数据集中产生了显著的提升。值得强调的是,我们只需要对原有NMS演算法做很小的改动便可获得如此的性能提升。同时,我们也在PASCAL数据集上做了相同的实验,在表二中,我们可以看到,使用Soft-NMS帮助Faster-RCNN和R-FCN的平均准确率均提升了1.7%。在此之后的实验,我们均采用高斯权重函数的soft-NMS。

在图四所示的实验中,我们可以看到应用soft-NMS的R-FCN演算法在MS-COCO数据集每一类物体识别的准确率均有提升。其中,例如斑马,长颈鹿,绵羊,大象,马等动物类物体检测均有3%到6%的准确率提升。同时,对于面包机,球类,吹风机等很少多个物体同时出现的类别的物体,平均检测率提升不明显。总的来说,Soft-NMS在不影响运算速度的情况下,可以有效的提升物体检测的成功率。

图四应用于R-FCN(左)和Faster-RCNN(右)的Soft-NMS演算法分类准确率提高

敏感度分析

由上述分析可知,使用Soft-NMS时需要设置参数,使用传统NMS需要设置参数Nt。为了对这些参数做敏感度分析,我们通过在MS-COCO数据上上不断改变这些参数的值去观察平均准确率的变化。如图五所示,对于两种检测器,平均准确率(AP)均在0.3-0.6之间稳定变化,然后在该范围之外明显降低。与传统NMS相比,soft-NMS在0.1-0.7的参数变化范围内有更好的性能。在0.4-0.7的参数范围内,使用soft-NMS的两种检测器的平均准确率均比传统NMS大约高1%。尽管在参数为0.6时soft-NMS具有更好的性能,但为了保证实验的一致性,我们均设置为0.5。

图五 R-FCN演算法对于参数(Soft-NMS)和Nt(NMS)的敏感度分析

Soft-NMS的定位效果比传统NMS更精确

定位能力(LocalizationPerformance):单纯适用平均准确率很难表现出soft-NMS在物体检测性能上的显著提升。因此,我们需要在不同的重叠阈值下去计算传统NMS和soft-NMS的平均准确率。同时,我们也在实验中不断变化NMS和soft-NMS的参数值来对这两种演算法有更深入的了解。在表三中,随著NMS重叠阈值Nt的提高,平均准确率降低。尽管在高度重叠(高Ot)的环境下,高重叠阈值Nt有相对好的表现,但是在低Ot环境下,高重叠阈值Nt导致平均准确率AP大幅下降。而soft-NMS 具有不同的特性,在高度重叠(高Ot)环境下取得的好性能在低重叠环境下仍能保持。对于不同的参数设置,soft-NMS均能取得比传统NMS更好的性能。同时,高可以在高度重叠环境下取得更大的性能提升。因此,相比于传统NMS,soft-NMS在物体检测中具有更好的定位效果:

Soft-NMS、NMS的准确度和检索率对比

最后,我们来观察在不同重叠阈值的下soft-NMS相对于NMS的性能提升。随著重叠阈值和检索率的提升,soft-NMS在准确率上有更大的提升。这是因为传统NMS对所有重叠区域的检测框检测分数均置零,从而错过了很多待识别物体并导致在高检索率的情况下准确率降低。Soft-NMS对相邻区域内的检测框的分数进行调整而非彻底抑制,从而提高了高检索率情况下的准确率。与此同时,由于在相邻区域彻底抑制的NMS在较高重叠环境下更容易错过待检测物体,soft-NMS在低检索率时仍能对物体检测性能有明显提升。

图六不同物体重叠程度(Ot)下准确度vs 检索率

定性分析

在图七中,我们对COCO验证集中的数据进行了定性分析。其中,我们采用R-FCN来检测图片中的物体,检测阈值为0.45。Soft-NMS在假阳性结果与真实被检测物体间有少量重叠时对检测结果有明显提升。以下图中8号为例,NMS中使用的一个涵盖多个人物的较宽的检测框在soft-NMS中被有效抑制,因为它与图中多个分数较高的检测框均有少量重叠,它的检测分数在分数重置函数的影响下会衰减很多,同样的情形也在9号图中出现。在1号的海滩场景中,soft-NMS使得女士包周围的较大的检测框被衰减到0.45以下,4号图中的假阳性结果也同样被有效抑制。同时,在2,5,7,13号图中的动物检测中,NMS对相邻检测框产生了过度抑制而soft-NMS通过衰减相邻检测框的检测分数来实现检测到更多在阈值0.45以上的正确结果。

图七实验结果定性分析,图片对中左图采用NMS演算法,右图采用Soft-NMS演算法。蓝线以上为检测成功实例,以下为失败实例。14号图检测物体为人,15号图检测物体为长椅,21号图检测物体为盆栽。

实验结论:Soft-NMS在目标检测中效率更高

本文提出了一种新的软权重非最大抑制演算法。它通过提供一个基于检测框重叠程度和检测分数的函数来实现。作者在传统贪心NMS演算法的基础上提出了两种改进函数并对其在两个现有检测数据集上进行了验证。通过分析,基于检测框重叠程度和检测分数的软权重函数可以有效提升物体检测的准确率。今后的工作可以考虑从学习更复杂的参数或非参数方程的角度展开。此外,针对物体检测的端到端的学习框架将是最理想的解决方案,它在生成检测框时无需考虑非最大抑制以及其中的检测分数和检测框位置等多种因素。


文章为了避免使用翻译产生歧义,演算法关键地方附有原文;另外文章参考:sohu.com/a/135469270_64;附Soft-NMS论文地址:arxiv.org/pdf/1704.0450

推荐阅读:

相关文章