作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (7): 65-75. doi: 10.19678/j.issn.1000-3428.0065328

• 人工智能与模式识别 • 上一篇    下一篇

一种基于边界框关键点距离的框回归算法

聂志勇1, 阴宇薇2,*, 汤佳欣2, 涂志刚2   

  1. 1. 国能网信科技(北京)有限公司 综合自动化部,北京 100011
    2. 武汉大学 测绘遥感信息工程国家重点实验室,武汉 430079
  • 收稿日期:2022-07-22 出版日期:2023-07-15 发布日期:2023-07-14
  • 通讯作者: 阴宇薇
  • 作者简介:

    聂志勇(1982—),男,工程师,主研方向为自动化控制、电气设计、人工智能研发管理

    汤佳欣,硕士

    涂志刚,研究员、博士、博士生导师

  • 基金资助:
    国家自然科学基金(62106177)

A Box Regression Algorithm Based on Key Point Distance of Bounding Box

Zhiyong NIE1, Yuwei YIN2,*, Jiaxin TANG2, Zhigang TU2   

  1. 1. General Automation Department, CHN Energy Network Information Technology(Beijing) Co., Ltd., Beijing 100011, China
    2. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
  • Received:2022-07-22 Online:2023-07-15 Published:2023-07-14
  • Contact: Yuwei YIN

摘要:

针对目前基于交并比(IoU)的框回归方法在实际应用中存在的检测精度不高、收敛速度较慢等问题,提出一种基于关键点距离交并比(KIoU)的框回归方法。从几何知识入手,将矩形的3个顶点和1个中心点作为关键点,通过计算对应点之间的距离来判断预测框与真实框的位置以及形态差异。构建基于关键点交并比损失的新型损失函数,计算实际情况与理想情况下预测框与真实框的关键点交并比之差,将关键对应点的距离作为IoU的惩罚项以加速模型收敛过程,利用关键点信息在定位上的高效性和准确性来提高目标检测精度。以单阶段目标检测算法SSD和两阶段目标检测算法Faster R-CNN为基准算法,在PASCAL VOC和COCO数据集上将KIoU与IoU、GIoU、DIoU、CIoU等4种交并比方法进行实验对比,结果表明:在检测精度方面,在Faster R-CNN上KIoU相较IoU提升了2.91%,相较目前表现较好的DIoU提升了0.11%,在SSD上KIoU相较IoU与DIoU分别提升了0.96%与0.06%;在目标检测视觉效果方面,KIoU方法对目标的定位更加准确,且在一定程度上能够减少目标漏检的情况。

关键词: 目标检测, 边界框回归, 交并比, 关键点距离交并比, 关键对应点

Abstract:

To address the challenges of low detection accuracy and slow convergence rate associated with the current box regression method utilizing Intersection-over-Union(IoU) in practical applications, a new box regression method based on Key point distance based Intersection-over-Union(KIoU) is proposed. The proposed method incorporates geometric knowledge by considering the three vertices and the center point of the rectangle as key points. These key points enable the determination of the position and morphological differences between the predicted box and the actual box by calculating the distance between corresponding points. A new loss function based on the IoU loss of key points is constructed to measure the difference between the IoU of the key points of the prediction box and the actual box in both real-world and ideal scenarios.The distance between the corresponding key points is used as the penalty term for IoU, thereby accelerating the convergence process of the model.The efficiency and accuracy of key point information in object positioning are leveraged to improve the target detection accuracy. Experimental comparisons were conducted on the PASCAL VOC and COCO datasets using the Single Shot multibox Detector(SSD), which is a single-stage object detection algorithm, and the Faster Region-Convolutional Neural Network(Faster R-CNN), which is a two-stage object detection algorithm, as benchmark algorithms. KIoU was compared against IoU, Generalized IoU(GIoU), Distance IoU (DIoU), and Complete IoU(CIoU).The results demonstrated notable improvements in detection accuracy.Specifically, compared to IoU, KIoU on Faster R-CNN exhibited a 2.91% increase, surpassing DIoU by 0.11% in current performance, and outperformed IoU and DIoU on SSD by 0.96% and 0.06%, respectively.Additionally, in terms of visual effects in object detection, the KIoU method exhibited more accurate target localization and demonstrated the ability to mitigate the occurrence of missed targets to some extent.

Key words: object detection, boundary box regression, Intersection-over-Union(IoU), Key point distance based Intersection-over-Union(KIoU), key corresponding point