作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

基于多尺度线性全局注意力的运动员检测算法

  • 发布日期:2023-12-05

Athlete Detection Algorithm Based on Multi-scale Linear Global Attention

  • Published:2023-12-05

摘要: 比赛过程中运动员快速移动且频繁遮挡使得对视频中运动员检测容易出现漏检、多检、检测精度下降等问题。现有 的主流方法对于移动和遮挡情况下的运动员检测表现不佳。运动员受到遮挡后,检测目标框的尺度变化增大。本文引入 cutout 作为数据增强的方法,模拟遮挡情况,构建了一类基于多尺度线性全局注意力 EfficientViT 的运动员检测算法。具体的,使用 线性全局注意力模块以减少计算量,并辅以卷积模块来增强其局部的特征提取能力,通过轻量级小卷积来聚合不同注意力头 部的 token,获得多尺度信息,增强其全局特征提取能力。针对损失函数部分,选择了 EIoU 作为边界框损失,加入检测框与 目标框的宽高距离,使得检测框和真实目标框尺度上更为贴近。最后,在 SportsMOT 数据集中 4 个公开的篮球比赛视频数据 集上,对比不同主干网络以及使用不同改进方法的实验结果,本文所提算法取得了 98.0%准确率、98.2%的均值平均精度,相 较于 YOLOv5 算法其精度提升了 4%,高置信度的均值平均精度提升了 8.7%。

Abstract: 】 The rapid movement and frequent occlusion of athletes during the competition make athletes detection in video easy to miss detection, multiple detection, detection accuracy decline and other problems. The current mainstream methods do not perform well for athlete detection under moving and occluding conditions. When the athletes are occluded, the size of the bounding box increases. In this paper, cutout is introduced as a data augmentation method to simulate occlusion, and an athlete detection algorithm based on multi-scale linear global attention EfficientViT is constructed. Specifically, the linear global attention module is used to reduce the amount of computation, and the convolution module is supplemented to enhance its local feature extraction capability. The token of different attention heads is aggregated through lightweight small convolution to obtain multi-scale information and enhance its global feature extraction capability. For the part of loss function, EIoU is selected as the bounding box loss, and the width and height distance between the detection bounding box and the target bounding box is added, so that the detection bounding boxes and the real target bounding boxes are closer in scale. Finally, on 4 public basketball game video datasets in the SportsMOT, comparing the experimental results of different backbone networks and using different improvement methods, the proposed algorithm achieved 98.0% precision and 98.2% mean average precision. Compared with the origin YOLOv5 algorithm, its precision increases by 4% and the high confidence mean average precision improves by 8.7%.