作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (1): 390-399. doi: 10.19678/j.issn.1000-3428.0070111

• 交叉融合与工程应用 • 上一篇    下一篇

融合交叉注意力和双特征交互的红外船舶目标检测模型

邹少华1, 刘笑嶂1,*(), 李修来2   

  1. 1. 海南大学计算机科学与技术学院, 海南 海口 570228
    2. 海南大学网络空间安全学院, 海南 海口 570228
  • 收稿日期:2024-07-12 修回日期:2024-08-14 出版日期:2026-01-15 发布日期:2026-01-15
  • 通讯作者: 刘笑嶂
  • 作者简介:

    邹少华(CCF学生会员), 男, 硕士研究生, 主研方向为目标检测

    刘笑嶂(通信作者), 教授、博士

    李修来, 博士研究生

  • 基金资助:
    海南省重点研发项目(ZDYF2022GXJS348); 海口市重点科技计划(2023-054)

Infrared Ship Target Detection Model Integrating Criss-Cross Attention and Dual Feature Interaction

ZOU Shaohua1, LIU Xiaozhang1,*(), LI Xiulai2   

  1. 1. College of Computer Science and Technology, Hainan University, Haikou 570228, Hainan, China
    2. School of Cyberspace Security, Hainan University, Haikou 570228, Hainan, China
  • Received:2024-07-12 Revised:2024-08-14 Online:2026-01-15 Published:2026-01-15
  • Contact: LIU Xiaozhang

摘要:

为了应对红外图像目标检测中目标像素低、背景复杂以及硬件资源有限等问题, 提出一种融合位置编码的多头交叉注意力机制和双特征交互细化结构的目标检测模型。在骨干网络中, 引入基于位置编码的交叉注意力(CCA)模块和空间金字塔池跨阶段局部(SPCP)模块。CCA模块通过行和列的相关矩阵变换, 在水平和垂直方向上聚合上下文信息, 并通过共享递归交错模块的参数, 减少自注意力机制所需的参数数量, 增强特征提取能力。SPCP模块通过统一不同大小和尺度的特征映射, 采用跨阶段局部(CSP)结构降低参数和计算量, 并引入挤压激励注意力机制选择对目标检测更有利的通道。在颈部网络中, 引入频域信息和双特征交互细化(DIR)模块, 进一步提取小型目标船舶的细化特征, 增强模型的特征融合能力。实验结果表明, 改进后的模型在红外船舶检测数据集(ISDD)上的精确度为89.5%, 召回率为97%, F1值为93.1%, 与基准模型相比, 显著提高了检测性能。此外, 与其他检测模型相比, 所提出的模型减少了计算参数量, 融合位置编码的多头交叉注意力机制和双特征交互细化结构可有效提升红外船舶目标检测的准确性。

关键词: 红外图像, 目标检测, 多头交叉注意力, 多尺度重塑, 双特征交互细化

Abstract:

To address the issues of low target pixels, complex background, and limited hardware resources in infrared image target detection, a target detection model that incorporates a multihead cross-attention mechanism with position coding and a two-feature interaction refinement structure is proposed. In the backbone network, a location coding-based cross-attention module called Criss-Cross Attention (CCA) and a Spatial Pyramid Pooling Cross Stage Partial (SPCP) module are introduced. The CCA module transforms the correlation matrix by rows and columns horizontally. This module aggregates contextual information in the horizontal and vertical directions via row and column correlation matrix transformations and enhances feature extraction by sharing the parameters of the recursive interleaving module, which reduces the number of parameters required for the self-attention mechanism. The SPCP module reduces the number of parameters and computations by unifying feature mappings of different sizes and scales, adopting a Cross Stage Partial (CSP) structure, and introducing a squeeze incentive. The attention mechanism selects channels that are more favorable for target detection. In the neck network, frequency-domain information and a Dual Feature Interaction Refinement (DIR) module are introduced to further extract the refined features of small target ships and enhance the feature fusion capability of the model. The improved model achieves 89.5% precision, 97% recall, and 93.1% F1 score on an Infrared Ship Detection Dataset (ISDD). This significantly improves the detection performance compared with that of the benchmark model. Additionally, the proposed model reduces the number of computational parameters compared with other detection models. The experimental results show that the multihead cross-attention mechanism with fused position coding and two-feature interaction refinement structure effectively improves the accuracy of infrared ship target detection.

Key words: infrared images, target detection, multi-head Criss-Cross Attention (CCA), multiscale reshaping, Dual Feature Interaction Refinement (DIR)