作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 102-112. doi: 10.19678/j.issn.1000-3428.0068301

• 人工智能与模式识别 • 上一篇    下一篇

基于改进PIDNet的水位线检测算法

李仲1,*(), 冒睿瑞2, 王晓龙1, 王根一1, 安国成1   

  1. 1. 上海华讯网络系统有限公司服务运作部, 上海 201103
    2. 中国电子科技集团公司第三十二研究所, 上海 201808
  • 收稿日期:2023-08-28 出版日期:2024-08-15 发布日期:2024-01-16
  • 通讯作者: 李仲
  • 基金资助:
    国家重点研发计划(2023YFC3006700)

Water Level Line Detection Algorithm Based on Improved PIDNet

Zhong LI1,*(), Ruirui MAO2, Xiaolong WANG1, Genyi WANG1, Guocheng AN1   

  1. 1. Service Operations Department, Shanghai Huaxun Network System Co., Ltd., Shanghai 201103, China
    2. The 32nd Research Institute of China Electronics Technology Group Corporation, Shanghai 201808, China
  • Received:2023-08-28 Online:2024-08-15 Published:2024-01-16
  • Contact: Zhong LI

摘要:

PIDNet是三分支网络构成的语义分割模型, 在众多竞赛数据集中均保持优秀的分割精度。然而, 积分分支中进行多次下采样和金字塔池化模块中多分支特征融合冗余的缺点限制了算法精度的提高。在水位线检测任务中, 现有算法的缺点会导致局部细节信息丢失, 使得水体边缘精细化检测的能力有所下降。为了缓解这个问题, 提出一种基于改进PIDNet的水位线检测算法。首先设计一种结合通道注意力的轻量化像素增强模块, 在积分分支下采样过程中进行像素增强, 减少局部信息丢失。然后对金字塔池化模块进行重构, 在减小池化输出特征大小的基础上减少并行分支数, 同时在特征融合时结合通道注意力进一步加强关注重要特征的能力, 提高水位线边缘的分割精度。此外, 融合多场景的河流数据集, 避免复杂场景下检测出的水位线位置发生偏移和断线。实验结果表明, 所提方法(S和M)在水位线检测任务中相对原算法(S和M)在3个性能指标上都有所提高, 以M规模为例, 像素正确率提高了1.47个百分点, 平均交并比提高了1.04个百分点, 检测延迟降低了0.9 ms。

关键词: 语义分割, 水位线检测, 金字塔池化模块, 注意力, 多场景

Abstract:

PIDNet is a semantic segmentation model composed of three branch networks, which maintains excellent segmentation accuracy in many competitive datasets. However, the shortcomings of multiple downsampling in the integral branch and the redundancy of multi-branch feature fusion in the pyramid pooling module limit the improvement in the accuracy of the algorithm. Existing algorithms for water level line detection suffer from shortcomings that result in the loss of local detailed information, thereby reducing their ability to detect water edges. To alleviate this problem, a water level line detection algorithm based on improved PIDNet is proposed. First, a Lightweight Pixel Enhancement Module(LPEM) combined with channel attention is designed to perform pixel enhancement to reduce local information loss during integral branch downsampling. The pyramid pooling module is then reconfigured to reduce the number of parallel branches by reducing the pooling output feature size. Combining channel attention during feature fusion further enhances the ability to focus feature attention and improves the water level line edge segmentation accuracy. In addition, this study combines a multi-scene river dataset to overcome situations in which the detected water level line position will shift or even break when the scene is complicated. The experimental results show that the method(S and M) in this study improves three performance metrics relative to the original algorithm(S and M) in the water level line detection task. Considering method(M) in this study as an example, the Pixel Accuracy(PA) is improved by 1.47 percentage points, the Mean Intersection over Union(mIoU) is improved by 1.04 percentage points, and the detection delay is reduced by 0.9 ms.

Key words: semantic segmentation, water level line detection, pyramid pooling module, attention, multi-scene