作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (1): 154-165. doi: 10.19678/j.issn.1000-3428.0252294

• 计算机视觉与图形图像处理 • 上一篇    下一篇

基于多层次特征融合的路面裂缝检测方法

黎东丰, 陈雨人*(), 余博   

  1. 同济大学道路与交通工程教育部重点实验室, 上海 201804
  • 收稿日期:2025-04-07 修回日期:2025-06-20 出版日期:2026-01-15 发布日期:2025-06-25
  • 通讯作者: 陈雨人
  • 作者简介:

    黎东丰, 男, 硕士研究生, 主研方向为道路交通基础设施计算机辅助工程

    陈雨人(通信作者), 教授

    余博, 副教授

  • 基金资助:
    国家重点研发计划(2023YFE0202400)

Pavement Crack Detection Method Based on Multi-Level Feature Fusion

LI Dongfeng, CHEN Yuren*(), YU Bo   

  1. The Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, Shanghai 201804, China
  • Received:2025-04-07 Revised:2025-06-20 Online:2026-01-15 Published:2025-06-25
  • Contact: CHEN Yuren

摘要:

在现有基于U-Net的路面裂缝检测方法中, 编码器各层次特征间的交互未能得到充分考虑, 容易因下采样过程中的信息丢失而导致检测结果不完整或出现漏检。为此, 提出一种基于多层次特征融合的路面裂缝检测方法。首先, 在编码阶段, 提取裂缝在不同层次上的特征, 形成从浅层到深层的裂缝特征表示; 其次, 在跳跃连接部分, 采用基于改进通道交叉Transformer(CCT)的跨层次融合策略, 增强各层次特征间的互补性, 丰富裂缝特征的表达; 最后, 在解码阶段, 通过特征融合模块优化解码器对编码器特征的利用方式, 促进裂缝特征的传递, 提高对裂缝特征的感知能力。为验证所提方法的有效性, 在DeepCrack和CRACK500 2个公开数据集上进行一系列的对比和消融实验, 结果表明, 所提方法的综合表现优于DeepCrack、Swin-UNet等6种方法, 在DeepCrack数据集上的F1值相较DeepCrack、Swin-UNet分别提高了2.30和2.51百分点, 在CRACK500数据集上则分别提高了1.65和1.00百分点。

关键词: 路面裂缝检测, 语义分割, U-Net, 多层次特征融合, 交叉注意力机制

Abstract:

Current U-Net-based pavement crack detection methods do not fully consider the interaction between the features of each level of the encoder, causing incomplete detection results or missed detections because of information loss during the downsampling process. To address this issue, this study proposes a pavement crack detection method based on multi-level feature fusion. In the encoding stage, the features of cracks at different levels are extracted to form crack feature representations from shallow to deep layers. In the skip connection section, a cross-level fusion strategy based on an improved Channel Cross Transformer (CCT) is adopted to enhance the complementarity between features at each level and enrich the expression of crack features. In the decoding stage, the feature fusion module is used to optimize the decoder's utilization of encoder features, promote the transmission of crack features, and improve the perception ability of crack features. In a series of comparative and ablation experiments on two public datasets, DeepCrack and CRACK500, the proposed method outperforms six other methods, including DeepCrack and Swin-UNet. On DeepCrack, the proposed method increases the F1 value by 2.30 and 2.51 percentage points, respectively, compared to those of DeepCrack and Swin-UNet, while on CRACK500, it increases by 1.65 and 1.00 percentage points, respectively.

Key words: pavement crack detection, semantic segmentation, U-Net, multi-level feature fusion, cross-attention mechanism