作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (1): 381-389. doi: 10.19678/j.issn.1000-3428.0069751

• 交叉融合与工程应用 • 上一篇    下一篇

基于可变序列的跨层通道剪枝方法

李秉烨1,2, 鲍宇1,2,*(), 曹伟3, 李明泽3, 徐方正1,2   

  1. 1. 中国矿业大学计算机科学与技术学院/人工智能学院, 江苏 徐州 221116
    2. 中国矿业大学矿山数字化教育部工程研究中心, 江苏 徐州 221116
    3. 徐州市第一人民医院, 江苏 徐州 221116
  • 收稿日期:2024-04-15 修回日期:2024-07-10 出版日期:2026-01-15 发布日期:2024-10-18
  • 通讯作者: 鲍宇
  • 作者简介:

    李秉烨, 男, 硕士研究生, 主研方向为软件工程、人工智能

    鲍宇(CCF专业会员、通信作者), 副教授、博士

    曹伟, 主任医师

    李明泽, 住院医师

    徐方正, 硕士研究生

  • 基金资助:
    国家自然科学基金(52374164); 徐州市医药卫生面上研究项目(KC22110)

Cross-layer Channel Pruning Method Based on Variable Sequences

LI Bingye1,2, BAO Yu1,2,*(), CAO Wei3, LI Mingze3, XU Fangzheng1,2   

  1. 1. School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China
    2. Mine Digitization Engineering Research Center of the Ministry of Education, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China
    3. Xuzhou No. 1 People's Hospital, Xuzhou 221116, Jiangsu, China
  • Received:2024-04-15 Revised:2024-07-10 Online:2026-01-15 Published:2024-10-18
  • Contact: BAO Yu

摘要:

放射治疗是肝癌的重要治疗方式。基于深度学习的图像语义分割技术能辅助医师勾画放射靶区, 提高放射治疗的精确性。然而, 现有的医疗图像语义分割模型结构较为复杂, 且参数量大, 难以部署在资源受限的设备上。分析视觉Transformer模型参数的重要性, 发现模型不同层次的重要性参数具有特殊分布规律。据此, 提出基于可变序列的跨层通道剪枝方法。依据重要性参数分布规律, 对多头自注意力(MSA)层和前馈网络(FFN)层, 测量其重要性权重并调节取值, 形成重要性权重值的层次序列, 再为序列设定对应的剪枝率, 形成随网络深度变化的可变剪枝率序列, 从而实现MSA层和FFN层的精细化剪枝。引入循环剪枝策略, 在每一轮次模型剪枝过程中迭代更新可变剪枝率序列, 以充分削减MSA层和FFN层的冗余结构。在公开肝脏分割数据集3D-IRCADb-01上进行训练和测试, 视觉Transformer模型经剪枝后, 图像分割准确率保持不变, 但浮点运算数(FLOPs)和参数量分别减少了60.26%和66.07%。实验结果表明, 所提方法在保证分割精度的前提下, 实现了更大的剪枝率, 比固定剪枝率方法更具有优势。

关键词: 医学影像, 语义分割, 视觉Transformer, 可变序列, 模型剪枝

Abstract:

Radiotherapy is an important treatment modality for liver cancer. Deep learning-based image semantic segmentation technology can assist physicians in demarcating radiation target areas and enhancing the accuracy of radiotherapy. However, existing medical image semantic segmentation models are relatively intricate and possess a substantial number of parameters, rendering them challenging to deploy on devices with constrained resources. An analysis of the significance of the parameters of the vision transformer model reveals that the crucial parameters of the different layers of the model exhibit a distinctive distribution pattern. Based on this finding, this study proposes a cross-layer channel pruning method based on variable sequences. According to the distribution pattern of the significant parameters, the significance weights of the Multihead Self-Attention (MSA) and Feed-Forward Network (FFN) layers are measured and these values are adjusted to form a hierarchical sequence of significance weight values. Subsequently, the corresponding pruning rate is set for the sequence to form a variable pruning rate sequence that varies with the depth of the network, thereby achieving fine pruning of the MSA and FFN layers. This new method introduces a cyclic pruning strategy that iteratively updates the variable pruning rate sequence during each round of model pruning to reduce the redundant structures in the MSA and FFN layers adequately. The model is trained and tested using the public liver segmentation dataset, 3D-IRCADb-01. After pruning the vision transformer, the accuracy of the image segmentation does not decline and the Floating-Point Operations (FLOPs) and number of parameters are reduced by 60.26% and 66.07%, respectively. Experimental results indicate that the new method attains a higher pruning rate while guaranteeing segmentation accuracy and is more advantageous than the fixed pruning rate method.

Key words: medical imaging, semantic segmentation, vision Transformer, variable sequence, model pruning