作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (12): 304-311. doi: 10.19678/j.issn.1000-3428.0063687

• 开发研究与工程应用 • 上一篇    下一篇

基于倒金字塔深度学习网络的三维医学图像分割

张相芬, 刘艳, 袁非牛   

  1. 上海师范大学 信息与机电工程学院, 上海 201400
  • 收稿日期:2022-01-04 修回日期:2022-02-16 发布日期:2022-06-30
  • 作者简介:张相芬(1977—),女,副教授、博士,主研方向为医学图像处理、信息融合;刘艳,硕士研究生;袁非牛,教授、博士。
  • 基金资助:
    国家自然科学基金(61862029,62171285);上海师范大学普通研究基金(KF2021100)。

3D Medical Image Segmentation Based on Inverted Pyramid Deep Learning Network

ZHANG Xiangfen, LIU Yan, YUAN Feiniu   

  1. College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 201400, China
  • Received:2022-01-04 Revised:2022-02-16 Published:2022-06-30

摘要: 基于深度学习的医学图像分割对医学研究和临床疾病诊断具有重要意义。然而,现有三维脑图像分割网络仅依赖单一模态信息,且最后一层网络的特征表达不准确,导致分割精度降低。引入注意力机制,提出一种基于深度学习的多模态交叉重构的倒金字塔网络MCRAIP-Net。以多模态磁共振图像作为输入,通过三个独立的编码器结构提取各模态的特征信息,并将提取的特征信息在同一分辨率级进行初步融合。利用双通道交叉重构注意力模块实现多模态特征的细化与融合。在此基础上,采用倒金字塔解码器对解码器各阶段不同分辨率的特征进行整合,完成脑组织的分割任务。在MRBrainS13和IBSR18数据集上的实验结果表明,相比3D U-Net、MMAN、SW-3D-Unet等网络,MCRAIP-Net能够充分利用多模态图像的互补信息,获取更准确丰富的细节特征且具有较优的分割精度,白质、灰质、脑脊液的Dice系数分别达到91.67%、88.95%、84.79%。

关键词: 多模态融合, 交叉重构注意力, 倒金字塔解码器, 医学图像分割, 深度学习

Abstract: Segmentation of medical images based on deep learning is of great significance to both medical research and clinical disease diagnosis.However, existing 3D image segmentation networks only rely on single modality information, and the feature representation of the last layer of these existing networks is not accurate.As a result, segmentation accuracy is reduced.By introducing an attention mechanism, a inverted pyramid network with multi-modality cross reconstruction MCRAIP-Net, based on deep learning, is proposed.With multi-modality Magnetic Resonance Imaging(MRI) as input, the feature information of each modality is extracted by three independent encoder structures, and the extracted feature information is preliminarily fused at the same resolution level.The Dual-channel Cross Reconstruction Attention(DCRA) module is used to refine and fuse multi-modality features.An inverted pyramid decoder is used to integrate the features of different resolutions at each stage of the decoder, to complete the task of segmenting a 3D medical image.The experimental results on MRBrainS13 and IBSR18 datasets show that the proposed MCRAIP-Net can use complementary information from the multi-modality images to obtain more accurate and rich details, and has improved segmentation accuracy, compared with networks such as the 3D U-Net, MMAN, SW-3D-Unet.The Dice coefficients of white matter, gray matter, and cerebrospinal fluid are 91.67%, 88.95%, and 84.79%, respectively.

Key words: multi-modality fusion, cross reconstruction attention, inverted pyramid decoder, medical image segmentation, deep learning

中图分类号: