Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (10): 308-318. doi: 10.19678/j.issn.1000-3428.0069172

• Graphics and Image Processing • Previous Articles     Next Articles

MRI Liver Image Segmentation Based on Cascade Transformer and U-Net

ZHANG Tiansen1, XU Xiaona1,*(), ZHAO Yue1, ZHANG Xinning2   

  1. 1. School of Information Engineering, Minzu University of China, Beijing 100081, China
    2. The First Medical Centre of PLA General Hospital, Beijing 100086, China
  • Received:2024-01-04 Revised:2024-03-19 Online:2025-10-15 Published:2025-10-29
  • Contact: XU Xiaona

基于级联Transformer和U-Net的MRI肝脏图像分割

张天森1, 徐晓娜1,*(), 赵悦1, 张新宁2   

  1. 1. 中央民族大学信息工程学院,北京 100081
    2. 解放军总医院第一医学中心,北京 100086
  • 通讯作者: 徐晓娜
  • 基金资助:
    北京市科委AI+健康协同创新培育项目(Z221100003522005); 中央高校基本科研业务费专项资金(2024GJYY46)

Abstract:

Achieving accurate Magnetic Resonance Imaging (MRI) liver image segmentation is of great significance in the field of medicine. It assists doctors in rapidly locating a target region, aids treatment, and plays a key role in postoperative observation. However, MRI images contain rich semantic information and numerous abnormal noises. Traditional convolutional operations have certain limitations in image processing, with limited global modeling capability, limited receptive fields, and difficulty in capturing global information. Moreover, the hierarchy of convolution-based networks should not be too deep because deep networks tend to increase the number of parameters and miss important semantic information at high resolutions. To address these problems, this study introduces the application of Transformers in image processing to establish global information associations, to better capture global information and achieve accurate target location. However, the Transformer may destroy local details when processing detailed image features and performs poorly in providing inductive bias. To leverage the advantages of the Transformer and convolution, this study proposes a feature modeling method that works in cascade. First, coarse segmentation of the Region of Interest (RoI) is achieved by using the Medical Transformer (MedT) network, which uses fewer parameters and requires less computational effort, as the upstream network. Then, the extracted RoI region is data processed and fed into a downstream U-Net network for secondary segmentation, where special attention is paid to local information during the second segmentation to obtain finer prediction results. Experiments on the CHAOS dataset demonstrate that the proposed method achieves significant results in liver segmentation tasks, with a Dice Similarity Coefficient (DSC) of 0.922 and an Intersection over Union (IoU) score of 0.877.

Key words: liver segmentation, Medical Transformer (MedT) network, U-Net structure, Magnetic Resonance Imaging(MRI), cascade

摘要:

实现精准的磁共振成像(MRI)肝脏图像分割在医学领域具有重要意义,不仅可有效协助医生迅速定位目标区域、辅助治疗,也可以在术后观察中发挥关键作用。然而MRI图像包含丰富的语义信息和众多异常噪声,而传统卷积操作在图像处理中存在一定的局限性,其全局建模能力与感受野有限,难以捕捉全局信息。并且,基于卷积的网络层次不宜过深,因为深层网络既会增加参数量,也会缺失高分辨率下的重要语义信息。为了解决这些问题,引入Transformer机制以建立全局信息关联,从而更好地捕捉全局信息,实现目标的精准定位。但Transformer在处理图像细节特征方面存在可能破坏局部细节的问题,且其在提供归纳偏置方面表现欠佳。为了综合利用Transformer和卷积的优势,提出一种级联工作的特征建模方法。首先,通过使用参数量和计算量较少的MedT(Medical Transformer)网络作为上游网络,实现对感兴趣区域(RoI)的粗分割。然后,对提取的RoI进行数据处理,并送入下游的U-Net进行二次分割,在第二次分割的过程中特别关注局部信息,以获得更精细的预测结果。在CHAOS数据集上的实验结果证明,该方法在肝脏分割任务中取得了显著的成果,肝脏的Dice相似系数(DSC)达到0.922,交并比(IoU)达到0.877。

关键词: 肝脏分割, Medical Transformer网络, U-Net结构, 磁共振成像, 级联