MRI Liver Image Segmentation Based on Cascade Transformer and U-Net

doi:10.19678/j.issn.1000-3428.0069172

Abstract

Abstract:

Achieving accurate Magnetic Resonance Imaging (MRI) liver image segmentation is of great significance in the field of medicine. It assists doctors in rapidly locating a target region, aids treatment, and plays a key role in postoperative observation. However, MRI images contain rich semantic information and numerous abnormal noises. Traditional convolutional operations have certain limitations in image processing, with limited global modeling capability, limited receptive fields, and difficulty in capturing global information. Moreover, the hierarchy of convolution-based networks should not be too deep because deep networks tend to increase the number of parameters and miss important semantic information at high resolutions. To address these problems, this study introduces the application of Transformers in image processing to establish global information associations, to better capture global information and achieve accurate target location. However, the Transformer may destroy local details when processing detailed image features and performs poorly in providing inductive bias. To leverage the advantages of the Transformer and convolution, this study proposes a feature modeling method that works in cascade. First, coarse segmentation of the Region of Interest (RoI) is achieved by using the Medical Transformer (MedT) network, which uses fewer parameters and requires less computational effort, as the upstream network. Then, the extracted RoI region is data processed and fed into a downstream U-Net network for secondary segmentation, where special attention is paid to local information during the second segmentation to obtain finer prediction results. Experiments on the CHAOS dataset demonstrate that the proposed method achieves significant results in liver segmentation tasks, with a Dice Similarity Coefficient (DSC) of 0.922 and an Intersection over Union (IoU) score of 0.877.

Key words: liver segmentation, Medical Transformer (MedT) network, U-Net structure, Magnetic Resonance Imaging(MRI), cascade

摘要：

实现精准的磁共振成像(MRI)肝脏图像分割在医学领域具有重要意义，不仅可有效协助医生迅速定位目标区域、辅助治疗，也可以在术后观察中发挥关键作用。然而MRI图像包含丰富的语义信息和众多异常噪声，而传统卷积操作在图像处理中存在一定的局限性，其全局建模能力与感受野有限，难以捕捉全局信息。并且，基于卷积的网络层次不宜过深，因为深层网络既会增加参数量，也会缺失高分辨率下的重要语义信息。为了解决这些问题，引入Transformer机制以建立全局信息关联，从而更好地捕捉全局信息，实现目标的精准定位。但Transformer在处理图像细节特征方面存在可能破坏局部细节的问题，且其在提供归纳偏置方面表现欠佳。为了综合利用Transformer和卷积的优势，提出一种级联工作的特征建模方法。首先，通过使用参数量和计算量较少的MedT(Medical Transformer)网络作为上游网络，实现对感兴趣区域(RoI)的粗分割。然后，对提取的RoI进行数据处理，并送入下游的U-Net进行二次分割，在第二次分割的过程中特别关注局部信息，以获得更精细的预测结果。在CHAOS数据集上的实验结果证明，该方法在肝脏分割任务中取得了显著的成果，肝脏的Dice相似系数(DSC)达到0.922，交并比(IoU)达到0.877。

关键词: 肝脏分割, Medical Transformer网络, U-Net结构, 磁共振成像, 级联

ZHANG Tiansen, XU Xiaona, ZHAO Yue, ZHANG Xinning. MRI Liver Image Segmentation Based on Cascade Transformer and U-Net[J]. Computer Engineering, 2025, 51(10): 308-318.

张天森, 徐晓娜, 赵悦, 张新宁. 基于级联Transformer和U-Net的MRI肝脏图像分割[J]. 计算机工程, 2025, 51(10): 308-318.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0069172

https://www.ecice06.com/EN/Y2025/V51/I10/308

Figures/Tables 16

Fig.1 The overall procedure of the model

Fig.2 The visualization results of feature reuse

Fig.3 Schematic diagram of the self-attention mechanism

Fig.4 Schematic diagram of Axial-Attention mechanism

Fig.5 Overall network structure

Fig.6 Segmentation results of different algorithms

Fig.7 The loss change curve of Global+16 patch

Fig.8 U-Net loss variation curve

References 25

1	DAS A, SABUT S K. Kernelized fuzzy C-means clustering with adaptive thresholding for segmenting liver tumors. Procedia Computer Science, 2016, 92, 389- 395. doi: 10.1016/j.procs.2016.07.395
2	张小强, 熊博莅, 匡纲要. 一种基于变化检测技术的SAR图像舰船目标鉴别方法. 电子与信息学报, 2015, 37(1): 63- 70.
	ZHANG X Q, XIONG B L, KUANG G Y. A ship target discrimination method based on change detection in SAR imagery. Journal of Electronics & Information Technology, 2015, 37(1): 63- 70.
3	ZENG Y Z, ZHAO Y Q, LIAO S H, et al. Liver vessel segmentation based on centerline constraint and intensity model. Biomedical Signal Processing and Control, 2018, 45, 192- 201. doi: 10.1016/j.bspc.2018.05.035
4	HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504- 507. doi: 10.1126/science.1127647
5	FUKUSHIMA K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 1980, 36(4): 193- 202. doi: 10.1007/BF00344251
6	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2015: 3431-3440.
7	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834- 848. doi: 10.1109/TPAMI.2017.2699184
8	RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation. Berlin, Germany: Springer, 2015.
9	OKTAY O, SCHLEMPER J, LE FOLGOC L, et al. Attention U-Net: learning where to look for the pancreas[EB/OL]. [2023-12-01]. https://arxiv.org/abs/1804.03999v3.
10	JIN Q G, MENG Z P, SUN C M, et al. RA-UNet: a hybrid deep attention-aware network to extract liver and tumor in CT scans. Frontiers in Bioengineering and Biotechnology, 2020, 8, 605132. doi: 10.3389/fbioe.2020.605132
11	刘一鸣, 肖志勇. 基于特征融合的肝脏肿瘤自动分割方法. 激光与光电子学进展, 2021, 58(14): 1417001.
	LIU Y M, XIAO Z Y. Automatic segmentation algorithm of liver tumor based on feature fusion. Laser & Optoelectronics Progress, 2021, 58(14): 1417001.
12	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. [2023-12-01]. https://arxiv.org/abs/2010.11929.
13	CHEN J N, LU Y Y, YU Q H, et al. TransUNet: transformers make strong encoders for medical image segmentation[EB/OL]. [2023-12-01]. https://arxiv.org/abs/2102.04306v1.
14	VALANARASU J M J, OZA P, HACIHALILOGLU I, et al. Medical transformer: gated axial-attention for medical image segmentation. Berlin, Germany: Springer, 2021.
15	WANG H Y, ZHU Y K, GREEN B, et al. Axial-DeepLab: stand-alone axial-attention for panoptic segmentation[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 108-126.
16	CHEN W L, ZHANG Y, HE J J, et al. Prostate segmentation using 2D bridged U-net[C]//Proceedings of International Joint Conference on Neural Networks. Washington D. C., USA: IEEE Press, 2019: 1- 7.
17	JHA D, RIEGLER M A, JOHANSEN D, et al. Double U-Net: a deep convolutional neural network for medical image segmentation[C]//Proceedings of the 33rd IEEE International Symposium on Computer-Based Medical Systems. Washington D. C., USA: IEEE Press, 2020: 558-564.
18	BI L, KIM J, KUMAR A, et al. Automatic liver lesion detection using cascaded deep residual networks[EB/OL]. [2023-12-01]. https://arxiv.org/abs/1704.02703v2.
19	KALUVA K C, KHENED M, KORI A, et al. 2D-densely connected convolution neural networks for automatic liver and tumor segmentation[EB/OL]. [2023-12-01]. https://arxiv.org/abs/1802.02182v1.
20	严春满, 王铖. 卷积神经网络模型发展及应用. 计算机科学与探索, 2021, 15(1): 27- 46.
	YAN C M, WANG C. Development and application of convolutional neural network model. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 27- 46.
21	HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE Press, 2017: 4700-4708.
22	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 30-42.
23	赵杰, 孙伟, 徐中达, 等. 基于形态学预处理的数字图像相关方法研究. 实验力学, 2022, 37(5): 629- 637.
	ZHAO J, SUN W, XU Z D, et al. Study on the method of digital image correlation based morphologicalpre-processing. Journal of Experimental Mechanics, 2022, 37(5): 629- 637.
24	KAVUR A E, GEZER N S, BARıŞ M, et al. CHAOS challenge-combined healthy abdominal organ segmentation. Medical Image Analysis, 2021, 69, 101950. doi: 10.1016/j.media.2020.101950
25	KINGMA D P, BA J, HAMMAD M M. Adam: a method for stochastic optimization[EB/OL].[2023-12-01]. https://arxiv.org/abs/1412.6980v9.

[1]	GUO Xinyu, MA Bo, Aibibula Atawula, YANG Fengyi, ZHOU Xi. Event Extraction via Cascade Decoding Enhanced by Dynamic Heterogeneous Graphs [J]. Computer Engineering, 2025, 51(9): 91-100.
[2]	ZHOU Zhechen, HU Jisu, QIAN Xusheng, ZHENG Yi, DAI Yakang, ZHOU Zhiyong. MRI Brain Tissue Segmentation Based on Query-Adaptive Bi-level Self-Attention Mechanism [J]. Computer Engineering, 2025, 51(7): 294-304.
[3]	Shuang GAO, Yilun SHI, Qiaozhi XU, Lei YU. Research on Cardiac MRI Segmentation Based on Asymmetric Encoding and Decoding Structure of Contrastive Learning [J]. Computer Engineering, 2024, 50(8): 290-300.
[4]	Meimei ZHANG, Pinle QIN, Rui CHAI, Jianchao ZENG, Shuangjiao ZHAI, Junyi YAN, Eryan FENG. CT-Generated MRI Algorithm for Acute Ischemic Stroke [J]. Computer Engineering, 2024, 50(2): 317-326.
[5]	ZHENG Ayong, GU Xingsheng. Bidirectional Cascade Network for Aspect Sentiment Triplet Extraction [J]. Computer Engineering, 2024, 50(12): 90-98.
[6]	GU Yunjie, WU Changhe, WU Qing, ZHANG Wei, Lü Tianhang, HU Qi, SONG Xiaobin, YAN Jiyu. Cascade Vulnerability Scanning Engine Deployment Strategy for Delay Optimization [J]. Computer Engineering, 2023, 49(3): 161-167,176.
[7]	Pengquan XU, Yuxiang LIANG, Ying LI. Medical Image Segmentation Fusing Multi-Scale Semantic and Residual Bottleneck Attention [J]. Computer Engineering, 2023, 49(10): 162-170.
[8]	PAN Jiacheng, DONG Yihong, CHEN Huahui. Review of Research on Auxiliary Diagnosis of Autism Based on Graph Neural Networks [J]. Computer Engineering, 2022, 48(9): 1-11.
[9]	CHUN Yutong, HAN Feiteng, HE Mingke. Intelligent Monitoring Model for Aggregated Infection Risk Against the Background of COVID-19 Epidemic [J]. Computer Engineering, 2022, 48(8): 45-52,61.
[10]	ZHANG Fazheng, YANG Juan, WANG Ronggui, XUE Lixia. Lightweight Image Super-Resolution Reconstruction Based on Dynamic Adaptive Cascade Network [J]. Computer Engineering, 2022, 48(12): 196-202.
[11]	QIAO Jie, CAI Ruichu, HAO Zhifeng. Causal Structure Learning Algorithm Based on Cascade Additive Noise Model [J]. Computer Engineering, 2022, 48(1): 93-98.
[12]	HAO Huaying, ZHAO Kun, SU Pan, ZHANG Hui, ZHAO Yitian, LIU Jiang. A Corneal Nerve Segmentation Algorithm Based on Improved ResU-Net [J]. Computer Engineering, 2021, 47(1): 217-223.
[13]	LIU Yang, HUANG Darong, LIU Yang, ZHONG Wei. Color Standardization of Traffic Sign Images Based on Multi-Color Space Cascade Classification [J]. Computer Engineering, 2020, 46(9): 233-241.
[14]	LIU Tianyu, JIANG Weiwei, HE Jiangping, HAN Jincang. Segmentation of Liver CT Images Based on HC-CFCN Model [J]. Computer Engineering, 2020, 46(2): 268-273.
[15]	CHEN Yangyang, QIAN Pengjiang, ZHAO Kaifa, SU Kuanhao. sCT Generation Method for Abdominal MRI Data Based on mDixon Sequence [J]. Computer Engineering, 2019, 45(7): 273-281.

Please choose a citation manager

Content to export