面向医学图像分割的深度学习模型架构与性能评估方法综述

doi:10.19678/j.issn.1000-3428.0253035

摘要/Abstract

摘要：

医学图像分割在多模态成像数据中实现病灶或结构的像素级定位是支撑辅助诊断与临床决策的关键任务。针对医学图像分割网络架构快速演化与评价指标存在的语义歧义、统计不稳等局限, 旨在系统梳理网络结构、任务特征和评价指标三者间的适配关系, 揭示方法发展路径与性能边界, 构建面向实际应用需求的结构-指标匹配机制。基于2020—2025年Web of Science核心数据库的代表性文献, 首先梳理Transformer、图神经网络(GNN)、扩散模型等主干架构的设计机制与演化路径; 然后总结轻量化、混合结构及提示引导范式的关键特征; 接着结合公开数据集实证研究, 对不同网络结构在器官、肿瘤与脑组织等典型任务中的分割性能进行定量对比, 涵盖95%豪斯多夫距离(HD95)、Dice相似系数(DSC)、交并比(IoU)等常用指标, 并识别出HD95在边界复杂任务中波动较大、DSC对小目标敏感性不足、IoU在结构区分方面存在局限等问题; 最后进一步揭示了指标误用与任务特征不匹配的统计根源, 构建了任务结构-指标推荐映射, 提出基于任务粒度的指标选择策略, 并探讨动态网络、自监督学习、跨模态建模等方向对模型泛化能力的潜在促进作用。

关键词: 医学图像分割, 深度学习, 网络架构, 评价指标体系, 任务适配

Abstract:

Medical image segmentation enables pixel-level localization of lesions or anatomical structures in multimodal imaging data and serves as a key foundation for computer-aided diagnosis and clinical decision-making. This study addresses the rapid evolution of medical image segmentation network architectures and the inherent limitations (semantic ambiguity and statistical instability) of existing evaluation metrics. This study aims to systematically examine and delineate the alignment among network structure, task characteristics, and evaluation metrics; reveal the method development path and performance boundaries; and establish a structure-metric matching mechanism tailored to practical clinical needs. Based on representative literature from the Web of Science Core Collection between 2020 and 2025, this study first reviews the design mechanisms and evolutionary pathways of core architectures, such as Transformers, Graph Neural Networks (GNNs), and Diffusion Models (DMs), and then summarizes the essential characteristics of lightweight, hybrid, and prompt-guided paradigms. Subsequently, by integrating empirical studies on public datasets, a quantitative comparison is conducted across different architectures in typical segmentation tasks involving organs, tumors, and brain tissues, covering common metrics such as the Dice Similarity Coefficient (DSC), 95% Hausdorff Distance (HD95), and Intersection over Union (IoU). The results indicate that HD95 exhibits high variability in boundary-complex tasks, DSC shows limited sensitivity to small targets, and IoU presents insufficient structural discrimination capability. Furthermore, this study reveals the statistical causes underlying metric misapplication and task-metric mismatch; constructs a task-structure-to-metric recommendation mapping; proposes a task-granularity-based metric selection strategy; and explores how dynamic networks, self-supervised learning, and cross-modal modeling contribute to the enhancement of model generalization.

Key words: medical image segmentation, deep learning, network architecture, evaluation metric system, task adaptation

李辉, 刘佳煜, 徐雅萍. 面向医学图像分割的深度学习模型架构与性能评估方法综述[J]. 计算机工程, 2026, 52(5): 81-94.

LI Hui, LIU Jiayu, XU Yaping. Review on Deep Learning Model Architectures and Performance Evaluation Methods for Medical Image Segmentation[J]. Computer Engineering, 2026, 52(5): 81-94.

https://www.ecice06.com/CN/Y2026/V52/I5/81

图/表 10

图1 近5年文献统计图

Fig.1 Statistical chart of literatures in the past 5 years

图2 单一主导范式发展历程

Fig.2 Evolution of single-dominant paradigms

图3 多模态融合方法流程图

Fig.3 Flowchart of multimodal fusion methods

图4 重叠度指标对比示意图

Fig.4 Schematic chart of overlap metric comparison

图5 DSC常见误用问题示意图

Fig.5 Schematic chart of DSC common misuse issues

参考文献 64

1	ANTONELLI M , REINKE A , BAKAS S , et al. The medical segmentation decathlon. Nature Communications, 2022, 13(1): 4128. doi: 10.1038/s41467-022-30695-9
2	HE K L , GAN C , LI Z Y , et al. Transformers in medical image analysis. Intelligent Medicine, 2023, 3(1): 59- 78. doi: 10.1016/j.imed.2022.07.002
3	HATAMIZADEH A, TANG Y C, NATH V, et al. UNETR: Transformers for 3D medical image segmentation[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Washington D.C., USA: IEEE Press, 2022: 1748-1758.
4	WANG L, HU P, SHEN C, et al. TransBTS: multimodal brain tumor segmentation using Transformer[EB/OL]. [2025-08-11]. https://arxiv.org/abs/2103.04430.
5	VALANARASU J M J, OZA P, HACIHALILOGLU I, et al. Medical Transformer: gated axial-attention for medical image segmentation[C]//Proceedings of the Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer International Publishing, 2021: 36-46.
6	NTANZI S , VIRIRI S . UNETR++ with voxel-focused attention: efficient 3D medical image segmentation with linear-complexity Transformers. Applied Sciences, 2025, 15(20): 11034. doi: 10.3390/app152011034
7	GENG S, JIANG S, HOU T, et al. FEU-Diff: a diffusion model with fuzzy evidence-driven dynamic uncertainty fusion for medical image segmentation[J/OL]. IEEE Transactions on Neural Networks and Learning Systems: 1-16[2025-08-11]. https://ieeexplore.ieee.org/document/11165205.
8	GAO M, YANG Y, WANG H, et al. DDPM-UNet: denoising diffusion probabilistic models for 3D medical image segmentation[EB/OL]. [2025-08-11]. https://arxiv.org/abs/2211.03364.
9	ZHENG H, ZHANG Y, ZHANG X, et al. Semantic Diffusion for weakly supervised medical image segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2023: 1-10.
10	GAO Y, GLOCKER B, RUECKERT D. Graph U-Net for brain cortical parcellation[C]//Proceedings of MICCAI'19. Washington D. C., USA: IEEE Press, 2019: 1-8.
11	ZHU M , XIAO Y , LIU J , et al. Graph attention U-Net for brain tumor segmentation. Neurocomputing, 2021, 452, 360- 372.
12	CHAI S R, JAIN R K, MO S C, et al. A novel adaptive hypergraph neural network for enhancing medical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2024: 23-33.
13	ZHOU X G, SUN Y Y, DENG M, et al. Robust semi-supervised multimodal medical image segmentation via cross modality collaboration[C]//Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2024: 57-67.
14	ZHANG J , YE Z Y , CHEN M Y , et al. TransGraphNet: a novel network for medical image segmentation based on Transformer and graph convolution. Biomedical Signal Processing and Control, 2025, 104, 107510. doi: 10.1016/j.bspc.2025.107510
15	CHEN J N , MEI J R , LI X H , et al. TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of Transformers. Medical Image Analysis, 2024, 97, 103280. doi: 10.1016/j.media.2024.103280
16	ZHANG Y D, LIU H Y, HU Q. TransFuse: fusing Transformers and CNNs for medical image segmentation[C]//Proceedings of the Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2021: 14-24.
17	CHEN T, WANG C H, CHEN Z H, et al. HiDiff: hybrid diffusion framework for medical image segmentation[EB/OL]. [2025-08-11]. https://hub.baai.ac.cn/paper/ace71b6a-ccbc-42f1-8fca-ebeb9da0cdcb.
18	AL QURRI A , ALMEKKAWY M . Hybrid MultiResUNet with Transformers for medical image segmentation. Biomedical Signal Processing and Control, 2025, 110, 108056. doi: 10.1016/j.bspc.2025.108056
19	XING Z H , WAN L , FU H Z , et al. Diff-UNet: a diffusion embedded network for robust 3D medical image segmentation. Medical Image Analysis, 2025, 105, 103654. doi: 10.1016/j.media.2025.103654
20	NAGARE M , BUZZARD G T , BOUMAN C A . Texture matching GAN for CT image enhancement. Journal of Mathematical Imaging and Vision, 2025, 67(4): 45. doi: 10.1007/s10851-025-01260-y
21	XING Z H, YE T, YANG Y J, et al. SegMamba: long-range sequential modeling mamba for 3D medical image segmentation[C]//Proceedings of the Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2024: 578-588.
22	LUO H Y , HE T , YI Z . A stable mapping of nmODE. Artificial Intelligence Review, 2024, 57(5): 120. doi: 10.1007/s10462-024-10749-8
23	NIU H , ZHOU Y X , YAN X H , et al. On the applications of neural ordinary differential equations in medical image analysis. Artificial Intelligence Review, 2024, 57(9): 236. doi: 10.1007/s10462-024-10894-0
24	HE Q, YAO X, WU J, et al. A lightweight U-like network utilizing neural memory ordinary differential equations for slimming the decoder[C]//Proceedings of IJCAI'24. Vienna, Austria: International Joint Conferences on Artificial Intelligence Organization, 2024: 821-829.
25	XU X , LUO H , YI Z , et al. A forward learning algorithm for neural memory ordinary differential equations. Internation Journal of Neural Systems, 2024, 34(9): 2450048. doi: 10.1142/S0129065724500485
26	WANG Z , GU J , ZHOU W , et al. Neural memory state space models for medical image segmentation. Internation Journal of Neural Systems, 2025, 35(1): 2450068. doi: 10.1142/S0129065724500680
27	MA J , ZHANG Y , GU S , et al. Segment anything in medical images with prompt-enhanced vision foundation model. Nature Machine Intelligence, 2023, 5(9): 1135- 1146.
28	WU J D , WANG Z Y , HONG M X , et al. Medical SAM Adapter: adapting segment anything model for medical image segmentation. Medical Image Analysis, 2025, 102, 103547. doi: 10.1016/j.media.2025.103547
29	LI Z H , LI Y X , LI Q D , et al. LViT: language meets vision Transformer in medical image segmentation. IEEE Transactions on Medical Imaging, 2024, 43(1): 96- 107. doi: 10.1109/TMI.2023.3291719
30	FISCHER M , BARTLER A , YANG B . Prompt tuning for parameter-efficient medical image segmentation. Medical Image Analysis, 2024, 91, 103024. doi: 10.1016/j.media.2023.103024
31	CHENG Z H, WEI Q Y, ZHU H R, et al. Unleashing the potential of SAM for medical adaptation via hierarchical decoding[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2024: 3511-3522.
32	赵欣, 李森, 李智生. 基于CNN和Transformer并行编码的腹部多器官图像分割. 吉林大学学报(理学版), 2024, 62(5): 1145- 1154.
	ZHAO X , LI S , LI Z S . Abdominal multi-organ image segmentation based on parallel coding of CNN and Transformer. Journal of Jilin University (Science Edition), 2024, 62(5): 1145- 1154.
33	BAI W, CHEN C, TARRONI G, et al. Semi-supervised learning for cardiac MRI segmentation via anatomical edge and shape constraints[Z]. 2022.
34	赖小波, 许茂盛, 徐小媚. 多模态MR图像和多特征融合的胶质母细胞瘤自动分割. 计算机辅助设计与图形学学报, 2019, 31(3): 421- 430.
	LAI X B , XU M S , XU X M . Automatic segmentation for glioblastoma multiforme using multimodal MR images and multiple features. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(3): 421- 430.
35	PATRÍCIO C , NEVES J C , TEIXEIRA L F . Explainable deep learning methods in medical image classification: a survey. ACM Computing Surveys, 2024, 56(4): 1- 41.
36	TJOA E , GUAN C T . A survey on eXplainable Artificial Intelligence (XAI): toward medical XAI. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(11): 4793- 4813. doi: 10.1109/TNNLS.2020.3027314
37	MAIER-HEIN L , REINKE A , EISENMANN M , et al. Metrics reloaded: pitfalls and recommendations for image analysis validation. Nature Methods, 2024, 21, 195- 212. doi: 10.1038/s41592-023-02151-z
38	TAHA A A , HANBURY A . Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. BMC Medical Imaging, 2015, 15, 29. doi: 10.1186/s12880-015-0068-x
39	KERVADEC H , BOUCHTIBA J , DESROSIERS C , et al. Boundary loss for highly unbalanced segmentation. Medical Image Analysis, 2021, 67, 101851. doi: 10.1016/j.media.2020.101851
40	ISENSEE F, WALD T, ULRICH C, et al. nnU-Net revisited: a call for rigorous validation in 3D medical image segmentation[C]//Proceedings of MICCAI'24. Berlin, Germany: Springer, 2024: 488-498.
41	SHAKER A , MAAZ M , RASHEED H , et al. UNETR++: delving into efficient and accurate 3D medical image segmentation. IEEE Transactions on Medical Imaging, 2024, 43(9): 3377- 3390. doi: 10.1109/TMI.2024.3398728
42	ROY S, KOEHLER G, ULRICH C, et al. MedNeXt: Transformer-driven scaling of ConvNets for medical image segmentation[C]//Proceedings of MICCAI'23. Berlin, Germany: Springer, 2023: 405-415.
43	IRATNI M , ABDULLAH A , ALDHAHERI M , et al. Transformers for neuroimage segmentation: scoping review. Journal of Medical Internet Research, 2025, 27
44	张啸成, 王涛, 田昕, 等. 基于移位窗口自注意力机制的新生儿脑区域图像分割. 吉林大学学报(理学版), 2024, 62(5): 1129- 1137.
	ZHANG X C , WANG T , TIAN X , et al. Image region segmentation of neonatal brain based on self-attention mechanism of shifted windows. Journal of Jilin University (Science Edition), 2024, 62(5): 1129- 1137.
45	王斯豪, 张笃振, 杨昌昌. 基于双路径注意力机制和多尺度信息融合的皮肤病变图像分割. 计算机应用, 2025, 45(3): 978- 989.
	WANG S H , ZHANG D Z , YANG C C . Skin Lesion image segmentation based on dual-path attention mechanism and multi-scale information fusion. Journal of Computer Applications, 2025, 45(3): 978- 989.
46	O K S, GALDRAN A, RIERA-MARIN M, et al. Uncertainty aware segmentation quality assessment in medical images[C]//Proceedings of the IEEE International Symposium on Biomedical Imaging (ISBI). Athens, Greece: IEEE Press, 2024: 1-5.
47	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE Press, 2017: 2999-3007.
48	LIU X B , HOU S F , LIU S , et al. Attention-based multimodal glioma segmentation with multi-attention layers for small-intensity dissimilarity. Journal of King Saud University (Computer and Information Sciences), 2023, 35(4): 183- 195. doi: 10.1016/j.jksuci.2023.03.011
49	LO H, VASCONCELOS N, TRIVEDI M, et al. Topological constraints in deep learning for image segmentation[C]//Proceedings of CVPR'22. Washington D.C., USA: IEEE Press, 2022: 1-10.
50	WEN B, ZHANG H, BARTSCH D U G, et al. Topology-preserving image segmentation with spatial-aware persistent feature matching[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2025: 5762-5771.
51	LIU Z H , SUNAR M S , TAN T S , et al. Deep learning for retinal vessel segmentation: a systematic review of techniques and applications. Medical & Biological Engineering & Computing, 2025, 63(8): 2191- 2208.
52	SHIT S, PAETZOLD J C, SEKUBOYINA A, et al. clDice—a novel topology-preserving loss function for tubular structure segmentation[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2021: 1-15.
53	邢莹, 闫晓华, 普程伟, 等. 全自动数字图像分析在外周血白细胞形态学复检中的临床应用. 中华医学杂志, 2016, 96(8): 634- 639.
	XING Y , YAN X H , PU C W , et al. Clinical application of automatic digital image analysis in the review of peripheral blood leukocyte morphology. National Medical Journal of China, 2016, 96(8): 634- 639.
54	ISENSEE F , JAEGER P F , KOHL S A A , et al. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature Methods, 2021, 18(2): 203- 211.
55	WANG W X, CHEN C, DING M, et al. TransBTS: multimodal brain tumor segmentation using Transformer[C]//Proceedings of MICCAI'21. Berlin, Germany: Springer, 2021: 109-119.
56	高路尧, 胡长虹, 肖树林. 基于超像素分割的图注意力网络的高光谱图像分类. 吉林大学学报(理学版), 2024, 62(2): 357- 368.
	GAO L Y , HU C H , XIAO S L . Hyperspectral image classification based on superpixel segmentation with graph attention networks. Journal of Jilin University (Science Edition), 2024, 62(2): 357- 368.
57	JEON Y S, YANG H, FU H, et al. Teaching ai the anatomy behind the scan: addressing anatomical flaws in medical image segmentation with learnable prior[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2025: 24024-24033.
58	MA J , HE Y T , LI F F , et al. Segment anything in medical images. Nature Communications, 2024, 15(1): 654. doi: 10.1038/s41467-024-44824-z
59	KOLEILAT T, ASGARIANDEHKORDI H, RIVAZ H, et al. MedCLIP-SAM: bridging text and image towards universal medical image segmentation[C]//Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention. Berlin, Germany: Springer, 2024: 643-653.
60	JI Y, ZHANG Z, ZHOU T, et al. MedClick: interactive medical image segmentation with click prompt[C]//Proceedings of MICCAI'23. Berlin, Germany: Springer, 2023: 1-12.
61	LUO Y, XU Q, FENG J, et al. Med-FastSAM: improving transfer efficiency of SAM to domain-generalised medical image segmentation[EB/OL]. [2025-08-11]. https://openreview.net/pdf?id=ZL1Vb8LuE1.
62	RAHMATI B , SHIRANI S , KESHAVARZ-MOTAMED Z . A hybrid approach for enhancing pseudo-labeling in medical images through pseudo-label refinement. Scientific Reports, 2025, 15(1): 35161. doi: 10.1038/s41598-025-19121-4
63	LIU X B , SONG L P , LIU S , et al. A review of deep-learning-based medical image segmentation methods. Sustainability, 2021, 13(3): 1224. doi: 10.3390/su13031224
64	LIU X Y, DING X, YU L, et al. PQ-SAM: post-training quantization for segment anything model[C]//Proceedings of the International Conference on Computer Vision. Berlin, Germany: Springer, 2024: 420-437.

[1]	田辉, 段鑫龙, 郝琪雅, 隋文灏, 马裕莹, 虞祖华, 徐杨, 曹仰杰. 结合多尺度特征融合和改进ViT的细胞计数方法[J]. 计算机工程, 2026, 52(5): 203-215.
[2]	林海, 王和钰, 曹越, 王丽园, 王世杰. LuffyNet:面向硬件感知的边缘智能[J]. 计算机工程, 2026, 52(5): 404-417.
[3]	许旻辰, 屈丹, 司念文, 彭思思, 陈雅淇. 社交媒体虚假信息检测技术研究综述[J]. 计算机工程, 2026, 52(5): 60-80.
[4]	王雯, 杨奎武, 仝松松, 魏江宏, 薛岩, 周荣魁. 深度神经网络模型水印攻击研究[J]. 计算机工程, 2026, 52(4): 22-38.
[5]	励皓轩, 张志远, 刘芮, 许沛华, 田昕. 基于隐式神经表达图像超分辨率的气象降尺度[J]. 计算机工程, 2026, 52(4): 376-385.
[6]	成彬, 赵彬兵, 雷华, 何博. 基于双目视觉的钢筋绑扎节点定位方法[J]. 计算机工程, 2026, 52(4): 433-445.
[7]	李娇, 范浩东, 洪旭东, 许镇义, 樊旭, 黄俊. 基于标签视觉原型学习的多标签图像分类[J]. 计算机工程, 2026, 52(4): 229-238.
[8]	崔少国, 许松, 王名洋, 周粤. 面向智能教育的深度学习知识追踪研究进展[J]. 计算机工程, 2026, 52(4): 39-61.
[9]	曹继卫, 罗飞, 丁炜超. BS-YOLO: 基于BSAM注意力机制和SCConv的小目标检测算法[J]. 计算机工程, 2026, 52(3): 119-127.
[10]	张永宏, 孙书林, 龚蒙, 王俊飞, 马光义. 基于多尺度运动记忆模型的遥感云图预测方法[J]. 计算机工程, 2026, 52(3): 128-140.
[11]	刘啸宇, 廖志芳, 谈遂, 余志武. 基于堆叠GRU神经网络的桥梁动应变预测[J]. 计算机工程, 2026, 52(3): 441-450.
[12]	张志, 尹昱凯, 孙奕灵, 孟雯锦, 彭畅. 基于多模态特征融合的Android恶意软件检测模型研究[J]. 计算机工程, 2026, 52(3): 243-254.
[13]	秦颖鑫, 张可佳, 潘海为, 巨亚昊. 计算机视觉对抗攻击研究综述[J]. 计算机工程, 2026, 52(2): 46-68.
[14]	赵旭东, 吴洪越, 孟柯, 许小龙, 窦万春. 服务推荐方法的研究进展与展望(特邀)[J]. 计算机工程, 2026, 52(1): 61-75.
[15]	陈亮, 赵英, 史晟辉, 尹琳. 基于超图神经网络的链路预测方法[J]. 计算机工程, 2026, 52(1): 136-143.

选择文件类型/文献管理软件名称

选择包含的内容