FMVM-DFFT：融合因子化VSS与特征频带分离的甲状腺结节分割

doi:10.19678/j.issn.1000-3428.0252441

摘要/Abstract

摘要： 当前的甲状腺结节分割方法在图像特征解析中可能导致图像边界模糊或细节丢失，且甲状腺超声图像本身还存在质量低、噪点多等问题影响特征精确提取。为此，基于最新的视觉状态空间模型(VMamba)，提出一种融合因子化VSS与特征频带分离的甲状腺结节超声图像分割网络FMVM-DFFT。该网络架构的主要创新在于：(1) 结合因子分解机(Factorization Machine, FM)和外部注意力(External Attention, EA)，提出一种VSS(Visual State Space)模块的因子化变体FMVSS，利用其高效提取输入特征在不同维度上的信息，并自适应调整特征权重，增强对关键信息和局部细节的捕捉能力;(2) 提出一种包含双分支快速傅里叶变换的DFFT模块，对编码器输出特征进行频带动态分离和精细提取，以提高网络对细节与宏观信息的捕捉能力，并结合通道注意力(Channel Attention，CA)自适应控制各通道的权重;(3) 提出一种基于Laplacian算子和新型损失函数BDELoss的边缘优化策略应用于训练过程中，进一步增强网络对图像边缘区域的学习能力。通过在TN3K和DDTI两个数据集上进行对比实验，结果表明：与主流分割网络和最新图像分割网络相比，FMVM-DFFT表现出最佳分割性能，尤其在重要指标DSC与IoU上表现出色，在TN3K上两项指标可达88.50%与79.37%，在DDTI上两项指标可达78.85%与65.09%。

Abstract: Current thyroid nodule segmentation methods often lead to blurred boundaries or detail loss during image feature analysis, and the low quality and high noise of thyroid ultrasound images further hinder precise feature extraction. To address these issues, we propose a thyroid nodule ultrasound image segmentation network FMVM-DFFT based on the latest visual state space model (VMamba), integrating factorized VSS and feature frequency-band separation. The network architecture boasts three key innovations: (1) By combining Factorization Machine (FM) and External Attention (EA), a factorized variant of VSS, namely FMVSS, is proposed to efficiently extract features from multiple dimensions of input images and adaptively adjust fusion weights, enhancing the capture of critical information and local details; (2) A DFFT module with dual-branch fast Fourier transform is designed to dynamically separate and finely extract high-frequency and low-frequency features of encoder outputs, improving the network's frequency-domain perception. This is combined with Channel Attention (CA) to optimize feature selection and fusion for better detail capture; (3) A Laplacian operator-based edge optimization strategy combined with the novel BDELoss is proposed and applied in the training process to further enhance the network's learning ability for image edge regions. Comparative experiments on TN3K and DDTI datasets show that FMVM-DFFT outperforms mainstream segmentation networks and latest image segmentation networks methods, achieving DSC scores of 88.50% and 79.37% on TN3K, and 78.85% and 65.09% on DDTI for DSC and IoU, respectively.

刘凤春, 韩宏帅, 张春英, 马将. FMVM-DFFT：融合因子化VSS与特征频带分离的甲状腺结节分割[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252441.

LIU Fengchun, HAN Hongshuai, ZHANG Chunying, MA Jiang. FMVM-DFFT: Fusion of factorized VSS and feature frequency-band separation for segmentation of thyroinodules[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252441.

参考文献

[1] LIN R, FOGARTY C E, MA B, et al. Identification of ferroptosis genes in immune infiltration and prognosis in thyroid papillary carcinoma using network analysis[J]. BMC genomics, 2021, 22: 1-16.
[2] BOUCAI L, ZAFEREO M, CABANILLAS M E. Thyroid cancer: a review[J]. Jama, 2024, 331(5): 425-435.
[3] LATIA M, BORLEA A, MIHUTA M S, et al. Impact of ultrasound elastography in evaluating Bethesda category IV thyroid nodules with histopathological correlation[J]. Frontiers in Endocrinology, 2024, 15: 1393982.
[4] ORTIZ S H C, CHIU T, FOX M D. Ultrasound image enhancement: A review[J]. Biomedical Signal Processing and Control, 2012, 7(5): 419-428.
[5] ZHOU Y T, YANG T Y, HAN X H, et al. Thyroid-DETR: Thyroid nodule detection model with transformer in ultrasound images[J]. Biomedical Signal Processing and Control, 2024, 98: 106762.
[6] WANG Y, GE X, MA H, et al. Deep learning in medical ultrasound image analysis: a review[J]. IEEE Access, 2021, 9: 54310-54324.
[7] 石军,王天同,朱子琦,等.基于深度学习的医学图像分割方法综述 [J]. 中国图象图形学报,2025,30(06):2161-2186. SHI J, WANG T T,ZHU Z Q, et al. Deep learning-based medical image segmentation methods [J]. Journal of Image and Graphics,2024,51(05):100-107.
[8] VASWANI A. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017.
[9] ZHOU Z, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, et al. Unet++: A nested u-net architecture for medical image segmentation[C]//Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4. Springer International Publishing, 2018: 3-11.
[10] PAN H, ZHOU Q, LATECKI L J. Sgunet: Semantic guided unet for thyroid nodule segmentation[C]//2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI). IEEE, 2021: 630-634.
[11] BI H, CAI C, SUN J, et al. BPAT-UNet: Boundary preserving assembled transformer UNet for ultrasound thyroid nodule segmentation[J]. Computer methods and programs in biomedicine, 2023, 238: 107614.
[12] RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks segmentation[C]//Medical for image biomedical computing image and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer International Publishing, 2015: 234-241.
[13] OZCAN A, TOSUN Ö, DONMEZ E, et al. Enhanced-TransUNet for ultrasound segmentation of thyroid nodules[J]. Biomedical Signal Processing and Control, 2024, 95: 106472.
[14] ZHOU H, LUO Y, GUO J, et al. Double U-Net: semi-supervised ultrasound image segmentation combining CNN and transformer’s U-shaped network[J]. The Journal of Supercomputing, 2025, 81(5): 659.
[15] GONG H, CHEN J, CHEN G, et al. Thyroid region prior guided attention for ultrasound segmentation of thyroid nodules[J]. Computers in biology and medicine, 2023, 155: 106389.
[16] HEIDARI M, KOLAHI S G, KARIMIJAFARBIGLOO S, et al. Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis[J]. arXiv preprint arXiv:2406.03430, 2024.
[17] YANG X, WANG Q, ZHANG K, et al. MSV-Mamba: A Multiscale Vision Mamba Network for Echocardiography Segmentation[J]. arXiv preprint arXiv:2501.07120, 2025.
[18] LIU Y, TIAN Y, ZHAO Y, et al. Vmamba: Visual state space model[C]//The Thirty-eighth Annual Conference on Neural Information Processing Systems. 2024.
[19] RUAN J, XIANG S. Vm-unet: Vision mamba unet for medical image segmentation[J]. arXiv preprint arXiv:2402.02491, 2024.
[20] DANG D Q T. Advancing brain tumor segmentation via Vision Mamba and soft labels[D]. DQT Dang, 2025.
[21] GUO M H, LIU Z N, MU T J, et al. Beyond self-attention: External attention using two linear layers for visual tasks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(5): 5436-5447.
[22] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
[23] ANGULO J. Hierarchical laplacian and its spectrum in ultrametric image processing[C]//MathematicalMorphology and Its Applications to Signal and Image Processing: 14th International Symposium, ISMM 2019, Saarbrücken, Germany, July 8-10, 2019, Proceedings 14. Springer International Publishing, 2019: 29-40.
[24] ZHU L, LIAO B, ZHANG Q, et al. Vision mamba: Efficient visual representation learning with bidirectional state space model[J]. arXiv preprint arXiv:2401.09417, 2024.
[25] GU A, DAO T. Mamba: Linear-time sequence modeling with selective state arXiv:2312.00752, 2023. spaces[J]. arXiv preprint
[26] ZHANG C, ZHOU X, CUI Y, et al. HCMUNet: A hybrid CNN and Mamba network for medical ultrasound image segmentation[J]. Available at SSRN 5263829.
[27] WANG D, ZHAO W, CUI K, et al. VMC‐UNet: A Vision Mamba‐CNN U‐Net for Tumor Segmentation in Breast Ultrasound Image[J]. International Journal of Imaging Systems and Technology, 2024, 34(6): e23222.
[28] ZOU S, ZHANG M, FAN B, et al. SkinMamba: A precision skin lesion segmentation architecture with cross-scale global state modeling and frequency boundary guidance[J]. arXiv preprint arXiv:2409.10890, 2024.
[29] PEDRAZA L, VARGAS C, NARVÁEZ F, et al. An open access thyroid ultrasound image database[C]//10th International symposium on medical information processing and analysis. SPIE, 2015, 9287: 188-193.
[30] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440.
[31] RAJAKUMAR G, LEELA R S J, DARNEY P E, et al. Seg-net: Automatic lung infection segmentation of covid-19 from ct images[C]//2021 5th International Conference on Trends in Electronics and Informatics (ICOEI). IEEE, 2021: 739-744.
[32] CHEN J, MEI J, LI X, et al. TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers[J]. Medical Image Analysis, 2024, 97: 103280.
[33] ZHANG C, WANG L, WEI G, et al. A dual-branch and dual attention transformer and CNN hybrid network for ultrasound image segmentation[J]. Frontiers in Physiology, 2024, 15: 1432987.
[34] LI Z, ZHENG Y, SHAN D, et al. Scribformer: Transformer makes cnn work better for scribble-based medical image segmentation[J]. IEEE Transactions on Medical Imaging, 2024, 43(6): 2254-2265.
[35] HAN X, LI X, SHANG J, et al. MambaEviScrib: Mamba and Evidence-Guided consistency enhance CNN robustness for Scribble-Based weakly supervised ultrasound image segmentation[J]. arXiv preprint arXiv:2409.19370, 2024.
[36] DIALAMEH M, RAJABZADEH H, SADEGHI-GOUGHARI M, et al. E2E-Swin-Unet++: An enhanced End-to-End Swin-Unet architecture with dual decoders for PTMC segmentation[J]. arXiv preprint arXiv:2410.18239, 2024.
[37] PROCHAZKA A, ZEMAN J. Thyroid nodule segmentation in ultrasound images using U-Net with ResNet encoder: achieving state-of-the-art performance on all public datasets[J]. AIMS Medical Science, 2025, 12(2): 124-144.
[38] GOWDA S N, CLIFTON D A. CC-SAM: SAM with cross-feature attention and context for ultrasound image segmentation[C]//European Conference on Computer Vision. Springer, Cham, 2025: 108-124.
[39] HUANG K, ZHOU T, FU H, et al. Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image Segmentation[J]. IEEE Transactions on Medical Imaging, 2025.
[40] BI H, DONG Z, SUN J, et al. PEW-SegDiff: Feature Pyramids Edge-Weighted Diffusion Segmentation model for ultrasound thyroid nodule[J]. Biomedical Signal Processing and Control, 2025, 102: 107346.

选择文件类型/文献管理软件名称

选择包含的内容