作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

FMVM-DFFT:融合因子化VSS与特征频带分离的甲状腺结节分割

  • 发布日期:2025-09-01

FMVM-DFFT: Fusion of factorized VSS and feature frequency-band separation for segmentation of thyroinodules

  • Published:2025-09-01

摘要: 当前的甲状腺结节分割方法在图像特征解析中可能导致图像边界模糊或细节丢失,且甲状腺超声图像本身还存在质量低、噪点多等问题影响特征精确提取。为此,基于最新的视觉状态空间模型(VMamba),提出一种融合因子化VSS与特征频带分离的甲状腺结节超声图像分割网络FMVM-DFFT。该网络架构的主要创新在于:(1) 结合因子分解机(Factorization Machine, FM)和外部注意力(External Attention, EA),提出一种VSS(Visual State Space)模块的因子化变体FMVSS,利用其高效提取输入特征在不同维度上的信息,并自适应调整特征权重,增强对关键信息和局部细节的捕捉能力;(2) 提出一种包含双分支快速傅里叶变换的DFFT模块,对编码器输出特征进行频带动态分离和精细提取,以提高网络对细节与宏观信息的捕捉能力,并结合通道注意力(Channel Attention,CA)自适应控制各通道的权重;(3) 提出一种基于Laplacian算子和新型损失函数BDELoss的边缘优化策略应用于训练过程中,进一步增强网络对图像边缘区域的学习能力。通过在TN3K和DDTI两个数据集上进行对比实验,结果表明:与主流分割网络和最新图像分割网络相比,FMVM-DFFT表现出最佳分割性能,尤其在重要指标DSC与IoU上表现出色,在TN3K上两项指标可达88.50%与79.37%,在DDTI上两项指标可达78.85%与65.09%。

Abstract: Current thyroid nodule segmentation methods often lead to blurred boundaries or detail loss during image feature analysis, and the low quality and high noise of thyroid ultrasound images further hinder precise feature extraction. To address these issues, we propose a thyroid nodule ultrasound image segmentation network FMVM-DFFT based on the latest visual state space model (VMamba), integrating factorized VSS and feature frequency-band separation. The network architecture boasts three key innovations: (1) By combining Factorization Machine (FM) and External Attention (EA), a factorized variant of VSS, namely FMVSS, is proposed to efficiently extract features from multiple dimensions of input images and adaptively adjust fusion weights, enhancing the capture of critical information and local details; (2) A DFFT module with dual-branch fast Fourier transform is designed to dynamically separate and finely extract high-frequency and low-frequency features of encoder outputs, improving the network's frequency-domain perception. This is combined with Channel Attention (CA) to optimize feature selection and fusion for better detail capture; (3) A Laplacian operator-based edge optimization strategy combined with the novel BDELoss is proposed and applied in the training process to further enhance the network's learning ability for image edge regions. Comparative experiments on TN3K and DDTI datasets show that FMVM-DFFT outperforms mainstream segmentation networks and latest image segmentation networks methods, achieving DSC scores of 88.50% and 79.37% on TN3K, and 78.85% and 65.09% on DDTI for DSC and IoU, respectively.