作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (5): 203-215. doi: 10.19678/j.issn.1000-3428.0070281

• 计算机视觉与图形图像处理 • 上一篇    下一篇

结合多尺度特征融合和改进ViT的细胞计数方法

田辉, 段鑫龙, 郝琪雅, 隋文灏, 马裕莹, 虞祖华, 徐杨, 曹仰杰*()   

  1. 郑州大学网络空间安全学院, 河南 郑州 450000
  • 收稿日期:2024-08-22 修回日期:2024-10-25 出版日期:2026-05-15 发布日期:2024-12-18
  • 通讯作者: 曹仰杰
  • 作者简介:

    田辉, 男, 讲师、博士, 主研方向为计算机视觉、医学图像分割、数字图像处理

    段鑫龙, 硕士研究生

    郝琪雅, 硕士研究生

    隋文灏, 硕士研究生

    马裕莹, 硕士研究生

    虞祖华, 硕士研究生

    徐杨, 硕士研究生

    曹仰杰(通信作者), 教授、博士

  • 基金资助:
    河南省自然科学基金(242300421474); 河南省科技攻关项目(222102310547)

Cell Counting Method Combining Multi-Scale Feature Fusion and Improved ViT

TIAN Hui, DUAN Xinlong, HAO Qiya, SUI Wenhao, MA Yuying, YU Zuhua, XU Yang, CAO Yangjie*()   

  1. School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou 450000, Henan, China
  • Received:2024-08-22 Revised:2024-10-25 Online:2026-05-15 Published:2024-12-18
  • Contact: CAO Yangjie

摘要:

细胞计数是临床医学研究中的常见任务之一, 在生物学和临床医学领域发挥着十分重要的作用。细胞计数任务中存在细胞重叠等情况, 导致很多计数方法将多个细胞统计为单个, 从而造成细胞计数精度降低的问题。为此, 引入U-Net医学图像分割模型并对其进行改进, 提出一种结合改进ViT(Vision Transformer)模块和多尺度特征融合的细胞计数方法, 该计数方法包括4个部分, 分别为提取深层特征的编码器、用于拼接编码器特征和解码器特征的多尺度特征融合模块、捕获全局上下文信息的改进ViT模块和用于恢复特征尺寸并输出分割结果的解码器。其中, 改进ViT模块利用新颖的空间注意力模块和通道注意力模块, 解决了传统ViT在提取特定空间和通道维度信息时能力不足的问题。多尺度特征融合模块将不同尺度的特征图进行融合, 提高模型分割不同尺寸细胞的边界的能力, 降低细胞重叠对计数精度造成的影响。此外, 为了进一步提高模型分割重叠细胞的能力, 提出一种数据增强策略, 通过将原始细胞标注转换为一定半径的圆形标注, 调整细胞标注之间的距离, 从而指导模型更好地将存在重叠的细胞进行分离。在LiveCell、MBM cells和DCC数据集上进行实验, 结果表明, 所提计数方法取得了较好的结果, 有效解决了由细胞重叠引起的计数精度降低的问题。

关键词: 细胞计数, 医学图像分割, 多尺度特征融合, Vision Transformer, 通道注意力模块, 空间注意力模块

Abstract:

Cell counting is common in clinical medical research and plays a crucial role in biology and clinical medicine. In situations where cells overlap, multiple cells may be counted as a single one, causing counting accuracy to decrease. To address this issue, this paper introduces an improved U-Net medical image segmentation model. The paper proposes a cell counting method combining an improved Vision Transformer (ViT) module and multi-scale feature fusion. This counting method comprises four parts: an encoder for extracting deep features, a multi-scale feature fusion module for concatenating encoder and decoder features, an improved ViT module for capturing global context information, and a decoder for restoring feature dimensions and outputting the segmentation results. The improved ViT module utilizes novel spatial and channel attention modules to address the insufficiency of traditional ViT in extracting specific spatial and channel dimensional information. The multi-scale feature fusion module integrates feature maps of different scales, enhancing the ability of the model to segment the boundaries of cells of different sizes and reducing the impact of cell overlap on counting accuracy. To further improve the ability of the model to segment overlapping cells, the paper proposes a data augmentation strategy. By converting the original cell annotations into circular annotations with a specific radius and adjusting the distance between the cell annotations, this strategy guides the model to separate overlapping cells more effectively. Experiments on the LiveCell, MBM cells, and DCC datasets demonstrate that the proposed counting method achieves good results, effectively addressing the issue of decreased counting accuracy caused by cell overlap.

Key words: cell counting, medical image segmentation, multi-scale feature fusion, Vision Transformer (ViT), channel attention module, spatial attention module