作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (6): 274-283,291. doi: 10.19678/j.issn.1000-3428.0065065

• 开发研究与工程应用 • 上一篇    下一篇

基于双分支特征融合的医学报告生成方法

沈秀轩1, 吴春雷1, 冯叶棋2, 程铭1, 张俊三1, 朱杰3   

  1. 1. 中国石油大学(华东) 计算机科学与技术学院, 山东 青岛 266580;
    2. 宸芯科技有限公司, 山东 青岛 266580;
    3. 中央司法警官学院 信息管理系, 河北 保定 071000
  • 收稿日期:2022-06-23 修回日期:2022-08-18 发布日期:2022-09-29
  • 作者简介:沈秀轩(1998-),女,硕士研究生,主研方向为医学图像识别、自然语言处理;吴春雷,教授、博士;冯叶棋、程铭,硕士研究生;张俊三(通信作者)、朱杰,副教授、博士。
  • 基金资助:
    山东省自然科学基金(ZR2020MF006);河北省自然科学基金(F2022511001);中国石油大学(华东)自主创新科研计划项目(20CX05019A)。

Medical Report Generation Method Based on Dual-Branch Feature Fusion

SHEN Xiuxuan1, WU Chunlei1, FENG Yeqi2, CHENG Ming1, ZHANG Junsan1, ZHU Jie3   

  1. 1. College of Computer Science and Technology, China University of Petroleum(East China), Qingdao 266580, Shandong, China;
    2. Chenxin Technology Co., Ltd., Qingdao 266580, Shandong, China;
    3. Department of Information Management, The National Police University for Criminal Justice, Baoding 071000, Hebei, China
  • Received:2022-06-23 Revised:2022-08-18 Published:2022-09-29

摘要: 医学图像的全局特征在基于深度学习的医学影像报告自动生成任务中发挥着重要作用,传统方法通常仅使用单分支卷积神经网络提取图像语义特征,注重局部细节特征的提取,但缺乏对医学图像全局特征的关注。提出一种新的医学影像报告生成方法DBFFN,基于双分支特征融合,结合卷积神经网络与视觉Transformer各自在图像特征提取上的优势,分别提取给定医学图像的全局特征和局部特征,在计算图像全局关系的同时关注局部细微的语义信息。针对医学图像的特征融合问题,设计一种多尺度特征融合模块,对来自两个分支的图像特征进行自适应尺度对齐,并结合矩阵运算方法和空间信息增强方法有效融合全局特征与局部特征内部包含的语义信息。在IU-X-Ray数据集上的实验结果表明,DBFFN方法的BLEU-1~BLEU-4,METEOR,ROUGE-L指标平均值分别为0.496,0.331,0.234,0.170,0.214,0.370,优于HRNN、HRGR、CMAS-RL等方法,在医学影像报告自动生成任务上具有有效性。

关键词: 医学影像报告生成, 全局特征, 局部特征, 特征融合, 图像-文本生成

Abstract: The global features of medical images play an important role in automatic medical report generation tasks based on deep learning models. However,in previous methods,only a single-branch convolutional neural network is adopted to extract image semantic features.These methods focus on local detailed features,ignoring the global features of medical images. Therefore,a new medical imaging report generation method called DBFFN,which is based on dual-branch feature fusion,is proposed to extract the global and local features of given medical images,thereby combining the advantages of Convolutional Neural Network(CNN) and visual Transformer in image feature extraction.The local subtle semantic information is considered when calculating the global relationship of the image,whereby two components are merged to complement each other.In addition,a Multi-Scale Feature Fusion Module(MSFFM) is designed for the feature fusion of medical images. First,the extracted features from the two branches are subjected to adaptive scale alignment. Subsequently,matrix operational and spatial information enhancement methods are combined to effectively fuse the semantic information contained in the global and local features. Through experiments on the IU-X-Ray data set,the average values of the optimal scores of DBFFN using the BLEU-1 to BLEU-4,METEOR,and ROUGE-L metrics reach 0.496,0.331,0.234,0.170,0.214,and 0.370,respectively,which is superior to the results obtained using methods such as HRNN,HRGR,and CMAS-RL,proving the effectiveness of DBFFN during the automatic medical reports generation task.

Key words: medical imaging report generation, global feature, local feature, feature fusion, image-text generation

中图分类号: