计算机工程 ›› 2019, Vol. 45 ›› Issue (1): 9-16,22.doi: 10.19678/j.issn.1000-3428.0049485

• 体系结构与软件技术 • 上一篇    下一篇

基于知识图谱的医疗病历数据存储研究

夏宇航1,高大启1,阮彤1,王昊奋2,殷亦超3   

  1. 1.华东理工大学 信息科学与工程学院,上海 200237; 2.深圳狗尾草智能科技有限公司,深圳 518057; 3.上海中医药大学附属曙光医院,上海 200021
  • 收稿日期:2017-11-29 出版日期:2019-01-15 发布日期:2019-01-15
  • 作者简介:夏宇航(1993—),男,硕士研究生,主研方向为知识图谱、图数据库;高大启、阮彤,教授、博士;王昊奋,博士;殷亦超,硕士。
  • 基金项目:

    国家高技术研究发展计划(2015AA020107)

Research on Data Storage of Medical Record Based on Knowledge Graph

XIA Yuhang 1,GAO Daqi 1,RUAN Tong 1,WANG Haofen 2,YIN Yichao 3   

  1. 1.School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China; 2.Gowild,Shenzhen 518057,China; 3.Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine,Shanghai 200021,China
  • Received:2017-11-29 Online:2019-01-15 Published:2019-01-15

摘要:

基于关系数据库的资源描述框架(RDF)存储方案多数未考虑领域特性而造成查询性能不足。为此,提出一种改进的病历图谱存储方案。根据原始病历数据具有多元关系的特征,设计多元关系到RDF三元组的转化方案。基于原始病历数据具有空值多、谓词数量多且谓词不固定等特性,采用基于改进三元组表的存储方案,将病历RDF三元组的实体和属性进行ID化。在此基础上,设计实体类型表,面向病历图谱使用SPARQL-to-SQL查询转换算法。实验结果表明,与基于类型的存储方案相比,该方案具有较高的查询效率。

关键词: 医疗病例, 知识图谱, 数据存储, 查询效率, 统计分析

Abstract:

Resource Description Framework(RDF) storage scheme based on relational database mostly does not consider the domain characteristics,resulting in insufficient query performance.Therefore,an improved medical record storage scheme is proposed.According to the characteristics of the multi-relationship of the original medical record data,the transformation plan of the RDF triple is designed.The original medical record data has many characteristics such as large null value,large number of predicates and unpredicted predicates.The entity and attributes of the medical record RDF triple are ID-based based on the improved triple storage table storage scheme.On this basis,the entity type table is designed,and the SPARQL-to-SQL query conversion algorithm is used for the medical record.Experimental results show that compared with the type-based storage scheme,the scheme has higher query efficiency.

Key words: medical record, knowledge graph, data storage, query efficiency, statistical analysis

中图分类号: