作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (24): 60-62. doi: 10.3969/j.issn.1000-3428.2009.24.020

• 软件技术与数据库 • 上一篇    下一篇

基于赋权二部图的记录簇匹配模型及其算法

陈 波1,2,王延章1   

  1. (1. 大连理工大学管理学院,大连 116023;2. 中国人民银行征信中心,北京 100140)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-12-20 发布日期:2009-12-20

Record Cluster Matching Model and Its Algorithm Based on Weighted Bipartite Graph

CHEN Bo1,2, WANG Yan-zhang1   

  1. (1. School of Management, Dalian University of Technology, Dalian 116023;2. Credit Reference Center, The People’s Bank of China, Beijing 100140)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-12-20 Published:2009-12-20

摘要: 通过一组成员记录表示实体时,相似记录匹配问题被扩展为记录簇匹配问题。提出2种记录簇匹配模式,应用赋权二部图理论建立记录簇匹配数学模型,设计记录簇上下界匹配算法。快速推导出记录簇匹配阈值的上下界,以减少记录簇子记录最大权的匹配次数。实验结果证明该算法能提高记录簇匹配精度和计算效率。

关键词: 信息集成, 记录簇匹配, 二部图最大权匹配

Abstract: When entities are represented by a group of members’ records, the similar records matching problem is extended to record cluster matching problem. This paper proposes two matching patterns of record cluster, establishes mathematical model of record cluster matching by using the theory of weighted bipartite graph, and designs upper and lower bounds matching algorithm of record cluster. It deduces upper and lower bounds of record cluster matching threshold value quickly to decrease the matching times of maximum weight matching between record clusters. Experimental results show that the algorithm can improve matching accuracy and computational efficiency.

Key words: information integration, record cluster matching, maximum weight matching of bipartite graph

中图分类号: