作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (22): 66-67,8. doi: 10.3969/j.issn.1000-3428.2007.22.023

• 软件技术与数据库 • 上一篇    下一篇

非一致性数据库的概率查询重写

谢 东,杨路明,蒲保兴,刘 波   

  1. (中南大学信息科学与工程学院,长沙 410083)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-11-20 发布日期:2007-11-20

Probabilistic Query Rewriting in Inconsistent Databases

XIE Dong, YANG Lu-ming, PU Bao-xing, LIU Bo   

  1. (School of Information Science and Engineering, Central South University, Changsha 410083)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-11-20 Published:2007-11-20

摘要: 结合概率数据库技术,以元组匹配所产生的聚类为基础,提出了一种新的基于聚类的非一致性数据的概率方法。基于可信聚类,给出了基本的查询重写技术,在有聚集的查询中,考虑了合适的元组概率、区间值、期望值。在不进行程序预处理的情况下,“重写”能被商业数据库系统有效地优化和执行,采用不一致性数据的区分度和数据库大小去理解其适应性,并使用了TPC-H基准的数据和查询。实验显示了该方法的有效性。

关键词: 关系数据库, 非一致性数据库, 查询重写, 聚类概率

Abstract: This paper combines with original probabilistic databases based on the clusters which is produced by tuple matching techniques, and proposes a new probabilistic approach based on clustering in inconsistent databases. It analyzes a basic technique for queries rewriting based on believable cluster, considers the appropriate tuple probabilities, interval values and expectation values. The approach needs not procedural pre-processing, so the rewritten queries can be efficiently optimized and executed by commercial database systems. In order to understand the flexibility of the approach for considering distinguishing degrees of inconsistency and database sizes, the experiments use the data and queries of the TPC-H specification, and results show that the approach is efficient.

Key words: relation database, inconsistent database, query rewriting, cluster probability

中图分类号: