作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2013, Vol. 39 ›› Issue (4): 44-47. doi: 10.3969/j.issn.1000-3428.2013.04.011

• 先进计算与数据处理 • 上一篇    下一篇

基于x-tuple的概率阈值top-k查询算法

黄冬梅,舒 博,王 建,熊中敏   

  1. (上海海洋大学信息学院,上海 201306)
  • 收稿日期:2012-04-25 出版日期:2013-04-15 发布日期:2013-04-12
  • 作者简介:黄冬梅(1963-),女,教授,主研方向:WebGIS技术,智能信息处理,人机交互;舒 博,硕士研究生;王 建,博士;熊中敏,副教授、博士后
  • 基金资助:
    国家“973”计划基金资助项目“海量信息可用性基础理论与关键技术研究”(2012CB316200);南北极环境综合考察与评估专项基金资助项目(CHINARE2012-04-07)

Probabilistic Threshold top-k Query Algorithm Based on x-tuple

HUANG Dong-mei, SHU Bo, WANG Jian, XIONG Zhong-min   

  1. (College of Information Technology, Shanghai Ocean University, Shanghai 201306, China)
  • Received:2012-04-25 Online:2013-04-15 Published:2013-04-12

摘要: 不确定数据库中的概率阈值top-k查询是计算元组排在前k位的概率和,返回概率和不小于p的元组,但现有的查询语义没有将x-tuple内的元组进行整体处理。针对该情况,定义一种新的查询语义——概率阈值x-top-k查询,并给出查询处理算法。在该查询语义下采用动态规划方法求取x-tuple内每个元组排在前k位的概率和,对其进行聚集后做概率阈值top-k查询,并利用观察法、最大上限值等剪枝方法进行优化。实验结果表明,该算法平均扫描全体数据集中60%的数据即可返回正确结果集,证明其查询处理效率较高。

关键词: 不确定数据库, 概率阈值top-k查询, x-元组, 动态规划算法, 聚集

Abstract: Probabilistic threshold top-k query calculation sum of the probability of the tuple ranked top-k and return the tuples whose sum of the probability are at least p. But top-k query does not take x-tuple as a whole, thus a new top-k query semantic probabilistic threshold x-top-k query is defined and an algorithm is given to process it, which uses dynamic method to acquire sum of the probability of the tuple, then process aggregate probabilities with top-k query. It uses several pruning methods like the upper bound method and so on to optimize the algorithm. Experimental result shows that the algorithm return the answer set for scanning about 60% of data set, and it demonstrates that the algorithm is efficient.

Key words: uncertain database, probabilistic threshold top-k query, x-tuple, dynamic programming algorithm, aggregation

中图分类号: