作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

分布式网络下改进的Top-k查询算法

杨浩,林喜军,曲海鹏   

  1. (中国海洋大学 信息科学与工程学院,山东 青岛 266100)
  • 收稿日期:2016-01-14 出版日期:2017-02-15 发布日期:2017-02-15
  • 作者简介:杨浩(1989—),男,硕士研究生,主研方向为大数据、云计算;林喜军,博士;曲海鹏(通信作者),副教授。
  • 基金资助:
    国家自然科学基金(61379127);国家海洋公益性行业科研专项(201105033)。

Improved Top-k Query Algorithm in Distributed Networks

YANG Hao,LIN Xijun,QU Haipeng   

  1. (College of Information Science and Engineering,Ocean University of China,Qingdao,Shandong 266100,China)
  • Received:2016-01-14 Online:2017-02-15 Published:2017-02-15

摘要: 现有Top-k查询算法主要运用在集中式关系型数据库中,当应用于分布式网络时会产生巨大的通信开销,导致算法效率低下。为此,提出一种改进的Top-k查询算法,利用预处理索引表对分布式网络中无关数据进行裁剪,在此基础上建立包含正确Top-k结果的候选子集并实现Top-k查询。实验结果表明,与Fagin和Naive Top-k查询算法相比,改进算法获得的查询结果更准确,运行时间更短,网络开销更小。

关键词: Top-k查询, 分布式网络, 数据裁剪策略, 预处理索引表, 大数据

Abstract: Existing Top-k query algorithms are mainly applied in the centralized relational database.However,the algorithms will cause huge communication costs and low efficiency in the distributed networks.In order to solve these problems,an improved Top-k Query Algorithm is proposed.This algorithm sets a Pretreatment Index Table(PIT) to cut the independent data out in the distributed networks,builds candidate subset which contains the correct Top-k results and realizes Top -k query based on it.Experimental result shows that the query results of this algorithm are more accurate,and it has shorter operation time and less network overhead compared with Fagin and Naive Top-k query algorithms.

Key words: Top-k query, distributed networks, data cutting strategy, Pretreatment Index Table(PIT), big data

中图分类号: