作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (19): 41-43,46. doi: 10.3969/j.issn.1000-3428.2011.19.012

• 软件技术与数据库 • 上一篇    下一篇

一种用于多属性范围查询的聚簇方法

马 慧1,吴凌坤2   

  1. (1. 电子科技大学中山学院计算机工程系,广东 中山 528402;2. 腾讯计算机系统有限公司,广东 深圳 518057)
  • 收稿日期:2011-04-18 出版日期:2011-10-05 发布日期:2011-10-05
  • 作者简介:马 慧(1981-),女,讲师、博士,主研方向:数据挖掘,数据库技术;吴凌坤,博士
  • 基金资助:
    电子科技大学中山学院科研启动基金资助项目(409YKQ 04)

Clustering Method for Multiple Attributes Range Query

MA Hui   1, WU Ling-kun   2   

  1. (1. Department of Computer Engineering, Zhongshan Institute, University of Electronic Science and Technology of China, Zhongshan 528402, China; 2. Tencent Holdings Limited, Shenzhen 518057, China)
  • Received:2011-04-18 Online:2011-10-05 Published:2011-10-05

摘要: 为提高多属性区域的查询效率,在物理层重新安排记录排列顺序,以减少查询访问磁盘块数。在此基础上,构造数学模型,将待查询记录按属性值映射至多维坐标空间中的点,以求解一个线性序,使空间中相距越远的点在线性序中也相距越远,并提出一种适用于多属性范围查询的聚簇方法。实验结果表明,与光谱算法及传统聚簇算法相比,该方法查询性能更优。

关键词: 多维聚簇, 数据重组, 区域查询, 聚簇索引, 查询效率

Abstract: To improve the query performance of range queries on multiple attributes in a static data file, a possible solution is to better reorganize the data in the data file so that it can reduce the I/O visiting times. A mathematical model is constructed for this problem. A record can be mapped to a point in a multi-dimensional space according to its queried attributes values. The aim is to find a linear order of these points so that the closer the points are in the multi-dimensional space the closer they are in the linear order. A heuristic method called FPF is proposed. Experimental results show the method performs better than spectrum algorithm and the traditional clustering algorithm.

Key words: multi-dimensional clustering, data reorganization, range query, clustering index, query efficiency

中图分类号: