摘要: 考虑了分布式数据仓库的星型模式及数据分段的特点,在各站点对分组关键字进行编码压缩,并采用分布式聚集运算的方法,最后在请求站点生成完整的分组聚集结果,以降低站点内的排序费用,减少站点间传输的元组大小和数目,从而降低了站点内的处理代价和站点间的数据传输费用,提高了分布式数据仓库分组聚集运算的效率。
关键词:
分布式数据仓库;聚集运算;星型模式
Abstract: Considering the characteristics of the star-schema, this paper compresses the sorting keywords to lessen the cost of transfer and the expense of sorting, adopts the distributed aggregation operation method to reduce the number of tuples which need to transfer and create complete aggregation result in request site. This work will greatly reduce the cost of data processing and the cost of data transmission between sites, thereby increase the efficiency of aggregation query of the distributed data warehouse
Key words:
Distributed data warehouse; Aggregation query; Star schema
李 强,王秀坤,赫然,孟凡辉,唐一源. 分布式数据仓库的一种聚集运算算法[J]. 计算机工程, 2006, 32(4): 91-93.
LI Qiang, WANG Xiukun, HE Ran, MENG Fanhui, TANG Yiyuan. A New Aggregation Query Algorithm for Distributed Data Warehouse[J]. Computer Engineering, 2006, 32(4): 91-93.