摘要: 通过对XML数据流的聚类研究,提出一种基于滑动窗口的XML数据流聚类算法SW-XSCLS。该算法采用滑动窗口技术,以聚类特征指数直方图作为概要数据结构,能动态地淘汰“过时”的数据,较好地保存当前窗口内的数据分布状况,从而获取较高质量的聚类结果。理论分析和实验结果表明,该算法可以获得较高的聚类质量和较快的处理速度。
关键词:
XML数据流,
滑动窗口,
聚类,
指数直方图
Abstract: This paper proposes a XML data stream clustering algorithm SW-XSCLS, based on sliding window, in the view of the XML data stream clustering research. The algorithm uses the sliding window technology, takes Exponential Histogram of Clustering Feature(EHCF) as its summary of data structure, it can dynamicly eliminates the outdated data, better preservation of the data distribution in current window, so can obtain a higher quality of clustering results. Theoretical analysis and experimental result show that the algorithm can obtain the higher clustering quality and the quicker processing speed.
Key words:
XML data stream,
sliding window,
clustering,
exponential histogram
中图分类号:
姚文集, 高明霞, 毛国君, 李广奎. 基于滑动窗口的XML数据流聚类算法[J]. 计算机工程, 2010, 36(13): 87-89,92.
TAO Wen-Ji, GAO Meng-Xia, MAO Guo-Jun, LI An-Kui. XML Data Stream Clustering Algorithm Based on Sliding Window[J]. Computer Engineering, 2010, 36(13): 87-89,92.