作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (12): 11-13. doi: 10.3969/j.issn.1000-3428.2009.12.004

• 博士论文 • 上一篇    下一篇

基于反k近邻的流数据离群点挖掘算法

张忠平,梁永欣   

  1. (燕山大学信息科学与工程学院,秦皇岛 066004)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-06-20 发布日期:2009-06-20

Stream Data Outlier Mining Algorithm Based on Reverse k Nearest Neighbors

ZHANG Zhong-ping, LIANG Yong-xin   

  1. (College of Information Science and Engineering, Yanshan University, Qinhuangdao 066004)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-06-20 Published:2009-06-20

摘要: 基于局部离群因子的增量挖掘算法需要多次扫描数据集。反k近邻适用于度量离群程度,根据该性质提出基于反k近邻的流数据离群点挖掘算法(SOMRNN)。采用滑动窗口模型更新当前窗口,仅须进行一次扫描,提高了算法效率。通过查询过程实现在任意指定时刻对当前窗口进行整体查询,及时捕捉数据流概念漂移现象。实验结果证明,SOMRNN具有适用性和有效性。

关键词: 数据流, 离群点, 反k近邻, 滑动窗口

Abstract: Incremental mining algorithms based on local outlier factor demand multiple scans of the data set. Stream data Outlier Mining algorithm based on Reverse k Nearest Neighbors(SOMRNN) is proposed according to the concept that reverse k nearest neighbors is suitable to measure outlier degree. The sliding window is adopted to update the current window with one scan, which improves the algorithm efficiency. The capability of queries at arbitrary time on the whole current window is achieved by query manager procedure, which can capture the phenomenon of concept drift of data stream in time. Experimental results show that SOMRNN has feasibility and efficiency.

Key words: data stream, outlier, reverse k nearest neighbors, sliding window

中图分类号: