作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

基于高维数据流的异常检测算法

余立苹,李云飞,朱世行   

  1. (苏州大学 计算机科学与技术学院,江苏 苏州 215006)
  • 收稿日期:2016-11-29 出版日期:2018-01-15 发布日期:2018-01-15
  • 作者简介:余立苹(1992—),女,硕士研究生,主研方向为数据挖掘;李云飞,教授;朱世行,硕士研究生。
  • 基金资助:
    国家自然科学基金(61201212,61272449)。

Anomaly Detection Algorithm Based on High-dimensional Data Stream

YU Liping,LI Yunfei,ZHU Shixing   

  1. (School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Received:2016-11-29 Online:2018-01-15 Published:2018-01-15

摘要: 传统基于欧氏距离的异常检测算法在高维数据检测中存在精度无法保证以及运行时间过长的问题。为此,结合高维数据流的特点运用角度方差的方法,提出一种改进的基于角度方差的数据流异常检测算法。通过构建最佳数据集网格和最近数据网格的小规模数据流计算集,以快速即时地衡量最新数据点的异常程度,将改进的算法用于无线传感器网络采集的电梯真实数据流检测,实现电梯故障检测。实验结果表明,与ABOD、HODA等算法相比,改进算法能有效识别高维数据流中的异常点,可适用于实时性要求高的传感器高维数据流。

关键词: 数据挖掘, 高维数据流, 异常检测, 海量数据, 角度方差

Abstract: The traditional outlier detection algorithm based on Euclidean distance can not guarantee the accuracy and the running time is too long in high-dimensional data detection.Based on the characteristics of high-dimensional data flow,an improved outlier detection algorithm based on angular variance is proposed by using the method of angle variance.The optimal data set grid and the nearest data grid are constructed to calculate the small scale data flow to measure the abnormal degree of the latest data points,and the improved algorithm is used to the elevator real data flow detection in wireless sensor network acquisition to achieve elevator fault detection.Experimental results show that compared with ABOD and HODA algorithms,the improved algorithm can effectively identify abnormal points in high-dimensional data streams and can be applied to high-dimensional data streams with high real-time requirements.

Key words: data mining, high-dimensional data stream, anomaly detection, massive data, angle variance

中图分类号: