Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering

Previous Articles     Next Articles

MySQL Outlier Detection Algorithm Based on Monitoring Data

LING Jun 1,2,YIN Boxue 2,LI Sheng 2,WANG Xin 1   

  1. (1.School of Computer Science and Technology,Tianjin University,Tianjin 300072,China; 2.Baidu (China) Co.,Ltd.,Beijing 100085,China)
  • Received:2014-11-03 Online:2015-11-15 Published:2015-11-13

基于监控数据的MySQL异常检测算法

凌骏1,2,尹博学2,李晟2,王鑫1   

  1. (1.天津大学计算机科学与技术学院,天津 300072; 2.百度(中国)有限公司,北京 100085)
  • 作者简介:凌骏(1991-),男,硕士研究生,主研方向:RDF图数据管理,MySQL数据库技术;尹博学、李晟,硕士;王鑫,副教授、博士。
  • 基金资助:
    第三届“百度主题研究”基金资助项目。

Abstract: With the explosive growth of the data on the Internet,the scale of the server cluster is rapidly expanding.How to carry out large-scale cluster monitoring and analysis becomes a difficult problem in the Internet industry.Therefore,this paper presents a new method for detection and analysis of the monitoring data according to the monitoring jittering data.It adopts pattern-based outlier detection method without setting a threshold,takes the eigenvalues,calculaties the outliers,and obtains the abnormal range and degrees.Experimental results show that the algorithm can extract data features for time sequence of jittering data,and has a higher precision and better applicability than the outlier detection algorithm based on mean-variance.

Key words: outlier detection, onitoring data, tatistics, attern, time sequence

摘要: 随着互联网数据规模的增长,服务器集群的规模快速扩大,对大规模的集群进行监控和分析成为互联网行业运维的难点。为此,根据监控统计数据剧烈波动的特点,提出一种MySQL异常检测分析算法,采用基于模式的异常检测方法,无须设置阈值,分段取模式特征值,计 算异常点、异常区间和异常程度。实验结果表明,该算法对于抖动剧烈监控数据的时序序列可以较好地提取数据特征,与基于均值方差的异常检测算法相比,具有更高的精准度,对监测数据的适用性较强。

关键词: 异常检测, 监控数据, 统计, 模式, 时间序列

CLC Number: