Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

Previous Articles     Next Articles

MySQL Outlier Detection Algorithm Based on Monitoring Data

LING Jun 1,2,YIN Boxue 2,LI Sheng 2,WANG Xin 1   

  1. (1.School of Computer Science and Technology,Tianjin University,Tianjin 300072,China; 2.Baidu (China) Co.,Ltd.,Beijing 100085,China)
  • Received:2014-11-03 Online:2015-11-15 Published:2015-11-13

基于监控数据的MySQL异常检测算法

凌骏1,2,尹博学2,李晟2,王鑫1   

  1. (1.天津大学计算机科学与技术学院,天津 300072; 2.百度(中国)有限公司,北京 100085)
  • 作者简介:凌骏(1991-),男,硕士研究生,主研方向:RDF图数据管理,MySQL数据库技术;尹博学、李晟,硕士;王鑫,副教授、博士。
  • 基金资助:
    第三届“百度主题研究”基金资助项目。

Abstract: With the explosive growth of the data on the Internet,the scale of the server cluster is rapidly expanding.How to carry out large-scale cluster monitoring and analysis becomes a difficult problem in the Internet industry.Therefore,this paper presents a new method for detection and analysis of the monitoring data according to the monitoring jittering data.It adopts pattern-based outlier detection method without setting a threshold,takes the eigenvalues,calculaties the outliers,and obtains the abnormal range and degrees.Experimental results show that the algorithm can extract data features for time sequence of jittering data,and has a higher precision and better applicability than the outlier detection algorithm based on mean-variance.

Key words: outlier detection, onitoring data, tatistics, attern, time sequence

摘要: 随着互联网数据规模的增长,服务器集群的规模快速扩大,对大规模的集群进行监控和分析成为互联网行业运维的难点。为此,根据监控统计数据剧烈波动的特点,提出一种MySQL异常检测分析算法,采用基于模式的异常检测方法,无须设置阈值,分段取模式特征值,计 算异常点、异常区间和异常程度。实验结果表明,该算法对于抖动剧烈监控数据的时序序列可以较好地提取数据特征,与基于均值方差的异常检测算法相比,具有更高的精准度,对监测数据的适用性较强。

关键词: 异常检测, 监控数据, 统计, 模式, 时间序列

CLC Number: