作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (11): 161-169. doi: 10.19678/j.issn.1000-3428.0063817

• 网络空间安全 • 上一篇    下一篇

一种在线实时微服务调用链异常检测方法

张攀1, 高丰2, 周逸1, 饶涵宇3, 毛冬3, 李静2   

  1. 1. 国家电网有限公司信息通信分公司, 北京 100031;
    2. 南京航空航天大学 计算机科学与技术学院, 南京 211106;
    3. 国网浙江省电力有限公司信息通信分公司, 杭州 310016
  • 收稿日期:2022-01-23 修回日期:2022-04-24 发布日期:2022-06-29
  • 作者简介:张攀(1989—),男,高级工程师、博士,主研方向为电力信息通信;高丰,硕士研究生;周逸,硕士;饶涵宇、毛冬,工程师、硕士;李静(通信作者),副教授、博士。
  • 基金资助:
    国家电网有限公司科技项目“业务应用改造上云与全链路运行分析技术研究”(5700-202152169A-0-0-00)。

An Online Real-Time Anomaly Detection Method for Microservice Call Chains

ZHANG Pan1, GAO Feng2, ZHOU Yi1, RAO Hanyu3, MAO Dong3, LI Jing2   

  1. 1. State Grid Information & Telecommunication Branch, Beijing 100031, China;
    2. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China;
    3. Information & Telecommunication Branch, State Grid Zhejiang Electric Power Co., Ltd., Hangzhou 310016, China
  • Received:2022-01-23 Revised:2022-04-24 Published:2022-06-29

摘要: 微服务架构逐渐成为大规模云应用的主流设计架构,微服务可靠性是云服务商亟须处理的关键问题。精确检测并定位微服务应用故障可有效保障应用的可靠性与稳定性,基于微服务调用链的异常检测可在系统发生故障时及时发现系统异常行为并触发告警。针对当前主流检测方法无法保证异常告警的实时性和准确性问题,提出一种基于自然语言处理与双向长短期记忆(BiLSTM)网络的微服务调用链异常检测方法MicroTrace。对调用链中记录的事件进行解析,将事件表示为语义序列与响应时间序列,利用词汇嵌入式表示算法提取事件的向量化表示,通过基于注意力机制的BiLSTM同时检测微服务实例的调用路径与性能异常。在真实微服务调用链数据集上的实验结果表明,该方法的查准率和查全率均可达96%以上,F1度量值相比于多模态-LSTM方法至少提升了6.8%。

关键词: 微服务, 调用链, 深度学习, 异常检测, 数据挖掘

Abstract: The microservice architecture is gradually becoming the mainstream design architecture for large-scale cloud applications, and the reliability of systems based on microservices is a key issue that must be addressed by cloud service providers.Accurately and effectively detecting and locating the faults in microservice applications is crucial for ensuring the reliability and stability of applications.Anomalies in microservice call chains are detected to identify abnormal behaviors in the system in a timely manner and trigger an alarm when the system fails.However, the real-time performance of alarms indicating abnormal behaviors cannot be guaranteed by using current mainstream methods, which require the establishment of a knowledge base by augmenting data from the microservice call chain.Therefore, an anomaly detection method for microservice call chains based on natural language processing and a Bi-directional Long Short-Term Memory(BiLSTM) network is proposed herein.First, events are extracted into semantic sequences and response time sequences.Second, Word2vec is used to extract the vectorized representation of the event, detect the call path anomalies in call chains, and identify the performance anomalies of microservice instances caused by BiLSTM based on the attention mechanism.Finally, the proposed method is verified on an actual microservice call chain dataset.The experimental results show that the precision and recall of the proposed method can exceed 96% and that the detection accuracy improves by at least 6.8% compared with that of the Multimodal-LSTM method.

Key words: microservice, call chain, deep learning, anomaly detection, data mining

中图分类号: