计算机工程 ›› 2021, Vol. 47 ›› Issue (2): 69-76.doi: 10.19678/j.issn.1000-3428.0057092

• 人工智能与模式识别 • 上一篇    下一篇

基于Transformer重建的时序数据异常检测与关系提取

孟恒宇, 李元祥   

  1. 上海交通大学 航空航天学院, 上海 200240
  • 收稿日期:2020-01-02 修回日期:2020-02-13 出版日期:2021-02-15 发布日期:2020-02-25
  • 作者简介:孟恒宇(1995-),男,硕士研究生,主研方向为航空数据挖掘、图像处理;李元祥(通信作者),副教授、博士。
  • 基金项目:
    国家自然科学基金“海洋环境动力学和数值模拟”(U1406404)。

Anomaly Detection and Relation Extraction for Time Series Data Based on Transformer Reconstruction

MENG Hengyu, LI Yuanxiang   

  1. School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2020-01-02 Revised:2020-02-13 Online:2021-02-15 Published:2020-02-25

摘要: 现有时序异常检测方法存在计算效率低和可解释性差的问题。考虑到Transformer模型在自然语言处理任务中表现出并行效率高且能够跨距离提取关系的优势,提出基于Transformer的掩膜时序建模方法。建立时序数据的并行无方向模型,并使用掩膜机制重建当前时间步,从而实现整段序列的重建。在存储系统数据集和NASA航天器数据集上的实验结果表明,与基于长短期记忆网络模型的检测方法相比,该方法可节约80.7%的计算时间,Range-based指标的F1得分达到0.582,并且其通过可视化关系矩阵可准确反映人为指令与异常的关系。

关键词: 时序数据, 注意力机制, 异常检测, 关系提取, 自动编码器

Abstract: Existing Anomaly Detection(AD) methods for time series are faced with inefficient computation and poor interpretability.As the Transformer model shows the high parallel efficiency and the ability to extract relations regardless of distance in Natural Language Processing(NLP) tasks,this paper proposes a Transformer-based method,Masked Time Series Modeling(MTSM).The parallel model with no direction of time series data is constructed,and the mask strategy is used for the reconstruction of the current timestep and then the whole sequence.Experimental results on the storage system dataset and NASA spacecraft dataset show that the proposed method saves about 80.7% time cost compared with the detection method based on Long-Short Term Memory(LSTM) model,and achieves 0.582 in F1 score for Range-based index.Moreover,it can visualize the relation matrix to reflect the relation between anomalies and human instructions accurately.

Key words: time series data, attention mechanism, Anomaly Detection(AD), relation extraction, Auto-Encoder(AE)

中图分类号: