作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (6): 179-187. doi: 10.19678/j.issn.1000-3428.0067570

• 网络空间安全 • 上一篇    下一篇

基于可解释性深度学习的物联网水质监测数据异常检测

李永飞, 李铭洋, 常鑫, 曹可欣   

  1. 华北科技学院计算机学院, 河北 廊坊 065201
  • 收稿日期:2023-05-08 修回日期:2023-08-31 出版日期:2024-06-15 发布日期:2024-06-23
  • 通讯作者: 李永飞,E-mail:lyf518@ncist.edu.cn E-mail:lyf518@ncist.edu.cn
  • 基金资助:
    中央高校基本科研业务费资助项目(3142017067,3142023060);河北省重点研发计划(19270318D)。

Anomaly Detection of IoT Water Quality Monitoring Data Based on Explainable Deep Learning

LI Yongfei, LI Mingyang, CHANG Xin, CAO Kexin   

  1. School of Computer, North China Institute of Science and Technology, Langfang 065201, Hebei, China
  • Received:2023-05-08 Revised:2023-08-31 Online:2024-06-15 Published:2024-06-23

摘要: 随着物联网技术的发展和应用范围的扩大,物联网设备和传感器的数量和种类也在不断增加。物联网水质传感器在生态监测与保护领域起着至关重要的作用,针对物联网水质传感器采集的监测数据中数据量大、维度高、无标注等问题,提出一种基于可解释性深度学习的无监督异常数据检测算法。使用自动编码器(AE)和SHAP算法对多维水质数据集进行异常检测。通过训练自动编码器模型,标记重建误差较大的数据,使用SHAP解释自动编码器并计算被标记数据中各数据特征的重要性。基于这些特征的重要性,确定最终的异常值,从而实现对水质监测数据的异常检测。在物联网水质监测数据集上的实验结果表明,该算法能有效检测出异常数据,F1值为0.875,性能优于当前无监督异常检测领域常用算法。该算法对于处理物联网水质监测数据具有实际应用价值,此外,还可以应用于其他领域的海量物联网监测数据的异常检测,例如气象、环境等领域。

关键词: 深度学习, 自动编码器, 异常检测, 可解释机器学习, 无监督学习

Abstract: With the increasing applicability of Internet-of-Things (IoT) technology, the number and types of IoT devices and sensors are continuously increasing. In particular, IoT water quality sensors play a vital role in the field of ecological monitoring and protection. Accordingly, this study proposes an unsupervised anomaly data detection algorithm based on explainable deep learning to address the issues of large volume, high dimensionality, and lack of labeling in the monitoring data collected by IoT water quality sensors. The algorithm uses the Auto-Encoder (AE) and SHAP algorithms to detect anomalies in multi-dimensional water quality datasets. The AE model is trained to flag data with significant reconstruction errors, and SHAP is used to interpret the AE and calculate the importance of each feature in the flagged data. Based on the importance of these features, the final anomaly value is determined for the dataset. Experimental results on an IoT water quality monitoring dataset show that the algorithm can effectively detect anomalous data with an F1 value of 0.875, outperforming existing unsupervised anomaly detection algorithms. Thus, the proposed algorithm has a practical application value for processing IoT water quality monitoring data. Furthermore, the algorithm can be applied to the anomaly detection of massive IoT monitoring data in other fields, such as meteorology and the environment.

Key words: deep learning, Auto-Encoder(AE), anomaly detection, explainable machine learning, unsupervised learning

中图分类号: