作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 113-122. doi: 10.19678/j.issn.1000-3428.0068028

• 网络空间安全 • 上一篇    下一篇

基于特征匹配度与异类子模型融合的安全性评估方法

徐晓滨1,2, 张云硕1,2, 施凡1,2, 常雷雷1,2,*(), 陶志刚3   

  1. 1. 杭州电子科技大学中国-奥地利人工智能与先进制造"一带一路"联合实验室, 浙江 杭州 310018
    2. 杭州电子科技大学自动化学院, 浙江 杭州 310018
    3. 中国矿业大学(北京)力学与土木工程学院, 北京 100083
  • 收稿日期:2023-07-07 出版日期:2024-08-15 发布日期:2024-05-08
  • 通讯作者: 常雷雷
  • 基金资助:
    浙江省属高校基本科研业务费专项(GK239909299001-010); 浙江省基础公益研究计划项目(LTGG23F030009); 国家重点研发计划(2022YFE0210700); 浙江省杰出青年科学基金(LR21030001)

Safety Assessment Method Based on Degree of Feature Matching and Fusion of Heterogeneous Sub-Models

Xiaobin XU1,2, Yunshuo ZHANG1,2, Fan SHI1,2, Leilei CHANG1,2,*(), Zhigang TAO3   

  1. 1. China-Austria Belt and Road Joint Laboratory on Artificial Intelligence and Advanced Manufacturing, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang, China
    2. College of Automation, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang, China
    3. School of Mechanics and Civil Engineering, China University of Mining and Technology(Beijing), Beijing 100083, China
  • Received:2023-07-07 Online:2024-08-15 Published:2024-05-08
  • Contact: Leilei CHANG

摘要:

机器学习模型的好坏影响预测精度、输入与输出结果的拟合情况。在复杂系统中, 使用单一模型评估系统安全性问题时容易受数据量、数据格式、模型结构以及环境干扰等因素影响, 使得这个模型在解决某个问题的能力上比较出色, 而在解决其他问题时, 结果却不尽如人意。针对上述问题, 提出一种基于特征匹配度和异类子模型融合的安全性评估方法。首先, 按照采样数据的输出值划分不同规模的数据集并构建子模型; 其次, 通过计算每个新数据对于这些子模型的匹配度, 进而得到每个子模型的权重; 最后, 根据权重大小融合所有子模型的子输出得到最终的多模型融合结果。所提方法对山东省济宁市霄云煤矿采掘数据集进行研究, 实验结果表明, 该方法与多样本单模型、少样本单模型和传统多模型方法相比, 在以330/70的比例来构建子模型的情况下均方根误差(RMSE)分别降低了15.13%、51.67%和12.46%, 该方法充分集成各子模型所能提供的有效信息, 减少和分散单一模型的预测误差, 以提高模型的预测精度和泛化能力。

关键词: 特征匹配度, 异类子模型, 单模型, 多模型融合, 安全性评估

Abstract:

The quality of machine learning models affects the prediction accuracy as well as the fit between the input and output results. In complex systems, when a single model is used to evaluate system security problems, the results are easily affected by data volume, data format, model structure, environmental interference, and other factors. These issues reduce the effectiveness of the model in simultaneously solving multiple problems, although being effective in solving one problem at a time, leading to unsatisfactory results. To address these issues, this study proposes a safety assessment method based on degree of feature matching and heterogeneous sub-model fusion. First, the datasets are divided into different sizes based on the output values of the sampled data to construct sub-models. Second, the weights of each sub-model are obtained by calculating the matching degree of each new data sample with these sub-models. Finally, the final multi-model fusion result is obtained by fusing the sub-outputs of all the sub-models based on their respective weights. The proposed safety assessment method is applied to the mining dataset of the Xiaoyun Coal Mine in Jining City, Shandong Province. The experimental results show that when the sub-model is constructed at a ratio of 330/70, the Root Mean Square Error (RMSE) of this proposed method is reduced by 15.13%, 51.67%, and 12.46% compared with the diversified single model, small-sample single model, and traditional multi-model, respectively. The proposed method fully integrates the effective information provided by each sub-model, reduces and disperses the prediction error of a single model, and improves the prediction accuracy and generalization ability of the model.

Key words: degree of feature matching, heterogeneous sub-models, single model, multi-model fusion, safety assessment