Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2023, Vol. 49 ›› Issue (5): 129-138. doi: 10.19678/j.issn.1000-3428.0063892

• Cyberspace Security • Previous Articles     Next Articles

Dendritic Cell Model Based on mRMR and Gini Importance

ZHANG Kailin, DONG Hongbin   

  1. Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China
  • Received:2022-02-08 Revised:2022-05-25 Published:2023-05-10

基于mRMR与基尼重要性的树突状细胞模型

张凯林, 董红斌   

  1. 武汉大学 国家网络安全学院 空天信息安全与可信计算教育部重点实验室, 武汉 430072
  • 作者简介:张凯林(1996-),男,硕士研究生,主研方向为计算机免疫、网络空间安全;董红斌,教授、博士。
  • 基金资助:
    国家自然科学基金“计算机免疫智能的连续应答机制及其应用”(61877045)。

Abstract: The Dendritic Cell Algorithm(DCA) simulates the recognition and presentation of antigens by Dendritic Cells(DC) in the human immune system.It is a fast and effective anomaly detection method.It selects data features that represent specific input signals.However,existing signal selection methods have feature subset redundancy and high time complexity,resulting in the low effectiveness of the generated antigen signal and low running speed on high-dimensional and large-sample data sets.Considering the availability of antigen signals and time efficiency during signal selection,a DC model MRGI-DCA based on maximal Relevance Minimal Redundancy(mRMR),and Gini Importance(GI) is proposed.The most relevant feature subset is extracted quickly from the original data set through mRMR,and the redundancy of the feature subset is minimized.Based on the pre-dimensionality reduction of mRMR,according to the fast and accurate characteristics of a CART tree model,more effective antigen signals can be obtained using the GI.Experimental results show that MRGI-DCA outperforms the IG-DCA,COR-DCA,GA-DCA,and SVM-DCA.The accuracy,F1 value,and AUC average values are 6.01%,5.86%,and 9.96% higher than those of COR-DCA,respectively,for high-dimensional,low-dimensional,abnormal data sets,and the average running time is approximately 1/5 that of the COR-DCA.

Key words: Dendritic Cell Algorithm(DCA), signal selection, maximal Relevance Minimal Redundancy(mRMR) algorithm, Gini Importance(GI), Artificial Immune System(AIS)

摘要: 树突状细胞算法(DCA)模拟人体免疫系统中树突状细胞对抗原的识别与提呈过程,是一种快速有效的异常检测方法,其关键是从数据中选取有效特征以表示特定的输入信号。然而,现有信号选取方法存在特征子集冗余、时间复杂度高等问题,导致生成的抗原信号有效性较低,且在高维大样本数据集上运行速度较慢。考虑抗原信号的可用性与信号选取过程的时间效率,提出基于最大相关最小冗余(mRMR)与基尼重要性的树突状细胞模型MRGI-DCA。通过mRMR从原始数据集中快速地提取最相关特征子集,且最大限度地降低特征子集的冗余性。在mRMR预降维的基础上,根据CART树模型快速、准确等特点,利用基尼重要性得到更有效的抗原信号。实验结果表明,MRGI-DCA总体表现优于IG-DCA、COR-DCA、GA-DCA和SVM-DCA方法,其中,准确率、F1值和AUC在高维、低维、异常数据集上的平均值较COR-DCA分别提高6.01%、5.86%、9.96%,并且平均运行时间约为COR-DCA的1/5。

关键词: 树突状细胞算法, 信号选取, 最大相关最小冗余算法, 基尼重要性, 人工免疫系统

CLC Number: