计算机工程

• 安全技术 • 上一篇    下一篇

大规模网络中基于集成学习的恶意域名检测

马旸,强小辉,蔡冰,王林汝   

  1. (国家计算机网络应急技术处理协调中心江苏分中心,南京 210003)
  • 收稿日期:2015-10-10 出版日期:2016-11-15 发布日期:2016-11-15
  • 作者简介:马旸(1980—),男,高级工程师、硕士,主研方向为网络与信息安全、大数据处理;强小辉、蔡冰、王林汝,硕士。

Malicious Domain Name Detection in Large-scale Network Based on Ensemble Learning

MA Yang,QIANG Xiaohui,CAI Bing,WANG Linru   

  1. (Jiangsu Branch of National Computer Network Emergency Response Technical Team/Coordination Center of China,Nanjing 210003,China)
  • Received:2015-10-10 Online:2016-11-15 Published:2016-11-15

摘要: 现有的恶意域名检测方案在处理大规模数据和多种类型的恶意域名时存在不足。为此,根据时间性、相关域名集合和对应IP三方面特征提出新的检测方案。使用并行化随机森林算法建立组合的域名检测分类器,以提高检测精确度及容错能力。实验结果表明,组合分类器的精确度和准确率均高于决策树分类器,新方案能够更有效地检测大规模网络中的恶意域名。

关键词: 恶意域名检测, 集成学习, 随机森林算法, 组合分类器, 大数据, 并行化

Abstract: Existing domain name detection schemes face difficulties in dealing with large-scale data and various malicious domains.Aiming at this problem,this paper designs a malicious domain detection scheme based on the features of the timeliness,relevant domain set and the corresponding IP.It uses parallelized random forests algorithm to build the classifier and process large-scale data,which improves classification precision and fault tolerance.Experimental result shows that,compared with decision tree classifier,the combined classifier has better performance in precision and accuracy,which can solve the problem of malicious domain detection in large-scale network environment more efficiently.

Key words: malicious domain name detection, ensemble learning, random forests algorithm, combined classifier, big data, parallelization

中图分类号: