基于机器学习的域名数据监控方法

doi:10.3969/j.issn.1000-3428.2014.09.053

计算机工程

所属专题：机器学习；

基于机器学习的域名数据监控方法

刘明星,金　键,李晓东

(中国科学院计算机网络信息中心,北京100190)

收稿日期:2013-09-16 出版日期:2014-09-15 发布日期:2014-09-12
作者简介:刘明星(1985 - ),男,硕士,主研方向:网络安全,下一代互联网技术;金　键,高级工程师、硕士;李晓东,研究员、博士、博士生导师。
基金资助:
国家自然科学基金资助项目(61005029);互联网基础技术开放实验室研究课题基金资助项目。

Monitoring Method of Domain Name Data Based on Machine Learning

LIU Ming-xing,JIN Jian,LI Xiao-dong

(Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China)

Received:2013-09-16 Online:2014-09-15 Published:2014-09-12

摘要/Abstract

摘要： 域名资源记录被篡改的问题严重危害域名应用。由于该问题具有较强的隐蔽性,亟需一种快速且有效的发现域名危险变化的方法。为此,提出一种基于机器学习算法的域名数据监控方法。在一定数量的域名中选取出资源记录发生变化的域名,通过分析其相关信息生成一个由域名字面特征、正反匹配度等属性组成的元组。以变化是否危险为依据进行类标签人工标记,每个元组和其类标签组成训练集中的一个实例。由分析训练集决策树算法和支持向量机算法建立检测域名系统数据危险变化的分类器。通过十折交叉法验证2 个分类器,发现其在域名危险变化判断上具有较强的能力,正确率的加权均值分别达到73. 8% 和82. 4% 。

关键词: 域名系统, 安全, 机器学习, 域名系统监控, 决策树, 支持向量机

Abstract: A threat that Domain Name System(DNS) data is tampered by hackers endangers DNS applications. Due to the hidden characteristic of this threat,a quick and effective method to find dangerous changes in DNS data is needed urgently. Regarding to the problem,this paper proposes a method to monitor the DNS data based on machine learning,by which dangerous change in DNS data can be found quickly. Some domain names whose data are changed are chosen from a number of domain names,and their relevant information is individually analyzed in order to produce a tuple that is represented by a multi-dimensional attribute vector,which contains literal characteristics,forward-inverse match and so on. After that a class is labeled depending on whether the changes are bad or not so that an instance containing the tuple and their class label is built and consequently a training set is built. By analyzing the training set the two classification algorithms,decision tree and Support Vector Machine(SVM),build classifiers,which are used to detect whether changes in DNS data are dangerous or not. The 10-fold cross-validation is used to validate the two classifiers. It is found that the classifiers do well in finding dangerous changes in DNS data,in which the present results show that the classifier can reach a good precision,and their weighted average accuracies are 73. 8% and 82. 4% .

Key words: Domain Name System(DNS), security, machine learning, DNS monitoring, decision tree, Support Vector Machine(SVM)

中图分类号:

TP18

刘明星,金键,李晓东. 基于机器学习的域名数据监控方法[J]. 计算机工程, doi: 10.3969/j.issn.1000-3428.2014.09.053.

LIU Ming-xing,JIN Jian,LI Xiao-dong. Monitoring Method of Domain Name Data Based on Machine Learning[J]. Computer Engineering, doi: 10.3969/j.issn.1000-3428.2014.09.053.

https://www.ecice06.com/CN/Y2014/V40/I9/263

参考文献

参考文献 [ 1 ]　Mockapetris P. Domain Names-Concepts and Facilities [EB / OL]. (1987-11-01). http:/ / www. ietf. org / rfc / rfc1034. txt. [ 2 ]　Mockapetris P. Domain Names-Implementation and Specification[ EB / OL]. (1987-11-01). http:/ / www. ietf. org / rfc / rfc1035. txt. [ 3 ]　Arends R,Austein R, Larson M, et al. DNS Security Introduction and Requirements [ EB / OL ]. ( 2005-03- 15). http:/ / www. ietf. org / rfc / rfc4033. txt. [ 4 ]　Santcroos M,Kolkman O M. DNS Threat Analysis[EB / OL ]. ( 2009-02-25 ). http:/ / www. nlnetlabs. nl / downloads / se-consult. pdf. [ 5 ]　ICANN Security, Stablility Advisory Committee. Domain Name Hijacking:Incidents,Threats,Risks,and Remedial Actions [ EB / OL ]. ( 2005-07-05 ). http:/ / archive. icann. org / en / announcements / hijacking-report- 12jul05. pdf. [ 6 ]　Pappas V,Xu Zhiguo,Lu Songwu,et al. Impact of Conguration Errors on DNS Robustness[C] / / Proc. of ACM SIGCOMM’04. Portland,USA:[s. n. ],2004:319-330. [ 7 ]　Liu Ziqian. Lessons Learned from May 19 China’s DNS Collapse[EB / OL]. (2009-11-10). https:/ / www. dnsoarc. net / files / workshop-200911/ Ziqian_Liu. pdf. [ 8 ]　王培新,刘颖,陈雨新,等. Web 通信中可疑域名监控技术的研究[J]. 计算机技术与发展,2012,22(4): 231-234. [ 9 ]　Samaneh R,Saripan M I,Rasid M F A. Defending Denial of Service Attacks Against Domain Name System with Machine Learning Techniques[EB / OL]. (2010-11-01). http:/ / www. researchgate. net / publication / 49586608. [10]　Wu Jun, Wang Xin, Lee Xiaodong, et al. Detecting DDoS Attack Towards DNS Server Using a Neural Network Classifier[C] / / Proc. of the 20th International Conference on Artificial Neural Networks. [ S. l. ]: Springer,2010:118-123. [11]　Kalafut A J, Shue C A, Gupta M. Understanding Implications of DNS Zone Provisioning [C] / / Proc. of the 8th Conference on Internet Measurement. [S. l. ]: Springer,2008:211-216. [12]　Osterweil E,Masse D,Zhuang Lixia. Observations from the DNSSEC Deployment [ C ] / / Proc. of IEEE Workshop on Secure Network Protocols. [S. l. ]:IEEE Press,2007:1-6. [13]　Osterweil E,Ryan M,Massey D,et al. Quantifying the Operational Status of the DNSSEC Deployment [C] / / Proc. of the 6th ACM/ USENIX Internet Measurement Conference. Vouliagmeni,Greece:[s. n. ],2008:211-216. [14]　Osterweil E, Massey D, Zhang Lixia. Deploying and Monitoring DNS Security(DNSSEC)[C] / / Proc. of the 25th Annual Computer Security Applications Conference. Honolulu,USA:[s. n. ],2009:429-438. [15]　Ma J,Saul L K, Savage S, et al. Learning to Detect Malicious URLs [J]. ACM Transactions on Intelligent Systems and Technology,2011,2(3):30. [16]　Gige D. Passive Measurement of Network Quality[D]. Zürich, Switzerland: Swiss Federal Institute of Technology,2005. [17]　Levenshtein V I. Binary Codes Capable of Correcting Deletions,Insertions and Substitutions of Symbols [J]. Doklady Academy of Sciences of the USSR,1965,163 (4):845-848. [18]　Daigle L. WHOIS Protocol Specification [ EB / OL ]. (2004-09-12). http:/ / www. ietf. org / rfc / rfc3912. txt. [19]　Quinlan J R. C4. 5: Programs for Machine Learning [M]. [S. l. ]:Morgan Kaufmann Publishers,1993. [20]　Kohavi R. A Study of Cross-validation and Bootstrap for Accuracy Estimation and Model Selection[C] / / Proc. of the 14th International Joint Conference on Artificial Intelligence. Montreal, Canada: Morgan Kaufmann, 1995:1137-1143. [21]　Boser B E, Guyon I M, Vapnik V N. A Training Algorithm for Optimal Margin Classifiers[C] / / Proc. of the 5th Annual ACM Workshop on COLT. Pittsburgh, USA:ACM Press,1992:144-152. [22]　Platt J. Fast Training of Support Vector Machines Using Sequential Minimal Optimization [ M ]. Cambridge, USA:MIT Press,1999. 编辑　顾逸斐

[1]	翟洁, 李艳豪, 李彬彬, 郭卫斌. 基于大语言模型的个性化实验报告评语自动生成与应用[J]. 计算机工程, 2024, 50(7): 42-52.
[2]	陈增照, 王政, 郑秋雨. 基于全范围头部姿态估计的教师注意力识别算法[J]. 计算机工程, 2024, 50(7): 96-103.
[3]	李永飞, 李铭洋, 常鑫, 曹可欣. 基于可解释性深度学习的物联网水质监测数据异常检测[J]. 计算机工程, 2024, 50(6): 179-187.
[4]	徐明亮, 李芳媛, 马浩然, 何飞. 大规模神经记录的峰电位聚类算法(特邀)[J]. 计算机工程, 2024, 50(6): 1-34.
[5]	王以良, 周鹏, 叶卫, 戚伟强. 基于金字塔网络的非侵入式负荷辨识及其隐私保护方案[J]. 计算机工程, 2024, 50(5): 182-189.
[6]	熊世强, 何道敬, 王振东, 杜润萌. 联邦学习及其安全与隐私保护研究综述[J]. 计算机工程, 2024, 50(5): 1-15.
[7]	倪林, 刘子辉, 张帅, 韩久江, 鲜明. 基于灰度图谱分析的IP软核硬件木马检测方法[J]. 计算机工程, 2024, 50(3): 44-51.
[8]	孙毅, 王会梅, 鲜明, 向航. Kubeflow异构算力调度策略研究[J]. 计算机工程, 2024, 50(2): 25-32.
[9]	单永航, 张希, 胡川, 丁涛军, 姚远. 基于集成学习的交通事故严重程度预测研究与应用[J]. 计算机工程, 2024, 50(2): 33-42.
[10]	吴嘉鑫, 孙一飞, 吴亚兰, 武继刚. 面向安全传输的低能耗无人机轨迹优化算法[J]. 计算机工程, 2024, 50(2): 59-67.
[11]	叶晓东, 赵迎迎, 孙永奇, 赵思聪, 刘真. 基于非定长编码和滑动窗口的隐私保护记录链接方法[J]. 计算机工程, 2024, 50(2): 154-164.
[12]	刘道清, 扈红超, 霍树民. 容器云中面向持久化存储的拟态防御技术研究[J]. 计算机工程, 2024, 50(2): 165-179.
[13]	周义涛, 李阳, 韩超, 赵玉来, 汪玲, 李建华. 适用于S-NUCA异构处理器的任务调度与热管理系统[J]. 计算机工程, 2024, 50(2): 196-205.
[14]	周莎, 申国伟, 郭春. 基于安全知识图谱与逆向特征的弱点信息补全[J]. 计算机工程, 2024, 50(1): 145-155.
[15]	谢兆贤, 邹兴敏, 张文静. 面向大型数据集的高效决策树参数剪枝算法[J]. 计算机工程, 2024, 50(1): 156-165.

选择文件类型/文献管理软件名称

选择包含的内容

基于机器学习的域名数据监控方法

Monitoring Method of Domain Name Data Based on Machine Learning

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于机器学习的域名数据监控方法

Monitoring Method of Domain Name Data Based on Machine Learning

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价