作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

面向大数据系统的检测器快速筛选算法

蔡涛,倪晓蓉,王伟生,牛德姣   

  1. (江苏大学计算机科学与通信工程学院,江苏 镇江 212013)
  • 收稿日期:2014-09-15 出版日期:2015-09-15 发布日期:2015-09-15
  • 作者简介:蔡涛(1976-),男,副教授、博士、CCF会员,主研方向:大数据系统,人工智能;倪晓蓉、王伟生,硕士研究生;牛德姣,副教授、博士研究生。
  • 基金资助:
    国家自然科学基金资助项目(61300228);浙江省自然科学基金资助项目(LY13F020012);江苏省科技支撑计划基金资助项目(BE2013103);深圳市科技基金资助项目(JCYJ20130401095947222)。

Fast Detector Screening Algorithm for Big Data System

CAI Tao,NI Xiaorong,WANG Weisheng,NIU Dejiao   

  1. (School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang 212013,China)
  • Received:2014-09-15 Online:2015-09-15 Published:2015-09-15

摘要: 筛选成熟检测器是决定人工免疫系统性能和效率的关键因素,在大数据环境下由于初始检测器的数量极其庞大,会造成现有检测器筛选算法时间开销过大。针对该问题,提出一种新的海量初始检测器快速筛选算法。设计海量初始检测器的分布存储模式,利用Map/Reduce模型,给出混合式初始检测器快速筛选架构、海量初始检测器分区检查策略和成熟检测器集优化策略,以提高筛选初始检测器的效率,优化成熟检测器。在Hadoop集群中实现面向大数据系统检测器快速筛选算法原型系统,使用CERT synthethic sendmail data数据集进行测试与分析,结果表明,与传统算法相比,该算法能减少58.87%的时间开销,并在初始检测器数量不断增加时保持时间开销的稳定。

关键词: 检测器生成算法, 大数据系统, 人工免疫系统, Map/Reduce模型

Abstract: The detector screening algorithm is important for the performance and efficiency of Artificial Immune System(AIS).But the time overhead of current detector generation algorithm is too big due to the large number of initial detector in the big data system.In this paper,the new distributed initial detector storage strategy is designed to provide the basis for improving the massive initial detector selection efficiency.The new fast detector screening algorithm for the big data system is given by analyzing the different character of map and reduce stage in Map/Reduce model,and it is used to improve the initial detector selection efficiency and optimize the mature detector set.It includes hybrid initial detector selection architecture,massive initial detection partition inspection algorithm and mature detector set optimization algorithm.The fast detector generation algorithm prototype for big data system is realized in Hadoop system.CERT synthethic sendmail data set is used to evaluate and compared with the current detector screening algorithm.The result shows that the 58.87% time overhead is reduced,and time overhead can be maintained stability with the number of initial detector increasing.

Key words: detector generation algorithm, big data system, Artificial Immune System(AIS), Map/Reduce model

中图分类号: