作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (18): 34-36. doi: 10.3969/j.issn.1000-3428.2012.18.009

• 软件技术与数据库 • 上一篇    下一篇

基于支持向量数据描述的分布式数据流挖掘

蔡国祯 1,毛国君 2   

  1. (1. 北京工业大学计算机学院,北京 100124;2. 中央财经大学信息学院,北京 100081)
  • 收稿日期:2011-12-15 修回日期:2012-02-07 出版日期:2012-09-20 发布日期:2012-09-18
  • 作者简介:蔡国祯(1985-),男,硕士研究生,主研方向:数据挖掘;毛国君,教授
  • 基金资助:
    国家自然科学基金资助项目(60873145)

Distributed Data Stream Mining Based on Support Vector Data Description

CAI Guo-zhen 1, MAO Guo-jun 2   

  1. (1. College of Computer Science, Beijing University of Technology, Beijing 100124, China;2. School of Information, Central University of Finance and Economics, Beijing 100081, China)
  • Received:2011-12-15 Revised:2012-02-07 Online:2012-09-20 Published:2012-09-18

摘要: :针对传统分布式数据流挖掘算法的通信开销较大、分类精度较低的问题,提出一种基于支持向量数据描述的分布式数据流挖掘算法。利用局部站点快速更新数据流信息,采用支持向量机算法学习元级数据并传递到中心站点。中心站点负责接收及合并元级数据,形成全局分类结果。实验结果表明,该算法能在降低局部站点和中心站点网络通信量的同时,获得较高精度的全局分类结果。

关键词: 分布式数据流, 数据挖掘, 支持向量数据描述, 支持向量机, 增量式挖掘

Abstract: In distributed data stream mining, communication loads and global classification accuracy are main problems. In order to solve the problem, this paper presents a distributed data stream mining algorithm based on Support Vector Data Description(SVDD). Local site quickly updates data stream information, gets meta-level data by Support Vector Machine(SVM), and transmits them to central site. Central site receives and combines meta-level data, and learns global classification model. Experimental result shows that the algorithm can reduce transmission between local site and central site, and keep better classification accuracy.

Key words: distributed data stream, data mining, Support Vector Data Description(SVDD), Support Vector Machine(SVM), incremental mining

中图分类号: