Abstract:
In distributed data stream mining, communication loads and global classification accuracy are main problems. In order to solve the problem, this paper presents a distributed data stream mining algorithm based on Support Vector Data Description(SVDD). Local site quickly updates data stream information, gets meta-level data by Support Vector Machine(SVM), and transmits them to central site. Central site receives and combines meta-level data, and learns global classification model. Experimental result shows that the algorithm can reduce transmission between local site and central site, and keep better classification accuracy.
Key words:
distributed data stream,
data mining,
Support Vector Data Description(SVDD),
Support Vector Machine(SVM),
incremental mining
摘要: :针对传统分布式数据流挖掘算法的通信开销较大、分类精度较低的问题,提出一种基于支持向量数据描述的分布式数据流挖掘算法。利用局部站点快速更新数据流信息,采用支持向量机算法学习元级数据并传递到中心站点。中心站点负责接收及合并元级数据,形成全局分类结果。实验结果表明,该算法能在降低局部站点和中心站点网络通信量的同时,获得较高精度的全局分类结果。
关键词:
分布式数据流,
数据挖掘,
支持向量数据描述,
支持向量机,
增量式挖掘
CLC Number:
CA Guo-Zhen, MAO Guo-Jun. Distributed Data Stream Mining Based on Support Vector Data Description[J]. Computer Engineering, 2012, 38(18): 34-36.
蔡国祯, 毛国君. 基于支持向量数据描述的分布式数据流挖掘[J]. 计算机工程, 2012, 38(18): 34-36.