Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2008, Vol. 34 ›› Issue (6): 35-37. doi: 10.3969/j.issn.1000-3428.2008.06.012

• Degree Paper • Previous Articles     Next Articles

Diversity and Performance Comparison forEnsemble Learning Algorithms

LI Kai1, CUI Li-juan2   

  1. (1. School of Mathematics and Computer, Hebei University, Baoding 071002; 2. Library of Hebei University, Baoding 071002)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-03-20 Published:2008-03-20

集成学习算法的差异性及性能比较

李 凯1,崔丽娟2   

  1. (1. 河北大学数学与计算机学院,保定 071002;2. 河北大学图书馆,保定 071002)

Abstract: From point of view of diversity, the paper studies ensemble learning algorithms based on feature sets and data. Methods of creating diversity for these ensemble learning algorithms are analyzed. And experimental studies for using decision trees and neural networks as basis models are conducted on 10 standard data sets. The results show that performances of ensemble learning algorithms depend on character of data sets, method of creating diversity, and etc. In general, performances of ensemble learning algorithms based on data are superior to one based on feature sets.

Key words: diversity, ensemble learning, feature set, sampling, performance

摘要: 从差异性出发,研究了基于特征集技术(通过一定的策略选取不同特征集以组成训练集)与数据技术(通过取样技术选取不同的训练集)的集成学习算法,分析了两种集成学习算法产生差异性的方法。针对决策树与神经网络模型,在标准数据集中对集成学习算法的性能进行实验研究,结果表明集成学习算法的性能依赖于数据集的特性以及产生差异性的方法等因素。从总体性能考虑,基于数据的集成学习算法在大多数数据集上优于基于特征集的集成学习算法。

关键词: 差异性, 集成学习, 特征集, 取样, 性能

CLC Number: