作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (13): 163-165,168. doi: 10.3969/j.issn.1000-3428.2012.13.048

• 人工智能及识别技术 • 上一篇    下一篇

基于图的半监督协同训练算法

郭 涛1,2,李贵洋1,2,兰 霞2   

  1. (1. 四川省可视化计算和虚拟现实重点实验室,成都 610068; 2. 四川师范大学计算机科学学院,成都 610101)
  • 收稿日期:2011-11-08 出版日期:2012-07-05 发布日期:2012-07-05
  • 作者简介:郭 涛(1967-),女,副教授、硕士,主研方向:数据挖掘,信息可视化;李贵洋,副教授、博士;兰 霞,硕士研究生
  • 基金资助:
    四川省科技厅重点实验室基金资助项目“可视化计算与虚拟现实”(PJ201102)

Semi-supervised Collaborative Training Algorithm Based on Graph

GUO Tao   1,2, LI Gui-yang   1,2, LAN Xia   2   

  1. (1. Visual Computing and Virtual Reality Key Laboratory of Sichuan Province, Chengdu 610068, China; 2. College of Computer Science, Sichuan Normal University, Chengdu 610101, China)
  • Received:2011-11-08 Online:2012-07-05 Published:2012-07-05

摘要: 在分类器训练过程中,无标记数据的引入容易产生噪音,从而降低分类精度。为此,提出一种基于图的置信度估计半监督协同训练算法。利用样本数据自身的结构信息,计算无标记样本所属类别概率。采用多分类器对无标记数据进行置信度估计,以提高无标记数据挑选标准,减少噪音数据的引入。在UCI数据集上的对比实验验证了该算法的有效性。

关键词: 半监督学习, 协同训练, 置信度, 分类, 无标记数据

Abstract: In classifier training process, the introduction of unlabeled data can cause noise data, and it reduces classification accuracy. This paper proposes Confidence Estimation for Semi-supervised Learning based on graph(CESL) algorithm. The algorithm makes use of structure information of sample data to calculate classification probability of unlabeled data explicitly. Combined with multi-classifiers, the algorithm estimates the confidence of unlabeled data implicitly and improves the selection criteria. With dual-confidence estimation, the unlabeled data is selected to update classifiers. Experiments on UCI datasets prove the efficiency of this algorithm.

Key words: semi-supervised learning, collaborative training, confidence, classification, unlabeled data

中图分类号: