作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2021, Vol. 47 ›› Issue (2): 77-83,89. doi: 10.19678/j.issn.1000-3428.0057017

• 人工智能与模式识别 • 上一篇    下一篇

结合改进密度峰值聚类的LGC半监督学习方法优化

薛子晗, 潘迪, 何丽   

  1. 天津财经大学 理工学院, 天津 300222
  • 收稿日期:2019-12-25 修回日期:2020-02-12 出版日期:2021-02-15 发布日期:2020-02-20
  • 作者简介:薛子晗(1995-).男,硕士研究生,主研方向为机器学习;潘迪,硕士研究生;何丽,教授。
  • 基金资助:
    天津市自然科学基金(16JCYBJC42000,18JCYBJC85100);天津市教委科研计划项目(2017KJ237);教育部人文社会科学研究规划基金(19YJA630046)。

Optimization of LGC Semi-Supervised Learning Method Combined with Improved Density Peaks Clustering

XUE Zihan, PAN Di, HE Li   

  1. College of Science and Technology, Tianjin University of Finance and Economics, Tianjin 300222, China
  • Received:2019-12-25 Revised:2020-02-12 Online:2021-02-15 Published:2020-02-20

摘要: 基于图的局部与全局一致性(LGC)半监督学习方法具有较高的标注正确率,但时间复杂度较高,难以适用于数据规模较大的实际应用场景。从缩小图的规模入手,提出一种全局一致性优化方法。使用改进后的密度峰值聚类算法,迭代地从数据集中筛选出多个中心点,以每个中心点为簇中心进行局部聚类,并以中心点为顶点构建图,实现基于LGC的半监督学习。实验结果表明,优化后的LGC方法在D31、Aggregation等数据集上具有较好的鲁棒性,在标注正确率和算法执行时间上优势明显。

关键词: 半监督学习, 密度峰值聚类, 基于图方法, 标签传递, 迭代

Abstract: The graph-based semi-supervised learning method with Local and Global Consistency(LGC) has excellent performance in labeling accuracy,but has a high time complexity and is difficult to apply to practical large-scale applications.To solve the problem,this paper proposes an LGC optimization method by reducing the size of the graph.This method uses the improved Density Peaks Clustering (DPC) algorithm,and iteratively selects multiple center points from the data set.Then local clustering is performed by taking each center point as the cluster center,and the center points are used as vertexes to construct a graph to perform LGC-based semi-supervised learning. Experimental results show that the optimized LGC method has good robustness on D31,Aggregation and other data sets,and has obvious advantages in label accuracy and algorithm execution time.

Key words: semi-supervised learning, Density Peaks Clustering (DPC), graph-based methods, label propagation, iteration

中图分类号: