Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2022, Vol. 48 ›› Issue (10): 37-44,54. doi: 10.19678/j.issn.1000-3428.0063091

• Research Hotspots and Reviews • Previous Articles     Next Articles

Orthogonal Basis-Based Multiview Transfer Spectral Clustering

WANG Lijuan1, ZHANG Lin1, YIN Ming2, HAO Zhifeng3, CAI Ruichu1, WEN Wen1   

  1. 1. School of Computing, Guangdong University of Technology, Guangzhou 510006, China;
    2. School of Automation, Guangdong University of Technology, Guangzhou 510006, China;
    3. Shantou University, Shantou, Guangdong 515063, China
  • Received:2021-10-30 Revised:2022-01-27 Published:2022-10-09

基于正交基的多视图迁移谱聚类

王丽娟1, 张霖1, 尹明2, 郝志峰3, 蔡瑞初1, 温雯1   

  1. 1. 广东工业大学 计算机学院, 广州 510006;
    2. 广东工业大学 自动化学院, 广州 510006;
    3. 汕头大学, 广东 汕头 515063
  • 作者简介:王丽娟(1978—),女,副教授、博士,主研方向为高维数据聚类分析;张霖,硕士;尹明、郝志峰、蔡瑞初、温雯,教授、博士。
  • 基金资助:
    国家自然科学基金(61876042,61876043,61976052);广东省基础与应用基础研究基金(2020A1515011493);广州市科技计划(201902010058)。

Abstract: The consistency of multiview data is important for multiview clustering.To achieve multiview data with better consistency, this paper proposes a new multiview clustering algorithm, OMTSC.The OMTSC algorithm simultaneously learns the cluster assignment matrix and feature embedding of each view.Each cluster assignment matrix can be decomposed into shared orthogonal basis-cluster coding matrices.An orthogonal basis matrix can capture and store consistent multiview data and form latent cluster centers.A weighted multiview cluster coding matrix can balance the quality differences of different views effectively.Meanwhile, bipartite graph co-clustering is introduced to realize knowledge transfer, which involves clustering coding, feature embedding, and the orthogonal basis.This improves the multiview data consistency and diversity learning, as well as allows the OMTSC algorithm to leverage the diversity of feature embedding for maximizing multiview consistency and learning the optimal latent cluster centers, thus further improving the performance of multiview clustering.In addition, feature embedding based on group sparse constraints is robust to noise in view data.Experimental results on WikipediaArticles, COIL20, and ORL datasets show that the OMTSC algorithm is superior to SC-Best, Co-Reg, and advanced multiview clustering algorithms, and that it yields the highest score in all three evaluation indexes, i.e., the ACC, NMI, and ARI on COIL20 and ORL datasets, the NMI evaluation index for the OMTSC algorithm exceeds 0.9.

Key words: multiview, orthogonal basis clustering, transfer learning, spectral clustering, co-regularization

摘要: 挖掘多视图一致性是提升多视图聚类性能的关键,为更好地从多视图数据中学习一致性表示,提出一种新的多视图聚类算法OMTSC。OMTSC算法同时学习每个视图的聚类分配矩阵和特征嵌入,并将聚类分配矩阵分解为共享正交基矩阵和聚类编码矩阵。正交基矩阵可捕获并储存多视图一致性信息形成潜在聚类中心,经过加权融合的多视图聚类编码矩阵可更好地平衡不同视图的质量差异。引入基于二部图的协同聚类,实现正交基、聚类编码和特征嵌入3个矩阵的知识相互迁移,以提升多视图数据一致性和多样性,并利用特征嵌入的多样性最大化多视图一致性学习最优的潜在聚类中心,从而提高多视图聚类的性能。此外,基于群稀疏约束的特征嵌入可有效消除多视图数据中的噪声,提升算法的鲁棒性。在WikipediaArticles、COIL20和ORL数据集上的实验结果表明,与SC-Best、Co-Reg等先进的多视图聚类算法相比,OMTSC算法在ACC、NMI、ARI 3个评价指标上整体取得最优值,其中在COIL20和ORL数据集中的NMI评价指标均高于0.9。

关键词: 多视图, 正交基聚类, 迁移学习, 谱聚类, 协同正则化

CLC Number: