作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (2): 122-131. doi: 10.19678/j.issn.1000-3428.0066433

• 人工智能与模式识别 • 上一篇    下一篇

基于一致性图的权重自适应多视角谱聚类算法

王丽娟1,*(), 邢津萍1, 尹明2, 郝志峰3, 蔡瑞初1, 温雯1   

  1. 1. 广东工业大学计算机学院, 广东 广州 510006
    2. 广东工业大学自动化学院, 广东 广州 510006
    3. 汕头大学, 广东 汕头 515063
  • 收稿日期:2022-12-05 出版日期:2024-02-15 发布日期:2023-04-06
  • 通讯作者: 王丽娟
  • 基金资助:
    国家自然科学基金(61876042); 国家自然科学基金(61876043); 国家自然科学基金(61976052)

Weight Adaptive Multi-view Spectral Clustering Algorithm Based on Consistent Graphs

Lijuan WANG1,*(), Jinping XING1, Ming YIN2, Zhifeng HAO3, Ruichu CAI1, Wen WEN1   

  1. 1. School of Computing Science and Technology, Guangdong University of Technology, Guangzhou 510006, Guangdong, China
    2. School of Automation, Guangdong University of Technology, Guangzhou 510006, Guangdong, China
    3. Shantou University, Shantou 515063, Guangdong, China
  • Received:2022-12-05 Online:2024-02-15 Published:2023-04-06
  • Contact: Lijuan WANG

摘要:

随着移动设备和互联网的普及,多视角数据的采集和分享变得更加容易,其可以从多个视角更准确地描述数据。目前,一些多视角聚类算法忽略了不同视角间的一致性潜在知识和不同视角的重要性。针对该问题,提出一种平衡视角间一致性信息的多视角聚类算法。首先通过调节视角权重学习视角间一致的共享相似度矩阵,提升共享矩阵的一致性,其中相关性强的视角具有的一致性信息更多,视角权重越大,在一致性学习中发挥的作用越大,而差异性大的视角其权重越小,在学习中发挥的作用越小。其次学习视角间的一致性样本嵌入以及不同视角的特征嵌入,并将特征嵌入中包含的多样性特征信息迁移到样本嵌入中,以此促进样本嵌入的一致性表达。在不同视角特征中包含多样性信息,可补充上述共享相似度矩阵学习中单一样本关系的不足。因此,采用二部图协同聚类,通过建立样本数据、样本嵌入和特征嵌入的关系图,学习样本的特征嵌入,并将其迁移到样本嵌入中。最后将图学习、谱聚类和特征嵌入学习整合到统一的框架中进行联合优化,得到最优的样本嵌入。实验结果表明,通过对样本嵌入进行K-means聚类,将该算法运行于5个真实数据集并与7种聚类算法对比,其中在3-Sources、Yale、MRSCV1数据集上的正确率均高于对比算法5%以上,验证了该算法的有效性。

关键词: 多视角聚类, 一致性学习, 权重自适应, 协同聚类, 谱聚类

Abstract:

With the popularity of mobile devices and the Internet, it has become easier to collect and share multi-view data. Multi-view data can describe data from multiple views more accurately. Currently, some multi-view clustering algorithms ignore consistent latent knowledge among different views and the importance of different views. To solve this problem, this paper proposes a multi-view clustering algorithm that balances the consistency information among different views. The proposed algorithm first learns the consistent shared similarity matrix among views by adjusting the weight of the views to improve the consistency of the shared matrix. Among them, views with strong correlation have more consistent information, and the weight of views is larger, which plays a greater role in consistency learning; in contrast, views with large differences have smaller weights and play a smaller role in learning. Moreover, the proposed algorithm learns the consistent sample embeddings of different views and the feature embeddings of different views, promoting the consistent expression of sample embeddings by transferring the diversity feature information contained in the feature embeddings to the sample embeddings.The features of the different views can complement the simple sample relationship in the shared similarity matrix learning described above.Therefore, this study used bipartite graph co-clustering to learn feature embeddings and transfer them to sample embeddings by building a relationship graph of the sample, sample embeddings, and feature embeddings. Finally, graph learning, spectral clustering, and feature embedding learning are effectively integrated into a unified framework for joint optimization. The experimental results show that the algorithm clusters sample embeddings using K-means, runs it on five real databases, and compares it with seven clustering algorithms. The correct rates of the 3-Sources, Yale, and MRSCV1 datasets are higher than those of comparison algorithms by more than 5%, which validates the effectiveness of this algorithm.

Key words: multi-view clustering, consistent learning, weight adaptive, co-clustering, spectral clustering