Self-Weighted Multi-View k-means Algorithm

doi:10.19678/j.issn.1000-3428.0069575

Abstract

Abstract:

With advancements in information technology, people can use increasingly diversified and complex ways to describe things more accurately, which has led to the emergence of multi-view data. Clustering multi-view data is a fundamental topic in data mining, machine learning, pattern recognition, and other fields. In the current era of information explosion, data dimensionality is increasing significantly, and the efficient clustering of such data remains a significant challenge. The current multi-view k-means algorithms faces the ″shortage of ability″ problem when dealing with high-dimensional data. To address this issue, this paper proposes a new multi-view clustering framework, namely the Self-weighted Multi-view k-Means (SwMKM) algorithm. First, by adopting the least absolute principles to guide robustness, this algorithm successfully reduces the effects of outliers on the results. Subsequently, the Iterative Reweight Least Square (IRLS) method is used to solve the minimum absolute residual, and the distribution of multiple weights is adjusted adaptively to achieve reweighting control. Finally, by introducing a projection matrix with $\ell$_{2, 1} -norm penalty term, the high-dimensional feature space of the original dataset is transformed into a statistically uncorrelated, low-dimensional subspace for feature selection and noise suppression. Experimental results show that the proposed algorithm performs significantly better than other multi-view k-means algorithms on Handwritten numerals, MSRCv1, Outdoor Scene, and other datasets.

Key words: unsupervised learning, k-means, multi-view clustering, ?_{2, 1}-norm, self-weighting

摘要：

随着信息技术的不断进步, 人们能够运用越来越多样化和复杂的方式来更准确地描述事物, 这导致了多视图数据的出现。对多视图数据聚类是数据挖掘、机器学习、模式识别等领域的基础和重要课题。在这个信息爆炸的时代, 数据的维度越来越高, 如何有效地对这类数据进行聚类仍然是一项巨大的挑战。针对目前多视图k-均值算法在处理高维数据时能力不足的问题, 提出一种全新的多视图聚类框架——自加权多视图k-均值(SwMKM)算法。首先, 通过采用最小绝对准则来引导鲁棒性, 降低异常值对结果的影响; 然后, 采用迭代重加权最小二乘法(IRLS)来求解最小绝对残差, 通过自适应地调整多个权重的分布, 实现重加权的控制; 最后, 通过引入具有$\ell$_{2, 1}范数惩罚项的投影矩阵, 将原始数据集的高维特征空间转换为统计上不相关的低维的子空间, 实现特征选择和噪声抑制。实验结果显示, SwMKM算法在Handwritten numerals、MSRCv1、Outdoor Scene等数据集上的表现明显优于其他多视图k-均值算法, 证明了该算法聚类的优越性。

关键词: 无监督学习, k-均值, 多视图聚类, ?_{2, 1}范数, 自加权

LIN Hechuan, XU Huiying, ZHU Xinzhong, HUANG Xiao, LIU Ziyang. Self-Weighted Multi-View k-means Algorithm[J]. Computer Engineering, 2025, 51(8): 141-150.

林合川, 徐慧英, 朱信忠, 黄晓, 刘子洋. 自加权多视图k-均值算法[J]. 计算机工程, 2025, 51(8): 141-150.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0069575

https://www.ecice06.com/EN/Y2025/V51/I8/141

Figures/Tables 8

Fig.1 Dimensionality for each view on different datasets

Fig.2 The learned weight for each view on different datasets

Fig.3 The convergence curves on different datasets

Fig.4 ACC, NMI and Purity values for each dataset under different β

References 39

1	ZHAO J , XIE X J , XU X , et al. Multi-view learning overview: recent progress and new challenges. Information Fusion, 2017, 38 (C): 43- 54.
2	FUTSCHIK M E , CARLISLE B . Noise-robust soft clustering of gene expression time-course data. Journal of Bioinformatics and Computational Biology, 2005, 3 (4): 965- 988. doi: 10.1142/S0219720005001375
3	FU L L , LIN P F , VASILAKOS A V , et al. An overview of recent multi-view clustering. Neurocomputing, 2020, 402, 148- 161. doi: 10.1016/j.neucom.2020.02.104
4	SONG J , GUO Y , GAO L , et al. From deterministic to generative: multimodal stochastic RNNs for video captioning. IEEE Transactions on Neural Networks and Learning Systems, 2018, 30 (10): 3047- 3058.
5	CHAO G , SUN S , BI J . A survey on multiview clustering. IEEE Transactions on Artificial Intelligence, 2021, 2 (2): 146- 168. doi: 10.1109/TAI.2021.3065894
6	赵翠娜, 杨有龙. 基于自表示和投影映射的不完整多视图聚类. 吉林大学学报(理学版), 2024, 62 (2): 331- 338.
	ZHAO C N , YANG Y L . Incomplete multi-view clustering based on self-representation and projection mapping. Journal of Jilin University (Science Edition), 2024, 62 (2): 331- 338.
7	刘思慧, 高全学, 宋伟, 等. 基于加权张量低秩约束的多视图谱聚类. 计算机工程, 2024, 50 (1): 129- 137. doi: 10.19678/j.issn.1000-3428.0068270
	LIU S H , GAO Q X , SONG W , et al. Multiview spectral clustering based on weighted tensor low-rank constraint. Computer Engineering, 2024, 50 (1): 129- 137. doi: 10.19678/j.issn.1000-3428.0068270
8	DU Y F , LU G F , JI G Y . Robust and optimal neighborhood graph learning for multi-view clustering. Information Sciences, 2023, 631, 429- 448. doi: 10.1016/j.ins.2023.02.089
9	王丽娟, 邢津萍, 尹明, 等. 基于一致性图的权重自适应多视角谱聚类算法. 计算机工程, 2024, 50 (2): 122- 131. doi: 10.19678/j.issn.1000-3428.0066433
	WANG L J , XING J P , YIN M , et al. Weight adaptive multi-view spectral clustering algorithm based on consistent graphs. Computer Engineering, 2024, 50 (2): 122- 131. doi: 10.19678/j.issn.1000-3428.0066433
10	陈曼笙, 蔡晓莎, 林家祺, 等. 张量学习诱导的多视图谱聚类. 计算机学报, 2024, 47 (1): 52- 68.
	WANG C D , CHEN M S , CAI X S , et al. Tensor learning induced multi-view spectral clustering. Chinese Journal of Computers, 2024, 47 (1): 52- 68.
11	YUN Y , LI J , GAO Q X , et al. Low-rank discrete multi-view spectral clustering. Neural Networks, 2023, 166, 137- 147. doi: 10.1016/j.neunet.2023.06.038
12	CHEN Z , WU X J , XU T , et al. Fast self-guided multi-view subspace clustering. IEEE Transactions on Image Process, 2023, 32, 6514- 6525. doi: 10.1109/TIP.2023.3261746
13	刘浩翰, 杜嘉欣, 李建伏. 两级联合融合的多视图子空间聚类改进算法. 计算机应用与软件, 2023, 40 (12): 299- 304.
	LIU H H , DU J X , LI J F . Improved multi view subspace clustering algorithm based on two level joint fusion. Computer Applications and Software, 2023, 40 (12): 299- 304.
14	SU C , YUAN H L , LAI L L , et al. Anchor-based multi-view subspace clustering with graph learning. Neurocomputing, 2023, 547, 126320.
15	LU H , GAO Q X , ZHANG X D , et al. A multi-view clustering framework via integrating k-means and graph-cut. Neurocomputing, 2022, 501, 609- 617.
16	刘洪基. 基于混沌PSO的大数据智能加权K均值聚类算法. 计算机应用与软件, 2022, 39 (4): 311- 319.
	LIU H J . Intelligent weighted K-means clustering algorithm for big data based on chaos PSO. Computer Applications and Software, 2022, 39 (4): 311- 319.
17	YU M . Regularized k-means clustering for multi-view data. Journal of Physics: Conference Series, 2022, 2381 (1): 012036.
18	ZHENG X , TANG C , LIU X W , et al. Multi-view clustering via matrix factorization assisted k-means. Neurocomputing, 2023, 534, 45- 54.
19	BICKEL S, SCHEFFER T.Multi-view clustering[C]//Proceedings of ICDM’04. Washington D. C., USA: IEEE Press, 2004: 19-26.
20	CAI X, NIE F, HUANG H. Multi-view k-means clustering on big data[C]//Proceedings of the 23rd International Joint Conference on Artificial Intelligence. Washington D. C., USA: IEEE Press, 2013: 1-10.
21	XU J L, HAN J W, NIE F P. Discriminatively embedded k-means for multi-view clustering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE Press, 2016: 5356-5364.
22	XU J , HAN J , NIE F , et al. Re-weighted discriminatively embedded k-means for multi-view clustering. IEEE Transactions on Image Process, 2017, 26 (6): 3016- 3027.
23	KRIEGEL H P , KRÖGER P , ZIMEK A . Clustering high-dimensional data. ACM Transactions on Knowledge Discovery from Data, 2009, 3 (1): 1- 58.
24	KHAN G A , HU J , LI T R , et al. Multi-view low rank sparse representation method for three-way clustering. International Journal of Machine Learning and Cybernetics, 2022, 13 (1): 233- 253.
25	SHEN H T , LIU L C , YANG Y , et al. Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Transactions on Knowledge and Data Engineering, 2020, 33 (10): 3351- 3365.
26	ZHOU Y M , TIAN L , ZHU C , et al. Video coding optimization for virtual reality 360-degree source. IEEE Journal of Selected Topics in Signal Processing, 2019, 14 (1): 118- 129.
27	YANG Y, SHEN H T, MA Z, et al. l2, 1-norm regularized discriminative feature selection for unsupervised learning[C]//Proceedings of International Joint Conference on Artificial Intelligence. Washington D. C., USA: IEEE Press, 2011: 1-10.
28	DING C, HE X F, SIMON H D. Nonnegative Lagrangian relaxation of k-means and spectral clustering[C]//Proceedings of ECML 2005. Berlin, Germany: Springer, 2005: 530-538.
29	DING C H Q , LI T , JORDAN M I . Convex and semi-nonnegative matrix factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 32 (1): 45- 55.
30	JIANG G Q , PENG J J , WANG H B , et al. Tensorial multi-view clustering via low-rank constrained high-order graph learning. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32 (8): 5307- 5318.
31	ZHU X F , ZHANG S C , LI Y G , et al. Low-rank sparse subspace for spectral clustering. IEEE Transactions on Knowledge and Data Engineering, 2019, 31 (8): 1532- 1543.
32	DAUBECHIES I, DEVORE R, FORNASIER M, et al. Iteratively re-weighted least squares minimization: proof of faster than linear rate for sparse recovery[C]//Proceedings of the 42nd Annual Conference on Information Sciences and Systems. Princeton, USA: IEEE Press, 2008: 26-29.
33	CAI D , HE X , HAN J . Document clustering using locality preserving indexing. IEEE Transactions on Knowledge and Data Engineering, 2005, 17 (12): 1624- 1637.
34	VARSHAVSKY R, LINIAL M, HORN D. COMPACT: a comparative package for clustering assessment[C]//Proceedings of International Symposium on Parallel and Distributed Processing and Applications. Berlin, Germany: Springer, 2005: 159-167.
35	CHEN C , WANG Y , HU W B , et al. Robust multi-view k-means clustering with outlier removal. Knowledge-Based Systems, 2020, 210, 106518.
36	HAN J W , XU J L , NIE F P , et al. Multi-view k-means clustering with adaptive sparse memberships and weight allocation. IEEE Transactions on Knowledge and Data Engineering, 2020, 34 (2): 816- 827.
37	WANG H , YANG Y , LIU B . GMC: graph-based multi-view clustering. IEEE Transactions on Knowledge and Data Engineering, 2019, 32 (6): 1116- 1129.
38	WINN J, JOJIC N.LOCUS: learning object classes with unsupervised segmentation[C]//Proceedings of the 10th IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2005: 756-763.
39	MONADJEMI A, THOMAS B T, MIRMEHDI M. Experiments on high resolution images towards outdoor scene classification[EB/OL]. [2024-01-20]. https://www.semanticscholar.org/paper/Experiments-on-High-Resolution-Images-Towards-Scene-Monadjemi-Thomas/749a4c37fe00e5a817e1f138f89b31c27ac6ba09.

[1]	LEI Yifan, CHEN Xiaohong. Privacy-Preserving Decentralized Federated Multi-View Clustering [J]. Computer Engineering, 2025, 51(7): 180-189.
[2]	YANG Jinlin, LI Chaofeng. Domain Transform Image Raindrop Removal Method by Integrating Fast Fourier Convolution [J]. Computer Engineering, 2024, 50(9): 296-303.
[3]	LI Yongfei, LI Mingyang, CHANG Xin, CAO Kexin. Anomaly Detection of IoT Water Quality Monitoring Data Based on Explainable Deep Learning [J]. Computer Engineering, 2024, 50(6): 179-187.
[4]	YU Changhong, XU Konghao, ZHANG Ze, GAO Ming. Improving Network Intrusion Detection Methods in Isolated Forests Based on Split Points [J]. Computer Engineering, 2024, 50(6): 148-156.
[5]	HU Aoran, CHEN Xiaohong. One-step Multi-view Clustering Based on Diversity and Consistency [J]. Computer Engineering, 2024, 50(5): 51-61.
[6]	Yinyin HE, Jing HU, Zhibo CHEN, Rongguo ZHANG. Low-light Image Enhancement Method Combining Gated Transformation Mechanism and GAN [J]. Computer Engineering, 2024, 50(2): 247-255.
[7]	Lijuan WANG, Jinping XING, Ming YIN, Zhifeng HAO, Ruichu CAI, Wen WEN. Weight Adaptive Multi-view Spectral Clustering Algorithm Based on Consistent Graphs [J]. Computer Engineering, 2024, 50(2): 122-131.
[8]	ZHANG Junna, HAN Chaochen, CHEN Jiawei, ZHAO Xiaoyan, YUAN Peiyan. A Method for Joint Edge Server Deployment and Service Placement [J]. Computer Engineering, 2024, 50(10): 266-280.
[9]	Shupeng WANG, Yindi HE. Uneven Illumination Image Enhancement Algorithm Fusing Feature Attention Mechanism [J]. Computer Engineering, 2023, 49(8): 232-239.
[10]	HE Yue, CHEN Guangsheng, JING Weipeng, XU Zekun. Remote Sensing Image Retrieval Based on Deep Multi-Similarity Hashing Method [J]. Computer Engineering, 2023, 49(2): 206-212.
[11]	Ning YAN, Yueyang LI, Haichi LUO. Unsupervised Anomaly Detection Based on Block Pyramid Memory Module [J]. Computer Engineering, 2023, 49(12): 304-310.
[12]	HE Na, MA Yingcang. Multi-View Fuzzy Clustering Algorithm Fused with KL Information [J]. Computer Engineering, 2022, 48(7): 114-121,150.
[13]	TANG Jiamin, HAN Hua, HUANG Li. Coarse-grained and Fine-grained Features Extraction Based on Unsupervised Learning in Pedestrian Re-identification [J]. Computer Engineering, 2022, 48(4): 269-275,283.
[14]	HUANG Yixuan, DU Shiqiang, YU Yao, XIAO Qingjiang, SONG Jinmei. Multi-View Clustering Based on Feature Selection and Robust Graph Learning [J]. Computer Engineering, 2022, 48(12): 95-103.
[15]	ZHANG Xianyang, ZHU Xiaoyu, LIN Haoshen, LIU Gang, AN Xibin. Trajectory Prediction Algorithm Based on Gaussian Mixture-Variational Autoencoder [J]. Computer Engineering, 2020, 46(7): 50-57.

Please choose a citation manager

Content to export