作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (4): 104-112. doi: 10.19678/j.issn.1000-3428.0067602

• 人工智能与模式识别 • 上一篇    下一篇

结合双流形映射的不完备多标签学习

许智磊, 黄睿   

  1. 上海大学通信与信息工程学院, 上海 200444
  • 收稿日期:2023-05-11 修回日期:2023-07-03 发布日期:2023-08-17
  • 通讯作者: 黄睿,E-mail:huangr@shu.edu.cn E-mail:huangr@shu.edu.cn

Multilabel Learning with Incomplete Using Dual-Manifold Mapping

XU Zhilei, HUANG Rui   

  1. School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China
  • Received:2023-05-11 Revised:2023-07-03 Published:2023-08-17

摘要: 在多标签学习中,有效利用标签相关性可以提高分类性能。然而,由于人工标注标签的主观性和实际应用中标签语义的相似性,通常只能观察到不完备的标签空间,导致标签相关性的估计不准确,使得算法性能下降。针对该问题,提出一种结合双流形映射的不完备多标签学习(ML-DMM)算法。构造两种流形映射,一种是保留实例数据空间局部结构信息的特征流形映射,另一种是基于迭代学习得到的标签相关性的标签流形映射。首先通过拉普拉斯映射构造数据的低维流形,然后通过回归系数矩阵和标签相关性矩阵将初始特征空间和初始标签空间分别映射到该低维流形上,形成一种双流形映射结构来提升算法性能,最后利用迭代学习得到的回归系数矩阵进行多标签分类。在8个多标签数据集及3种标签缺失率情况下的对比实验结果表明,ML-DMM算法性能优于其他针对缺失标签的多标签分类算法。

关键词: 多标签学习, 缺失标签, 标签相关性, 低维流形, 双流形映射

Abstract: In multilabel learning, the classification performance can be improved through the effective use of label correlations. However, owing to the subjectivity of manual tagging and the similarity of label semantics in practical applications, an incomplete label space is typically observed, which results in an inaccurate estimation of label correlations and thus degraded algorithm performance. Hence, a Multilabel Learning with incomplete labels using Dual-Manifold Mapping(ML-DMM) algorithm is proposed. The algorithm constructs two types of manifold mappings: feature manifold mapping, which preserves local structural information in the instance data space, and label manifold mapping, which is based on label correlations obtained through iterative learning. The algorithm first constructs a low-dimensional manifold of data through Laplace mapping and then maps the original feature space and original label space onto the low-dimensional manifold via a regression coefficient matrix and label correlation matrix, respectively. Thus, a dual-manifold mapping structure is formed to improve the algorithm performance. Finally, the regression coefficient matrix obtained via iterative learning is used for multilabel classification. Experimental results on eight multilabel datasets with three missing rates of class labels show that ML-DMM performs better than other multilabel classification methods for missing labels.

Key words: multilabel learning, missing labels, label correlations, low-dimensional manifold, dual-manifold mapping

中图分类号: