作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (11): 49-60. doi: 10.19678/j.issn.1000-3428.0065787

• 人工智能与模式识别 • 上一篇    下一篇

基于双拉普拉斯正则化与因果推断的多标签学习

罗俊, 高清维*, 檀怡, 赵大卫, 卢一相, 孙冬   

  1. 安徽大学 电气工程与自动化学院, 合肥 230601
  • 收稿日期:2022-09-19 出版日期:2023-11-15 发布日期:2023-11-06
  • 通讯作者: 高清维
  • 作者简介:

    罗俊(1997—),男,硕士研究生,主研方向为机器学习

    檀怡,硕士研究生

    赵大卫,博士

    卢一相,副教授、博士

    孙冬,副教授、博士

  • 基金资助:
    国家自然科学基金(62071001); 安徽省自然科学基金(2008085MF183); 安徽省教育厅重点项目(KJ2018A0012); 安徽大学博士研究基金项目(J01003266)

Multi-Label Learning Based on Double Laplace Regularization and Causal Inference

Jun LUO, Qingwei GAO*, Yi TAN, Dawei ZHAO, Yixiang LU, Dong SUN   

  1. School of Electrical Engineering and Automation, Anhui University, Hefei 230601, China
  • Received:2022-09-19 Online:2023-11-15 Published:2023-11-06
  • Contact: Qingwei GAO

摘要:

标签特定特征是多标签学习的研究热点,利用标签特征提取解决单个例子存在多个类标签的问题。现有多标签分类研究通常只是简单考虑标签之间的相关性,忽略原始数据之间的局部流形结构,可能会造成分类精度下降。此外,在标签相关性中,特征和标签的结构关系以及标签之间的内在因果关系也往往被忽视。提出一种基于双拉普拉斯正则化与因果推断的多标签学习算法。利用线性回归模型建立多标签分类的基本框架,结合因果学习探索标签之间的内在因果关系,以达到挖掘标签之间本质联系的目的。在此基础上,为充分利用特征与标签之间的结构关系,加入双拉普拉斯正则化以挖掘局部标签关联信息以及有效保持原始数据的局部流形结构。在公共多标签数据集上验证该算法的有效性,实验结果表明,相比LLSF、ML-KNN、LIFT等算法,该算法的汉明损失、平均精度、1次错误率、排序损失、覆盖率、AUC平均提升8.82%、4.98%、9.43%、16.27%、12.19%、3.35%。

关键词: 多标签分类, 双拉普拉斯, 标签相关性, 流形结构, 因果推断

Abstract:

Label-specific features are a research hotspot in multi-label learning, which utilizes label feature extraction to solve the problem of multiple class labels in a single instance. Existing research on multi-label classification usually considers only the correlation between labels and ignores the local manifold structure between the original data, which results in a decrease in classification accuracy. In addition, in label correlation, the structural relationship between features and labels, as well as the inherent causal relationship between labels, are often overlooked. To address these issues, in this study, a multi-label learning algorithm based on double Laplace regularization and causal inference is proposed. Linear regression models are used to establish a basic multi-label classification framework which is combined with causal learning to explore the inherent causal relationships between labels, to achieve the goal of mining the essential connections between labels. To fully utilize the structural relationship between features and labels, double Laplace regularization is added to mine local label association information and effectively maintain the local manifold structure of the original data. The effectiveness of the proposed algorithm is verified on a public multi-label dataset. The experimental results showed that compared to algorithms such as LLSF, ML-KNN, and LIFT, the proposed algorithm achieved an average performance improvement of 8.82%, 4.98%, 9.43%, 16.27%, 12.19%, and 3.35% in terms of Hamming Loss(HL), Average Precision(AP), One Error(OE), Ranking Loss(RL), coverage, and AUC, respectively.

Key words: multi-label classification, double Laplace, label correlation, manifold structure, causal inference