作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (1): 71-80. doi: 10.19678/j.issn.1000-3428.0068598

• 人工智能与模式识别 • 上一篇    下一篇

基于半监督深度自编码网络的分类算法及应用

张新波, 张雪英, 黄丽霞*(), 陈桂军   

  1. 太原理工大学电子信息与光学工程学院, 山西 太原 030024
  • 收稿日期:2023-10-16 出版日期:2025-01-15 发布日期:2024-04-11
  • 通讯作者: 黄丽霞
  • 基金资助:
    国家自然科学基金(62271342); 山西省重大科技专项(20181102008)

Classification Algorithm and Application Based on Semi-Supervised Deep Auto-Encoder Network

ZHANG Xinbo, ZHANG Xueying, HUANG Lixia*(), CHEN Guijun   

  1. College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan 030024, Shanxi, China
  • Received:2023-10-16 Online:2025-01-15 Published:2024-04-11
  • Contact: HUANG Lixia

摘要:

在工业分类预测中, 有标签数据稀缺且标记成本高, 导致模型预测不准确, 同时大多数无标签数据中的特征未得到合理利用, 模型的泛化能力不足。为了解决这个问题, 提出半监督深度自编码网络(SSup-DDSAE-Link), 将有标签数据和无标签数据通过有监督学习和无监督学习进行结合, 提升模型预测准确率。该模型首先在深度自编码通道上, 分别添加高斯噪声和稀疏性约束, 提取与分类相关且更具代表性的特征表示; 其次在编码器与解码器之间引入横向连接, 过滤与分类任务不相关的信息, 使得网络能够更好地学习关键变量的特征表示, 并在网络顶层添加有监督学习路径来实现分类识别; 然后添加原始编码器, 与解码器中对应隐含层的输出一起训练, 从而构造无监督学习路径, 有效利用无标签数据中的信息; 最后通过有监督损失函数与无监督损失函数构造总损失函数, 实现对工业生产中关键变量的分类预测。实验结果表明, 与常用的有监督学习模型和传统的半监督学习模型相比, SSup-DDSAE-Link的分类预测准确率得到了有效提高, 并且精确率、召回率和F1值均得到提升。

关键词: 半监督学习, 降噪自编码器, 稀疏自编码器, 特征提取, 分类预测

Abstract:

In industrial classification prediction, labeled data are scarce, and labeling is expensive, leading to inaccurate model predictions. Simultaneously, features in most unlabeled data are not effectively used, resulting in insufficient generalization of the model. To solve this problem, this study proposes a Semi-Supervised Deep Auto-Encoder network (SSup-DDSAE-Link) that combines labeled and unlabeled data through supervised and unsupervised learning to improve the model's prediction accuracy. First, Gaussian noise and sparsity constraints are added to the deep Auto-Encoder (AE) channel to extract more representative features related to the classification. Second, a lateral connection is introduced between the encoder and decoder to filter information irrelevant to the classification task so that the network can better learn the feature representations of key variables. A supervised learning path is then added to the top layer of the network to realize classification and recognition. Subsequently, the original encoder is added and trained together with the output of the corresponding hidden layer in the decoder to realize an unsupervised learning path and effectively uses the information in the unlabeled data. Finally, the total loss function is constructed using supervised and unsupervised loss functions to classify and predict key variables in industrial production. The experimental results show that, compared with the commonly used supervised learning models and the traditional semi-supervised learning models, the proposed algorithm has better classification prediction accuracy along with effectively improved precision, recall, and F1 value.

Key words: semi-supervised learning, Denoising Auto-Encoder(DAE), Sparse Auto-Encoder(SAE), feature extraction, classification prediction