一种基于块平均正交权重修正的连续学习算法

doi:10.19678/j.issn.1000-3428.0069310

摘要/Abstract

摘要：

连续学习能力是人类智能行为的一个重要的方面，可使人类具有持续获取新知识的能力。然而，大量的研究表明，当前常规的深度神经网络并不具备这样的连续学习能力，它们在序列学习新任务后，往往会对已学习的任务产生灾难性遗忘，从而无法持续地积累新知识，这限制了智能水平的进一步提升。因而，使深度神经网络具备连续学习能力是达成强人工智能技术的一项重要课题。提出一种基于块平均正交权重修正的连续学习算法(B-OWM)。该算法采用具有极优值分块数的输入样本块平均向量组作为输入空间的表示，结合正交权重修正(OWN)思想来更新网络参数，使得深度神经网络模型在学习新任务时可以克服对已学习知识的灾难性遗忘。在多个数据集上进行的大量任务不相交类增量连续学习实验表明，B-OWM在连续学习性能上显著优于OWM算法，尤其在大批次数连续学习场景中，测试精度提升率可达80%。

关键词: 连续学习, 正交权重修正, 深度学习, 正则化, 灾难性遗忘

Abstract:

Continuous learning ability is an important aspect of human intelligent behavior, which enables humans to acquire new knowledge continuously. However, several studies have shown that conventional deep neural networks do not possess such continuous learning capabilities. After learning new tasks in sequence, they often experience catastrophic forgetting of previously learned tasks, which hinders the continuous accumulation of new knowledge and limits further improvement in intelligence. Therefore, enabling deep neural networks to have continuous learning capabilities is important to achieve strong artificial intelligence technologies. This study proposes a continuous learning algorithm based on block average and orthogonal weight modification, named B-OWM, which uses a set of input sample block average vectors with an extremely optimal number of blocks to represent the input space, combined with the idea of Orthogonal Weight Modification (OWM) to update network parameters. Thus, deep neural network models can overcome catastrophic forgetting of learned knowledge when learning new tasks. Many incremental continuous learning experiments on multiple datasets with nonoverlapping tasks show that B-OWM algorithm significantly outperforms the OWM algorithm in terms of continuous learning performance, with an accuracy improvement rate of up to 80% in continuous learning scenarios with large batch number.

Key words: continuous learning, Orthogonal Weight Modification (OWM), deep learning, regularization, catastrophic forgetting

廖丁丁, 刘俊峰, 曾君, 邱晓欢. 一种基于块平均正交权重修正的连续学习算法[J]. 计算机工程, 2025, 51(6): 57-64.

LIAO Dingding, LIU Junfeng, ZENG Jun, QIU Xiaohuan. A Continuous Learning Algorithm Based on Block Average and Orthogonal Weight Modification[J]. Computer Engineering, 2025, 51(6): 57-64.

https://www.ecice06.com/CN/Y2025/V51/I6/57

图/表 8

图1 B-OWM算法实现流程

Fig.1 Implementation process of B-OWM algorithm

图2 输入空间块平均表示计算流程

Fig.2 Input space block average representation calculation process

图3 MINIST数据集上的连续学习实验结果

Fig.3 Results of continuous learning experiments on the MINIST dataset

图4 CIFAR-10数据集上的连续学习实验结果

Fig.4 Results of continuous learning experiments on the CIFAR-10 dataset

图5 不同分块数下的MINIST-5LB-5RW连续学习实验结果

Fig.5 Results of MINIST-5LB-5RW continuous learning experiments with different number of blocks

图6 与OWM和NS算法的连续学习性能对比

Fig.6 Comparison of the continuous learning performance with the OWM and NS algorithms

参考文献 23

1	KIRKPATRICK J , PASCANU R , RABINOWITZ N , et al. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences of the United States of America, 2017, 114 (13): 3521- 3526. doi: 10.1073/pnas.1611835114
2	LI Z Z , HOIEM D . Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (12): 2935- 2947. doi: 10.1109/TPAMI.2017.2773081
3	LEE S W, KIM J H, JUN J, et al. Overcoming catastrophic forgetting by incremental moment matching[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 4655-4665.
4	ALJUNDI R, BABILONI F, ELHOSEINY M, et al. Memory aware synapses: learning what (not) to forget[C]//Proceedings of the 15th European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 144-161.
5	REBUFFI S A, KOLESNIKOV A, SPERL G, et al. iCaRL: incremental classifier and representation learning[EB/OL]. [2023-10-11]. https://arxiv.org/abs/1611.07725.
6	ROBINS A . Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Science, 1995, 7 (2): 123- 146. doi: 10.1080/09540099550039318
7	ATKINSON C, MCCANE B, SZYMANSKI L, et al. Pseudo-Recursal: solving the catastrophic forgetting problem in deep neural networks[EB/OL]. [2023-10-11]. https://arxiv.org/abs/1802.03875.
8	SHIN H, LEE J K, KIM J, et al. Continual learning with deep generative replay[EB/OL]. [2023-10-11]. https://arxiv.org/abs/1705.08690v3.
9	ZHANG B , GUO Y , LI Y , et al. Memory recall: a simple neural network training framework against catastrophic forgetting. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33 (5): 2010- 2022. doi: 10.1109/TNNLS.2021.3099700
10	ZENG G X , CHEN Y , CUI B , et al. Continual learning of context-dependent processing in neural networks. Nature Machine Intelligence, 2019, 1 (8): 364- 372. doi: 10.1038/s42256-019-0080-x
11	MALLYA A, LAZEBNIK S. PackNet: adding multiple tasks to a single network by iterative pruning[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 7765-7773.
12	SERRÀ J, SURÍS D, MIRON M, et al. Overcoming catastrophic forgetting with hard attention to the task[C]//Proceedings of the 35th International Conference on Machine Learning. New York, USA: ACM Press, 2018: 1-17.
13	NIU S C , WU J X , ZHANG Y F , et al. Disturbance-immune weight sharing for neural architecture search. Neural Networks, 2021, 144 (C): 553- 564. doi: 10.1016/J.NEUNET.2021.09.002
14	CHEN S L , WU J J , LU Q H , et al. Cross-scene loop-closure detection with continual learning for visual simultaneous localization and mapping. International Journal of Advanced Robotic Systems, 2021, 18 (5): 17298814211050560. doi: 10.1177/17298814211050560
15	LI X B , WANG W Q . GopGAN: gradients orthogonal projection generative adversarial network with continual learning. IEEE Transactions on Neural Networks and Learning Systems, 2021, 34 (1): 215- 227.
16	HUA J Q , LI Y G , MOU W P , et al. An accurate cutting tool wear prediction method under different cutting conditions based on continual learning. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 2022, 236 (1/2): 123- 131. doi: 10.1177/0954405421993694
17	CORY C, BENAVIDES-PRADO D, KOH Y S. Continual correction of errors using smart memory replay[C]//Proceedings of International Joint Conference on Neural Networks. Washington D. C., USA: IEEE Press, 2021: 1-8.
18	FARAJTABAR M, AZIZAN N, MOTT A, et al. Orthogonal gradient descent for continual learning[EB/OL]. [2023-10-11]. https://arxiv.org/abs/1910.07104v1.
19	WANG S P, LI X R, SUN J, et al. Training networks in null space of feature covariance for continual learning[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 184-193.
20	CHAUDHRY A , KHAN N , DOKANIA P , et al. Continual learning in low-rank orthogonal subspaces. Advances in Neural Information Processing Systems, 2020, 33, 9900- 9911. doi: 10.48550/arXiv.2010.11635
21	LEE D E , NAKAMURA K , TAK J H , et al. A continual learning algorithm based on orthogonal gradient descent beyond neural tangent kernel regime. IEEE Access, 2023, 11, 85395- 85404. doi: 10.1109/ACCESS.2023.3303869
22	HE X, JAEGER H. Overcoming catastrophic interference using conceptoraided backpropagation[C]//Proceedings of International Conference on Learning Representations. Washington D. C., USA: IEEE Press, 2018: 1-11.
23	HU W P, LIN Z, LIU B, et al. Overcoming catastrophic forgetting via model adaptation[C]//Proceedings of International Conference on Learning Representations. Washington D. C., USA: IEEE Press, 2019: 1-13.

[1]	秦永旺, 张洋, 胡星, 刘胜, 李少青. 基于图注意力网络的门级网表功能识别[J]. 计算机工程, 2025, 51(6): 29-37.
[2]	庞鑫, 葛凤培, 李艳玲. 声景识音：数字化时代声学场景分类的探索与前沿[J]. 计算机工程, 2025, 51(6): 1-19.
[3]	王培吉, 邹承明. 基于向量转换的卷积计算优化方法[J]. 计算机工程, 2025, 51(6): 74-82.
[4]	陈思帆, 杨家志, 黄琳, 吕志玮, 沈露. 融合可变形核和自注意力的点云分类分割边卷积网络[J]. 计算机工程, 2025, 51(6): 146-154.
[5]	曹蓓, 赵奎. 基于双重情感和多特征融合的虚假新闻检测[J]. 计算机工程, 2025, 51(6): 193-203.
[6]	郝志峰, 黎阳霖, 许柏炎, 蔡瑞初. 面向跨域自然语言生成SQL语句的超图神经网络[J]. 计算机工程, 2025, 51(5): 114-123.
[7]	魏铭康, 李嘉楠, 韩林, 高伟, 赵荣彩, 王洪生. 面向深度学习编译器的多粒度量化框架支持与优化[J]. 计算机工程, 2025, 51(5): 62-72.
[8]	赵瑶谦, 滕奇志, 何小海, 税爱, 陈洪刚. 基于自注意力特征蒸馏的轻量级图像超分辨率重建[J]. 计算机工程, 2025, 51(5): 257-265.
[9]	庄紫薇, 朱俊国. 面向多源文本的越南语文本检错方法[J]. 计算机工程, 2025, 51(5): 93-102.
[10]	李丹丹, 李智, 郑龙, 张丽. 面向弥散张量图像的鲁棒可逆水印算法[J]. 计算机工程, 2025, 51(5): 279-287.
[11]	蒋杰平, 王明文. 基于时空置换注意力机制的残差行为识别模型[J]. 计算机工程, 2025, 51(4): 119-128.
[12]	沈忱, 何勇, 彭安浪. 鲁棒物联网多维时序数据预测方法[J]. 计算机工程, 2025, 51(4): 107-118.
[13]	杜晨阳, 张雪英, 黄丽霞, 李娟. 基于改进高效通道注意力机制的多特征语音情感识别[J]. 计算机工程, 2025, 51(4): 97-106.
[14]	戴康佳, 徐慧英, 朱信忠, 李悉钰, 黄晓, 陈国强, 张志雄. YGL-SLAM: 动态场景下基于点和线的语义SLAM系统[J]. 计算机工程, 2025, 51(3): 95-104.
[15]	韩鹏, 黄韫栀, 任彩月, 程竞仪, 徐军. 基于双分支网络的乳腺PET新辅助化疗疗效评估[J]. 计算机工程, 2025, 51(3): 293-299.

选择文件类型/文献管理软件名称

选择包含的内容