Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (10): 1-17. doi: 10.19678/j.issn.1000-3428.0070575

• Research Hotspots and Reviews • Previous Articles     Next Articles

Survey of Pre-training-based Continual Learning Methods (Invited)

LU Yue1, ZHOU Xiangyu1, ZHANG Shizhou1,*(), LIANG Guoqiang1, XING Yinghui1, CHENG De2, ZHANG Yanning1   

  1. 1. School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, Shaanxi, China
    2. School of Telecommunications Engineering, Xidian University, Xi'an 710071, Shaanxi, China
  • Received:2024-11-04 Revised:2025-01-16 Online:2025-10-15 Published:2025-01-22
  • Contact: ZHANG Shizhou

基于预训练的持续学习方法综述(特邀)

路悦1, 周翔宇1, 张世周1,*(), 梁国强1, 邢颖慧1, 程德2, 张艳宁1   

  1. 1. 西北工业大学计算机学院, 陕西 西安 710129
    2. 西安电子科技大学通信工程学院, 陕西 西安 710071
  • 通讯作者: 张世周
  • 基金资助:
    国家自然科学基金(62201467); 中国博士后科学基金(2022TQ0260); 中国博士后科学基金(2023M742842); 西安市科协青年人才托举计划(959202313088); 陕西省创新能力支撑计划(2024ZC-KJXX-043); 陕西省自然科学基础研究计划(2022JC-DW-08)

Abstract:

Traditional machine learning algorithms perform well only when the training and testing sets are identically distributed. They cannot perform incremental learning for new categories or tasks that were not present in the original training set. Continual learning enables models to learn new knowledge adaptively while preventing the forgetting of old tasks. However, they still face challenges related to computation, storage overhead, and performance stability. Recent advances in pre-training models have provided new research directions for continual learning, which are promising for further performance improvements. This survey summarizes existing pre-training-based continual learning methods. According to the anti-forgetting mechanism, they are categorized into five types: methods based on prompt pools, methods with slow parameter updating, methods based on backbone branch extension, methods based on parameter regularization, and methods based on classifier design. Additionally, these methods are classified according to the number of phases, fine-tuning approaches, and use of language modalities. Subsequently, the overall challenges of continual learning methods are analyzed, and the applicable scenarios and limitations of various continual learning methods are summarized. The main characteristics and advantages of each method are also outlined. Comprehensive experiments are conducted on multiple benchmarks, followed by in-depth discussions on the performance gaps among the different methods. Finally, the survey discusses research trends in pre-training-based continual learning methods.

Key words: continual learning, catastrophic forgetting, pre-training model, efficient parameter fine-tuning, deep neural network

摘要:

传统机器学习算法只有当测试集和训练集同分布时才能取得较好的性能, 无法增量地学习原训练集中没有的新类别或任务。持续学习使模型得以具备自适应学习能力, 在持续学习新任务的同时能够防止对旧任务的遗忘。当前持续学习仍面临计算开销大、存储成本高以及性能不稳定等挑战。近年来, 预训练模型的发展为持续学习提供了新的研究方向, 有望进一步提高性能表现。首先, 分析了现有基于预训练的持续学习方法, 按照防止遗忘的机制将其归纳为基于提示池、缓慢更新参数、基于扩展主干网络分支、基于参数正则化、基于分类器设计5类方法, 进一步按照阶段数、微调方式和是否利用语言模态对其进行归类并总结了各类方法的主要特点和各自优势。然后, 分析了持续学习方法存在的主要挑战, 归纳了各类持续学习方法的适用场景和局限性, 在多个评测基准上对各类方法进行实验比较并讨论各方法的性能差异。最后, 对基于预训练的持续学习方法的研究趋势进行展望。

关键词: 持续学习, 灾难性遗忘, 预训练模型, 高效参数微调, 深度神经网络