作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

基于强化学习双种群的约束分解多目标优化算法

  • 发布日期:2025-09-19

Li wei, Li xiaoling, Liu ziqiong, Huang ying*

  • Published:2025-09-19

摘要: 处理约束多目标优化最关键的是如何在满足约束条件和目标函数最小化的同时,平衡算法的多样性和收敛性。现有的基于分解的约束多目标优化算法在面对具有复杂约束前沿的问题时,不能很好的利用不可行解信息,且难以平衡种群的收敛性和多样性。针对这一问题,提出一种基于强化学习双种群的约束分解多目标优化算法。该算法使用基于强化学习ε约束自适应策略和双种群合作信息学习策略帮助种群收敛到真正约束前沿上。前者利用强化学习的Q-learning自适应选择ε约束方法,通过将强化学习引入到自适应选择ε约束方法,可以让种群根据实时进化状态确定最优的ε约束方法,以增强全局搜索能力,使算法更好地逼近真实的前沿。后者设计一种双种群合作信息学习策略,通过两个种群的合作信息交流学习和不同的子代产生和后代选择策略指导算法充分利用不可行解的信息找到真正的约束前沿,从而平衡种群的收敛性和多样性。最后还将提出的算法与六个先进的约束多目标优化算法在33个测试问题进行对比,并应用在四杆桁架实际问题上进行仿真实验,实验结果表明所提算法在求解理论问题和实际问题时较其他算法具有更好的性能。

Abstract: The most critical aspect of dealing with constrained multi-objective optimization is how to balance the diversity and convergence of the algorithm while satisfying the constraints and minimizing the objective function. Existing constrained multi-objective optimization algorithms based on decomposition cannot make good use of infeasible solution information when facing problems with complex constraint fronts, and it is difficult to balance the convergence and diversity of populations. To address this problem, a constraint decomposition multi-objective evolutionary algorithm based on reinforcement learning dual population is proposed. The algorithm uses a reinforcement learning ε-based constraint adaptive strategy and a dual-population cooperative information learning strategy to help the population converge to the true constraint frontier. The former utilizes the Q-learning adaptive selection of ε-constraints method of reinforcement learning, which allows the population to determine the optimal ε-constraints method according to the real-time evolutionary state by introducing reinforcement learning into the adaptive selection of ε-constraints method, in order to enhance the global searching ability and enable the algorithm to better approximate the true front. The latter designs a dual-population cooperative information learning strategy to balance the convergence and diversity of the populations by guiding the algorithm to make full use of the information of infeasible solutions to find the true constraint front through the cooperative information exchange learning of the dual populations and different offspring generation and progeny selection strategies. Finally, the proposed algorithm is also compared with six state-of-the-art constrained multi-objective optimization algorithms in 33 test problems and applied to the real problem of four-bar truss for simulation experiments, and the experimental results show that the proposed algorithm has a better performance than the other algorithms in solving the theoretical and practical problems.