基于强化学习双种群的约束分解多目标优化算法

doi:10.19678/j.issn.1000-3428.0252293

摘要/Abstract

摘要： 处理约束多目标优化最关键的是如何在满足约束条件和目标函数最小化的同时，平衡算法的多样性和收敛性。现有的基于分解的约束多目标优化算法在面对具有复杂约束前沿的问题时，不能很好的利用不可行解信息，且难以平衡种群的收敛性和多样性。针对这一问题，提出一种基于强化学习双种群的约束分解多目标优化算法。该算法使用基于强化学习ε约束自适应策略和双种群合作信息学习策略帮助种群收敛到真正约束前沿上。前者利用强化学习的Q-learning自适应选择ε约束方法，通过将强化学习引入到自适应选择ε约束方法，可以让种群根据实时进化状态确定最优的ε约束方法，以增强全局搜索能力，使算法更好地逼近真实的前沿。后者设计一种双种群合作信息学习策略，通过两个种群的合作信息交流学习和不同的子代产生和后代选择策略指导算法充分利用不可行解的信息找到真正的约束前沿，从而平衡种群的收敛性和多样性。最后还将提出的算法与六个先进的约束多目标优化算法在33个测试问题进行对比，并应用在四杆桁架实际问题上进行仿真实验，实验结果表明所提算法在求解理论问题和实际问题时较其他算法具有更好的性能。

Abstract: The most critical aspect of dealing with constrained multi-objective optimization is how to balance the diversity and convergence of the algorithm while satisfying the constraints and minimizing the objective function. Existing constrained multi-objective optimization algorithms based on decomposition cannot make good use of infeasible solution information when facing problems with complex constraint fronts, and it is difficult to balance the convergence and diversity of populations. To address this problem, a constraint decomposition multi-objective evolutionary algorithm based on reinforcement learning dual population is proposed. The algorithm uses a reinforcement learning ε-based constraint adaptive strategy and a dual-population cooperative information learning strategy to help the population converge to the true constraint frontier. The former utilizes the Q-learning adaptive selection of ε-constraints method of reinforcement learning, which allows the population to determine the optimal ε-constraints method according to the real-time evolutionary state by introducing reinforcement learning into the adaptive selection of ε-constraints method, in order to enhance the global searching ability and enable the algorithm to better approximate the true front. The latter designs a dual-population cooperative information learning strategy to balance the convergence and diversity of the populations by guiding the algorithm to make full use of the information of infeasible solutions to find the true constraint front through the cooperative information exchange learning of the dual populations and different offspring generation and progeny selection strategies. Finally, the proposed algorithm is also compared with six state-of-the-art constrained multi-objective optimization algorithms in 33 test problems and applied to the real problem of four-bar truss for simulation experiments, and the experimental results show that the proposed algorithm has a better performance than the other algorithms in solving the theoretical and practical problems.

李伟, 李小玲, 刘子琼, 黄颖. 基于强化学习双种群的约束分解多目标优化算法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252293.

Li wei , Li xiaoling, Liu ziqiong, Huang ying. Li wei, Li xiaoling, Liu ziqiong, Huang ying*[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252293.

参考文献

[1] ZHANG Q F, LI H. MOEA/D: A multi-objective evolutionary algorithm based on decomposition[J]. IEEE Transactions on Evolutionary Computation, 2007, 11(6): 712-731.
[2] LIANG J, BAN X X, YU K J, et al. A survey on evolutionary constrained multi-objective optimization[J]. IEEE Transactions on Evolutionary Computation, 2023, 27(2): 201-221.
[3] JAN M A, TAIRAN N M, KHANUM R A, et a1. A new threshold-based penalty function embedded MOEA/D[J]. International Journal of Advanced Computer Science and Applications, 2016, 7(2): 647-655.
[4] MALDONADO H M, ZAPOTECAS-MARTINEZ S. A dynamic penalty function within MOEA/D for constrained multi-objective optimization problems[C]//2021 IEEE Congress on Evolutionary Computation. Kraków Poland, 2021: 1470-1477.
[5] DEB K, PRATAP A, AGARWAL S, et al. A fast and elitist multi-objective genetic algorithm: NSGA-II[J]. IEEE Transactions on Evolutionary Computation, 2002, 6(2): 182-197.
[6] JAIN H, DEB K. An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, Part II: Handling constraints and extending to an adaptive approach[J]. IEEE Transactions on Evolutionary Computation, 2014, 18(4): 602-622.
[7] RUNARSSON, YAO X. Stochastic ranking for constrained evolutionary optimization[J]. Evolutionary Computation, IEEE Transactions on, 2000, 4(3): 284(294).
[8] JAN M A, KHANUM R A. A study of two penalty-parameterless constraint handling techniques in the framework of MOEA/D[J]. Applied Soft Computing, 2013, 13(1): 128-148.
[9] YING W Q, HE W P, HUANG Y X, et al. An adaptive stochastic ranking mechanism in MOEA/D for constrained multi-objective optimization[C]//2016 International Conference on Information System and Artificial Intelligence. Hong Kong China, 2016: 514-518.
[10] ASAFUDDOULA M, RAY T, SARKER R. An adaptive constraint handling approach embedded MOEA/D[C]// 2012 IEEE Congress on Evolutionary Computation. Brisbane Australia, 2012: 1-8.
[11] FAN Z, LI W J, CAI X Y, et al. An improved epsilon constraint-handling method embedded in MOEA/D for constrained multi-objective optimization Problems[C]//2016 IEEE Symposium Series on Computational Intelligence. Athens Greece, 2016: 1-8.
[12] FAN Z, LI W J, CAI X Y, et al. Push and pull search for solving constrained multi-objective optimization problems[J]. Swarm and Evolutionary Computation, 2019, 44: 665-679.
[13] XIANG Y, YANG X W, HUANG H, et al. Balancing constraints and objectives by considering problem types in constrained multi-objective optimization[J]. IEEE Transactions on Cybernetics, 2023, 53(1): 88-101.
[14] QIAO K J, CHEN Z L, QU B Y, et al. A dual-population evolutionary algorithm based on dynamic constraint processing and resources allocation for constrained multi-objective optimization problems[J]. Expert Systems with Applications, 2023, 238: 121707.
[15] YANG Y K, LIU J C, TAN S B. A constrained multi-objective evolutionary algorithm based on decomposition and dynamic constraint-handling mechanism[J]. Applied Soft Computing, 2020, 89(0): 106104.
[16] WANG X L, JIN Y C, SCHMITT S, et al. Recent advances in bayesian optimization[J]. ACM Computing Surveys, 2023, 55(13): 1-36.
[17] VEDAT D, STEVEN P. Multi-objective bilevel optimization by bayesian optimization[J]. Algorithms, 2024, 17(4): 146.
[18] WANG Y, CAI Z X, ZHOU Y R, et al. Evolutionary algorithms for constrained optimization[J]. Journal of Software, 2009, 20(1): 11-29.
[19] 弓佳明, 章腾浩, 许丽娟. 基于分解的多目标优化算法研究与分析[J]. 现代计算机, 2022, 28: 11-17. GONG J M, ZHANG T H, XU L J. Research and analysis of multi-objective optimization algorithm based on decomposition [J]. Modern Computer, 2022, 28: 11-17.
[20] TAKAHAMA T, SAKAI T. Constrained optimization by the ε-constrained differential evolution with an archive and gradient-based mutation[C]//IEEE Congress on Evolutionary Computation. Barcelona Spain, 2010: 1-9.
[21] 韩忻辰, 俞胜平, 袁志明等. 基于Q-learning的高速铁路列车动态调度方法[J]. 控制理论与应用, 2021, 38: 1511-1521 HAN X C, YU S P, YUAN Z M, et al. Dynamic scheduling method for high-speed railway trains based on Q-learning [J]. Control Theory & Applications, 2021, 38: 1511-1521.
[22] BAI H, CHENG R, JIN Y C. Evolutionary reinforcement learning: a survey[J]. Intelligent Computing, 2023, 2: 0025.
[23] SONG S Q, ZHANG K, ZHANG L, et al. A dual-population algorithm based on self-adaptive epsilon method for constrained multi-objective optimization[J]. Information Sciences, 2024, 655: 119906.
[24] FAN Z, LI W, CAI X, et al. An improved epsilon constraint-handling method in MOEA/D for CMOPs with large infeasible regions[J]. Soft Computing, 2019, 23: 12491–12510.
[25] JAN M A, ZHANG Q. MOEA/D for constrained multi-objective optimization: some preliminary experimental results[C]//2010 UK Workshop on Computational Intelligence. Colchester UK, 2010: 1-6.
[26] ADHAM A M, MOHD-GHAZALI N, AHMAD R. Performance optimization of a microchannel heat sink using the improved strength pareto evolutionary Algorithm (SPEA2) [J]. Journal of Engineering Thermophysics, 2015, 24(1): 86-100.
[27] TIAN Y, ZHANG T, XIAO J H, et al. A coevolutionary framework for constrained multi-objective optimization problems[J]. IEEE Transactions on Evolutionary Computation, 2021, 25(1): 102-116.
[28] MING F, GONG W Y, ZHEN H X, et al. A simple two-stage evolutionary algorithm for constrained multi-objective optimization[J]. Knowledge-Based Systems, 2021, 228: 107263.
[29] TIAN Y, ZHANG Y, SU Y, et al. Balancing objective optimization and constraint satisfaction in constrained evolutionary multi-objective optimization[J]. IEEE Transactions on Cybernetics, 2022, 53(9): 9559-9572.
[30] QIAO K J, YU K J, QU B Y, et al. Dynamic auxiliary task-based evolutionary multitasking for constrained multi-objective optimization[J]. IEEE Transactions on Evolutionary Computation, 2023, 27(3): 642-656.
[31] MENDES C S, ARAUJO A F, FARIAS L R. Non-dominated sorting bidirectional differential Coevolution[C]//2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2023.
[32] MING F, GONG W Y, JIN Y C. Even search in a promising region for constrained multi-objective optimization[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(2): 474-508.
[33] 王学武, 魏建斌, 周昕等. 一种基于超体积指标的多目标进化算法[J]. 华东理工大学学报(自然科学版), 2020, 46(6): 780-791.
WANG X W, WEI J B, ZHOU X, et al. A multi-objective evolutionary algorithm based on hypervolume indicator [J]. Journal of East China University of Science and Technology (Natural Science Edition), 2020, 46(6): 780-791.
[34] 付世炜, 苏毅娟, 谢承旺. MOEA/IGD-NSE：一种基于IGD-NSE指标的高维多目标进化算法[J]. 广西科学, 2024, (200). FU S W, SUY J, XIE C W. MOEA/IGD-NSE: A high-dimensional multi-objective evolutionary algorithm based on IGD-NSE metrics [J]. Guangxi Science, 2024, (200).
[35] STADLER W, DAUER J. Multicriteria optimization in engineering: a tutorial and survey[C]//Structural Optimization: Status and Future. American Institute of Aeronautics and Astronautics, 1992: 209-249.

选择文件类型/文献管理软件名称

选择包含的内容