Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Neural Combinatorial Optimization Model Based on Bidirectional Construction Strategy

  

  • Published:2025-05-20

基于双向构造策略的神经组合优化模型

Abstract: Combinatorial optimization problems have important applications in areas such as logistics path planning, but their solution space exponentially expands with the problem size, leading to severe challenges for traditional methods. In recent years, neural combinatorial optimization methods based on reinforcement learning have been able to achieve solution quality close to that of traditional solvers while keeping the solution consumption time short. The mainstream method POMO (Policy Optimization with Multiple Optima) enhances the training stability through symmetry optimization, but its unidirectional sequence generation mechanism still suffers from a double limitation: on the one hand, it is difficult for the traditional constructive method to fully exploit the symmetry features of the problem; on the other hand, the endpoint information can’t effectively participate in the decision-making process of the remote node. To address this problem, this paper proposes a Bidirectional Construction Strategy (BCS)-based POMO model, named BCS-POMO, which dynamically selects the extension direction with higher confidence by constructing the solution sequence in parallel from the start point and the end point, avoiding models that are caught in a dilemma due to unidirectional constructions. The model exploits the symmetry of the construction sequence to achieve weight parameter sharing and improves the efficiency through batch parallel computation. Experiments have shown that the BCS-POMO effectively reinforces the role of endpoint information as a decision aid in the construction process, which reduces the error by 16% and 18% for the traveling salesman problem (TSP) and the capacitated vehicle routing problem (CVRP), respectively, verifying the effectiveness of the bidirectional construction strategy in exploiting the endpoint information and the advantages of symmetry modelling.

摘要: 组合优化问题在物流路径规划等领域具有重要应用价值,但其解空间随问题规模呈指数级扩张,导致传统方法面临严峻挑战。近年来基于强化学习的神经组合优化方法能够在保持较短求解耗时的同时,解质量已接近传统求解器的水平。主流方法POMO(Policy Optimization with Multiple Optima)通过对称性优化增强了训练稳定性,但其单向序列生成机制仍存在双重局限:一方面,传统构造式方法难以充分挖掘问题对称性特征;另一方面,终点信息无法有效参与远端节点的决策过程。针对这一问题,提出了基于双向构造策略(BCS)的BCS-POMO模型,它能通过起点与终点双向并行构造解序列,动态选择更有把握的扩展方向,避免模型因单向构造而陷入两难的抉择之中。该模型利用构造序列对称性实现权重参数共享,并通过批量并行计算提升效率。实验表明,BCS-POMO有效强化了终点信息在构造过程中的决策辅助作用,在旅行商问题(TSP)和有能力约束的车辆路径调度问题(CVRP)上分别使误差降低了16%和18%,验证了双向构造策略对终点信息利用的有效性和对称性建模的优势。