计算机工程 ›› 2020, Vol. 46 ›› Issue (11): 293-300.doi: 10.19678/j.issn.1000-3428.0055910

• 开发研究与工程应用 • 上一篇    下一篇

基于强化学习的旅行商问题解构造方法

王若愚1,2, 陈勇全2,3   

  1. 1. 深圳供电局有限公司 输电规划科, 广东 深圳 518001;
    2. 香港中文大学(深圳) 机器人与智能制造研究院, 广东 深圳 518172;
    3. 深圳市人工智能与机器人研究院 无人系统研究中心, 广东 深圳 518129
  • 收稿日期:2019-09-03 修回日期:2019-11-22 发布日期:2019-11-29
  • 作者简介:王若愚(1986-),工程师、硕士,主研方向为电力系统巡检机器人调度及路径规划;陈勇全(通信作者),副研究员、博士。
  • 基金项目:
    国家自然科学基金(U1613216);深圳市基础研究项目(JCYJ20180508162406177);住房和城乡建设部软科学研究项目(2018-K8-034)。

Solution Construction Methods Based on Reinforcement Learning for the Traveling Salesman Problem

WANG Ruoyu1,2, CHEN Yongquan2,3   

  1. 1. Transmission Planning Section, Shenzhen Power Supply Co., Ltd., Shenzhen, Guangdong 518001, China;
    2. Institute of Robotics and Intelligent Manufacturing, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, China;
    3. Research Center on Unmanned Systems, Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, Guangdong 518129, China
  • Received:2019-09-03 Revised:2019-11-22 Published:2019-11-29

摘要: 基于迭代局部搜索(ILS)的启发式算法是目前最为先进的旅行商问题求解算法,在多数国际公开算例上保持着世界最优纪录。解构造方法是影响ILS性能的重要因素,为此,提出4种不同的解构造方法。解构造方法1为基准算法,其仅利用城市间的距离等静态结构信息来构造初始解,解构造方法2~解构造方法4则尝试利用搜索过程中积累的历史数据,通过强化学习挖掘有用信息,用于引导解的构造过程。在25个国际公开算例上的测试结果表明,基于历史信息的强化学习方法可有效优化构造解的质量,提升ILS整体性能。

关键词: 旅行商问题, 迭代局部搜索, 解构造, 强化学习, 过滤网络

Abstract: Among the existing algorithms for TSP solution,the heuristic algorithm based on Iterated Local Search(ILS) performs the best,holding the world record on most of the public instances.The method for solution construction has a significant influence on the performance of ILS,and thus should be carefully designed.This paper proposes four different methods for solution construction,including a baseline algorithm that uses only static information such as the distances between cities to construct the initial solution,and three reinforcement-learning-based algorithms that attempt to utilize reinforcement learning to dig useful information from the historic information collected during the search for the construction of initial solutions.Experimental results on 25 public instances show that the reinforcement-learning-based methods using historic information can significantly improve the quality of the constructed solution as well as the performance of ILS.

Key words: Traveling Salesman Problem(TSP), Iterated Local Search(ILS), solution construction, reinforcement learning, filter network

中图分类号: