Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2021, Vol. 47 ›› Issue (11): 207-213. doi: 10.19678/j.issn.1000-3428.0059591

• Mobile Internet and Communication Technology • Previous Articles     Next Articles

Optimized FANET Routing Algorithm with Reinforcement Learning Based on Function Approximation

XIE Yongsheng, YANG Yuwang, QIU Xiulin, WANG Yinyin   

  1. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
  • Received:2020-09-27 Revised:2020-11-16 Published:2020-12-07

基于函数逼近的强化学习FANET路由优化算法

谢勇盛, 杨余旺, 邱修林, 王吟吟   

  1. 南京理工大学 计算机科学与工程学院, 南京 210094
  • 作者简介:谢勇盛(1995-),男,硕士研究生,主研方向为飞行自组网路由协议;杨余旺(通信作者),教授、博士生导师;邱修林、王吟吟,博士研究生。
  • 基金资助:
    江苏省重点研发计划(BE2018393);苏州市重点产业技术创新项目(SYG201826)。

Abstract: The high-speed movement of nodes in Flying Ad-Hoc Network(FANET) has caused difficulties in maintaining the links of the FANET routing protocol.To address the problem,an algorithm named QLA-OLSR is proposed based on Reinforcement Learning(RL) for adaptive optimization of link state routing.By sensing the changing number of the node neighbors and the service loads in the dynamic environment,the Q-learning algorithm in RL is used to construct a value function.On this basis,the optimal HELLO time slot is solved to improve the performance of the node in link detection and maintenance.Then the State Similarity Mechanism(SSM) of the improved Kanerva coding algorithm is used to reduce the complexity of the algorithm while increasing its stability. Simulation results show that the QLA-OLSR algorithm can significantly improve the network throughput,reduce the overhead of routine maintenance,and is capable of self-learning.It is suitable for FANET in a highly dynamic environment.

Key words: Flying Ad-Hoc Network(FANET), function approximation, Q-learning, routing algorithm, adaptive HELLO time slot

摘要: 针对高速移动状态下的飞行自组网路由协议链路维护困难问题,提出一种基于强化学习的自适应链路状态路由优化算法QLA-OLSR。借鉴强化学习中的Q学习算法,通过感知动态环境下节点邻居数量变化和业务负载程度,构建价值函数求解最优HELLO时隙,提高节点链路发现与维护能力。利用优化Kanerva编码算法的状态相似度机制,降低QLA-OLSR算法复杂度并增强稳定性。仿真结果表明,QLA-OLSR算法能有效提升网络吞吐量,减少路由维护开销,且具有自学习特性,适用于高动态环境下的飞行自组网。

关键词: 飞行自组网, 函数逼近, Q学习, 路由算法, 自适应HELLO时隙

CLC Number: