[1] Wang L ,Wang J ,Liu H , et al.Decision-Making Strategies for Close-Range Air Combat Based on Reinforcement Learning with Variable-Scale Actions[J].Aerospace,2023,10(5):DOI:10.3390/AEROSPACE10050401.
[2] CHAI J, CHEN W, ZHU Y, et al. A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat[J/OL]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2023, 53(9): 5417-5429. https://doi.org/10.1109/tsmc.2023.3270444. DOI:10.1109/tsmc.2023.3270444.
[3] ZHANG J, CHEN Z, ZHANG Y, et al. Reinforcement Learning of Aerial Combat Maneuver Decisions Based on UAVs Situation[C/OL]//2023 6th International Conference on Computer Network, Electronic and Automation (ICCNEA). IEEE, 2023: 370-375. https://doi.org/10.1109/iccnea60107.2023.00085. DOI:10.1109/iccnea60107.2023.00085.
[4] Wang Z ,Li H ,Wang J , et al.Deep reinforcement learning based conflict detection and resolution in air traffic control[J].IET Intelligent Transport Systems,2019,13(6):1041-1047.DOI:10.1049/iet-its.2018.5357.
[5] Qiming Y ,Jiandong Z ,Guoqing S , et al.Maneuver Decision of UAV in Short-Range Air Combat Based on Deep Reinforcement Learning[J].IEEE Access,2020,8363-378.DOI:10.1109/access.2019.2961426.
[6] ZHEN Y, HAO M. Aircraft Control Method Based on Deep Reinforcement Learning[C/OL]//2020 IEEE 9th Data Driven Control and Learning Systems Conference (DDCLS). IEEE, 2020: 912-917. https://doi.org/10.1109/ddcls49620.2020.9275205. DOI:10.1109/ddcls49620.2020.9275205.
[7] Feilong J ,Minqiang X ,Yuqing L , et al.Short-range air combat maneuver decision of UAV swarm based on multi-agent Transformer introducing virtual objects[J].Engineering Applications of Artificial Intelligence,2023,123(PB):DOI:10.1016/J.ENGAPPAI.2023.106358.
[8] 蒋俊哲.基于时序建模与层次搜索的空战机动决策研究[D].广西大学, 2025.DOI:10.27034/d.cnki.ggxiu.2025. 000894.
Jiang Junzhe. Research on air combat maneuver decision-making based on temporal modeling and hierarchical search [D]. Nanning: Guangxi University, 2025. DOI:10.27034/d.cnki.ggxiu.2025.000894.
[9] 张百川,毕文豪,张安,等.基于Transformer模型的空战飞行器轨迹预测误差补偿方法[J].航空学报,2023,44(09):291-304.
Zhang Baichuan, Bi Wenhao, Zhang An, et al. Error compensation method for trajectory prediction of air combat vehicles based on Transformer model[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(09): 291-304.
[10] Zihao G ,Yang X ,Delin L .UAV Cooperative Air Combat Maneuvering Confrontation Based on Multi-agent Reinforcement Learning[J].Unmanned Systems, 2023, 11(03): DOI:10.1142/S2301385023410029.
[11] 姚昌华,张晓慧,毕珊宁,等.电子战无人机与陆战多智能体协同决策[J/OL].计算机应用与软件,1-9[2025-10-24].https://link.cnki.net/urlid/31.1260.tp.20250902.1117.004.
Yao Changhua, Zhang Xiaohui, Bi Shaning, et al. Cooperative decision-making of electronic-warfare UAV and land-based multi-agent systems [J/OL]. *Computer Applications and Software*, 1-9 [2025-10-24]. https://link.cnki.net/urlid/31.1260.tp.20250902.1117.004.
[12] LI B, HUANG J, BAI S, et al. Autonomous air combat decision‐making of UAV based on parallel self‐play reinforcement learning[J/OL]. CAAI Transactions on Intelligence Technology, 2022, 8(1): 64-81. https://doi.org/10.1049/cit2.12109. DOI:10.1049/cit2.12109.
[13] 唐媛.基于深度强化学习的无人机编队避障导航关键技术研究[D].电子科技大学,2025. DOI:10.27005/d.cnki.gdzku.2025.002226.
Tang Yuan. Research on Key Technologies for Obstacle Avoidance and Navigation of UAV Formation Based on Deep Reinforcement Learning [D]. University of Electronic Science and Technology of China, 2025. DOI:10.27005/d.cnki.gdzku.2025.002226.
[14] Yongfeng L ,Yongxi L ,Jingping S , et al.Autonomous Maneuver Decision of Air Combat Based on Simulated Operation Command and FRV-DDPG Algorithm[J].Aerospace,2022,9(11):658-658.DOI:10.3390/AEROSPACE9110658.
[15] HU J, WANG L, HU T, et al. Autonomous Maneuver Decision Making of Dual-UAV Cooperative Air Combat Based on Deep Reinforcement Learning[J/OL]. Electronics, 2022, 11(3): 467. https://doi.org/10.3390/electronics11030467. DOI:10.3390/electronics11030467.
[16] Lee S M, Shin M, Son H. Robust predictor-based control for multirotor UAV with various time delays[J]. IEEE Transactions on Industrial Electronics, 2022, 70(8): 8151-8162.
[17] Wu Y H, Li C Y, Lin Y B, et al. Modeling control delays for edge-enabled UAVs in cellular networks[J]. IEEE Internet of Things Journal, 2022, 9(17): 16222-16233.
[18] Yu Z, Zhang Y, Jiang B, et al. Refined fractional-order fault-tolerant coordinated tracking control of networked fixed-wing UAVs against faults and communication delays via double recurrent perturbation FNNs[J]. IEEE Transactions on Cybernetics, 2022, 54(2): 1189-1201.
[19] Muskardin T, Coelho A, Della Noce E R, et al. Energy-based cooperative control for landing fixed-wing UAVs on mobile platforms under communication delays[J]. IEEE Robotics and Automation Letters, 2020, 5(4): 5081-5088.
[20] Wang W, Han D, Luo X, et al. Addressing signal delay in deep reinforcement learning[C]//The Twelfth International Conference on Learning Representations. 2023.
[21] Ivan Anokhin, Rishav Rishav, Matthew Riemer, et al. Handling Delay in Real-Time Reinforcement Learning. [C]//The Thirteenth International Conference on Learning Representations, 2025.
[22] Lee J, Kim J, Jeong J, et al. Reinforcement Learning via Conservative Agent for Environments with Random Delays[J]. arXiv preprint arXiv:2507.18992, 2025.
[23] De Vries S C. UAVs and control delays[R]. 2005.
[24] Ding Y, Xu J, Yang Y, et al. AoI-MDP: An AoI Optimized Markov Decision Process Dedicated in The Underwater Task (Student Abstract)[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2025, 39(28): 29348-29350.
[25] Kosinski R J. A literature review on reaction time[J]. Clemson University, 2008, 10(1): 337-344.
[26] Stevens B L, Lewis F L, Johnson E N. Aircraft control and simulation: dynamics, controls design, and autonomous systems[M]. John Wiley & Sons, 2015.
[27] Chen B, Xu M, Li L, et al. Delay-aware model-based reinforcement learning for continuous control[J]. Neurocomputing, 2021, 450: 119-128.
[28] Towers M, Kwiatkowski A, Terry J, et al. Gymnasium: A standard interface for reinforcement learning environments[J]. arXiv preprint arXiv:2407.17032, 2024.
[29] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.
[30] Hochreiter S, Schmidhuber J. LSTM can solve hard long time lag problems[J]. Advances in neural information processing systems, 1996, 9.
[31] Van Houdt G, Mosquera C, Nápoles G. A review on the long short-term memory model[J]. Artificial intelligence review, 2020, 53(8): 5929-5955.
[32] Armah S K, Yi S. Analysis of time delays in quadrotor systems and design of control[M]//Time Delay Systems: Theory, Numerics, Applications, and Experiments. Cham: Springer International Publishing, 2017: 299-313.
[33] Kali Y, Rodas J, Gregor R, et al. Attitude tracking of a tri-rotor UAV based on robust sliding mode with time delay estimation[C]//2018 International Conference on Unmanned Aircraft Systems (ICUAS). IEEE, 2018: 346-351.
|