[1] Kong F, Du F, Zhao D. Station-viewpoint joint coverage path
planning towards mobile visual inspection[J]. Robotics and
Computer-Integrated Manufacturing, 2025, 91: 102821.
[2] Babataev I, Fedoseev A, Weerakkodi N, et al. Hyperguider:
Virtual reality framework for interactive path planning of
quadruped robot in cluttered and multi-terrain
environments[C]// IEEE International Conference on
Systems, Man, and Cybernetics. Prague, Czech Republic:
IEEE, 2022: 2037-2042.
[3] Huang Y, Tsao C T, Lee H H. Efficiency Improvement to
Neural-Network-Driven Optimal Path Planning via Region
and Guideline Prediction[J]. IEEE Robotics and Automation
Letters, 2024, 9(2): 1851-1858.
[4] Li S, Chen X, Zhang M, et al. A UAV coverage path planning
algorithm based on double deep q-network[J]. Journal of
Physics: Conference Series, 2022, 2216(1): 12-17.
[5] Cheng X, Zhou J, Zhou Z, et al. An improved RRT-Connect
path planning algorithm of robotic arm for automatic
sampling of exhaust emission detection in Industry 4.0[J].
Journal of Industrial Information Integration, 2023, 33:
100436.
[6] Huang T, Fan K, Sun W. Density gradient-RRT: An improved
rapidly exploring random tree algorithm for UAV path
planning[J]. Expert Systems with Applications, 2024, 252:
124121.
[7] Zhang J, Chen D, Han G, et al. Formation Path Planning for
Collaborative Autonomous Underwater Vehicles Based on
Consensus-Sparrow Search Algorithm[J]. IEEE Internet of
Things Journal, 2023, 11(8): 13810-13823.
[8] Li B, Tan C, Lian Y, et al. Mobile robot global planning
based on improved a* algorithm path planning
research[C]//Proceedings of the 2023 international
conference on advances in artificial intelligence and
applications. Wuhan, China: ACM, 2023: 305-311.
[9] Li H, Qian L, Hong M, et al. Effective anti-submarine decision
support system based on heuristic rank-based Dijkstra andadaptive threshold partitioning mechanism[J]. Applied Soft
Computing, 2024, 161: 111718.
[10] Ganesan S, Ramalingam B, Mohan R E. A hybrid
sampling-based RRT* path planning algorithm for
autonomous mobile robot navigation[J]. Expert Systems with
Applications, 2024, 258: 125206.
[11] Ab Wahab M N, Nazir A, Khalil A, et al. Improved genetic
algorithm for mobile robot path planning in static
environments[J]. Expert Systems with Applications, 2024,
249: 123762.
[12] Wang Y, Hu F, Xu H, et al. A Multi-Groups Cooperative
Particle Swarm Algorithm for Optimization of Multi-Vehicle
Path Planning in Internet of Vehicles[J]. IEEE Internet of
Things Journal, 2024, 11(22): 35839 – 35851.
[13] Yu K, Xu B. Mobile Robot Path Planning Based on Improved
Elite Ant Colony Algorithm[C]//International Conference on
Robotics, Control and Automation. Shanghai, China: IEEE,
2024: 63-67.
[14] Jiang Z, Li F, Yang R. Multi-objective optimization in mobile
robot path planning: a joint strategy of A* and simulated
annealing algorithms[C]//IEEE International Conference on
Cybernetics and Intelligent Systems and IEEE International
Conference on Robotics, Automation and Mechatronics.
Hangzhou, China: IEEE, 2024: 162-167.
[15] Lee J, Seo Y. Q-learning based on strategic artificial potential
field for path planning enabling concealment and cover in
ground battlefield environments[J]. Applied Intelligence,
2024, 54(13-14): 7170-7200.
[16] 罗彪, 胡天萌, 周育豪, 等. 多智能体强化学习控制与决
策研究综述 [J/OL]. 自 动 化 学 报 , 1-30[2024-12-01].
https://doi.org/10.16383/j.aas.c240392.
Biao L, Tianmeng H, Yuhao Z, et al. Survey on multi-agent
reinforcement learning for control and
decision-making[J/OL]. Acta Automatica Sinica,
1-30[2024-12-01]. https://doi.org/10.16383/j.aas.c240392. (in
Chinese)
[17] Lowe R, Wu Y I, Tamar A, et al. Multi-agent actor-critic for
mixed cooperative-competitive environments[C]// Neural
Information Processing Systems. Long Beach, CA, USA:
Curran Associates, 2017: 6382-6393.
[18] Pathak D, Agrawal P, Efros A A, et al. Curiosity-driven
exploration by self-supervised prediction[C]//International
Conference on Machine Learning. Sydney, Australia: PMLR,
2017: 2778-2787.
[19] Reizinger P, Szemenyei M. Attention-based curiosity-driven
exploration in deep reinforcement learning[C]//IEEE
International Conference on Acoustics, Speech and Signal
Processing. Barcelona, Spain: IEEE, 2020: 3542-3546.
[20] 乔和, 李增辉, 刘春, 等. 基于改进好奇心的深度强化学
习方法[J]. 计算机应用研究, 2024, 41(09): 2635-2640.
He Q, Zenghui L, Chun L, et al. Research on deep
reinforcement learning method based on improved curiosity
[J]. Application Research of Computers, 2024, 41 (9):
2635-2640. (in Chinese)
[21] 金志军, 王浩, 方宝富. 稀疏场景下基于理性好奇心的多
智能体强化学习[J]. 计算机工程, 2023, 49(05): 302-309.
Zhijun J, Hao W, Baofu F. Multi-Agent Reinforcement
Learning Based on Rational Curiosity in Sparse Scenarios[J].
Computer Engineering, 2023, 49(5): 302-309. (in Chinese)
[22] Greff K,Srivastava R K,Koutnik J,et al. LSTM:a search
space odyssey[J]. IEEE Transactions on Neural Networks
and Learning Systems, 2017, 28(10): 2222-2232.
[23] Rashid T, Samvelyan M, De Witt C S, et al. Monotonic value
function factorisation for deep multi-agent reinforcement
learning[J]. Journal of Machine Learning Research, 2020,
21(178): 1-51.
[24] Wang J, Ren Z, Liu T, et al. QPLEX: Duplex Dueling
Multi-Agent Q-Learning[C]//International Conference on
Learning Representations. Virtual Event, Austria: :
OpenReview, 2021: 1-27.
[25] Peng B, Rashid T, Schroeder de Witt C, et al. Facmac:
Factored multi-agent centralised policy gradients[C]// Neural
Information Processing Systems. Beijing, China: Curran
Associates, 2021, 34: 12208-12221.
[26] Foerster J, Farquhar G, Afouras T, et al. Counterfactual
multi-agent policy gradients[C]//Proceedings of the AAAconference on artificial intelligence. Louisiana, USA: AAAI,
2018, 32(1): 2974-2982.
[27] 袁雷, 张子谦, 李立和, 等. 开放环境下的协作多智能体
强化学习进展[J]. 中国科学: 信息科学, 2025, 55(02):
217-268.
Lei Y, Ziqian Z, Lihe L, et al. Progress on cooperative
multi-agent reinforcement learning in open environment[J].
SCIENTIA SINICA Informationis, 2025, 55(02): 217-268.
(in Chinese)
[28] 刘奇儒, 耿霞. 基于改进 DQN 算法的机器人路径规划[J].
计算机工程, 2023, 49(12): 111-120.
Qiru L, Xia G. Robot Path Planning Based on Improved
DQN Algorithm[J]. Computer Engineering, 2023, 49(12):
111-12. (in Chinese)
[29] Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic:
Off-policy maximum entropy deep reinforcement learning
with a stochastic actor[C]//International Conference on
Machine Learning. Stockholm, Sweden: PMLR, 2018:
1861-1870.
[30] Zheng L, Chen J, Wang J, et al. Episodic multi-agent
reinforcement learning with curiosity-driven exploration[J].
Advances in Neural Information Processing Systems, 2021,
34: 3757-3769.
[31] Guan H, Gao Y, Zhao M, et al. Ab-mapper: Attention and
bicnet based multi-agent path planning for dynamic
environment[C]//2022 IEEE/RSJ International Conference on
Intelligent Robots and Systems. Kyoto, Japan: IEEE, 2022:
13799-13806.
[32] Xu F, Kaneko T. Curiosity-driven Exploration for
Cooperative Multi-Agent Reinforcement Learning[C]//
International Joint Conference on Neural Networks. Coast,
Australia: IEEE, 2023: 1-8.
[33] Zhang Z, Duan T, Sun Z, et al. Prediction-based Hierarchical
Reinforcement Learning for Robot Soccer[C]//2024
IEEE/CIC International Conference on Communications in
China. Hangzhou, China: IEEE, 2024: 1721-1726.
[34] Zhang S, Cao J, Yuan L, et al. Self-Motivated Multi-Agent
Exploration[C]// 2023 International Conference on
Autonomous Agents and Multiagent Systems. London,
United Kingdom: Springer, 2023: 476-484.
[35] 方城亮, 杨飞生, 潘泉. 基于 MASAC 强化学习算法的多
无人机协同路径规划[J]. 中国科学: 信息科学, 2024,
54(08): 1871-1883.
Chengliang F, Feisheng Y, Quan P. Multi-UAV collaborative
path planning based on multi-agent soft actor critic[J].
SCIENTIA SINICA Informationis, 2024, 54(08): 1871-1883.
(in Chinese)
[36] Paszke A. Pytorch: An imperative style, high-performance
deep learning library[C]//Neural Information Processing
Systems. Vancouver, BC, Canada: Curran Associate, 2019:
8024-8035
[37] Hasselt H V. Double q-learning[C]//Neural Information
Processing Systems. Vancouver, BC, Canada: Curran
Associates, 2010: 2613–2621.
[38] Musa S. Techniques for quadcopter modeling and design: A
review[J]. Journal of unmanned system Technology, 2018,
5(3): 66-75.
[39] 石喜玲, 孙运强, 李静, 等. 四旋翼动力学建模及非线性
PID 轨迹跟踪控制[J]. 科学 技术与工程, 2020, 20(06):
2489-2493.
Xiling S, Yunqiang S, Jing L, et al. Quadrotor Dynamics
Modeling and Nonlinear PID Trajectory Tracking Control[J].
Science Technology and Engineering, 2020, 20(06):
2489-2493. (in Chinese)
[40] Yu C, Velu A, Vinitsky E, et al. The surprising effectiveness
of ppo in cooperative multi-agent games[J]. Advances in
neural information processing systems, 2022, 35:
24611-24624.
[41] Raffin A, Hill A, Gleave A, et al. Stable-Baselines3: Reliable
Reinforcement Learning Implementations[J]. Journal of
Machine Learning Research, 2021, 22(268): 1-8
|