[1] 黄子蓉. 基于深度强化学习的多智能体协同研究[D]. 太原:太原理工大学, 2021. HUANG Z R. Research of multi-agent cooperation based on deep reinforcement learning[D]. Taiyuan:Taiyuan University of Technology, 2021. (in Chinese) [2] 侯胜男. 基于强化学习的多智能体协同对抗算法[D]. 杭州:浙江大学, 2022. HOU S N. Multi-agent confrontation algorithm based on reinforcement learning[D].Hangzhou:Zhejiang University, 2022. (in Chinese) [3] 李雪松, 张锲石, 宋呈群, 等. 自动驾驶场景下的轨迹预测技术综述[J]. 计算机工程, 2023, 49(5):1-11. LI X S, ZHANG Q S, SONG C Q, et al. Review of trajectory prediction technology in autonomous driving scenes[J]. Computer Engineering, 2023, 49(5):1-11.(in Chinese) [4] 熊丽琴, 曹雷, 赖俊, 等. 基于值分解的多智能体深度强化学习综述[J]. 计算机科学, 2022, 49(9):172-182. XIONG L Q, CAO L, LAI J, et al. Overview of multi-agent deep reinforcement learning based on value factorization[J]. Computer Science, 2022, 49(9):172-182.(in Chinese) [5] ZHANG S Q, ZHANG Q, LIN J. Efficient communication in multi-agent reinforcement learning via variance based control[C]//Proceedings of the Annual Conference on Neural Information Processing Systems. Cambridge, USA:MIT Press, 2019:5471-5483. [6] YUAN L, WANG J H, ZHANG F X, et al. Multi-agent incentive communication via decentralized teammate modeling[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(9):9466-9474. [7] WU B, YANG X Y, SUN C X, et al. Learning effective value function factorization via attentional communication[C]//Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics. Washington D. C., USA:IEEE Press, 2020:629-634. [8] RASHID T, FARQUHAR G, PENG B, et al. Weighted QMIX:expanding monotonic value function factorisation for deep multi-agent reinforcement learning[C]//Proceedings of the Annual Conference on Neural Information Processing Systems. Cambridge, USA:MIT Press, 2020:10199-10210. [9] OLIEHOEK F A, AMATO C. A concise introduction to decentralized POMDPs[M]. Berlin, Germany:Springer, 2016. [10] SON K, KIM D, KANG W J, et al. QTRAN:learning to factorize with transformation for cooperative multi-agent reinforcement learning[C]//Proceedings of International Conference on Machine Learning. Washington D.C., USA:ACM Press, 2019:5887-5896. [11] TAN M. Multi-agent reinforcement learning:independent vs. cooperative agents[C]//Proceedings of International Conference on Machine Learning.[S.l.]:Elsevier, 1993:330-337. [12] OLIEHOEK F A, SPAAN M T J, VLASSIS N. Optimal and approximate Q-value functions for decentralized POMDPs[J]. Journal of Artificial Intelligence Research, 2008, 32:289-353. [13] 邓晖奕, 李勇振, 尹奇跃. 引入通信与探索的多智能体强化学习QMIX算法[J]. 计算机应用, 2023, 43(1):202-208. DENG H Y, LI Y Z, YIN Q Y. Improved QMIX algorithm from communication and exploration for multi-agent reinforcement learning[J]. Journal of Computer Applications, 2023, 43(1):202-208.(in Chinese) [14] SUNEHAG P, LEVER G, GRUSLYS A, et al. Value-decomposition networks for cooperative multi-agent learning based on team reward[C]//Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems. New York, USA:ACM Press, 2018:2085-2087. [15] RASHID T, SAMVELYAN M, WITT C D. QMIX:monotonic value function factorisation for deep multi-agent reinforcement learning[C]//Proceedings of International Conference on Learning Representations. Washington D. C., USA:IEEE Press, 2018:1-14. [16] WANG J, REN Z, LIU T, YU Y, et al. QPLEX:duplex dueling multi-agent Q-learning[C]//Proceedings of International Conference on Learning Representations. Washington D. C., USA:IEEE Press, 2021:301-310. [17] 宋健, 王子磊. 基于值分解的多目标多智能体深度强化学习方法[J]. 计算机工程, 2023, 49(1):31-40. SONG J, WANG Z L. Multi-goal multi-agent deep reinforcement learning method based on value decomposition[J]. Computer Engineering, 2023, 49(1):31-40. (in Chinese) [18] SHEN S, QIU M, LIU J, et al. ResQ:a residual q function-based approach for multi-agent reinforcement learning value factorization[C]//Proceedings of the Annual Conference on Neural Information Processing Systems. Cambridge, USA:MIT Press, 2022:457-464. [19] DAS A, GERVET T, ROMOFF J, et al. TarMAC:targeted multi-agent communication[C]//Proceedings of International Conference on Machine Learning. Washington D. C., USA:ACM Press, 2019:1538-1546. [20] SINGH A, JAIN T, SUKHBAATAR S. Learning when to communicate at scale in multiagent cooperative and competitive tasks[EB/OL].[2023-05-02]. https://arxiv.org/abs/1812.09755. [21] LIU Y, WANG W X, HU Y J, et al. Multi-agent game abstraction via graph attention neural network[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5):7211-7218. [22] SUKHBAATAR S, SZLAM A, FERGUS R. Learning multiagent communication with backpropagation[C]//Proceedings of the Annual Conference on Neural Information Processing Systems. Cambridge, USA:MIT Press, 2016:2252-2260. [23] JIANG J, DUN C, HUANG T, et al. Graph convolutional reinforcement learning[C]//Proceedings of International Conference on Learning Representations. Washington D. C., USA:IEEE Press, 2020:25-30. [24] HAUSKNECHT M, STONE P. Deep recurrent Q-learning for partially observable MDPs[EB/OL].[2023-05-02]. https://arxiv.org/abs/1507.06527v1. [25] WANG Y, SUN Y B, LIU Z W, et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 38(5):146. |